Resume
Prior Work Experience (CV)
Resume for Jonas S Karlsson
===========================
My locations:
San Francisco, USA
Sydney, Australia
Amsterdam, The Netherlands
Stockholm region, Sweden
Phone: +1 691 4936 (mobile)
Email: jsk@yesco.org (preferred)
Name : Jonas S Karlsson
Date of Birth : 12th July-1968
Place of Birth: Enkoping, Sweden
Education : PhD, Computer Science
Status : free (not married)
Visa : US Green card
I n t r o d u c t i o n :
NEWS: I gave notice at end of June, 2010, and quit Google! Now I'm
travelling around the world visiting friends, and giving talks and consulting
about how to design and implement for scalability. Do you need scalability
design review/help for your successful internet services? :-)
As of 2008 September I've been working on scalability and indexing
of Google Wave, and fulltext indexing support in Megastore.
In 2006, I was co-creator of the Google Megastore project which provides
a storage layer with SQL style schema and secondary indices and consistency.
Google's "database" storage system for building scalable applications,
if you so will.
I was the second person on the project and therefore designed many features
and implemented most of the indexing features. This project aimed and
delivered a storage layer that allows applications to scale to Google
application web deployment. It is currently used in a large number of
released new applications from Google and numerous internal applications.
Up till April 2006 at Google I was working in the Ads-Serving group as a
developer implementing various back-end features. This work involved
coordination between front-end developers, PMs and usability engineers
as well as major design effort and implementation. One notable feature was
the backend server support for dayparting, also known as "Ad Scheduling"
feature.
Previously, I worked at IBM in the former Cloudscape group that we
inherited from Informix. Just recently, our product has been made open
source through the Apache Foundation. While working with the
Cloudscape team one of my on-going focus is designing and implementing
an emerging research prototype for JDBC Edge Caching, for scalability.
In 2000, when I started at IBM, I was the team/technical lead at IBM
for the DB2 Everyplace database kernel group. DB2 Everyplace is fast
and small (200KB) SQL database engine for embedded use in small devices
and programs.
I'm foremost an architect/designer/programmer for new functionality.
I have a wide range of interests, see the project list below.
The foremost additions to DB2 Everyplace was designing/validating and
leading interns in implementing database encryption and key management
for mobile devices (article ref below), transaction management and
recovery, as well as a 40% speedup of the database storage engine's
operations by writing specialized profiling tools in my first few
weeks employed.
I was hired by Josephine Cheng in 2000, after an hour phone interview.
She is now an IBM Fellow, previously Director of IBM China Software
Development Laboratory, now she heads the IBM Almaden Research.
At some point, I was doing my PhD at CWI (Dutch Centre for Mathematics
and Computer Science) under supervision of prof. Martin Kersten. The
thesis was defended in autumn 2000, after I'd been with IBM
for almost a year. Before that I enjoyed some research at
the database laboratory (EDSLAB) at IDA, Linkping University,
with prof. Tore Risch.
My research focus was on Data Structures, Parallel Databases;
Parallel Storage (LH*), Parallel Query Processing, Parallel Objects
Handling. I focused on Scalable Distributed Data Structures (SDDSs)
which involves dynamic hashing and dealing with multi-attributes, and
their integration to DBMSs. I have also a keen interest in Main-Memory
Database Systems, and their efficient implementation. Experiences
from WS-Iris/AMOS developed by Tore Risch at HP-Labs and Monet
developed by the Data mining theme at CWI/Amsterdam. I have broad
experience of automatically generating web-pages as interfaces
to data storing systems.
G o a l :
I consider myself the inventive type, I like working on problems that
have not been solved elsewhere, find new solutions, new algorithms
for specific challanging issues.
The type of work I am looking for is in a free/open environment, where
information and ideas are shared. Working in a group with a broad
range of contacts is encouraging for me. I have participated in a
number of projects where designs have been discussed, and later on
implemented. It is encouraging to share and see others views. I also
enjoy working on projects on my own - from idea/design to finished
implementation. My area of interests is broad. Some words that
captures part of them are: Very Large Data (>G/T), Main Memory, Data
Structures, Indexing, Data Storage, (O)DBMS kernel implementations &
prototyping, Web based systems, Network, Scalability and Distribution,
HTML/HTTP, Active Web-systems.
P e r s o n a l :
I speak Swedish, since I'm native Swedish, my mother lives in Norway
so I also easily make my self understood there. English is my
technical language. German I studied for 5 years in "Gymnasium" and
spend 1/2 year in Berlin. I do now speak German fluently. When I
studied in The Netherlands Dutch was added to the set of languages
that I do understand.
Other interests include, film festivals, movie (video) watching with
friends, cooking, traveling, meeting new cultures, occasional "hacking"
(programming interesting ideas, see projects at the end), reading
Science Fiction, practicing Tai Chi, Zen, Fencing, meditation, and
driving my Yamaha XJ750 through Europe!
P r o f e s s i o n a l S k i l l s :
* NOSQL implementations and the next logical step: database technologies
applied to NOSQL giving transactions, consistency, indexing, high performance,
high availability.
* Designing scalable backend systems. This requires a particular way of
thinking, throughout the design of a new application. Scalability
need to be the main part of the design of a systems architecture
in order for it to scale seemlessly by throwing more machine at it.
Enabled this at Google through the Megastore project.
* Technical Team Lead, a group of 10 people, architecture, writing designs,
researching solutions, reviewing designs, pin-pointing important issues,
encourage group members, review code, write code, lead team forward,
customer feature request negotiation.
* Security, Encryption (PKZ11, DES, key management/storage seeding).
* Broad knowledge/interest of programming languages, from
object-oriented, to functional, procedural. Examples are:
C (DB implementations), Perl (scripting/prototyping), Java,
Erlang, lisp (simulations), C++, Smalltalk, Pascal. Fortran, HTML, SQL,
Object Oriented SQLs, emacs; implementing my own specialized languages.
jML - my own functional programming language for fast web prototyping.
* Early Unix experience (88-), used variants of
UNIX:es (SGI, HP, AIX, Sun, Linux), MS-DOS, MS-Windows, OS/2, CP/M
Run Linux on home system since 92, we invited Linus to Sweden.
* Protocols hands-on implementations: HTTP, SMTP, RPC, KOM,
Web-server implementations, Import of emails/threading to DBMS.
* Database Kernel Implementations: Memory Management, Data Structures,
Query Evaluators/Interpreters/Compilers, Indices, Cost-Models
Query Optimizer, Database Recovery and Logging, Main Memory
awareness, Transaction Paging System, Persistence implementations.
(WS-Iris/Amos (OOSQL), Monet (datamining), DB2 Everyplace, Cloudscape),
Google Megastore
* GUI programming; Use specialied tiny web servers, generate HTML,
most portable GUIs, used the idea since 1996 (see JySKom below).
* Programming language interpreters: forth, lisp, Mascal, jML.
* Parallel & threaded: Implemented a portable C-threads package,
Multitasked a DBMS using it. LH*lh - Distributed Data Storage,
implemented using threads for real-time performance, using
24-node machine with 48 CPUs and 48GB of memory. Parsytec
parallel computer programming (European project).
* PDA/embedded devices "library programming" (SQL engine i 200K),
frustrating but challenging!, Palm Pilot, WinCE, Pocket PC,
Symbian (ugh!), embedded linux (rocks!)
* Experiences in text/information filtering, reformatting, and fast
prototyping of ideas; using combinations Lisp/Perl/C/Web-servers.
D i p l o m a s :
* PhD in Computer Science
(Scalable Distributed Data Structures for Database Management)
(2000, University of Amsterdam, The Netherlands)
* Technical Licenciate Exam
(A Scalable Data Structure for a Parallel Data Server)
(1997, Linkping University, Sweden)
* Master of Science in Computer Technology and Computer Science
(Implementation of Transaction Logging and Recovery in
a Main Memory Resident Database System)
(1993, Linkping University, Sweden)
("minor" in digital electronics)
P u b l i c T a l k s
* Nottingham University: A New Way of Thinking: Infinite Scalability...
(upcoming talk)
* Uppsala University: A New Way of Thinking: Infinite Scalability and
Rapid Scale Development (Bigger picture for NOSQL?)
1st September, 2010, Uppsala University invited by prof Tore Rich.
* Bedarra Research: A New Way of Thinking; Infinite Scalabilty and
Rapid Scale Development - bigger picture for NOSQL?
invited by Dave Thomas, founder Bedarra Research, Ottawa, Canada 2010
* CWI: A New Way of Thinking
at CWI, The Netherlands, 2010 (Dutch Centre for Computer Science and
* NOSQL: New Way of Thinking - infinite scalability
A short "flash-talk" about NOSQL, next gen NOSQL, and how to implement it.
http://vimeo.com/5184606
NOSQL meetup, San Francisco, June 2009.
* SAP Research: A repeat of JAOO talk aka "Infinite Scalability"
Invited to speak at the SAP Labs by the R&D group
by Hui Ding and Shel Finkelstein. July 2009.
* 2x JAOO: Consistency, Storage, and Reliability in the Cloud.
"Jonas talks about how to scale web apps and underlying principle
of scalability of storage and data processing to millions of users
on many thousands of machines. The talk is based on experiences with
Google Megastore and many applications that is has enabled to scale
seamlessly. He'll discuss the importance of consistency and when it
may be appropriate to relax it, while maintaining a high quality
user experience."
Talk given at JAOO, Sydney and Brisbane, 2009
* Megastore: A Scalable Data System for User Facing Applications
JJ Furman, Jonas S Karlsson, Jean-Michel Leon, Alex Lloyd,
Steve Newman, and Philip Zeyliger (Google Inc.)
Abstract: Megastore provides a rich model and API that facilitates
implementation of user facing applications storing data in Bigtable.
Our goal is to enable Google developers to quickly build and launch
highly available applications at Google scale. We extend Bigtable
to provide strong consistency guarantees and higher levels abstractions
such as transactions, secondary indexes and synchronous replication.
Megastore takes a practical approach to schema management, providing
integrated declarative schemas with rich data extensions, such as logical
data partitioning, which is key to achieve high performance querying
and scalable massively parallel transactions.
SIGMOD 2008, Vancouver, Canada. Invited talk for special session.
http://www.sigmod08.org/program_glance.shtml
* Bigtable, gave the Bigtable talk at IBM R&D in Beijing, China.
Jan 2007.
I n t e r n a l T a l k s a t G o o g l e
* Megastore Tea-time Q&A, Sydney, 2009.
* Megastore, internal Google EMEA conference talk at Davos, Switzerland.
March 2007.
* Megastore, various internal talks at Google offices: Mountain View,
Beijing, Seoul, Santa Monica.
L i s t o f p u b l i c a t i o n s :
* J. S Karlsson, Thomas Fanghaenel, Cliff Leung
... (DB2 Everyplace article in German Database Journal)
* J. S Karlsson, Thomas Fanghaenel, Cliff Leung, Xin Hu
DB2 Everyplace: A High Performance Secure Small Footprint
Database using Encryption.
* J. S Karlsson, Amrish Lal, T. Y. Cliff Leung, Thanh Pham
IBM DB2 Everyplace: A Small Footprint Relational Database System.
ICDE 2001: 230-232, Heidelberg, Germany, April 2001)
* Jonas S. Karlsson, PhD Thesis: Scalable Distributed Data Structures
for Database Management, 2000 Amsterdam
* J. S Karlsson, M. L. Kersten
Omega-Storage: A Self Organizing Multi-Attribute Storage Technique
for Very Large Main Memories. (Presented at the Australian Database
Conference, Canberra, Australia, January 2000)
* J. S Karlsson
hQT*: A Scalable Distributed Data Structure for High-Performance
Spatial Access. (Presented at Int'l Conf. on Foundations of
Data Organization, Kobe, Japan, November 1998)
* J. S Karlsson, M. L. Kersten
Transparent Distribution in a Storage Manager
(Was presented Int'l. Conf. on Parallel and
Distributed Processing Techniques and Applications, Las Vegas, NV,
USA, July 1998.)
* Licentiate of Engineering Thesis No 609, by Jonas S Karlsson, 1997
A Scalable Data Structure for a Parallel Data Server
ISBN 91-7871-918-6, ISSN 0280-7971. April, Linkping University..
(A Licentiate Thesis is a simpler form of PhD Thesis made about 3
years after the MSc, available only in Sweden. A full PhD these
takes 2-3 more years.)
* J.S. Karlsson, W. Litwin, T. Risch: LH*lh : A Scalable High
Performance Data Structure for Switched Multicomputers.
(In The 5th International Conference on Extending Database Technology
(EDBT'96) Avignon, France, March 1996.
* Karlsson, J. S. (1995). An Implementation of Transaction Logging and
Recovery in a Main Memory Resident Database System. (Master Thesis
LiTH-IDA-Ex-94-04, Department of Computer and Information Science,
Linkping University, Sweden.)
T e c h n i c a l R e p o r t s :
* J. S Karlsson, M. L. Kersten
Scalable Storage for a DBMS Using Transparent Distribution
Technical Report INS-R9710, CWI, Amsterdam, The Netherlands, 1997.
* Karlsson, J. S., Litwin, W., and Risch, T. (1995). LH*LH : A
Scalable High Performance Data Structure for Switched
Multicomputers . Technical Report LiTH-IDA-R-95-25, Department of
Computer and Information Science, Linkping University, Sweden. Was
presented at EDBT-96, Avignon, France.
* J. S. Karlsson, S. Larsson, T. Risch, M. Skld, M. Werner
AMOS User's Guide, CAELAB, IDA, Department of Computer Science
and Information Science, Linkping University, Sweden,
Memo 94-01 edition, March 1994
P a t e n t s
* I've filed a patent in IBM's name together with Torsten Grust, Prof.
"Small-footprint applicative query interpreter method, system
and program product", approved
Summary W o r k i n g E x p e r i e n c e (details below):
Google (2004-)
+ Google Wave - 1 year
+ Megastore - 3+ years
+ Ads backend - 1 year
IBM Database Lab (2000-2004)
+ Cloudscape/Derby R&D - 2 years
+ DB2 Everyplace, Tech Lead - 1.5 years
+ DB2 Everyplace
Research (1995-1998)
+ Database Lab (EDSLAB) - 2 years
+ Teaching uni. courses - 9 months
+ CWI - 2 years
Consulting (1986-1995)
+ Rektron AB - 3 years (1986-1992)
+ Schmidt+Haench - 6 months (1992)
+ Tech Prog, EDSLAB - 1 year (1993-1995)
Technical work (1985)
+ Bahco AB - 3 months,
(List is in reverse time order, time overlaps)
Convention:
year month : Event at a date, such as exam
year month- : Activity started
year-year : (part time) activity during period
(time) : (estimated) working time spent on activity
2008 Sep - Google Wave: Scalability consultant and fulltext indexing
implementor. Enabling Megastore to be used for indexing
Google Wave consistently. Indexing reliability and recovery.
2006 Feb - Google: Megastore team. As second person on the project
were defining features and implementing most of the
data encoding, index encoding, indexing features,
querying API, command line (SQL-ish) tool, data web browser.
Helped numerous application define their schema and application
for scalability.
(2005 Jan) Started a meditation group at Google. Leds the group
that sits every Wednesday lunch for 30 minutes + discussion.
(ongoing - still going strong 2008 July)
2004 Dec - Gooogle: Ads backend team, implement various features for
the backend ads serving; relaxing MinCPC (from 5c->2c) for
some countries; Ad Scheduling (dayparting feature);
AOL Marketplace prototype.
2002 IBM: Cloudscape team, R&D for JDBC Edge Caching technologies.
2000 Nov- IBM: DB2 Everyplace, Technical Lead, architect changes,
design, features. Design and implement. Lead work of 10
people of varied culture origin. Architected mobile database
secured storage solution, lead implementation.
2000 Dec PhD thesis defense in Amsterdam.
2000- IBM: DB2 Everyplace, database kernel, various
optimizations of algorithms, data structures, design and
implement transaction facility for small device.
1999 autumn CWI: Delivery of PhD thesis.
1998 Nov CWI: Referee ICDE'99
1997 Jul- CWI: Database Researcher, due to finish PhD in 1999.
New spatial scalable distributed data structure (SDDS).
Integration of SDDS into a DBMS (Monet).
1997 Feb- CWI: Invited Guest (Researcher) involved in porting the
(5 months) database system Monet to Parallel machine (Parsytec). Other
activities includes applicance of SDDS (Scalable Distributed
Data Structure to a database system, Monet).
1997 April EDSLAB: Technical Licenciate Thesis presentation.
(could be described as "half-a-PhD", two years into PhD).
1994-1996 EDSLAB: PhD courses taken in: Principles of Modern Databases,
(~8 months)Distributed Databases, Configuration Management, Advanced
Computer-bases Learning Environment, Scientific Writing,
Temporal Databases, Parallel Programming and Compilation
Techniques, Debugging and Performance Monitoring with the
TOPSYS Environment, Network Databases, Computer Network
and Service Protocols, Multi-databases.
1993-1996 EDSLAB: Teaching in undergraduate courses
(~9 months)Programming Abstractions and Methodology. Lisp course.(lect/lab)
Process Programming and Operating Systems. (lect/lab)
Databases (labs)
1996 Mars EDSLAB: Presentation of LH*lh at EDBT96 Avignon France
together with prof. Tore Risch and prof. Witold Litwin
(Paris 9, Dauphine).
1995 Feb- EDSLAB: Employment as researcher (PhD-student) at EDSLAB
(still (Engineering Databases and Systems LABoratory),
employed) IDA (Department of Computer and Information Science),
LiTH (Linkping Institute of Technology).
1993 Aug- EDSLAB: Technical Programming support for CAELAB/EDSLAB
(6 months) Extension of WS-Iris/AMOS database (C,lisp)
1993 Febr- EDSLAB: Master Thesis Work: An Implementation of Transaction
(6 months) Logging and Recovery in a Main Memory Resident Database
System. The work was done with for prof. Tore Risch. (C,lisp)
1992 Aug- Rektron AB: Networked multi-user version of Rektron GAUGE
(6 months) database software. Further extensions on user defined database
schemas, and user defined query plans. (Borland O.Pascal)
1992 Febr- Schmidt+Haench: Student Exchange program (COMETT II):
(6 months) Programming and planning at the Product Development group at
Schmidt+Haench, Berlin, Germany. Project for Sudzucker
Industrie to handle information flow at chemical laboratory.
Software for controlling of mechanical laboratory equipment
was developed, as well as means for data storage. (B.O.Pascal)
1988-1994 Rektron AB: Database system & application design and product
(2 years) development. The product is support for ISO-9000 calibration and
traceability in industry (Volvo, FMV...). The work took place
during summers at Rektron AB in Stockholm, but also besides
studies. Total time working time (hours counted) 2 years.
(B.O.Pascal)
1987 July Joined studies at the Master of Science Program in Computer
Science and Engineering at Linkping University.
1987-1995 Board of a diverse University Societies;
Lysator - Lysator Academic Computer Club Society
2 year secretary + 1 year board member
Admittansen - University Electronics Club
1 year vice president
+other activities.
1987 Summer Rektron AB: Practical work. Porting/Hotting up DBase
(3 months) application to newer DBase version in network environment. (basic)
1986 Summer Rektron AB: Practical work. Porting CP/M Basic software
(3 months) to (original) IBM PC. Wrote script to translate code (basic)
1985 Summer Bahco Ventilation AB: Summer work, helping out with
(3 months) technical drawings updates for air conditioning systems.
L a n g u a g e s :
* Swedish is my mother tongue.
* English is my technical language.
* Proficient in German.
* Dutch I understand.
* Interest in Asian languages.
P r o j e c t s :
I briefly mention some of my "fun" projects that I sometimes spend some
"hacking" time on.
* Built and experimented with own construction of a digital pulse dialer
phone using discrete component like TTLs, built a remote phone status
indicator (off hook, connected, free) (age 13)
* Built a crude "scanner" using my MCP-40 plotter. Hardware interface &
Software for Oric-Atmos. (85)
* Forth - wrote an interpreter/compiler for forth on a Z80, using
Turbo Pascal 3, CP/M. (86)
* Assembler - written for an digital project where we developed
our own CPU/computer from VLSI-chips. (87)
* Adventure Language - a project in lisp, developing a object-oriented
language for implementing adventures in. (87)
* In project at university; digital electronics: Designed, wired,
tested a one cycle instruction CPU: VNISC (Virtually No Instruction
Set Computer) using 50 TTLs. Wrote an assembly langauge compiler. (87)
* MASCAL - my own Pascal, interpreter with loose typing, and types like
"code", "expression" (lazy evaluation style) (89)
* LISP - a lisp-interpreter written in Borland Pascal. Using
reference counting garbage collection. Tounge in cheek. (94)
* Benchmarking Repository - interfaced AMOS MMDB (OSQL) with gnuplot
facilities to automatically plot graphs and extract excessive
amounts of data - later used for plots in EDBT paper. (95)
* Querying External File Data - general interface for AMOSQL to
query semi-structured external data, such as mail-files. (95)
* QPM - Query Processing Memory manager in C, a memory management and
efficient ObjectLOG interpreter. I.e. the core of a DB system.
Extensible with new data types and data structures, crude TUI.(96-)
* JySKom - a web-interface for a Conference System (stored messages).
Implemented an web-server (HTTP-protocol) from scratch in Perl,
this server acts as a client to the Kom database using a plain-text RPC,
it generates HTML-pages on the fly. (96-97)
* KOM AMOSQL interface - query external specialized conference
database system to facilitate Advance Object Queries.
* cjtml - a preprocessing macro language for HTML, a simple but yet
capable interpreter written in a few lines of Perl! (96)
* Mankan Search - another "application" for the web-server code.
Extracted and generated well-formed (legacy) data from web-pages,
create well formed database that can be queried
structured. (http://skiff.cwi.nl:2442/) (98)
* JML - a new web-language based on cjtml, allows "active"
web-pages. Is a inline text/string macro language, allows
easy generation of HTML. Avoids the irritating quoting problem when
generating HTML. Allows for *extremely* rapid prototyping of
web-applications. For example the basic JySKom project could be
"implemented" in a day using this language. Makan Search, can
be replaced by a few pages of code. Uses XML-style "databases".(98-99)
* Album - Automatic Photo-album (HTML-pages) generation for manual
classification. Creates views and facilitate queries over picture
collections. Implemented in Perl. (98)
* Vicky - a Wiki Wiki Web system used internally at my IBM group.
Allows remote editing of HTML documents, easy linking, history
of changes. System is used for keeping track of features, design,
line-items. Used by a startup for their intranet. (2000-)
* Album - reimplemented in JML, using XML, active webpages, templates.
Is used to search/manage over 15,000 digital pictures.
* Voogle - "google" inspired brute force search engine/ranking for pages.
* Text - full text indexing scripts in Perl for online conferences
in LysKOM (Swedish server), speedup using main memory word index,
two layer position index, etc.
* jML - compilation of jML programs from interpretive to more efficient
executing by compiling (to perl!).
* mm - a perl reimplementation of old command line interface for
"Mail Manager"(TOPS), superior list/search from command line.
* zdb - Playing with Java for building small but efficient database
system kernel, mapping all data to long ints for code reuse.
* mail - Started to write my own mail web interface search engine for
searching/organizing all my emails since 1987. Killed by Gmail...
* yesco.org - integrated information site for my stuff
- wiki for pages
- online address book management (generic form/database)
- album for my digital photos
- my books server
- persistent XML database interface
- server side bookmark management
- related data "search engine", for page shows relevant:
pages, titles, photos, albums, bookmarks, books
- what's new (list changed pages, "pre RSS")
- (type non-existent page and it'll search yesco.org/oogl )