New article on ACM Queue:
Beyond Relational Databases
http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=299
There is more to data access than SQL.
by MARGO SELTZER, SLEEPYCAT
From the Databases issue, vol. 3, no. 3 - April 2005
article excerpt:
The number and variety of computing devices in the environment are
increasing rapidly. Real computers are no longer tethered to desktops
or locked in server rooms. PDAs, highly mobile tablet and laptop
devices, palmtop computers, and mobile telephony handsets now offer
powerful platforms for the delivery of new applications and services.
These devices are, however, only the tip of the iceberg. Hidden from
sight are the many computing and network elements required to support
the infrastructure that makes ubiquitous computing possible.
With so much computing power traveling around in briefcases and
pockets,
developers are building applications that would have been impossible
just a few years ago. Among the interesting services available today
are text and multimedia messaging, location-based search and
information services (for example, on-demand reviews of nearby
restaurants), and ad hoc multiplayer games. Over the next several
years, new classes of mobile and personalized services, impossible to
predict today, will certainly be developed.
While these
services differ from one another in major ways, they also share some
important attributes. One--the focus of this article--is the
need for data storage and retrieval functions built into the
application. Messaging applications need to move messages around the
network reliably and without loss. Location-based services need to
map
physical location to logical location (for example, GPS or cell-tower
coordinates to postal code) and then look up location-based
information. Gaming applications must record and share the current
state
of the game on distributed devices and manage content retrieval and
delivery to each of the devices in realtime. In all these cases,
fast,
reliable data storage and retrieval are critical.
As soon as
the discussion turns to data storage and retrieval, relational
databases come to mind. Relational databases have been tremendously
successful over the past three decades, and SQL has become the lingua
franca for data access. While data management has become
almost synonymous with RDBMS, however, there are an increasing number
of applications for which lighter-weight alternatives are more
appropriate.
In this article, we begin with a brief review of
how relational systems came to dominate the data management
landscape,
discuss how the relational technologies have evolved, present a
data-centric overview of today's emergent applications, and
delve into data management needs for today's and tomorrow's
applications.
RELATIONAL PREHISTORY
Relational
databases came out of research at IBM1,2 and the University of
California at Berkeley3 in the 1970s. Relational databases were
fundamentally a reaction to the escalating costs required for
deploying and maintaining complex systems.
The key observation
was that programmers, who were very expensive, had to rewrite large
amounts of application software manually whenever the content or
physical organization of a database changed. Because the application
generally knew in detail how its data was stored, including its
on-disk layout, reorganizing databases or adding new information to
existing databases forced wholesale changes to the code accessing
those databases.
Relational databases solved this problem in two
ways. First, they hid the physical organization of the database from
the application and provided only a logical view of the data. Second,
they used a declarative language to describe the data of interest
in a
particular query, rather than forcing the programmer to write a
collection of function calls to fetch the data. These two changes
allowed programmers to describe the information they wanted and to
leave the details of optimization and access to the database
management system. This transformation relieved programmers of the
burden of rewriting application code whenever the database layout or
organization changed.
Relational databases enjoyed tremendous
success in the IT shops and data centers of the world. Businesses
with
large quantities of data to manage and sophisticated applications
using that data adopted the new technology quickly. Demand for
relational products created a market worth billions of dollars in
licensing revenue per year. Several RDBMS vendors arose in the 1980s
to compete for this lucrative business.
In the 20 years that
followed, two related trends emerged. First, the RDBMS vendors
increased functionality to provide market differentiators and to
address
each new market niche as it arose. Second, few applications need all
the features available in today's RDBMSs, so as the feature set
size increased, each application used a decreasing fraction of that
feature set.
This drive toward increasing DBMS functionality has
been accompanied by increasing complexity, and most deployments now
require a specialist, trained in database administration, to keep the
systems and applications running. Since these systems are developed
and sold as monolithic entities, even though applications may require
only a small subset of the system's functionality, each
installation pays the price of the total overall complexity. Surely,
there must be a better way.
THE NEW FRONTIER
We are
not the first to notice these tides of change. In 1998, the leading
database researchers concluded that database management systems were
becoming too complex and that automated configuration and management
were becoming essential.4 Two years later, Surajit Chaudhuri and
Gerhard Weikum proposed radically rethinking database management
system architecture.5 They suggested that database management systems
be made more modular and that we broaden our thoughts about data
management to include rather simple, component-based building blocks.
Most recently, Michael Stonebraker joined the chorus, arguing that
"one size no longer fits all" and citing particular
application examples where the conventional RDBMS architecture is
inappropriate.6
As argued by Stonebraker, the relational vendors
have been providing the illusion that an RDBMS is the answer to any
data management need. For example, as data warehousing and decision
support have emerged as important application domains, the vendors
have adapted products to address the specialized needs that arise in
these new domains. They do this by hiding fairly different data
management implementations behind the familiar SQL front end. This
model breaks down, however, as one begins to examine emerging data
needs in more depth.
Read the rest of this article at acmqueue.com
http://www.acmqueue.com/modules.php?name=Content&pa=showpage&pid=299