For some reason I got on a research jag about this topic last month
(maybe to avoid my FOSS4G prep?). Anyhow, I sketched out some ideas on
the storage/provision side here:

http://etherpad.com/I3dgOoyQKV

Feel free to annotate / edit that if you like.

The raw text is below for your interest.
----

The OpenTile Federation
=======================

How can we create globally available image data sets, without
committing to running centralized data centres? How can organizations
place their imagery online in a way that it will both globally
available, integrated with other imagery from other organizations, and
retain good performance?

We propose decentralized approach, in which any organization can
easily set up an OpenTile server, which will join the OpenTile
federation, and automatically share in the burden of providing access
and redundant storage of global imagery and map tile sets.

An Attempt: OpenAerialMap
-------------------------

OpenAerialMap attempted a centralized solution to the problem of
uniform access to a shared imagery cache. Servers were donated by a
university and some basic web infrastructure was written that ingested
images in a few basic formats and wrote the result out to a tile
cache.

It didn't work, fortunately for reasons that are mostly technical. The
interest and enthusiasm generated even in a short period of time were
quite high, particularly given the limitations of the platform. This
is a problem which is amenable to an engineering solution.

OpenAerialMap shut down a few months ago, and one of the reasons
behind the shutdown was an issue of scale -- there is just too much
image data it the world for one organization to take on the charitable
work of hosting it for everybody.

A secondary reason for the shutdown was the difficulty involved in
getting new data in, given the required knowledge of raster imagery
necessary to parameterize a new upload. We'll treat this problem as a
second-order issue, since without infrastructure it's a moot point.

We need a solution that allows people to easily donate bandwidth and
storage space, without creating a coordination or institution problem.
We don't need an OpenAerialFoundation to take in donations and run a
data center, we need a way for people to bind their servers together
into a federation, with minimal input.

An Attempt: OpenStreetMap
-------------------------

OpenStreetMap (OSM) is a successful attempt at data sharing, but the
handling of actual tile service has been treated as an after-thought
-- the primary goal of the organization is capturing and improving
vector data. OpenStreetMap tiles are served out centrally, so the
system has a single point of failure, and requires direct cash inputs
to maintain. Thus far, demand for OpenStreetMap tiles has been
mitigated by commercial organizations handling much of the
distribution issues. CloudMade serves tiles based on OSM data. DeCarta
has also recently entered the business. This is fine for converting a
free resource (OSM vectors) into a proprietary service (CloudMade or
DeCarta tiles), but not good for a resource like imagery, where the
image tile itself is public and free.

A New Approach
--------------

We want a way for organizations to easily collaborate, for data to
reside close to where it will be used, but still be part of a seamless
global coverage of imagery.

BigPeer2BigPeer
~~~~~~~~~~~~

A pure P2P solution is not practical, as the look-up costs and
unpredicable latency will make tile delivery problematic. What we want
is a way for BigPeers, with good local connectivity in their region,
to easily set up a new node and begin servicing local clients, with
very little technical or communication overhead. Similarly, if those
BigPeers drop offline for various reasons, we want the network to
re-balance, but not lose data in the process.

Each BigPeer will set up an OpenTile server, with as much storage as
they can provide to it. The server will join the network, be
registered and pull over the tiles necessary to service the local
clients in the area. How doest this work? See below.

Global Addressing
~~~~~~~~~~~~

We want every tile and the metadata for every tile to be globally
addressable and accessable.  We want the history of every tile to also
be globally addressable and accessable.

Fortunately, the solution already exists, in a technology called a
"distributed hash table" (DHT). DHT technology received a lot of
research between 2000 and 2005 after the implosion of Napster and the
rise of a number of different attempts at federated non-centralized
systems.

Using a DHT approach, a user could ask any OpenTile server for any
tile in the world, and have that tile returned. By routing users to
nearby servers, and having servers store tiles that are in their local
area, the bandwidth requirements would be shared, and the performance
improved over a centralized system. By spreading out the storage among
servers, the hardware requirements would be shared. By replicating the
tiles to multiple server, reliability would be guaranteed.

Using a DHT, the coordinating function of the central institution
could be greatly reduced. No huge storage arrays to maintain, no thick
network cords to wire up. The central OpenTile web site would just
maintain a manual for operating the OpenTile software, a copy of the
code, downloadable binaries, a list of running OpenTile servers so new
servers can join the federation, and a DNS system to map generic tile
requests to random or nearby servers.

Some tweaks to generic DHT technology can be used to make the OpenTile
solution an even better fit.  Unlike a classic DHT, in a tile DHT, the
keys have meaning, they have a location.  And the requests for tiles
are likely to be well correlated with the tile locations themselves --
that is, people in Argentina are more likely to ask for Argentina
tiles than people in NY, and vice versa.

Using the user/tile affinity, tiles can be migrated to servers that
are close to where the users are likely to request them. A further
benefit of this approach is that it migrates tiles past international
internet bottlenecks, into local national backbones (assuming an
OpenTile server is set up locally). This makes an OpenTile solution an
excellent fit for international development.

Another tweak is to use DNS for server discovery, rather than rely on
the DHT protocol itself. So if you want a tile, as a HTTP client (web
browser, etc) the OpenTile DNS system will give you the geographically
nearest server to talk to. That server will either provide you your
tiles directly out of its cache, or fetch the tiles you need over the
DHT network (and then locally cache them) if it doesn't already have
them.

Technical Details
-----------------

Enabling technologies
~~~~~~~~~~~~~~~~

- Chimera, open source overlay network, binds servers together,
maintains routing information, routes messages to keys, allows
application level to handle details of local storage, replication, etc
- Berkeley DB, open source disk hash, provides local key value store
- GDAL, open source image format handler, provides format conversion
where necessary
- FastCGI, implement OpenTile as an FCGI and have HTTP layer handled elsewhere

Open issues
~~~~~~~~~~~

- Object size, for 256x256 tiles is pretty small, perhaps make each
key map to a 1024x1024 area, but transfer around a collection of
256x256 chunks in a message bundle (or just a JPEG-compressed, tiled
TIFF, if we're feeling spunky), that way the server can easily serve
standard tiles, but can replicate on the basis of larger chunks
- Object management policy, Chimera delegates these "minor" details,
like how to ensure that objects get replicated, that we still have
redundancy in the system for a given object, that new servers get
copies (which copies?) of objects
- Some policy is natural, in that tile requests by clients should lead
to local caching
- Some policy is not natural, what is the rule for discarding objects
when the cache approaches maximum size, a combination of redundancy
(it's already well replicated) and locality (it's not close to me)
might imply dropping an object

Key layout
~~~~~~~~~~

- Chimera supports a 160 bit key, lots of room
- First 64 bits can be the tile address, encoded in the "microsoft
bing maps" manner, the first two bits indicate which quadrant of the
top level (NW=0,NE=1,SW=2,SE=3), the next two bits indicate which
quadrant of that quadrant, etc. This allows embedding tile number in a
fixed length key, and means that spatially nearby keys will be
numerically nearby, even across zoom levels.
- 64 bit tile key translates to a ground resolution of 1cm per tile at
the equator in mercator
- Next 8 bits could be the "layer number" allowing for 256 global layers
- A key combining the first 64 bits with a layer number and then
zeroes can be the "canonical tile", the current active view of the
tile
- Next 8 bits can be the "metadata number" a space for storing
information about the tiles in this tile location in this layer. Ask
for "tile x/yz, layer n, metadata" and get back an XML document
- XML document lists all the versions of tiles that have been added to
this tile slot, what their key number is, who added them, what their
effective resolution is, color/bw, and which one is the current
canonical tile
o some explanation of this idea: when a new tile is inserted into the
system, it will have to be inserted with some standard metadata,
indicating the effective image resolution, date of capture, user doing
the insert, etc. The OpenTile server will apply the policy for the
layer and decide: do I accept this tile at all? if so, give it a slot
under this tile number. is it "better" than the current canonical
tile? if so, also replace the canonical tile content with this tile
content, update the metadata to include the information about the new
tile. some layers, like imagery, will have pretty complex policies for
canonical tile replacement, balancing currency and effective
resolution. others like a rendered map view, might just take the
latest view as canonical and discard all other copies (a reasonable
policy for an OSM tile set, for example)
- Initially there will be only one metadata number, but having room
for others hopefully gives room for future proofing the problem of
ever-growing metadata documents
- Next 16 bits can be spare!
- Next 32 bits can be random hash, so that historical tiles get a
random slot for storage
- Next 32 bits can be spare!
- I've probably forgotten something, but the important point is that
spatially nearby tiles sort together, that the metadata and historical
tiles sort together, and that there is, in fact, metadata and the
ability to maintain a history (and potential rollback path) of the
layer

Key affinity
~~~~~~~~~~~~

Messages in Chimera are routed to the host with the key value
"nearest" to the message key. Server configuration can include the
provision of a lat/lon coordinate, which can be converted into a key.
Add some random noise at the bottom of the key, to keep the servers
distinct, even when they are at the "same" location, and you have a
server key to which messages can migrate that is spatially associated
with (a) the location of the server and (b) the tiles that will end up
there.

Global Metadata
~~~~~~~~~~~~~~~

- There will want to be some global metadata stored in the network,
like how the layer numbers map to human readable names, and insert
policies for layers, and so on. What key that stuff is stored under
needs to be worked out.
- Possibly shove the whole tile key over two bits (we have resolution
to burn) and use the top bit to flag global metadata. Give it a much
wider replication policy.

Interacting with the Server
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The server will be accessed using web services, specifically, each
tile will be a writable URI

- GET /layer/tilenumber
o Where tilenumber is the bing maps key (other schemes can be
supported, obviously)
- GET /layer/tilenumber/tileversion
o Where tileversion is a number returned in the metadata that allows
access to the underlying historical tiles
- GET /layer/tilenumber/metadata
o Get a version of the XML metadata document, with the tile keys
replaced with appropriate version numbers (the random hash portion
extracted from the key)
- POST /layer/tilenumber/new
o Put a metadata fragment into the POST payload, if the metadata is
judged valid, you get back a /layer/tilenumber/tileversion URL to
which you can
- PUT /layer/tilenumber/tileversion
o Put the actual image data into place
- DELETE /layer/tilenumber/tileversion

What about overviews? Creating and uploading overviews will become the
province of the data loading tools. The server will only operate
tile-by-tile.

Uploading Data
~~~~~~~~~~~~~~

A desktop GDAL utility will take in image files, request mandatory
metadata (capture date, capture source, user id, what layer to upload
to (initially there will be only two layers, the "global imagery
latlon" and "global imagery mercator") and then get to work. Reproject
raw data into layer projection, overlay the tile grid and clip out
just those tiles fully contained in the data frame (we'll be throwing
away some edge imagery, but we're trying to be brutalist and effective
here). Up-sample for overviews and repeat. Upload each tile to the
server, and generate a final report of which tiles were accepted and
which ones weere rejected (some overviews will probably get rejected
on the basis of a policy maintaining some global overview like
bluemarble or landsat)

Can we do this server-side? Yes, we could, as a separate project. The
key here is to describe the mechanics of uploading, and the best
practices (attempt to upload overviews as well as best resolution) and
then any upload tool can populate the system. The infrastructure of
storage and serving tiles is separate from the infrastructure of
preparing and uploading.

Distributing Query Load
~~~~~~~~~~~~~~~~~~~~~~~

The overlay network and cache management application layers should
nicely distribute the data to appropriate localities. The next stop is
to try and match the locality of users to locality of data. At the
most basic level, having new OpenTile servers automatically register
with a central DNS service would allow us to round-robin requests to
national-level DNS aliases (ca.opentiles.org) to appropriate national
servers (server1.ca.opentiles.org). Doing distribution at a DNS level
seems preferable to any system of redirects, since the number of
requests to be redirected would only grow and the redirect would add a
uniform HTTP connection setup/teardown overhead to every request.

_______________________________________________
talk mailing list
[email protected]
http://openaerialmap.org/mailman/listinfo/talk_openaerialmap.org

Reply via email to