Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Ian Dees
On Tue, May 12, 2009 at 4:56 PM, Bernhard zwischenbrugger <
b...@datenkueche.com> wrote:

> Hi all
>
> Is there a possibility to get all new data entered to OSM in realtime?
>
> If someone adds a new road, building, restaurant,... I would like to
> have this data.
>
> There was talks to put this kind of data to the jabber network.
> Is this already available?


There is no live feed of data available. The closest to live is the minutely
diffs on the planet server.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Iván Sánchez Ortega
El Martes, 12 de Mayo de 2009, Bernhard zwischenbrugger escribió:
> Is there a possibility to get all new data entered to OSM in realtime?

No, AFAIK. The closest you can get is the minutely diffs (all the changes done 
in the last minute).

> If someone adds a new road, building, restaurant,... I would like to
> have this data.

Well, head on to planet.openstreetmap.org and download planets and diffs to 
your heart's content :-)


Cheers,
-- 
--
Iván Sánchez Ortega 

Aviso: Este e-mail es confidencial y no debería ser usado por nadie que no sea 
el destinatario original. No se permite la reproducción mediante fotocopia, 
walkie-talkie, emisora de radioaficionado, satélite, televisión por cable, 
proyector, señales de humo, código morse, braille, lenguaje de signos, 
taquigrafía o cualquier otro medio. Bajo ningún concepto debe traducirse al 
francés este e-mail. Este e-mail no puede ser ridiculizado, parodiado, 
juzgado en una competición, o leído en voz alta con un acento gracioso 
llevando un bigote falso y/o cualquier tipo de sombrero, incluyendo pero no 
limitándose a pañuelos. No inciten ni provoquen a este e-mail. Si está 
medicándose, puede experimentar nauseas, desorientación, histeria, vómitos, 
pérdida temporal de la memoria a corto plazo y malestar general al leer este 
e-mail. Consulte a su médico o farmacéutico antes de leer este e-mail. Todas 
las modelos descritas en este e-mail son mayores de 18 años. Si ha recibido 
este e-mail por error es probablemente porque estaba borracho cuando escribí 
la dirección del destinatario.


signature.asc
Description: This is a digitally signed message part.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread andrzej zaborowski
2009/5/13 Ian Dees :
> On Tue, May 12, 2009 at 4:56 PM, Bernhard zwischenbrugger
>  wrote:
>>
>> Hi all
>>
>> Is there a possibility to get all new data entered to OSM in realtime?
>>
>> If someone adds a new road, building, restaurant,... I would like to
>> have this data.
>>
>> There was talks to put this kind of data to the jabber network.
>> Is this already available?
>
> There is no live feed of data available. The closest to live is the minutely
> diffs on the planet server.

You can in theory extract all edits, at higher than 1 minute
granularity, from http://www.openstreetmap.org/browse/changesets
together with all history.  (From the minutely diffs if a new way is
created and deleted in the same minute, you would never know about it)

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Iván Sánchez Ortega
El Miércoles, 13 de Mayo de 2009, andrzej zaborowski escribió:
> From the minutely diffs if a new way is created and deleted in the same 
> minute, you would never know about it 

Can't you get the changeset IDs from the diff, then query the API to know the 
exact time of the changeset?

-- 
--
Iván Sánchez Ortega 

"Good people do not need laws to tell them to act responsibly, while bad 
people will find a way around the laws."
 - Plato (427-347 B.C.)


signature.asc
Description: This is a digitally signed message part.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos
On Tue, May 12, 2009 at 11:50 PM, andrzej zaborowski  wrote:
> You can in theory extract all edits, at higher than 1 minute
> granularity, from http://www.openstreetmap.org/browse/changesets
> together with all history.  (From the minutely diffs if a new way is
> created and deleted in the same minute, you would never know about it)

in theory, yes, but please don't as it puts extra strain on the
servers. please use the minute diffs from the planet server instead
:-)

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Tom Hughes
andrzej zaborowski wrote:

> You can in theory extract all edits, at higher than 1 minute
> granularity, from http://www.openstreetmap.org/browse/changesets
> together with all history.  (From the minutely diffs if a new way is
> created and deleted in the same minute, you would never know about it)

Anybody trying such a stunt will be liable to summary blocking when 
caught however.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos
2009/5/13 Iván Sánchez Ortega :
> El Miércoles, 13 de Mayo de 2009, andrzej zaborowski escribió:
>> From the minutely diffs if a new way is created and deleted in the same
>> minute, you would never know about it
>
> Can't you get the changeset IDs from the diff, then query the API to know the
> exact time of the changeset?

the way (and the changeset its in) may not even appear in the diff.
also, changesets are not atomic, so they don't have a single time -
they have a created_at time and a closed_at time which can be up to
24h apart.

however, brett is testing a new form of diffs that contain all edits,
which should solve that problem.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm
Hi,

Tom Hughes wrote:
> Anybody trying such a stunt will be liable to summary blocking when 
> caught however.

I was waiting for that ;-)

To be just slightly more constructive, the least invasive way of 
querying the API for new data only without changing the code would be to 
make multi-GETs for batches of object IDs just above the highest known 
object ID. That would probably not disrupt services if done by one user, 
but then if one user is allowed to do it, what can we say if 10 others 
wanted to do the same?

Probably the best way to have a live feed - and a technique that has 
been discussed on dev about two years ago - would be to have the rails 
code log all successful database operations into a file which could then 
be retrieved by an independent daemon and fed into whatever distribution 
network you want. That would be about the same thing that database 
replication does, just on a higher level.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Tom Hughes
Frederik Ramm wrote:

> Probably the best way to have a live feed - and a technique that has 
> been discussed on dev about two years ago - would be to have the rails 
> code log all successful database operations into a file which could then 
> be retrieved by an independent daemon and fed into whatever distribution 
> network you want. That would be about the same thing that database 
> replication does, just on a higher level.

It's a completely insane solution though. It we want to do it we should 
just do it properly in the database not fart around with stupid hacks in 
the rails code that break as soon as any updates are not done via rails.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos
On Wed, May 13, 2009 at 12:36 AM, Frederik Ramm  wrote:
> To be just slightly more constructive, the least invasive way of
> querying the API for new data only without changing the code would be to
> make multi-GETs for batches of object IDs just above the highest known
> object ID. That would probably not disrupt services if done by one user,
> but then if one user is allowed to do it, what can we say if 10 others
> wanted to do the same?

the least invasive way is to use the minutely diffs, as it doesn't
touch the API or DB servers at all.

> Probably the best way to have a live feed - and a technique that has
> been discussed on dev about two years ago - would be to have the rails
> code log all successful database operations into a file which could then
> be retrieved by an independent daemon and fed into whatever distribution
> network you want. That would be about the same thing that database
> replication does, just on a higher level.

given that there are more efficient ways of doing the database
replication than aggregating these feeds from all the different API
servers into a coherent whole, i think its probably better to continue
creating the feed (i.e: diffs) from the database.

unless, of course, you're talking about twittering the updates. that
would be teh moar ;-)

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm
Hi,

Tom Hughes wrote:
> It's a completely insane solution though. It we want to do it we should 
> just do it properly in the database not fart around with stupid hacks in 
> the rails code that break as soon as any updates are not done via rails.

Assuming for a moment that the database was our bottleneck, something 
that can be done by "farting around" on a number of easily scalable API 
servers would of course compare favourably to burdening the 
not-so-scalable database with triggers and extra write operations, would 
it not?

Now I don't know how often you manually modify database contents, but I 
would think that any operation of a scale that would lead us to bypass 
the rails API would also be very likely to blow apart anyone who listens 
for edits downstream, so in my eyes there's not much to be gained by 
streaming these "manual override" kinds of edits as well.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Bernhard zwischenbrugger
Hi
> http://planet.openstreetmap.org/minute/
>
That's perfect!!!

Is there also the a file with the *newest* data?
Or do I have to read the timestamp file?

I don't want to synchronize a database. The thing I'm thinking
about is a visualization of the current activity.

Bernhard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm
Hi,

Bernhard zwischenbrugger wrote:
> I don't want to synchronize a database. The thing I'm thinking
> about is a visualization of the current activity.

Google for "OSMAware" for some inspiration!

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos
On Wed, May 13, 2009 at 1:10 AM, Bernhard zwischenbrugger
 wrote:
> Hi
>> http://planet.openstreetmap.org/minute/
>>
> That's perfect!!!
>
> Is there also the a file with the *newest* data?
> Or do I have to read the timestamp file?

reading the timestamp.txt is the best way to do it.

> I don't want to synchronize a database. The thing I'm thinking
> about is a visualization of the current activity.

these might be of interest:

http://matt.sandbox.cloudmade.com/
http://trac.openstreetmap.org/browser/applications/utils/export/tile_expiry
http://lists.openstreetmap.org/pipermail/dev/2009-February/013934.html
http://vimeo.com/4548155



cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Shaun McDonald
Frederik,

On 13 May 2009, at 01:01, Frederik Ramm wrote:

> Hi,
>
> Tom Hughes wrote:
>> It's a completely insane solution though. It we want to do it we  
>> should
>> just do it properly in the database not fart around with stupid  
>> hacks in
>> the rails code that break as soon as any updates are not done via  
>> rails.
>
> Assuming for a moment that the database was our bottleneck, something
> that can be done by "farting around" on a number of easily scalable  
> API
> servers would of course compare favourably to burdening the
> not-so-scalable database with triggers and extra write operations,  
> would
> it not?
>
> Now I don't know how often you manually modify database contents,  
> but I
> would think that any operation of a scale that would lead us to bypass
> the rails API would also be very likely to blow apart anyone who  
> listens
> for edits downstream, so in my eyes there's not much to be gained by
> streaming these "manual override" kinds of edits as well.
>

I really don't want to be attempting to try and collate the edits from  
the api server logs. For a start they don't contain all the  
information that you would need.

There needs to be a fix for the Osmosis method where things are  
committed with a huge delay from the timestamp, however that is still  
the best method of distributing updates of the OSM data. It is then up  
to someone else to do what they like with that. If you want to  
summarise each minutely diff and twitter it, be my guest, though  
remember you need to compress it into 140 chars.

Shaun


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm
Matt,

> the least invasive way is to use the minutely diffs, as it doesn't
> touch the API or DB servers at all.

Sure, but they are (a) delayed by 5 minutes and (b) broken ;-)

I was initially opposed to the concept of diffs. I remember a developer 
meeting in Essen in 2007 where I rather violently requested more 
frequent updates and NickB said something like "we could do daily or 
hourly diffs" and I said "I want the f*ing real thing, not canned diffs".

I must say that, especially with the convenience Osmosis brings in 
dealing with them, I have meanwhile changed my mind. The diffs are a 
very crude solution but they work remarkably well, and they are quite 
robust compared to some kind of replication feed that may go out of sync 
at any time.

I still think that there are use cases for almost-realtime feeds but the 
diffs work for most people. - I didn't know the original poster was 
unaware of the diffs; I assumed he must know the diffs and was looking 
for something better!

> given that there are more efficient ways of doing the database
> replication than aggregating these feeds from all the different API
> servers into a coherent whole, 

As I said in another post, I was under the impression that while you can 
easily have any number of servers running API daemons on them, you'd 
rather not stuff too much into the database because at least for write 
requests we'll be stuck with it for a long while to come. But hey, maybe 
I underestimate the Postgres factor ;-)

> unless, of course, you're talking about twittering the updates. that
> would be teh moar ;-)

For once, it would not be TomH who bans an IP range then ;-)

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Frederik Ramm
Hi,

Shaun McDonald wrote:
> I really don't want to be attempting to try and collate the edits from 
> the api server logs. For a start they don't contain all the information 
> that you would need.

I was not talking about the web server logs, but special log files 
created solely for the purpose of recording, and relaying, changes.

> If you want to summarise 
> each minutely diff and twitter it, be my guest, though remember you need 
> to compress it into 140 chars.

Well, I could spread the content over 60 seconds ;-)

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Matt Amos
On Wed, May 13, 2009 at 1:27 AM, Frederik Ramm  wrote:
> Matt,
>
>> the least invasive way is to use the minutely diffs, as it doesn't
>> touch the API or DB servers at all.
>
> Sure, but they are (a) delayed by 5 minutes and (b) broken ;-)

we're working on both (a) and (b) at the moment... we'll fix it real
soon now, i promise :-)

> I was initially opposed to the concept of diffs. I remember a developer
> meeting in Essen in 2007 where I rather violently requested more frequent
> updates and NickB said something like "we could do daily or hourly diffs"
> and I said "I want the f*ing real thing, not canned diffs".

the trouble with "the f*ing real thing" is that, because it needs the
very latest information, it has to hit the database. imagine that
TF*RT is like WMS - every different request has a slightly different
lat/lon/scale, so its basically uncacheable unless some clever things
are done. granular diffs are like tiles - you only get discrete
chunks, but it makes caching *so* much easier. in fact, you could look
at the files on planet.osm.org as direct access to the cache - no need
to hit the DB, no extra DB load which would be better used serving
editors**. :-)

> I must say that, especially with the convenience Osmosis brings in dealing
> with them, I have meanwhile changed my mind. The diffs are a very crude
> solution but they work remarkably well, and they are quite robust compared
> to some kind of replication feed that may go out of sync at any time.

exactly. because they're just files on disk they're robust against API
downtime or bugs, they're quick to download, etc...

> I still think that there are use cases for almost-realtime feeds but the
> diffs work for most people. - I didn't know the original poster was unaware
> of the diffs; I assumed he must know the diffs and was looking for something
> better!

i think we can find a compromise. if we could get the diff generation
time down from about 5 minutes (and fix (b)!) to 1-2 minutes, would
that be good enough for almost-realtime?

>> given that there are more efficient ways of doing the database
>> replication than aggregating these feeds from all the different API
>> servers into a coherent whole,
>
> As I said in another post, I was under the impression that while you can
> easily have any number of servers running API daemons on them, you'd rather
> not stuff too much into the database because at least for write requests
> we'll be stuck with it for a long while to come. But hey, maybe I
> underestimate the Postgres factor ;-)

but then a single something has to communicate with all the API
daemons, collate all the API activity, and ensure edits' atomicity,
consistency, isolation and durability... what kind of software might
have these ACID properties, i wonder? ;-)

>> unless, of course, you're talking about twittering the updates. that
>> would be teh moar ;-)
>
> For once, it would not be TomH who bans an IP range then ;-)

hey, the postgres guys were happy with OSM using postgres - why
wouldn't twitter be happy? they just re-wrote their backend for better
scalability, so we'd be doing them a favour by testing it!

cheers,

matt

**: yeah, there's going to be an overhead for pulling the minute diffs
out, but thats done once and amortised over all the consumers of the
data.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Paul Johnson
Iván Sánchez Ortega wrote:
> El Martes, 12 de Mayo de 2009, Bernhard zwischenbrugger escribió:
>> Is there a possibility to get all new data entered to OSM in realtime?
> 
> No, AFAIK. The closest you can get is the minutely diffs (all the changes 
> done 
> in the last minute).

It would be cool to get this automagically delivered via XMPP... that
would be handy since both XMPP and OSM are XML.



signature.asc
Description: OpenPGP digital signature
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-12 Thread Ian Dees
Sorry, I lost the thread in Gmail here, but:
On Tue, May 12, 2009 at 7:53 PM, Matt Amos  wrote:

> >> unless, of course, you're talking about twittering the updates. that
> >> would be teh moar ;-)
> >
>

I'd like to continue this part of the thread. As was discussed by Frederik,
I think the end goal should be a real-time OSM stream of what's getting
applied to the database. Doing that in a performant way is relatively
difficult (which is why we're using Osmosis and minutely diffs right now),
but I think we should be striving for having a realtime XML feed.

If we assume that's the goal (ok, it can just be my goal and you guys can
think I'm crazy :)), what do we need to think about or plan for in the
future to make it happen?

DB triggers? API collation? Realtime data stream server**?

I'd love to hear lively, continued discussion on this topic.

-Ian

** Currently, my day job is writing server software for medical devices that
does "broadcast" streams of XML data over TCP/HTTP channels. I'd love to
spend some time working on this if I knew there was a source for the data.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes
Ian Dees wrote:

> I'd like to continue this part of the thread. As was discussed by 
> Frederik, I think the end goal should be a real-time OSM stream of 
> what's getting applied to the database. Doing that in a performant way 
> is relatively difficult (which is why we're using Osmosis and minutely 
> diffs right now), but I think we should be striving for having a 
> realtime XML feed.

I have to say I don't see any great reason to strive for it. I don't 
think anybody has ever given a use case which requires such a stream and 
can't work with the diffs.

Given that such a stream is uncacheable (and hence requires much higher 
bandwidth outgoing from the core servers) and much more fragile than the 
diffs, it is not obvious that we should put what would undoubtedly be a 
huge amount of effort into creating and maintaining such a system rather 
than into doing other things.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Lennard
Matt Amos wrote:

> these might be of interest:
> 
> http://matt.sandbox.cloudmade.com/

Which would have been fine and dandy in the past, but somebody needs to 
nudge that one into life again, /me thinks.

-- 
Lennard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes
Frederik Ramm wrote:

> Tom Hughes wrote:
>> It's a completely insane solution though. It we want to do it we 
>> should just do it properly in the database not fart around with stupid 
>> hacks in the rails code that break as soon as any updates are not done 
>> via rails.
> 
> Assuming for a moment that the database was our bottleneck, something 
> that can be done by "farting around" on a number of easily scalable API 
> servers would of course compare favourably to burdening the 
> not-so-scalable database with triggers and extra write operations, would 
> it not?

The fact that the servers are easily scalable is part of the problem as 
it means that any such logging system involves merging the actions of 
some 80 or so processes spread over 4 separate machines (at present).

That either means some complicated and fragile locking scheme to control 
who is writing to the log at any given time or some scheme for merging a 
whole load of separate logs.

> Now I don't know how often you manually modify database contents, but I 
> would think that any operation of a scale that would lead us to bypass 
> the rails API would also be very likely to blow apart anyone who listens 
> for edits downstream, so in my eyes there's not much to be gained by 
> streaming these "manual override" kinds of edits as well.

I'm not thinking about manual modifications. I'm thinking about things 
like the gpx import that are no longer in rails. I think that is only 
likely to spread to include much of the API in the not too distant future.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Bernhard zwischenbrugger
Hi

Maybe you like this:
http://datenkueche.com/osmlive/

If I get nice feedback I will make it zoomable.

Bernhard

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 8:43 AM, Lennard  wrote:
> Matt Amos wrote:
>
>> these might be of interest:
>>
>> http://matt.sandbox.cloudmade.com/
>
> Which would have been fine and dandy in the past, but somebody needs to
> nudge that one into life again, /me thinks.

yeah, sorry. its on my todo list ;-)

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 2:41 AM, Tom Hughes  wrote:

> Ian Dees wrote:
>
>  I'd like to continue this part of the thread. As was discussed by
>> Frederik, I think the end goal should be a real-time OSM stream of what's
>> getting applied to the database. Doing that in a performant way is
>> relatively difficult (which is why we're using Osmosis and minutely diffs
>> right now), but I think we should be striving for having a realtime XML
>> feed.
>>
>
> I have to say I don't see any great reason to strive for it.


Because it's there? Why are we striving to cover the globe with map data? :)


> I don't think anybody has ever given a use case which requires such a
> stream and can't work with the diffs.


I agree, but the point is that minutely-diffs are a minute old. At some
point in the future someone will want to see the data in real time as a
stream. The only reason I can currently think of is because they don't want
to have to deal with downloading the minutely diffs and would rather read a
stream of XML messages, applying each one to their database somehow as they
came in.

Given that such a stream is uncacheable (and hence requires much higher
> bandwidth outgoing from the core servers)


The stream would be uncacheable, but could be repeated by others outside of
the core server so that the bandwidth load was spread amongst the community.


> and much more fragile than the diffs, it is not obvious that we should put
> what would undoubtedly be a huge amount of effort into creating and
> maintaining such a system rather than into doing other things.


Ok, this I'll agree on. My original post was just to talk about it... not
really to do it. But it sounds like we should take "baby" steps. Let's work
on the minutely diffs first and if some crazy person comes up with a good
use case for streaming, we can talk about it then.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett
Ian Dees wrote:
>> I don't think anybody has ever given a use case which requires such
>>a stream and can't work with the diffs.
> 
> 
> I agree, but the point is that minutely-diffs are a minute old. At some
> point in the future someone will want to see the data in real time as a
> stream. The only reason I can currently think of is because they don't
> want to have to deal with downloading the minutely diffs and would
> rather read a stream of XML messages, applying each one to their
> database somehow as they came in.

The updates to the database aren't records of real-time, real-world
events; They're just mappers updating parts of the map. Anything which
analyses that, rather than the data itself as a whole is just
navel-gazing. It tells you something about the project, but not the
world it's mapping.

You're not missing out on anything by having minute-old data. We're not
recording how the world is changing, we're just making our map more
accurate.

As for wanting updates in a different format: Patches Welcome.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Peter Childs
2009/5/13 Ian Dees :

> Ok, this I'll agree on. My original post was just to talk about it... not
> really to do it. But it sounds like we should take "baby" steps. Let's work
> on the minutely diffs first and if some crazy person comes up with a good
> use case for streaming, we can talk about it then.
>

The Problem is that you can't rebuild the map from a continuing
stream, This is the problem with Database Replication in general.

If you lose the stream for any reason you have to start again, which
is a nightmare.

Things that can be streamed like TV and Radio change over time and you
don't need whats gone before. If you miss the beginning of a 24x7 News
channel or a soap you can still work out whats going on about after a
few minutes. With a Map you have not got a chance.

Maps worry about quality. Streams don't

I'm not quite sure how your stream would work.

Theory Great, Practice don't work.

Peter.

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Frederik Ramm
Hi,

Peter Childs wrote:
> The Problem is that you can't rebuild the map from a continuing
> stream, This is the problem with Database Replication in general.

True, but maybe the stream use cases don't require that? Maybe it is 
more important for an application to know in an instant where something 
is being edited, than having complete knowledge of what has been edited 
yesterday?

I don't have a killer app in mind where I would say "this works with a 
stream and doesn't work with minute diffs". But I can think of a number 
of applications that would be cooler with a proper stream. I mean, just 
look at Bernhard's application:

http://datenkueche.com/osmlive/

It looks very cool and you have the individual spots lighting up in 
something that looks like "real time" but then it is five minutes 
delayed and based on chunked diffs - meaning what you see is a 
fabricated replay of what has probably happened, and not "the real 
thing". Which diminshes the coolness, if only slightly.

Now I'm not saying we should turn the database inside out to support 
fractionally more coolness.

But saying: "We don't intend to support this because we cannot think of 
an application that absolutely requires it", is quite un-OSM, is it not?

Bye
Frederik

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Iván Sánchez Ortega
El Miércoles, 13 de Mayo de 2009, Ian Dees escribió:
> [...] the point is that minutely-diffs are a minute old. At some point in 
> the future someone will want to see the data in real time as a stream.

If you can't wait *one* minute to see the data, you have a very acute case of 
OSMOCD, and you should see a psychiatrist.

> The only reason I can currently think of is because they don't want 
> to have to deal with downloading the minutely diffs and would rather read a
> stream of XML messages, applying each one to their database somehow as they
> came in.

As a wise man once said, "all problems in computer science can be solved by 
adding another indirection layer".

If you really really want a stream, I'm positive it can be hacked with a 
couple of scripts and the minutely diffs.

-- 
--
Iván Sánchez Ortega 

Un ordenador no es un televisor ni un microondas, es una herramienta compleja.


signature.asc
Description: This is a digitally signed message part.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
2009/5/13 Iván Sánchez Ortega 

> As a wise man once said, "all problems in computer science can be solved by
> adding another indirection layer".
>
> If you really really want a stream, I'm positive it can be hacked with a
> couple of scripts and the minutely diffs.


You have discovered one of my use-cases for the stream: the minutely diffs
should be generated from the stream by slicing the stream up into
minute-long segments and saving them to disk, not the other way around.

>From previous discussions with Brett, this is essentially what Osmosis is
doing, but with the database as the input instead of the stream.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett
Frederik Ramm wrote:
> But saying: "We don't intend to support this because we cannot think of 
> an application that absolutely requires it", is quite un-OSM, is it not?

Qualify "application" as "application which actually uses the geodata",
and it's not so far off the mark. We don't need a million tools that
just tell us where people are mapping.
-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread andrzej zaborowski
2009/5/13 Jonathan Bennett :
> Ian Dees wrote:
>>>     I don't think anybody has ever given a use case which requires such
>>>    a stream and can't work with the diffs.
>>
>>
>> I agree, but the point is that minutely-diffs are a minute old. At some
>> point in the future someone will want to see the data in real time as a
>> stream. The only reason I can currently think of is because they don't
>> want to have to deal with downloading the minutely diffs and would
>> rather read a stream of XML messages, applying each one to their
>> database somehow as they came in.
>
> The updates to the database aren't records of real-time, real-world
> events; They're just mappers updating parts of the map. Anything which
> analyses that, rather than the data itself as a whole is just
> navel-gazing. It tells you something about the project, but not the
> world it's mapping.
>
> You're not missing out on anything by having minute-old data.

You might be missing out on a cool visualisation tool though (maybe
what Bernhard is trying doing is similar), but that's the only use
case I can think of right now.

What is a little worrying is that, as far as I see, there's no simple
way to get a copy of the osm data (as in, everything that's in the
database), even a week old -- because the planet file is only a
"projection" of the data on a plane.  AFAIK Wikipedia manages to
provide full database dumps so technically it should also be possible
for OSM as we still (?) have less data and less traffic than WP.

I'd think the streaming/download and upload (merging) of new data are
two separable tasks that can be provided by separate servers with db
replication between them.

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett
andrzej zaborowski wrote:
> You might be missing out on a cool visualisation tool though (maybe
> what Bernhard is trying doing is similar), but that's the only use
> case I can think of right now.

How does that help anyone a) use the data, or b) improve the data? See
ITO's OSM Mapper if you want a *useful* visualisation tool. No live
stream needed there.

> What is a little worrying is that, as far as I see, there's no simple
> way to get a copy of the osm data (as in, everything that's in the
> database), even a week old -- because the planet file is only a
> "projection" of the data on a plane.  

I have no idea what a "projection of the data on a plane" is, unless
you're talking about an in-flight OSM movie. The planet file is
everything that's in the database, barring history info. Nothing more,
nothing less.

-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
2009/5/13 Ian Dees :
> 2009/5/13 Iván Sánchez Ortega 
>>
>> As a wise man once said, "all problems in computer science can be solved
>> by
>> adding another indirection layer".
>>
>> If you really really want a stream, I'm positive it can be hacked with a
>> couple of scripts and the minutely diffs.

+1

> You have discovered one of my use-cases for the stream: the minutely diffs
> should be generated from the stream by slicing the stream up into
> minute-long segments and saving them to disk, not the other way around.

why not?

when its done the other way around its far, far simpler - just xml
files on disk.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 10:11 AM, Jonathan Bennett <
openstreet...@jonno.cix.co.uk> wrote:

> Frederik Ramm wrote:
> > But saying: "We don't intend to support this because we cannot think of
> > an application that absolutely requires it", is quite un-OSM, is it not?
>
> Qualify "application" as "application which actually uses the geodata",
> and it's not so far off the mark. We don't need a million tools that
> just tell us where people are mapping.


Woah! Since when can OSM tell me what sort of applications I can and can't
write with the open source data that OSM is providing**?

OSM isn't about the geodata, it's about the data. That includes the fact
that it is in the geographic domain, but it also means that we can
manipulate it or store it however we want.

** Provided it meets the requirements of the license that the data is
released under.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett
Ian Dees wrote:
> Woah! Since when can OSM tell me what sort of applications I can and
> can't write with the open source data that OSM is providing**?

You're not being told what to do with the data, but it's being suggested
to you that you can't have it in a particular, resource-intensive format
unless you can justify why you need it over and above an existing, less
resource hungry format, for an application that does something other
than go "Ooooh, shiny!"

> OSM isn't about the geodata, it's about the data. That includes the fact
> that it is in the geographic domain, but it also means that we can
> manipulate it or store it however we want.

You can. On your own infrastructure.

-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread andrzej zaborowski
2009/5/13 Jonathan Bennett :
> andrzej zaborowski wrote:
>> You might be missing out on a cool visualisation tool though (maybe
>> what Bernhard is trying doing is similar), but that's the only use
>> case I can think of right now.
>
> How does that help anyone a) use the data, or b) improve the data? See
> ITO's OSM Mapper if you want a *useful* visualisation tool. No live
> stream needed there.

Cool visualisation tools don't have to comply with a) or b), they just
need to be cool :)

>
>> What is a little worrying is that, as far as I see, there's no simple
>> way to get a copy of the osm data (as in, everything that's in the
>> database), even a week old -- because the planet file is only a
>> "projection" of the data on a plane.
>
> I have no idea what a "projection of the data on a plane" is, unless
> you're talking about an in-flight OSM movie. The planet file is
> everything that's in the database, barring history info.

Yup, barring history info.  One of the dimensions is thrown away, this
operation is called projection.

I don't say I need to have a use case for the full database, but in
any project it's only fair to give contributors a way to download the
entire database with the data they created.

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Jonathan Bennett
andrzej zaborowski wrote:
 > Cool visualisation tools don't have to comply with a) or b), they just
> need to be cool :)

So cool you're prepared to pay for the infrastructure to support it?


-- 
Jonathan (Jonobennett)

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread andrzej zaborowski
2009/5/13 Jonathan Bennett :
> andrzej zaborowski wrote:
>  > Cool visualisation tools don't have to comply with a) or b), they just
>> need to be cool :)
>
> So cool you're prepared to pay for the infrastructure to support it?

I didn't say that.  I said there *are* things you're missing out on.

In a different mail you said:
> Ian Dees wrote:
>> OSM isn't about the geodata, it's about the data. That includes the fact
>> that it is in the geographic domain, but it also means that we can
>> manipulate it or store it however we want.
>
> You can. On your own infrastructure.

Except you can't right now, the dumps don't provide enough information
to duplicate OSM database even on your own infrastructure.

Cheers

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 10:28 AM, Jonathan Bennett <
openstreet...@jonno.cix.co.uk> wrote:

> Ian Dees wrote:
> > Woah! Since when can OSM tell me what sort of applications I can and
> > can't write with the open source data that OSM is providing**?
>
> You're not being told what to do with the data, but it's being suggested
> to you that you can't have it in a particular, resource-intensive format
> unless you can justify why you need it over and above an existing, less
> resource hungry format, for an application that does something other
> than go "Ooooh, shiny!"


The whole argument I'm making is that after the initial implementation**,
streaming the data is a lot less resource intensive than what we are
currently doing. Perhaps I don't have the whole picture of what goes on in
the backend, but at some point the changeset XML files are applied to the
database. At this point, we already have the XML changeset that was created
by the client. The stream would simply be mirroring that out to anyone
listening over a compressed HTTP channel.

Of course it could then by propogated to other servers if the bandwidth load
was too great.

One of the clients to this stream might be Osmosis, saving off chunks of
data one minute wide and sending it to planet.openstreetmap.org, for
example.

** ...and I've always said I would be willing to impelement this if we
discussed it and decided there was a way to source the data in a technically
feasible way.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Frederik Ramm
Hi,

Jonathan Bennett wrote:
> andrzej zaborowski wrote:
>> You might be missing out on a cool visualisation tool though (maybe
>> what Bernhard is trying doing is similar), but that's the only use
>> case I can think of right now.
> 
> How does that help anyone a) use the data, or b) improve the data? See
> ITO's OSM Mapper if you want a *useful* visualisation tool. No live
> stream needed there.

Who are you to say what is useful and what isn't? The presentation from 
SOTM 2007 that I remember most vividly - the "wiggly maps" - was also 
the most useless.

Every day someone says "let's not map  because it is 
useless", and our mantra is "maybe it is just your limited imagination 
that makes this look useless". Why suddenly this very different attitude 
of yours?

I fully agree that streaming is probably a niche thing, a nice-to-have 
and not a must-have, and I have no problem if the idea is treated as a 
small priority. But dismissing it just because your imagination is too 
limited...?

Bye
Frederik


___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 10:33 AM, Jonathan Bennett <
openstreet...@jonno.cix.co.uk> wrote:

> andrzej zaborowski wrote:
>  > Cool visualisation tools don't have to comply with a) or b), they just
> > need to be cool :)
>
> So cool you're prepared to pay for the infrastructure to support it?
>

I think talking about hardware infrastructure is a little premature at this
point, but yes, I would be happy to set up a server or 3 to send this
streaming data around the world and take the load off of the db/api servers.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Bernhard zwischenbrugger
Jonathan Bennett schrieb:
> andrzej zaborowski wrote:
>  > Cool visualisation tools don't have to comply with a) or b), they just
>   
>> need to be cool :)
>> 
>
> So cool you're prepared to pay for the infrastructure to support it?
>
>
>   
To put OSM data live to xmpp ist very simple and I don't think it's 
expensive.

An easy way would be to post it to a xmpp groupchat:


geodata here


After login it's just a copy to a tcp socket port 5222.
Everybody who wants the data can log into the groupchat and gets all the 
new data.
Jabber Servers can handle the load without problem (not sure about that 
) and maybe its possible to use an existing jabber server like 
jabber.org, jabber.ru,

I would like to see that. It would be a perfect playground for me.

Bernhard

OSM Live (6 Minutes delay):
http://datenkueche.com/osmlive/






___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes
Bernhard zwischenbrugger wrote:

> To put OSM data live to xmpp ist very simple and I don't think it's 
> expensive.
> 
> An easy way would be to post it to a xmpp groupchat:
> 
> 
> geodata here
> 
> 
> After login it's just a copy to a tcp socket port 5222.
> Everybody who wants the data can log into the groupchat and gets all the 
> new data.
> Jabber Servers can handle the load without problem (not sure about that 
> ) and maybe its possible to use an existing jabber server like 
> jabber.org, jabber.ru,

Yes and then as soon your client disconnects for a second you've lost a 
ton of edits and you have no way to resync.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes
Ian Dees wrote:

> The whole argument I'm making is that after the initial 
> implementation**, streaming the data is a lot less resource intensive 
> than what we are currently doing. Perhaps I don't have the whole picture 
> of what goes on in the backend, but at some point the changeset XML 
> files are applied to the database. At this point, we already have the 
> XML changeset that was created by the client. The stream would simply be 
> mirroring that out to anyone listening over a compressed HTTP channel.

You don't want Potlatch's changes then? or changes made by changing 
individual objects rather than uploading diffs?

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Tom Hughes
Frederik Ramm wrote:

> I fully agree that streaming is probably a niche thing, a nice-to-have 
> and not a must-have, and I have no problem if the idea is treated as a 
> small priority. But dismissing it just because your imagination is too 
> limited...?

It's fine for people to discuss it. What I object to is people wanting 
to implement it (or anything else) on the basis that somebody might, at 
some point, find it useful.

If people have a concrete goal that they want to achieve then I'm all 
for discussing ways of achieving it but I don't see the point of 
creating technology just because we can.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 4:40 PM, andrzej zaborowski  wrote:
> In a different mail you said:
>> Ian Dees wrote:
>>> OSM isn't about the geodata, it's about the data. That includes the fact
>>> that it is in the geographic domain, but it also means that we can
>>> manipulate it or store it however we want.
>>
>> You can. On your own infrastructure.
>
> Except you can't right now, the dumps don't provide enough information
> to duplicate OSM database even on your own infrastructure.

they don't *yet*. brett has been working on "full" diffs, i.e: diffs
with all edits, whether they were later overridden or not. this would
allow you to fully reproduce the whole database. see
http://planet.openstreetmap.org/history/ for whats been done so far.

Frederik said:
> I fully agree that streaming is probably a niche thing, a nice-to-have
> and not a must-have, and I have no problem if the idea is treated as a
> small priority. But dismissing it just because your imagination is too
> limited...?

+1

i think if we can get the delay on the diffs down from 5 mins to under
2 mins then there's no reason why streaming can't be built on top of
the diffs and be able to support all the things people want to do with
streaming.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 5:15 PM, Tom Hughes  wrote:
> Ian Dees wrote:
>> The whole argument I'm making is that after the initial
>> implementation**, streaming the data is a lot less resource intensive
>> than what we are currently doing. Perhaps I don't have the whole picture
>> of what goes on in the backend, but at some point the changeset XML
>> files are applied to the database. At this point, we already have the
>> XML changeset that was created by the client. The stream would simply be
>> mirroring that out to anyone listening over a compressed HTTP channel.
>
> You don't want Potlatch's changes then? or changes made by changing
> individual objects rather than uploading diffs?

+1

or even the diffs? any diff where someone creates an element has
negative placeholder IDs, so extra work would have to be done altering
the XML to match the IDs returned by the database.

and the HTTP stream would contain many osmChange documents? that won't
really work with any XML parser i know of... you'd need to pre-parse
it into separate XML documents first.

and how would you take these XML documents on the API servers and
merge them into a consistent ordered stream, ensuring all data
dependencies are satisfied?

all of that in less work than than osmosis' diff queries?

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Frederik Ramm
Hi,

Matt Amos wrote:
> i think if we can get the delay on the diffs down from 5 mins to under
> 2 mins then there's no reason why streaming can't be built on top of
> the diffs and be able to support all the things people want to do with
> streaming.

What you are talking about is "simulated streaming" not real streaming. 
But it would be a good start; establish some kind of simulated streaming 
that is based on the diffs and costs us almost nothing (can be done by 
someone on their own server off-site!), and when interesting 
applications spring from this where everybody says "oh if these could 
only be real-time instead of 2 minutes delayed" then one an still work 
on providing the same stream in a live fashion.

By the way, if someone really wants to chase the edge of the database by 
always downloading the latest minute diff, what is the suggested way to 
do this? If he makes only one GET request per minute then the diff he is 
looking for might already be 59 seconds delayed ;-) can any of today's 
hip & trendy messaging protocols be used to painlessly notify anyone who 
is interested that "there's a new diff ready", instead of having 
over-eager scripts poll the directory every 10 seconds?

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 12:18 PM, Matt Amos  wrote:

> On Wed, May 13, 2009 at 5:15 PM, Tom Hughes  wrote:
> > Ian Dees wrote:
> >> The whole argument I'm making is that after the initial
> >> implementation**, streaming the data is a lot less resource intensive
> >> than what we are currently doing. Perhaps I don't have the whole picture
> >> of what goes on in the backend, but at some point the changeset XML
> >> files are applied to the database. At this point, we already have the
> >> XML changeset that was created by the client. The stream would simply be
> >> mirroring that out to anyone listening over a compressed HTTP channel.
> >
> > You don't want Potlatch's changes then? or changes made by changing
> > individual objects rather than uploading diffs?
>
> +1
>
> or even the diffs? any diff where someone creates an element has
> negative placeholder IDs, so extra work would have to be done altering
> the XML to match the IDs returned by the database.


These are implementation details that would have to be hammered out after we
talk about design.

You're right, I would prefer to have the database itself (via triggers) dump
to a file/network handle the data that's being written to it. This way, it
would be able to get everything (including Potlatch and diffs) as it was
created.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 12:30 PM, Frederik Ramm  wrote:

> can any of today's
> hip & trendy messaging protocols be used to painlessly notify anyone who
> is interested that "there's a new diff ready", instead of having
> over-eager scripts poll the directory every 10 seconds?


The server would need to open up a socket and send out some sort of
notification to whoever is listening whenever a new diff is ready.

I imagine might even be an intermediate server application that listens to
that notification, grabs the diff, and creates the pseudo-stream for others.
This way, the pseudo-stream would be delayed by N+60 seconds, where N is the
number of seconds it took to create/post/notify/download the diff. That's
pretty darn good.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 7:05 PM, Ian Dees  wrote:
> On Wed, May 13, 2009 at 12:18 PM, Matt Amos  wrote:
>> On Wed, May 13, 2009 at 5:15 PM, Tom Hughes  wrote:
>> > Ian Dees wrote:
>> >> The whole argument I'm making is that after the initial
>> >> implementation**, streaming the data is a lot less resource intensive
>> >> than what we are currently doing. Perhaps I don't have the whole
>> >> picture
>> >> of what goes on in the backend, but at some point the changeset XML
>> >> files are applied to the database. At this point, we already have the
>> >> XML changeset that was created by the client. The stream would simply
>> >> be
>> >> mirroring that out to anyone listening over a compressed HTTP channel.
>> >
>> > You don't want Potlatch's changes then? or changes made by changing
>> > individual objects rather than uploading diffs?
>>
>> +1
>>
>> or even the diffs? any diff where someone creates an element has
>> negative placeholder IDs, so extra work would have to be done altering
>> the XML to match the IDs returned by the database.
>
> These are implementation details that would have to be hammered out after we
> talk about design.
>
> You're right, I would prefer to have the database itself (via triggers) dump
> to a file/network handle the data that's being written to it. This way, it
> would be able to get everything (including Potlatch and diffs) as it was
> created.

why via triggers?

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 1:09 PM, Matt Amos  wrote:

> why via triggers?
>

Because the database is the only aggregation point for the data. There are
many API servers (which would be the ideal spot for creating this data
feed), but my initial thought was that it was quite cumbersome to try and
aggregate the streams from the various API servers (along with time-aligning
them) when the DB server was already doing that for you.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 7:13 PM, Ian Dees  wrote:
> On Wed, May 13, 2009 at 1:09 PM, Matt Amos  wrote:
>>
>> why via triggers?
>
> Because the database is the only aggregation point for the data. There are
> many API servers (which would be the ideal spot for creating this data
> feed), but my initial thought was that it was quite cumbersome to try and
> aggregate the streams from the various API servers (along with time-aligning
> them) when the DB server was already doing that for you.

sorry, i wasn't clear in my question: why triggers in particular,
rather than one of the many other features that the DB provides for
doing this?

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 6:30 PM, Frederik Ramm  wrote:
> Matt Amos wrote:
>>
>> i think if we can get the delay on the diffs down from 5 mins to under
>> 2 mins then there's no reason why streaming can't be built on top of
>> the diffs and be able to support all the things people want to do with
>> streaming.
>
> What you are talking about is "simulated streaming" not real streaming. But
> it would be a good start; establish some kind of simulated streaming that is
> based on the diffs and costs us almost nothing (can be done by someone on
> their own server off-site!),

indeed! good, isn't it? ;-)

> and when interesting applications spring from
> this where everybody says "oh if these could only be real-time instead of 2
> minutes delayed" then one an still work on providing the same stream in a
> live fashion.

given that nothing is ever truly live - there will be a processing
delay with any method - whats the real advantage in a 2 minute delay
rather than a <1 minute delay?

> By the way, if someone really wants to chase the edge of the database by
> always downloading the latest minute diff, what is the suggested way to do
> this? If he makes only one GET request per minute then the diff he is
> looking for might already be 59 seconds delayed ;-)

yep... but does another 59 seconds really matter? ;-)

> can any of today's hip &
> trendy messaging protocols be used to painlessly notify anyone who is
> interested that "there's a new diff ready", instead of having over-eager
> scripts poll the directory every 10 seconds?

i guess it would be fairly easy to have a CGI script for the "next
diff", i.e: after receiving the request it blocks until a new diff is
ready and then returns that diff.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 1:27 PM, Matt Amos  wrote:

> On Wed, May 13, 2009 at 7:13 PM, Ian Dees  wrote:
> > On Wed, May 13, 2009 at 1:09 PM, Matt Amos  wrote:
> >>
> >> why via triggers?
> >
> > Because the database is the only aggregation point for the data. There
> are
> > many API servers (which would be the ideal spot for creating this data
> > feed), but my initial thought was that it was quite cumbersome to try and
> > aggregate the streams from the various API servers (along with
> time-aligning
> > them) when the DB server was already doing that for you.
>
> sorry, i wasn't clear in my question: why triggers in particular,
> rather than one of the many other features that the DB provides for
> doing this?
>
>
Mostly because it would allow us to use the same XML format that everybody
already knows how to parse and because it's what I've worked with in my
limited PostgreSQL experience.

What other features were you thinking about?
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 7:30 PM, Ian Dees  wrote:
> On Wed, May 13, 2009 at 1:27 PM, Matt Amos  wrote:
>> sorry, i wasn't clear in my question: why triggers in particular,
>> rather than one of the many other features that the DB provides for
>> doing this?
>
> Mostly because it would allow us to use the same XML format that everybody
> already knows how to parse and because it's what I've worked with in my
> limited PostgreSQL experience.

why would it allow us to use the XML format? nothing in XML ever goes
near the database.

> What other features were you thinking about?

i was looking at snapshots and transaction IDs to isolate the updated
rows in the history tables.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 1:33 PM, Matt Amos  wrote:

> On Wed, May 13, 2009 at 7:30 PM, Ian Dees  wrote:
> > On Wed, May 13, 2009 at 1:27 PM, Matt Amos  wrote:
> >> sorry, i wasn't clear in my question: why triggers in particular,
> >> rather than one of the many other features that the DB provides for
> >> doing this?
> >
> > Mostly because it would allow us to use the same XML format that
> everybody
> > already knows how to parse and because it's what I've worked with in my
> > limited PostgreSQL experience.
>
> why would it allow us to use the XML format? nothing in XML ever goes
> near the database.
>

I meant that it would trigger some external executable that would build up
the XML, not that the database would do it.


> > What other features were you thinking about?
>
> i was looking at snapshots and transaction IDs to isolate the updated
> rows in the history tables.


I yield to your judgment on that. I haven't given myself enough time to
explore abusing the database app for such a thing.
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Matt Amos
On Wed, May 13, 2009 at 7:36 PM, Ian Dees  wrote:
> On Wed, May 13, 2009 at 1:33 PM, Matt Amos  wrote:
>> On Wed, May 13, 2009 at 7:30 PM, Ian Dees  wrote:
>> > On Wed, May 13, 2009 at 1:27 PM, Matt Amos  wrote:
>> >> sorry, i wasn't clear in my question: why triggers in particular,
>> >> rather than one of the many other features that the DB provides for
>> >> doing this?
>> >
>> > Mostly because it would allow us to use the same XML format that
>> > everybody
>> > already knows how to parse and because it's what I've worked with in my
>> > limited PostgreSQL experience.
>>
>> why would it allow us to use the XML format? nothing in XML ever goes
>> near the database.
>
> I meant that it would trigger some external executable that would build up
> the XML, not that the database would do it.

is the external executable called osmosis?

>> > What other features were you thinking about?
>>
>> i was looking at snapshots and transaction IDs to isolate the updated
>> rows in the history tables.
>
> I yield to your judgment on that. I haven't given myself enough time to
> explore abusing the database app for such a thing.

its better to get this done without the main db and the rails_port
code diverging too much, so i'm looking for methods which are as
un-invasive as possible.

cheers,

matt

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Ian Dees
On Wed, May 13, 2009 at 1:48 PM, Matt Amos  wrote:

> its better to get this done without the main db and the rails_port
> code diverging too much, so i'm looking for methods which are as
> un-invasive as possible.


I agree. Since it seems like a huge amount of work to augment the current
infrastructure to support this, perhaps it would make more sense to follow
what (I think) Frederik said: use the minutely diffs to create a
pseudo-stream and see what sort of apps build up around it.

What's left on making the diffs work?
___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-13 Thread Erik Johansson
This is an implementation of this for  Live Journal:
http://updates.sixapart.com/

Lets you connect to a TCP port and get live XML feed of all updates on
Livejournal.. Has some cool features, such as discarding data from the
stream when you can't keep up.

/Erik

___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk


Re: [OSM-talk] Live Data - all new Data in OSM

2009-05-21 Thread Jaak Laineste
> To put OSM data live to xmpp ist very simple and I don't think it's
> expensive.

Coming back to this a bit older topic. XMPP is server-based solution, so you
will overload some server. Why not use good old and free "Kazaa" network, in
its Skype groupchat re-incarnation, so the delivery channel would be nicely
distributed? 

There could be be traffic limitations in Skype, so it needs checked out and
tested. Also creation of skype plugin for generating and loading the feed
would be maybe even easier than with xmpp.

/Jaak



___
talk mailing list
talk@openstreetmap.org
http://lists.openstreetmap.org/listinfo/talk