Re: [OSM-dev] How reconstrucing a way from the history?

2009-01-14 Thread Richard Fairhurst

Thomas Wood wrote:
 [updating way when node moved]
 However, I think potlatch does work in this way..

It does in 0.5 but won't in 0.6.

This does make it a bit more difficult in 0.6 for Potlatch's way revert tool
('H') to figure out what the revision dates were, which is what I'm working
on at the moment.

cheers
Richard
-- 
View this message in context: 
http://www.nabble.com/How-reconstrucing-a-way-from-the-history--tp21413719p21453282.html
Sent from the OpenStreetMap - Dev mailing list archive at Nabble.com.


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Erik Johansson
On Sun, Jan 4, 2009 at 12:15 PM, Jochen Topf joc...@remote.org wrote:
 Somebody just asked on the German mailing list about Google not finding
 pages from our wiki any more. I just checked this and it does find some,
 but they are all still ../index.php/.. URLs. Looking at the robots.txt and
 the Sitemap, it should find the new ../wiki/.. URLs, but maybe something
 is wrong somewhere.



+1  It's impossible to find stuff on the wiki with google

http://www.google.se/search?q=site%3Awiki.openstreetmap.org+-inurl%3Aindex+-inurl%3Aimages

http://trac.openstreetmap.org/ticket/1496


Regards

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Tom Hughes
Erik Johansson wrote:
 On Sun, Jan 4, 2009 at 12:15 PM, Jochen Topf joc...@remote.org wrote:
 Somebody just asked on the German mailing list about Google not finding
 pages from our wiki any more. I just checked this and it does find some,
 but they are all still ../index.php/.. URLs. Looking at the robots.txt and
 the Sitemap, it should find the new ../wiki/.. URLs, but maybe something
 is wrong somewhere.

 
 
 +1  It's impossible to find stuff on the wiki with google
 
 http://www.google.se/search?q=site%3Awiki.openstreetmap.org+-inurl%3Aindex+-inurl%3Aimages
 
 http://trac.openstreetmap.org/ticket/1496

The googlebot is probably blocked as bots tend to kill the wiki due to 
the extreme crapness of mediawiki.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Ævar Arnfjörð Bjarmason
On Wed, Jan 14, 2009 at 10:36 AM, Tom Hughes t...@compton.nu wrote:
 Erik Johansson wrote:
 On Sun, Jan 4, 2009 at 12:15 PM, Jochen Topf joc...@remote.org wrote:
 Somebody just asked on the German mailing list about Google not finding
 pages from our wiki any more. I just checked this and it does find some,
 but they are all still ../index.php/.. URLs. Looking at the robots.txt and
 the Sitemap, it should find the new ../wiki/.. URLs, but maybe something
 is wrong somewhere.



 +1  It's impossible to find stuff on the wiki with google

 http://www.google.se/search?q=site%3Awiki.openstreetmap.org+-inurl%3Aindex+-inurl%3Aimages

 http://trac.openstreetmap.org/ticket/1496

 The googlebot is probably blocked as bots tend to kill the wiki due to
 the extreme crapness of mediawiki.

Bots happily hit Wikipedia all day without killing it, is the OSM wiki
set up equivalently? I.e. with a cache in front of it, memcache to
cache various things within MW etc?

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Limit on the number of tags on a node

2009-01-14 Thread Dave Stubbs
2009/1/13 Neil Penman ianaf4...@yahoo.com:
 Hmm, trying my post again with a message created from scratch!  I didn't
 realise I couldn't just reply all to another message, change the subject and
 delete the old text!  Its a bad habit anyway so time I stopped it.

 I get a 400 error when I try and upload nodes with more than 50 tags and
 about 4,300 bytes.  This is to a test database, not
 www.openstreetmap.org/api.  Is anyone aware of any limitations?



API 0.5 concatenates node tags as a single text column for storage.
This means that there isn't actually a limit on the number, but if the
total size goes above the size of a database text column then there
may be a problem. I think that's 2^16 bytes on mysql, so it shouldn't
be causing you a problem.

I don't know of any limits on way tags other than a 255 char limit on
each key and value.

Dave

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Tom Hughes
Ævar Arnfjörð Bjarmason wrote:

 On Wed, Jan 14, 2009 at 10:36 AM, Tom Hughes t...@compton.nu wrote:

 The googlebot is probably blocked as bots tend to kill the wiki due to
 the extreme crapness of mediawiki.
 
 Bots happily hit Wikipedia all day without killing it, is the OSM wiki
 set up equivalently? I.e. with a cache in front of it, memcache to
 cache various things within MW etc?

Yes, we're setup a little differently.

We have one little Atom based machine. They have a metric buttload of 
database servers serving pages to ten metric buttloads of web servers 
fronted by about a thousand metric buttloads of squid caches.

Basically wikipedia have solved the problem of mediawiki being a crock 
by throwing ludicrous amounts of hardware at it.

It doesn't help that we have some ridiculously complicated pages like 
map features which are not only hugely complicated with masses of 
templates but also get changed very frequently.

Tom

-- 
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Frederik Ramm
Hi,

Tom Hughes wrote:
 We have one little Atom based machine. They have a metric buttload of 
 database servers serving pages to ten metric buttloads of web servers 
 fronted by about a thousand metric buttloads of squid caches.

Then don't be such a sissy and get use the metric buttload of servers we 
need for the wiki! Who needs to access nodes and ways when they can 
instead access wiki pages ;-)

 It doesn't help that we have some ridiculously complicated pages like 
 map features which are not only hugely complicated with masses of 
 templates but also get changed very frequently.

Jokes aside - as my other hobby I'm involved with a browser game and 
they have their own Mediawiki describing the areas you can go to and the 
items you can collect and things. A typical page is here: 
http://www.fwwiki.de/index.php/Rovonia - it uses about 40 templates. 
Some of them are very magic and set some variables that are used on 
almost every page. I recently changed one of these variables in one 
template, bringing down the whole Wiki for about two hours ;-)

So yes, MediaWiki is crap, especially when used by computer geeks who 
tend to actually use the templating features. What options do we have? 
Close down the Wiki? Replace it with another Wiki software? Look for 
hardware/money to throw at the problem? Perhaps we would produce a 
nightly dump into static pages that could then be indexed by google et al?

Bye
Frederik


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Ævar Arnfjörð Bjarmason
On Wed, Jan 14, 2009 at 1:25 PM, Frederik Ramm frede...@remote.org wrote:
 So yes, MediaWiki is crap, especially when used by computer geeks who tend
 to actually use the templating features. What options do we have? Close down
 the Wiki? Replace it with another Wiki software? Look for hardware/money to
 throw at the problem? Perhaps we would produce a nightly dump into static
 pages that could then be indexed by google et al?

The first thing that should be done is to ensure that MediaWiki's
caching infrastructure, both internal (memcache, message cache etc)
and external (squid) is correctly set up and all working optimally
together. I discussed this briefly with TomH on IRC which told me to
contact Grant regarding this, which I've done. I'd be happy to take a
look at it to see if anything can be improved.

If it's correctly set up (anonymous user) requests should either
bypass MW entirely or alternatively at least hit some cache or other
for the stuff that's computationally expensive. Although there are
some edge cases of course like when a template that's used 5 levels
deep by a few hundred other templates needs to be invalidated. But
from looking at RC it looks like the wiki has around 5-50 changes in
the Template: namespace per day so it shouldn't be that unmanageable.

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Grant Slater
Ævar Arnfjörð Bjarmason wrote:
 The first thing that should be done is to ensure that MediaWiki's
 caching infrastructure, both internal (memcache, message cache etc)
 and external (squid) is correctly set up and all working optimally
 together. I discussed this briefly with TomH on IRC which told me to
 contact Grant regarding this, which I've done. I'd be happy to take a
 look at it to see if anything can be improved.

   

emailed offlist...
The current squid server was meant to be a temporary loan... it needs to 
move somewhere better later.

 If it's correctly set up (anonymous user) requests should either
 bypass MW entirely or alternatively at least hit some cache or other
 for the stuff that's computationally expensive. Although there are
 some edge cases of course like when a template that's used 5 levels
 deep by a few hundred other templates needs to be invalidated. But
 from looking at RC it looks like the wiki has around 5-50 changes in
 the Template: namespace per day so it shouldn't be that unmanageable.

   

anonymous users often have cookie set... causing cache misses. The 
machine runs squid WITHOUT X-Vary-Options patch! Squid server is used in 
production for other low usage clients.

When the templates change, load spikes on the mediawiki server. Load 
Avg: ~2 to ~18. On the single processor machine. It takes awhile to 
settle down.

Squid Log examples:
1231921832.295  15911 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/Map_Features - NONE/- -
1231921832.303  0 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/DE:Map_Features - NONE/- -
1231921832.310  4 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/IT:Map_Features - NONE/- -
1231921832.318  3 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/FR:Map_Features - NONE/- -
1231921832.326  0 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/Cz:Map_Features - NONE/- -
1231921832.367  3 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/Ro:Map_Features - NONE/- -
1231921832.403  9 89.16.177.88 TCP_MISS/200 103 PURGE 
http://wiki.openstreetmap.org/wiki/Da:Map_Features - NONE/- -
1231921832.412  6 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/Hu:Map_Features - NONE/- -
1231921832.452  5 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/Ja:Map_Features - NONE/- -
1231921832.459  7 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/Fi:Map_Features - NONE/- -
1231921832.480  4 89.16.177.88 TCP_MISS/404 110 PURGE 
http://wiki.openstreetmap.org/wiki/Ru:Map_Features - NONE/- -
...
1231923817.516  60008 89.16.177.88 TCP_HIT/000 0 PURGE 
http://wiki.openstreetmap.org/wiki/DE:Map_Features - NONE/- -

/ Grant

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Google doesn't find new wiki urls

2009-01-14 Thread Karl Guggisberg
 It doesn't help that we have some ridiculously complicated pages like map 
 features which are not only hugely complicated with masses of templates but 
 also get changed very frequently.

We are currently evaluating how Semantic MediaWiki can help us to manage map 
feature infos on the wiki (see [1] for ideas and [2] for dev playground).

We will still use templates, of course, even quite excessively, I'd say. In 
addition we will bring semantic inline queries into the game. This might have a 
significant impact on caching and I begin to wonder whether OSMs servers will 
be able to handle this kind of load. 

Does anybody have experience with installations of SMW with significant load?

-- Karl

[1] http://wiki.openstreetmap.org/wiki/Machine-readable_Map_Feature_list
[2] http://dev.openstreetmap.org/~edgemaster/semwiki/index.php/Main_Page


-Ursprüngliche Nachricht-
Von: dev-boun...@openstreetmap.org [mailto:dev-boun...@openstreetmap.org] Im 
Auftrag von Tom Hughes
Gesendet: Mittwoch, 14. Januar 2009 13:37
An: Ævar Arnfjörð Bjarmason
Cc: Dev Openstreetmap
Betreff: Re: [OSM-dev] Google doesn't find new wiki urls

 var Arnfj r  Bjarmason wrote:

 On Wed, Jan 14, 2009 at 10:36 AM, Tom Hughes t...@compton.nu wrote:

 The googlebot is probably blocked as bots tend to kill the wiki due 
 to the extreme crapness of mediawiki.
 
 Bots happily hit Wikipedia all day without killing it, is the OSM wiki 
 set up equivalently? I.e. with a cache in front of it, memcache to 
 cache various things within MW etc?

Yes, we're setup a little differently.

We have one little Atom based machine. They have a metric buttload of database 
servers serving pages to ten metric buttloads of web servers fronted by about a 
thousand metric buttloads of squid caches.

Basically wikipedia have solved the problem of mediawiki being a crock by 
throwing ludicrous amounts of hardware at it.

It doesn't help that we have some ridiculously complicated pages like map 
features which are not only hugely complicated with masses of templates but 
also get changed very frequently.

Tom

--
Tom Hughes (t...@compton.nu)
http://www.compton.nu/

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


[OSM-dev] Osmosis - handling of linestring column

2009-01-14 Thread Lars Francke
Hi,

Osmosis used to create a bbox column for the ways table. As I didn't
need this I replaced all its queries with queries to create and update
a LINESTRING-column instead. This works flawlessly and as I understand
this feature has since been added to Osmosis (I'm using 0.29.2). As
this update seems(!) to be the main source for slowdowns when applying
diffs I wondered what could be done about it.

The column needs to be updated when any of the nodes in a way gets updated.
In my hack and the current NodeDAO.java[1] file the
SQL_UPDATE_WAY_LINESTRING is executed once for each node that has been
updated and once again if the way has been updated (WayDAO.java). So a
way with 1000 nodes would be updated up to 1001 times. First question
is: Is my assumption correct? If not please ignore the rest ;-)

If it is correct a solution depends on the OSM-format. If a node is
updated does Osmosis produce a modify-element of all the related
ways? If yes the solution would be to simply delete the
SQL_UPDATE_WAY_LINESTRING from NodeDAO. If no we could compile a list
of all ways that are touched by the node updates and after
successfully importing everything update these ways. This wouldn't be
the best solution memory-wise and perhaps not even speed up things at
all because we'd have to find out which ways a node belongs to for
each updated node. But this is all I could come up with.

I don't have access to a running database at the moment as I'm
currently rebuilding the server so all this is not thoroughly checked.
But any comments are greatly appreciated.

Lars

[1] 
http://svn.openstreetmap.org/applications/utils/osmosis/trunk/src/com/bretth/osmosis/core/pgsql/v0_6/impl/NodeDao.java

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Osmosis - handling of linestring column

2009-01-14 Thread Brett Henderson
Lars Francke wrote:
 Hi,

 Osmosis used to create a bbox column for the ways table. As I didn't
 need this I replaced all its queries with queries to create and update
 a LINESTRING-column instead. This works flawlessly and as I understand
 this feature has since been added to Osmosis (I'm using 0.29.2). As
 this update seems(!) to be the main source for slowdowns when applying
 diffs I wondered what could be done about it.
   
Yes, the feature has been added, but only for the 0.6 code.

There are 3 optional features in the 0.6 codebase, the way.bbox column, 
the way.linestring column and the action table (contains all changes for 
the current import, useful for updating custom downstream tables).
 The column needs to be updated when any of the nodes in a way gets updated.
 In my hack and the current NodeDAO.java[1] file the
 SQL_UPDATE_WAY_LINESTRING is executed once for each node that has been
 updated and once again if the way has been updated (WayDAO.java). So a
 way with 1000 nodes would be updated up to 1001 times. First question
 is: Is my assumption correct? If not please ignore the rest ;-)
   
Yes, that is correct.
 If it is correct a solution depends on the OSM-format. If a node is
 updated does Osmosis produce a modify-element of all the related
 ways? If yes the solution would be to simply delete the
 SQL_UPDATE_WAY_LINESTRING from NodeDAO. If no we could compile a list
 of all ways that are touched by the node updates and after
 successfully importing everything update these ways. This wouldn't be
 the best solution memory-wise and perhaps not even speed up things at
 all because we'd have to find out which ways a node belongs to for
 each updated node. But this is all I could come up with.
   
The osm format doesn't contain related elements, it only contains those 
elements that have changed.  Finding related elements would require 
additional queries while extracting data from the main database which 
I've kept to a minimum.  In effect the load is being placed on the 
client instead of the server, scaling the central server is one of the 
primary objectives.  It has the additional benefit of keeping the 
changeset file sizes minimal.

The difficult bit is identifying which ways are impacted by a node 
change.  Currently the code is naive and just runs an update query on 
all ways related to the node being modified.  As you point out this 
isn't ideal so it might be possible to break this into two parts: 1. 
Identify impacted ways, 2. Update ways.  Step 1 will still need to be 
done per node (although we could query on several nodes within a single 
query) so the savings there might not be noticeable (performance seems 
to be most impacted by disk seeks, not database round trips), but step 2 
could be reduced significantly if each way is impacted by many nodes 
(ie. we'd only update each way once rather than once per node).

The current solution is the simplest and I hoped would be satisfactory.  
If it's taking too long we'll have to find a better way.  So long as the 
solution can fit in a reasonable amount of RAM I'm happy.  I doubt if 
I'll be able to look at this myself soon so feel free to experiment with 
improvements.
 I don't have access to a running database at the moment as I'm
 currently rebuilding the server so all this is not thoroughly checked.
 But any comments are greatly appreciated.
   
Brett


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Osmosis - handling of linestring column

2009-01-14 Thread Lars Francke
Hi,

 I doubt if I'll
 be able to look at this myself soon so feel free to experiment with
 improvements.

and thank you for your comments and clarifications. I will have a look
at this but it will take me some time. I'll get back to you when I
have something to report.

Lars

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev