Re: [GRASS-dev] vector libs: file based spatial index

2009-07-14 Thread Moritz Lennert
On 13/07/09 14:44, Markus Metz wrote: Now in trunk r38390, time to make distclean again... Some testing: 1) ssbel: 20025 areas, 74674 primitives time v.what --q -a map=ss...@mlennert east_north=213355.121152,112569.565623 distance=3555.157725 GRASS6.4: real0m3.401s user0m3.316s

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-14 Thread Markus Metz
Moritz Lennert wrote: Some testing: [...] 3) erm_roads: 1883345 lines, 1883345 primitives time v.build erm_roads GRASS6.4: real1m54.298s user1m49.107s sys0m2.888s GRASS7: real2m54.266s user2m40.606s sys0m6.688s (Note the fact that here GRASS6.4 is

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-13 Thread Markus Metz
Moritz Lennert wrote: On 25/06/09 08:51, Markus GRASS wrote: I would suggest that I first implement a new version were the spatial index is always written out when a new or modifed vector is closed. Intermediate data are still stored in memory. Opening an old vector in read-only mode would

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-13 Thread Martin Landa
Hi, 2009/7/13 Markus Metz markus.metz.gisw...@googlemail.com: To work with an existing vector in grass7, topology needs to be rebuilt because a support file is missing, the spatial index. After that everything is fine and grass6 can read the vector again as it is. great! Just trying to build

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-13 Thread Martin Landa
Hi, 2009/7/13 Martin Landa landa.mar...@gmail.com: [...] Just trying to build vector map 'bridges' from nc_spm. The module ends up with the 'position mismatch' error. $ v.build bridges Building topology for vector map bridges... [...] ERROR: position mismatch Some debug info... D4/5:

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-13 Thread Markus Metz
Martin Landa wrote: Hi, 2009/7/13 Markus Metz markus.metz.gisw...@googlemail.com: To work with an existing vector in grass7, topology needs to be rebuilt because a support file is missing, the spatial index. After that everything is fine and grass6 can read the vector again as it is.

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-09 Thread Martin Landa
Hi Markus, 2009/7/7 Markus Metz markus.metz.gisw...@googlemail.com: [...] For the time being, the only reasonable way to deal with these massive datasets is to *not* build topology. It's not not only the spatial index that is getting out of hand, also topology itself and the category index.

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-09 Thread Markus Metz
Martin Landa wrote: Hi Markus, 2009/7/7 Markus Metz markus.metz.gisw...@googlemail.com: [...] For the time being, the only reasonable way to deal with these massive datasets is to *not* build topology. It's not not only the spatial index that is getting out of hand, also topology

Re: [GRASS-dev] vector libs: file based spatial index

2009-07-07 Thread Markus Metz
doug_newc...@fws.gov wrote: I guess my point is that lidar datasets are getting quite massive. If we are going to be working with the lidar data as point data in the GRASS vector framework, go with the most scalable options. Scalability in working with large data sets is a huge benefit in

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-29 Thread Doug_Newcomb
the biggest lidar file used that I know about is Doug's 379GB dataset (14.5 billion points). Frightening. The above dataset was for two watersheds collected in 2001, the larger of the two watersheds is 9000 square miles . I've recently been working with newer lidar data ( 2007) from a

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-26 Thread Moritz Lennert
On 25/06/09 08:51, Markus GRASS wrote: I would suggest that I first implement a new version were the spatial index is always written out when a new or modifed vector is closed. Intermediate data are still stored in memory. Opening an old vector in read-only mode would then be faster, opening an

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-25 Thread Markus GRASS
Moritz Lennert wrote: Markus: If an old vector is opened just for reading (v.what, v.info, probably also d.vect), the fastest solution is probably to only load the header of the spatial index, as is done for the coor file, and perform spatial queries in file. This is very fast AFAIKT. Then

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-25 Thread Paul Kelly
On Wed, 24 Jun 2009, Markus GRASS wrote: Paul Kelly wrote: On Tue, 23 Jun 2009, Markus GRASS wrote: My implementation is completely file based, also when creating or updating a vector. This comes obviously with a speed penalty because reading in memory is faster than reading from file. With

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-25 Thread Markus GRASS
Hamish wrote: Moritz wrote: The largest file I have used is about 125000 areas with a topo file weighing 42M, so taking your worst estimation, this would mean around 200MB of spatial index, which is still largely acceptable for me. lidar and swath bathymetry data will easily have

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-25 Thread Hamish
Markus M wrote: Lidar is a special case, I don't see a reason to drag along with them topo and the spatial index, maybe the spatial index, but not topo there is nothing at all special about lidar data, it's just a bunch of x,y,z points. (often with other variables like signal return strength)

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-25 Thread Markus GRASS
Paul Kelly wrote: Markus: What are memory-mapped files? Excuse my ignorance, I'm just a self-trained coder (learning by doing). http://en.wikipedia.org/wiki/Memory-mapped_file A chunk of a disk file is directly mapped into memory so you can access it using normal pointers as if it was

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-25 Thread Paul Kelly
On Thu, 25 Jun 2009, Markus GRASS wrote: [...] very little about in this case. E.g. for completely random access there might not be a lot of gain. It is completely random, the next chunk to be read/written can be anywhere in the file. [...] But if there was random access only within a

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-25 Thread Markus GRASS
Paul Kelly wrote: On Thu, 25 Jun 2009, Markus GRASS wrote: [...] very little about in this case. E.g. for completely random access there might not be a lot of gain. It is completely random, the next chunk to be read/written can be anywhere in the file. [...] But if there was random

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-24 Thread Jachym Cepicky
Nice work. What about make it scalable? In case, the vector library would take more then GRASS_VECTOR_MEMORY_MAX (or similar), make it file-based. Keep it in memory otherwise. j On Tue, Jun 23, 2009 at 08:28:49PM +0200, Markus GRASS wrote: Paolo Cavallini wrote: Markus GRASS ha scritto:

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-24 Thread Moritz Lennert
On 23/06/09 20:28, Markus GRASS wrote: Paolo Cavallini wrote: Markus GRASS ha scritto: What to do now? Leave it all in memory as in grass6, build in memory then write out (risk of running out of memory on massive datasets), or keep it always in file? I'll not commit any time soon (also

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-24 Thread Benjamin Ducke
Great work! In keeping with good GRASS traditions, I would suggest to have this user-configurable via an env var. Perhaps: GRASS_SPATIAL_INDEX = AUTO (default: keep in mem if enough RAM available) FILE MEMORY Cheers, Ben Moritz Lennert wrote: On 23/06/09 20:28,

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-24 Thread Paul Kelly
On Tue, 23 Jun 2009, Markus GRASS wrote: My implementation is completely file based, also when creating or updating a vector. This comes obviously with a speed penalty because reading in memory is faster than reading from file. With all sorts of tricks and relying on the system to cache files,

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-24 Thread Moritz Lennert
On 24/06/09 22:49, Markus GRASS wrote: Moritz: I'm not sure I understand everything correctly here, but I have the feeling that there are two questions here: 1) Should we have a file-based storage of the spatial index ? This can then be read into memory when necessary, which still should be

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-24 Thread Hamish
Moritz wrote: The largest file I have used is about 125000 areas with a topo file weighing 42M, so taking your worst estimation, this would mean around 200MB of spatial index, which is still largely acceptable for me. lidar and swath bathymetry data will easily have millions of points, and

[GRASS-dev] vector libs: file based spatial index

2009-06-23 Thread Markus GRASS
Hi, I have now a completely file based spatial index for vector libs that could reduce memory consumption considerably. The spatial index file is usually 2 - 3 times larger than the topo file, for point datasets it is about 5 times larger than the topo file. In GRASS6.x all that is always kept in

Re: [GRASS-dev] vector libs: file based spatial index

2009-06-23 Thread Paolo Cavallini
Markus GRASS ha scritto: What to do now? Leave it all in memory as in grass6, build in memory then write out (risk of running out of memory on massive datasets), or keep it always in file? I'll not commit any time soon (also waiting for the lib/raster commotion to settle down), I need feedback