Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-18 Thread Jan Hartmann



On 17-1-2010 22:02, Mateusz Loskot wrote:

Does that mean that I can use ogrinfo on a gzipped archive, like

gdalinfo (http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip)?
 


Yes, it does but...as long as OGR driver performs filesystem
operations using VSI*L API. The problem is that only few of them use VSI*L.

As an example, I just updated OGR GeoJSON driver to use VSI*L API:

http://trac.osgeo.org/gdal/changeset/18573

As you can see, it is not a complex task, but may be time consuming and
tedious, and requires quite a lot of testing.

After these changes are applied, it is possible to (un)gzip GeoJSON
datasets:

1) Translate Shapefile to GeoJSON file compressed using GZip:

$ ogr2ogr -f GeoJSON /vsigzip/./points.geojson.gz points.shp

2) Read GZip compressed GeoJSON dataset:

$ ogrinfo /vsigzip/./points.geojson.gz

$ ogrinfo /vsigzip/./points.geojson.gz OGRGeoJSON

   



It would be a nice archive functionality.
 

Yes, indeed.

The work requires to step in to each OGR driver directory,
grep .cpp files for VSI verify what VSI API is used.driver:

For example, OGR Shapefile driver would, in theory, need to get
updated VSI calls in about 50 places:

$ grep VSI *.cpp | wc -ls
47

I'm quite sure volunteers would be appreciated.

   


Thanks Mateusz. I am going to look at GeoJSON as an ASCII data archiving 
format, as long as long as a full-fledged GML driver isn't available I 
have always liked JSON more then XML.


Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-17 Thread Mateusz Loskot
Jan Hartmann wrote:
 On 16-1-2010 16:03, Mateusz Loskot wrote:
 Jan Hartmann wrote:
   
 Yes, that is clear, thanks. I see that at the moment only raster files
 are supported.
 Would it make sense to do this for vector formats too?
  
 The VSI layer is available to all parts of GDAL and OGR.
 If you scan source code of OGR drivers, you'll find that this feature
 is already used by, for example, GTM driver:

 http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrsf_frmts/gtm/gtm.cpp?rev=17612#L339


 IOW, VSI layer is available and not dependant on GDAL or OGR format,
 it's a separate library of common functions.

 Best regards,

 
 Does that mean that I can use ogrinfo on a gzipped archive, like
 gdalinfo (http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip)?


Yes, it does but...as long as OGR driver performs filesystem
operations using VSI*L API. The problem is that only few of them use VSI*L.

As an example, I just updated OGR GeoJSON driver to use VSI*L API:

http://trac.osgeo.org/gdal/changeset/18573

As you can see, it is not a complex task, but may be time consuming and
tedious, and requires quite a lot of testing.

After these changes are applied, it is possible to (un)gzip GeoJSON
datasets:

1) Translate Shapefile to GeoJSON file compressed using GZip:

$ ogr2ogr -f GeoJSON /vsigzip/./points.geojson.gz points.shp

2) Read GZip compressed GeoJSON dataset:

$ ogrinfo /vsigzip/./points.geojson.gz

$ ogrinfo /vsigzip/./points.geojson.gz OGRGeoJSON


 This page says:
 
 The fact that this new capability is implemented as virtual file
 systems imply that it will only work for GDAL drivers supporting the
 large file API

Apparently, the Wiki needs to be updated.

 It would be a nice archive functionality. 

Yes, indeed.

The work requires to step in to each OGR driver directory,
grep .cpp files for VSI verify what VSI API is used.driver:

For example, OGR Shapefile driver would, in theory, need to get
updated VSI calls in about 50 places:

$ grep VSI *.cpp | wc -ls
47

I'm quite sure volunteers would be appreciated.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Even Rouault

I think Frank meant


With Even's work, it is *now* possible for many drivers to to
transparently access compressed files using the /vsigzip/ mechanism.



___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Jan Hartmann


On 16-1-2010 3:01, Frank Warmerdam wrote:


Instead, if this is a goal of archiving, I'd suggest archiving
the original data (in a possibly arcane format), and a copy in
a more accessable format likely to still be usable decades later.


That's what we are doing now. It's OK for practical purposes.


The OGR VRT driver already captures most of this with my recent
addition of schema support.  It could be extended to actually be
a feature store.

Alternatively, we could look at improving the GML driver to support
capture of everything that OGR can represent.  This would have the
benefit of being useful in non-OGR applications.

Both very good propositions, especially the GML one. I guess such a GML 
dump could be read back into OGR without problems? I'm not proposing a 
RFC (don't know how much work it is), but perhaps you could keep this in 
the back of your head ...



Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Jan Hartmann
Oh, that makes a difference indeed! When I read Frank's long answer, 
this was the only point I didn't like. How does this mechanism work?


Jan

On 16-1-2010 12:33, Even Rouault wrote:

I think Frank meant


With Even's work, it is *now* possible for many drivers to to
transparently access compressed files using the /vsigzip/ mechanism.




___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Even Rouault

Jan,

Some GDAL and OGR drivers use a specific API, the VSI File Large API, to 
access files. That API mimics the semantics of standard C library IO API :


fopen - VSIFOpenL
fread - VSIFReadL
fseek - VSIFSeekL


Usually that API enables access to large files ( 4 GB) on Unix and 
Windows. But there are a few 'plugins' for specific purposes. For 
example, if you pass /vsigzip/pass/to/your/file.gz to VSIFOpenL, the 
calls will go through a plug-in that will do on-the-fly decompression of 
a GZip compressed file (compression support added by Frank in 1.7.0). 
This is used internally by the GDAL R driver, or by the OGR GTM driver. 
We can also use the /vsimem/ prefix to read or write into in-memory 
files (used internally in GDAL in some drivers and algorithms, used by 
MapServer to generate the output image and avoid creating a temporary 
file on the file system, etc...). Or /vsisubfile/ to access to a file 
embedded within another file (used to decompress JPEG2000 or JPEG 
streams in some formats like NITF).


Here are a few links for further reading on the subject :
http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip
http://gdal.org/cpl__vsi_8h.html

Best regards,

Even

Oh, that makes a difference indeed! When I read Frank's long answer, 
this was the only point I didn't like. How does this mechanism work?


Jan

On 16-1-2010 12:33, Even Rouault wrote:

I think Frank meant


With Even's work, it is *now* possible for many drivers to to
transparently access compressed files using the /vsigzip/ mechanism.










___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Jan Hartmann
Yes, that is clear, thanks. I see that at the moment only raster files 
are supported. Would it make sense to do this for vector formats too? I 
am thinking of a dump from a large PostGIS database I had to upgrade 
from a 32 to a 64 bits server. I didn't like the pgdump format, as I got 
in all sorts of trouble with system tables and libraries, and the 
dumpfile got really big. If I had thought about it, I would have 
extracted all schemas with OGR, and I am certainly going to archive and 
gzip my PostGIS vector files that way. It would be nice if ogrinfo could 
give information about vector files within a (g)zip archive, and, of 
course, it would be ideal if the files within that archive would be in 
some sort of GML that could be translated back to a binary OGR format. 
That would make a perfect data storage and archive medium, but 
meanwhile, thanks to this conversation, I already have found a nice way 
to backup and archive my maps. Thanks.


Jan

On 16-1-2010 13:52, Even Rouault wrote:

Jan,

Some GDAL and OGR drivers use a specific API, the VSI File Large API, 
to access files. That API mimics the semantics of standard C library 
IO API :


fopen - VSIFOpenL
fread - VSIFReadL
fseek - VSIFSeekL


Usually that API enables access to large files ( 4 GB) on Unix and 
Windows. But there are a few 'plugins' for specific purposes. For 
example, if you pass /vsigzip/pass/to/your/file.gz to VSIFOpenL, the 
calls will go through a plug-in that will do on-the-fly decompression 
of a GZip compressed file (compression support added by Frank in 
1.7.0). This is used internally by the GDAL R driver, or by the OGR 
GTM driver. We can also use the /vsimem/ prefix to read or write into 
in-memory files (used internally in GDAL in some drivers and 
algorithms, used by MapServer to generate the output image and avoid 
creating a temporary file on the file system, etc...). Or /vsisubfile/ 
to access to a file embedded within another file (used to decompress 
JPEG2000 or JPEG streams in some formats like NITF).


Here are a few links for further reading on the subject :
http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip
http://gdal.org/cpl__vsi_8h.html

Best regards,

Even

Oh, that makes a difference indeed! When I read Frank's long answer, 
this was the only point I didn't like. How does this mechanism work?


Jan

On 16-1-2010 12:33, Even Rouault wrote:

I think Frank meant


With Even's work, it is *now* possible for many drivers to to
transparently access compressed files using the /vsigzip/ mechanism.











___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Mateusz Loskot
Jan Hartmann wrote:
 Yes, that is clear, thanks. I see that at the moment only raster files
 are supported.
 Would it make sense to do this for vector formats too?

The VSI layer is available to all parts of GDAL and OGR.
If you scan source code of OGR drivers, you'll find that this feature
is already used by, for example, GTM driver:

http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrsf_frmts/gtm/gtm.cpp?rev=17612#L339

IOW, VSI layer is available and not dependant on GDAL or OGR format,
it's a separate library of common functions.

Best regards,
-- 
Mateusz Loskot
http://mateusz.loskot.net

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Jan Hartmann



On 16-1-2010 16:03, Mateusz Loskot wrote:

Jan Hartmann wrote:
   

Yes, that is clear, thanks. I see that at the moment only raster files
are supported.
Would it make sense to do this for vector formats too?
 

The VSI layer is available to all parts of GDAL and OGR.
If you scan source code of OGR drivers, you'll find that this feature
is already used by, for example, GTM driver:

http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrsf_frmts/gtm/gtm.cpp?rev=17612#L339

IOW, VSI layer is available and not dependant on GDAL or OGR format,
it's a separate library of common functions.

Best regards,
   


Does that mean that I can use ogrinfo on a gzipped archive, like 
gdalinfo (http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip)? This page 
says:


The fact that this new capability is implemented as virtual file 
systems imply that it will only work for GDAL drivers supporting the 
large file API


It would be a nice archive functionality. At the moment I store my 
raster files in a directory tree and retrieve information on some or all 
of them with small scripts calling gdalinfo and filtering the results. I 
know now that I can put the files in a gzip archive, while still be able 
to do the same queries. If the same would be possible for vector maps, I 
could store everything in one large file for backup.


Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-16 Thread Jan Hartmann

Thanks Frank, I'll try and let you know.

BTW: Perhaps your idea of GML-output for OGR would solve the 
Metadata-problem for OGR. Everyone can add whatever data they want to 
the GML-file, as long as OGR knows what parts to retrieve to reconstruct 
the vector map.


Jan

On 16-Jan-10 21:53, Frank Warmerdam wrote:


Some OGR drivers (and some GDAL drivers) will support this capability.
It depends on which ones go through the VSI*L API.  On the GDAL side we
now have a technique to keep track of this in an organized way, while 
this
does not yet exist on the OGR side.  Buy you can just try it to know 
for a

particular driver.

Best regards,

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-15 Thread Peter J Halls

Dear All,

Mateusz Loskot wrote:
% Snip


Yes. Also, most applications I've seen using OGR do define their own
data models and translate OGRFeature to features of their own types.
Perhaps it would be interesting to know why they don't use OGRFeature
as a part of their data model, what's missing...


Thinking about this in terms of my own programs, I think that it is not 
necessarily that OGRFeature is missing anything but rather that my program data 
structures (objects) are designed for some specific task.  So, for example, I 
have a program that reads from a GPS and creates OGRFeatures for storage 
somewhere using OGR; another uses OGR to read data of some format and then 
builds a Green  Sibson neighbourhood structure from the OGRFeatures in order to 
measure neighbourhood characteristics and writes or updates attribute tables; 
and so on.  These simple OGC-type features are, for me, ideal input into or 
output from what are primarily research models.  Indeed, were the data structure 
more complex, I should probably have to unpick it into a more simple structure, 
like OGRFeature, in order to build appropriate data structures for whatever it 
was I was doing.


Note: I am not using OGR as a component of a GIS, rather my programs are either 
extensions to GIS methodology (eg neighbours and cluster detection) or are 
designed to model the behaviour of some phenomenon.  I use OGR for format 
independent spatial object IO because it is easy to map OGR objects to my 
objects.  This means that I use my own object methods for tasks like 
intersection detection, etc, when needed.


Having said that, do I want more?  There are times when geometric topology is, 
or could be, very useful.  Currently, if I really want that, I create an ESRI 
'coverage' dataset and use that as input via infolib: its not necessarily ideal 
and I have no idea how long that format will persist, but it serves my needs 
well.  I do not think I would expect OGR to offer topology functions, though: I 
think I would expect to use a separate but related library to build topology 
from OGRFeatures.





There is of course some non-trivial overhead converting underlying
features into OGRFeatures, and as was noted there is some performance
impedance between OGR and GEOS due to the need to translate
geometries frequently.


There usually is yet another step (cost), it is translation from
OGRFeature to feature of application's data model.


This is very true and is probably inevitable, unless one is inventing the wheel 
yet again.  Of course, the overhead can be minimised by the use of appropriate 
structures and avoiding repetition.


Dunno how useful that is to anyone else, but if it is, then great.

Best wishes,

Peter


Peter J Halls, GIS Advisor, University of York
Telephone: 01904 433806 Fax: 01904 433740
Snail mail: Computing Service, University of York, Heslington, York YO10 5DD
This message has the status of a private and personal communication

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-15 Thread Jan Hartmann
I was thinking along the same lines, but more in the direction of OGR as 
an archival standard.  I have been working with archives and stored maps 
all my life and am now busy with lots of digital historical maps. For 
raster maps the best storing format is Geotiff, from which all other 
formats can be derived if you want something small or quick for high 
performance purposes. For vector maps there is no such standard. Most 
vector maps are exchanged as shapefiles, but this is certainly not 
optimal. Some sort of file-based OGR format could perhaps fill this gap; 
conceptually, it certainly is the best model there is. For archival 
purposes however, there should be three additional options:


1) If a map is converted to OGR and converted back, there should be an 
option of getting it back byte-identical. Law #1 of archiving: always 
make sure that you can get back the original
2) For long term archival purposes, there should be an ascii format (XML 
or GeoJSON) associated with OGR. I have had to converted binary files 
from a 60-bits Cyber long ago, so I know what I am talking about. God 
knows what a computer word will look like in the age of quark-computers.
3) There should be some sort of lossless compression scheme associated 
with OGR


As to additional functionality, it wouldn't take first place for me, but 
it would be nice to have it, perhaps not baked into the format but as 
additional libraries and an API to be linked in:


1) Most of the time I do not need optimization for speed. I find it more 
important to create a work-flow with as few copying steps as possible 
(always a gigantic source of errors). Only at the last moment, e.g. for 
setting up a high performance server, I create the necessary production 
files, optimized for speed or memory space.
2) IMHO large vector maps are useless without indices. It would be nice 
to have an indexing scheme for these OGR maps, perhaps as standalone 
files, comparable to the OVR raster files used by gdaladdo.
3) Topology would be nice too. What to think about the way ArcGIS does 
it nowadays? It uses some sort of shapefiles as base maps, but computes 
the topology (you can choose different criteria) and puts it in separate 
files. There is work going on for topology in the PostGIS, I believe, 
but it is a horribly difficult subject, of course.

4) Regular GIS functions are already available via the GEOS library.
5) I am not a big fan of Metadata. Most maps are from governmental 
organisations, and my experience with Metadata is that those 
bureaucratic offices want to put the complete structure of their 
specific organisation into the definition of the map. It is impossible 
to get all these definitions into one overarching metadata system. The 
nice thing about maps is that every map can be combined with every 
other. The problem with (governmental) organisations is that they create 
their own small universums, which aren't compatible with other 
universums, and even don't know about each other's existence. It's like 
combining Euclidean and non-Euclidean universums, don't try it! There 
should be documentation associated with a map, of course, but that is 
different from the basic definition of a map in terms of points, lines 
polygons and projections.


Again and again, I am not asking for this functionality or even 
commenting on the ongoing work on OGR; I don't know enough about its 
internals or the way people are working on it to be in any way qualified 
for that. These are just the thoughts of a long-term (and very happy) 
GDAL/OGR user from an historical/archival point of view.


Jan

On 15-1-2010 9:10, Peter J Halls wrote:

Dear All,

Mateusz Loskot wrote:
% Snip


Yes. Also, most applications I've seen using OGR do define their own
data models and translate OGRFeature to features of their own types.
Perhaps it would be interesting to know why they don't use OGRFeature
as a part of their data model, what's missing...


Thinking about this in terms of my own programs, I think that it is 
not necessarily that OGRFeature is missing anything but rather that my 
program data structures (objects) are designed for some specific 
task.  So, for example, I have a program that reads from a GPS and 
creates OGRFeatures for storage somewhere using OGR; another uses OGR 
to read data of some format and then builds a Green  Sibson 
neighbourhood structure from the OGRFeatures in order to measure 
neighbourhood characteristics and writes or updates attribute tables; 
and so on.  These simple OGC-type features are, for me, ideal input 
into or output from what are primarily research models.  Indeed, were 
the data structure more complex, I should probably have to unpick it 
into a more simple structure, like OGRFeature, in order to build 
appropriate data structures for whatever it was I was doing.


Note: I am not using OGR as a component of a GIS, rather my programs 
are either extensions to GIS methodology (eg neighbours and cluster 
detection) or are 

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-15 Thread Peter J Halls

Jan,

   for a vector archival format, surely GML is the nearest equivalent to 
Geotiff?  That should preserve all information; be vendor neutral; and it be 
possible to retrieve all information in the future.


Peter

Jan Hartmann wrote:
I was thinking along the same lines, but more in the direction of OGR as 
an archival standard.  I have been working with archives and stored maps 
all my life and am now busy with lots of digital historical maps. For 
raster maps the best storing format is Geotiff, from which all other 
formats can be derived if you want something small or quick for high 
performance purposes. For vector maps there is no such standard. Most 
vector maps are exchanged as shapefiles, but this is certainly not 
optimal. Some sort of file-based OGR format could perhaps fill this gap; 
conceptually, it certainly is the best model there is. For archival 
purposes however, there should be three additional options:


1) If a map is converted to OGR and converted back, there should be an 
option of getting it back byte-identical. Law #1 of archiving: always 
make sure that you can get back the original
2) For long term archival purposes, there should be an ascii format (XML 
or GeoJSON) associated with OGR. I have had to converted binary files 
from a 60-bits Cyber long ago, so I know what I am talking about. God 
knows what a computer word will look like in the age of quark-computers.
3) There should be some sort of lossless compression scheme associated 
with OGR


As to additional functionality, it wouldn't take first place for me, but 
it would be nice to have it, perhaps not baked into the format but as 
additional libraries and an API to be linked in:


1) Most of the time I do not need optimization for speed. I find it more 
important to create a work-flow with as few copying steps as possible 
(always a gigantic source of errors). Only at the last moment, e.g. for 
setting up a high performance server, I create the necessary production 
files, optimized for speed or memory space.
2) IMHO large vector maps are useless without indices. It would be nice 
to have an indexing scheme for these OGR maps, perhaps as standalone 
files, comparable to the OVR raster files used by gdaladdo.
3) Topology would be nice too. What to think about the way ArcGIS does 
it nowadays? It uses some sort of shapefiles as base maps, but computes 
the topology (you can choose different criteria) and puts it in separate 
files. There is work going on for topology in the PostGIS, I believe, 
but it is a horribly difficult subject, of course.

4) Regular GIS functions are already available via the GEOS library.
5) I am not a big fan of Metadata. Most maps are from governmental 
organisations, and my experience with Metadata is that those 
bureaucratic offices want to put the complete structure of their 
specific organisation into the definition of the map. It is impossible 
to get all these definitions into one overarching metadata system. The 
nice thing about maps is that every map can be combined with every 
other. The problem with (governmental) organisations is that they create 
their own small universums, which aren't compatible with other 
universums, and even don't know about each other's existence. It's like 
combining Euclidean and non-Euclidean universums, don't try it! There 
should be documentation associated with a map, of course, but that is 
different from the basic definition of a map in terms of points, lines 
polygons and projections.


Again and again, I am not asking for this functionality or even 
commenting on the ongoing work on OGR; I don't know enough about its 
internals or the way people are working on it to be in any way qualified 
for that. These are just the thoughts of a long-term (and very happy) 
GDAL/OGR user from an historical/archival point of view.


Jan

On 15-1-2010 9:10, Peter J Halls wrote:

Dear All,

Mateusz Loskot wrote:
% Snip


Yes. Also, most applications I've seen using OGR do define their own
data models and translate OGRFeature to features of their own types.
Perhaps it would be interesting to know why they don't use OGRFeature
as a part of their data model, what's missing...


Thinking about this in terms of my own programs, I think that it is 
not necessarily that OGRFeature is missing anything but rather that my 
program data structures (objects) are designed for some specific 
task.  So, for example, I have a program that reads from a GPS and 
creates OGRFeatures for storage somewhere using OGR; another uses OGR 
to read data of some format and then builds a Green  Sibson 
neighbourhood structure from the OGRFeatures in order to measure 
neighbourhood characteristics and writes or updates attribute tables; 
and so on.  These simple OGC-type features are, for me, ideal input 
into or output from what are primarily research models.  Indeed, were 
the data structure more complex, I should probably have to unpick it 
into a more simple structure, like OGRFeature, in 

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-15 Thread Jan Hartmann
Personally, I find GML far too complex to be of practical use (for my 
own purposes, mind). The GML 3.1 specification is 595 pages of small 
print, almost none of which are part of existing datasets. I just want 
to store existing datasets in a system-independent way, and preferably, 
to be able to read them directly into my applications. Perhaps GML is 
the future for new datasets in large governmental or international 
organisations, although I have my doubts about that, but for the job of 
archiving what already exists it is certainly overkill. I prefer a 
small, conceptually clear standard like OGR, that already can process 
everything that exists under the sun, and can be handled by individuals 
or small companies. I really dislike standards that require massive 
bureaucracies to get implemented. We've got enough of those here in 
Europe. Again IMHO.


Oh, and OGR already includes a subset of GML 2.0. Is this comparable 
with WCS/GDAL for raster maps?


http://www.gdal.org/ogr/drv_gml.html

Jan

On 15-1-2010 12:25, Peter J Halls wrote:

Jan,

   for a vector archival format, surely GML is the nearest equivalent 
to Geotiff?  That should preserve all information; be vendor neutral; 
and it be possible to retrieve all information in the future.


Peter

Jan Hartmann wrote:
I was thinking along the same lines, but more in the direction of OGR 
as an archival standard.  I have been working with archives and 
stored maps all my life and am now busy with lots of digital 
historical maps. For raster maps the best storing format is Geotiff, 
from which all other formats can be derived if you want something 
small or quick for high performance purposes. For vector maps there 
is no such standard. Most vector maps are exchanged as shapefiles, 
but this is certainly not optimal. Some sort of file-based OGR format 
could perhaps fill this gap; conceptually, it certainly is the best 
model there is. For archival purposes however, there should be three 
additional options:


1) If a map is converted to OGR and converted back, there should be 
an option of getting it back byte-identical. Law #1 of archiving: 
always make sure that you can get back the original
2) For long term archival purposes, there should be an ascii format 
(XML or GeoJSON) associated with OGR. I have had to converted binary 
files from a 60-bits Cyber long ago, so I know what I am talking 
about. God knows what a computer word will look like in the age of 
quark-computers.
3) There should be some sort of lossless compression scheme 
associated with OGR


As to additional functionality, it wouldn't take first place for me, 
but it would be nice to have it, perhaps not baked into the format 
but as additional libraries and an API to be linked in:


1) Most of the time I do not need optimization for speed. I find it 
more important to create a work-flow with as few copying steps as 
possible (always a gigantic source of errors). Only at the last 
moment, e.g. for setting up a high performance server, I create the 
necessary production files, optimized for speed or memory space.
2) IMHO large vector maps are useless without indices. It would be 
nice to have an indexing scheme for these OGR maps, perhaps as 
standalone files, comparable to the OVR raster files used by gdaladdo.
3) Topology would be nice too. What to think about the way ArcGIS 
does it nowadays? It uses some sort of shapefiles as base maps, but 
computes the topology (you can choose different criteria) and puts it 
in separate files. There is work going on for topology in the 
PostGIS, I believe, but it is a horribly difficult subject, of course.

4) Regular GIS functions are already available via the GEOS library.
5) I am not a big fan of Metadata. Most maps are from governmental 
organisations, and my experience with Metadata is that those 
bureaucratic offices want to put the complete structure of their 
specific organisation into the definition of the map. It is 
impossible to get all these definitions into one overarching metadata 
system. The nice thing about maps is that every map can be combined 
with every other. The problem with (governmental) organisations is 
that they create their own small universums, which aren't compatible 
with other universums, and even don't know about each other's 
existence. It's like combining Euclidean and non-Euclidean 
universums, don't try it! There should be documentation associated 
with a map, of course, but that is different from the basic 
definition of a map in terms of points, lines polygons and projections.


Again and again, I am not asking for this functionality or even 
commenting on the ongoing work on OGR; I don't know enough about its 
internals or the way people are working on it to be in any way 
qualified for that. These are just the thoughts of a long-term (and 
very happy) GDAL/OGR user from an historical/archival point of view.


Jan

On 15-1-2010 9:10, Peter J Halls wrote:

Dear All,

Mateusz Loskot wrote:
% Snip



Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-15 Thread Frank Warmerdam

Jan Hartmann wrote:
1) If a map is converted to OGR and converted back, there should be an 
option of getting it back byte-identical. Law #1 of archiving: always 
make sure that you can get back the original


Jan,

I think this is generally impractical with most OGR formats.  In
most cases ogr2ogr from a  file to a new file of the same file
format will not be byte identical.

Instead, if this is a goal of archiving, I'd suggest archiving
the original data (in a possibly arcane format), and a copy in
a more accessable format likely to still be usable decades later.

2) For long term archival purposes, there should be an ascii format (XML 
or GeoJSON) associated with OGR. I have had to converted binary files 
from a 60-bits Cyber long ago, so I know what I am talking about. God 
knows what a computer word will look like in the age of quark-computers.


It would be relatively easy to have an XML format which is a lossless
dump of what OGR knows through it's data model.  However, it might
be unlikely that any application not built on OGR would ever support
it.

The OGR VRT driver already captures most of this with my recent
addition of schema support.  It could be extended to actually be
a feature store.

Alternatively, we could look at improving the GML driver to support
capture of everything that OGR can represent.  This would have the
benefit of being useful in non-OGR applications.

3) There should be some sort of lossless compression scheme associated 
with OGR


With Even's work, it is not possible for many drivers to to
transparently access compressed files using the /vsigzip/ mechanism.

3) Topology would be nice too. What to think about the way ArcGIS does 
it nowadays? It uses some sort of shapefiles as base maps, but computes 
the topology (you can choose different criteria) and puts it in separate 
files. There is work going on for topology in the PostGIS, I believe, 
but it is a horribly difficult subject, of course.


Note that the OGR Arc/Info binary coverage (and I think a few
others drivers) do capture and represent topological relationships
with features.  However, different drivers do this in slightly
different ways since there is no well defined way of doing this
in the OGR data model.

There are no tools to build topology in GDAL but perhaps this
could be a GRASS task.

Of course, OGR does nothing to update topologies cleanly.  Currently
it really just allows access to existing topological datasets.

The problem with (governmental) organisations is that they create 
their own small universums, which aren't compatible with other 
universums, and even don't know about each other's existence. 


Very true, and to some extent this can also happen to software
projects (open source and proprietary).

Best regards,
--
---+--
I set the clouds in motion - turn up   | Frank Warmerdam, warmer...@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush| Geospatial Programmer for Rent

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-14 Thread Jan Hartmann



On 13-1-2010 21:19, Mateusz Loskot wrote:


IMHO, it's misunderstanding to consider OGR fully featured data model
and I/O engine to read, write, process and analyse spatial vector data,
especially if performance is a critical factor. IMHO, there are too many
compromises in OGR.

   
OK, that is a very clear statement. I must say that I always thought of 
OGR as an independant GIS data model, the most encompassing of all, and 
that it could (in principle anyway) be used in some sort of stand-alone 
fashion.. I certainly can imagine, however, that for real applications 
it is not as optimal as more specialized formats.


Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-14 Thread Mateusz Loskot

Jan Hartmann wrote:

On 13-1-2010 21:19, Mateusz Loskot wrote:


IMHO, it's misunderstanding to consider OGR fully featured data model
and I/O engine to read, write, process and analyse spatial vector data,
especially if performance is a critical factor. IMHO, there are too many
compromises in OGR.

   

OK, that is a very clear statement.


Please, notice the IMHO at the beginning of my sentence.

Best regards.
--
Mateusz Loskot, http://mateusz.loskot.net
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-14 Thread 刘忠志
maybe we can check the source codes, and promoted that.
for example: when we get a feature from a layer , not create a new one every 
time, just return the same feature, but change the coordinate of features. 



  ___ 
  好玩贺卡等你发,邮箱贺卡全新上线! 
http://card.mail.cn.yahoo.com/___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Peter J Halls

Jason,

   are you constrained to retaining your data in an ArcGIS compatible format? 
If so and if you do not have ArcSDE, then what follows may not be much help.


Otherwise, I think it likely that you will find using a DBMS as your data 
repository advantageous for many reasons.  Apart from the built in indexing and 
index based operations, it is *very* much easier to share data between users, 
retaining a single copy and all user having effective access.  Until the File 
Geodatabase format is published (later this year?) and someone has the effort to 
build an OGR interface, the DBMS route is probably the best route to 
compatibility.  We happen to be a corporate Oracle site, but PostGres is pretty 
similar.  PostGres is supported by ESRI with ArcSDE, so it is possible to retain 
ArcGIS compatibility this way.


Many years ago, I had a Simula class for performing many of these basic spatial 
operations, however now my data is all in Oracle: I am able to use the Oracle 
functions and no longer have to worry about building and rebuilding indexes, 
etc. - other than USER_SDO_GEOM_METADATA which, unfortunately, OGR only writes 
to at table creation and does not update.  Frankly, life (and maintenance) is 
much easier now and, certainly with Oracle, I think there have been performance 
gains.


Just my ha'pence-worth.

Peter

Mateusz Loskot wrote:

Jason Roberts wrote:

Mateusz,

I'm not an expert in this area, but I think that big performance 
gains can be obtained by using a spatial index.


Yes, likely true.

For example, consider a situation where you want to clip out a study 
region from the full resolution GSHHS shoreline database, a polygon 
layer. The shoreline polygons have very large, complicated 
geometries. It would be expensive to loop over every polygon, loading
 its full geometry and calling GEOS. Instead, you would use the 
spatial index to isolate the polygons that are likely to overlap with

 the study region, then loop over just those ones.


GEOS as JTS provides support of various spatial indexes.
It is possible to index data and optimise it in this manner as you
mention. In fact, GEOS uses index internally in various operations.
The problem is that such index is not persistent, not serialised
anywhere, so happens in memory only. In fact, there are much more
problems than this one.

BTW, PostGIS is an index serialisation.

OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.

Alternative is to try to divide the tasks:
1. Query features from data source using spatial index capability of
data source.
2. Having only subject features selected, apply geometric processing.

I did it that way, actually.

If OGR takes advantage of spatial indexes internally (e.g. if the 
data source drivers can tell the core about these indexes, and the 
core can use them when OGRLayer::SetSpatialFilter is called), then 
many scenarios could be efficiently implemented by just OGR and GEOS 
alone.


The problem with OGR and GEOS is cost of translation from OGR geometry
to GEOS geometry. It can be a bottleneck.

However, if such processing functionality would be considered as
built in to OGR, that would make sense, but I still see limitations:

Let's brainstom a bit and assume it implements operation:

OGRLayer OGR::SymDifference(OGRLayer layer1, OGRLayer layer2);

Depending on data source, OGR could exploit its capabilities.,
If both layers sit in the same PostGIS (or other spatial)
database, OGR just delegates the processing to PostGIS
where ST_SymDifference is executed and OGR only grabs the
results and generates OGRLayer.

What if layer1 is a Shapefile and layer2 is Oracle table?
Let's assume Shapefile has .qix file with spatial index
and Oracle has its own index. What does OGR do?

Loads .qix to memory, then grabs layer2 and decides which features to
select form layer1?
Loads the whole Shapefile to memory and uses Oracle index to select
features from layer2 masked by layer1?
How to calculate cost which one to transfer in which direction, etc.

Certainly, it depends on number of elements, what algorithm is used,
direction of application of algorithm (who is subject, who is object),
and many more.

It's plenty of combinations and my point is that if performance (it's
not only in terms of speed, but any resource) is critical, it would be
extremely difficult to provide efficient  implementation of such
features in OGR with guaranteed or even determinable degree of
complexity. Without these guarantees, I see little of use of
such solution.

Given that, depending on needs, write a specialised application using
available tools like OGR and GEOS, that is optimised according to
specifics of datasets, type of processing, system requirements, etc.

If not, then your suggestion may be as fast as any other. For 
example, the idea of loading the features in to PostGIS or SpatiaLite
 will require loading all of the full 

RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Duarte Carreira
Jason,

Have you looked at GeoKettle [1]? And recently I found GearScape [2], which 
seemed very interesting to me. Though neither is based on python...

Duarte Carreira

[1] - http://sourceforge.net/projects/geokettle/
[2] - http://www.fergonco.es/gearscape/index.php

De: Emilio Mayorga [emiliomayo...@gmail.com]
Enviado: terça-feira, 12 de Janeiro de 2010 18:25
Para: Jason Roberts
Cc: gdal-dev
Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries?

Hi Jason,

This may not be quite what you have in mind, but check out the PySAL
(Open Source Python Library for Spatial Analytical Functions) project:
http://geodacenter.asu.edu/pysal

I've never used it, and have only looked at a recent presentation
(http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not
clear that it includes or even aims to include the traditional spatial
operators provided by GEOS. I also have no idea if it uses OGR for its
vector data access. But the developers have done some terrific work in
spatial analysis tools in the past.

BTW, I'd love to see your marine spatial ecology tools moved to an
open source, platform neutral code base!

Cheers,

-Emilio Mayorga
Applied Physics Laboratory
University of Washington
Box 355640
Seattle, WA 98105-6698  USA


On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote:
 Dear geospatial software experts,



 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an open
 source technology that replicates the geoprocessing tools found in ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.



 If such a library does not exist, does the OGR team envision that they might
 add such capabilities to OGR in the future? From software design and
 performance points of view, would it be appropriate to extend OGR to include
 functions for spatial operations on entire layers, or is this best left to
 other libraries? I can see rudimentary ways to implement such tools (e.g.
 for intersecting layers: loop over all features in both layers, calling
 OGRGeometry::Touches on all combinations, or something similar). But I am
 not a geometry expert and do not know if OGRLayer's cursor-based design is
 compatible with such capabilities; I do not know about spatial indexing, for
 example.



 I develop open source geoprocessing tools that help with spatial ecology
 problems. At the moment, my tools depend on heavily on ArcGIS for these
 operations with vector layers. I would like to remove this dependency, and,
 if possible, develop a toolbox that exposes the same ecology tools to
 several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
 and OpenJump, support plugin extensions. I am wondering whether how
 difficult it would be to develop a package of tools that does not depend on
 a specific GIS package but exposes them to several packages via the
 package-specific plugin mechanisms. For this to work, I'd have to find a
 library that can do the kind of geoprocessing with layers that ArcGIS can
 do, or write my own. Writing it myself sounds daunting and am hoping that
 there are existing projects to draw from.



 Thank you very much for any comments you can provide.



 Jason







 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Jan Hartmann



On 13-1-2010 2:33, Mateusz Loskot wrote:


OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.
   

Just curious, would it make sense / be possible to implement indexing in 
OGR, something like a
generalized version of Mapserver's shptree, the quadtree-based spatial 
index for a shapefiles?


http://mapserver.org/utilities/shptree.html

Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Ari Jolma

Jan Hartmann wrote:



On 13-1-2010 2:33, Mateusz Loskot wrote:


OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.
  
Just curious, would it make sense / be possible to implement indexing 
in OGR, something like a
generalized version of Mapserver's shptree, the quadtree-based 
spatial index for a shapefiles?


http://mapserver.org/utilities/shptree.html


It could make sense to have a in-memory index for in-memory geometries. 
Pehaps use GiST library(1) (I don't know whether it can use in-memory 
indexes) for geometries in an OGRGeometryCollection  or OGRMemLayer if 
it's available.


For other formats it might not make sense because OGR is not responsible 
for the actual geometries. As have been said, one should use PostGIS 
format, which has this functionality built-in, for larger and more 
static datasets.


Just my quick thoughts.

Ari

(1) http://www.sai.msu.su/~megera/postgres/gist/



Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Jason Roberts
Mateusz,

Thank you very much for your insight. I have a few more questions I'm hoping
you could answer.

 Alternative is to try to divide the tasks:
 1. Query features from data source using spatial index capability of
 data source.
 2. Having only subject features selected, apply geometric processing.

That sounds like a reasonable approach. Considering just the simpler
scenarios, such as the one I mentioned, is it possible to implement
efficiently it with OGR compiled with GEOS? I believe OGR can pass through
SQL directly to the data source driver, allowing the caller to submit SQL
containing spatial operators. In principle, one could submit a spatial query
to PostGIS or SpatiaLite and efficiently get back the features (including
geometry) that could possibly intersect a bounding box. Then one could use
the GEOS functions on OGRGeometry to do the actual intersecting. Is that
what you were suggesting?

Of course, it may be that PostGIS or SpatiaLite can handle both steps 1 and
2 in a single query. If so, would it be best to do it that way?

It appears that the OGR shapefile driver supports a spatial indexing scheme
(.qix file) that is respected by OGRLayer::SetSpatialFilter. The
documentation says that Currently this test is may be inaccurately
implemented, but it is guaranteed that all features who's envelope (as
returned by OGRGeometry::getEnvelope()) overlaps the envelope of the spatial
filter will be returned. Therefore, it appears that the shapefile driver
can implement step 1 but not step 2. Is that correct?

 The problem with OGR and GEOS is cost of translation from OGR geometry
 to GEOS geometry. It can be a bottleneck.

Is it correct that this cost would only be incurred when you call OGR
functions implemented by GEOS, such as OGRGeometry::Intersects,
OGRGeometry::Disjoint, etc? 

 It's plenty of combinations and my point is that if performance (it's
 not only in terms of speed, but any resource) is critical, it would be
 extremely difficult to provide efficient  implementation of such
 features in OGR with guaranteed or even determinable degree of
 complexity. Without these guarantees, I see little of use of
 such solution.

Yes, I see what you mean. But I suggest to the open source community that
there is still value in implementing such features, either as part of OGR or
another library, even if optimal performance cannot be guaranteed in all
scenarios. The reason is that ArcGIS provides such generic tools (e.g.
intersect/union/symdiff layers, regardless of underlying storage). These
geoprocessing tools are considered the most basic capabilities of ArcGIS,
available in the cheapest versions of the software. IMHO, if the open source
community wants to win over a large number of ArcGIS users to open GIS
systems, I believe the community needs to provide parity with these basic
tools.

Thanks again,

Jason

-Original Message-
From: Mateusz Loskot [mailto:mate...@loskot.net] 
Sent: Tuesday, January 12, 2010 8:33 PM
To: Jason Roberts
Cc: 'gdal-dev'
Subject: Re: [gdal-dev] Open source vector geoprocessing libraries?

Jason Roberts wrote:
 Mateusz,
 
 I'm not an expert in this area, but I think that big performance 
 gains can be obtained by using a spatial index.

Yes, likely true.

 For example, consider a situation where you want to clip out a study 
 region from the full resolution GSHHS shoreline database, a polygon 
 layer. The shoreline polygons have very large, complicated 
 geometries. It would be expensive to loop over every polygon, loading
  its full geometry and calling GEOS. Instead, you would use the 
 spatial index to isolate the polygons that are likely to overlap with
  the study region, then loop over just those ones.

GEOS as JTS provides support of various spatial indexes.
It is possible to index data and optimise it in this manner as you
mention. In fact, GEOS uses index internally in various operations.
The problem is that such index is not persistent, not serialised
anywhere, so happens in memory only. In fact, there are much more
problems than this one.

BTW, PostGIS is an index serialisation.

OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.

Alternative is to try to divide the tasks:
1. Query features from data source using spatial index capability of
data source.
2. Having only subject features selected, apply geometric processing.

I did it that way, actually.

 If OGR takes advantage of spatial indexes internally (e.g. if the 
 data source drivers can tell the core about these indexes, and the 
 core can use them when OGRLayer::SetSpatialFilter is called), then 
 many scenarios could be efficiently implemented by just OGR and GEOS 
 alone.

The problem with OGR and GEOS is cost of translation from OGR geometry
to GEOS geometry. It can be a bottleneck.

However, if such processing functionality would be considered as
built in to OGR, that would make sense, but I still

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Ari Jolma

Jan Hartmann wrote:


On 13-1-2010 15:49, Ari Jolma wrote:

Jan Hartmann wrote:



Just curious, would it make sense / be possible to implement 
indexing in OGR, something like a
generalized version of Mapserver's shptree, the quadtree-based 
spatial index for a shapefiles?


http://mapserver.org/utilities/shptree.html


It could make sense to have a in-memory index for in-memory 
geometries. Pehaps use GiST library(1) (I don't know whether it can 
use in-memory indexes) for geometries in an OGRGeometryCollection  or 
OGRMemLayer if it's available.


For other formats it might not make sense because OGR is not 
responsible for the actual geometries. As have been said, one should 
use PostGIS format, which has this functionality built-in, for larger 
and more static datasets.


Is that so? Reading the OGR API tutorial 
(http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, 
frowm whatever input source, are represented internally as a generic 
OGRGeometry pointer, which is a virtual base class for all real 
geometry classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most 
of the GEOS functionality can be implemented on OGRGeometries, so in 
principle the same could be done with indexing libraries (GIST, 
b-tree, quadtree, etc). Such indices should be written out to disk to 
be of any use at all, of course, like shptree does.




What I meant is that with other formats than the in-memory format, the 
features are stored on disk (possibly even on remote servers) and only 
available for indexing when retrieved. When they are retrieved, they are 
of course OGR objects and accessable through the generic OGR API. Maybe 
it's possible but it would probably mean that the library would need to 
retrieve and go through all the features, and prepare and store into 
some local(?) file the index. Thus I think that for those formats, it's 
up to the format itself to provide the indexing or not.


Ari

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Jason Roberts
Hi Duarte,

Thanks for the suggestions.

I took a look at GeoKettle. Here are some relevant excerpts from a document:

GeoKettle is a ... powerful, metadata‐driven spatial ETL tool dedicated to the 
integration of different spatial data sources for building/updating geospatial 
data warehouses. At present, Oracle spatial, PostgreSQL/PostGIS and MySQL DBMS 
and the ESRI shapefiles are natively supported in read and write modes. Spatial 
Reference Systems management and coordinates transformations have been fully 
implemented. It is possible to access Geometry objects in JavaScript and define 
custom transformation steps (“Modified JavaScript Value” step). Topological 
predicates (Intersects, crosses, etc.) have all been implemented.

It looks interesting, but oriented to server applications. We are building a 
set of desktop GIS analysis tools. It would probably not be practical to try to 
embed GeoKettle in our application.

GearScape also looks interesting, with SQL-oriented geoprocessing, but it is 
more of an extensible GIS program than a geospatial library. Again, probably 
not practical to embed it in our app.

Best regards,

Jason

-Original Message-
From: Duarte Carreira [mailto:dcarre...@edia.pt] 
Sent: Wednesday, January 13, 2010 4:54 AM
To: Jason Roberts
Cc: gdal-dev
Subject: RE: [gdal-dev] Open source vector geoprocessing libraries?

Jason,

Have you looked at GeoKettle [1]? And recently I found GearScape [2], which 
seemed very interesting to me. Though neither is based on python...

Duarte Carreira

[1] - http://sourceforge.net/projects/geokettle/
[2] - http://www.fergonco.es/gearscape/index.php

De: Emilio Mayorga [emiliomayo...@gmail.com]
Enviado: terça-feira, 12 de Janeiro de 2010 18:25
Para: Jason Roberts
Cc: gdal-dev
Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries?

Hi Jason,

This may not be quite what you have in mind, but check out the PySAL
(Open Source Python Library for Spatial Analytical Functions) project:
http://geodacenter.asu.edu/pysal

I've never used it, and have only looked at a recent presentation
(http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not
clear that it includes or even aims to include the traditional spatial
operators provided by GEOS. I also have no idea if it uses OGR for its
vector data access. But the developers have done some terrific work in
spatial analysis tools in the past.

BTW, I'd love to see your marine spatial ecology tools moved to an
open source, platform neutral code base!

Cheers,

-Emilio Mayorga
Applied Physics Laboratory
University of Washington
Box 355640
Seattle, WA 98105-6698  USA


On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote:
 Dear geospatial software experts,



 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an open
 source technology that replicates the geoprocessing tools found in ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.



 If such a library does not exist, does the OGR team envision that they might
 add such capabilities to OGR in the future? From software design and
 performance points of view, would it be appropriate to extend OGR to include
 functions for spatial operations on entire layers, or is this best left to
 other libraries? I can see rudimentary ways to implement such tools (e.g.
 for intersecting layers: loop over all features in both layers, calling
 OGRGeometry::Touches on all combinations, or something similar). But I am
 not a geometry expert and do not know if OGRLayer's cursor-based design is
 compatible with such capabilities; I do not know about spatial indexing, for
 example.



 I develop open source geoprocessing tools that help with spatial ecology
 problems. At the moment, my tools depend on heavily on ArcGIS for these
 operations with vector layers. I would like to remove this dependency, and,
 if possible, develop a toolbox that exposes the same ecology tools to
 several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
 and OpenJump, support plugin extensions. I am wondering whether how
 difficult it would be to develop a package of tools that does not depend on
 a specific GIS package but exposes them to several packages via the
 package-specific plugin mechanisms. For this to work, I'd have to find a
 library that can do the kind of geoprocessing with layers that ArcGIS can
 do, or write my own. Writing it myself sounds daunting and am hoping that
 there are existing projects to draw from

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Peter J Halls

Jason,

Jason Roberts wrote:

Peter,


are you constrained to retaining your data in an ArcGIS compatible format?



We are attempting to build tools that can work with data stored in a variety
of formats. Our current user community uses mostly shapefiles, ArcGIS
personal geodatabases, and ArcGIS file geodatabases. Many of them are
ecologists who do not have the interest or skills to deploy a real DBMS
system. Thus we are hoping to provide tools that can work without one. This
is one reason I was exploring how embeddable PostGIS and SpatiaLite might be
in the other fork of this thread.


I wonder how many users are aware that ESRI have announced the file geodatabase 
as replacing the (Access) personal geodatabase?  They have not, as yet, 
announced a cut off for this format, but its many limitations as a result of 
Access capabilities may make this sooner rather than later.




Until the File 
Geodatabase format is published (later this year?) and someone has the
effort to 
build an OGR interface, the DBMS route is probably the best route to 
compatibility.


It would be really great for that to happen, but I'm not holding my breath.
If it does get published, I would seriously contemplate building an OGR
driver.


ESRI announced publication would be alongside the release of ArcGIS 9.4 at the 
EMEA User Conference in November 2008 (London).  They said that they see the 
file geodatabase replacing both the personal geodatabase and shapefiles.  I 
believe 9.4 to currently be in beta test.




I have contemplated building an ArcObjects- or arcgisscripting-based driver.
This would at least allow people who have ArcGIS to use OGR to access any
ArcGIS layer, including those created by ArcGIS's tools for joining
arbitrary layers, etc. That would handle file geodatabases, as well as ALL
formats accessible from ArcGIS. If such a driver existed, then we could use
OGR as the base interface inside our application. But creating such a driver
would be a lot of work and have funky dependencies because it either needs
to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call
the ArcGIS APIs. I am certainly capable of implementing it but because most
of our code is in Python, it is probably easier for me to wrap OGR and
arcgisscripting behind a common abstraction, and then have our tools work
against that abstraction rather than OGR directly.


GDAL, including OGR, is actually embedded in ArcGIS: however I do not know quite 
what ESRI use it for.




At any rate, I'm sure it is nice being able to do all your work in a
spatially-enabled DBMS...


Also an attraction of PostGres, of course.

Best wishes,

Peter


Peter J Halls, GIS Advisor, University of York
Telephone: 01904 433806 Fax: 01904 433740
Snail mail: Computing Service, University of York, Heslington, York YO10 5DD
This message has the status of a private and personal communication

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Frank Warmerdam

Jan Hartmann wrote:
Is that so? Reading the OGR API tutorial 
(http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, 
frowm whatever input source, are represented internally as a generic 
OGRGeometry pointer, which is a virtual base class for all real geometry 
classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most of the 
GEOS functionality can be implemented on OGRGeometries, so in principle 
the same could be done with indexing libraries (GIST, b-tree, quadtree, 
etc). Such indices should be written out to disk to be of any use at 
all, of course, like shptree does.


Jan,

I have had trouble keeping up with this spirited discussion, but I wanted
to note that it is not intended that alternate implementations of geometries
would be derived by OGRGeometry.  There are many places for instance that
assume an OGRGeometry can be cast to OGRLineString if it's type is
wkbLineString.

Best regards,
--
---+--
I set the clouds in motion - turn up   | Frank Warmerdam, warmer...@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush| Geospatial Programmer for Rent

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Ragi Y. Burhum

 Date: Wed, 13 Jan 2010 10:27:43 -0500
 From: Jason Roberts jason.robe...@duke.edu
 Subject: RE: [gdal-dev] Open source vector geoprocessing libraries?
 To: 'Peter J Halls' p.ha...@york.ac.uk
 Cc: 'gdal-dev' gdal-dev@lists.osgeo.org
 Message-ID: 008001ca9464$f4059f10$dc10dd...@roberts@duke.edu
 Content-Type: text/plain;   charset=US-ASCII

 Peter,

  are you constrained to retaining your data in an ArcGIS compatible
 format?


 We are attempting to build tools that can work with data stored in a
 variety
 of formats. Our current user community uses mostly shapefiles, ArcGIS
 personal geodatabases, and ArcGIS file geodatabases. Many of them are
 ecologists who do not have the interest or skills to deploy a real DBMS
 system. Thus we are hoping to provide tools that can work without one. This
 is one reason I was exploring how embeddable PostGIS and SpatiaLite might
 be
 in the other fork of this thread.

  Until the File
  Geodatabase format is published (later this year?) and someone has the
 effort to
  build an OGR interface, the DBMS route is probably the best route to
  compatibility.

 It would be really great for that to happen, but I'm not holding my breath.
 If it does get published, I would seriously contemplate building an OGR
 driver.

 I have contemplated building an ArcObjects- or arcgisscripting-based
 driver.
 This would at least allow people who have ArcGIS to use OGR to access any
 ArcGIS layer, including those created by ArcGIS's tools for joining
 arbitrary layers, etc. That would handle file geodatabases, as well as ALL
 formats accessible from ArcGIS. If such a driver existed, then we could use
 OGR as the base interface inside our application. But creating such a
 driver
 would be a lot of work and have funky dependencies because it either needs
 to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call
 the ArcGIS APIs. I am certainly capable of implementing it but because most
 of our code is in Python, it is probably easier for me to wrap OGR and
 arcgisscripting behind a common abstraction, and then have our tools work
 against that abstraction rather than OGR directly.


I find it very amusing you mention this right now.

Why?

I asked Frank if there was an ArcObjects based OGR driver this very past
Thursday and he said not that I know of. What I wanted was, among other
things, to get data out of FileGDB to PostGIS with one shot and add some
custom behavior for a client of mine. So I spent the past three days looking
at OGR drivers and wrote an ArcObjects based one. I got it working
yesterday.

- Right now I only instantiate 3 factories (Enterprise GDB aka ArcSDE,
AccessDB and FileGDB). This means it reads FileGDB just fine. If you want
more factories, the driver only has to be modified with one line to add any
other factories and everything else would just work.

- I only implemented the parts that I needed, so it is readonly (should be
straight forward to expand if need be).

- Although, it can read other GeoDatabase abstractions (Topology, Geometric
Networks, Annotations, Cadastral Fabrics, etc), currently I am explicitly
filtering for FeatureClasses and FeatureDatasets.

- It is a ATL / COM / C++ based one, so it will only compile on Windows. It
can be modified to use the cross platform ArcEngine SDK since all the COM
Objects that I use are called the same and behave the same way... I just did
not have an ArcEngine SDK installer, so I could not test this.

Anyway, if you are interested in the source code, let me know. Perhaps we
can add it as an ogr driver contribution (what is the process for that
anyway?). I may not respond fast enough to e-mail, since the next 4 weeks
are pretty crazy for me.

- Ragi Burhum
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Mateusz Loskot
Jan Hartmann wrote:
 On 13-1-2010 2:33, Mateusz Loskot wrote:
 
 OGR does not provide any spatial indexing layer common to various 
 vector datasets. For many simple formats it performs the 
 brute-force selection.
 
 Just curious, would it make sense / be possible to implement indexing
  in OGR, something like a generalized version of Mapserver's shptree,
  the quadtree-based spatial index for a shapefiles?

This implementation of index comes from Shapelib made by Frank.
The very same bits of Shapelib are used in MapServer and OGR,
namely .qix spatial index file support.
So, it's already there but for Shapefiles only.

Back to the question, I'm personally sceptic.
Recalling example with processing two layers, one from DBMS and one from
file-based data source, how it would be supposed to work?
...common .qix file generated for DBMS data source?

In my opinion, this kind of functionality is out of scope of OGR.
I see OGR as a data provider. OGR is basically a translation library
that reads from one data source and writes to another data source
providing set of reasonably limited features to process data during
translation - a common denominator for popular vector spatial
data formats.

IMHO, it's misunderstanding to consider OGR fully featured data model
and I/O engine to read, write, process and analyse spatial vector data,
especially if performance is a critical factor. IMHO, there are too many
compromises in OGR.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Mateusz Loskot
Jason Roberts wrote:
 Mateusz,
 
 Thank you very much for your insight. I have a few more questions I'm
 hoping you could answer.
 
 Alternative is to try to divide the tasks: 1. Query features from
 data source using spatial index capability of data source. 2.
 Having only subject features selected, apply geometric processing.
 
 That sounds like a reasonable approach. Considering just the simpler 
 scenarios, such as the one I mentioned, is it possible to implement 
 efficiently it with OGR compiled with GEOS?

Should be, but OGRGeometry - geos::Geometry translation may be an overhead.

 I believe OGR can pass through SQL directly to the data source
 driver, allowing the caller to submit SQL containing spatial
 operators. In principle, one could submit a spatial query to PostGIS
 or SpatiaLite and efficiently get back the features (including 
 geometry) that could possibly intersect a bounding box. Then one
 could use the GEOS functions on OGRGeometry to do the actual
 intersecting. Is that what you were suggesting?

Yes, that's the concept

 Of course, it may be that PostGIS or SpatiaLite can handle both steps
 1 and 2 in a single query. If so, would it be best to do it that way?
 
It's usually a good idea to let the DBMS engine to do as much as
possible, so looks like a good idea to me.

 It appears that the OGR shapefile driver supports a spatial indexing
 scheme (.qix file) that is respected by OGRLayer::SetSpatialFilter.
 The documentation says that Currently this test is may be
 inaccurately implemented, but it is guaranteed that all features
 who's envelope (as returned by OGRGeometry::getEnvelope()) overlaps
 the envelope of the spatial filter will be returned. Therefore, it
 appears that the shapefile driver can implement step 1 but not step
 2. Is that correct?

Yes.

 The problem with OGR and GEOS is cost of translation from OGR
 geometry to GEOS geometry. It can be a bottleneck.
 
 Is it correct that this cost would only be incurred when you call OGR
  functions implemented by GEOS, such as OGRGeometry::Intersects, 
 OGRGeometry::Disjoint, etc?

Yes.

Namely, here potential cost takes place:

http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrgeometry.cpp#L333

 It's plenty of combinations and my point is that if performance
 (it's not only in terms of speed, but any resource) is critical, it
 would be extremely difficult to provide efficient  implementation
 of such features in OGR with guaranteed or even determinable degree
 of complexity. Without these guarantees, I see little of use of 
 such solution.
 
 Yes, I see what you mean. But I suggest to the open source community
 that there is still value in implementing such features, either as
 part of OGR or another library, even if optimal performance cannot be
 guaranteed in all scenarios.

Perhaps you'll find these inspiring:

http://trac.osgeo.org/qgis/browser/trunk/qgis/src/analysis/vector

Look at the Java camp too.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Jason Roberts
Frank,

Thanks for your thoughts on this.

 I'd like to see something along this line happen.  I to do it efficiently
 it would be necessary to dig into GEOS past the C interface so that
 a spatial index on a collection of features can be maintained over time
 rather than created and discarded for each pairwise test of two
geometries.
 
 I am somewhat hesitant to have this sort of processing go into GDAL/OGR
 itself, especially as an extensive set of methods on OGRLayer.  I think
 it could be done as a layered processing library without any noticable
 loss of performance.

Do you know of anyone working on such a library?

It sounds like such a library would sit on top of GDAL/OGR, to leverage the
abstraction of data sources and layers. Although I am not yet familiar with
efficient algorithms for operations with layers, I suspect that library
would need spatial index support from OGR. The underlying data sources often
maintain spatial indexes. OGR would either need to expose these via a new
abstraction (new methods on OGRLayer, for example). Or if the underlying
source did not support spatial indexes, perhaps OGR could loop through the
layer, build an index with GEOS, and expose that via the same abstraction.
Is that similar to what you were thinking?

It sounds like there is not presently an open source project that provides
this geoprocessing with layers functionality. If not, I will still have to
use ArcGIS for my own project, but I would like to hide ArcGIS behind an
abstraction that is likely to be architecturally compatible with a future
library, so that maybe I could swap it in at some future point. This is why
I am probing for more details on what you envision, even if those ideas are
still somewhat distant.

Thanks,

Jason

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Doug_Newcomb

Jason,
  If you're working with vector data, why not throw the data into
Postgresql/Postgis, http://postgis.refractions.net, and use the spatial
operators there to select/buffer/intersect the vector geometries as you
describe.  http://postgis.refractions.net/documentation/manual-1.4
/ch07.html for  geoprocessing operations.  Your application can pass SQL
commands to the database.  You can use ogr to load data /export your
finished product to/from postgresql/postgis  .

  You might be able to similar things in spatialite,
http://www.gaia-gis.it/spatialite/spatialite-tutorial-2.3.1.html#t4.

Doug

Doug Newcomb
USFWS
Raleigh, NC
919-856-4520 ext. 14 doug_newc...@fws.gov
-

The opinions I express are my own and are not representative of the
official policy of the U.S.Fish and Wildlife Service or Dept. of the
Interior.   Life is too short for undocumented, proprietary data formats.


   
 Jason Roberts   
 jason.robe...@du 
 ke.eduTo
 Sent by:  'gdal-dev'
 gdal-dev-bounces@ gdal-dev@lists.osgeo.org  
 lists.osgeo.orgcc
   
   Subject
 01/11/2010 05:32  [gdal-dev] Open source vector   
 PMgeoprocessing libraries?
   
   
   
   
   
   




Dear geospatial software experts,

By integrating with GEOS, OGR can perform various spatial operations on
individual geometries, such as buffer, intersection, union, and so on. Is
there a library that efficiently performs these kinds of operations on
entire OGRLayers? For example, this library would have functions that would
buffer all of the features in a layer, or intersect all of the features in
one layer with all of those in another. Basically, I am looking for an open
source technology that replicates the geoprocessing tools found in ArcGIS
and other GIS packages. These tools traditionally operate on one or more
layers as input and produce one or more layers as output.

If such a library does not exist, does the OGR team envision that they
might add such capabilities to OGR in the future? From software design and
performance points of view, would it be appropriate to extend OGR to
include functions for spatial operations on entire layers, or is this best
left to other libraries? I can see rudimentary ways to implement such tools
(e.g. for intersecting layers: loop over all features in both layers,
calling OGRGeometry::Touches on all combinations, or something similar).
But I am not a geometry expert and do not know if OGRLayer's cursor-based
design is compatible with such capabilities; I do not know about spatial
indexing, for example.

I develop open source geoprocessing tools that help with spatial ecology
problems. At the moment, my tools depend on heavily on ArcGIS for these
operations with vector layers. I would like to remove this dependency, and,
if possible, develop a toolbox that exposes the same ecology tools to
several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
and OpenJump, support plugin extensions. I am wondering whether how
difficult it would be to develop a package of tools that does not depend on
a specific GIS package but exposes them to several packages via the
package-specific plugin mechanisms. For this to work, I'd have to find a
library that can do the kind of geoprocessing with layers that ArcGIS can
do, or write my own. Writing it myself sounds daunting and am hoping that
there are existing projects to draw from.

Thank you very much for any comments you can provide.

Jason


 ___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-devinline: graycol.gifinline: pic14932.gifinline: ecblank.gif___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Emilio Mayorga
Hi Jason,

This may not be quite what you have in mind, but check out the PySAL
(Open Source Python Library for Spatial Analytical Functions) project:
http://geodacenter.asu.edu/pysal

I've never used it, and have only looked at a recent presentation
(http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not
clear that it includes or even aims to include the traditional spatial
operators provided by GEOS. I also have no idea if it uses OGR for its
vector data access. But the developers have done some terrific work in
spatial analysis tools in the past.

BTW, I'd love to see your marine spatial ecology tools moved to an
open source, platform neutral code base!

Cheers,

-Emilio Mayorga
Applied Physics Laboratory
University of Washington
Box 355640
Seattle, WA 98105-6698  USA


On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote:
 Dear geospatial software experts,



 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an open
 source technology that replicates the geoprocessing tools found in ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.



 If such a library does not exist, does the OGR team envision that they might
 add such capabilities to OGR in the future? From software design and
 performance points of view, would it be appropriate to extend OGR to include
 functions for spatial operations on entire layers, or is this best left to
 other libraries? I can see rudimentary ways to implement such tools (e.g.
 for intersecting layers: loop over all features in both layers, calling
 OGRGeometry::Touches on all combinations, or something similar). But I am
 not a geometry expert and do not know if OGRLayer's cursor-based design is
 compatible with such capabilities; I do not know about spatial indexing, for
 example.



 I develop open source geoprocessing tools that help with spatial ecology
 problems. At the moment, my tools depend on heavily on ArcGIS for these
 operations with vector layers. I would like to remove this dependency, and,
 if possible, develop a toolbox that exposes the same ecology tools to
 several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
 and OpenJump, support plugin extensions. I am wondering whether how
 difficult it would be to develop a package of tools that does not depend on
 a specific GIS package but exposes them to several packages via the
 package-specific plugin mechanisms. For this to work, I'd have to find a
 library that can do the kind of geoprocessing with layers that ArcGIS can
 do, or write my own. Writing it myself sounds daunting and am hoping that
 there are existing projects to draw from.



 Thank you very much for any comments you can provide.



 Jason







 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Jason Roberts
Doug,

 

Thanks for these suggestions. It looks like PostGIS and SpatialLite both
provide a SQL-based approach for accomplishing what I need. Both look
promising and I will dig into them in more detail.

 

It might be less than optimal to load data into one of these, execute the
desired spatial query, and export data back out. But there is probably no
suitable alternative that provides a complete set of spatial operators that
is any faster. I'm sure a big part of executing efficient spatial queries is
having a spatial index. Even OGR does not appear to expose spatial indexes
that may be maintained by the underlying data sources. Thus any
geoprocessing library that sits on OGR or a similar API must already
retrieve all records, build a spatial index, then execute the spatial query.
This is basically the same thing as loading data into PostGIS or SpatialLite
and then executing the query.

 

I have tons of questions but will resist asking all but one: do you know how
well these systems can be embedded in other software? In my collection of
tools, I want the infrastructure that supports them to be hidden and
config-less. Although I have not used SQLite, I know it is designed
explicitly for easy embedding, so it seems promising. What about Postgres?
In my past experience, it appeared to be much more of a full-blown
enterprise database system, designed to run as a service or daemon, listen
for connections, etc. If it can be easily embedded, I might prefer to use
it, as PostGIS appears to provide a richer set of spatial operators.

 

Jason

 

From: doug_newc...@fws.gov [mailto:doug_newc...@fws.gov] 
Sent: Tuesday, January 12, 2010 12:29 PM
To: Jason Roberts
Cc: 'gdal-dev'; gdal-dev-boun...@lists.osgeo.org
Subject: Re: [gdal-dev] Open source vector geoprocessing libraries?

 

Jason,
If you're working with vector data, why not throw the data into
Postgresql/Postgis, http://postgis.refractions.net, and use the spatial
operators there to select/buffer/intersect the vector geometries as you
describe. http://postgis.refractions.net/documentation/manual-1.4/ch07.html
for geoprocessing operations. Your application can pass SQL commands to the
database. You can use ogr to load data /export your finished product to/from
postgresql/postgis . 

You might be able to similar things in spatialite,
http://www.gaia-gis.it/spatialite/spatialite-tutorial-2.3.1.html#t4.

Doug

Doug Newcomb 
USFWS
Raleigh, NC
919-856-4520 ext. 14 doug_newc...@fws.gov

-
The opinions I express are my own and are not representative of the official
policy of the U.S.Fish and Wildlife Service or Dept. of the Interior. Life
is too short for undocumented, proprietary data formats.
Inactive hide details for Jason Roberts jason.robe...@duke.eduJason
Roberts jason.robe...@duke.edu




Jason Roberts jason.robe...@duke.edu 
Sent by: gdal-dev-boun...@lists.osgeo.org 

01/11/2010 05:32 PM




To


'gdal-dev' gdal-dev@lists.osgeo.org




cc






Subject


[gdal-dev] Open source vector geoprocessing libraries?

 







Dear geospatial software experts,

By integrating with GEOS, OGR can perform various spatial operations on
individual geometries, such as buffer, intersection, union, and so on. Is
there a library that efficiently performs these kinds of operations on
entire OGRLayers? For example, this library would have functions that would
buffer all of the features in a layer, or intersect all of the features in
one layer with all of those in another. Basically, I am looking for an open
source technology that replicates the geoprocessing tools found in ArcGIS
and other GIS packages. These tools traditionally operate on one or more
layers as input and produce one or more layers as output.

If such a library does not exist, does the OGR team envision that they might
add such capabilities to OGR in the future? From software design and
performance points of view, would it be appropriate to extend OGR to include
functions for spatial operations on entire layers, or is this best left to
other libraries? I can see rudimentary ways to implement such tools (e.g.
for intersecting layers: loop over all features in both layers, calling
OGRGeometry::Touches on all combinations, or something similar). But I am
not a geometry expert and do not know if OGRLayer's cursor-based design is
compatible with such capabilities; I do not know about spatial indexing, for
example.

I develop open source geoprocessing tools that help with spatial ecology
problems. At the moment, my tools depend on heavily on ArcGIS for these
operations with vector layers. I would like to remove this dependency, and,
if possible, develop a toolbox that exposes the same ecology tools to
several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
and OpenJump, support plugin extensions. I am wondering whether how
difficult it would be to develop a package of tools that does not depend

RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Jason Roberts
Emilio,

Thanks for the suggestion of pysal. It does look interesting, but as you
speculated, it seems to not aim to include the traditional spatial
operators. Instead it looks like a collection of various interesting
algorithms, implemented in Python on top of SciPy, NumPy, spatialindex, and
Rtree. This might be useful for specific problems, but I need a more
comprehensive library of the traditional stuff.

 BTW, I'd love to see your marine spatial ecology tools moved to an
 open source, platform neutral code base!

Yes, we would love that too. At the moment, I am evaluating whether we
should develop our next batch of tools under our existing framework which
depends heavily on ArcGIS, or take a time-out to rework the framework to
eliminate that dependency. I have already done pieces of it, here and there,
but this vector geoprocessing functionality is a key blocker that remains
unresolved.

Best,

Jason

-Original Message-
From: Emilio Mayorga [mailto:emiliomayo...@gmail.com] 
Sent: Tuesday, January 12, 2010 1:26 PM
To: Jason Roberts
Cc: gdal-dev
Subject: Re: [gdal-dev] Open source vector geoprocessing libraries?

Hi Jason,

This may not be quite what you have in mind, but check out the PySAL
(Open Source Python Library for Spatial Analytical Functions) project:
http://geodacenter.asu.edu/pysal

I've never used it, and have only looked at a recent presentation
(http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not
clear that it includes or even aims to include the traditional spatial
operators provided by GEOS. I also have no idea if it uses OGR for its
vector data access. But the developers have done some terrific work in
spatial analysis tools in the past.

BTW, I'd love to see your marine spatial ecology tools moved to an
open source, platform neutral code base!

Cheers,

-Emilio Mayorga
Applied Physics Laboratory
University of Washington
Box 355640
Seattle, WA 98105-6698  USA


On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu
wrote:
 Dear geospatial software experts,



 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that
would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an
open
 source technology that replicates the geoprocessing tools found in
ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.



 If such a library does not exist, does the OGR team envision that they
might
 add such capabilities to OGR in the future? From software design and
 performance points of view, would it be appropriate to extend OGR to
include
 functions for spatial operations on entire layers, or is this best left to
 other libraries? I can see rudimentary ways to implement such tools (e.g.
 for intersecting layers: loop over all features in both layers, calling
 OGRGeometry::Touches on all combinations, or something similar). But I am
 not a geometry expert and do not know if OGRLayer's cursor-based design is
 compatible with such capabilities; I do not know about spatial indexing,
for
 example.



 I develop open source geoprocessing tools that help with spatial ecology
 problems. At the moment, my tools depend on heavily on ArcGIS for these
 operations with vector layers. I would like to remove this dependency,
and,
 if possible, develop a toolbox that exposes the same ecology tools to
 several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
 and OpenJump, support plugin extensions. I am wondering whether how
 difficult it would be to develop a package of tools that does not depend
on
 a specific GIS package but exposes them to several packages via the
 package-specific plugin mechanisms. For this to work, I'd have to find a
 library that can do the kind of geoprocessing with layers that ArcGIS can
 do, or write my own. Writing it myself sounds daunting and am hoping that
 there are existing projects to draw from.



 Thank you very much for any comments you can provide.



 Jason







 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Stephen Woodbridge

Jason Roberts wrote:

Doug,

 

Thanks for these suggestions. It looks like PostGIS and SpatialLite both 
provide a SQL-based approach for accomplishing what I need. Both look 
promising and I will dig into them in more detail.


 

It might be less than optimal to load data into one of these, execute 
the desired spatial query, and export data back out. But there is 
probably no suitable alternative that provides a complete set of spatial 
operators that is any faster. I'm sure a big part of executing efficient 
spatial queries is having a spatial index. Even OGR does not appear to 
expose spatial indexes that may be maintained by the underlying data 
sources. Thus any geoprocessing library that sits on OGR or a similar 
API must already retrieve all records, build a spatial index, then 
execute the spatial query. This is basically the same thing as loading 
data into PostGIS or SpatialLite and then executing the query.


 

I have tons of questions but will resist asking all but one: do you know 
how well these systems can be embedded in other software? In my 
collection of tools, I want the infrastructure that supports them to be 
hidden and config-less. Although I have not used SQLite, I know it is 
designed explicitly for easy embedding, so it seems promising. What 
about Postgres? In my past experience, it appeared to be much more of a 
full-blown enterprise database system, designed to run as a service or 
daemon, listen for connections, etc. If it can be easily embedded, I 
might prefer to use it, as PostGIS appears to provide a richer set of 
spatial operators.


I have used SQLite in a bunch of tools. It is easy and straight forward. 
You can do any config stuff in your code so it is config-less from the 
users point of view.


I also use PostGIS for a lot of stuff, but you are right that you need 
to setup a server and have your tools connect to it. Once the server is 
setup and running the client applications just need to have valid login 
to the server to do whatever they want.


For Perl I tend to use PostGIS and for C code I tend to use SQLite. I 
have looked at the SpatiaLite extensions but have not really used them yet.


If you are building a system where you don't want to deal with a 
database server, I would have not qualms using SQLite and SaptiaLite for 
building an embedded solution.


-Steve
 http://imaptools.com/


Jason

 


*From:* doug_newc...@fws.gov [mailto:doug_newc...@fws.gov]
*Sent:* Tuesday, January 12, 2010 12:29 PM
*To:* Jason Roberts
*Cc:* 'gdal-dev'; gdal-dev-boun...@lists.osgeo.org
*Subject:* Re: [gdal-dev] Open source vector geoprocessing libraries?

 


Jason,
If you're working with vector data, why not throw the data into 
Postgresql/Postgis, http://postgis.refractions.net, and use the spatial 
operators there to select/buffer/intersect the vector geometries as you 
describe. 
http://postgis.refractions.net/documentation/manual-1.4/ch07.html for 
geoprocessing operations. Your application can pass SQL commands to the 
database. You can use ogr to load data /export your finished product 
to/from postgresql/postgis .


You might be able to similar things in spatialite, 
http://www.gaia-gis.it/spatialite/spatialite-tutorial-2.3.1.html#t4.


Doug

Doug Newcomb
USFWS
Raleigh, NC
919-856-4520 ext. 14 doug_newc...@fws.gov
-
The opinions I express are my own and are not representative of the 
official policy of the U.S.Fish and Wildlife Service or Dept. of the 
Interior. Life is too short for undocumented, proprietary data formats.
Inactive hide details for Jason Roberts jason.robe...@duke.eduJason 
Roberts jason.robe...@duke.edu


*Jason Roberts jason.robe...@duke.edu*
Sent by: gdal-dev-boun...@lists.osgeo.org

01/11/2010 05:32 PM



To




'gdal-dev' gdal-dev@lists.osgeo.org

cc



Subject




[gdal-dev] Open source vector geoprocessing libraries?

 





Dear geospatial software experts,

By integrating with GEOS, OGR can perform various spatial operations on 
individual geometries, such as buffer, intersection, union, and so on. 
Is there a library that efficiently performs these kinds of operations 
on entire OGRLayers? For example, this library would have functions that 
would buffer all of the features in a layer, or intersect all of the 
features in one layer with all of those in another. Basically, I am 
looking for an open source technology that replicates the geoprocessing 
tools found in ArcGIS and other GIS packages. These tools traditionally 
operate on one or more layers as input and produce one or more layers as 
output.


If such a library does not exist, does the OGR team envision that they 
might add such capabilities to OGR in the future? From software design 
and performance points of view, would it be appropriate to extend OGR to 
include functions for spatial operations on entire layers, or is this 
best

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Mateusz Loskot
Jason Roberts wrote:
 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an open
 source technology that replicates the geoprocessing tools found in ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.

What prevents you from calling GEOS and process all features in a layer?

I've used GEOS processing layers of large number of features
in similar manner as PostGIS would, but programmatically (C++).

For example, generating buffer from tens of polygons
or performing boolean operations like cookie-cutting
layer of 1000-2000 polygons with one polygon.

IMHO, I can't see much point making a new library.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Jason Roberts
Mateusz,

I'm not an expert in this area, but I think that big performance gains can
be obtained by using a spatial index. For example, consider a situation
where you want to clip out a study region from the full resolution GSHHS
shoreline database, a polygon layer. The shoreline polygons have very large,
complicated geometries. It would be expensive to loop over every polygon,
loading its full geometry and calling GEOS. Instead, you would use the
spatial index to isolate the polygons that are likely to overlap with the
study region, then loop over just those ones. I do not know much about
spatial indexes yet, but I suspect they do something like store the
rectangular envelope of each feature, which can then be quickly compared to
other envelopes to determine whether it is possible for features to overlap.

If OGR takes advantage of spatial indexes internally (e.g. if the data
source drivers can tell the core about these indexes, and the core can use
them when OGRLayer::SetSpatialFilter is called), then many scenarios could
be efficiently implemented by just OGR and GEOS alone. Of course, it might
get more complicated when you have two layers to perform the operation on,
rather than one layer and a single feature. I'm sure that others have done a
lot of thinking about how to optimize the different scenarios, while I
haven't done much. This is why I wondered if there was another library for
doing this kind of thing.

If not, then your suggestion may be as fast as any other. For example, the
idea of loading the features in to PostGIS or SpatiaLite will require
loading all of the full geometries, passing them to another database system,
etc, etc. It may be that shuffling all of the data around will be hugely
expensive and that just using OGR functions with simple approaches like
calling GEOS from nested loops will be faster than shuffling the data to a
system that implements a more efficient approach once the data gets there.
Is that basically what you are saying? Or have I totally missed the point?

Thanks for your thoughts,

Jason

-Original Message-
From: Mateusz Loskot [mailto:mate...@loskot.net] 
Sent: Tuesday, January 12, 2010 5:51 PM
To: Jason Roberts
Cc: 'gdal-dev'
Subject: Re: [gdal-dev] Open source vector geoprocessing libraries?

Jason Roberts wrote:
 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that
would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an
open
 source technology that replicates the geoprocessing tools found in
ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.

What prevents you from calling GEOS and process all features in a layer?

I've used GEOS processing layers of large number of features
in similar manner as PostGIS would, but programmatically (C++).

For example, generating buffer from tens of polygons
or performing boolean operations like cookie-cutting
layer of 1000-2000 polygons with one polygon.

IMHO, I can't see much point making a new library.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-12 Thread Mateusz Loskot
Jason Roberts wrote:
 Mateusz,
 
 I'm not an expert in this area, but I think that big performance 
 gains can be obtained by using a spatial index.

Yes, likely true.

 For example, consider a situation where you want to clip out a study 
 region from the full resolution GSHHS shoreline database, a polygon 
 layer. The shoreline polygons have very large, complicated 
 geometries. It would be expensive to loop over every polygon, loading
  its full geometry and calling GEOS. Instead, you would use the 
 spatial index to isolate the polygons that are likely to overlap with
  the study region, then loop over just those ones.

GEOS as JTS provides support of various spatial indexes.
It is possible to index data and optimise it in this manner as you
mention. In fact, GEOS uses index internally in various operations.
The problem is that such index is not persistent, not serialised
anywhere, so happens in memory only. In fact, there are much more
problems than this one.

BTW, PostGIS is an index serialisation.

OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.

Alternative is to try to divide the tasks:
1. Query features from data source using spatial index capability of
data source.
2. Having only subject features selected, apply geometric processing.

I did it that way, actually.

 If OGR takes advantage of spatial indexes internally (e.g. if the 
 data source drivers can tell the core about these indexes, and the 
 core can use them when OGRLayer::SetSpatialFilter is called), then 
 many scenarios could be efficiently implemented by just OGR and GEOS 
 alone.

The problem with OGR and GEOS is cost of translation from OGR geometry
to GEOS geometry. It can be a bottleneck.

However, if such processing functionality would be considered as
built in to OGR, that would make sense, but I still see limitations:

Let's brainstom a bit and assume it implements operation:

OGRLayer OGR::SymDifference(OGRLayer layer1, OGRLayer layer2);

Depending on data source, OGR could exploit its capabilities.,
If both layers sit in the same PostGIS (or other spatial)
database, OGR just delegates the processing to PostGIS
where ST_SymDifference is executed and OGR only grabs the
results and generates OGRLayer.

What if layer1 is a Shapefile and layer2 is Oracle table?
Let's assume Shapefile has .qix file with spatial index
and Oracle has its own index. What does OGR do?

Loads .qix to memory, then grabs layer2 and decides which features to
select form layer1?
Loads the whole Shapefile to memory and uses Oracle index to select
features from layer2 masked by layer1?
How to calculate cost which one to transfer in which direction, etc.

Certainly, it depends on number of elements, what algorithm is used,
direction of application of algorithm (who is subject, who is object),
and many more.

It's plenty of combinations and my point is that if performance (it's
not only in terms of speed, but any resource) is critical, it would be
extremely difficult to provide efficient  implementation of such
features in OGR with guaranteed or even determinable degree of
complexity. Without these guarantees, I see little of use of
such solution.

Given that, depending on needs, write a specialised application using
available tools like OGR and GEOS, that is optimised according to
specifics of datasets, type of processing, system requirements, etc.

 If not, then your suggestion may be as fast as any other. For 
 example, the idea of loading the features in to PostGIS or SpatiaLite
  will require loading all of the full geometries, passing them to 
 another database system, etc, etc. It may be that shuffling all of 
 the data around will be hugely expensive and that just using OGR 
 functions with simple approaches like calling GEOS from nested loops 
 will be faster than shuffling the data to a system that implements a 
 more efficient approach once the data gets there.

It's never just using. Performance is usualy a concern regarding large
datasets. Large datasets are unlikely to be stored in a simple
format, but in proper spatial data storage, like PostGIS.
It nicely combines all the elements necessary to perform geometrical
processing in usable and optimised form, with index.

 Is that basically what you are saying?

It is.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-11 Thread Frank Warmerdam

Jason Roberts wrote:

Dear geospatial software experts,

 

By integrating with GEOS, OGR can perform various spatial operations on 
individual geometries, such as buffer, intersection, union, and so on. 
Is there a library that efficiently performs these kinds of operations 
on entire OGRLayers? For example, this library would have functions that 
would buffer all of the features in a layer, or intersect all of the 
features in one layer with all of those in another. Basically, I am 
looking for an open source technology that replicates the geoprocessing 
tools found in ArcGIS and other GIS packages. These tools traditionally 
operate on one or more layers as input and produce one or more layers as 
output.


Jason,

I'd like to see something along this line happen.  I to do it efficiently
it would be necessary to dig into GEOS past the C interface so that
a spatial index on a collection of features can be maintained over time
rather than created and discarded for each pairwise test of two geometries.

I am somewhat hesitant to have this sort of processing go into GDAL/OGR
itself, especially as an extensive set of methods on OGRLayer.  I think
it could be done as a layered processing library without any noticable
loss of performance.

Best regards,
--
---+--
I set the clouds in motion - turn up   | Frank Warmerdam, warmer...@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush| Geospatial Programmer for Rent

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev