Re: [gdal-dev] Open source vector geoprocessing libraries?
On 17-1-2010 22:02, Mateusz Loskot wrote: Does that mean that I can use ogrinfo on a gzipped archive, like gdalinfo (http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip)? Yes, it does but...as long as OGR driver performs filesystem operations using VSI*L API. The problem is that only few of them use VSI*L. As an example, I just updated OGR GeoJSON driver to use VSI*L API: http://trac.osgeo.org/gdal/changeset/18573 As you can see, it is not a complex task, but may be time consuming and tedious, and requires quite a lot of testing. After these changes are applied, it is possible to (un)gzip GeoJSON datasets: 1) Translate Shapefile to GeoJSON file compressed using GZip: $ ogr2ogr -f GeoJSON /vsigzip/./points.geojson.gz points.shp 2) Read GZip compressed GeoJSON dataset: $ ogrinfo /vsigzip/./points.geojson.gz $ ogrinfo /vsigzip/./points.geojson.gz OGRGeoJSON It would be a nice archive functionality. Yes, indeed. The work requires to step in to each OGR driver directory, grep .cpp files for VSI verify what VSI API is used.driver: For example, OGR Shapefile driver would, in theory, need to get updated VSI calls in about 50 places: $ grep VSI *.cpp | wc -ls 47 I'm quite sure volunteers would be appreciated. Thanks Mateusz. I am going to look at GeoJSON as an ASCII data archiving format, as long as long as a full-fledged GML driver isn't available I have always liked JSON more then XML. Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 16-1-2010 16:03, Mateusz Loskot wrote: Jan Hartmann wrote: Yes, that is clear, thanks. I see that at the moment only raster files are supported. Would it make sense to do this for vector formats too? The VSI layer is available to all parts of GDAL and OGR. If you scan source code of OGR drivers, you'll find that this feature is already used by, for example, GTM driver: http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrsf_frmts/gtm/gtm.cpp?rev=17612#L339 IOW, VSI layer is available and not dependant on GDAL or OGR format, it's a separate library of common functions. Best regards, Does that mean that I can use ogrinfo on a gzipped archive, like gdalinfo (http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip)? Yes, it does but...as long as OGR driver performs filesystem operations using VSI*L API. The problem is that only few of them use VSI*L. As an example, I just updated OGR GeoJSON driver to use VSI*L API: http://trac.osgeo.org/gdal/changeset/18573 As you can see, it is not a complex task, but may be time consuming and tedious, and requires quite a lot of testing. After these changes are applied, it is possible to (un)gzip GeoJSON datasets: 1) Translate Shapefile to GeoJSON file compressed using GZip: $ ogr2ogr -f GeoJSON /vsigzip/./points.geojson.gz points.shp 2) Read GZip compressed GeoJSON dataset: $ ogrinfo /vsigzip/./points.geojson.gz $ ogrinfo /vsigzip/./points.geojson.gz OGRGeoJSON This page says: The fact that this new capability is implemented as virtual file systems imply that it will only work for GDAL drivers supporting the large file API Apparently, the Wiki needs to be updated. It would be a nice archive functionality. Yes, indeed. The work requires to step in to each OGR driver directory, grep .cpp files for VSI verify what VSI API is used.driver: For example, OGR Shapefile driver would, in theory, need to get updated VSI calls in about 50 places: $ grep VSI *.cpp | wc -ls 47 I'm quite sure volunteers would be appreciated. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
I think Frank meant With Even's work, it is *now* possible for many drivers to to transparently access compressed files using the /vsigzip/ mechanism. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
On 16-1-2010 3:01, Frank Warmerdam wrote: Instead, if this is a goal of archiving, I'd suggest archiving the original data (in a possibly arcane format), and a copy in a more accessable format likely to still be usable decades later. That's what we are doing now. It's OK for practical purposes. The OGR VRT driver already captures most of this with my recent addition of schema support. It could be extended to actually be a feature store. Alternatively, we could look at improving the GML driver to support capture of everything that OGR can represent. This would have the benefit of being useful in non-OGR applications. Both very good propositions, especially the GML one. I guess such a GML dump could be read back into OGR without problems? I'm not proposing a RFC (don't know how much work it is), but perhaps you could keep this in the back of your head ... Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Oh, that makes a difference indeed! When I read Frank's long answer, this was the only point I didn't like. How does this mechanism work? Jan On 16-1-2010 12:33, Even Rouault wrote: I think Frank meant With Even's work, it is *now* possible for many drivers to to transparently access compressed files using the /vsigzip/ mechanism. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan, Some GDAL and OGR drivers use a specific API, the VSI File Large API, to access files. That API mimics the semantics of standard C library IO API : fopen - VSIFOpenL fread - VSIFReadL fseek - VSIFSeekL Usually that API enables access to large files ( 4 GB) on Unix and Windows. But there are a few 'plugins' for specific purposes. For example, if you pass /vsigzip/pass/to/your/file.gz to VSIFOpenL, the calls will go through a plug-in that will do on-the-fly decompression of a GZip compressed file (compression support added by Frank in 1.7.0). This is used internally by the GDAL R driver, or by the OGR GTM driver. We can also use the /vsimem/ prefix to read or write into in-memory files (used internally in GDAL in some drivers and algorithms, used by MapServer to generate the output image and avoid creating a temporary file on the file system, etc...). Or /vsisubfile/ to access to a file embedded within another file (used to decompress JPEG2000 or JPEG streams in some formats like NITF). Here are a few links for further reading on the subject : http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip http://gdal.org/cpl__vsi_8h.html Best regards, Even Oh, that makes a difference indeed! When I read Frank's long answer, this was the only point I didn't like. How does this mechanism work? Jan On 16-1-2010 12:33, Even Rouault wrote: I think Frank meant With Even's work, it is *now* possible for many drivers to to transparently access compressed files using the /vsigzip/ mechanism. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Yes, that is clear, thanks. I see that at the moment only raster files are supported. Would it make sense to do this for vector formats too? I am thinking of a dump from a large PostGIS database I had to upgrade from a 32 to a 64 bits server. I didn't like the pgdump format, as I got in all sorts of trouble with system tables and libraries, and the dumpfile got really big. If I had thought about it, I would have extracted all schemas with OGR, and I am certainly going to archive and gzip my PostGIS vector files that way. It would be nice if ogrinfo could give information about vector files within a (g)zip archive, and, of course, it would be ideal if the files within that archive would be in some sort of GML that could be translated back to a binary OGR format. That would make a perfect data storage and archive medium, but meanwhile, thanks to this conversation, I already have found a nice way to backup and archive my maps. Thanks. Jan On 16-1-2010 13:52, Even Rouault wrote: Jan, Some GDAL and OGR drivers use a specific API, the VSI File Large API, to access files. That API mimics the semantics of standard C library IO API : fopen - VSIFOpenL fread - VSIFReadL fseek - VSIFSeekL Usually that API enables access to large files ( 4 GB) on Unix and Windows. But there are a few 'plugins' for specific purposes. For example, if you pass /vsigzip/pass/to/your/file.gz to VSIFOpenL, the calls will go through a plug-in that will do on-the-fly decompression of a GZip compressed file (compression support added by Frank in 1.7.0). This is used internally by the GDAL R driver, or by the OGR GTM driver. We can also use the /vsimem/ prefix to read or write into in-memory files (used internally in GDAL in some drivers and algorithms, used by MapServer to generate the output image and avoid creating a temporary file on the file system, etc...). Or /vsisubfile/ to access to a file embedded within another file (used to decompress JPEG2000 or JPEG streams in some formats like NITF). Here are a few links for further reading on the subject : http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip http://gdal.org/cpl__vsi_8h.html Best regards, Even Oh, that makes a difference indeed! When I read Frank's long answer, this was the only point I didn't like. How does this mechanism work? Jan On 16-1-2010 12:33, Even Rouault wrote: I think Frank meant With Even's work, it is *now* possible for many drivers to to transparently access compressed files using the /vsigzip/ mechanism. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: Yes, that is clear, thanks. I see that at the moment only raster files are supported. Would it make sense to do this for vector formats too? The VSI layer is available to all parts of GDAL and OGR. If you scan source code of OGR drivers, you'll find that this feature is already used by, for example, GTM driver: http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrsf_frmts/gtm/gtm.cpp?rev=17612#L339 IOW, VSI layer is available and not dependant on GDAL or OGR format, it's a separate library of common functions. Best regards, -- Mateusz Loskot http://mateusz.loskot.net ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
On 16-1-2010 16:03, Mateusz Loskot wrote: Jan Hartmann wrote: Yes, that is clear, thanks. I see that at the moment only raster files are supported. Would it make sense to do this for vector formats too? The VSI layer is available to all parts of GDAL and OGR. If you scan source code of OGR drivers, you'll find that this feature is already used by, for example, GTM driver: http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrsf_frmts/gtm/gtm.cpp?rev=17612#L339 IOW, VSI layer is available and not dependant on GDAL or OGR format, it's a separate library of common functions. Best regards, Does that mean that I can use ogrinfo on a gzipped archive, like gdalinfo (http://trac.osgeo.org/gdal/wiki/UserDocs/ReadInZip)? This page says: The fact that this new capability is implemented as virtual file systems imply that it will only work for GDAL drivers supporting the large file API It would be a nice archive functionality. At the moment I store my raster files in a directory tree and retrieve information on some or all of them with small scripts calling gdalinfo and filtering the results. I know now that I can put the files in a gzip archive, while still be able to do the same queries. If the same would be possible for vector maps, I could store everything in one large file for backup. Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Thanks Frank, I'll try and let you know. BTW: Perhaps your idea of GML-output for OGR would solve the Metadata-problem for OGR. Everyone can add whatever data they want to the GML-file, as long as OGR knows what parts to retrieve to reconstruct the vector map. Jan On 16-Jan-10 21:53, Frank Warmerdam wrote: Some OGR drivers (and some GDAL drivers) will support this capability. It depends on which ones go through the VSI*L API. On the GDAL side we now have a technique to keep track of this in an organized way, while this does not yet exist on the OGR side. Buy you can just try it to know for a particular driver. Best regards, ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Dear All, Mateusz Loskot wrote: % Snip Yes. Also, most applications I've seen using OGR do define their own data models and translate OGRFeature to features of their own types. Perhaps it would be interesting to know why they don't use OGRFeature as a part of their data model, what's missing... Thinking about this in terms of my own programs, I think that it is not necessarily that OGRFeature is missing anything but rather that my program data structures (objects) are designed for some specific task. So, for example, I have a program that reads from a GPS and creates OGRFeatures for storage somewhere using OGR; another uses OGR to read data of some format and then builds a Green Sibson neighbourhood structure from the OGRFeatures in order to measure neighbourhood characteristics and writes or updates attribute tables; and so on. These simple OGC-type features are, for me, ideal input into or output from what are primarily research models. Indeed, were the data structure more complex, I should probably have to unpick it into a more simple structure, like OGRFeature, in order to build appropriate data structures for whatever it was I was doing. Note: I am not using OGR as a component of a GIS, rather my programs are either extensions to GIS methodology (eg neighbours and cluster detection) or are designed to model the behaviour of some phenomenon. I use OGR for format independent spatial object IO because it is easy to map OGR objects to my objects. This means that I use my own object methods for tasks like intersection detection, etc, when needed. Having said that, do I want more? There are times when geometric topology is, or could be, very useful. Currently, if I really want that, I create an ESRI 'coverage' dataset and use that as input via infolib: its not necessarily ideal and I have no idea how long that format will persist, but it serves my needs well. I do not think I would expect OGR to offer topology functions, though: I think I would expect to use a separate but related library to build topology from OGRFeatures. There is of course some non-trivial overhead converting underlying features into OGRFeatures, and as was noted there is some performance impedance between OGR and GEOS due to the need to translate geometries frequently. There usually is yet another step (cost), it is translation from OGRFeature to feature of application's data model. This is very true and is probably inevitable, unless one is inventing the wheel yet again. Of course, the overhead can be minimised by the use of appropriate structures and avoiding repetition. Dunno how useful that is to anyone else, but if it is, then great. Best wishes, Peter Peter J Halls, GIS Advisor, University of York Telephone: 01904 433806 Fax: 01904 433740 Snail mail: Computing Service, University of York, Heslington, York YO10 5DD This message has the status of a private and personal communication ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
I was thinking along the same lines, but more in the direction of OGR as an archival standard. I have been working with archives and stored maps all my life and am now busy with lots of digital historical maps. For raster maps the best storing format is Geotiff, from which all other formats can be derived if you want something small or quick for high performance purposes. For vector maps there is no such standard. Most vector maps are exchanged as shapefiles, but this is certainly not optimal. Some sort of file-based OGR format could perhaps fill this gap; conceptually, it certainly is the best model there is. For archival purposes however, there should be three additional options: 1) If a map is converted to OGR and converted back, there should be an option of getting it back byte-identical. Law #1 of archiving: always make sure that you can get back the original 2) For long term archival purposes, there should be an ascii format (XML or GeoJSON) associated with OGR. I have had to converted binary files from a 60-bits Cyber long ago, so I know what I am talking about. God knows what a computer word will look like in the age of quark-computers. 3) There should be some sort of lossless compression scheme associated with OGR As to additional functionality, it wouldn't take first place for me, but it would be nice to have it, perhaps not baked into the format but as additional libraries and an API to be linked in: 1) Most of the time I do not need optimization for speed. I find it more important to create a work-flow with as few copying steps as possible (always a gigantic source of errors). Only at the last moment, e.g. for setting up a high performance server, I create the necessary production files, optimized for speed or memory space. 2) IMHO large vector maps are useless without indices. It would be nice to have an indexing scheme for these OGR maps, perhaps as standalone files, comparable to the OVR raster files used by gdaladdo. 3) Topology would be nice too. What to think about the way ArcGIS does it nowadays? It uses some sort of shapefiles as base maps, but computes the topology (you can choose different criteria) and puts it in separate files. There is work going on for topology in the PostGIS, I believe, but it is a horribly difficult subject, of course. 4) Regular GIS functions are already available via the GEOS library. 5) I am not a big fan of Metadata. Most maps are from governmental organisations, and my experience with Metadata is that those bureaucratic offices want to put the complete structure of their specific organisation into the definition of the map. It is impossible to get all these definitions into one overarching metadata system. The nice thing about maps is that every map can be combined with every other. The problem with (governmental) organisations is that they create their own small universums, which aren't compatible with other universums, and even don't know about each other's existence. It's like combining Euclidean and non-Euclidean universums, don't try it! There should be documentation associated with a map, of course, but that is different from the basic definition of a map in terms of points, lines polygons and projections. Again and again, I am not asking for this functionality or even commenting on the ongoing work on OGR; I don't know enough about its internals or the way people are working on it to be in any way qualified for that. These are just the thoughts of a long-term (and very happy) GDAL/OGR user from an historical/archival point of view. Jan On 15-1-2010 9:10, Peter J Halls wrote: Dear All, Mateusz Loskot wrote: % Snip Yes. Also, most applications I've seen using OGR do define their own data models and translate OGRFeature to features of their own types. Perhaps it would be interesting to know why they don't use OGRFeature as a part of their data model, what's missing... Thinking about this in terms of my own programs, I think that it is not necessarily that OGRFeature is missing anything but rather that my program data structures (objects) are designed for some specific task. So, for example, I have a program that reads from a GPS and creates OGRFeatures for storage somewhere using OGR; another uses OGR to read data of some format and then builds a Green Sibson neighbourhood structure from the OGRFeatures in order to measure neighbourhood characteristics and writes or updates attribute tables; and so on. These simple OGC-type features are, for me, ideal input into or output from what are primarily research models. Indeed, were the data structure more complex, I should probably have to unpick it into a more simple structure, like OGRFeature, in order to build appropriate data structures for whatever it was I was doing. Note: I am not using OGR as a component of a GIS, rather my programs are either extensions to GIS methodology (eg neighbours and cluster detection) or are
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan, for a vector archival format, surely GML is the nearest equivalent to Geotiff? That should preserve all information; be vendor neutral; and it be possible to retrieve all information in the future. Peter Jan Hartmann wrote: I was thinking along the same lines, but more in the direction of OGR as an archival standard. I have been working with archives and stored maps all my life and am now busy with lots of digital historical maps. For raster maps the best storing format is Geotiff, from which all other formats can be derived if you want something small or quick for high performance purposes. For vector maps there is no such standard. Most vector maps are exchanged as shapefiles, but this is certainly not optimal. Some sort of file-based OGR format could perhaps fill this gap; conceptually, it certainly is the best model there is. For archival purposes however, there should be three additional options: 1) If a map is converted to OGR and converted back, there should be an option of getting it back byte-identical. Law #1 of archiving: always make sure that you can get back the original 2) For long term archival purposes, there should be an ascii format (XML or GeoJSON) associated with OGR. I have had to converted binary files from a 60-bits Cyber long ago, so I know what I am talking about. God knows what a computer word will look like in the age of quark-computers. 3) There should be some sort of lossless compression scheme associated with OGR As to additional functionality, it wouldn't take first place for me, but it would be nice to have it, perhaps not baked into the format but as additional libraries and an API to be linked in: 1) Most of the time I do not need optimization for speed. I find it more important to create a work-flow with as few copying steps as possible (always a gigantic source of errors). Only at the last moment, e.g. for setting up a high performance server, I create the necessary production files, optimized for speed or memory space. 2) IMHO large vector maps are useless without indices. It would be nice to have an indexing scheme for these OGR maps, perhaps as standalone files, comparable to the OVR raster files used by gdaladdo. 3) Topology would be nice too. What to think about the way ArcGIS does it nowadays? It uses some sort of shapefiles as base maps, but computes the topology (you can choose different criteria) and puts it in separate files. There is work going on for topology in the PostGIS, I believe, but it is a horribly difficult subject, of course. 4) Regular GIS functions are already available via the GEOS library. 5) I am not a big fan of Metadata. Most maps are from governmental organisations, and my experience with Metadata is that those bureaucratic offices want to put the complete structure of their specific organisation into the definition of the map. It is impossible to get all these definitions into one overarching metadata system. The nice thing about maps is that every map can be combined with every other. The problem with (governmental) organisations is that they create their own small universums, which aren't compatible with other universums, and even don't know about each other's existence. It's like combining Euclidean and non-Euclidean universums, don't try it! There should be documentation associated with a map, of course, but that is different from the basic definition of a map in terms of points, lines polygons and projections. Again and again, I am not asking for this functionality or even commenting on the ongoing work on OGR; I don't know enough about its internals or the way people are working on it to be in any way qualified for that. These are just the thoughts of a long-term (and very happy) GDAL/OGR user from an historical/archival point of view. Jan On 15-1-2010 9:10, Peter J Halls wrote: Dear All, Mateusz Loskot wrote: % Snip Yes. Also, most applications I've seen using OGR do define their own data models and translate OGRFeature to features of their own types. Perhaps it would be interesting to know why they don't use OGRFeature as a part of their data model, what's missing... Thinking about this in terms of my own programs, I think that it is not necessarily that OGRFeature is missing anything but rather that my program data structures (objects) are designed for some specific task. So, for example, I have a program that reads from a GPS and creates OGRFeatures for storage somewhere using OGR; another uses OGR to read data of some format and then builds a Green Sibson neighbourhood structure from the OGRFeatures in order to measure neighbourhood characteristics and writes or updates attribute tables; and so on. These simple OGC-type features are, for me, ideal input into or output from what are primarily research models. Indeed, were the data structure more complex, I should probably have to unpick it into a more simple structure, like OGRFeature, in
Re: [gdal-dev] Open source vector geoprocessing libraries?
Personally, I find GML far too complex to be of practical use (for my own purposes, mind). The GML 3.1 specification is 595 pages of small print, almost none of which are part of existing datasets. I just want to store existing datasets in a system-independent way, and preferably, to be able to read them directly into my applications. Perhaps GML is the future for new datasets in large governmental or international organisations, although I have my doubts about that, but for the job of archiving what already exists it is certainly overkill. I prefer a small, conceptually clear standard like OGR, that already can process everything that exists under the sun, and can be handled by individuals or small companies. I really dislike standards that require massive bureaucracies to get implemented. We've got enough of those here in Europe. Again IMHO. Oh, and OGR already includes a subset of GML 2.0. Is this comparable with WCS/GDAL for raster maps? http://www.gdal.org/ogr/drv_gml.html Jan On 15-1-2010 12:25, Peter J Halls wrote: Jan, for a vector archival format, surely GML is the nearest equivalent to Geotiff? That should preserve all information; be vendor neutral; and it be possible to retrieve all information in the future. Peter Jan Hartmann wrote: I was thinking along the same lines, but more in the direction of OGR as an archival standard. I have been working with archives and stored maps all my life and am now busy with lots of digital historical maps. For raster maps the best storing format is Geotiff, from which all other formats can be derived if you want something small or quick for high performance purposes. For vector maps there is no such standard. Most vector maps are exchanged as shapefiles, but this is certainly not optimal. Some sort of file-based OGR format could perhaps fill this gap; conceptually, it certainly is the best model there is. For archival purposes however, there should be three additional options: 1) If a map is converted to OGR and converted back, there should be an option of getting it back byte-identical. Law #1 of archiving: always make sure that you can get back the original 2) For long term archival purposes, there should be an ascii format (XML or GeoJSON) associated with OGR. I have had to converted binary files from a 60-bits Cyber long ago, so I know what I am talking about. God knows what a computer word will look like in the age of quark-computers. 3) There should be some sort of lossless compression scheme associated with OGR As to additional functionality, it wouldn't take first place for me, but it would be nice to have it, perhaps not baked into the format but as additional libraries and an API to be linked in: 1) Most of the time I do not need optimization for speed. I find it more important to create a work-flow with as few copying steps as possible (always a gigantic source of errors). Only at the last moment, e.g. for setting up a high performance server, I create the necessary production files, optimized for speed or memory space. 2) IMHO large vector maps are useless without indices. It would be nice to have an indexing scheme for these OGR maps, perhaps as standalone files, comparable to the OVR raster files used by gdaladdo. 3) Topology would be nice too. What to think about the way ArcGIS does it nowadays? It uses some sort of shapefiles as base maps, but computes the topology (you can choose different criteria) and puts it in separate files. There is work going on for topology in the PostGIS, I believe, but it is a horribly difficult subject, of course. 4) Regular GIS functions are already available via the GEOS library. 5) I am not a big fan of Metadata. Most maps are from governmental organisations, and my experience with Metadata is that those bureaucratic offices want to put the complete structure of their specific organisation into the definition of the map. It is impossible to get all these definitions into one overarching metadata system. The nice thing about maps is that every map can be combined with every other. The problem with (governmental) organisations is that they create their own small universums, which aren't compatible with other universums, and even don't know about each other's existence. It's like combining Euclidean and non-Euclidean universums, don't try it! There should be documentation associated with a map, of course, but that is different from the basic definition of a map in terms of points, lines polygons and projections. Again and again, I am not asking for this functionality or even commenting on the ongoing work on OGR; I don't know enough about its internals or the way people are working on it to be in any way qualified for that. These are just the thoughts of a long-term (and very happy) GDAL/OGR user from an historical/archival point of view. Jan On 15-1-2010 9:10, Peter J Halls wrote: Dear All, Mateusz Loskot wrote: % Snip
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: 1) If a map is converted to OGR and converted back, there should be an option of getting it back byte-identical. Law #1 of archiving: always make sure that you can get back the original Jan, I think this is generally impractical with most OGR formats. In most cases ogr2ogr from a file to a new file of the same file format will not be byte identical. Instead, if this is a goal of archiving, I'd suggest archiving the original data (in a possibly arcane format), and a copy in a more accessable format likely to still be usable decades later. 2) For long term archival purposes, there should be an ascii format (XML or GeoJSON) associated with OGR. I have had to converted binary files from a 60-bits Cyber long ago, so I know what I am talking about. God knows what a computer word will look like in the age of quark-computers. It would be relatively easy to have an XML format which is a lossless dump of what OGR knows through it's data model. However, it might be unlikely that any application not built on OGR would ever support it. The OGR VRT driver already captures most of this with my recent addition of schema support. It could be extended to actually be a feature store. Alternatively, we could look at improving the GML driver to support capture of everything that OGR can represent. This would have the benefit of being useful in non-OGR applications. 3) There should be some sort of lossless compression scheme associated with OGR With Even's work, it is not possible for many drivers to to transparently access compressed files using the /vsigzip/ mechanism. 3) Topology would be nice too. What to think about the way ArcGIS does it nowadays? It uses some sort of shapefiles as base maps, but computes the topology (you can choose different criteria) and puts it in separate files. There is work going on for topology in the PostGIS, I believe, but it is a horribly difficult subject, of course. Note that the OGR Arc/Info binary coverage (and I think a few others drivers) do capture and represent topological relationships with features. However, different drivers do this in slightly different ways since there is no well defined way of doing this in the OGR data model. There are no tools to build topology in GDAL but perhaps this could be a GRASS task. Of course, OGR does nothing to update topologies cleanly. Currently it really just allows access to existing topological datasets. The problem with (governmental) organisations is that they create their own small universums, which aren't compatible with other universums, and even don't know about each other's existence. Very true, and to some extent this can also happen to software projects (open source and proprietary). Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush| Geospatial Programmer for Rent ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
On 13-1-2010 21:19, Mateusz Loskot wrote: IMHO, it's misunderstanding to consider OGR fully featured data model and I/O engine to read, write, process and analyse spatial vector data, especially if performance is a critical factor. IMHO, there are too many compromises in OGR. OK, that is a very clear statement. I must say that I always thought of OGR as an independant GIS data model, the most encompassing of all, and that it could (in principle anyway) be used in some sort of stand-alone fashion.. I certainly can imagine, however, that for real applications it is not as optimal as more specialized formats. Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 13-1-2010 21:19, Mateusz Loskot wrote: IMHO, it's misunderstanding to consider OGR fully featured data model and I/O engine to read, write, process and analyse spatial vector data, especially if performance is a critical factor. IMHO, there are too many compromises in OGR. OK, that is a very clear statement. Please, notice the IMHO at the beginning of my sentence. Best regards. -- Mateusz Loskot, http://mateusz.loskot.net ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
maybe we can check the source codes, and promoted that. for example: when we get a feature from a layer , not create a new one every time, just return the same feature, but change the coordinate of features. ___ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason, are you constrained to retaining your data in an ArcGIS compatible format? If so and if you do not have ArcSDE, then what follows may not be much help. Otherwise, I think it likely that you will find using a DBMS as your data repository advantageous for many reasons. Apart from the built in indexing and index based operations, it is *very* much easier to share data between users, retaining a single copy and all user having effective access. Until the File Geodatabase format is published (later this year?) and someone has the effort to build an OGR interface, the DBMS route is probably the best route to compatibility. We happen to be a corporate Oracle site, but PostGres is pretty similar. PostGres is supported by ESRI with ArcSDE, so it is possible to retain ArcGIS compatibility this way. Many years ago, I had a Simula class for performing many of these basic spatial operations, however now my data is all in Oracle: I am able to use the Oracle functions and no longer have to worry about building and rebuilding indexes, etc. - other than USER_SDO_GEOM_METADATA which, unfortunately, OGR only writes to at table creation and does not update. Frankly, life (and maintenance) is much easier now and, certainly with Oracle, I think there have been performance gains. Just my ha'pence-worth. Peter Mateusz Loskot wrote: Jason Roberts wrote: Mateusz, I'm not an expert in this area, but I think that big performance gains can be obtained by using a spatial index. Yes, likely true. For example, consider a situation where you want to clip out a study region from the full resolution GSHHS shoreline database, a polygon layer. The shoreline polygons have very large, complicated geometries. It would be expensive to loop over every polygon, loading its full geometry and calling GEOS. Instead, you would use the spatial index to isolate the polygons that are likely to overlap with the study region, then loop over just those ones. GEOS as JTS provides support of various spatial indexes. It is possible to index data and optimise it in this manner as you mention. In fact, GEOS uses index internally in various operations. The problem is that such index is not persistent, not serialised anywhere, so happens in memory only. In fact, there are much more problems than this one. BTW, PostGIS is an index serialisation. OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. I did it that way, actually. If OGR takes advantage of spatial indexes internally (e.g. if the data source drivers can tell the core about these indexes, and the core can use them when OGRLayer::SetSpatialFilter is called), then many scenarios could be efficiently implemented by just OGR and GEOS alone. The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. However, if such processing functionality would be considered as built in to OGR, that would make sense, but I still see limitations: Let's brainstom a bit and assume it implements operation: OGRLayer OGR::SymDifference(OGRLayer layer1, OGRLayer layer2); Depending on data source, OGR could exploit its capabilities., If both layers sit in the same PostGIS (or other spatial) database, OGR just delegates the processing to PostGIS where ST_SymDifference is executed and OGR only grabs the results and generates OGRLayer. What if layer1 is a Shapefile and layer2 is Oracle table? Let's assume Shapefile has .qix file with spatial index and Oracle has its own index. What does OGR do? Loads .qix to memory, then grabs layer2 and decides which features to select form layer1? Loads the whole Shapefile to memory and uses Oracle index to select features from layer2 masked by layer1? How to calculate cost which one to transfer in which direction, etc. Certainly, it depends on number of elements, what algorithm is used, direction of application of algorithm (who is subject, who is object), and many more. It's plenty of combinations and my point is that if performance (it's not only in terms of speed, but any resource) is critical, it would be extremely difficult to provide efficient implementation of such features in OGR with guaranteed or even determinable degree of complexity. Without these guarantees, I see little of use of such solution. Given that, depending on needs, write a specialised application using available tools like OGR and GEOS, that is optimised according to specifics of datasets, type of processing, system requirements, etc. If not, then your suggestion may be as fast as any other. For example, the idea of loading the features in to PostGIS or SpatiaLite will require loading all of the full
RE: [gdal-dev] Open source vector geoprocessing libraries?
Jason, Have you looked at GeoKettle [1]? And recently I found GearScape [2], which seemed very interesting to me. Though neither is based on python... Duarte Carreira [1] - http://sourceforge.net/projects/geokettle/ [2] - http://www.fergonco.es/gearscape/index.php De: Emilio Mayorga [emiliomayo...@gmail.com] Enviado: terça-feira, 12 de Janeiro de 2010 18:25 Para: Jason Roberts Cc: gdal-dev Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries? Hi Jason, This may not be quite what you have in mind, but check out the PySAL (Open Source Python Library for Spatial Analytical Functions) project: http://geodacenter.asu.edu/pysal I've never used it, and have only looked at a recent presentation (http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not clear that it includes or even aims to include the traditional spatial operators provided by GEOS. I also have no idea if it uses OGR for its vector data access. But the developers have done some terrific work in spatial analysis tools in the past. BTW, I'd love to see your marine spatial ecology tools moved to an open source, platform neutral code base! Cheers, -Emilio Mayorga Applied Physics Laboratory University of Washington Box 355640 Seattle, WA 98105-6698 USA On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote: Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend on a specific GIS package but exposes them to several packages via the package-specific plugin mechanisms. For this to work, I'd have to find a library that can do the kind of geoprocessing with layers that ArcGIS can do, or write my own. Writing it myself sounds daunting and am hoping that there are existing projects to draw from. Thank you very much for any comments you can provide. Jason ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
On 13-1-2010 2:33, Mateusz Loskot wrote: OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? http://mapserver.org/utilities/shptree.html Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 13-1-2010 2:33, Mateusz Loskot wrote: OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? http://mapserver.org/utilities/shptree.html It could make sense to have a in-memory index for in-memory geometries. Pehaps use GiST library(1) (I don't know whether it can use in-memory indexes) for geometries in an OGRGeometryCollection or OGRMemLayer if it's available. For other formats it might not make sense because OGR is not responsible for the actual geometries. As have been said, one should use PostGIS format, which has this functionality built-in, for larger and more static datasets. Just my quick thoughts. Ari (1) http://www.sai.msu.su/~megera/postgres/gist/ Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Mateusz, Thank you very much for your insight. I have a few more questions I'm hoping you could answer. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. That sounds like a reasonable approach. Considering just the simpler scenarios, such as the one I mentioned, is it possible to implement efficiently it with OGR compiled with GEOS? I believe OGR can pass through SQL directly to the data source driver, allowing the caller to submit SQL containing spatial operators. In principle, one could submit a spatial query to PostGIS or SpatiaLite and efficiently get back the features (including geometry) that could possibly intersect a bounding box. Then one could use the GEOS functions on OGRGeometry to do the actual intersecting. Is that what you were suggesting? Of course, it may be that PostGIS or SpatiaLite can handle both steps 1 and 2 in a single query. If so, would it be best to do it that way? It appears that the OGR shapefile driver supports a spatial indexing scheme (.qix file) that is respected by OGRLayer::SetSpatialFilter. The documentation says that Currently this test is may be inaccurately implemented, but it is guaranteed that all features who's envelope (as returned by OGRGeometry::getEnvelope()) overlaps the envelope of the spatial filter will be returned. Therefore, it appears that the shapefile driver can implement step 1 but not step 2. Is that correct? The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. Is it correct that this cost would only be incurred when you call OGR functions implemented by GEOS, such as OGRGeometry::Intersects, OGRGeometry::Disjoint, etc? It's plenty of combinations and my point is that if performance (it's not only in terms of speed, but any resource) is critical, it would be extremely difficult to provide efficient implementation of such features in OGR with guaranteed or even determinable degree of complexity. Without these guarantees, I see little of use of such solution. Yes, I see what you mean. But I suggest to the open source community that there is still value in implementing such features, either as part of OGR or another library, even if optimal performance cannot be guaranteed in all scenarios. The reason is that ArcGIS provides such generic tools (e.g. intersect/union/symdiff layers, regardless of underlying storage). These geoprocessing tools are considered the most basic capabilities of ArcGIS, available in the cheapest versions of the software. IMHO, if the open source community wants to win over a large number of ArcGIS users to open GIS systems, I believe the community needs to provide parity with these basic tools. Thanks again, Jason -Original Message- From: Mateusz Loskot [mailto:mate...@loskot.net] Sent: Tuesday, January 12, 2010 8:33 PM To: Jason Roberts Cc: 'gdal-dev' Subject: Re: [gdal-dev] Open source vector geoprocessing libraries? Jason Roberts wrote: Mateusz, I'm not an expert in this area, but I think that big performance gains can be obtained by using a spatial index. Yes, likely true. For example, consider a situation where you want to clip out a study region from the full resolution GSHHS shoreline database, a polygon layer. The shoreline polygons have very large, complicated geometries. It would be expensive to loop over every polygon, loading its full geometry and calling GEOS. Instead, you would use the spatial index to isolate the polygons that are likely to overlap with the study region, then loop over just those ones. GEOS as JTS provides support of various spatial indexes. It is possible to index data and optimise it in this manner as you mention. In fact, GEOS uses index internally in various operations. The problem is that such index is not persistent, not serialised anywhere, so happens in memory only. In fact, there are much more problems than this one. BTW, PostGIS is an index serialisation. OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. I did it that way, actually. If OGR takes advantage of spatial indexes internally (e.g. if the data source drivers can tell the core about these indexes, and the core can use them when OGRLayer::SetSpatialFilter is called), then many scenarios could be efficiently implemented by just OGR and GEOS alone. The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. However, if such processing functionality would be considered as built in to OGR, that would make sense, but I still
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 13-1-2010 15:49, Ari Jolma wrote: Jan Hartmann wrote: Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? http://mapserver.org/utilities/shptree.html It could make sense to have a in-memory index for in-memory geometries. Pehaps use GiST library(1) (I don't know whether it can use in-memory indexes) for geometries in an OGRGeometryCollection or OGRMemLayer if it's available. For other formats it might not make sense because OGR is not responsible for the actual geometries. As have been said, one should use PostGIS format, which has this functionality built-in, for larger and more static datasets. Is that so? Reading the OGR API tutorial (http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, frowm whatever input source, are represented internally as a generic OGRGeometry pointer, which is a virtual base class for all real geometry classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most of the GEOS functionality can be implemented on OGRGeometries, so in principle the same could be done with indexing libraries (GIST, b-tree, quadtree, etc). Such indices should be written out to disk to be of any use at all, of course, like shptree does. What I meant is that with other formats than the in-memory format, the features are stored on disk (possibly even on remote servers) and only available for indexing when retrieved. When they are retrieved, they are of course OGR objects and accessable through the generic OGR API. Maybe it's possible but it would probably mean that the library would need to retrieve and go through all the features, and prepare and store into some local(?) file the index. Thus I think that for those formats, it's up to the format itself to provide the indexing or not. Ari ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Hi Duarte, Thanks for the suggestions. I took a look at GeoKettle. Here are some relevant excerpts from a document: GeoKettle is a ... powerful, metadata‐driven spatial ETL tool dedicated to the integration of different spatial data sources for building/updating geospatial data warehouses. At present, Oracle spatial, PostgreSQL/PostGIS and MySQL DBMS and the ESRI shapefiles are natively supported in read and write modes. Spatial Reference Systems management and coordinates transformations have been fully implemented. It is possible to access Geometry objects in JavaScript and define custom transformation steps (“Modified JavaScript Value” step). Topological predicates (Intersects, crosses, etc.) have all been implemented. It looks interesting, but oriented to server applications. We are building a set of desktop GIS analysis tools. It would probably not be practical to try to embed GeoKettle in our application. GearScape also looks interesting, with SQL-oriented geoprocessing, but it is more of an extensible GIS program than a geospatial library. Again, probably not practical to embed it in our app. Best regards, Jason -Original Message- From: Duarte Carreira [mailto:dcarre...@edia.pt] Sent: Wednesday, January 13, 2010 4:54 AM To: Jason Roberts Cc: gdal-dev Subject: RE: [gdal-dev] Open source vector geoprocessing libraries? Jason, Have you looked at GeoKettle [1]? And recently I found GearScape [2], which seemed very interesting to me. Though neither is based on python... Duarte Carreira [1] - http://sourceforge.net/projects/geokettle/ [2] - http://www.fergonco.es/gearscape/index.php De: Emilio Mayorga [emiliomayo...@gmail.com] Enviado: terça-feira, 12 de Janeiro de 2010 18:25 Para: Jason Roberts Cc: gdal-dev Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries? Hi Jason, This may not be quite what you have in mind, but check out the PySAL (Open Source Python Library for Spatial Analytical Functions) project: http://geodacenter.asu.edu/pysal I've never used it, and have only looked at a recent presentation (http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not clear that it includes or even aims to include the traditional spatial operators provided by GEOS. I also have no idea if it uses OGR for its vector data access. But the developers have done some terrific work in spatial analysis tools in the past. BTW, I'd love to see your marine spatial ecology tools moved to an open source, platform neutral code base! Cheers, -Emilio Mayorga Applied Physics Laboratory University of Washington Box 355640 Seattle, WA 98105-6698 USA On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote: Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend on a specific GIS package but exposes them to several packages via the package-specific plugin mechanisms. For this to work, I'd have to find a library that can do the kind of geoprocessing with layers that ArcGIS can do, or write my own. Writing it myself sounds daunting and am hoping that there are existing projects to draw from
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason, Jason Roberts wrote: Peter, are you constrained to retaining your data in an ArcGIS compatible format? We are attempting to build tools that can work with data stored in a variety of formats. Our current user community uses mostly shapefiles, ArcGIS personal geodatabases, and ArcGIS file geodatabases. Many of them are ecologists who do not have the interest or skills to deploy a real DBMS system. Thus we are hoping to provide tools that can work without one. This is one reason I was exploring how embeddable PostGIS and SpatiaLite might be in the other fork of this thread. I wonder how many users are aware that ESRI have announced the file geodatabase as replacing the (Access) personal geodatabase? They have not, as yet, announced a cut off for this format, but its many limitations as a result of Access capabilities may make this sooner rather than later. Until the File Geodatabase format is published (later this year?) and someone has the effort to build an OGR interface, the DBMS route is probably the best route to compatibility. It would be really great for that to happen, but I'm not holding my breath. If it does get published, I would seriously contemplate building an OGR driver. ESRI announced publication would be alongside the release of ArcGIS 9.4 at the EMEA User Conference in November 2008 (London). They said that they see the file geodatabase replacing both the personal geodatabase and shapefiles. I believe 9.4 to currently be in beta test. I have contemplated building an ArcObjects- or arcgisscripting-based driver. This would at least allow people who have ArcGIS to use OGR to access any ArcGIS layer, including those created by ArcGIS's tools for joining arbitrary layers, etc. That would handle file geodatabases, as well as ALL formats accessible from ArcGIS. If such a driver existed, then we could use OGR as the base interface inside our application. But creating such a driver would be a lot of work and have funky dependencies because it either needs to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call the ArcGIS APIs. I am certainly capable of implementing it but because most of our code is in Python, it is probably easier for me to wrap OGR and arcgisscripting behind a common abstraction, and then have our tools work against that abstraction rather than OGR directly. GDAL, including OGR, is actually embedded in ArcGIS: however I do not know quite what ESRI use it for. At any rate, I'm sure it is nice being able to do all your work in a spatially-enabled DBMS... Also an attraction of PostGres, of course. Best wishes, Peter Peter J Halls, GIS Advisor, University of York Telephone: 01904 433806 Fax: 01904 433740 Snail mail: Computing Service, University of York, Heslington, York YO10 5DD This message has the status of a private and personal communication ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: Is that so? Reading the OGR API tutorial (http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, frowm whatever input source, are represented internally as a generic OGRGeometry pointer, which is a virtual base class for all real geometry classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most of the GEOS functionality can be implemented on OGRGeometries, so in principle the same could be done with indexing libraries (GIST, b-tree, quadtree, etc). Such indices should be written out to disk to be of any use at all, of course, like shptree does. Jan, I have had trouble keeping up with this spirited discussion, but I wanted to note that it is not intended that alternate implementations of geometries would be derived by OGRGeometry. There are many places for instance that assume an OGRGeometry can be cast to OGRLineString if it's type is wkbLineString. Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush| Geospatial Programmer for Rent ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Date: Wed, 13 Jan 2010 10:27:43 -0500 From: Jason Roberts jason.robe...@duke.edu Subject: RE: [gdal-dev] Open source vector geoprocessing libraries? To: 'Peter J Halls' p.ha...@york.ac.uk Cc: 'gdal-dev' gdal-dev@lists.osgeo.org Message-ID: 008001ca9464$f4059f10$dc10dd...@roberts@duke.edu Content-Type: text/plain; charset=US-ASCII Peter, are you constrained to retaining your data in an ArcGIS compatible format? We are attempting to build tools that can work with data stored in a variety of formats. Our current user community uses mostly shapefiles, ArcGIS personal geodatabases, and ArcGIS file geodatabases. Many of them are ecologists who do not have the interest or skills to deploy a real DBMS system. Thus we are hoping to provide tools that can work without one. This is one reason I was exploring how embeddable PostGIS and SpatiaLite might be in the other fork of this thread. Until the File Geodatabase format is published (later this year?) and someone has the effort to build an OGR interface, the DBMS route is probably the best route to compatibility. It would be really great for that to happen, but I'm not holding my breath. If it does get published, I would seriously contemplate building an OGR driver. I have contemplated building an ArcObjects- or arcgisscripting-based driver. This would at least allow people who have ArcGIS to use OGR to access any ArcGIS layer, including those created by ArcGIS's tools for joining arbitrary layers, etc. That would handle file geodatabases, as well as ALL formats accessible from ArcGIS. If such a driver existed, then we could use OGR as the base interface inside our application. But creating such a driver would be a lot of work and have funky dependencies because it either needs to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call the ArcGIS APIs. I am certainly capable of implementing it but because most of our code is in Python, it is probably easier for me to wrap OGR and arcgisscripting behind a common abstraction, and then have our tools work against that abstraction rather than OGR directly. I find it very amusing you mention this right now. Why? I asked Frank if there was an ArcObjects based OGR driver this very past Thursday and he said not that I know of. What I wanted was, among other things, to get data out of FileGDB to PostGIS with one shot and add some custom behavior for a client of mine. So I spent the past three days looking at OGR drivers and wrote an ArcObjects based one. I got it working yesterday. - Right now I only instantiate 3 factories (Enterprise GDB aka ArcSDE, AccessDB and FileGDB). This means it reads FileGDB just fine. If you want more factories, the driver only has to be modified with one line to add any other factories and everything else would just work. - I only implemented the parts that I needed, so it is readonly (should be straight forward to expand if need be). - Although, it can read other GeoDatabase abstractions (Topology, Geometric Networks, Annotations, Cadastral Fabrics, etc), currently I am explicitly filtering for FeatureClasses and FeatureDatasets. - It is a ATL / COM / C++ based one, so it will only compile on Windows. It can be modified to use the cross platform ArcEngine SDK since all the COM Objects that I use are called the same and behave the same way... I just did not have an ArcEngine SDK installer, so I could not test this. Anyway, if you are interested in the source code, let me know. Perhaps we can add it as an ogr driver contribution (what is the process for that anyway?). I may not respond fast enough to e-mail, since the next 4 weeks are pretty crazy for me. - Ragi Burhum ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 13-1-2010 2:33, Mateusz Loskot wrote: OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? This implementation of index comes from Shapelib made by Frank. The very same bits of Shapelib are used in MapServer and OGR, namely .qix spatial index file support. So, it's already there but for Shapefiles only. Back to the question, I'm personally sceptic. Recalling example with processing two layers, one from DBMS and one from file-based data source, how it would be supposed to work? ...common .qix file generated for DBMS data source? In my opinion, this kind of functionality is out of scope of OGR. I see OGR as a data provider. OGR is basically a translation library that reads from one data source and writes to another data source providing set of reasonably limited features to process data during translation - a common denominator for popular vector spatial data formats. IMHO, it's misunderstanding to consider OGR fully featured data model and I/O engine to read, write, process and analyse spatial vector data, especially if performance is a critical factor. IMHO, there are too many compromises in OGR. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason Roberts wrote: Mateusz, Thank you very much for your insight. I have a few more questions I'm hoping you could answer. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. That sounds like a reasonable approach. Considering just the simpler scenarios, such as the one I mentioned, is it possible to implement efficiently it with OGR compiled with GEOS? Should be, but OGRGeometry - geos::Geometry translation may be an overhead. I believe OGR can pass through SQL directly to the data source driver, allowing the caller to submit SQL containing spatial operators. In principle, one could submit a spatial query to PostGIS or SpatiaLite and efficiently get back the features (including geometry) that could possibly intersect a bounding box. Then one could use the GEOS functions on OGRGeometry to do the actual intersecting. Is that what you were suggesting? Yes, that's the concept Of course, it may be that PostGIS or SpatiaLite can handle both steps 1 and 2 in a single query. If so, would it be best to do it that way? It's usually a good idea to let the DBMS engine to do as much as possible, so looks like a good idea to me. It appears that the OGR shapefile driver supports a spatial indexing scheme (.qix file) that is respected by OGRLayer::SetSpatialFilter. The documentation says that Currently this test is may be inaccurately implemented, but it is guaranteed that all features who's envelope (as returned by OGRGeometry::getEnvelope()) overlaps the envelope of the spatial filter will be returned. Therefore, it appears that the shapefile driver can implement step 1 but not step 2. Is that correct? Yes. The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. Is it correct that this cost would only be incurred when you call OGR functions implemented by GEOS, such as OGRGeometry::Intersects, OGRGeometry::Disjoint, etc? Yes. Namely, here potential cost takes place: http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrgeometry.cpp#L333 It's plenty of combinations and my point is that if performance (it's not only in terms of speed, but any resource) is critical, it would be extremely difficult to provide efficient implementation of such features in OGR with guaranteed or even determinable degree of complexity. Without these guarantees, I see little of use of such solution. Yes, I see what you mean. But I suggest to the open source community that there is still value in implementing such features, either as part of OGR or another library, even if optimal performance cannot be guaranteed in all scenarios. Perhaps you'll find these inspiring: http://trac.osgeo.org/qgis/browser/trunk/qgis/src/analysis/vector Look at the Java camp too. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Frank, Thanks for your thoughts on this. I'd like to see something along this line happen. I to do it efficiently it would be necessary to dig into GEOS past the C interface so that a spatial index on a collection of features can be maintained over time rather than created and discarded for each pairwise test of two geometries. I am somewhat hesitant to have this sort of processing go into GDAL/OGR itself, especially as an extensive set of methods on OGRLayer. I think it could be done as a layered processing library without any noticable loss of performance. Do you know of anyone working on such a library? It sounds like such a library would sit on top of GDAL/OGR, to leverage the abstraction of data sources and layers. Although I am not yet familiar with efficient algorithms for operations with layers, I suspect that library would need spatial index support from OGR. The underlying data sources often maintain spatial indexes. OGR would either need to expose these via a new abstraction (new methods on OGRLayer, for example). Or if the underlying source did not support spatial indexes, perhaps OGR could loop through the layer, build an index with GEOS, and expose that via the same abstraction. Is that similar to what you were thinking? It sounds like there is not presently an open source project that provides this geoprocessing with layers functionality. If not, I will still have to use ArcGIS for my own project, but I would like to hide ArcGIS behind an abstraction that is likely to be architecturally compatible with a future library, so that maybe I could swap it in at some future point. This is why I am probing for more details on what you envision, even if those ideas are still somewhat distant. Thanks, Jason ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason, If you're working with vector data, why not throw the data into Postgresql/Postgis, http://postgis.refractions.net, and use the spatial operators there to select/buffer/intersect the vector geometries as you describe. http://postgis.refractions.net/documentation/manual-1.4 /ch07.html for geoprocessing operations. Your application can pass SQL commands to the database. You can use ogr to load data /export your finished product to/from postgresql/postgis . You might be able to similar things in spatialite, http://www.gaia-gis.it/spatialite/spatialite-tutorial-2.3.1.html#t4. Doug Doug Newcomb USFWS Raleigh, NC 919-856-4520 ext. 14 doug_newc...@fws.gov - The opinions I express are my own and are not representative of the official policy of the U.S.Fish and Wildlife Service or Dept. of the Interior. Life is too short for undocumented, proprietary data formats. Jason Roberts jason.robe...@du ke.eduTo Sent by: 'gdal-dev' gdal-dev-bounces@ gdal-dev@lists.osgeo.org lists.osgeo.orgcc Subject 01/11/2010 05:32 [gdal-dev] Open source vector PMgeoprocessing libraries? Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend on a specific GIS package but exposes them to several packages via the package-specific plugin mechanisms. For this to work, I'd have to find a library that can do the kind of geoprocessing with layers that ArcGIS can do, or write my own. Writing it myself sounds daunting and am hoping that there are existing projects to draw from. Thank you very much for any comments you can provide. Jason ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-devinline: graycol.gifinline: pic14932.gifinline: ecblank.gif___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Hi Jason, This may not be quite what you have in mind, but check out the PySAL (Open Source Python Library for Spatial Analytical Functions) project: http://geodacenter.asu.edu/pysal I've never used it, and have only looked at a recent presentation (http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not clear that it includes or even aims to include the traditional spatial operators provided by GEOS. I also have no idea if it uses OGR for its vector data access. But the developers have done some terrific work in spatial analysis tools in the past. BTW, I'd love to see your marine spatial ecology tools moved to an open source, platform neutral code base! Cheers, -Emilio Mayorga Applied Physics Laboratory University of Washington Box 355640 Seattle, WA 98105-6698 USA On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote: Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend on a specific GIS package but exposes them to several packages via the package-specific plugin mechanisms. For this to work, I'd have to find a library that can do the kind of geoprocessing with layers that ArcGIS can do, or write my own. Writing it myself sounds daunting and am hoping that there are existing projects to draw from. Thank you very much for any comments you can provide. Jason ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Doug, Thanks for these suggestions. It looks like PostGIS and SpatialLite both provide a SQL-based approach for accomplishing what I need. Both look promising and I will dig into them in more detail. It might be less than optimal to load data into one of these, execute the desired spatial query, and export data back out. But there is probably no suitable alternative that provides a complete set of spatial operators that is any faster. I'm sure a big part of executing efficient spatial queries is having a spatial index. Even OGR does not appear to expose spatial indexes that may be maintained by the underlying data sources. Thus any geoprocessing library that sits on OGR or a similar API must already retrieve all records, build a spatial index, then execute the spatial query. This is basically the same thing as loading data into PostGIS or SpatialLite and then executing the query. I have tons of questions but will resist asking all but one: do you know how well these systems can be embedded in other software? In my collection of tools, I want the infrastructure that supports them to be hidden and config-less. Although I have not used SQLite, I know it is designed explicitly for easy embedding, so it seems promising. What about Postgres? In my past experience, it appeared to be much more of a full-blown enterprise database system, designed to run as a service or daemon, listen for connections, etc. If it can be easily embedded, I might prefer to use it, as PostGIS appears to provide a richer set of spatial operators. Jason From: doug_newc...@fws.gov [mailto:doug_newc...@fws.gov] Sent: Tuesday, January 12, 2010 12:29 PM To: Jason Roberts Cc: 'gdal-dev'; gdal-dev-boun...@lists.osgeo.org Subject: Re: [gdal-dev] Open source vector geoprocessing libraries? Jason, If you're working with vector data, why not throw the data into Postgresql/Postgis, http://postgis.refractions.net, and use the spatial operators there to select/buffer/intersect the vector geometries as you describe. http://postgis.refractions.net/documentation/manual-1.4/ch07.html for geoprocessing operations. Your application can pass SQL commands to the database. You can use ogr to load data /export your finished product to/from postgresql/postgis . You might be able to similar things in spatialite, http://www.gaia-gis.it/spatialite/spatialite-tutorial-2.3.1.html#t4. Doug Doug Newcomb USFWS Raleigh, NC 919-856-4520 ext. 14 doug_newc...@fws.gov - The opinions I express are my own and are not representative of the official policy of the U.S.Fish and Wildlife Service or Dept. of the Interior. Life is too short for undocumented, proprietary data formats. Inactive hide details for Jason Roberts jason.robe...@duke.eduJason Roberts jason.robe...@duke.edu Jason Roberts jason.robe...@duke.edu Sent by: gdal-dev-boun...@lists.osgeo.org 01/11/2010 05:32 PM To 'gdal-dev' gdal-dev@lists.osgeo.org cc Subject [gdal-dev] Open source vector geoprocessing libraries? Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend
RE: [gdal-dev] Open source vector geoprocessing libraries?
Emilio, Thanks for the suggestion of pysal. It does look interesting, but as you speculated, it seems to not aim to include the traditional spatial operators. Instead it looks like a collection of various interesting algorithms, implemented in Python on top of SciPy, NumPy, spatialindex, and Rtree. This might be useful for specific problems, but I need a more comprehensive library of the traditional stuff. BTW, I'd love to see your marine spatial ecology tools moved to an open source, platform neutral code base! Yes, we would love that too. At the moment, I am evaluating whether we should develop our next batch of tools under our existing framework which depends heavily on ArcGIS, or take a time-out to rework the framework to eliminate that dependency. I have already done pieces of it, here and there, but this vector geoprocessing functionality is a key blocker that remains unresolved. Best, Jason -Original Message- From: Emilio Mayorga [mailto:emiliomayo...@gmail.com] Sent: Tuesday, January 12, 2010 1:26 PM To: Jason Roberts Cc: gdal-dev Subject: Re: [gdal-dev] Open source vector geoprocessing libraries? Hi Jason, This may not be quite what you have in mind, but check out the PySAL (Open Source Python Library for Spatial Analytical Functions) project: http://geodacenter.asu.edu/pysal I've never used it, and have only looked at a recent presentation (http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not clear that it includes or even aims to include the traditional spatial operators provided by GEOS. I also have no idea if it uses OGR for its vector data access. But the developers have done some terrific work in spatial analysis tools in the past. BTW, I'd love to see your marine spatial ecology tools moved to an open source, platform neutral code base! Cheers, -Emilio Mayorga Applied Physics Laboratory University of Washington Box 355640 Seattle, WA 98105-6698 USA On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote: Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend on a specific GIS package but exposes them to several packages via the package-specific plugin mechanisms. For this to work, I'd have to find a library that can do the kind of geoprocessing with layers that ArcGIS can do, or write my own. Writing it myself sounds daunting and am hoping that there are existing projects to draw from. Thank you very much for any comments you can provide. Jason ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason Roberts wrote: Doug, Thanks for these suggestions. It looks like PostGIS and SpatialLite both provide a SQL-based approach for accomplishing what I need. Both look promising and I will dig into them in more detail. It might be less than optimal to load data into one of these, execute the desired spatial query, and export data back out. But there is probably no suitable alternative that provides a complete set of spatial operators that is any faster. I'm sure a big part of executing efficient spatial queries is having a spatial index. Even OGR does not appear to expose spatial indexes that may be maintained by the underlying data sources. Thus any geoprocessing library that sits on OGR or a similar API must already retrieve all records, build a spatial index, then execute the spatial query. This is basically the same thing as loading data into PostGIS or SpatialLite and then executing the query. I have tons of questions but will resist asking all but one: do you know how well these systems can be embedded in other software? In my collection of tools, I want the infrastructure that supports them to be hidden and config-less. Although I have not used SQLite, I know it is designed explicitly for easy embedding, so it seems promising. What about Postgres? In my past experience, it appeared to be much more of a full-blown enterprise database system, designed to run as a service or daemon, listen for connections, etc. If it can be easily embedded, I might prefer to use it, as PostGIS appears to provide a richer set of spatial operators. I have used SQLite in a bunch of tools. It is easy and straight forward. You can do any config stuff in your code so it is config-less from the users point of view. I also use PostGIS for a lot of stuff, but you are right that you need to setup a server and have your tools connect to it. Once the server is setup and running the client applications just need to have valid login to the server to do whatever they want. For Perl I tend to use PostGIS and for C code I tend to use SQLite. I have looked at the SpatiaLite extensions but have not really used them yet. If you are building a system where you don't want to deal with a database server, I would have not qualms using SQLite and SaptiaLite for building an embedded solution. -Steve http://imaptools.com/ Jason *From:* doug_newc...@fws.gov [mailto:doug_newc...@fws.gov] *Sent:* Tuesday, January 12, 2010 12:29 PM *To:* Jason Roberts *Cc:* 'gdal-dev'; gdal-dev-boun...@lists.osgeo.org *Subject:* Re: [gdal-dev] Open source vector geoprocessing libraries? Jason, If you're working with vector data, why not throw the data into Postgresql/Postgis, http://postgis.refractions.net, and use the spatial operators there to select/buffer/intersect the vector geometries as you describe. http://postgis.refractions.net/documentation/manual-1.4/ch07.html for geoprocessing operations. Your application can pass SQL commands to the database. You can use ogr to load data /export your finished product to/from postgresql/postgis . You might be able to similar things in spatialite, http://www.gaia-gis.it/spatialite/spatialite-tutorial-2.3.1.html#t4. Doug Doug Newcomb USFWS Raleigh, NC 919-856-4520 ext. 14 doug_newc...@fws.gov - The opinions I express are my own and are not representative of the official policy of the U.S.Fish and Wildlife Service or Dept. of the Interior. Life is too short for undocumented, proprietary data formats. Inactive hide details for Jason Roberts jason.robe...@duke.eduJason Roberts jason.robe...@duke.edu *Jason Roberts jason.robe...@duke.edu* Sent by: gdal-dev-boun...@lists.osgeo.org 01/11/2010 05:32 PM To 'gdal-dev' gdal-dev@lists.osgeo.org cc Subject [gdal-dev] Open source vector geoprocessing libraries? Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason Roberts wrote: By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. What prevents you from calling GEOS and process all features in a layer? I've used GEOS processing layers of large number of features in similar manner as PostGIS would, but programmatically (C++). For example, generating buffer from tens of polygons or performing boolean operations like cookie-cutting layer of 1000-2000 polygons with one polygon. IMHO, I can't see much point making a new library. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Mateusz, I'm not an expert in this area, but I think that big performance gains can be obtained by using a spatial index. For example, consider a situation where you want to clip out a study region from the full resolution GSHHS shoreline database, a polygon layer. The shoreline polygons have very large, complicated geometries. It would be expensive to loop over every polygon, loading its full geometry and calling GEOS. Instead, you would use the spatial index to isolate the polygons that are likely to overlap with the study region, then loop over just those ones. I do not know much about spatial indexes yet, but I suspect they do something like store the rectangular envelope of each feature, which can then be quickly compared to other envelopes to determine whether it is possible for features to overlap. If OGR takes advantage of spatial indexes internally (e.g. if the data source drivers can tell the core about these indexes, and the core can use them when OGRLayer::SetSpatialFilter is called), then many scenarios could be efficiently implemented by just OGR and GEOS alone. Of course, it might get more complicated when you have two layers to perform the operation on, rather than one layer and a single feature. I'm sure that others have done a lot of thinking about how to optimize the different scenarios, while I haven't done much. This is why I wondered if there was another library for doing this kind of thing. If not, then your suggestion may be as fast as any other. For example, the idea of loading the features in to PostGIS or SpatiaLite will require loading all of the full geometries, passing them to another database system, etc, etc. It may be that shuffling all of the data around will be hugely expensive and that just using OGR functions with simple approaches like calling GEOS from nested loops will be faster than shuffling the data to a system that implements a more efficient approach once the data gets there. Is that basically what you are saying? Or have I totally missed the point? Thanks for your thoughts, Jason -Original Message- From: Mateusz Loskot [mailto:mate...@loskot.net] Sent: Tuesday, January 12, 2010 5:51 PM To: Jason Roberts Cc: 'gdal-dev' Subject: Re: [gdal-dev] Open source vector geoprocessing libraries? Jason Roberts wrote: By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. What prevents you from calling GEOS and process all features in a layer? I've used GEOS processing layers of large number of features in similar manner as PostGIS would, but programmatically (C++). For example, generating buffer from tens of polygons or performing boolean operations like cookie-cutting layer of 1000-2000 polygons with one polygon. IMHO, I can't see much point making a new library. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason Roberts wrote: Mateusz, I'm not an expert in this area, but I think that big performance gains can be obtained by using a spatial index. Yes, likely true. For example, consider a situation where you want to clip out a study region from the full resolution GSHHS shoreline database, a polygon layer. The shoreline polygons have very large, complicated geometries. It would be expensive to loop over every polygon, loading its full geometry and calling GEOS. Instead, you would use the spatial index to isolate the polygons that are likely to overlap with the study region, then loop over just those ones. GEOS as JTS provides support of various spatial indexes. It is possible to index data and optimise it in this manner as you mention. In fact, GEOS uses index internally in various operations. The problem is that such index is not persistent, not serialised anywhere, so happens in memory only. In fact, there are much more problems than this one. BTW, PostGIS is an index serialisation. OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. I did it that way, actually. If OGR takes advantage of spatial indexes internally (e.g. if the data source drivers can tell the core about these indexes, and the core can use them when OGRLayer::SetSpatialFilter is called), then many scenarios could be efficiently implemented by just OGR and GEOS alone. The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. However, if such processing functionality would be considered as built in to OGR, that would make sense, but I still see limitations: Let's brainstom a bit and assume it implements operation: OGRLayer OGR::SymDifference(OGRLayer layer1, OGRLayer layer2); Depending on data source, OGR could exploit its capabilities., If both layers sit in the same PostGIS (or other spatial) database, OGR just delegates the processing to PostGIS where ST_SymDifference is executed and OGR only grabs the results and generates OGRLayer. What if layer1 is a Shapefile and layer2 is Oracle table? Let's assume Shapefile has .qix file with spatial index and Oracle has its own index. What does OGR do? Loads .qix to memory, then grabs layer2 and decides which features to select form layer1? Loads the whole Shapefile to memory and uses Oracle index to select features from layer2 masked by layer1? How to calculate cost which one to transfer in which direction, etc. Certainly, it depends on number of elements, what algorithm is used, direction of application of algorithm (who is subject, who is object), and many more. It's plenty of combinations and my point is that if performance (it's not only in terms of speed, but any resource) is critical, it would be extremely difficult to provide efficient implementation of such features in OGR with guaranteed or even determinable degree of complexity. Without these guarantees, I see little of use of such solution. Given that, depending on needs, write a specialised application using available tools like OGR and GEOS, that is optimised according to specifics of datasets, type of processing, system requirements, etc. If not, then your suggestion may be as fast as any other. For example, the idea of loading the features in to PostGIS or SpatiaLite will require loading all of the full geometries, passing them to another database system, etc, etc. It may be that shuffling all of the data around will be hugely expensive and that just using OGR functions with simple approaches like calling GEOS from nested loops will be faster than shuffling the data to a system that implements a more efficient approach once the data gets there. It's never just using. Performance is usualy a concern regarding large datasets. Large datasets are unlikely to be stored in a simple format, but in proper spatial data storage, like PostGIS. It nicely combines all the elements necessary to perform geometrical processing in usable and optimised form, with index. Is that basically what you are saying? It is. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason Roberts wrote: Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. Jason, I'd like to see something along this line happen. I to do it efficiently it would be necessary to dig into GEOS past the C interface so that a spatial index on a collection of features can be maintained over time rather than created and discarded for each pairwise test of two geometries. I am somewhat hesitant to have this sort of processing go into GDAL/OGR itself, especially as an extensive set of methods on OGRLayer. I think it could be done as a layered processing library without any noticable loss of performance. Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush| Geospatial Programmer for Rent ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev