Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason, are you constrained to retaining your data in an ArcGIS compatible format? If so and if you do not have ArcSDE, then what follows may not be much help. Otherwise, I think it likely that you will find using a DBMS as your data repository advantageous for many reasons. Apart from the built in indexing and index based operations, it is *very* much easier to share data between users, retaining a single copy and all user having effective access. Until the File Geodatabase format is published (later this year?) and someone has the effort to build an OGR interface, the DBMS route is probably the best route to compatibility. We happen to be a corporate Oracle site, but PostGres is pretty similar. PostGres is supported by ESRI with ArcSDE, so it is possible to retain ArcGIS compatibility this way. Many years ago, I had a Simula class for performing many of these basic spatial operations, however now my data is all in Oracle: I am able to use the Oracle functions and no longer have to worry about building and rebuilding indexes, etc. - other than USER_SDO_GEOM_METADATA which, unfortunately, OGR only writes to at table creation and does not update. Frankly, life (and maintenance) is much easier now and, certainly with Oracle, I think there have been performance gains. Just my ha'pence-worth. Peter Mateusz Loskot wrote: Jason Roberts wrote: Mateusz, I'm not an expert in this area, but I think that big performance gains can be obtained by using a spatial index. Yes, likely true. For example, consider a situation where you want to clip out a study region from the full resolution GSHHS shoreline database, a polygon layer. The shoreline polygons have very large, complicated geometries. It would be expensive to loop over every polygon, loading its full geometry and calling GEOS. Instead, you would use the spatial index to isolate the polygons that are likely to overlap with the study region, then loop over just those ones. GEOS as JTS provides support of various spatial indexes. It is possible to index data and optimise it in this manner as you mention. In fact, GEOS uses index internally in various operations. The problem is that such index is not persistent, not serialised anywhere, so happens in memory only. In fact, there are much more problems than this one. BTW, PostGIS is an index serialisation. OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. I did it that way, actually. If OGR takes advantage of spatial indexes internally (e.g. if the data source drivers can tell the core about these indexes, and the core can use them when OGRLayer::SetSpatialFilter is called), then many scenarios could be efficiently implemented by just OGR and GEOS alone. The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. However, if such processing functionality would be considered as built in to OGR, that would make sense, but I still see limitations: Let's brainstom a bit and assume it implements operation: OGRLayer OGR::SymDifference(OGRLayer layer1, OGRLayer layer2); Depending on data source, OGR could exploit its capabilities., If both layers sit in the same PostGIS (or other spatial) database, OGR just delegates the processing to PostGIS where ST_SymDifference is executed and OGR only grabs the results and generates OGRLayer. What if layer1 is a Shapefile and layer2 is Oracle table? Let's assume Shapefile has .qix file with spatial index and Oracle has its own index. What does OGR do? Loads .qix to memory, then grabs layer2 and decides which features to select form layer1? Loads the whole Shapefile to memory and uses Oracle index to select features from layer2 masked by layer1? How to calculate cost which one to transfer in which direction, etc. Certainly, it depends on number of elements, what algorithm is used, direction of application of algorithm (who is subject, who is object), and many more. It's plenty of combinations and my point is that if performance (it's not only in terms of speed, but any resource) is critical, it would be extremely difficult to provide efficient implementation of such features in OGR with guaranteed or even determinable degree of complexity. Without these guarantees, I see little of use of such solution. Given that, depending on needs, write a specialised application using available tools like OGR and GEOS, that is optimised according to specifics of datasets, type of processing, system requirements, etc. If not, then your suggestion may be as fast as any other. For example, the idea of loading the features in to PostGIS or SpatiaLite will require loading all of the full
[gdal-dev] Re: Memory use in GDALDriver::CreateCopy()
Greg Coats gregcoats at mac.com writes: I find that GDAL version 1.6.3, released 2009/11/19, gdal_translate fully supports reading and writing a 150 GB GeoTiff image 260,000 columns by 195,000 rows by RGB. Greg Hi, The problem is not not the image size itself. It may be related to, as mentioned earlier, image organised by scanlines and having blocks wider that pixels. Your image is tiled Tile Width: 512 Tile Length: 512 while the file in question has 4 pixel wide blocks. Band 1 Block=4x1 Type=Byte, ColorInterp=Gray -Jukka Rahkonen- ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] NITF JPEG2000 compression and Kakadu
Frank, In the file NITFDatasetCreate.cpp in the function NITFDatasetCreate() if the compression option is set to C8 (JPEG2000) it looks like you: 1. get a handle to an installed J2K driver if available. 2. test for metadata creation capability. 3. create the nitf file. 4. open a new handle to the nitf file on disk. 5. setup a j2k subfile option based on the new nitf file segment offset. 6. call create on the j2k driver with the j2k_subfile option. 7. return an open handle to the new nitf file. It seems to me that I could hack my version of GDAL to include support for doing this with my copy of Kakadu with the exception that I would have to first create a VRT dataset of my output J2K file and then use CreateCopy() on the Kakadu driver instead of Create(). Do you think I am missing something here and that it is more difficult then that? Does the Kakadu library not have some feature I would need to do this? If it can be done, would my approach of using a VRT dataset work? I only want to create single dataset output. Best regards, Martin. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Jason, Have you looked at GeoKettle [1]? And recently I found GearScape [2], which seemed very interesting to me. Though neither is based on python... Duarte Carreira [1] - http://sourceforge.net/projects/geokettle/ [2] - http://www.fergonco.es/gearscape/index.php De: Emilio Mayorga [emiliomayo...@gmail.com] Enviado: terça-feira, 12 de Janeiro de 2010 18:25 Para: Jason Roberts Cc: gdal-dev Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries? Hi Jason, This may not be quite what you have in mind, but check out the PySAL (Open Source Python Library for Spatial Analytical Functions) project: http://geodacenter.asu.edu/pysal I've never used it, and have only looked at a recent presentation (http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not clear that it includes or even aims to include the traditional spatial operators provided by GEOS. I also have no idea if it uses OGR for its vector data access. But the developers have done some terrific work in spatial analysis tools in the past. BTW, I'd love to see your marine spatial ecology tools moved to an open source, platform neutral code base! Cheers, -Emilio Mayorga Applied Physics Laboratory University of Washington Box 355640 Seattle, WA 98105-6698 USA On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote: Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend on a specific GIS package but exposes them to several packages via the package-specific plugin mechanisms. For this to work, I'd have to find a library that can do the kind of geoprocessing with layers that ArcGIS can do, or write my own. Writing it myself sounds daunting and am hoping that there are existing projects to draw from. Thank you very much for any comments you can provide. Jason ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
On 13-1-2010 2:33, Mateusz Loskot wrote: OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? http://mapserver.org/utilities/shptree.html Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 13-1-2010 2:33, Mateusz Loskot wrote: OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? http://mapserver.org/utilities/shptree.html It could make sense to have a in-memory index for in-memory geometries. Pehaps use GiST library(1) (I don't know whether it can use in-memory indexes) for geometries in an OGRGeometryCollection or OGRMemLayer if it's available. For other formats it might not make sense because OGR is not responsible for the actual geometries. As have been said, one should use PostGIS format, which has this functionality built-in, for larger and more static datasets. Just my quick thoughts. Ari (1) http://www.sai.msu.su/~megera/postgres/gist/ Jan ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Mateusz, Thank you very much for your insight. I have a few more questions I'm hoping you could answer. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. That sounds like a reasonable approach. Considering just the simpler scenarios, such as the one I mentioned, is it possible to implement efficiently it with OGR compiled with GEOS? I believe OGR can pass through SQL directly to the data source driver, allowing the caller to submit SQL containing spatial operators. In principle, one could submit a spatial query to PostGIS or SpatiaLite and efficiently get back the features (including geometry) that could possibly intersect a bounding box. Then one could use the GEOS functions on OGRGeometry to do the actual intersecting. Is that what you were suggesting? Of course, it may be that PostGIS or SpatiaLite can handle both steps 1 and 2 in a single query. If so, would it be best to do it that way? It appears that the OGR shapefile driver supports a spatial indexing scheme (.qix file) that is respected by OGRLayer::SetSpatialFilter. The documentation says that Currently this test is may be inaccurately implemented, but it is guaranteed that all features who's envelope (as returned by OGRGeometry::getEnvelope()) overlaps the envelope of the spatial filter will be returned. Therefore, it appears that the shapefile driver can implement step 1 but not step 2. Is that correct? The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. Is it correct that this cost would only be incurred when you call OGR functions implemented by GEOS, such as OGRGeometry::Intersects, OGRGeometry::Disjoint, etc? It's plenty of combinations and my point is that if performance (it's not only in terms of speed, but any resource) is critical, it would be extremely difficult to provide efficient implementation of such features in OGR with guaranteed or even determinable degree of complexity. Without these guarantees, I see little of use of such solution. Yes, I see what you mean. But I suggest to the open source community that there is still value in implementing such features, either as part of OGR or another library, even if optimal performance cannot be guaranteed in all scenarios. The reason is that ArcGIS provides such generic tools (e.g. intersect/union/symdiff layers, regardless of underlying storage). These geoprocessing tools are considered the most basic capabilities of ArcGIS, available in the cheapest versions of the software. IMHO, if the open source community wants to win over a large number of ArcGIS users to open GIS systems, I believe the community needs to provide parity with these basic tools. Thanks again, Jason -Original Message- From: Mateusz Loskot [mailto:mate...@loskot.net] Sent: Tuesday, January 12, 2010 8:33 PM To: Jason Roberts Cc: 'gdal-dev' Subject: Re: [gdal-dev] Open source vector geoprocessing libraries? Jason Roberts wrote: Mateusz, I'm not an expert in this area, but I think that big performance gains can be obtained by using a spatial index. Yes, likely true. For example, consider a situation where you want to clip out a study region from the full resolution GSHHS shoreline database, a polygon layer. The shoreline polygons have very large, complicated geometries. It would be expensive to loop over every polygon, loading its full geometry and calling GEOS. Instead, you would use the spatial index to isolate the polygons that are likely to overlap with the study region, then loop over just those ones. GEOS as JTS provides support of various spatial indexes. It is possible to index data and optimise it in this manner as you mention. In fact, GEOS uses index internally in various operations. The problem is that such index is not persistent, not serialised anywhere, so happens in memory only. In fact, there are much more problems than this one. BTW, PostGIS is an index serialisation. OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. I did it that way, actually. If OGR takes advantage of spatial indexes internally (e.g. if the data source drivers can tell the core about these indexes, and the core can use them when OGRLayer::SetSpatialFilter is called), then many scenarios could be efficiently implemented by just OGR and GEOS alone. The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. However, if such processing functionality would be considered as built in to OGR, that would make sense, but I still
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 13-1-2010 15:49, Ari Jolma wrote: Jan Hartmann wrote: Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? http://mapserver.org/utilities/shptree.html It could make sense to have a in-memory index for in-memory geometries. Pehaps use GiST library(1) (I don't know whether it can use in-memory indexes) for geometries in an OGRGeometryCollection or OGRMemLayer if it's available. For other formats it might not make sense because OGR is not responsible for the actual geometries. As have been said, one should use PostGIS format, which has this functionality built-in, for larger and more static datasets. Is that so? Reading the OGR API tutorial (http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, frowm whatever input source, are represented internally as a generic OGRGeometry pointer, which is a virtual base class for all real geometry classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most of the GEOS functionality can be implemented on OGRGeometries, so in principle the same could be done with indexing libraries (GIST, b-tree, quadtree, etc). Such indices should be written out to disk to be of any use at all, of course, like shptree does. What I meant is that with other formats than the in-memory format, the features are stored on disk (possibly even on remote servers) and only available for indexing when retrieved. When they are retrieved, they are of course OGR objects and accessable through the generic OGR API. Maybe it's possible but it would probably mean that the library would need to retrieve and go through all the features, and prepare and store into some local(?) file the index. Thus I think that for those formats, it's up to the format itself to provide the indexing or not. Ari ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Hi Duarte, Thanks for the suggestions. I took a look at GeoKettle. Here are some relevant excerpts from a document: GeoKettle is a ... powerful, metadata‐driven spatial ETL tool dedicated to the integration of different spatial data sources for building/updating geospatial data warehouses. At present, Oracle spatial, PostgreSQL/PostGIS and MySQL DBMS and the ESRI shapefiles are natively supported in read and write modes. Spatial Reference Systems management and coordinates transformations have been fully implemented. It is possible to access Geometry objects in JavaScript and define custom transformation steps (“Modified JavaScript Value” step). Topological predicates (Intersects, crosses, etc.) have all been implemented. It looks interesting, but oriented to server applications. We are building a set of desktop GIS analysis tools. It would probably not be practical to try to embed GeoKettle in our application. GearScape also looks interesting, with SQL-oriented geoprocessing, but it is more of an extensible GIS program than a geospatial library. Again, probably not practical to embed it in our app. Best regards, Jason -Original Message- From: Duarte Carreira [mailto:dcarre...@edia.pt] Sent: Wednesday, January 13, 2010 4:54 AM To: Jason Roberts Cc: gdal-dev Subject: RE: [gdal-dev] Open source vector geoprocessing libraries? Jason, Have you looked at GeoKettle [1]? And recently I found GearScape [2], which seemed very interesting to me. Though neither is based on python... Duarte Carreira [1] - http://sourceforge.net/projects/geokettle/ [2] - http://www.fergonco.es/gearscape/index.php De: Emilio Mayorga [emiliomayo...@gmail.com] Enviado: terça-feira, 12 de Janeiro de 2010 18:25 Para: Jason Roberts Cc: gdal-dev Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries? Hi Jason, This may not be quite what you have in mind, but check out the PySAL (Open Source Python Library for Spatial Analytical Functions) project: http://geodacenter.asu.edu/pysal I've never used it, and have only looked at a recent presentation (http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not clear that it includes or even aims to include the traditional spatial operators provided by GEOS. I also have no idea if it uses OGR for its vector data access. But the developers have done some terrific work in spatial analysis tools in the past. BTW, I'd love to see your marine spatial ecology tools moved to an open source, platform neutral code base! Cheers, -Emilio Mayorga Applied Physics Laboratory University of Washington Box 355640 Seattle, WA 98105-6698 USA On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote: Dear geospatial software experts, By integrating with GEOS, OGR can perform various spatial operations on individual geometries, such as buffer, intersection, union, and so on. Is there a library that efficiently performs these kinds of operations on entire OGRLayers? For example, this library would have functions that would buffer all of the features in a layer, or intersect all of the features in one layer with all of those in another. Basically, I am looking for an open source technology that replicates the geoprocessing tools found in ArcGIS and other GIS packages. These tools traditionally operate on one or more layers as input and produce one or more layers as output. If such a library does not exist, does the OGR team envision that they might add such capabilities to OGR in the future? From software design and performance points of view, would it be appropriate to extend OGR to include functions for spatial operations on entire layers, or is this best left to other libraries? I can see rudimentary ways to implement such tools (e.g. for intersecting layers: loop over all features in both layers, calling OGRGeometry::Touches on all combinations, or something similar). But I am not a geometry expert and do not know if OGRLayer's cursor-based design is compatible with such capabilities; I do not know about spatial indexing, for example. I develop open source geoprocessing tools that help with spatial ecology problems. At the moment, my tools depend on heavily on ArcGIS for these operations with vector layers. I would like to remove this dependency, and, if possible, develop a toolbox that exposes the same ecology tools to several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow, and OpenJump, support plugin extensions. I am wondering whether how difficult it would be to develop a package of tools that does not depend on a specific GIS package but exposes them to several packages via the package-specific plugin mechanisms. For this to work, I'd have to find a library that can do the kind of geoprocessing with layers that ArcGIS can do, or write my own. Writing it myself sounds daunting and am hoping that there are existing projects to draw from.
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason, Jason Roberts wrote: Peter, are you constrained to retaining your data in an ArcGIS compatible format? We are attempting to build tools that can work with data stored in a variety of formats. Our current user community uses mostly shapefiles, ArcGIS personal geodatabases, and ArcGIS file geodatabases. Many of them are ecologists who do not have the interest or skills to deploy a real DBMS system. Thus we are hoping to provide tools that can work without one. This is one reason I was exploring how embeddable PostGIS and SpatiaLite might be in the other fork of this thread. I wonder how many users are aware that ESRI have announced the file geodatabase as replacing the (Access) personal geodatabase? They have not, as yet, announced a cut off for this format, but its many limitations as a result of Access capabilities may make this sooner rather than later. Until the File Geodatabase format is published (later this year?) and someone has the effort to build an OGR interface, the DBMS route is probably the best route to compatibility. It would be really great for that to happen, but I'm not holding my breath. If it does get published, I would seriously contemplate building an OGR driver. ESRI announced publication would be alongside the release of ArcGIS 9.4 at the EMEA User Conference in November 2008 (London). They said that they see the file geodatabase replacing both the personal geodatabase and shapefiles. I believe 9.4 to currently be in beta test. I have contemplated building an ArcObjects- or arcgisscripting-based driver. This would at least allow people who have ArcGIS to use OGR to access any ArcGIS layer, including those created by ArcGIS's tools for joining arbitrary layers, etc. That would handle file geodatabases, as well as ALL formats accessible from ArcGIS. If such a driver existed, then we could use OGR as the base interface inside our application. But creating such a driver would be a lot of work and have funky dependencies because it either needs to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call the ArcGIS APIs. I am certainly capable of implementing it but because most of our code is in Python, it is probably easier for me to wrap OGR and arcgisscripting behind a common abstraction, and then have our tools work against that abstraction rather than OGR directly. GDAL, including OGR, is actually embedded in ArcGIS: however I do not know quite what ESRI use it for. At any rate, I'm sure it is nice being able to do all your work in a spatially-enabled DBMS... Also an attraction of PostGres, of course. Best wishes, Peter Peter J Halls, GIS Advisor, University of York Telephone: 01904 433806 Fax: 01904 433740 Snail mail: Computing Service, University of York, Heslington, York YO10 5DD This message has the status of a private and personal communication ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()
As a practical matter, I do not see this restriction in GDAL. On Thu 21 Sep 2006, I created with gdal_merge.py a 3 GB .tif having 18,400 columns by 52,800 rows by RGB. On Thu 11 Dec 2009, gdal_translate processed a 150 GB untiled .tif to a tiled .tif with 260,000 columns by 195,000 rows. Greg On Jan 12, 2010, at 6:38 PM, Even Rouault wrote: I'm a bit surprised that you even managed to read a 40Kx100K large NITF file organized as scanlines. There was a limit until very recently that prevented to read blocks whose one dimension was bigger than . This was fixed recently in trunk ( see ticket http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has not yet been released to an officially released version. So which GDAL version are you using ? ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: Is that so? Reading the OGR API tutorial (http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, frowm whatever input source, are represented internally as a generic OGRGeometry pointer, which is a virtual base class for all real geometry classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most of the GEOS functionality can be implemented on OGRGeometries, so in principle the same could be done with indexing libraries (GIST, b-tree, quadtree, etc). Such indices should be written out to disk to be of any use at all, of course, like shptree does. Jan, I have had trouble keeping up with this spirited discussion, but I wanted to note that it is not intended that alternate implementations of geometries would be derived by OGRGeometry. There are many places for instance that assume an OGRGeometry can be cast to OGRLineString if it's type is wkbLineString. Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush| Geospatial Programmer for Rent ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] Re: NITF JPEG2000 compression and Kakadu
Martin Chapman wrote: Frank, In the file NITFDatasetCreate.cpp in the function NITFDatasetCreate() if the compression option is set to C8 (JPEG2000) it looks like you: 1. get a handle to an installed J2K driver if available. 2. test for metadata creation capability. 3. create the nitf file. 4. open a new handle to the nitf file on disk. 5. setup a j2k subfile option based on the new nitf file segment offset. 6. call create on the j2k driver with the j2k_subfile option. 7. return an open handle to the new nitf file. It seems to me that I could hack my version of GDAL to include support for doing this with my copy of Kakadu with the exception that I would have to first create a VRT dataset of my output J2K file and then use CreateCopy() on the Kakadu driver instead of Create(). Do you think I am missing something here and that it is more difficult then that? Does the Kakadu library not have some feature I would need to do this? If it can be done, would my approach of using a VRT dataset work? I only want to create single dataset output. Martin, I have skimmed NITFCreateCopy() in nitfdataset.cpp, and I was somewhat surprised to find it does not already support using the JP2KAK (kakadu) driver to write jpeg2000 encoded nitf files. I *think* it could be trivially extended to support JP2KAK with a case similar to the one for JasPer with the filename encoded using the /vsisubfile/ mechanism. I must confess I'm not clear on why you are bringing VRT files or the Create() method into the discussion. Hmm, rereading your email... I assume you are referring to the function NITFDatasetCreate() in nitfdataset.cpp (not NITFDatasetCreate.cpp which I don't think exists). I see it utilizes some special hacks taking advantage of the fact that the JP2ECW driver supports create+write as long as the application writes in a very specific top down order. I had forgetting about this hack which was implemented (somewhat against my better judgement) for ERMapper. Do you have a compelling need to support imperative creation via Create() instead of CreateCopy()? Feel free to give me a call at +1 613 754-2041 if that would expedite this discussion. Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush| Geospatial Programmer for Rent ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()
Hi Even, yes, I tried: gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 NITF_IM:0:input.ntf output.ntf I monitored the memory use using top and it was steadily increasing till it reached 98.4% (I have 8GB of RAM and 140 GB of local disk for swap etc.) before the node died (not just the program, but the whole system just stopped responding). My GDAL version is 1.6.2. gdalinfo on this image shows the raster size of (37504, 98772) and Block=37504x1. The image is compressed using JPEG2000 option and contains two subdatasets (data and cloud data ~ I used only the data for gdal_translate test). Band info from gdalinfo: Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray Ozy On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault even.roua...@mines-paris.orgwrote: Ozy, Did you try with gdal_translate -of NITF src.tif output.tif -co BLOCKSIZE=128 ? Does it give similar results ? I'm a bit surprised that you even managed to read a 40Kx100K large NITF file organized as scanlines. There was a limit until very recently that prevented to read blocks whose one dimension was bigger than . This was fixed recently in trunk ( see ticket http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has not yet been released to an officially released version. So which GDAL version are you using ? Does the output of gdalinfo on your scanline oriented input NITF gives something like : Band 1 Block=4x1 Type=Byte, ColorInterp=Gray Is your input NITF compressed or uncompressed ? Anyway, with latest trunk, I've simulated creating a similarly large NITF image with the following python snippet : import gdal ds = gdal.GetDriverByName('NITF').Create('scanline.ntf', 4, 10) ds = None and then creating the tiled NITF : gdal_translate -of NITF scanline.ntf tiled.ntf -co BLOCKSIZE=128 The memory consumption is very reasonnable (less than 50 MB : the default block cache size of 40 MB + temporary buffers ), so I'm not clear why you would have a problem of increasing memory use. ozy sjahputera a écrit : I was trying to make a copy of a very large NITF image (about 40Kx100K pixels) using GDALDriver::CreateCopy(). The new file was set to have different block-size (input was a scanline image, output is to have a 128x128 blocksize). The program keeps getting killed by the system (Linux). I monitor the memory use of the program as it was executing CreateCopy and the memory use was steadily increasing as the progress indicator from CreateCopy was moving forward. Why does CreateCopy() use so much memory? I have not perused the source code of CreateCopy() yet, but I am guessing it employs RasterIO() to perform the read/write? I was trying different sizes for GDAL cache from 64MB, 256MB, 512MB, 1GB, and 2GB. The program got killed in all these cache sizes. In fact, my Linux box became unresponsive when I set GDALSetCacheMax() to 64MB. Thank you. Ozy ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()
Update: after more than 20 minutes of being non-responsive, the OS finally regained functionality and promptly killed gdal_translate after about 80% into the process. On Wed, Jan 13, 2010 at 11:14 AM, ozy sjahputera sjahpute...@gmail.comwrote: Hi Even, yes, I tried: gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 NITF_IM:0:input.ntf output.ntf I monitored the memory use using top and it was steadily increasing till it reached 98.4% (I have 8GB of RAM and 140 GB of local disk for swap etc.) before the node died (not just the program, but the whole system just stopped responding). My GDAL version is 1.6.2. gdalinfo on this image shows the raster size of (37504, 98772) and Block=37504x1. The image is compressed using JPEG2000 option and contains two subdatasets (data and cloud data ~ I used only the data for gdal_translate test). Band info from gdalinfo: Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray Ozy On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault even.roua...@mines-paris.org wrote: Ozy, Did you try with gdal_translate -of NITF src.tif output.tif -co BLOCKSIZE=128 ? Does it give similar results ? I'm a bit surprised that you even managed to read a 40Kx100K large NITF file organized as scanlines. There was a limit until very recently that prevented to read blocks whose one dimension was bigger than . This was fixed recently in trunk ( see ticket http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has not yet been released to an officially released version. So which GDAL version are you using ? Does the output of gdalinfo on your scanline oriented input NITF gives something like : Band 1 Block=4x1 Type=Byte, ColorInterp=Gray Is your input NITF compressed or uncompressed ? Anyway, with latest trunk, I've simulated creating a similarly large NITF image with the following python snippet : import gdal ds = gdal.GetDriverByName('NITF').Create('scanline.ntf', 4, 10) ds = None and then creating the tiled NITF : gdal_translate -of NITF scanline.ntf tiled.ntf -co BLOCKSIZE=128 The memory consumption is very reasonnable (less than 50 MB : the default block cache size of 40 MB + temporary buffers ), so I'm not clear why you would have a problem of increasing memory use. ozy sjahputera a écrit : I was trying to make a copy of a very large NITF image (about 40Kx100K pixels) using GDALDriver::CreateCopy(). The new file was set to have different block-size (input was a scanline image, output is to have a 128x128 blocksize). The program keeps getting killed by the system (Linux). I monitor the memory use of the program as it was executing CreateCopy and the memory use was steadily increasing as the progress indicator from CreateCopy was moving forward. Why does CreateCopy() use so much memory? I have not perused the source code of CreateCopy() yet, but I am guessing it employs RasterIO() to perform the read/write? I was trying different sizes for GDAL cache from 64MB, 256MB, 512MB, 1GB, and 2GB. The program got killed in all these cache sizes. In fact, my Linux box became unresponsive when I set GDALSetCacheMax() to 64MB. Thank you. Ozy ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
RE: [gdal-dev] Open source vector geoprocessing libraries?
Date: Wed, 13 Jan 2010 10:27:43 -0500 From: Jason Roberts jason.robe...@duke.edu Subject: RE: [gdal-dev] Open source vector geoprocessing libraries? To: 'Peter J Halls' p.ha...@york.ac.uk Cc: 'gdal-dev' gdal-dev@lists.osgeo.org Message-ID: 008001ca9464$f4059f10$dc10dd...@roberts@duke.edu Content-Type: text/plain; charset=US-ASCII Peter, are you constrained to retaining your data in an ArcGIS compatible format? We are attempting to build tools that can work with data stored in a variety of formats. Our current user community uses mostly shapefiles, ArcGIS personal geodatabases, and ArcGIS file geodatabases. Many of them are ecologists who do not have the interest or skills to deploy a real DBMS system. Thus we are hoping to provide tools that can work without one. This is one reason I was exploring how embeddable PostGIS and SpatiaLite might be in the other fork of this thread. Until the File Geodatabase format is published (later this year?) and someone has the effort to build an OGR interface, the DBMS route is probably the best route to compatibility. It would be really great for that to happen, but I'm not holding my breath. If it does get published, I would seriously contemplate building an OGR driver. I have contemplated building an ArcObjects- or arcgisscripting-based driver. This would at least allow people who have ArcGIS to use OGR to access any ArcGIS layer, including those created by ArcGIS's tools for joining arbitrary layers, etc. That would handle file geodatabases, as well as ALL formats accessible from ArcGIS. If such a driver existed, then we could use OGR as the base interface inside our application. But creating such a driver would be a lot of work and have funky dependencies because it either needs to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call the ArcGIS APIs. I am certainly capable of implementing it but because most of our code is in Python, it is probably easier for me to wrap OGR and arcgisscripting behind a common abstraction, and then have our tools work against that abstraction rather than OGR directly. I find it very amusing you mention this right now. Why? I asked Frank if there was an ArcObjects based OGR driver this very past Thursday and he said not that I know of. What I wanted was, among other things, to get data out of FileGDB to PostGIS with one shot and add some custom behavior for a client of mine. So I spent the past three days looking at OGR drivers and wrote an ArcObjects based one. I got it working yesterday. - Right now I only instantiate 3 factories (Enterprise GDB aka ArcSDE, AccessDB and FileGDB). This means it reads FileGDB just fine. If you want more factories, the driver only has to be modified with one line to add any other factories and everything else would just work. - I only implemented the parts that I needed, so it is readonly (should be straight forward to expand if need be). - Although, it can read other GeoDatabase abstractions (Topology, Geometric Networks, Annotations, Cadastral Fabrics, etc), currently I am explicitly filtering for FeatureClasses and FeatureDatasets. - It is a ATL / COM / C++ based one, so it will only compile on Windows. It can be modified to use the cross platform ArcEngine SDK since all the COM Objects that I use are called the same and behave the same way... I just did not have an ArcEngine SDK installer, so I could not test this. Anyway, if you are interested in the source code, let me know. Perhaps we can add it as an ogr driver contribution (what is the process for that anyway?). I may not respond fast enough to e-mail, since the next 4 weeks are pretty crazy for me. - Ragi Burhum ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] Motion: Paid Maintainer Contract with Chaitanya
Motion: Frank Warmerdam is authorized to negotiate a paid maintainer contract with Chaitanya Kumar CH for up to $9360 USD at $13USD/hr over six months, and would be acting as supervisor, operating under the terms of RFC 9 (GDAL Paid Maintainer Guidelines). --- Folks, Chaitanya's current paid maintainer contract ended and the end of December and he has invoiced for the hours worked. Both he and I are interested in his continuing in the role, which has been helpful in resolving a number of issues and moving the project forward. I have not done a detailed analysis of our financial position as a project, but I am confident we can cover the above amount for the first half of this year. It may be that Chaitanya will not be able to work the full number of hours proposed (1560) as he has activities with OSGeo India that he is also pursuing, so the above really establishes an upper bound. In the coming weeks I hope to review our income from sponsorship renewals, and our expenses to see what our financial position is. Depending on how that goes we might consider looking for another paid maintainer in addition to Chaitanya, but I'll leave that till after the financial review. Best regards, -- ---+-- I set the clouds in motion - turn up | Frank Warmerdam, warmer...@pobox.com light and sound - activate the windows | http://pobox.com/~warmerdam and watch the world go round - Rush| Geospatial Programmer for Rent ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()
Greg, You've probably missed that the issue raised by Ozy was with NITF, not with GeoTIFF As a practical matter, I do not see this restriction in GDAL. On Thu 21 Sep 2006, I created with gdal_merge.py a 3 GB .tif having 18,400 columns by 52,800 rows by RGB. On Thu 11 Dec 2009, gdal_translate processed a 150 GB untiled .tif to a tiled .tif with 260,000 columns by 195,000 rows. Greg On Jan 12, 2010, at 6:38 PM, Even Rouault wrote: I'm a bit surprised that you even managed to read a 40Kx100K large NITF file organized as scanlines. There was a limit until very recently that prevented to read blocks whose one dimension was bigger than . This was fixed recently in trunk ( see ticket http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has not yet been released to an officially released version. So which GDAL version are you using ? ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()
Ozy, The interesting info is that your input image is JPEG2000 compressed. This explains why you were able to read a scanline oriented NITF with blockwidth . My guess would be that the leak is in the JPEG2000 driver in question, so this may be more a problem on the reading part than on the writing part. You can check that by running : gdalinfo -checksum NITF_IM:0:input.ntf. If you see the memory increasing again and again, there's definitely a problem. In case you have GDAL configured with several JPEG2000 drivers, you'll have to find which one is used : JP2KAK (Kakadu based), JP2ECW (ECW SDK based), JPEG2000 (Jasper based, but I doubt you're using it with such a big dataset), JP2MRSID. Normally, they are selected in the order I've described (JP2KAK first, etc). As you're on Linux, it might be interesting that you run valgrind to see if it reports leaks. As it might very slow on such a big dataset, you could try translating just a smaller window of your input dataset, like valgrind --leak-check=full gdal_translate NITF_IM:0:input.ntf output.tif -srcwin 0 0 37504 128 I've selected TIF as output format as it shouldn't matter if you confirm that the problem is in the reading part. As far as the window size is concerned, it's difficult to guess which value will show the leak. Filing a ticket with your findings on GDAL Trac might be appropriate. It might be good trying with GDAL trunk first though, in case the leak might have been fixed since 1.6.2. The beta2 source zip are to be found here : http://download.osgeo.org/gdal/gdal-1.7.0b2.tar.gz Best regards, Even ozy sjahputera a écrit : Hi Even, yes, I tried: gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 NITF_IM:0:input.ntf output.ntf I monitored the memory use using top and it was steadily increasing till it reached 98.4% (I have 8GB of RAM and 140 GB of local disk for swap etc.) before the node died (not just the program, but the whole system just stopped responding). My GDAL version is 1.6.2. gdalinfo on this image shows the raster size of (37504, 98772) and Block=37504x1. The image is compressed using JPEG2000 option and contains two subdatasets (data and cloud data ~ I used only the data for gdal_translate test). Band info from gdalinfo: Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray Ozy On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault even.roua...@mines-paris.org mailto:even.roua...@mines-paris.org wrote: Ozy, Did you try with gdal_translate -of NITF src.tif output.tif -co BLOCKSIZE=128 ? Does it give similar results ? I'm a bit surprised that you even managed to read a 40Kx100K large NITF file organized as scanlines. There was a limit until very recently that prevented to read blocks whose one dimension was bigger than . This was fixed recently in trunk ( see ticket http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has not yet been released to an officially released version. So which GDAL version are you using ? Does the output of gdalinfo on your scanline oriented input NITF gives something like : Band 1 Block=4x1 Type=Byte, ColorInterp=Gray Is your input NITF compressed or uncompressed ? Anyway, with latest trunk, I've simulated creating a similarly large NITF image with the following python snippet : import gdal ds = gdal.GetDriverByName('NITF').Create('scanline.ntf', 4, 10) ds = None and then creating the tiled NITF : gdal_translate -of NITF scanline.ntf tiled.ntf -co BLOCKSIZE=128 The memory consumption is very reasonnable (less than 50 MB : the default block cache size of 40 MB + temporary buffers ), so I'm not clear why you would have a problem of increasing memory use. ozy sjahputera a écrit : I was trying to make a copy of a very large NITF image (about 40Kx100K pixels) using GDALDriver::CreateCopy(). The new file was set to have different block-size (input was a scanline image, output is to have a 128x128 blocksize). The program keeps getting killed by the system (Linux). I monitor the memory use of the program as it was executing CreateCopy and the memory use was steadily increasing as the progress indicator from CreateCopy was moving forward. Why does CreateCopy() use so much memory? I have not perused the source code of CreateCopy() yet, but I am guessing it employs RasterIO() to perform the read/write? I was trying different sizes for GDAL cache from 64MB, 256MB, 512MB, 1GB, and 2GB. The program got killed in all these cache sizes. In fact, my Linux box became unresponsive when I set GDALSetCacheMax() to 64MB. Thank you. Ozy ___
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jan Hartmann wrote: On 13-1-2010 2:33, Mateusz Loskot wrote: OGR does not provide any spatial indexing layer common to various vector datasets. For many simple formats it performs the brute-force selection. Just curious, would it make sense / be possible to implement indexing in OGR, something like a generalized version of Mapserver's shptree, the quadtree-based spatial index for a shapefiles? This implementation of index comes from Shapelib made by Frank. The very same bits of Shapelib are used in MapServer and OGR, namely .qix spatial index file support. So, it's already there but for Shapefiles only. Back to the question, I'm personally sceptic. Recalling example with processing two layers, one from DBMS and one from file-based data source, how it would be supposed to work? ...common .qix file generated for DBMS data source? In my opinion, this kind of functionality is out of scope of OGR. I see OGR as a data provider. OGR is basically a translation library that reads from one data source and writes to another data source providing set of reasonably limited features to process data during translation - a common denominator for popular vector spatial data formats. IMHO, it's misunderstanding to consider OGR fully featured data model and I/O engine to read, write, process and analyse spatial vector data, especially if performance is a critical factor. IMHO, there are too many compromises in OGR. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Open source vector geoprocessing libraries?
Jason Roberts wrote: Mateusz, Thank you very much for your insight. I have a few more questions I'm hoping you could answer. Alternative is to try to divide the tasks: 1. Query features from data source using spatial index capability of data source. 2. Having only subject features selected, apply geometric processing. That sounds like a reasonable approach. Considering just the simpler scenarios, such as the one I mentioned, is it possible to implement efficiently it with OGR compiled with GEOS? Should be, but OGRGeometry - geos::Geometry translation may be an overhead. I believe OGR can pass through SQL directly to the data source driver, allowing the caller to submit SQL containing spatial operators. In principle, one could submit a spatial query to PostGIS or SpatiaLite and efficiently get back the features (including geometry) that could possibly intersect a bounding box. Then one could use the GEOS functions on OGRGeometry to do the actual intersecting. Is that what you were suggesting? Yes, that's the concept Of course, it may be that PostGIS or SpatiaLite can handle both steps 1 and 2 in a single query. If so, would it be best to do it that way? It's usually a good idea to let the DBMS engine to do as much as possible, so looks like a good idea to me. It appears that the OGR shapefile driver supports a spatial indexing scheme (.qix file) that is respected by OGRLayer::SetSpatialFilter. The documentation says that Currently this test is may be inaccurately implemented, but it is guaranteed that all features who's envelope (as returned by OGRGeometry::getEnvelope()) overlaps the envelope of the spatial filter will be returned. Therefore, it appears that the shapefile driver can implement step 1 but not step 2. Is that correct? Yes. The problem with OGR and GEOS is cost of translation from OGR geometry to GEOS geometry. It can be a bottleneck. Is it correct that this cost would only be incurred when you call OGR functions implemented by GEOS, such as OGRGeometry::Intersects, OGRGeometry::Disjoint, etc? Yes. Namely, here potential cost takes place: http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrgeometry.cpp#L333 It's plenty of combinations and my point is that if performance (it's not only in terms of speed, but any resource) is critical, it would be extremely difficult to provide efficient implementation of such features in OGR with guaranteed or even determinable degree of complexity. Without these guarantees, I see little of use of such solution. Yes, I see what you mean. But I suggest to the open source community that there is still value in implementing such features, either as part of OGR or another library, even if optimal performance cannot be guaranteed in all scenarios. Perhaps you'll find these inspiring: http://trac.osgeo.org/qgis/browser/trunk/qgis/src/analysis/vector Look at the Java camp too. Best regards, -- Mateusz Loskot, http://mateusz.loskot.net Charter Member of OSGeo, http://osgeo.org ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()
Even, We use the JP2ECW driver. I did the valgrind test and did not see any reported leak. Here is some of the outputs from valgrind: ==11469== Invalid free() / delete / delete[] ==11469==at 0x4CE: free (in /usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so) ==11469==by 0x95D1CDA: (within /lib64/libc-2.9.so) ==11469==by 0x95D1879: (within /lib64/libc-2.9.so) ==11469==by 0x4A1D60C: _vgnU_freeres (in /usr/lib64/valgrind/amd64-linux/vgpreload_core.so) ==11469==by 0x950AB98: exit (in /lib64/libc-2.9.so) ==11469==by 0x94F55EA: (below main) (in /lib64/libc-2.9.so) ==11469== Address 0x40366f0 is not stack'd, malloc'd or (recently) free'd ==11469== ==11469== ERROR SUMMARY: 13177 errors from 14 contexts (suppressed: 0 from 0) ==11469== malloc/free: in use at exit: 376 bytes in 9 blocks. ==11469== malloc/free: 8,856,910 allocs, 8,856,902 frees, 5,762,693,361 bytes allocated. ==11469== For counts of detected errors, rerun with: -v ==11469== Use --track-origins=yes to see where uninitialised values come from ==11469== searching for pointers to 9 not-freed blocks. ==11469== checked 1,934,448 bytes. ==11469== ==11469== LEAK SUMMARY: ==11469==definitely lost: 0 bytes in 0 blocks. ==11469== possibly lost: 0 bytes in 0 blocks. ==11469==still reachable: 376 bytes in 9 blocks. ==11469== suppressed: 0 bytes in 0 blocks. ==11469== Reachable blocks (those to which a pointer was found) are not shown. I will check gdal trunk, but we are looking forward to an upgrade to 1.7. For now, I try to find a scanline and uncompressed NITF image and perform the same gdal_translate operation on it. If the memory use does not climb when operating on uncompressed image, then we can say with more certainty that the problems lay with JPG2000 drivers. I'll let you know. Thanks. Ozy On Wed, Jan 13, 2010 at 1:46 PM, Even Rouault even.roua...@mines-paris.orgwrote: Ozy, The interesting info is that your input image is JPEG2000 compressed. This explains why you were able to read a scanline oriented NITF with blockwidth . My guess would be that the leak is in the JPEG2000 driver in question, so this may be more a problem on the reading part than on the writing part. You can check that by running : gdalinfo -checksum NITF_IM:0:input.ntf. If you see the memory increasing again and again, there's definitely a problem. In case you have GDAL configured with several JPEG2000 drivers, you'll have to find which one is used : JP2KAK (Kakadu based), JP2ECW (ECW SDK based), JPEG2000 (Jasper based, but I doubt you're using it with such a big dataset), JP2MRSID. Normally, they are selected in the order I've described (JP2KAK first, etc). As you're on Linux, it might be interesting that you run valgrind to see if it reports leaks. As it might very slow on such a big dataset, you could try translating just a smaller window of your input dataset, like valgrind --leak-check=full gdal_translate NITF_IM:0:input.ntf output.tif -srcwin 0 0 37504 128 I've selected TIF as output format as it shouldn't matter if you confirm that the problem is in the reading part. As far as the window size is concerned, it's difficult to guess which value will show the leak. Filing a ticket with your findings on GDAL Trac might be appropriate. It might be good trying with GDAL trunk first though, in case the leak might have been fixed since 1.6.2. The beta2 source zip are to be found here : http://download.osgeo.org/gdal/gdal-1.7.0b2.tar.gz Best regards, Even ozy sjahputera a écrit : Hi Even, yes, I tried: gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co BLOCKYSIZE=128 NITF_IM:0:input.ntf output.ntf I monitored the memory use using top and it was steadily increasing till it reached 98.4% (I have 8GB of RAM and 140 GB of local disk for swap etc.) before the node died (not just the program, but the whole system just stopped responding). My GDAL version is 1.6.2. gdalinfo on this image shows the raster size of (37504, 98772) and Block=37504x1. The image is compressed using JPEG2000 option and contains two subdatasets (data and cloud data ~ I used only the data for gdal_translate test). Band info from gdalinfo: Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray Ozy On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault even.roua...@mines-paris.org mailto:even.roua...@mines-paris.org wrote: Ozy, Did you try with gdal_translate -of NITF src.tif output.tif -co BLOCKSIZE=128 ? Does it give similar results ? I'm a bit surprised that you even managed to read a 40Kx100K large NITF file organized as scanlines. There was a limit until very recently that prevented to read blocks whose one dimension was bigger than . This was fixed recently in trunk ( see ticket http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has not yet been released to an officially
Re: [Gdal-dev] When GDAL 1.7.0 branch?
Mateusz Loskot wrote: Frank Warmerdam wrote: Mateusz Loskot wrote: Hi, When the upcoming 1.7.0 will get its own branch in SVN? Mateusz, My normal practice is to produce a 1.7 branch at the point the first RC is prepared. Frank, Great. Thanks! Frank, One more thing if I may, the Wiki/Roadmap say the RC1 is planned on Dec 15, 2009. I understand the schedule has changed. Would you have new date of RC1? Best regards, - -- Mateusz Loskot http://mateusz.loskot.net -- View this message in context: http://n2.nabble.com/When-GDAL-1-7-0-branch-tp4291455p4385486.html Sent from the GDAL - Dev mailing list archive at Nabble.com. ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev
[gdal-dev] UTF8/Wide chars in path
Dear all, I've been looking around GDAL codebase and I can't see anything that would deal with wide/multybyte characters for file names/paths on Windows. Would you please confirm that on Windows GDAL will only be able to open files with pure ASCII names, not international charset? -M ___ gdal-dev mailing list gdal-dev@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/gdal-dev