Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Peter J Halls

Jason,

   are you constrained to retaining your data in an ArcGIS compatible format? 
If so and if you do not have ArcSDE, then what follows may not be much help.


Otherwise, I think it likely that you will find using a DBMS as your data 
repository advantageous for many reasons.  Apart from the built in indexing and 
index based operations, it is *very* much easier to share data between users, 
retaining a single copy and all user having effective access.  Until the File 
Geodatabase format is published (later this year?) and someone has the effort to 
build an OGR interface, the DBMS route is probably the best route to 
compatibility.  We happen to be a corporate Oracle site, but PostGres is pretty 
similar.  PostGres is supported by ESRI with ArcSDE, so it is possible to retain 
ArcGIS compatibility this way.


Many years ago, I had a Simula class for performing many of these basic spatial 
operations, however now my data is all in Oracle: I am able to use the Oracle 
functions and no longer have to worry about building and rebuilding indexes, 
etc. - other than USER_SDO_GEOM_METADATA which, unfortunately, OGR only writes 
to at table creation and does not update.  Frankly, life (and maintenance) is 
much easier now and, certainly with Oracle, I think there have been performance 
gains.


Just my ha'pence-worth.

Peter

Mateusz Loskot wrote:

Jason Roberts wrote:

Mateusz,

I'm not an expert in this area, but I think that big performance 
gains can be obtained by using a spatial index.


Yes, likely true.

For example, consider a situation where you want to clip out a study 
region from the full resolution GSHHS shoreline database, a polygon 
layer. The shoreline polygons have very large, complicated 
geometries. It would be expensive to loop over every polygon, loading
 its full geometry and calling GEOS. Instead, you would use the 
spatial index to isolate the polygons that are likely to overlap with

 the study region, then loop over just those ones.


GEOS as JTS provides support of various spatial indexes.
It is possible to index data and optimise it in this manner as you
mention. In fact, GEOS uses index internally in various operations.
The problem is that such index is not persistent, not serialised
anywhere, so happens in memory only. In fact, there are much more
problems than this one.

BTW, PostGIS is an index serialisation.

OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.

Alternative is to try to divide the tasks:
1. Query features from data source using spatial index capability of
data source.
2. Having only subject features selected, apply geometric processing.

I did it that way, actually.

If OGR takes advantage of spatial indexes internally (e.g. if the 
data source drivers can tell the core about these indexes, and the 
core can use them when OGRLayer::SetSpatialFilter is called), then 
many scenarios could be efficiently implemented by just OGR and GEOS 
alone.


The problem with OGR and GEOS is cost of translation from OGR geometry
to GEOS geometry. It can be a bottleneck.

However, if such processing functionality would be considered as
built in to OGR, that would make sense, but I still see limitations:

Let's brainstom a bit and assume it implements operation:

OGRLayer OGR::SymDifference(OGRLayer layer1, OGRLayer layer2);

Depending on data source, OGR could exploit its capabilities.,
If both layers sit in the same PostGIS (or other spatial)
database, OGR just delegates the processing to PostGIS
where ST_SymDifference is executed and OGR only grabs the
results and generates OGRLayer.

What if layer1 is a Shapefile and layer2 is Oracle table?
Let's assume Shapefile has .qix file with spatial index
and Oracle has its own index. What does OGR do?

Loads .qix to memory, then grabs layer2 and decides which features to
select form layer1?
Loads the whole Shapefile to memory and uses Oracle index to select
features from layer2 masked by layer1?
How to calculate cost which one to transfer in which direction, etc.

Certainly, it depends on number of elements, what algorithm is used,
direction of application of algorithm (who is subject, who is object),
and many more.

It's plenty of combinations and my point is that if performance (it's
not only in terms of speed, but any resource) is critical, it would be
extremely difficult to provide efficient  implementation of such
features in OGR with guaranteed or even determinable degree of
complexity. Without these guarantees, I see little of use of
such solution.

Given that, depending on needs, write a specialised application using
available tools like OGR and GEOS, that is optimised according to
specifics of datasets, type of processing, system requirements, etc.

If not, then your suggestion may be as fast as any other. For 
example, the idea of loading the features in to PostGIS or SpatiaLite
 will require loading all of the full 

[gdal-dev] Re: Memory use in GDALDriver::CreateCopy()

2010-01-13 Thread Jukka Rahkonen
Greg Coats gregcoats at mac.com writes:

 I find that GDAL version 1.6.3, released 2009/11/19, gdal_translate fully
supports reading and writing a 150 GB GeoTiff image 260,000 columns by 195,000
rows by RGB. Greg

Hi,

The problem is not not the image size itself.  It may be related to, as
mentioned earlier, image organised by scanlines and having blocks wider that
 pixels. Your image is tiled

   Tile Width: 512 Tile Length: 512
while the file in question has 4 pixel wide blocks.
 Band 1 Block=4x1 Type=Byte, ColorInterp=Gray

-Jukka Rahkonen-


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] NITF JPEG2000 compression and Kakadu

2010-01-13 Thread Martin Chapman
Frank,

 

In the file NITFDatasetCreate.cpp in the function NITFDatasetCreate() if the
compression option is set to C8 (JPEG2000) it looks like you:

 

1.  get a handle to an installed J2K driver if available.
2.  test for metadata creation capability.
3.  create the nitf file.
4.  open a new handle to the nitf file on disk.
5.  setup a j2k subfile option based on the new nitf file segment
offset.
6.  call create on the j2k driver with the j2k_subfile option.
7.  return an open handle to the new nitf file.

 

It seems to me that I could hack my version of GDAL to include support for
doing this with my copy of Kakadu with the exception that I would have to
first create a VRT dataset of my output J2K file and then use CreateCopy()
on the Kakadu driver instead of Create().

 

Do you think I am missing something here and that it is more difficult then
that?  Does the Kakadu library not have some feature I would need to do
this?

 

If it can be done, would my approach of using a VRT dataset work?

 

I only want to create single dataset output.

 

Best regards,

Martin.

 

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Duarte Carreira
Jason,

Have you looked at GeoKettle [1]? And recently I found GearScape [2], which 
seemed very interesting to me. Though neither is based on python...

Duarte Carreira

[1] - http://sourceforge.net/projects/geokettle/
[2] - http://www.fergonco.es/gearscape/index.php

De: Emilio Mayorga [emiliomayo...@gmail.com]
Enviado: terça-feira, 12 de Janeiro de 2010 18:25
Para: Jason Roberts
Cc: gdal-dev
Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries?

Hi Jason,

This may not be quite what you have in mind, but check out the PySAL
(Open Source Python Library for Spatial Analytical Functions) project:
http://geodacenter.asu.edu/pysal

I've never used it, and have only looked at a recent presentation
(http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not
clear that it includes or even aims to include the traditional spatial
operators provided by GEOS. I also have no idea if it uses OGR for its
vector data access. But the developers have done some terrific work in
spatial analysis tools in the past.

BTW, I'd love to see your marine spatial ecology tools moved to an
open source, platform neutral code base!

Cheers,

-Emilio Mayorga
Applied Physics Laboratory
University of Washington
Box 355640
Seattle, WA 98105-6698  USA


On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote:
 Dear geospatial software experts,



 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an open
 source technology that replicates the geoprocessing tools found in ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.



 If such a library does not exist, does the OGR team envision that they might
 add such capabilities to OGR in the future? From software design and
 performance points of view, would it be appropriate to extend OGR to include
 functions for spatial operations on entire layers, or is this best left to
 other libraries? I can see rudimentary ways to implement such tools (e.g.
 for intersecting layers: loop over all features in both layers, calling
 OGRGeometry::Touches on all combinations, or something similar). But I am
 not a geometry expert and do not know if OGRLayer's cursor-based design is
 compatible with such capabilities; I do not know about spatial indexing, for
 example.



 I develop open source geoprocessing tools that help with spatial ecology
 problems. At the moment, my tools depend on heavily on ArcGIS for these
 operations with vector layers. I would like to remove this dependency, and,
 if possible, develop a toolbox that exposes the same ecology tools to
 several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
 and OpenJump, support plugin extensions. I am wondering whether how
 difficult it would be to develop a package of tools that does not depend on
 a specific GIS package but exposes them to several packages via the
 package-specific plugin mechanisms. For this to work, I'd have to find a
 library that can do the kind of geoprocessing with layers that ArcGIS can
 do, or write my own. Writing it myself sounds daunting and am hoping that
 there are existing projects to draw from.



 Thank you very much for any comments you can provide.



 Jason







 ___
 gdal-dev mailing list
 gdal-dev@lists.osgeo.org
 http://lists.osgeo.org/mailman/listinfo/gdal-dev

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Jan Hartmann



On 13-1-2010 2:33, Mateusz Loskot wrote:


OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.
   

Just curious, would it make sense / be possible to implement indexing in 
OGR, something like a
generalized version of Mapserver's shptree, the quadtree-based spatial 
index for a shapefiles?


http://mapserver.org/utilities/shptree.html

Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Ari Jolma

Jan Hartmann wrote:



On 13-1-2010 2:33, Mateusz Loskot wrote:


OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.
  
Just curious, would it make sense / be possible to implement indexing 
in OGR, something like a
generalized version of Mapserver's shptree, the quadtree-based 
spatial index for a shapefiles?


http://mapserver.org/utilities/shptree.html


It could make sense to have a in-memory index for in-memory geometries. 
Pehaps use GiST library(1) (I don't know whether it can use in-memory 
indexes) for geometries in an OGRGeometryCollection  or OGRMemLayer if 
it's available.


For other formats it might not make sense because OGR is not responsible 
for the actual geometries. As have been said, one should use PostGIS 
format, which has this functionality built-in, for larger and more 
static datasets.


Just my quick thoughts.

Ari

(1) http://www.sai.msu.su/~megera/postgres/gist/



Jan
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Jason Roberts
Mateusz,

Thank you very much for your insight. I have a few more questions I'm hoping
you could answer.

 Alternative is to try to divide the tasks:
 1. Query features from data source using spatial index capability of
 data source.
 2. Having only subject features selected, apply geometric processing.

That sounds like a reasonable approach. Considering just the simpler
scenarios, such as the one I mentioned, is it possible to implement
efficiently it with OGR compiled with GEOS? I believe OGR can pass through
SQL directly to the data source driver, allowing the caller to submit SQL
containing spatial operators. In principle, one could submit a spatial query
to PostGIS or SpatiaLite and efficiently get back the features (including
geometry) that could possibly intersect a bounding box. Then one could use
the GEOS functions on OGRGeometry to do the actual intersecting. Is that
what you were suggesting?

Of course, it may be that PostGIS or SpatiaLite can handle both steps 1 and
2 in a single query. If so, would it be best to do it that way?

It appears that the OGR shapefile driver supports a spatial indexing scheme
(.qix file) that is respected by OGRLayer::SetSpatialFilter. The
documentation says that Currently this test is may be inaccurately
implemented, but it is guaranteed that all features who's envelope (as
returned by OGRGeometry::getEnvelope()) overlaps the envelope of the spatial
filter will be returned. Therefore, it appears that the shapefile driver
can implement step 1 but not step 2. Is that correct?

 The problem with OGR and GEOS is cost of translation from OGR geometry
 to GEOS geometry. It can be a bottleneck.

Is it correct that this cost would only be incurred when you call OGR
functions implemented by GEOS, such as OGRGeometry::Intersects,
OGRGeometry::Disjoint, etc? 

 It's plenty of combinations and my point is that if performance (it's
 not only in terms of speed, but any resource) is critical, it would be
 extremely difficult to provide efficient  implementation of such
 features in OGR with guaranteed or even determinable degree of
 complexity. Without these guarantees, I see little of use of
 such solution.

Yes, I see what you mean. But I suggest to the open source community that
there is still value in implementing such features, either as part of OGR or
another library, even if optimal performance cannot be guaranteed in all
scenarios. The reason is that ArcGIS provides such generic tools (e.g.
intersect/union/symdiff layers, regardless of underlying storage). These
geoprocessing tools are considered the most basic capabilities of ArcGIS,
available in the cheapest versions of the software. IMHO, if the open source
community wants to win over a large number of ArcGIS users to open GIS
systems, I believe the community needs to provide parity with these basic
tools.

Thanks again,

Jason

-Original Message-
From: Mateusz Loskot [mailto:mate...@loskot.net] 
Sent: Tuesday, January 12, 2010 8:33 PM
To: Jason Roberts
Cc: 'gdal-dev'
Subject: Re: [gdal-dev] Open source vector geoprocessing libraries?

Jason Roberts wrote:
 Mateusz,
 
 I'm not an expert in this area, but I think that big performance 
 gains can be obtained by using a spatial index.

Yes, likely true.

 For example, consider a situation where you want to clip out a study 
 region from the full resolution GSHHS shoreline database, a polygon 
 layer. The shoreline polygons have very large, complicated 
 geometries. It would be expensive to loop over every polygon, loading
  its full geometry and calling GEOS. Instead, you would use the 
 spatial index to isolate the polygons that are likely to overlap with
  the study region, then loop over just those ones.

GEOS as JTS provides support of various spatial indexes.
It is possible to index data and optimise it in this manner as you
mention. In fact, GEOS uses index internally in various operations.
The problem is that such index is not persistent, not serialised
anywhere, so happens in memory only. In fact, there are much more
problems than this one.

BTW, PostGIS is an index serialisation.

OGR does not provide any spatial indexing layer common to various
vector datasets. For many simple formats it performs the brute-force
selection.

Alternative is to try to divide the tasks:
1. Query features from data source using spatial index capability of
data source.
2. Having only subject features selected, apply geometric processing.

I did it that way, actually.

 If OGR takes advantage of spatial indexes internally (e.g. if the 
 data source drivers can tell the core about these indexes, and the 
 core can use them when OGRLayer::SetSpatialFilter is called), then 
 many scenarios could be efficiently implemented by just OGR and GEOS 
 alone.

The problem with OGR and GEOS is cost of translation from OGR geometry
to GEOS geometry. It can be a bottleneck.

However, if such processing functionality would be considered as
built in to OGR, that would make sense, but I still 

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Ari Jolma

Jan Hartmann wrote:


On 13-1-2010 15:49, Ari Jolma wrote:

Jan Hartmann wrote:



Just curious, would it make sense / be possible to implement 
indexing in OGR, something like a
generalized version of Mapserver's shptree, the quadtree-based 
spatial index for a shapefiles?


http://mapserver.org/utilities/shptree.html


It could make sense to have a in-memory index for in-memory 
geometries. Pehaps use GiST library(1) (I don't know whether it can 
use in-memory indexes) for geometries in an OGRGeometryCollection  or 
OGRMemLayer if it's available.


For other formats it might not make sense because OGR is not 
responsible for the actual geometries. As have been said, one should 
use PostGIS format, which has this functionality built-in, for larger 
and more static datasets.


Is that so? Reading the OGR API tutorial 
(http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, 
frowm whatever input source, are represented internally as a generic 
OGRGeometry pointer, which is a virtual base class for all real 
geometry classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most 
of the GEOS functionality can be implemented on OGRGeometries, so in 
principle the same could be done with indexing libraries (GIST, 
b-tree, quadtree, etc). Such indices should be written out to disk to 
be of any use at all, of course, like shptree does.




What I meant is that with other formats than the in-memory format, the 
features are stored on disk (possibly even on remote servers) and only 
available for indexing when retrieved. When they are retrieved, they are 
of course OGR objects and accessable through the generic OGR API. Maybe 
it's possible but it would probably mean that the library would need to 
retrieve and go through all the features, and prepare and store into 
some local(?) file the index. Thus I think that for those formats, it's 
up to the format itself to provide the indexing or not.


Ari

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Jason Roberts
Hi Duarte,

Thanks for the suggestions.

I took a look at GeoKettle. Here are some relevant excerpts from a document:

GeoKettle is a ... powerful, metadata‐driven spatial ETL tool dedicated to the 
integration of different spatial data sources for building/updating geospatial 
data warehouses. At present, Oracle spatial, PostgreSQL/PostGIS and MySQL DBMS 
and the ESRI shapefiles are natively supported in read and write modes. Spatial 
Reference Systems management and coordinates transformations have been fully 
implemented. It is possible to access Geometry objects in JavaScript and define 
custom transformation steps (“Modified JavaScript Value” step). Topological 
predicates (Intersects, crosses, etc.) have all been implemented.

It looks interesting, but oriented to server applications. We are building a 
set of desktop GIS analysis tools. It would probably not be practical to try to 
embed GeoKettle in our application.

GearScape also looks interesting, with SQL-oriented geoprocessing, but it is 
more of an extensible GIS program than a geospatial library. Again, probably 
not practical to embed it in our app.

Best regards,

Jason

-Original Message-
From: Duarte Carreira [mailto:dcarre...@edia.pt] 
Sent: Wednesday, January 13, 2010 4:54 AM
To: Jason Roberts
Cc: gdal-dev
Subject: RE: [gdal-dev] Open source vector geoprocessing libraries?

Jason,

Have you looked at GeoKettle [1]? And recently I found GearScape [2], which 
seemed very interesting to me. Though neither is based on python...

Duarte Carreira

[1] - http://sourceforge.net/projects/geokettle/
[2] - http://www.fergonco.es/gearscape/index.php

De: Emilio Mayorga [emiliomayo...@gmail.com]
Enviado: terça-feira, 12 de Janeiro de 2010 18:25
Para: Jason Roberts
Cc: gdal-dev
Assunto: Re: [gdal-dev] Open source vector geoprocessing libraries?

Hi Jason,

This may not be quite what you have in mind, but check out the PySAL
(Open Source Python Library for Spatial Analytical Functions) project:
http://geodacenter.asu.edu/pysal

I've never used it, and have only looked at a recent presentation
(http://conference.scipy.org/static/wiki/rey_pysal.pdf). It's not
clear that it includes or even aims to include the traditional spatial
operators provided by GEOS. I also have no idea if it uses OGR for its
vector data access. But the developers have done some terrific work in
spatial analysis tools in the past.

BTW, I'd love to see your marine spatial ecology tools moved to an
open source, platform neutral code base!

Cheers,

-Emilio Mayorga
Applied Physics Laboratory
University of Washington
Box 355640
Seattle, WA 98105-6698  USA


On Mon, Jan 11, 2010 at 2:32 PM, Jason Roberts jason.robe...@duke.edu wrote:
 Dear geospatial software experts,



 By integrating with GEOS, OGR can perform various spatial operations on
 individual geometries, such as buffer, intersection, union, and so on. Is
 there a library that efficiently performs these kinds of operations on
 entire OGRLayers? For example, this library would have functions that would
 buffer all of the features in a layer, or intersect all of the features in
 one layer with all of those in another. Basically, I am looking for an open
 source technology that replicates the geoprocessing tools found in ArcGIS
 and other GIS packages. These tools traditionally operate on one or more
 layers as input and produce one or more layers as output.



 If such a library does not exist, does the OGR team envision that they might
 add such capabilities to OGR in the future? From software design and
 performance points of view, would it be appropriate to extend OGR to include
 functions for spatial operations on entire layers, or is this best left to
 other libraries? I can see rudimentary ways to implement such tools (e.g.
 for intersecting layers: loop over all features in both layers, calling
 OGRGeometry::Touches on all combinations, or something similar). But I am
 not a geometry expert and do not know if OGRLayer's cursor-based design is
 compatible with such capabilities; I do not know about spatial indexing, for
 example.



 I develop open source geoprocessing tools that help with spatial ecology
 problems. At the moment, my tools depend on heavily on ArcGIS for these
 operations with vector layers. I would like to remove this dependency, and,
 if possible, develop a toolbox that exposes the same ecology tools to
 several GIS packages. Many GIS packages, such as ArcGIS, QGIS, MapWindow,
 and OpenJump, support plugin extensions. I am wondering whether how
 difficult it would be to develop a package of tools that does not depend on
 a specific GIS package but exposes them to several packages via the
 package-specific plugin mechanisms. For this to work, I'd have to find a
 library that can do the kind of geoprocessing with layers that ArcGIS can
 do, or write my own. Writing it myself sounds daunting and am hoping that
 there are existing projects to draw from.




Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Peter J Halls

Jason,

Jason Roberts wrote:

Peter,


are you constrained to retaining your data in an ArcGIS compatible format?



We are attempting to build tools that can work with data stored in a variety
of formats. Our current user community uses mostly shapefiles, ArcGIS
personal geodatabases, and ArcGIS file geodatabases. Many of them are
ecologists who do not have the interest or skills to deploy a real DBMS
system. Thus we are hoping to provide tools that can work without one. This
is one reason I was exploring how embeddable PostGIS and SpatiaLite might be
in the other fork of this thread.


I wonder how many users are aware that ESRI have announced the file geodatabase 
as replacing the (Access) personal geodatabase?  They have not, as yet, 
announced a cut off for this format, but its many limitations as a result of 
Access capabilities may make this sooner rather than later.




Until the File 
Geodatabase format is published (later this year?) and someone has the
effort to 
build an OGR interface, the DBMS route is probably the best route to 
compatibility.


It would be really great for that to happen, but I'm not holding my breath.
If it does get published, I would seriously contemplate building an OGR
driver.


ESRI announced publication would be alongside the release of ArcGIS 9.4 at the 
EMEA User Conference in November 2008 (London).  They said that they see the 
file geodatabase replacing both the personal geodatabase and shapefiles.  I 
believe 9.4 to currently be in beta test.




I have contemplated building an ArcObjects- or arcgisscripting-based driver.
This would at least allow people who have ArcGIS to use OGR to access any
ArcGIS layer, including those created by ArcGIS's tools for joining
arbitrary layers, etc. That would handle file geodatabases, as well as ALL
formats accessible from ArcGIS. If such a driver existed, then we could use
OGR as the base interface inside our application. But creating such a driver
would be a lot of work and have funky dependencies because it either needs
to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call
the ArcGIS APIs. I am certainly capable of implementing it but because most
of our code is in Python, it is probably easier for me to wrap OGR and
arcgisscripting behind a common abstraction, and then have our tools work
against that abstraction rather than OGR directly.


GDAL, including OGR, is actually embedded in ArcGIS: however I do not know quite 
what ESRI use it for.




At any rate, I'm sure it is nice being able to do all your work in a
spatially-enabled DBMS...


Also an attraction of PostGres, of course.

Best wishes,

Peter


Peter J Halls, GIS Advisor, University of York
Telephone: 01904 433806 Fax: 01904 433740
Snail mail: Computing Service, University of York, Heslington, York YO10 5DD
This message has the status of a private and personal communication

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()

2010-01-13 Thread Greg Coats
As a practical matter, I do not see this  restriction in GDAL. On Thu 21 
Sep 2006, I created with gdal_merge.py a 3 GB .tif having 18,400 columns by 
52,800 rows by RGB. On Thu 11 Dec 2009, gdal_translate processed a 150 GB 
untiled .tif to a tiled .tif with 260,000 columns by 195,000 rows. Greg

On Jan 12, 2010, at 6:38 PM, Even Rouault wrote:

 I'm a bit surprised that you even managed to read a 40Kx100K large NITF file 
 organized as scanlines. There was a limit until very recently that prevented 
 to read blocks whose one dimension was bigger than . This was fixed 
 recently in trunk ( see ticket http://trac.osgeo.org/gdal/ticket/3263 ) and 
 branches/1.6, but it has not yet been released to an officially released 
 version. So which GDAL version are you using ?
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Frank Warmerdam

Jan Hartmann wrote:
Is that so? Reading the OGR API tutorial 
(http://www.gdal.org/ogr/ogr_apitut.html), I see that all geometries, 
frowm whatever input source, are represented internally as a generic 
OGRGeometry pointer, which is a virtual base class for all real geometry 
classes (http://www.gdal.org/ogr/classOGRGeometry.html). Most of the 
GEOS functionality can be implemented on OGRGeometries, so in principle 
the same could be done with indexing libraries (GIST, b-tree, quadtree, 
etc). Such indices should be written out to disk to be of any use at 
all, of course, like shptree does.


Jan,

I have had trouble keeping up with this spirited discussion, but I wanted
to note that it is not intended that alternate implementations of geometries
would be derived by OGRGeometry.  There are many places for instance that
assume an OGRGeometry can be cast to OGRLineString if it's type is
wkbLineString.

Best regards,
--
---+--
I set the clouds in motion - turn up   | Frank Warmerdam, warmer...@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush| Geospatial Programmer for Rent

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] Re: NITF JPEG2000 compression and Kakadu

2010-01-13 Thread Frank Warmerdam

Martin Chapman wrote:

Frank,

 

In the file NITFDatasetCreate.cpp in the function NITFDatasetCreate() if 
the compression option is set to C8 (JPEG2000) it looks like you:


   1. get a handle to an installed J2K driver if available.
   2. test for metadata creation capability.
   3. create the nitf file.
   4. open a new handle to the nitf file on disk.
   5. setup a j2k subfile option based on the new nitf file segment offset.
   6. call create on the j2k driver with the j2k_subfile option.
   7. return an open handle to the new nitf file.

It seems to me that I could hack my version of GDAL to include support 
for doing this with my copy of Kakadu with the exception that I would 
have to first create a VRT dataset of my output J2K file and then use 
CreateCopy() on the Kakadu driver instead of Create().


Do you think I am missing something here and that it is more difficult 
then that?  Does the Kakadu library not have some feature I would need 
to do this?


If it can be done, would my approach of using a VRT dataset work?

I only want to create single dataset output.


Martin,

I have skimmed NITFCreateCopy() in nitfdataset.cpp, and I was somewhat
surprised to find it does not already support using the JP2KAK (kakadu)
driver to write jpeg2000 encoded nitf files.

I *think* it could be trivially extended to support JP2KAK with a case
similar to the one for JasPer with the filename encoded using the
/vsisubfile/ mechanism.

I must confess I'm not clear on why you are bringing VRT files or the
Create() method into the discussion.  Hmm, rereading your email...

I assume you are referring to the function NITFDatasetCreate() in
nitfdataset.cpp (not NITFDatasetCreate.cpp which I don't think exists).
I see it utilizes some special hacks taking advantage of the fact that
the JP2ECW driver supports create+write as long as the application writes
in a very specific top down order.  I had forgetting about this hack
which was implemented (somewhat against my better judgement) for
ERMapper.

Do you have a compelling need to support imperative creation via
Create() instead of CreateCopy()?  Feel free to give me a call at
+1 613 754-2041 if that would expedite this discussion.

Best regards,
--
---+--
I set the clouds in motion - turn up   | Frank Warmerdam, warmer...@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush| Geospatial Programmer for Rent

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()

2010-01-13 Thread ozy sjahputera
Hi Even,

yes, I tried:
gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co
BLOCKYSIZE=128  NITF_IM:0:input.ntf output.ntf

I monitored the memory use using top and it was steadily increasing till it
reached 98.4% (I have 8GB of RAM and 140 GB of local disk for swap etc.)
before the node died (not just the program, but the whole system just
stopped responding).

My GDAL version is 1.6.2.

gdalinfo on this image shows the raster size of (37504, 98772) and
Block=37504x1.
The image is compressed using JPEG2000 option and contains two subdatasets
(data and cloud data ~ I used only the data for gdal_translate test).

Band info from gdalinfo:
Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray

Ozy

On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault
even.roua...@mines-paris.orgwrote:

 Ozy,

 Did you try with gdal_translate -of NITF src.tif output.tif -co
 BLOCKSIZE=128 ? Does it give similar results ?

 I'm a bit surprised that you even managed to read a 40Kx100K large NITF
 file organized as scanlines. There was a limit until very recently that
 prevented to read blocks whose one dimension was bigger than . This
 was fixed recently in trunk ( see ticket
 http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has
 not yet been released to an officially released version. So which GDAL
 version are you using ?

 Does the output of gdalinfo on your scanline oriented input NITF gives
 something like :
 Band 1 Block=4x1 Type=Byte, ColorInterp=Gray

 Is your input NITF compressed or uncompressed ?

 Anyway, with latest trunk, I've simulated creating a similarly large
 NITF image with the following python snippet :

 import gdal
 ds = gdal.GetDriverByName('NITF').Create('scanline.ntf', 4, 10)
 ds = None

 and then creating the tiled NITF :

 gdal_translate -of NITF scanline.ntf tiled.ntf -co BLOCKSIZE=128

 The memory consumption is very reasonnable (less than 50 MB : the
 default block cache size of 40 MB + temporary buffers ), so I'm not
 clear why you would have a problem of increasing memory use.

 ozy sjahputera a écrit :
  I was trying to make a copy of a very large NITF image (about 40Kx100K
  pixels) using GDALDriver::CreateCopy(). The new file was set to have
  different block-size (input was a scanline image, output is to have a
  128x128 blocksize). The program keeps getting killed by the system
  (Linux). I monitor the memory use of the program as it was executing
  CreateCopy and the memory use was steadily increasing as the progress
  indicator from CreateCopy was moving forward.
 
  Why does CreateCopy() use so much memory? I have not perused the
  source code of CreateCopy() yet, but I am guessing it employs
  RasterIO() to perform the read/write?
 
  I was trying different sizes for GDAL  cache from 64MB, 256MB, 512MB,
  1GB, and 2GB. The program got killed in all these cache sizes. In
  fact, my Linux box became unresponsive when I set GDALSetCacheMax() to
  64MB.
 
  Thank you.
  Ozy
 
 
  
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev



___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()

2010-01-13 Thread ozy sjahputera
Update:

after more than 20 minutes of being non-responsive, the OS finally regained
functionality and promptly killed gdal_translate after about 80% into the
process.


On Wed, Jan 13, 2010 at 11:14 AM, ozy sjahputera sjahpute...@gmail.comwrote:

 Hi Even,

 yes, I tried:
 gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co
 BLOCKYSIZE=128  NITF_IM:0:input.ntf output.ntf

 I monitored the memory use using top and it was steadily increasing till it
 reached 98.4% (I have 8GB of RAM and 140 GB of local disk for swap etc.)
 before the node died (not just the program, but the whole system just
 stopped responding).

 My GDAL version is 1.6.2.

 gdalinfo on this image shows the raster size of (37504, 98772) and
 Block=37504x1.
 The image is compressed using JPEG2000 option and contains two subdatasets
 (data and cloud data ~ I used only the data for gdal_translate test).

 Band info from gdalinfo:
 Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray

 Ozy


 On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault 
 even.roua...@mines-paris.org wrote:

 Ozy,

 Did you try with gdal_translate -of NITF src.tif output.tif -co
 BLOCKSIZE=128 ? Does it give similar results ?

 I'm a bit surprised that you even managed to read a 40Kx100K large NITF
 file organized as scanlines. There was a limit until very recently that
 prevented to read blocks whose one dimension was bigger than . This
 was fixed recently in trunk ( see ticket
 http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has
 not yet been released to an officially released version. So which GDAL
 version are you using ?

 Does the output of gdalinfo on your scanline oriented input NITF gives
 something like :
 Band 1 Block=4x1 Type=Byte, ColorInterp=Gray

 Is your input NITF compressed or uncompressed ?

 Anyway, with latest trunk, I've simulated creating a similarly large
 NITF image with the following python snippet :

 import gdal
 ds = gdal.GetDriverByName('NITF').Create('scanline.ntf', 4, 10)
 ds = None

 and then creating the tiled NITF :

 gdal_translate -of NITF scanline.ntf tiled.ntf -co BLOCKSIZE=128

 The memory consumption is very reasonnable (less than 50 MB : the
 default block cache size of 40 MB + temporary buffers ), so I'm not
 clear why you would have a problem of increasing memory use.

 ozy sjahputera a écrit :
  I was trying to make a copy of a very large NITF image (about 40Kx100K
  pixels) using GDALDriver::CreateCopy(). The new file was set to have
  different block-size (input was a scanline image, output is to have a
  128x128 blocksize). The program keeps getting killed by the system
  (Linux). I monitor the memory use of the program as it was executing
  CreateCopy and the memory use was steadily increasing as the progress
  indicator from CreateCopy was moving forward.
 
  Why does CreateCopy() use so much memory? I have not perused the
  source code of CreateCopy() yet, but I am guessing it employs
  RasterIO() to perform the read/write?
 
  I was trying different sizes for GDAL  cache from 64MB, 256MB, 512MB,
  1GB, and 2GB. The program got killed in all these cache sizes. In
  fact, my Linux box became unresponsive when I set GDALSetCacheMax() to
  64MB.
 
  Thank you.
  Ozy
 
 
  
 
  ___
  gdal-dev mailing list
  gdal-dev@lists.osgeo.org
  http://lists.osgeo.org/mailman/listinfo/gdal-dev




___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

RE: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Ragi Y. Burhum

 Date: Wed, 13 Jan 2010 10:27:43 -0500
 From: Jason Roberts jason.robe...@duke.edu
 Subject: RE: [gdal-dev] Open source vector geoprocessing libraries?
 To: 'Peter J Halls' p.ha...@york.ac.uk
 Cc: 'gdal-dev' gdal-dev@lists.osgeo.org
 Message-ID: 008001ca9464$f4059f10$dc10dd...@roberts@duke.edu
 Content-Type: text/plain;   charset=US-ASCII

 Peter,

  are you constrained to retaining your data in an ArcGIS compatible
 format?


 We are attempting to build tools that can work with data stored in a
 variety
 of formats. Our current user community uses mostly shapefiles, ArcGIS
 personal geodatabases, and ArcGIS file geodatabases. Many of them are
 ecologists who do not have the interest or skills to deploy a real DBMS
 system. Thus we are hoping to provide tools that can work without one. This
 is one reason I was exploring how embeddable PostGIS and SpatiaLite might
 be
 in the other fork of this thread.

  Until the File
  Geodatabase format is published (later this year?) and someone has the
 effort to
  build an OGR interface, the DBMS route is probably the best route to
  compatibility.

 It would be really great for that to happen, but I'm not holding my breath.
 If it does get published, I would seriously contemplate building an OGR
 driver.

 I have contemplated building an ArcObjects- or arcgisscripting-based
 driver.
 This would at least allow people who have ArcGIS to use OGR to access any
 ArcGIS layer, including those created by ArcGIS's tools for joining
 arbitrary layers, etc. That would handle file geodatabases, as well as ALL
 formats accessible from ArcGIS. If such a driver existed, then we could use
 OGR as the base interface inside our application. But creating such a
 driver
 would be a lot of work and have funky dependencies because it either needs
 to use Windows COM (for ArcObjects) or Python (for arcgisscripting) to call
 the ArcGIS APIs. I am certainly capable of implementing it but because most
 of our code is in Python, it is probably easier for me to wrap OGR and
 arcgisscripting behind a common abstraction, and then have our tools work
 against that abstraction rather than OGR directly.


I find it very amusing you mention this right now.

Why?

I asked Frank if there was an ArcObjects based OGR driver this very past
Thursday and he said not that I know of. What I wanted was, among other
things, to get data out of FileGDB to PostGIS with one shot and add some
custom behavior for a client of mine. So I spent the past three days looking
at OGR drivers and wrote an ArcObjects based one. I got it working
yesterday.

- Right now I only instantiate 3 factories (Enterprise GDB aka ArcSDE,
AccessDB and FileGDB). This means it reads FileGDB just fine. If you want
more factories, the driver only has to be modified with one line to add any
other factories and everything else would just work.

- I only implemented the parts that I needed, so it is readonly (should be
straight forward to expand if need be).

- Although, it can read other GeoDatabase abstractions (Topology, Geometric
Networks, Annotations, Cadastral Fabrics, etc), currently I am explicitly
filtering for FeatureClasses and FeatureDatasets.

- It is a ATL / COM / C++ based one, so it will only compile on Windows. It
can be modified to use the cross platform ArcEngine SDK since all the COM
Objects that I use are called the same and behave the same way... I just did
not have an ArcEngine SDK installer, so I could not test this.

Anyway, if you are interested in the source code, let me know. Perhaps we
can add it as an ogr driver contribution (what is the process for that
anyway?). I may not respond fast enough to e-mail, since the next 4 weeks
are pretty crazy for me.

- Ragi Burhum
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

[gdal-dev] Motion: Paid Maintainer Contract with Chaitanya

2010-01-13 Thread Frank Warmerdam

Motion: Frank Warmerdam is authorized to negotiate a paid maintainer
contract with Chaitanya Kumar CH for up to $9360 USD at $13USD/hr over
six months, and would be acting as supervisor, operating under the terms
of RFC 9 (GDAL Paid Maintainer Guidelines).

---

Folks,

Chaitanya's current paid maintainer contract ended and the end of December
and he has invoiced for the hours worked.  Both he and I are interested in
his continuing in the role, which has been helpful in resolving a number
of issues and moving the project forward.

I have not done a detailed analysis of our financial position as a project,
but I am confident we can cover the above amount for the first half of this
year.  It may be that Chaitanya will not be able to work the full number of
hours proposed (1560) as he has activities with OSGeo India that he is also
pursuing, so the above really establishes an upper bound.

In the coming weeks I hope to review our income from sponsorship renewals,
and our expenses to see what our financial position is.  Depending on how
that goes we might consider looking for another paid maintainer in
addition to Chaitanya, but I'll leave that till after the financial review.

Best regards,
--
---+--
I set the clouds in motion - turn up   | Frank Warmerdam, warmer...@pobox.com
light and sound - activate the windows | http://pobox.com/~warmerdam
and watch the world go round - Rush| Geospatial Programmer for Rent

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()

2010-01-13 Thread Even Rouault

Greg,

You've probably missed that the issue raised by Ozy was with NITF, not 
with GeoTIFF


As a practical matter, I do not see this  restriction in GDAL. On 
Thu 21 Sep 2006, I created with gdal_merge.py a 3 GB .tif having 
18,400 columns by 52,800 rows by RGB. On Thu 11 Dec 
2009, gdal_translate processed a 150 GB untiled .tif to a tiled .tif 
with 260,000 columns by 195,000 rows. Greg


On Jan 12, 2010, at 6:38 PM, Even Rouault wrote:

I'm a bit surprised that you even managed to read a 40Kx100K large 
NITF file organized as scanlines. There was a limit until very 
recently that prevented to read blocks whose one dimension was bigger 
than . This was fixed recently in trunk ( see 
ticket http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but 
it has not yet been released to an officially released version. So 
which GDAL version are you using ?



___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev



___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()

2010-01-13 Thread Even Rouault

Ozy,

The interesting info is that your input image is JPEG2000 compressed. 
This explains why you were able to read a scanline oriented NITF with 
blockwidth  . My guess would be that the leak is in the JPEG2000 
driver in question, so this may be more a problem on the reading part 
than on the writing part. You can check that by running : gdalinfo 
-checksum NITF_IM:0:input.ntf. If you see the memory increasing again 
and again, there's definitely a problem. In case you have GDAL 
configured with several JPEG2000 drivers, you'll have to find which one 
is used : JP2KAK (Kakadu based), JP2ECW (ECW SDK based), JPEG2000 
(Jasper based, but I doubt you're using it with such a big dataset), 
JP2MRSID. Normally, they are selected in the order I've described 
(JP2KAK first, etc). As you're on Linux, it might be interesting that 
you run valgrind to see if it reports leaks. As it might very slow on 
such a big dataset, you could try translating just a smaller window of 
your input dataset, like


valgrind --leak-check=full gdal_translate NITF_IM:0:input.ntf output.tif 
-srcwin 0 0 37504 128


I've selected TIF as output format as it shouldn't matter if you confirm 
that the problem is in the reading part. As far as the window size is 
concerned, it's difficult to guess which value will show the leak.


Filing a ticket with your findings on GDAL Trac might be appropriate.

It might be good trying with GDAL trunk first though, in case the leak 
might have been fixed since 1.6.2. The beta2 source zip are to be found 
here : http://download.osgeo.org/gdal/gdal-1.7.0b2.tar.gz


Best regards,

Even

ozy sjahputera a écrit :

Hi Even,

yes, I tried:
gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co 
BLOCKYSIZE=128  NITF_IM:0:input.ntf output.ntf


I monitored the memory use using top and it was steadily increasing 
till it reached 98.4% (I have 8GB of RAM and 140 GB of local disk for 
swap etc.) before the node died (not just the program, but the whole 
system just stopped responding).


My GDAL version is 1.6.2.

gdalinfo on this image shows the raster size of (37504, 98772) and 
Block=37504x1. 
The image is compressed using JPEG2000 option and contains two 
subdatasets (data and cloud data ~ I used only the data for 
gdal_translate test).


Band info from gdalinfo:
Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray

Ozy

On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault 
even.roua...@mines-paris.org mailto:even.roua...@mines-paris.org 
wrote:


Ozy,

Did you try with gdal_translate -of NITF src.tif output.tif -co
BLOCKSIZE=128 ? Does it give similar results ?

I'm a bit surprised that you even managed to read a 40Kx100K large
NITF
file organized as scanlines. There was a limit until very recently
that
prevented to read blocks whose one dimension was bigger than .
This
was fixed recently in trunk ( see ticket
http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it has
not yet been released to an officially released version. So which GDAL
version are you using ?

Does the output of gdalinfo on your scanline oriented input NITF gives
something like :
Band 1 Block=4x1 Type=Byte, ColorInterp=Gray

Is your input NITF compressed or uncompressed ?

Anyway, with latest trunk, I've simulated creating a similarly large
NITF image with the following python snippet :

import gdal
ds = gdal.GetDriverByName('NITF').Create('scanline.ntf', 4,
10)
ds = None

and then creating the tiled NITF :

gdal_translate -of NITF scanline.ntf tiled.ntf -co BLOCKSIZE=128

The memory consumption is very reasonnable (less than 50 MB : the
default block cache size of 40 MB + temporary buffers ), so I'm not
clear why you would have a problem of increasing memory use.

ozy sjahputera a écrit :
 I was trying to make a copy of a very large NITF image (about
40Kx100K
 pixels) using GDALDriver::CreateCopy(). The new file was set to have
 different block-size (input was a scanline image, output is to
have a
 128x128 blocksize). The program keeps getting killed by the system
 (Linux). I monitor the memory use of the program as it was executing
 CreateCopy and the memory use was steadily increasing as the
progress
 indicator from CreateCopy was moving forward.

 Why does CreateCopy() use so much memory? I have not perused the
 source code of CreateCopy() yet, but I am guessing it employs
 RasterIO() to perform the read/write?

 I was trying different sizes for GDAL  cache from 64MB, 256MB,
512MB,
 1GB, and 2GB. The program got killed in all these cache sizes. In
 fact, my Linux box became unresponsive when I set
GDALSetCacheMax() to
 64MB.

 Thank you.
 Ozy





 ___
 

Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Mateusz Loskot
Jan Hartmann wrote:
 On 13-1-2010 2:33, Mateusz Loskot wrote:
 
 OGR does not provide any spatial indexing layer common to various 
 vector datasets. For many simple formats it performs the 
 brute-force selection.
 
 Just curious, would it make sense / be possible to implement indexing
  in OGR, something like a generalized version of Mapserver's shptree,
  the quadtree-based spatial index for a shapefiles?

This implementation of index comes from Shapelib made by Frank.
The very same bits of Shapelib are used in MapServer and OGR,
namely .qix spatial index file support.
So, it's already there but for Shapefiles only.

Back to the question, I'm personally sceptic.
Recalling example with processing two layers, one from DBMS and one from
file-based data source, how it would be supposed to work?
...common .qix file generated for DBMS data source?

In my opinion, this kind of functionality is out of scope of OGR.
I see OGR as a data provider. OGR is basically a translation library
that reads from one data source and writes to another data source
providing set of reasonably limited features to process data during
translation - a common denominator for popular vector spatial
data formats.

IMHO, it's misunderstanding to consider OGR fully featured data model
and I/O engine to read, write, process and analyse spatial vector data,
especially if performance is a critical factor. IMHO, there are too many
compromises in OGR.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Open source vector geoprocessing libraries?

2010-01-13 Thread Mateusz Loskot
Jason Roberts wrote:
 Mateusz,
 
 Thank you very much for your insight. I have a few more questions I'm
 hoping you could answer.
 
 Alternative is to try to divide the tasks: 1. Query features from
 data source using spatial index capability of data source. 2.
 Having only subject features selected, apply geometric processing.
 
 That sounds like a reasonable approach. Considering just the simpler 
 scenarios, such as the one I mentioned, is it possible to implement 
 efficiently it with OGR compiled with GEOS?

Should be, but OGRGeometry - geos::Geometry translation may be an overhead.

 I believe OGR can pass through SQL directly to the data source
 driver, allowing the caller to submit SQL containing spatial
 operators. In principle, one could submit a spatial query to PostGIS
 or SpatiaLite and efficiently get back the features (including 
 geometry) that could possibly intersect a bounding box. Then one
 could use the GEOS functions on OGRGeometry to do the actual
 intersecting. Is that what you were suggesting?

Yes, that's the concept

 Of course, it may be that PostGIS or SpatiaLite can handle both steps
 1 and 2 in a single query. If so, would it be best to do it that way?
 
It's usually a good idea to let the DBMS engine to do as much as
possible, so looks like a good idea to me.

 It appears that the OGR shapefile driver supports a spatial indexing
 scheme (.qix file) that is respected by OGRLayer::SetSpatialFilter.
 The documentation says that Currently this test is may be
 inaccurately implemented, but it is guaranteed that all features
 who's envelope (as returned by OGRGeometry::getEnvelope()) overlaps
 the envelope of the spatial filter will be returned. Therefore, it
 appears that the shapefile driver can implement step 1 but not step
 2. Is that correct?

Yes.

 The problem with OGR and GEOS is cost of translation from OGR
 geometry to GEOS geometry. It can be a bottleneck.
 
 Is it correct that this cost would only be incurred when you call OGR
  functions implemented by GEOS, such as OGRGeometry::Intersects, 
 OGRGeometry::Disjoint, etc?

Yes.

Namely, here potential cost takes place:

http://trac.osgeo.org/gdal/browser/trunk/gdal/ogr/ogrgeometry.cpp#L333

 It's plenty of combinations and my point is that if performance
 (it's not only in terms of speed, but any resource) is critical, it
 would be extremely difficult to provide efficient  implementation
 of such features in OGR with guaranteed or even determinable degree
 of complexity. Without these guarantees, I see little of use of 
 such solution.
 
 Yes, I see what you mean. But I suggest to the open source community
 that there is still value in implementing such features, either as
 part of OGR or another library, even if optimal performance cannot be
 guaranteed in all scenarios.

Perhaps you'll find these inspiring:

http://trac.osgeo.org/qgis/browser/trunk/qgis/src/analysis/vector

Look at the Java camp too.

Best regards,
-- 
Mateusz Loskot, http://mateusz.loskot.net
Charter Member of OSGeo, http://osgeo.org
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Memory use in GDALDriver::CreateCopy()

2010-01-13 Thread ozy sjahputera
Even,

We use the JP2ECW driver.

I did the valgrind test and did not see any reported leak. Here is some of
the outputs from valgrind:

==11469== Invalid free() / delete / delete[]
==11469==at 0x4CE: free (in
/usr/lib64/valgrind/amd64-linux/vgpreload_memcheck.so)
==11469==by 0x95D1CDA: (within /lib64/libc-2.9.so)
==11469==by 0x95D1879: (within /lib64/libc-2.9.so)
==11469==by 0x4A1D60C: _vgnU_freeres (in
/usr/lib64/valgrind/amd64-linux/vgpreload_core.so)
==11469==by 0x950AB98: exit (in /lib64/libc-2.9.so)
==11469==by 0x94F55EA: (below main) (in /lib64/libc-2.9.so)
==11469==  Address 0x40366f0 is not stack'd, malloc'd or (recently) free'd
==11469==
==11469== ERROR SUMMARY: 13177 errors from 14 contexts (suppressed: 0 from
0)
==11469== malloc/free: in use at exit: 376 bytes in 9 blocks.
==11469== malloc/free: 8,856,910 allocs, 8,856,902 frees, 5,762,693,361
bytes allocated.
==11469== For counts of detected errors, rerun with: -v
==11469== Use --track-origins=yes to see where uninitialised values come
from
==11469== searching for pointers to 9 not-freed blocks.
==11469== checked 1,934,448 bytes.
==11469==
==11469== LEAK SUMMARY:
==11469==definitely lost: 0 bytes in 0 blocks.
==11469==  possibly lost: 0 bytes in 0 blocks.
==11469==still reachable: 376 bytes in 9 blocks.
==11469== suppressed: 0 bytes in 0 blocks.
==11469== Reachable blocks (those to which a pointer was found) are not
shown.

I will check gdal trunk, but we are looking forward to an upgrade to 1.7.
For now, I try to find a scanline and uncompressed NITF image and perform
the same gdal_translate operation on it. If the memory use does not climb
when operating on uncompressed image, then we can say with more certainty
that the problems lay with JPG2000 drivers. I'll let you know.

Thanks.
Ozy

On Wed, Jan 13, 2010 at 1:46 PM, Even Rouault
even.roua...@mines-paris.orgwrote:

 Ozy,

 The interesting info is that your input image is JPEG2000 compressed.
 This explains why you were able to read a scanline oriented NITF with
 blockwidth  . My guess would be that the leak is in the JPEG2000
 driver in question, so this may be more a problem on the reading part
 than on the writing part. You can check that by running : gdalinfo
 -checksum NITF_IM:0:input.ntf. If you see the memory increasing again
 and again, there's definitely a problem. In case you have GDAL
 configured with several JPEG2000 drivers, you'll have to find which one
 is used : JP2KAK (Kakadu based), JP2ECW (ECW SDK based), JPEG2000
 (Jasper based, but I doubt you're using it with such a big dataset),
 JP2MRSID. Normally, they are selected in the order I've described
 (JP2KAK first, etc). As you're on Linux, it might be interesting that
 you run valgrind to see if it reports leaks. As it might very slow on
 such a big dataset, you could try translating just a smaller window of
 your input dataset, like

 valgrind --leak-check=full gdal_translate NITF_IM:0:input.ntf output.tif
 -srcwin 0 0 37504 128

 I've selected TIF as output format as it shouldn't matter if you confirm
 that the problem is in the reading part. As far as the window size is
 concerned, it's difficult to guess which value will show the leak.

 Filing a ticket with your findings on GDAL Trac might be appropriate.

 It might be good trying with GDAL trunk first though, in case the leak
 might have been fixed since 1.6.2. The beta2 source zip are to be found
 here : http://download.osgeo.org/gdal/gdal-1.7.0b2.tar.gz

 Best regards,

 Even

 ozy sjahputera a écrit :
  Hi Even,
 
  yes, I tried:
  gdal_translate -of NITF -co ICORDS=G -co BLOCKXSIZE=128 -co
  BLOCKYSIZE=128  NITF_IM:0:input.ntf output.ntf
 
  I monitored the memory use using top and it was steadily increasing
  till it reached 98.4% (I have 8GB of RAM and 140 GB of local disk for
  swap etc.) before the node died (not just the program, but the whole
  system just stopped responding).
 
  My GDAL version is 1.6.2.
 
  gdalinfo on this image shows the raster size of (37504, 98772) and
  Block=37504x1.
  The image is compressed using JPEG2000 option and contains two
  subdatasets (data and cloud data ~ I used only the data for
  gdal_translate test).
 
  Band info from gdalinfo:
  Band 1 Block=37504x1 Type=UInt16, ColorInterp=Gray
 
  Ozy
 
  On Tue, Jan 12, 2010 at 5:38 PM, Even Rouault
  even.roua...@mines-paris.org mailto:even.roua...@mines-paris.org
  wrote:
 
  Ozy,
 
  Did you try with gdal_translate -of NITF src.tif output.tif -co
  BLOCKSIZE=128 ? Does it give similar results ?
 
  I'm a bit surprised that you even managed to read a 40Kx100K large
  NITF
  file organized as scanlines. There was a limit until very recently
  that
  prevented to read blocks whose one dimension was bigger than .
  This
  was fixed recently in trunk ( see ticket
  http://trac.osgeo.org/gdal/ticket/3263 ) and branches/1.6, but it
 has
  not yet been released to an officially 

Re: [Gdal-dev] When GDAL 1.7.0 branch?

2010-01-13 Thread Mateusz Loskot


Mateusz Loskot wrote:
 
 Frank Warmerdam wrote:
 Mateusz Loskot wrote:
 Hi,

 When the upcoming 1.7.0 will get its own branch in SVN?
 
 Mateusz,
 
 My normal practice is to produce a 1.7 branch at the point the first
 RC is prepared.
 
 Frank,
 
 Great. Thanks!
 

Frank,

One more thing if I may, the Wiki/Roadmap say the RC1 is planned on Dec 15,
2009.
I understand the schedule has changed. Would you have new date of RC1?

Best regards,

-
-- 
Mateusz Loskot
http://mateusz.loskot.net
-- 
View this message in context: 
http://n2.nabble.com/When-GDAL-1-7-0-branch-tp4291455p4385486.html
Sent from the GDAL - Dev mailing list archive at Nabble.com.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev


[gdal-dev] UTF8/Wide chars in path

2010-01-13 Thread ogr user
Dear all,

 I've been looking around GDAL codebase and I can't see anything that
would deal with wide/multybyte characters for file names/paths on
Windows. Would you please confirm that on Windows GDAL will only be able
to open files with pure ASCII names, not international charset?

-M
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev