Re: [gdal-dev] Large shapefile issues

2013-11-25 Thread Even Rouault
Le mardi 26 novembre 2013 00:46:18, Stephen Woodbridge a écrit :
> Unless something has changed, I have never been able to work with dbf
> file over 2GB using shapelib.

With shapelib or OGR ? With shapelib, you need to define SAOffset to be a 64bit 
integer, which OGR does.
The shapefile driver regression tests include a test to read and update a 
record over 2GB.

> 
> -Steve W
> 
> On 11/25/2013 5:52 PM, Even Rouault wrote:
> > Le lundi 25 novembre 2013 11:42:23, CARMAN, Darren a écrit :
> >> Hi List
> >> 
> >> 
> >> 
> >> I notice on the OGR formats page for ESRI Shapefile the following is
> >> mentioned:
> >> 
> >> 
> >> 
> >> Size Issues
> >> 
> >> Geometry: The Shapefile format explicitly uses 32bit offsets and so
> >> cannot go over 8GB (it actually uses 32bit offsets to 16bit words).
> >> Hence, it is is not recommended to use a file size over 4GB.
> >> 
> >> Attributes: The dbf format does not have any offsets in it, so it can be
> >> arbitrarily large.
> >> 
> >> 
> >> 
> >> 
> >> 
> >> Yet on the ESRI website:
> >> 
> >> 
> >> 
> >> Geometry limitations
> >> 
> >> There is a 2 GB size limit for any shapefile component file, which
> >> translates to a maximum of roughly 70 million point features. The actual
> >> number of line or polygon features you can store in a shapefile depends
> >> on the number of vertices in each line or polygon (a vertex is
> >> equivalent to a point).
> >> 
> >> 
> >> 
> >> 
> >> 
> >> I assume the OGR web page is wrong, or has a different meaning outside
> >> of ESRI S/W use.
> > 
> > Darren,
> > 
> > Yes, as underlined by Chaitanya the actual limit depends on the software
> > implementation. Actually the limit in OGR was 4 GB for the .SHP, and
> > AFAICS unlimited for DBF.
> > 
> > I've added in http://trac.osgeo.org/gdal/changeset/26657 a 2GB_LIMIT=YES
> > layer creation option (and SHAPE_2GB_LIMIT configuration option) that
> > will enforce the 2GB limit. And in
> > http://trac.osgeo.org/gdal/changeset/26658 a change so that when the
> > layer is reached the file is properly closed with the valid information.
> > 
> > Spliting files over several files would properly need to be done outside
> > of this, in a script, by restarting from the source layer at the index
> > next to the one that was last written in the shapefile.
> > 
> > Best regards,
> > 
> > Even
> 
> ___
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev

-- 
Geospatial professional services
http://even.rouault.free.fr/services.html
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Large shapefile issues

2013-11-25 Thread Stephen Woodbridge
Unless something has changed, I have never been able to work with dbf 
file over 2GB using shapelib.


-Steve W

On 11/25/2013 5:52 PM, Even Rouault wrote:

Le lundi 25 novembre 2013 11:42:23, CARMAN, Darren a écrit :

Hi List



I notice on the OGR formats page for ESRI Shapefile the following is
mentioned:



Size Issues

Geometry: The Shapefile format explicitly uses 32bit offsets and so
cannot go over 8GB (it actually uses 32bit offsets to 16bit words).
Hence, it is is not recommended to use a file size over 4GB.

Attributes: The dbf format does not have any offsets in it, so it can be
arbitrarily large.





Yet on the ESRI website:



Geometry limitations

There is a 2 GB size limit for any shapefile component file, which
translates to a maximum of roughly 70 million point features. The actual
number of line or polygon features you can store in a shapefile depends
on the number of vertices in each line or polygon (a vertex is
equivalent to a point).





I assume the OGR web page is wrong, or has a different meaning outside
of ESRI S/W use.


Darren,

Yes, as underlined by Chaitanya the actual limit depends on the software
implementation. Actually the limit in OGR was 4 GB for the .SHP, and AFAICS
unlimited for DBF.

I've added in http://trac.osgeo.org/gdal/changeset/26657 a 2GB_LIMIT=YES layer
creation option (and SHAPE_2GB_LIMIT configuration option) that will enforce
the 2GB limit. And in http://trac.osgeo.org/gdal/changeset/26658 a change so
that when the layer is reached the file is properly closed with the valid
information.

Spliting files over several files would properly need to be done outside of
this, in a script, by restarting from the source layer at the index next to
the one that was last written in the shapefile.

Best regards,

Even



___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Large shapefile issues

2013-11-25 Thread Even Rouault
Le lundi 25 novembre 2013 11:42:23, CARMAN, Darren a écrit :
> Hi List
> 
> 
> 
> I notice on the OGR formats page for ESRI Shapefile the following is
> mentioned:
> 
> 
> 
> Size Issues
> 
> Geometry: The Shapefile format explicitly uses 32bit offsets and so
> cannot go over 8GB (it actually uses 32bit offsets to 16bit words).
> Hence, it is is not recommended to use a file size over 4GB.
> 
> Attributes: The dbf format does not have any offsets in it, so it can be
> arbitrarily large.
> 
> 
> 
> 
> 
> Yet on the ESRI website:
> 
> 
> 
> Geometry limitations
> 
> There is a 2 GB size limit for any shapefile component file, which
> translates to a maximum of roughly 70 million point features. The actual
> number of line or polygon features you can store in a shapefile depends
> on the number of vertices in each line or polygon (a vertex is
> equivalent to a point).
> 
> 
> 
> 
> 
> I assume the OGR web page is wrong, or has a different meaning outside
> of ESRI S/W use.

Darren,

Yes, as underlined by Chaitanya the actual limit depends on the software 
implementation. Actually the limit in OGR was 4 GB for the .SHP, and AFAICS 
unlimited for DBF.

I've added in http://trac.osgeo.org/gdal/changeset/26657 a 2GB_LIMIT=YES layer 
creation option (and SHAPE_2GB_LIMIT configuration option) that will enforce 
the 2GB limit. And in http://trac.osgeo.org/gdal/changeset/26658 a change so 
that when the layer is reached the file is properly closed with the valid 
information.

Spliting files over several files would properly need to be done outside of 
this, in a script, by restarting from the source layer at the index next to 
the one that was last written in the shapefile.

Best regards,

Even
-- 
Geospatial professional services
http://even.rouault.free.fr/services.html
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

Re: [gdal-dev] Large shapefile issues

2013-11-25 Thread Chaitanya kumar CH
Darren,

These limitations are a result of the shapefile reader/writer
implementation. While the specification permits up to 8GB, the
implementation might not use it fully.

According to the specification a feature offset is stored as a 32-bit
integer to a 16-bit word. So, that's (2^32)*16 bits or (2^32)*2 bytes. To
fully utilize this, we need to use unsigned 32-bit integers and check for
overflows.
Originally, the shapelib library used by GDAL had the same limit as ESRI's.

If you are reaching the file size limits, I suggest you look into some
other file format to store your data. SQLite/SpatiaLite is a good option.

A hacky way to split the features across multiple shapefiles is to use the
-where option in ogr2ogr. You can filter based on the FID values.



On Mon, Nov 25, 2013 at 4:12 PM, CARMAN, Darren <
darren.car...@astrium.eads.net> wrote:

> Hi List
>
>
>
> I notice on the OGR formats page for ESRI Shapefile the following is
> mentioned:
>
> 
>
> Size Issues
>
> Geometry: The Shapefile format explicitly uses 32bit offsets and so cannot
> go over 8GB (it actually uses 32bit offsets to 16bit words). Hence, it is
> is not recommended to use a file size over 4GB.
>
> Attributes: The dbf format does not have any offsets in it, so it can be
> arbitrarily large.
>
> 
>
>
>
> Yet on the ESRI website:
>
> 
>
> Geometry limitations
>
> There is a 2 GB size limit for any shapefile component file, which
> translates to a maximum of roughly 70 million point features. The actual
> number of line or polygon features you can store in a shapefile depends on
> the number of vertices in each line or polygon (a vertex is equivalent to a
> point).
>
> 
>
>
>
> I assume the OGR web page is wrong, or has a different meaning outside of
> ESRI S/W use.
>
>
>
> I notice that the PostGIS utility pgsql2shp stops processing with an error
> once the dbf file goes over 2GB. Is there any way to get ogr2ogr to do this?
>
>
>
> Ideally, something I can’t find a lot of information about online, is
> there a way to get ogr2ogr to start writing to a new shapefile at a certain
> processing point (number of objects or file size)?
>
>
>
> Alternatively, assuming the OGR website text is correct for files it
> creates, is there any utility that could be used to split the created
> shapefile files into ones with a size less than 2GB?
>
>
>
> Thanks in advance
>
> Darren
>
>
>
>
>
>
>
> *Darren Carman*
> Senior Software Engineer
>
> GEO-Information Services
>
>
>
> *Astrium Services*
>
> Tel +44 (0)1252 362138
>
> http://www.astrium-geo.com 
>
>
>
>
>
> Infoterra Ltd. Is part of the Astrium GEO-Information Services Division
> and a wholly owned subsidiary of Astrium, Europe's leading space systems
> and services specialist.
>
> Disclaimer. The information contained in this e-mail and its attachments
> are confidential and intended only for the use of the named addressee(s).
> If you are not the intended addressee, please do not read, copy, use or
> disclose this message or its attachments. If you have received this message
> in error, please notify the sender immediately and delete or destroy all
> copies of this message and attachments in all media. Any views or opinions
> expressed are solely those of the author and do not necessarily represent
> those of Infoterra Ltd and shall not form part of any binding agreement.
>
> Infoterra Limited a company registered in England under number 2359955
> and having its registered office at Europa House, Southwood Crescent,
> Farnborough, GU14 0NL. VAT number GB 476 0468 27.
>
> P Before printing, think about the environment
>
>
>
>
>
>
>
>
> ___
> gdal-dev mailing list
> gdal-dev@lists.osgeo.org
> http://lists.osgeo.org/mailman/listinfo/gdal-dev
>



-- 
Best regards,
Chaitanya kumar CH.

+91-9494447584
17.2416N 80.1426E
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev

[gdal-dev] Large shapefile issues

2013-11-25 Thread CARMAN, Darren
Hi List

 

I notice on the OGR formats page for ESRI Shapefile the following is
mentioned:



Size Issues

Geometry: The Shapefile format explicitly uses 32bit offsets and so
cannot go over 8GB (it actually uses 32bit offsets to 16bit words).
Hence, it is is not recommended to use a file size over 4GB.

Attributes: The dbf format does not have any offsets in it, so it can be
arbitrarily large.



 

Yet on the ESRI website:



Geometry limitations

There is a 2 GB size limit for any shapefile component file, which
translates to a maximum of roughly 70 million point features. The actual
number of line or polygon features you can store in a shapefile depends
on the number of vertices in each line or polygon (a vertex is
equivalent to a point).



 

I assume the OGR web page is wrong, or has a different meaning outside
of ESRI S/W use.

 

I notice that the PostGIS utility pgsql2shp stops processing with an
error once the dbf file goes over 2GB. Is there any way to get ogr2ogr
to do this?

 

Ideally, something I can't find a lot of information about online, is
there a way to get ogr2ogr to start writing to a new shapefile at a
certain processing point (number of objects or file size)?

 

Alternatively, assuming the OGR website text is correct for files it
creates, is there any utility that could be used to split the created
shapefile files into ones with a size less than 2GB?

 

Thanks in advance

Darren

 

 

 

 Darren Carman
Senior Software Engineer

GEO-Information Services

 

Astrium Services

Tel +44 (0)1252 362138

http://www.astrium-geo.com  

 


Infoterra Ltd. Is part of the Astrium GEO-Information Services Division and a 
wholly owned subsidiary of Astrium, Europe's leading space systems and services 
specialist.

Disclaimer. The information contained in this e-mail and its attachments are 
confidential and intended only for the use of the named addressee(s). If you 
are not the intended addressee, please do not read, copy, use or disclose this 
message or its attachments. If you have received this message in error, please 
notify the sender immediately and delete or destroy all copies of this message 
and attachments in all media. Any views or opinions expressed are solely those 
of the author and do not necessarily represent those of Infoterra Ltd and shall 
not form part of any binding agreement.

Infoterra Limited a company registered in England under number 2359955 and 
having its registered office at Europa House, Southwood Crescent, Farnborough, 
GU14 0NL. VAT number GB 476 0468 27. 

P Before printing, think about the environment 

<>___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/gdal-dev