Re: [gdal-dev] Deciding whether to use WFS paging after opening the datasource

2023-05-03 Thread Even Rouault

Craig,

Both options you propose are reasonable. I'd perhaps have a slight 
preference for option 2 (is my understanding correct that the user would 
be responsible to issue GetFeatureCount() manually before calling 
GetNextFeature() ? we don't want to systematically call 
GetFeatureCount() because it could be potentially a rather costly 
operation for some servers).


It could be enhanced with allowing a CHECK_WITH_HITS value for 
OGR_WFS_PAGING_ALLOWED (or perhaps through a new specific open option ?) 
that would automatically issue the feature count request (that could be 
useful for ogr2ogr types of scenario where the user can't easily cause 
ogr2ogr to issue GetFeatureCount())


Even

Le 03/05/2023 à 03:10, Craig de Stigter a écrit :

Hi folks

We're having trouble with OGR trying to fetch a response from this 
WFS2 URL: 
https://geo.irceline.be/wfs?SERVICE=WFS&VERSION=2.0.0&REQUEST=GetFeature&TYPENAMES=realtime:pm25_24hmean_station&COUNT=100&STARTINDEX=0 



The error from the server is

Cannot do natural order without a primary key, please add it or
specify a manual sort over existing attributes


This seems fair enough on the surface - we specified *STARTINDEX=0* 
but the server is unable to support paging for this dataset due to no 
natural sort order.


We'd happily disable paging for this dataset - it has a fairly low 
number of features - but we would prefer to implement a general 
purpose solution rather than special-case this particular dataset.


It would make sense to open the datasource, check the feature count by 
doing a 'hits' query (we're doing this anyway), and then use paging if 
there are more than 100 features - otherwise don't use paging.


However I can't find a way to accomplish this with GDAL. The 
*OGR_WFS_PAGING_ALLOWED* config var is only checked when the 
datasource is opened - so you can't open the datasource, check the 
feature count and then change the paging behaviour.


I can imagine either of these changes in GDAL itself might fix this:

1. GDAL could check *OGR_WFS_PAGING_ALLOWED* when actually first 
issuing GetFeature requests, rather than during datasource creation
2. GDAL could omit the pagination parameters if the feature count is 
already known and is less than the page size


How would you suggest we proceed?

--
Regards,
Craig

Platform Engineer
Koordinates

+64 21 256 9488  / koordinates.com 
 / @koordinates 


___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


--
http://www.spatialys.com
My software is free, but my time generally not.
___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Moises Calzado via gdal-dev
Hi Even,

Thanks so much for taking a look into that one!

I have one doubt regarding the CSVT content, as we're not really using it,
but it's required when using the GEOMETRY_NAME layer creation option, as
can be checked in the CSV driver documentation:


>-
>
>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry column.
>Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to WKT
>
> We really need this flag as we are processing files that contain
geometries with different column names, and we always want the same
geometry name in the generated output. Are we losing something when using
that flag to avoid this problem?
In my humble opinion, generating an invalid CSV when using the -lco
CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
strings containing line breaks can't be quoted.

Could you please shed some light on this?

Looking forward to your reply,
Regards.

El mié, 3 may 2023 a las 14:00, Even Rouault ()
escribió:

> you didn't post to the list
> Le 03/05/2023 à 13:49, Moises Calzado a écrit :
>
> Hi Even,
>
> Thanks so much for taking a look into that one!
>
> I have one doubt regarding the CSVT content, as we're not really using it,
> but it's required when using the GEOMETRY_NAME layer creation option, as
> can be checked in the CSV driver documentation:
>
>
>>-
>>
>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry column.
>>Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to WKT
>>
>> We really need this flag as we are processing files that contain
> geometries with different column names, and we always want the same
> geometry name in the generated output. Are we losing something when using
> that flag to avoid this problem?
> In my humble opinion, generating an invalid CSV when using the -lco
> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
> strings containing line breaks can't be quoted.
>
> Could you please shed some light on this?
>
> Looking forward to your reply,
> Regards.
>
> El sáb, 29 abr 2023 a las 15:44, Even Rouault ()
> escribió:
>
>> Moises,
>>
>> as far as I can see with your example, the CSV driver behaves "properly"
>> in reading and writing of field values with line breaks.
>>
>> It follows the "Fields with embedded line breaks must be quoted" rule of
>> https://en.wikipedia.org/wiki/Comma-separated_values
>>
>> $ ogr2ogr out.csv /vsizip/dataframe.zip
>>
>> $ cat out.csv
>> id,descriptio
>> "1",This is my third row
>> "2","this is
>> my string
>> "
>> "3",This is my third row
>>
>> $ ogrinfo out.csv -al
>> INFO: Open of `out.csv'
>>   using driver `CSV' successful.
>>
>> Layer name: out
>> Geometry: None
>> Feature Count: 3
>> Layer SRS WKT:
>> (unknown)
>> id: String (0.0)
>> descriptio: String (0.0)
>> OGRFeature(out):1
>>   id (String) = 1
>>   descriptio (String) = This is my third row
>>
>> OGRFeature(out):2
>>   id (String) = 2
>>   descriptio (String) = this is
>> my string
>>
>>
>> OGRFeature(out):3
>>   id (String) = 3
>>   descriptio (String) = This is my third row
>>
>> But in your example using /vsistdout/ and -lco CREATE_CSVT=YES is going
>> to result in an invalid CSV file which will mix both the .csvt and .csv
>> content
>>
>> Even
>> Le 24/04/2023 à 13:34, Moises Calzado via gdal-dev a écrit :
>>
>> Hello!
>>
>> We're trying to convert a Shapefile into a CSV using ogr2ogr and we're
>> having some issues while dealing with some columns that contain line breaks
>> inside their values. If we have a line with the following string, ogr2ogr
>> detects that the line break is a new line and it returns two lines.
>>
>> "this is my \n value"
>>
>>
>> That's the command that we're executing:
>>
>> ogr2ogr -f CSV -skipfailures -makevalid /vsistdout/ /vsizip/shapefile.zip
>>> -simplify 0.1 -dim XY -t_srs EPSG:4326 -lco GEOMETRY=AS_WKT -lco
>>> GEOMETRY_NAME=geom -lco CREATE_CSVT=YES > result.csv
>>>
>>
>> Is this an expected behaviour, or is there any way to avoid this?
>> Sharing an example Shapefile so that you can try to reproduce that
>> behaviour:
>> https://drive.google.com/file/d/1gFqfTP02KTFoavJyyO-Ix05YwZB2tS24/view?usp=sharing
>>
>> Thanks so much in advance,
>> Regards.
>>
>> --
>> *Moises Calzado*
>>
>> Support Engineer
>>
>> +34671264286 | mcalz...@carto.com | CARTO 
>> 
>>
>> ___
>> gdal-dev mailing 
>> listgdal-dev@lists.osgeo.orghttps://lists.osgeo.org/mailman/listinfo/gdal-dev
>>
>> -- http://www.spatialys.com
>> My software is free, but my time generally not.
>>
>>
>
> --
> *Moises Calzado*
>
> Support Engineer
>
> +34671264286 | mcalz...@carto.com | CARTO 
> 
>
> -- http://www.spatialys.com
> My software is free, but my time generally not.
>
>

-- 
*Moises Calzado*

Support Engineer

+34671264286 | mcalz...@carto.com | CARTO 

Re: [gdal-dev] JPEG2000 change progression order without decoding

2023-05-03 Thread Even Rouault
I don't have in mind OSS to do that, but the kdu_transcode utility from 
Kakadu (proprietary) should be able to do it. Cf 
https://trac.osgeo.org/gdal/ticket/3295 / 
https://kakadusoftware.com/wp-content/uploads/Usage_Examples.txt


Le 03/05/2023 à 04:20, Tobby Moalem a écrit :
I wonder if there is a way to change the progression order of jpeg2000 
images, without the need of decoding it and then encode it again. For 
example if I have an image with RPCL progression and I want to convert 
it to LRCP. Is there a way to only move bits around (change the order 
of the file) instead of needing to transcode the image all over again 
(which may take a long time)?


Thanks in advance,
Tobby

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


--
http://www.spatialys.com
My software is free, but my time generally not.

___
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev


Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Even Rouault


Le 03/05/2023 à 14:22, Moises Calzado via gdal-dev a écrit :

Hi Even,

Thanks so much for taking a look into that one!

I have one doubt regarding the CSVT content, as we're not really using 
it, but it's required when using the GEOMETRY_NAME layer creation 
option, as can be checked in the CSV driver documentation:


 *

GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES.
Defaults to WKT

We really need this flag as we are processing files that contain 
geometries with different column names, and we always want the same 
geometry name in the generated output. Are we losing something when 
using that flag to avoid this problem?


The reason  for requiring CREASE_CSVT=YES is that when reading back a 
.csv without a .csvt the geometry column must be named WKT. Unless you 
specify the GEOM_POSSIBLE_NAMES open option (which must have been a 
later addition). That said it could be reasonable to relax that coupling 
and allow GEOMETRY_NAME without CREATE_CSVT=YES, with a warning in the 
doc about the consequence I just mentioned before


In my humble opinion, generating an invalid CSV when using the -lco 
CREATE_CSVT=YES looks like a bug for me,


Are you speaking about emitting the .prj and .csvt content when writing 
to /vsistdout ? Yes, I'd tend to agree they should not be emitted in 
that mode.


as I can't see the reason why strings containing line breaks can't be 
quoted.
I'm not following you about the issue with line breaks. In my previous 
message, I showed I didn't reproduce any issue: the CSV driver emits 
fields with double quotes, even when there are line breaks. Can you be 
more specific about what's wrong ? I don't see the connection with 
GEOMETRY_NAME.


Could you please shed some light on this?

Looking forward to your reply,
Regards.

El mié, 3 may 2023 a las 14:00, Even Rouault 
() escribió:


you didn't post to the list

Le 03/05/2023 à 13:49, Moises Calzado a écrit :

Hi Even,

Thanks so much for taking a look into that one!

I have one doubt regarding the CSVT content, as we're not really
using it, but it's required when using the GEOMETRY_NAME layer
creation option, as can be checked in the CSV driver documentation:

 *

GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of
geometry column. Only used if GEOMETRY=AS_WKT and
CREATE_CSVT=YES. Defaults to WKT

We really need this flag as we are processing files that contain
geometries with different column names, and we always want the
same geometry name in the generated output. Are we losing
something when using that flag to avoid this problem?
In my humble opinion, generating an invalid CSV when using the
-lco CREATE_CSVT=YES looks like a bug for me, as I can't see the
reason why strings containing line breaks can't be quoted.

Could you please shed some light on this?

Looking forward to your reply,
Regards.

El sáb, 29 abr 2023 a las 15:44, Even Rouault
() escribió:

Moises,

as far as I can see with your example, the CSV driver behaves
"properly" in reading and writing of field values with line
breaks.

It follows the "Fields with embedded line breaks must be
quoted" rule of
https://en.wikipedia.org/wiki/Comma-separated_values

$ ogr2ogr out.csv /vsizip/dataframe.zip

$ cat out.csv
id,descriptio
"1",This is my third row
"2","this is
my string
"
"3",This is my third row

$ ogrinfo out.csv -al
INFO: Open of `out.csv'
  using driver `CSV' successful.

Layer name: out
Geometry: None
Feature Count: 3
Layer SRS WKT:
(unknown)
id: String (0.0)
descriptio: String (0.0)
OGRFeature(out):1
  id (String) = 1
  descriptio (String) = This is my third row

OGRFeature(out):2
  id (String) = 2
  descriptio (String) = this is
my string


OGRFeature(out):3
  id (String) = 3
  descriptio (String) = This is my third row

But in your example using /vsistdout/ and -lco
CREATE_CSVT=YES is going to result in an invalid CSV file
which will mix both the .csvt and .csv content

Even

Le 24/04/2023 à 13:34, Moises Calzado via gdal-dev a écrit :

Hello!

We're trying to convert a Shapefile into a CSV using ogr2ogr
and we're having some issues while dealing with some columns
that contain line breaks inside their values. If we have a
line with the following string, ogr2ogr detects that the
line break is a new line and it returns two lines.

"this is my \n value"


That's the command that we're executing:

ogr2ogr -f CSV -skipfailures -makevalid /vsistdout/
  

Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Robert Hewlett
The .CSVT and .PRJ help to make a proper geocsv dataset. Helps with QGIS
And geopandas. The column name that I use in the CSV is usually geom and
WKT shows up in the CSVT file which seems to be a one line file that hints
at the data types in the CSV file.

I hope that makes sense.

CSVT
Integer, Integer,WKT

CSV
line_id,point_id,geom
1,1,"POINT(1000 1000)"

PRJ
EPSG:26910




On Wed, May 3, 2023, 05:23 Moises Calzado via gdal-dev <
gdal-dev@lists.osgeo.org> wrote:

> Hi Even,
>
> Thanks so much for taking a look into that one!
>
> I have one doubt regarding the CSVT content, as we're not really using it,
> but it's required when using the GEOMETRY_NAME layer creation option, as
> can be checked in the CSV driver documentation:
>
>
>>-
>>
>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry column.
>>Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to WKT
>>
>> We really need this flag as we are processing files that contain
> geometries with different column names, and we always want the same
> geometry name in the generated output. Are we losing something when using
> that flag to avoid this problem?
> In my humble opinion, generating an invalid CSV when using the -lco
> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
> strings containing line breaks can't be quoted.
>
> Could you please shed some light on this?
>
> Looking forward to your reply,
> Regards.
>
> El mié, 3 may 2023 a las 14:00, Even Rouault ()
> escribió:
>
>> you didn't post to the list
>> Le 03/05/2023 à 13:49, Moises Calzado a écrit :
>>
>> Hi Even,
>>
>> Thanks so much for taking a look into that one!
>>
>> I have one doubt regarding the CSVT content, as we're not really using
>> it, but it's required when using the GEOMETRY_NAME layer creation option,
>> as can be checked in the CSV driver documentation:
>>
>>
>>>-
>>>
>>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>>>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to WKT
>>>
>>> We really need this flag as we are processing files that contain
>> geometries with different column names, and we always want the same
>> geometry name in the generated output. Are we losing something when using
>> that flag to avoid this problem?
>> In my humble opinion, generating an invalid CSV when using the -lco
>> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
>> strings containing line breaks can't be quoted.
>>
>> Could you please shed some light on this?
>>
>> Looking forward to your reply,
>> Regards.
>>
>> El sáb, 29 abr 2023 a las 15:44, Even Rouault (<
>> even.roua...@spatialys.com>) escribió:
>>
>>> Moises,
>>>
>>> as far as I can see with your example, the CSV driver behaves "properly"
>>> in reading and writing of field values with line breaks.
>>>
>>> It follows the "Fields with embedded line breaks must be quoted" rule of
>>> https://en.wikipedia.org/wiki/Comma-separated_values
>>>
>>> $ ogr2ogr out.csv /vsizip/dataframe.zip
>>>
>>> $ cat out.csv
>>> id,descriptio
>>> "1",This is my third row
>>> "2","this is
>>> my string
>>> "
>>> "3",This is my third row
>>>
>>> $ ogrinfo out.csv -al
>>> INFO: Open of `out.csv'
>>>   using driver `CSV' successful.
>>>
>>> Layer name: out
>>> Geometry: None
>>> Feature Count: 3
>>> Layer SRS WKT:
>>> (unknown)
>>> id: String (0.0)
>>> descriptio: String (0.0)
>>> OGRFeature(out):1
>>>   id (String) = 1
>>>   descriptio (String) = This is my third row
>>>
>>> OGRFeature(out):2
>>>   id (String) = 2
>>>   descriptio (String) = this is
>>> my string
>>>
>>>
>>> OGRFeature(out):3
>>>   id (String) = 3
>>>   descriptio (String) = This is my third row
>>>
>>> But in your example using /vsistdout/ and -lco CREATE_CSVT=YES is going
>>> to result in an invalid CSV file which will mix both the .csvt and .csv
>>> content
>>>
>>> Even
>>> Le 24/04/2023 à 13:34, Moises Calzado via gdal-dev a écrit :
>>>
>>> Hello!
>>>
>>> We're trying to convert a Shapefile into a CSV using ogr2ogr and we're
>>> having some issues while dealing with some columns that contain line breaks
>>> inside their values. If we have a line with the following string, ogr2ogr
>>> detects that the line break is a new line and it returns two lines.
>>>
>>> "this is my \n value"
>>>
>>>
>>> That's the command that we're executing:
>>>
>>> ogr2ogr -f CSV -skipfailures -makevalid /vsistdout/
 /vsizip/shapefile.zip -simplify 0.1 -dim XY -t_srs EPSG:4326 -lco
 GEOMETRY=AS_WKT -lco GEOMETRY_NAME=geom -lco CREATE_CSVT=YES > result.csv

>>>
>>> Is this an expected behaviour, or is there any way to avoid this?
>>> Sharing an example Shapefile so that you can try to reproduce that
>>> behaviour:
>>> https://drive.google.com/file/d/1gFqfTP02KTFoavJyyO-Ix05YwZB2tS24/view?usp=sharing
>>>
>>> Thanks so much in advance,
>>> Regards.
>>>
>>> --
>>> *Moises Calzado*
>>>
>>> Support Engineer
>>>
>>> +34671264286 | mcalz...@carto.com | CARTO 
>>> 

Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Moises Calzado via gdal-dev
We're also specifying the GEOM_POSSIBLE_NAMES, so it would be great if with
that option we could use the GEOMETRY_NAME without using the
CREATE_CSVT=YES option.

Regarding emitting the .prj and .csvt in /vsistdout mode, that's why I'm
saying that there is an issue while generating the resultant CSV.
The way we see it is that when using the /vsistdout mode, the result is a
CSV file with the .prj information in the first line, and the .csvt in the
second line. We're dealing with the result deleting the first two lines and
using the rest of the content as a CSV, which should be equal to the result
obtained when using ogr2ogr without the CREATE_CSVT=YES option.
Probably we're losing something, but as we see it, the generated CSV should
be a valid one. Does that make sense?

Thanks so much for your help!

El mié, 3 may 2023 a las 15:10, Robert Hewlett ()
escribió:

> The .CSVT and .PRJ help to make a proper geocsv dataset. Helps with QGIS
> And geopandas. The column name that I use in the CSV is usually geom and
> WKT shows up in the CSVT file which seems to be a one line file that hints
> at the data types in the CSV file.
>
> I hope that makes sense.
>
> CSVT
> Integer, Integer,WKT
>
> CSV
> line_id,point_id,geom
> 1,1,"POINT(1000 1000)"
>
> PRJ
> EPSG:26910
>
>
>
>
> On Wed, May 3, 2023, 05:23 Moises Calzado via gdal-dev <
> gdal-dev@lists.osgeo.org> wrote:
>
>> Hi Even,
>>
>> Thanks so much for taking a look into that one!
>>
>> I have one doubt regarding the CSVT content, as we're not really using
>> it, but it's required when using the GEOMETRY_NAME layer creation option,
>> as can be checked in the CSV driver documentation:
>>
>>
>>>-
>>>
>>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>>>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to WKT
>>>
>>> We really need this flag as we are processing files that contain
>> geometries with different column names, and we always want the same
>> geometry name in the generated output. Are we losing something when using
>> that flag to avoid this problem?
>> In my humble opinion, generating an invalid CSV when using the -lco
>> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
>> strings containing line breaks can't be quoted.
>>
>> Could you please shed some light on this?
>>
>> Looking forward to your reply,
>> Regards.
>>
>> El mié, 3 may 2023 a las 14:00, Even Rouault ()
>> escribió:
>>
>>> you didn't post to the list
>>> Le 03/05/2023 à 13:49, Moises Calzado a écrit :
>>>
>>> Hi Even,
>>>
>>> Thanks so much for taking a look into that one!
>>>
>>> I have one doubt regarding the CSVT content, as we're not really using
>>> it, but it's required when using the GEOMETRY_NAME layer creation option,
>>> as can be checked in the CSV driver documentation:
>>>
>>>
-

GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to 
 WKT

 We really need this flag as we are processing files that contain
>>> geometries with different column names, and we always want the same
>>> geometry name in the generated output. Are we losing something when using
>>> that flag to avoid this problem?
>>> In my humble opinion, generating an invalid CSV when using the -lco
>>> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
>>> strings containing line breaks can't be quoted.
>>>
>>> Could you please shed some light on this?
>>>
>>> Looking forward to your reply,
>>> Regards.
>>>
>>> El sáb, 29 abr 2023 a las 15:44, Even Rouault (<
>>> even.roua...@spatialys.com>) escribió:
>>>
 Moises,

 as far as I can see with your example, the CSV driver behaves
 "properly" in reading and writing of field values with line breaks.

 It follows the "Fields with embedded line breaks must be quoted" rule
 of https://en.wikipedia.org/wiki/Comma-separated_values

 $ ogr2ogr out.csv /vsizip/dataframe.zip

 $ cat out.csv
 id,descriptio
 "1",This is my third row
 "2","this is
 my string
 "
 "3",This is my third row

 $ ogrinfo out.csv -al
 INFO: Open of `out.csv'
   using driver `CSV' successful.

 Layer name: out
 Geometry: None
 Feature Count: 3
 Layer SRS WKT:
 (unknown)
 id: String (0.0)
 descriptio: String (0.0)
 OGRFeature(out):1
   id (String) = 1
   descriptio (String) = This is my third row

 OGRFeature(out):2
   id (String) = 2
   descriptio (String) = this is
 my string


 OGRFeature(out):3
   id (String) = 3
   descriptio (String) = This is my third row

 But in your example using /vsistdout/ and -lco CREATE_CSVT=YES is going
 to result in an invalid CSV file which will mix both the .csvt and .csv
 content

 Even
 Le 24/04/2023 à 13:34, Moises Calzado via gdal-dev a écrit :

 Hello!

 We'r

Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Robert Hewlett
Just to clarify, instead of getting three files you are getting one with
all the info: types, projection, data?

https://giswiki.hsr.ch/GeoCSV

On Wed, May 3, 2023 at 8:57 AM Moises Calzado via gdal-dev <
gdal-dev@lists.osgeo.org> wrote:

> We're also specifying the GEOM_POSSIBLE_NAMES, so it would be great if
> with that option we could use the GEOMETRY_NAME without using the
> CREATE_CSVT=YES option.
>
> Regarding emitting the .prj and .csvt in /vsistdout mode, that's why I'm
> saying that there is an issue while generating the resultant CSV.
> The way we see it is that when using the /vsistdout mode, the result is a
> CSV file with the .prj information in the first line, and the .csvt in the
> second line. We're dealing with the result deleting the first two lines and
> using the rest of the content as a CSV, which should be equal to the result
> obtained when using ogr2ogr without the CREATE_CSVT=YES option.
> Probably we're losing something, but as we see it, the generated CSV
> should be a valid one. Does that make sense?
>
> Thanks so much for your help!
>
> El mié, 3 may 2023 a las 15:10, Robert Hewlett ()
> escribió:
>
>> The .CSVT and .PRJ help to make a proper geocsv dataset. Helps with QGIS
>> And geopandas. The column name that I use in the CSV is usually geom and
>> WKT shows up in the CSVT file which seems to be a one line file that hints
>> at the data types in the CSV file.
>>
>> I hope that makes sense.
>>
>> CSVT
>> Integer, Integer,WKT
>>
>> CSV
>> line_id,point_id,geom
>> 1,1,"POINT(1000 1000)"
>>
>> PRJ
>> EPSG:26910
>>
>>
>>
>>
>> On Wed, May 3, 2023, 05:23 Moises Calzado via gdal-dev <
>> gdal-dev@lists.osgeo.org> wrote:
>>
>>> Hi Even,
>>>
>>> Thanks so much for taking a look into that one!
>>>
>>> I have one doubt regarding the CSVT content, as we're not really using
>>> it, but it's required when using the GEOMETRY_NAME layer creation option,
>>> as can be checked in the CSV driver documentation:
>>>
>>>
-

GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to 
 WKT

 We really need this flag as we are processing files that contain
>>> geometries with different column names, and we always want the same
>>> geometry name in the generated output. Are we losing something when using
>>> that flag to avoid this problem?
>>> In my humble opinion, generating an invalid CSV when using the -lco
>>> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
>>> strings containing line breaks can't be quoted.
>>>
>>> Could you please shed some light on this?
>>>
>>> Looking forward to your reply,
>>> Regards.
>>>
>>> El mié, 3 may 2023 a las 14:00, Even Rouault (<
>>> even.roua...@spatialys.com>) escribió:
>>>
 you didn't post to the list
 Le 03/05/2023 à 13:49, Moises Calzado a écrit :

 Hi Even,

 Thanks so much for taking a look into that one!

 I have one doubt regarding the CSVT content, as we're not really using
 it, but it's required when using the GEOMETRY_NAME layer creation option,
 as can be checked in the CSV driver documentation:


>-
>
>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to 
> WKT
>
> We really need this flag as we are processing files that contain
 geometries with different column names, and we always want the same
 geometry name in the generated output. Are we losing something when using
 that flag to avoid this problem?
 In my humble opinion, generating an invalid CSV when using the -lco
 CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
 strings containing line breaks can't be quoted.

 Could you please shed some light on this?

 Looking forward to your reply,
 Regards.

 El sáb, 29 abr 2023 a las 15:44, Even Rouault (<
 even.roua...@spatialys.com>) escribió:

> Moises,
>
> as far as I can see with your example, the CSV driver behaves
> "properly" in reading and writing of field values with line breaks.
>
> It follows the "Fields with embedded line breaks must be quoted" rule
> of https://en.wikipedia.org/wiki/Comma-separated_values
>
> $ ogr2ogr out.csv /vsizip/dataframe.zip
>
> $ cat out.csv
> id,descriptio
> "1",This is my third row
> "2","this is
> my string
> "
> "3",This is my third row
>
> $ ogrinfo out.csv -al
> INFO: Open of `out.csv'
>   using driver `CSV' successful.
>
> Layer name: out
> Geometry: None
> Feature Count: 3
> Layer SRS WKT:
> (unknown)
> id: String (0.0)
> descriptio: String (0.0)
> OGRFeature(out):1
>   id (String) = 1
>   descriptio (String) = This is my third row
>
> OGRFeature(out):2
>   id (String) = 2
>   desc

Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Moises Calzado via gdal-dev
Hi Robert,

Yes, we're getting one with all the info!

El mié, 3 may 2023 a las 18:14, Robert Hewlett ()
escribió:

> Just to clarify, instead of getting three files you are getting one with
> all the info: types, projection, data?
>
> https://giswiki.hsr.ch/GeoCSV
>
> On Wed, May 3, 2023 at 8:57 AM Moises Calzado via gdal-dev <
> gdal-dev@lists.osgeo.org> wrote:
>
>> We're also specifying the GEOM_POSSIBLE_NAMES, so it would be great if
>> with that option we could use the GEOMETRY_NAME without using the
>> CREATE_CSVT=YES option.
>>
>> Regarding emitting the .prj and .csvt in /vsistdout mode, that's why I'm
>> saying that there is an issue while generating the resultant CSV.
>> The way we see it is that when using the /vsistdout mode, the result is a
>> CSV file with the .prj information in the first line, and the .csvt in the
>> second line. We're dealing with the result deleting the first two lines and
>> using the rest of the content as a CSV, which should be equal to the result
>> obtained when using ogr2ogr without the CREATE_CSVT=YES option.
>> Probably we're losing something, but as we see it, the generated CSV
>> should be a valid one. Does that make sense?
>>
>> Thanks so much for your help!
>>
>> El mié, 3 may 2023 a las 15:10, Robert Hewlett ()
>> escribió:
>>
>>> The .CSVT and .PRJ help to make a proper geocsv dataset. Helps with QGIS
>>> And geopandas. The column name that I use in the CSV is usually geom and
>>> WKT shows up in the CSVT file which seems to be a one line file that hints
>>> at the data types in the CSV file.
>>>
>>> I hope that makes sense.
>>>
>>> CSVT
>>> Integer, Integer,WKT
>>>
>>> CSV
>>> line_id,point_id,geom
>>> 1,1,"POINT(1000 1000)"
>>>
>>> PRJ
>>> EPSG:26910
>>>
>>>
>>>
>>>
>>> On Wed, May 3, 2023, 05:23 Moises Calzado via gdal-dev <
>>> gdal-dev@lists.osgeo.org> wrote:
>>>
 Hi Even,

 Thanks so much for taking a look into that one!

 I have one doubt regarding the CSVT content, as we're not really using
 it, but it's required when using the GEOMETRY_NAME layer creation option,
 as can be checked in the CSV driver documentation:


>-
>
>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to 
> WKT
>
> We really need this flag as we are processing files that contain
 geometries with different column names, and we always want the same
 geometry name in the generated output. Are we losing something when using
 that flag to avoid this problem?
 In my humble opinion, generating an invalid CSV when using the -lco
 CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
 strings containing line breaks can't be quoted.

 Could you please shed some light on this?

 Looking forward to your reply,
 Regards.

 El mié, 3 may 2023 a las 14:00, Even Rouault (<
 even.roua...@spatialys.com>) escribió:

> you didn't post to the list
> Le 03/05/2023 à 13:49, Moises Calzado a écrit :
>
> Hi Even,
>
> Thanks so much for taking a look into that one!
>
> I have one doubt regarding the CSVT content, as we're not really using
> it, but it's required when using the GEOMETRY_NAME layer creation option,
> as can be checked in the CSV driver documentation:
>
>
>>-
>>
>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to 
>> WKT
>>
>> We really need this flag as we are processing files that contain
> geometries with different column names, and we always want the same
> geometry name in the generated output. Are we losing something when using
> that flag to avoid this problem?
> In my humble opinion, generating an invalid CSV when using the -lco
> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
> strings containing line breaks can't be quoted.
>
> Could you please shed some light on this?
>
> Looking forward to your reply,
> Regards.
>
> El sáb, 29 abr 2023 a las 15:44, Even Rouault (<
> even.roua...@spatialys.com>) escribió:
>
>> Moises,
>>
>> as far as I can see with your example, the CSV driver behaves
>> "properly" in reading and writing of field values with line breaks.
>>
>> It follows the "Fields with embedded line breaks must be quoted" rule
>> of https://en.wikipedia.org/wiki/Comma-separated_values
>>
>> $ ogr2ogr out.csv /vsizip/dataframe.zip
>>
>> $ cat out.csv
>> id,descriptio
>> "1",This is my third row
>> "2","this is
>> my string
>> "
>> "3",This is my third row
>>
>> $ ogrinfo out.csv -al
>> INFO: Open of `out.csv'
>>   using driver `CSV' successful.
>>
>> Layer name: out
>> Geometry: None
>> Feature Coun

Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Robert Hewlett
Hi,

Not to start a controversy but it feels like the standard hints at three
files. Did the standard change?

If it is three files which works for me in QGIS and geopandas i.e. data
lands where it is suppose to, then more layer creations options are needed
to handle the SRID/CRS

CREATE_PRJ=YES/NO
or -t_srs and/or -s_srs triggers the dot-prj file being created.

Just saying 😊.

In the meantime would a short python script help parse the one file into
three?


On Wed, May 3, 2023 at 9:16 AM Moises Calzado via gdal-dev <
gdal-dev@lists.osgeo.org> wrote:

> Hi Robert,
>
> Yes, we're getting one with all the info!
>
> El mié, 3 may 2023 a las 18:14, Robert Hewlett ()
> escribió:
>
>> Just to clarify, instead of getting three files you are getting one with
>> all the info: types, projection, data?
>>
>> https://giswiki.hsr.ch/GeoCSV
>>
>> On Wed, May 3, 2023 at 8:57 AM Moises Calzado via gdal-dev <
>> gdal-dev@lists.osgeo.org> wrote:
>>
>>> We're also specifying the GEOM_POSSIBLE_NAMES, so it would be great if
>>> with that option we could use the GEOMETRY_NAME without using the
>>> CREATE_CSVT=YES option.
>>>
>>> Regarding emitting the .prj and .csvt in /vsistdout mode, that's why I'm
>>> saying that there is an issue while generating the resultant CSV.
>>> The way we see it is that when using the /vsistdout mode, the result is
>>> a CSV file with the .prj information in the first line, and the .csvt in
>>> the second line. We're dealing with the result deleting the first two lines
>>> and using the rest of the content as a CSV, which should be equal to the
>>> result obtained when using ogr2ogr without the CREATE_CSVT=YES option.
>>> Probably we're losing something, but as we see it, the generated CSV
>>> should be a valid one. Does that make sense?
>>>
>>> Thanks so much for your help!
>>>
>>> El mié, 3 may 2023 a las 15:10, Robert Hewlett ()
>>> escribió:
>>>
 The .CSVT and .PRJ help to make a proper geocsv dataset. Helps with
 QGIS And geopandas. The column name that I use in the CSV is usually geom
 and WKT shows up in the CSVT file which seems to be a one line file that
 hints at the data types in the CSV file.

 I hope that makes sense.

 CSVT
 Integer, Integer,WKT

 CSV
 line_id,point_id,geom
 1,1,"POINT(1000 1000)"

 PRJ
 EPSG:26910




 On Wed, May 3, 2023, 05:23 Moises Calzado via gdal-dev <
 gdal-dev@lists.osgeo.org> wrote:

> Hi Even,
>
> Thanks so much for taking a look into that one!
>
> I have one doubt regarding the CSVT content, as we're not really using
> it, but it's required when using the GEOMETRY_NAME layer creation option,
> as can be checked in the CSV driver documentation:
>
>
>>-
>>
>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults to 
>> WKT
>>
>> We really need this flag as we are processing files that contain
> geometries with different column names, and we always want the same
> geometry name in the generated output. Are we losing something when using
> that flag to avoid this problem?
> In my humble opinion, generating an invalid CSV when using the -lco
> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
> strings containing line breaks can't be quoted.
>
> Could you please shed some light on this?
>
> Looking forward to your reply,
> Regards.
>
> El mié, 3 may 2023 a las 14:00, Even Rouault (<
> even.roua...@spatialys.com>) escribió:
>
>> you didn't post to the list
>> Le 03/05/2023 à 13:49, Moises Calzado a écrit :
>>
>> Hi Even,
>>
>> Thanks so much for taking a look into that one!
>>
>> I have one doubt regarding the CSVT content, as we're not really
>> using it, but it's required when using the GEOMETRY_NAME layer creation
>> option, as can be checked in the CSV driver documentation:
>>
>>
>>>-
>>>
>>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>>>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults 
>>> to WKT
>>>
>>> We really need this flag as we are processing files that contain
>> geometries with different column names, and we always want the same
>> geometry name in the generated output. Are we losing something when using
>> that flag to avoid this problem?
>> In my humble opinion, generating an invalid CSV when using the -lco
>> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
>> strings containing line breaks can't be quoted.
>>
>> Could you please shed some light on this?
>>
>> Looking forward to your reply,
>> Regards.
>>
>> El sáb, 29 abr 2023 a las 15:44, Even Rouault (<
>> even.roua...@spatialys.com>) escribió:
>>
>>> Moises,
>>>

Re: [gdal-dev] Ogr2ogr CSV driver not handling correctly line breaks inside columns

2023-05-03 Thread Robert Hewlett
Hi,

I just tested with : GDAL 3.6.4, released 2023/04/17

Using the ogr2ogr as follows:
ogr2ogr -f CSV poi_out.csv poi.shp -lco CREATE_CSVT=YES
I get three files but no geometry

ogr2ogr -f CSV poi_out.csv poi.shp -lco CREATE_CSVT=YES -lco GEOMETRY=AS_WKT
I get three file with the geometry as WKT with the column name WKT

*WKT*,id,poi_name,poi_types
"POINT (508878.602179846 5433913.2763688)","1",crescent,"4"
"POINT (517836.918121302 5447702.01715829)","2",Tynehead Regional Park,"1"

ogr2ogr -f CSV poi_out.csv poi.shp -lco CREATE_CSVT=YES -lco
GEOMETRY=AS_WKT -lco GEOMETRY_NAME=*geom*
I get three file with the geometry as WKT but the column called  *geom*
*geom*,id,poi_name,poi_types
"POINT (508878.602179846 5433913.2763688)","1",crescent,"4"
"POINT (517836.918121302 5447702.01715829)","2",Tynehead Regional Park,"1"

What does
*ogr2ogr --version *
report back



On Wed, May 3, 2023 at 9:38 AM Robert Hewlett  wrote:

> Hi,
>
> Not to start a controversy but it feels like the standard hints at three
> files. Did the standard change?
>
> If it is three files which works for me in QGIS and geopandas i.e. data
> lands where it is suppose to, then more layer creations options are needed
> to handle the SRID/CRS
>
> CREATE_PRJ=YES/NO
> or -t_srs and/or -s_srs triggers the dot-prj file being created.
>
> Just saying 😊.
>
> In the meantime would a short python script help parse the one file into
> three?
>
>
> On Wed, May 3, 2023 at 9:16 AM Moises Calzado via gdal-dev <
> gdal-dev@lists.osgeo.org> wrote:
>
>> Hi Robert,
>>
>> Yes, we're getting one with all the info!
>>
>> El mié, 3 may 2023 a las 18:14, Robert Hewlett ()
>> escribió:
>>
>>> Just to clarify, instead of getting three files you are getting one with
>>> all the info: types, projection, data?
>>>
>>> https://giswiki.hsr.ch/GeoCSV
>>>
>>> On Wed, May 3, 2023 at 8:57 AM Moises Calzado via gdal-dev <
>>> gdal-dev@lists.osgeo.org> wrote:
>>>
 We're also specifying the GEOM_POSSIBLE_NAMES, so it would be great if
 with that option we could use the GEOMETRY_NAME without using the
 CREATE_CSVT=YES option.

 Regarding emitting the .prj and .csvt in /vsistdout mode, that's why
 I'm saying that there is an issue while generating the resultant CSV.
 The way we see it is that when using the /vsistdout mode, the result is
 a CSV file with the .prj information in the first line, and the .csvt in
 the second line. We're dealing with the result deleting the first two lines
 and using the rest of the content as a CSV, which should be equal to the
 result obtained when using ogr2ogr without the CREATE_CSVT=YES option.
 Probably we're losing something, but as we see it, the generated CSV
 should be a valid one. Does that make sense?

 Thanks so much for your help!

 El mié, 3 may 2023 a las 15:10, Robert Hewlett ()
 escribió:

> The .CSVT and .PRJ help to make a proper geocsv dataset. Helps with
> QGIS And geopandas. The column name that I use in the CSV is usually geom
> and WKT shows up in the CSVT file which seems to be a one line file that
> hints at the data types in the CSV file.
>
> I hope that makes sense.
>
> CSVT
> Integer, Integer,WKT
>
> CSV
> line_id,point_id,geom
> 1,1,"POINT(1000 1000)"
>
> PRJ
> EPSG:26910
>
>
>
>
> On Wed, May 3, 2023, 05:23 Moises Calzado via gdal-dev <
> gdal-dev@lists.osgeo.org> wrote:
>
>> Hi Even,
>>
>> Thanks so much for taking a look into that one!
>>
>> I have one doubt regarding the CSVT content, as we're not really
>> using it, but it's required when using the GEOMETRY_NAME layer creation
>> option, as can be checked in the CSV driver documentation:
>>
>>
>>>-
>>>
>>>GEOMETRY_NAME=name (Starting with GDAL 2.1): Name of geometry
>>>column. Only used if GEOMETRY=AS_WKT and CREATE_CSVT=YES. Defaults 
>>> to WKT
>>>
>>> We really need this flag as we are processing files that contain
>> geometries with different column names, and we always want the same
>> geometry name in the generated output. Are we losing something when using
>> that flag to avoid this problem?
>> In my humble opinion, generating an invalid CSV when using the -lco
>> CREATE_CSVT=YES looks like a bug for me, as I can't see the reason why
>> strings containing line breaks can't be quoted.
>>
>> Could you please shed some light on this?
>>
>> Looking forward to your reply,
>> Regards.
>>
>> El mié, 3 may 2023 a las 14:00, Even Rouault (<
>> even.roua...@spatialys.com>) escribió:
>>
>>> you didn't post to the list
>>> Le 03/05/2023 à 13:49, Moises Calzado a écrit :
>>>
>>> Hi Even,
>>>
>>> Thanks so much for taking a look into that one!
>>>
>>> I have one doubt regarding the CSVT content, as we're not really
>>> using it, but it's r