Hi Robert!
I think that we're losing a bit the main issue that we reported, as in
fact the problem is related with line breaks in the output generated
while using /vsistdout and the CREATE_CSVT=YES option.
Even pointed out that avoiding that flag it works as expected, but
when it's used the generated output is not okay as the "Fields with
embedded line breaks must be quoted" rule is not followed.
IMHO although the generated output is not a CSV itself, we should be
able to delete the first two lines (projection info and types) and
deal with the rest of the content as a CSV.
What we're doing is streaming the output of the /vsistdout driver to
another process that perform some steps with the resultant CSV. In all
cases it works correctly, as the output of the ogr2ogr execution is a
valid CSV when deleting the first two lines, but in the case reported
in my first email it's not.
The CREATE_CSVT=YES option is mandatory for us as for the moment, it's
requires to use the GEOMETRY_NAME=*geom *one, so we don't have any
workaround.
Just wanted to confirm if that's expected for you (generating an
output that it's not a valid CSV in the end)!
El mié, 3 may 2023 a las 21:05, Robert Hewlett (<rob.h...@gmail.com>)
escribió:
Hi,
I just tested with : GDAL 3.6.4, released 2023/04/17
Using the ogr2ogr as follows:
ogr2ogr -f CSV poi_out.csv poi.shp -lco CREATE_CSVT=YES
I get three files but no geometry
ogr2ogr -f CSV poi_out.csv poi.shp -lco CREATE_CSVT=YES -lco
GEOMETRY=AS_WKT
I get three file with the geometry as WKT with the column name WKT
*WKT*,id,poi_name,poi_types
"POINT (508878.602179846 5433913.2763688)","1",crescent,"4"
"POINT (517836.918121302 5447702.01715829)","2",Tynehead Regional
Park,"1"
ogr2ogr -f CSV poi_out.csv poi.shp -lco CREATE_CSVT=YES -lco
GEOMETRY=AS_WKT -lco GEOMETRY_NAME=*geom*
I get three file with the geometry as WKT but the column called *geom*
*geom*,id,poi_name,poi_types
"POINT (508878.602179846 5433913.2763688)","1",crescent,"4"
"POINT (517836.918121302 5447702.01715829)","2",Tynehead Regional
Park,"1"
What does
*ogr2ogr --version *
report back
On Wed, May 3, 2023 at 9:38 AM Robert Hewlett <rob.h...@gmail.com>
wrote:
Hi,
Not to start a controversy but it feels like the standard
hints at three files. Did the standard change?
If it is three files which works for me in QGIS and geopandas
i.e. data lands where it is suppose to, then more layer
creations options are needed to handle the SRID/CRS
CREATE_PRJ=YES/NO
or -t_srs and/or -s_srs triggers the dot-prj file being created.
Just saying 😊.
In the meantime would a short python script help parse the one
file into three?
On Wed, May 3, 2023 at 9:16 AM Moises Calzado via gdal-dev
<gdal-dev@lists.osgeo.org> wrote:
Hi Robert,
Yes, we're getting one with all the info!
El mié, 3 may 2023 a las 18:14, Robert Hewlett
(<rob.h...@gmail.com>) escribió:
Just to clarify, instead of getting three files you
are getting one with all the info: types, projection,
data?
https://giswiki.hsr.ch/GeoCSV
On Wed, May 3, 2023 at 8:57 AM Moises Calzado via
gdal-dev <gdal-dev@lists.osgeo.org> wrote:
We're also specifying the GEOM_POSSIBLE_NAMES, so
it would be great if with that option we could use
the GEOMETRY_NAME without using the
CREATE_CSVT=YES option.
Regarding emitting the .prj and .csvt in
/vsistdout mode, that's why I'm saying that there
is an issue while generating the resultant CSV.
The way we see it is that when using the
/vsistdout mode, the result is a CSV file with the
.prj information in the first line, and the .csvt
in the second line. We're dealing with the result
deleting the first two lines and using the rest of
the content as a CSV, which should be equal to the
result obtained when using ogr2ogr without the
CREATE_CSVT=YES option.
Probably we're losing something, but as we see it,
the generated CSV should be a valid one. Does that
make sense?
Thanks so much for your help!
El mié, 3 may 2023 a las 15:10, Robert Hewlett
(<rob.h...@gmail.com>) escribió:
The .CSVT and .PRJ help to make a proper
geocsv dataset. Helps with QGIS And geopandas.
The column name that I use in the CSV is
usually geom and WKT shows up in the CSVT file
which seems to be a one line file that hints
at the data types in the CSV file.
I hope that makes sense.
CSVT
Integer, Integer,WKT
CSV
line_id,point_id,geom
1,1,"POINT(1000 1000)"
PRJ
EPSG:26910
On Wed, May 3, 2023, 05:23 Moises Calzado via
gdal-dev <gdal-dev@lists.osgeo.org> wrote:
Hi Even,
Thanks so much for taking a look into that
one!
I have one doubt regarding the CSVT
content, as we're not really using it, but
it's required when using the GEOMETRY_NAME
layer creation option, as can be checked
in the CSV driver documentation:
*
GEOMETRY_NAME=name (Starting with
GDAL 2.1): Name of geometry
column. Only used if
GEOMETRY=AS_WKT and
CREATE_CSVT=YES. Defaults to WKT
We really need this flag as we are
processing files that contain geometries
with different column names, and we always
want the same geometry name in the
generated output. Are we losing something
when using that flag to avoid this problem?
In my humble opinion, generating an
invalid CSV when using the -lco
CREATE_CSVT=YES looks like a bug for me,
as I can't see the reason why strings
containing line breaks can't be quoted.
Could you please shed some light on this?
Looking forward to your reply,
Regards.
El mié, 3 may 2023 a las 14:00, Even
Rouault (<even.roua...@spatialys.com>)
escribió:
you didn't post to the list
Le 03/05/2023 à 13:49, Moises Calzado
a écrit :
Hi Even,
Thanks so much for taking a look into
that one!
I have one doubt regarding the CSVT
content, as we're not really using
it, but it's required when using the
GEOMETRY_NAME layer creation option,
as can be checked in the CSV driver
documentation:
*
GEOMETRY_NAME=name (Starting
with GDAL 2.1): Name of
geometry column. Only used if
GEOMETRY=AS_WKT and
CREATE_CSVT=YES. Defaults to WKT
We really need this flag as we are
processing files that contain
geometries with different column
names, and we always want the same
geometry name in the generated
output. Are we losing something when
using that flag to avoid this problem?
In my humble opinion, generating an
invalid CSV when using the -lco
CREATE_CSVT=YES looks like a bug for
me, as I can't see the reason why
strings containing line breaks can't
be quoted.
Could you please shed some light on this?
Looking forward to your reply,
Regards.
El sáb, 29 abr 2023 a las 15:44, Even
Rouault
(<even.roua...@spatialys.com>) escribió:
Moises,
as far as I can see with your
example, the CSV driver behaves
"properly" in reading and writing
of field values with line breaks.
It follows the "Fields with
embedded line breaks must be
quoted" rule of
https://en.wikipedia.org/wiki/Comma-separated_values
$ ogr2ogr out.csv
/vsizip/dataframe.zip
$ cat out.csv
id,descriptio
"1",This is my third row
"2","this is
my string
"
"3",This is my third row
$ ogrinfo out.csv -al
INFO: Open of `out.csv'
using driver `CSV' successful.
Layer name: out
Geometry: None
Feature Count: 3
Layer SRS WKT:
(unknown)
id: String (0.0)
descriptio: String (0.0)
OGRFeature(out):1
id (String) = 1
descriptio (String) = This is
my third row
OGRFeature(out):2
id (String) = 2
descriptio (String) = this is
my string
OGRFeature(out):3
id (String) = 3
descriptio (String) = This is
my third row
But in your example using
/vsistdout/ and -lco
CREATE_CSVT=YES is going to
result in an invalid CSV file
which will mix both the .csvt and
.csv content
Even
Le 24/04/2023 à 13:34, Moises
Calzado via gdal-dev a écrit :
Hello!
We're trying to convert a
Shapefile into a CSV using
ogr2ogr and we're having some
issues while dealing with some
columns that contain line breaks
inside their values. If we have
a line with the following
string, ogr2ogr detects that the
line break is a new line and it
returns two lines.
"this is my \n value"
That's the command that we're
executing:
ogr2ogr -f CSV -skipfailures
-makevalid /vsistdout/
/vsizip/shapefile.zip
-simplify 0.00001 -dim XY
-t_srs EPSG:4326 -lco
GEOMETRY=AS_WKT -lco
GEOMETRY_NAME=geom -lco
CREATE_CSVT=YES > result.csv
Is this an expected behaviour,
or is there any way to avoid this?
Sharing an example Shapefile so
that you can try to reproduce
that behaviour:
https://drive.google.com/file/d/1gFqfTP02KTFoavJyyO-Ix05YwZB2tS24/view?usp=sharing
Thanks so much in advance,
Regards.
--
*Moises Calzado*
Support Engineer
+34671264286 |
mcalz...@carto.com | CARTO
<https://www.carto.com/>
<https://spatial-data-science-conference.com/2023/london/>
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
http://www.spatialys.com
My software is free, but my time generally
not.
--
*Moises Calzado*
Support Engineer
+34671264286 | mcalz...@carto.com |
CARTO <https://www.carto.com/>
<https://spatial-data-science-conference.com/2023/london/>
--
http://www.spatialys.com
My software is free, but my time generally not.
--
*Moises Calzado*
Support Engineer
+34671264286 | mcalz...@carto.com | CARTO
<https://www.carto.com/>
<https://spatial-data-science-conference.com/2023/london/>
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
*Moises Calzado*
Support Engineer
+34671264286 | mcalz...@carto.com | CARTO
<https://www.carto.com/>
<https://spatial-data-science-conference.com/2023/london/>
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
*Moises Calzado*
Support Engineer
+34671264286 | mcalz...@carto.com | CARTO
<https://www.carto.com/>
<https://spatial-data-science-conference.com/2023/london/>
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev
--
*Moises Calzado*
Support Engineer
+34671264286 | mcalz...@carto.com | CARTO <https://www.carto.com/>
<https://spatial-data-science-conference.com/2023/london/>
_______________________________________________
gdal-dev mailing list
gdal-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/gdal-dev