paleolimbot opened a new pull request, #632:
URL: https://github.com/apache/sedona-db/pull/632

   This PR adds a thin wrapper around `pyogrio.raw.write_arrow()` that fills in 
some default arguments with information we already have helpers to get (e.g., 
geometry column name and its crs).
   
   ```python
   import sedona.db
   
   sd = sedona.db.connect()
   
   url = 
"https://github.com/geoarrow/geoarrow-data/releases/download/v0.2.0/ns-water_water-point.fgb";
   sd.read_pyogrio(url).to_pyogrio("foofy.fgb")
   ```
   
   Because the arrow interface also powers some calls to `ogr2ogr`, I did a 
quick check to make sure it was not too far off (and it is about the same):
   
   ```python
   # curl -L 
https://github.com/geoarrow/geoarrow-data/releases/download/v0.2.0/ns-water_elevation.fgb
 -o input.fgb
   %time sd.read_pyogrio("input.fgb").to_pyogrio("foofy.fgb")
   #> CPU times: user 5.51 s, sys: 1.3 s, total: 6.81 s
   #> Wall time: 7 s
   
   # Similar wall time to ogr2ogr, which does approximately the same thing
   # ogr2ogr foofy.fgb input.fgb  6.51s user 1.58s system 93% cpu 8.697 total
   ```
   
   One hiccup that I will solve in a follow-up PR is that the version of GDAL 
in current pip pyogrio doesn't support string or binary views, so some pretty 
common operations like `read_parquet().to_pyogrio()` will error ( 
https://github.com/OSGeo/gdal/issues/13942 ). There are some other libraries 
that have this restriction too and so we can provide a function to insert the 
appropriate casts to eliminate exotic types (e.g., string view, run end 
encoded, dictionary).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to