umartin commented on PR #828: URL: https://github.com/apache/sedona/pull/828#issuecomment-1544118849
Hi, I would prefer a plain binary data source and separate RS_AsXXX functions. There are several benefits to separating the raster formats and the data source - The raster formats are useful for transferring rasters between different systems. For example I could use RS_AsXXX and write the raster to PostGIS, parquet or kafka for further processing outside Sedona and Spark. That means we would need the RS_AsXXX functions anyway. - Different formats requires different parameters. Since ArcGrid is single layered we might want a parameter to select which layer to convert in a multi layer raster. With GeoTIFF we might have parameters for compression level and compression codec. Doing conversion and binary file writing in the data source will lead to a messy API when there are many formats with wildly different parameters. - The data source API in Spark is constantly evolving. We might want to minimize the exposure to the API by keeping the data source simple. - We can create Flink bindings for the RS_AsXXX functions as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
