Re: [PR] [MINOR][DOCS][FOLLOWUP] Document supported SRIDs in geospatial types [spark]
pratham76 commented on PR #55207: URL: https://github.com/apache/spark/pull/55207#issuecomment-4205372795 > Thanks for adding SRID documentation, @pratham76 — the "Commonly Used SRIDs" table is a useful addition. I have some suggestions to improve accuracy and reduce redundancy with existing content on the page. > > Also, minor note: the PR description references #54780, but that PR was closed (not merged). You may want to update the description to reference the correct merged work. Thank you @szehon-ho for the review comments. I had referenced https://github.com/apache/spark/pull/54780 as this seems to be the PR referenced in the JIRA. It also seems to be the one that was merged. Please do let know if i missed anything, and also if the changes are okay. Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [MINOR][DOCS][FOLLOWUP] Document supported SRIDs in geospatial types [spark]
pratham76 commented on code in PR #55207: URL: https://github.com/apache/spark/pull/55207#discussion_r3050478338 ## docs/sql-ref-geospatial-types.md: ## @@ -142,6 +142,92 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'010100F03F000 * **Fixed-SRID columns**: Every value in the column must have the same SRID as the column type. Inserting a value with a different SRID can raise an error (or you can use `ST_SetSrid` to set the value’s SRID to match the column). * **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are allowed. * **Storage**: Parquet, Delta, and Iceberg store geometry/geography with a fixed SRID per column; mixed-SRID types are for in-memory/query use. When writing to these formats, a concrete (fixed) SRID is required. +### Supported SRIDs + +Spark includes a pre-built registry of standard Spatial Reference Identifiers (SRIDs) from the PROJ database, with overrides to support OGC standards. This registry enables validation and proper handling of coordinate systems for geospatial data. + + Commonly Used SRIDs + +| SRID | Name | Description | Typical Use Case | +|--|--|-|--| +| 4326 | WGS 84 | World Geodetic System 1984 (latitude/longitude) | GPS coordinates, global data (default for GEOGRAPHY) | +| 3857 | Web Mercator | Pseudo-Mercator projection used by web mapping services | Web maps (Google Maps, OpenStreetMap, Bing Maps) | +| 2154 | RGF93 / Lambert-93 | French national coordinate system | France-specific mapping and GIS | +| 32633 | WGS 84 / UTM zone 33N | Universal Transverse Mercator, zone 33 North | Central Europe (6°E to 12°E) | +| 32634 | WGS 84 / UTM zone 34N | Universal Transverse Mercator, zone 34 North | Eastern Europe (12°E to 18°E) | +| 32635 | WGS 84 / UTM zone 35N | Universal Transverse Mercator, zone 35 North | Eastern Europe/Western Asia (18°E to 24°E) | + +The registry includes many additional SRIDs for various UTM zones, national coordinate systems, and other projections. For a complete list, refer to the [EPSG Geodetic Parameter Dataset](https://epsg.org/). + + Using Different SRIDs + +**Creating tables with specific SRIDs:** + +```sql +-- Web Mercator projection (common for web mapping applications) +CREATE TABLE web_map_data ( + id BIGINT, + location GEOMETRY(3857) +); + +-- UTM zone 33N for Central Europe +CREATE TABLE europe_survey_data ( + id BIGINT, + measurement_point GEOMETRY(32633) +); + +-- French national grid +CREATE TABLE france_cadastre ( + id BIGINT, + parcel GEOMETRY(2154) +); +``` + +**Converting between SRIDs:** Review Comment: Thank you, noted, have removed the repeated section. ## docs/sql-ref-geospatial-types.md: ## @@ -142,6 +142,92 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'010100F03F000 * **Fixed-SRID columns**: Every value in the column must have the same SRID as the column type. Inserting a value with a different SRID can raise an error (or you can use `ST_SetSrid` to set the value’s SRID to match the column). * **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are allowed. * **Storage**: Parquet, Delta, and Iceberg store geometry/geography with a fixed SRID per column; mixed-SRID types are for in-memory/query use. When writing to these formats, a concrete (fixed) SRID is required. +### Supported SRIDs + +Spark includes a pre-built registry of standard Spatial Reference Identifiers (SRIDs) from the PROJ database, with overrides to support OGC standards. This registry enables validation and proper handling of coordinate systems for geospatial data. + + Commonly Used SRIDs + +| SRID | Name | Description | Typical Use Case | +|--|--|-|--| +| 4326 | WGS 84 | World Geodetic System 1984 (latitude/longitude) | GPS coordinates, global data (default for GEOGRAPHY) | +| 3857 | Web Mercator | Pseudo-Mercator projection used by web mapping services | Web maps (Google Maps, OpenStreetMap, Bing Maps) | +| 2154 | RGF93 / Lambert-93 | French national coordinate system | France-specific mapping and GIS | +| 32633 | WGS 84 / UTM zone 33N | Universal Transverse Mercator, zone 33 North | Central Europe (6°E to 12°E) | +| 32634 | WGS 84 / UTM zone 34N | Universal Transverse Mercator, zone 34 North | Eastern Europe (12°E to 18°E) | +| 32635 | WGS 84 / UTM zone 35N | Universal Transverse Mercator, zone 35 North | Eastern Europe/Western Asia (18°E to 24°E) | + +The registry includes many additional SRIDs for various UTM zones, national coordinate systems, and other projections. For a complete list, refer to the [EPSG Geodetic Parameter Dataset](https://epsg.org/). + + Using Different SRIDs + +**Creating tables with specific SRIDs:** + +```sql +-- Web Mercator projection (common for web mapping applications) +CREATE TABLE web_map_
Re: [PR] [MINOR][DOCS][FOLLOWUP] Document supported SRIDs in geospatial types [spark]
pratham76 commented on code in PR #55207: URL: https://github.com/apache/spark/pull/55207#discussion_r3050477105 ## docs/sql-ref-geospatial-types.md: ## @@ -142,6 +142,92 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'010100F03F000 * **Fixed-SRID columns**: Every value in the column must have the same SRID as the column type. Inserting a value with a different SRID can raise an error (or you can use `ST_SetSrid` to set the value’s SRID to match the column). * **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are allowed. * **Storage**: Parquet, Delta, and Iceberg store geometry/geography with a fixed SRID per column; mixed-SRID types are for in-memory/query use. When writing to these formats, a concrete (fixed) SRID is required. +### Supported SRIDs + +Spark includes a pre-built registry of standard Spatial Reference Identifiers (SRIDs) from the PROJ database, with overrides to support OGC standards. This registry enables validation and proper handling of coordinate systems for geospatial data. + + Commonly Used SRIDs + +| SRID | Name | Description | Typical Use Case | +|--|--|-|--| +| 4326 | WGS 84 | World Geodetic System 1984 (latitude/longitude) | GPS coordinates, global data (default for GEOGRAPHY) | +| 3857 | Web Mercator | Pseudo-Mercator projection used by web mapping services | Web maps (Google Maps, OpenStreetMap, Bing Maps) | +| 2154 | RGF93 / Lambert-93 | French national coordinate system | France-specific mapping and GIS | +| 32633 | WGS 84 / UTM zone 33N | Universal Transverse Mercator, zone 33 North | Central Europe (6°E to 12°E) | +| 32634 | WGS 84 / UTM zone 34N | Universal Transverse Mercator, zone 34 North | Eastern Europe (12°E to 18°E) | +| 32635 | WGS 84 / UTM zone 35N | Universal Transverse Mercator, zone 35 North | Eastern Europe/Western Asia (18°E to 24°E) | Review Comment: Noted! Have added `CRS identifier` column to the table, which contains the corresponding mappings as stated above. I have also introduced a column `type` which indicated if the SRIDs are valid for GEOGRAPHY or GEOMETRY, or both. Along with these i have added some notes based on the above comments. Do inform if this helps. Thanks! ## docs/sql-ref-geospatial-types.md: ## @@ -142,6 +142,92 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'010100F03F000 * **Fixed-SRID columns**: Every value in the column must have the same SRID as the column type. Inserting a value with a different SRID can raise an error (or you can use `ST_SetSrid` to set the value’s SRID to match the column). * **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are allowed. * **Storage**: Parquet, Delta, and Iceberg store geometry/geography with a fixed SRID per column; mixed-SRID types are for in-memory/query use. When writing to these formats, a concrete (fixed) SRID is required. +### Supported SRIDs + +Spark includes a pre-built registry of standard Spatial Reference Identifiers (SRIDs) from the PROJ database, with overrides to support OGC standards. This registry enables validation and proper handling of coordinate systems for geospatial data. + + Commonly Used SRIDs + +| SRID | Name | Description | Typical Use Case | +|--|--|-|--| +| 4326 | WGS 84 | World Geodetic System 1984 (latitude/longitude) | GPS coordinates, global data (default for GEOGRAPHY) | +| 3857 | Web Mercator | Pseudo-Mercator projection used by web mapping services | Web maps (Google Maps, OpenStreetMap, Bing Maps) | +| 2154 | RGF93 / Lambert-93 | French national coordinate system | France-specific mapping and GIS | +| 32633 | WGS 84 / UTM zone 33N | Universal Transverse Mercator, zone 33 North | Central Europe (6°E to 12°E) | +| 32634 | WGS 84 / UTM zone 34N | Universal Transverse Mercator, zone 34 North | Eastern Europe (12°E to 18°E) | +| 32635 | WGS 84 / UTM zone 35N | Universal Transverse Mercator, zone 35 North | Eastern Europe/Western Asia (18°E to 24°E) | + +The registry includes many additional SRIDs for various UTM zones, national coordinate systems, and other projections. For a complete list, refer to the [EPSG Geodetic Parameter Dataset](https://epsg.org/). Review Comment: Thanks for pointing this out, have updated the note. ## docs/sql-ref-geospatial-types.md: ## @@ -142,6 +142,92 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'010100F03F000 * **Fixed-SRID columns**: Every value in the column must have the same SRID as the column type. Inserting a value with a different SRID can raise an error (or you can use `ST_SetSrid` to set the value’s SRID to match the column). * **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are
Re: [PR] [MINOR][DOCS][FOLLOWUP] Document supported SRIDs in geospatial types [spark]
szehon-ho commented on code in PR #55207: URL: https://github.com/apache/spark/pull/55207#discussion_r3047796739 ## docs/sql-ref-geospatial-types.md: ## @@ -142,6 +142,92 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'010100F03F000 * **Fixed-SRID columns**: Every value in the column must have the same SRID as the column type. Inserting a value with a different SRID can raise an error (or you can use `ST_SetSrid` to set the value’s SRID to match the column). * **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are allowed. * **Storage**: Parquet, Delta, and Iceberg store geometry/geography with a fixed SRID per column; mixed-SRID types are for in-memory/query use. When writing to these formats, a concrete (fixed) SRID is required. +### Supported SRIDs + +Spark includes a pre-built registry of standard Spatial Reference Identifiers (SRIDs) from the PROJ database, with overrides to support OGC standards. This registry enables validation and proper handling of coordinate systems for geospatial data. + + Commonly Used SRIDs + +| SRID | Name | Description | Typical Use Case | +|--|--|-|--| +| 4326 | WGS 84 | World Geodetic System 1984 (latitude/longitude) | GPS coordinates, global data (default for GEOGRAPHY) | +| 3857 | Web Mercator | Pseudo-Mercator projection used by web mapping services | Web maps (Google Maps, OpenStreetMap, Bing Maps) | +| 2154 | RGF93 / Lambert-93 | French national coordinate system | France-specific mapping and GIS | +| 32633 | WGS 84 / UTM zone 33N | Universal Transverse Mercator, zone 33 North | Central Europe (6°E to 12°E) | +| 32634 | WGS 84 / UTM zone 34N | Universal Transverse Mercator, zone 34 North | Eastern Europe (12°E to 18°E) | +| 32635 | WGS 84 / UTM zone 35N | Universal Transverse Mercator, zone 35 North | Eastern Europe/Western Asia (18°E to 24°E) | Review Comment: Consider adding a **CRS Identifier** column. Spark maps SRIDs to CRS strings internally, and these strings are visible to users in `df.schema.json()` output and in Parquet/Delta/Iceberg storage metadata. For example, `GEOMETRY(4326)` stores as `geometry(OGC:CRS84)` in JSON schema — not `EPSG:4326`. This is a common source of confusion. The key mappings are: | SRID | CRS Identifier | |--|---| | 0 | `SRID:0` | | 3857 | `EPSG:3857` | | 4326 | `OGC:CRS84` | | 4267 | `OGC:CRS27` | | 4269 | `OGC:CRS83` | Also worth noting which SRIDs are valid for GEOGRAPHY vs GEOMETRY. For instance, `GEOMETRY(3857)` works but `GEOGRAPHY(3857)` will error because 3857 is a projected (non-geographic) CRS. That's a real pitfall for users. ## docs/sql-ref-geospatial-types.md: ## @@ -142,6 +142,92 @@ SELECT ST_Srid(ST_SetSrid(ST_GeomFromWKB(X'010100F03F000 * **Fixed-SRID columns**: Every value in the column must have the same SRID as the column type. Inserting a value with a different SRID can raise an error (or you can use `ST_SetSrid` to set the value’s SRID to match the column). * **Mixed-SRID columns** (`GEOMETRY(ANY)` or `GEOGRAPHY(ANY)`): Values can have different SRIDs. Only valid SRIDs are allowed. * **Storage**: Parquet, Delta, and Iceberg store geometry/geography with a fixed SRID per column; mixed-SRID types are for in-memory/query use. When writing to these formats, a concrete (fixed) SRID is required. +### Supported SRIDs + +Spark includes a pre-built registry of standard Spatial Reference Identifiers (SRIDs) from the PROJ database, with overrides to support OGC standards. This registry enables validation and proper handling of coordinate systems for geospatial data. + + Commonly Used SRIDs + +| SRID | Name | Description | Typical Use Case | +|--|--|-|--| +| 4326 | WGS 84 | World Geodetic System 1984 (latitude/longitude) | GPS coordinates, global data (default for GEOGRAPHY) | +| 3857 | Web Mercator | Pseudo-Mercator projection used by web mapping services | Web maps (Google Maps, OpenStreetMap, Bing Maps) | +| 2154 | RGF93 / Lambert-93 | French national coordinate system | France-specific mapping and GIS | +| 32633 | WGS 84 / UTM zone 33N | Universal Transverse Mercator, zone 33 North | Central Europe (6°E to 12°E) | +| 32634 | WGS 84 / UTM zone 34N | Universal Transverse Mercator, zone 34 North | Eastern Europe (12°E to 18°E) | +| 32635 | WGS 84 / UTM zone 35N | Universal Transverse Mercator, zone 35 North | Eastern Europe/Western Asia (18°E to 24°E) | + +The registry includes many additional SRIDs for various UTM zones, national coordinate systems, and other projections. For a complete list, refer to the [EPSG Geodetic Parameter Dataset](https://epsg.org/). + + Using Different SRIDs + +**Creating tables with specific SRIDs:** Review Comment: Most of the examples in sections "Using Different SRID
Re: [PR] [MINOR][DOCS][FOLLOWUP] Document supported SRIDs in geospatial types [spark]
pratham76 commented on PR #55207: URL: https://github.com/apache/spark/pull/55207#issuecomment-4196202080 @szehon-ho gentle ping! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] [MINOR][DOCS][FOLLOWUP] Document supported SRIDs in geospatial types [spark]
pratham76 commented on PR #55207: URL: https://github.com/apache/spark/pull/55207#issuecomment-4189228469 @szehon-ho @cloud-fan @uros-db Could you have a look at the doc additions for https://github.com/apache/spark/pull/54780? Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
