gh-yzou commented on code in PR #3000: URL: https://github.com/apache/polaris/pull/3000#discussion_r2501281826
########## plugins/spark/README.md: ########## @@ -113,14 +117,15 @@ bin/spark-shell \ ``` # Limitations -The Polaris Spark client supports catalog management for both Iceberg and Delta tables, it routes all Iceberg table -requests to the Iceberg REST endpoints, and routes all Delta table requests to the Generic Table REST endpoints. - -The Spark Client requires at least delta 3.2.1 to work with Delta tables, which requires at least Apache Spark 3.5.3. -Following describes the current functionality limitations of the Polaris Spark client: -1) Create table as select (CTAS) is not supported for Delta tables. As a result, the `saveAsTable` method of `Dataframe` - is also not supported, since it relies on the CTAS support. -2) Create a Delta table without explicit location is not supported. -3) Rename a Delta table is not supported. -4) ALTER TABLE ... SET LOCATION is not supported for DELTA table. -5) For other non-Iceberg tables like csv, it is not supported today. +The following describes the current limitations of the Polaris Spark client: + +## General Limitations +1. The Polaris Spark client only supports Iceberg and Delta Lake tables. It does not support other table formats like CSV, JSON, etc. +2. Generic tables (non-Iceberg tables) do not currently support credential vending. + +## Delta Lake Limitations +1. Create table as select (CTAS) is not supported for Delta Lake tables. As a result, the `saveAsTable` method of `Dataframe` Review Comment: I see we updated Delta tables to Delta Lake Tables. I think Delta is the actual table metadata format, Delta lake seems more indicating the storage layer or system, and within delta lake, the table is stored in delta format. So from table format's point of view, I think Delta table is a more accurate term. ########## site/content/in-dev/unreleased/generic-table.md: ########## @@ -22,17 +22,19 @@ type: docs weight: 435 --- -The Generic Table in Apache Polaris is designed to provide support for non-Iceberg tables across different table formats includes delta, csv etc. It currently provides the following capabilities: +The generic tables framework provides support for non-Iceberg table formats including Delta Lake, CSV, etc. With this framework, you can: Review Comment: I don't think generic table is a framework, it is a catalog concept, can we keep what we originally have? ########## site/content/in-dev/unreleased/generic-table.md: ########## @@ -157,13 +158,10 @@ curl -X DELETE http://localhost:8181/api/catalog/polaris/v1/delta_catalog/namesp For the complete and up-to-date API specification, see the [Catalog API Spec](https://editor-next.swagger.io/?url=https://raw.githubusercontent.com/apache/polaris/refs/heads/main/spec/generated/bundled-polaris-catalog-service.yaml). -## Limitations +## Known Limitations -Current limitations of Generic Table support: -1) Limited spec information. Currently, there is no spec for information like Schema, Partition etc. -2) No commit coordination or update capability provided at the catalog service level. - -Therefore, the catalog itself is unaware of anything about the underlying table except some of the loosely defined metadata. -It is the responsibility of the engine (and plugins used by the engine) to determine exactly how loading or committing data Review Comment: This part explains the current contract between the server and client, can we add this part back? ########## site/content/in-dev/unreleased/polaris-spark-client.md: ########## @@ -22,17 +22,17 @@ type: docs weight: 650 --- -Apache Polaris now provides Catalog support for Generic Tables (non-Iceberg tables), please check out -the [Polaris Catalog OpenAPI Spec]({{% ref "polaris-api-specs/polaris-catalog-api.md" %}}) for Generic Table API specs. +Polaris provides a Spark client to manage non-Iceberg tables through [Generic Tables]({{% ref "generic-table.md" %}}). -Along with the Generic Table Catalog support, Polaris is also releasing a Spark client, which helps to -provide an end-to-end solution for Apache Spark to manage Delta tables using Polaris. +{{< alert note >}} +The Spark client can manage Iceberg tables and non-Iceberg tables. -Note the Polaris Spark client is able to handle both Iceberg and Delta tables, not just Delta. +Users who only use Iceberg tables can use Spark without this client. Review Comment: -> For users who only need to interact with iceberg tables is not strictly required to use Polaris Spark Client. Regular Iceberg provided Spark Client should continue work. ########## plugins/spark/README.md: ########## @@ -113,14 +117,15 @@ bin/spark-shell \ ``` # Limitations Review Comment: how about update this title to # Current Limitations, because all those limitations can be eventually removed, just require extra work -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
