[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-570798373 @cloud-fan Any more comments on this? Shall we merge this? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-572586913 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-572711937 Thanks @rdblue and @cloud-fan . Merging to master This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568208318 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568315720 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568384325 @cloud-fan In an offline conversation I had with @rdblue, we discussed all the APIs (including the proposal in #26868) and all the use-cases we have today. We discussed the following use cases: ``` 1. Just TableProvider - spark.table(...): Only table properties from metastore is passed in to create Table - spark.read.load(...): Only dataFrameOptions is passed in to create Table - spark.read.schema().load(...): Throws exception for not supporting `SupportsUserSpecifiedSchema` 2. SupportsExternalMetadata 2a. SupportsUserSpecifiedSchema (extends SupportsExternalMetadata) - spark.table(...): Metastore schema + partitioning info + properties is passed in to create Table - spark.read.load(...): Call inferSchema + inferPartitioning then pass in inferred schema + partitioning + df options to create Table - spark.read.schema().load(...): Call inferPartitioning, pass in schema + inferred partitioning + df options to create Table 2b. Just SupportsExternalMetadata - spark.table(...): Metastore schema + partitioning info + properties is passed in to create Table - spark.read.load(...): Call inferSchema + inferPartitioning then pass in inferred schema + partitioning + df options to create Table - spark.read.schema().load(...): Throws exception for not supporting `SupportsUserSpecifiedSchema` 3. SupportsCatalogOptions - spark.table(...): Metastore schema + partitioning info + properties is passed in to create Table - spark.read.load(...): extractCatalog provides a catalog to provide schema + partitioning + properties - spark.read.schema().load(...): Use catalog, ignore userSpecifiedSchema ``` We noticed that we could cover all use cases with 2 and 3. Then the question was, do we even need the `getTable(properties)` method in #26868, and it didn't seem required. So, the flow would be: - If `SupportsCatalogOptions` is extended, and we're trying to resolve a table through data source options, always delegate to the catalog implementation blessed by the data source - If not, then the methods in `SupportsExternalMetadata` can be used to create the table instance. - We also need a marker `SupportsUserSpecifiedSchema` to handle the case of users providing schema's where they shouldn't. (Maybe `SupportsUserSpecifiedSchema` has the method `withSchema` to recreate a table with the user provided schema) Let me know what you think This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568619664 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider
brkyvz commented on issue #26913: [SPARK-29219][SQL] Introduce SupportsCatalogOptions for TableProvider URL: https://github.com/apache/spark/pull/26913#issuecomment-568807340 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org