[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] support data source v2 in CREATE TABLE USING

2019-09-12 Thread GitBox
cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] 
support data source v2 in CREATE TABLE USING
URL: https://github.com/apache/spark/pull/25651#discussion_r323785172
 
 

 ##
 File path: 
external/avro/src/main/scala/org/apache/spark/sql/v2/avro/AvroDataSourceV2.scala
 ##
 @@ -35,7 +36,10 @@ class AvroDataSourceV2 extends FileDataSourceV2 {
 AvroTable(tableName, sparkSession, options, paths, None, 
fallbackFileFormat)
   }
 
-  override def getTable(options: CaseInsensitiveStringMap, schema: 
StructType): Table = {
+  override def getTable(
+  options: CaseInsensitiveStringMap,
+  schema: StructType,
+  partitions: Array[Transform]): Table = {
 
 Review comment:
   read options should be passed in `Table.newScanBuilder`. The `options` here 
is the table properties.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] support data source v2 in CREATE TABLE USING

2019-09-04 Thread GitBox
cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] 
support data source v2 in CREATE TABLE USING
URL: https://github.com/apache/spark/pull/25651#discussion_r320769069
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2ScanSupportCheck.scala
 ##
 @@ -20,17 +20,20 @@ package org.apache.spark.sql.execution.datasources.v2
 import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
 import org.apache.spark.sql.execution.streaming.{StreamingRelation, 
StreamingRelationV2}
-import org.apache.spark.sql.sources.v2.TableCapability.{CONTINUOUS_READ, 
MICRO_BATCH_READ}
+import org.apache.spark.sql.sources.v2.TableCapability.{BATCH_READ, 
CONTINUOUS_READ, MICRO_BATCH_READ}
 
 /**
  * This rules adds some basic table capability check for streaming scan, 
without knowing the actual
  * streaming execution mode.
  */
-object V2StreamingScanSupportCheck extends (LogicalPlan => Unit) {
+object V2ScanSupportCheck extends (LogicalPlan => Unit) {
   import DataSourceV2Implicits._
 
   override def apply(plan: LogicalPlan): Unit = {
 plan.foreach {
+  case r: DataSourceV2Relation if !r.table.supports(BATCH_READ) =>
+throw new AnalysisException(
+  s"Table ${r.table.name()} does not support batch scan.")
 
 Review comment:
   I've created a separate PR to fix it: 
https://github.com/apache/spark/pull/25679


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] support data source v2 in CREATE TABLE USING

2019-09-02 Thread GitBox
cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] 
support data source v2 in CREATE TABLE USING
URL: https://github.com/apache/spark/pull/25651#discussion_r319931123
 
 

 ##
 File path: 
sql/catalyst/src/main/java/org/apache/spark/sql/sources/v2/TableProvider.java
 ##
 @@ -44,18 +45,28 @@
   Table getTable(CaseInsensitiveStringMap options);
 
   /**
-   * Return a {@link Table} instance to do read/write with user-specified 
schema and options.
+   * Return a {@link Table} instance to do read/write with user-specified 
options and additional
+   * schema/partitions information. The additional schema/partitions 
information can be specified
+   * by users (e.g. {@code session.read.format("myFormat").schema(...)}) or 
retrieved from the
+   * metastore (e.g. {@code CREATE TABLE t(i INT) USING myFormat}).
+   * 
+   * The returned table must report the same schema/partitions with the ones 
that are passed in.
* 
* By default this method throws {@link UnsupportedOperationException}, 
implementations should
-   * override this method to handle user-specified schema.
+   * override this method to handle the additional schema/partitions 
information.
* 
* @param options the user-specified options that can identify a table, e.g. 
file path, Kafka
*topic name, etc. It's an immutable case-insensitive 
string-to-string map.
-   * @param schema the user-specified schema.
+   * @param schema the additional schema information.
+   * @param partitions the additional partitions information.
* @throws UnsupportedOperationException
*/
-  default Table getTable(CaseInsensitiveStringMap options, StructType schema) {
+  default Table getTable(
 
 Review comment:
   I'll refine the classdoc of this interface, after we reach an agreement of 
the proposal.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org