date:20190724

[GitHub] [spark] dongjinleekr commented on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

dongjinleekr commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514886740
 
 
   @maropu I will have a try. Let me see.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25068: 
[SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view
URL: https://github.com/apache/spark/pull/25068#issuecomment-514886374
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25068: 
[SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view
URL: https://github.com/apache/spark/pull/25068#issuecomment-514886379
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108138/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] 
Self-join should not miss cached view
URL: https://github.com/apache/spark/pull/25068#issuecomment-514886379
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108138/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view

2019-07-24 Thread GitBox

SparkQA removed a comment on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] 
Self-join should not miss cached view
URL: https://github.com/apache/spark/pull/25068#issuecomment-514836024
 
 
   **[Test build #108138 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108138/testReport)**
 for PR 25068 at commit 
[`474c16e`](https://github.com/apache/spark/commit/474c16ea5eae33da3f198c8f240a77ae8b1ce13e).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] 
Self-join should not miss cached view
URL: https://github.com/apache/spark/pull/25068#issuecomment-514886374
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] Self-join should not miss cached view

2019-07-24 Thread GitBox

SparkQA commented on issue #25068: [SPARK-28156][SQL][BACKPORT-2.4] Self-join 
should not miss cached view
URL: https://github.com/apache/spark/pull/25068#issuecomment-514886133
 
 
   **[Test build #108138 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108138/testReport)**
 for PR 25068 at commit 
[`474c16e`](https://github.com/apache/spark/commit/474c16ea5eae33da3f198c8f240a77ae8b1ce13e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support 
using volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514885083
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13251/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support 
using volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514885080
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume 
mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514885053
 
 
   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/13251/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using 
volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514885080
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using 
volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514885083
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13251/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25236: [SPARK-28487][k8s] More 
responsive dynamic allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514883978
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13250/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25236: [SPARK-28487][k8s] More 
responsive dynamic allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514883973
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] DataSourceV2: InsertTable

2019-07-24 Thread GitBox

brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#discussion_r307102960
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ##
 @@ -284,9 +294,13 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
*/
   override def visitInsertIntoTable(
   ctx: InsertIntoTableContext): InsertTableParams = withOrigin(ctx) {
-val tableIdent = visitTableIdentifier(ctx.tableIdentifier)
+val tableIdent = visitMultipartIdentifier(ctx.multipartIdentifier)
 val partitionKeys = 
Option(ctx.partitionSpec).map(visitPartitionSpec).getOrElse(Map.empty)
 
+if (ctx.EXISTS != null) {
 
 Review comment:
   what's the point of adding this to the parser, if we're not going to support 
it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] DataSourceV2: InsertTable

2019-07-24 Thread GitBox

brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#discussion_r307103799
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/TestInMemoryTableCatalog.scala
 ##
 @@ -149,49 +161,89 @@ private class InMemoryTable(
   }
 
   override def newWriteBuilder(options: CaseInsensitiveStringMap): 
WriteBuilder = {
-new WriteBuilder with SupportsTruncate {
-  private var shouldTruncate: Boolean = false
+new WriteBuilder with SupportsTruncate with SupportsOverwrite with 
SupportsDynamicOverwrite {
+  private var writer: BatchWrite = Append
 
   override def truncate(): WriteBuilder = {
-shouldTruncate = true
+assert(writer == Append)
+writer = TruncateAndAppend
+this
+  }
+
+  override def overwrite(filters: Array[Filter]): WriteBuilder = {
+assert(writer == Append)
+writer = new Overwrite(filters)
 this
   }
 
-  override def buildForBatch(): BatchWrite = {
-if (shouldTruncate) TruncateAndAppend else Append
+  override def overwriteDynamicPartitions(): WriteBuilder = {
+assert(writer == Append)
+writer = DynamicOverwrite
+this
   }
+
+  override def buildForBatch(): BatchWrite = writer
 }
   }
 
-  private object TruncateAndAppend extends BatchWrite {
+  private abstract class TestBatchWrite extends BatchWrite {
 override def createBatchWriterFactory(): DataWriterFactory = {
   BufferedRowsWriterFactory
 }
 
-override def commit(messages: Array[WriterCommitMessage]): Unit = {
-  replaceData(messages.map(_.asInstanceOf[BufferedRows]))
+override def abort(messages: Array[WriterCommitMessage]): Unit = {
 }
+  }
 
-override def abort(messages: Array[WriterCommitMessage]): Unit = {
+  private object Append extends TestBatchWrite {
+override def commit(messages: Array[WriterCommitMessage]): Unit = 
dataMap.synchronized {
+  withData(messages.map(_.asInstanceOf[BufferedRows]))
 }
   }
 
-  private object Append extends BatchWrite {
-override def createBatchWriterFactory(): DataWriterFactory = {
-  BufferedRowsWriterFactory
+  private object DynamicOverwrite extends TestBatchWrite {
+override def commit(messages: Array[WriterCommitMessage]): Unit = 
dataMap.synchronized {
+  val newData = messages.map(_.asInstanceOf[BufferedRows])
+  dataMap --= newData.flatMap(_.rows.map(getKey))
+  withData(newData)
 }
+  }
 
-override def commit(messages: Array[WriterCommitMessage]): Unit = {
-  replaceData(data ++ messages.map(_.asInstanceOf[BufferedRows]))
+  private class Overwrite(filters: Array[Filter]) extends TestBatchWrite {
+override def commit(messages: Array[WriterCommitMessage]): Unit = 
dataMap.synchronized {
+  val deleteKeys = dataMap.keys.filter { partValues =>
+filters.exists {
+  case EqualTo(attr, value) =>
+partFieldNames.zipWithIndex.find(_._1 == attr) match {
+  case Some((_, partIndex)) =>
+value == partValues(partIndex)
+  case _ =>
+throw new IllegalArgumentException(s"Unknown filter attribute: 
$attr")
+}
+  case f @ _ =>
+throw new IllegalArgumentException(s"Unsupported filter type: $f")
+}
+  }
+  dataMap --= deleteKeys
+  withData(messages.map(_.asInstanceOf[BufferedRows]))
 }
+  }
 
-override def abort(messages: Array[WriterCommitMessage]): Unit = {
+  private object TruncateAndAppend extends TestBatchWrite {
+override def commit(messages: Array[WriterCommitMessage]): Unit = 
dataMap.synchronized {
+  dataMap = mutable.Map.empty
 
 Review comment:
   @rdblue You forgot to address this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25236: [SPARK-28487][k8s] More responsive 
dynamic allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514883978
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13250/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25236: [SPARK-28487][k8s] More responsive 
dynamic allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514883973
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] DataSourceV2: InsertTable

2019-07-24 Thread GitBox

brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#discussion_r307103888
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceResolution.scala
 ##
 @@ -23,17 +23,19 @@ import scala.collection.mutable
 
 import org.apache.spark.sql.{AnalysisException, SaveMode}
 import org.apache.spark.sql.catalog.v2.{CatalogPlugin, Identifier, 
LookupCatalog, TableCatalog}
-import org.apache.spark.sql.catalog.v2.expressions.Transform
+import org.apache.spark.sql.catalog.v2.expressions.{FieldReference, 
IdentityTransform, Transform}
 
 Review comment:
   are any of the changes here needed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] DataSourceV2: InsertTable

2019-07-24 Thread GitBox

brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#discussion_r307103748
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/sources/v2/TestInMemoryTableCatalog.scala
 ##
 @@ -157,43 +170,86 @@ class InMemoryTable(
 
   override def newWriteBuilder(options: CaseInsensitiveStringMap): 
WriteBuilder = {
 TestInMemoryTableCatalog.maybeSimulateFailedTableWrite(options)
-new WriteBuilder with SupportsTruncate {
-  private var shouldTruncate: Boolean = false
+
+new WriteBuilder with SupportsTruncate with SupportsOverwrite with 
SupportsDynamicOverwrite {
+  private var writer: BatchWrite = Append
 
   override def truncate(): WriteBuilder = {
-shouldTruncate = true
+assert(writer == Append)
+writer = TruncateAndAppend
 this
   }
 
-  override def buildForBatch(): BatchWrite = {
-if (shouldTruncate) TruncateAndAppend else Append
+  override def overwrite(filters: Array[Filter]): WriteBuilder = {
+assert(writer == Append)
+writer = new Overwrite(filters)
+this
   }
+
+  override def overwriteDynamicPartitions(): WriteBuilder = {
+assert(writer == Append)
+writer = DynamicOverwrite
+this
+  }
+
+  override def buildForBatch(): BatchWrite = writer
 }
   }
 
-  private object TruncateAndAppend extends BatchWrite {
+  private abstract class TestBatchWrite extends BatchWrite {
 override def createBatchWriterFactory(): DataWriterFactory = {
   BufferedRowsWriterFactory
 }
 
-override def commit(messages: Array[WriterCommitMessage]): Unit = {
-  replaceData(messages.map(_.asInstanceOf[BufferedRows]))
+override def abort(messages: Array[WriterCommitMessage]): Unit = {
 }
+  }
 
-override def abort(messages: Array[WriterCommitMessage]): Unit = {
+  private object Append extends TestBatchWrite {
+override def commit(messages: Array[WriterCommitMessage]): Unit = 
dataMap.synchronized {
+  withData(messages.map(_.asInstanceOf[BufferedRows]))
 }
   }
 
-  private object Append extends BatchWrite {
-override def createBatchWriterFactory(): DataWriterFactory = {
-  BufferedRowsWriterFactory
+  private object DynamicOverwrite extends TestBatchWrite {
+override def commit(messages: Array[WriterCommitMessage]): Unit = 
dataMap.synchronized {
+  val newData = messages.map(_.asInstanceOf[BufferedRows])
+  dataMap --= newData.flatMap(_.rows.map(getKey))
+  withData(newData)
 }
+  }
 
-override def commit(messages: Array[WriterCommitMessage]): Unit = {
-  replaceData(data ++ messages.map(_.asInstanceOf[BufferedRows]))
+  private class Overwrite(filters: Array[Filter]) extends TestBatchWrite {
+override def commit(messages: Array[WriterCommitMessage]): Unit = 
dataMap.synchronized {
+  val deleteKeys = dataMap.keys.filter { partValues =>
+filters.flatMap(splitAnd).forall {
+  case EqualTo(attr, value) =>
+partFieldNames.zipWithIndex.find(_._1 == attr) match {
+  case Some((_, partIndex)) =>
+value == partValues(partIndex)
+  case _ =>
+throw new IllegalArgumentException(s"Unknown filter attribute: 
$attr")
+}
+  case f @ _ =>
 
 Review comment:
   nit, no need for `@ _`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] DataSourceV2: InsertTable

2019-07-24 Thread GitBox

brkyvz commented on a change in pull request #24832: [SPARK-27845][SQL] 
DataSourceV2: InsertTable
URL: https://github.com/apache/spark/pull/24832#discussion_r307058102
 
 

 ##
 File path: 
sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4
 ##
 @@ -294,8 +294,8 @@ query
 ;
 
 insertInto
-: INSERT OVERWRITE TABLE tableIdentifier (partitionSpec (IF NOT EXISTS)?)? 
 #insertOverwriteTable
-| INSERT INTO TABLE? tableIdentifier partitionSpec?
 #insertIntoTable
+: INSERT OVERWRITE TABLE? multipartIdentifier (partitionSpec (IF NOT 
EXISTS)?)? #insertOverwriteTable
+| INSERT INTO TABLE? multipartIdentifier partitionSpec? (IF NOT EXISTS)?   
 #insertIntoTable
 
 Review comment:
   do we need to wrap with parentheses `(partitionSpec (IF NOT EXISTS)?)?` like 
above? Otherwise, what happens if there's no `partitionSpec` but the `IF NOT 
EXISTS`? 
- If the table not exists? Then wouldn't that be CTAS?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

SparkQA commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic 
allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514883941
 
 
   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/13250/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

SparkQA removed a comment on issue #24879: [SPARK-28042][K8S] Support using 
volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514881351
 
 
   **[Test build #108149 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108149/testReport)**
 for PR 24879 at commit 
[`6e5fcf6`](https://github.com/apache/spark/commit/6e5fcf64472cd25785673b3c9599cc59360d9381).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support 
using volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514883509
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108149/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #24879: [SPARK-28042][K8S] Support 
using volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514883504
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume 
mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514883450
 
 
   **[Test build #108149 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108149/testReport)**
 for PR 24879 at commit 
[`6e5fcf6`](https://github.com/apache/spark/commit/6e5fcf64472cd25785673b3c9599cc59360d9381).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using 
volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514883509
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108149/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #24879: [SPARK-28042][K8S] Support using 
volume mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514883504
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume 
mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514882742
 
 
   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/13251/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307102653
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ##
 @@ -228,6 +234,156 @@ case class FilterExec(condition: Expression, child: 
SparkPlan)
   override def outputPartitioning: Partitioning = child.outputPartitioning
 }
 
+/**
+ * Physical plan node for a recursive table that encapsulates the physical 
plans of the anchor
+ * terms and the logical plans of the recursive terms and the maximum number 
of rows to return.
+ *
+ * Anchor terms are physical plans and they are used to initialize the query 
in the first run.
+ * Recursive terms are used to extend the result with new rows, They are 
logical plans and contain
+ * references to the result of the previous iteration or to the so far 
cumulated result. These
+ * references are updated with new statistics and compiled to physical plan 
and then updated to
+ * reflect the appropriate RDD before execution.
+ *
+ * The execution terminates once the anchor terms or the current iteration of 
the recursive terms
+ * return no rows or the number of cumulated rows reaches the limit.
+ *
+ * During the execution of a recursive query the previously computed results 
are reused multiple
+ * times. To avoid massive recomputation of these pieces of the final result, 
they are cached.
+ *
+ * @param name the name of the recursive table
+ * @param anchorTerms this child is used for initializing the query
+ * @param recursiveTerms this child is used for extending the set of results 
with new rows based on
+ *   the results of the previous iteration (or the anchor 
in the first
+ *   iteration)
+ * @param limit the maximum number of rows to return
+ */
+case class RecursiveTableExec(
+name: String,
+anchorTerms: Seq[SparkPlan],
+@transient
+val recursiveTerms: Seq[LogicalPlan],
+limit: Option[Long]) extends SparkPlan {
+  override def children: Seq[SparkPlan] = anchorTerms
+
+  override def output: Seq[Attribute] = 
anchorTerms.head.output.map(_.withNullability(true))
+
+  override def simpleString(maxFields: Int): String =
+s"RecursiveTable $name${limit.map(", " + _).getOrElse("")}"
+
+  override def innerChildren: Seq[QueryPlan[_]] = recursiveTerms ++ 
super.innerChildren
+
+  override protected def doExecute(): RDD[InternalRow] = {
+val storageLevel = 
StorageLevel.fromString(conf.getConf(SQLConf.RECURSION_CACHE_STORAGE_LEVEL))
+
+val prevIterationRDDs = ArrayBuffer.empty[RDD[InternalRow]]
+var prevIterationCount = 0L
+
+val anchorTermsIterator = anchorTerms.iterator
+while (anchorTermsIterator.hasNext && limit.forall(_ > 
prevIterationCount)) {
+  val anchorTerm = anchorTermsIterator.next()
+
+  lazy val cumulatedResult = if (prevIterationRDDs.size > 1) {
+sparkContext.union(prevIterationRDDs)
+  } else {
+prevIterationRDDs.head
+  }
+
+  anchorTerm.foreach {
+case rr: RecursiveReferenceExec if rr.name == name => 
rr.recursiveTable = cumulatedResult
+case _ =>
+  }
+
+  val rdd = anchorTerm.execute().map(_.copy()).persist(storageLevel)
+  val count = rdd.count()
+  if (count > 0) {
+prevIterationRDDs += rdd
+prevIterationCount += count
+  }
+}
+
+val cumulatedRDDs = ArrayBuffer(prevIterationRDDs: _*)
+var cumulatedCount = prevIterationCount
+var level = 0
+val levelLimit = conf.getConf(SQLConf.RECURSION_LEVEL_LIMIT)
 
 Review comment:
   `conf.recursionLevelLimit`? Also, can you move this definition outside the 
loop?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307102482
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/basicPhysicalOperators.scala
 ##
 @@ -228,6 +234,156 @@ case class FilterExec(condition: Expression, child: 
SparkPlan)
   override def outputPartitioning: Partitioning = child.outputPartitioning
 }
 
+/**
+ * Physical plan node for a recursive table that encapsulates the physical 
plans of the anchor
+ * terms and the logical plans of the recursive terms and the maximum number 
of rows to return.
+ *
+ * Anchor terms are physical plans and they are used to initialize the query 
in the first run.
+ * Recursive terms are used to extend the result with new rows, They are 
logical plans and contain
+ * references to the result of the previous iteration or to the so far 
cumulated result. These
+ * references are updated with new statistics and compiled to physical plan 
and then updated to
+ * reflect the appropriate RDD before execution.
+ *
+ * The execution terminates once the anchor terms or the current iteration of 
the recursive terms
+ * return no rows or the number of cumulated rows reaches the limit.
+ *
+ * During the execution of a recursive query the previously computed results 
are reused multiple
+ * times. To avoid massive recomputation of these pieces of the final result, 
they are cached.
+ *
+ * @param name the name of the recursive table
+ * @param anchorTerms this child is used for initializing the query
+ * @param recursiveTerms this child is used for extending the set of results 
with new rows based on
+ *   the results of the previous iteration (or the anchor 
in the first
+ *   iteration)
+ * @param limit the maximum number of rows to return
+ */
+case class RecursiveTableExec(
+name: String,
+anchorTerms: Seq[SparkPlan],
+@transient
+val recursiveTerms: Seq[LogicalPlan],
+limit: Option[Long]) extends SparkPlan {
+  override def children: Seq[SparkPlan] = anchorTerms
+
+  override def output: Seq[Attribute] = 
anchorTerms.head.output.map(_.withNullability(true))
+
+  override def simpleString(maxFields: Int): String =
+s"RecursiveTable $name${limit.map(", " + _).getOrElse("")}"
+
+  override def innerChildren: Seq[QueryPlan[_]] = recursiveTerms ++ 
super.innerChildren
+
+  override protected def doExecute(): RDD[InternalRow] = {
+val storageLevel = 
StorageLevel.fromString(conf.getConf(SQLConf.RECURSION_CACHE_STORAGE_LEVEL))
 
 Review comment:
   `conf.recursionCacheStorageLevel`?
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

SparkQA commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic 
allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514881458
 
 
   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/13250/
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume mount as local storage

2019-07-24 Thread GitBox

SparkQA commented on issue #24879: [SPARK-28042][K8S] Support using volume 
mount as local storage
URL: https://github.com/apache/spark/pull/24879#issuecomment-514881351
 
 
   **[Test build #108149 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108149/testReport)**
 for PR 24879 at commit 
[`6e5fcf6`](https://github.com/apache/spark/commit/6e5fcf64472cd25785673b3c9599cc59360d9381).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

maropu commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514880267
 
 
   At the same time, can you check unused imports by using IDE supports?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

SparkQA commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic 
allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514878393
 
 
   **[Test build #108148 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108148/testReport)**
 for PR 25236 at commit 
[`fd0a1d2`](https://github.com/apache/spark/commit/fd0a1d2cb240e4a0f0f8e442e083dfa221cfce00).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on issue #25236: [SPARK-28487][k8s] More responsive dynamic allocation with K8S.

2019-07-24 Thread GitBox

dongjoon-hyun commented on issue #25236: [SPARK-28487][k8s] More responsive 
dynamic allocation with K8S.
URL: https://github.com/apache/spark/pull/25236#issuecomment-514878133
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25242: [SPARK-28497][SQL] Disallow upcasting complex data types to string type

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25242: [SPARK-28497][SQL] 
Disallow upcasting complex data types to string type
URL: https://github.com/apache/spark/pull/25242#discussion_r307098867
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderResolutionSuite.scala
 ##
 @@ -196,6 +196,43 @@ class EncoderResolutionSuite extends PlanTest {
 encoder.resolveAndBind(attrs)
   }
 
+  test("SPARK-28497: complex type is not compatible with string encoder 
schema") {
+val encoder = ExpressionEncoder[String]
+
+{
+  val attrs = Seq('a.struct('x.long))
+  
assert(intercept[AnalysisException](encoder.resolveAndBind(attrs)).message ==
+s"""
+   |Cannot up cast `a` from struct to string.
+   |The type path of the target object is:
+   |- root class: "java.lang.String"
+   |You can either add an explicit cast to the input data or choose a 
higher precision type
+""".stripMargin.trim + " of the field in the target object")
+}
+
+{
+  val attrs = Seq('a.array(StringType))
+  
assert(intercept[AnalysisException](encoder.resolveAndBind(attrs)).message ==
+s"""
+   |Cannot up cast `a` from array to string.
 
 Review comment:
   Oh, it was the same comment as 
https://github.com/apache/spark/pull/25242#discussion_r307064357


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] hddong commented on issue #23952: [SPARK-26929][SQL]fix table owner use user instead of principal when create table through spark-sql or beeline

2019-07-24 Thread GitBox

hddong commented on issue #23952: [SPARK-26929][SQL]fix table owner use user 
instead of principal when create table through spark-sql or beeline
URL: https://github.com/apache/spark/pull/23952#issuecomment-514876679
 
 
   ok to test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #25242: [SPARK-28497][SQL] Disallow upcasting complex data types to string type

2019-07-24 Thread GitBox

cloud-fan commented on a change in pull request #25242: [SPARK-28497][SQL] 
Disallow upcasting complex data types to string type
URL: https://github.com/apache/spark/pull/25242#discussion_r307097063
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderResolutionSuite.scala
 ##
 @@ -196,6 +196,43 @@ class EncoderResolutionSuite extends PlanTest {
 encoder.resolveAndBind(attrs)
   }
 
+  test("SPARK-28497: complex type is not compatible with string encoder 
schema") {
+val encoder = ExpressionEncoder[String]
+
+{
+  val attrs = Seq('a.struct('x.long))
+  
assert(intercept[AnalysisException](encoder.resolveAndBind(attrs)).message ==
+s"""
 
 Review comment:
   +1


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875182
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108136/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875176
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed 
new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875176
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed 
new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875182
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108136/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

SparkQA removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514834222
 
 
   **[Test build #108136 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108136/testReport)**
 for PR 25007 at commit 
[`9f17b9b`](https://github.com/apache/spark/commit/9f17b9bbf0d3d5677abf424c5b2d4d3b93dfc95a).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875071
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108140/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875069
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed 
new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875071
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108140/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed 
new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514875069
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new 
shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514874761
 
 
   **[Test build #108136 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108136/testReport)**
 for PR 25007 at commit 
[`9f17b9b`](https://github.com/apache/spark/commit/9f17b9bbf0d3d5677abf424c5b2d4d3b93dfc95a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

SparkQA removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514837877
 
 
   **[Test build #108140 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108140/testReport)**
 for PR 25007 at commit 
[`56fa450`](https://github.com/apache/spark/commit/56fa450b4d703c7bc34f338e2ed0cd21fc82a98c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25249: [SPARK-28237][SQL] Enforce Idempotence for Once batches in RuleExecutor

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25249: [SPARK-28237][SQL] Enforce 
Idempotence for Once batches in RuleExecutor
URL: https://github.com/apache/spark/pull/25249#issuecomment-514874301
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108143/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new 
shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514874681
 
 
   **[Test build #108140 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108140/testReport)**
 for PR 25007 at commit 
[`56fa450`](https://github.com/apache/spark/commit/56fa450b4d703c7bc34f338e2ed0cd21fc82a98c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25249: [SPARK-28237][SQL] Enforce Idempotence for Once batches in RuleExecutor

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25249: [SPARK-28237][SQL] Enforce Idempotence 
for Once batches in RuleExecutor
URL: https://github.com/apache/spark/pull/25249#issuecomment-514874301
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108143/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25249: [SPARK-28237][SQL] Enforce Idempotence for Once batches in RuleExecutor

2019-07-24 Thread GitBox

SparkQA removed a comment on issue #25249: [SPARK-28237][SQL] Enforce 
Idempotence for Once batches in RuleExecutor
URL: https://github.com/apache/spark/pull/25249#issuecomment-514854663
 
 
   **[Test build #108143 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108143/testReport)**
 for PR 25249 at commit 
[`10678ad`](https://github.com/apache/spark/commit/10678ad436e9c50499703df96fae64f782e8722f).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25249: [SPARK-28237][SQL] Enforce Idempotence for Once batches in RuleExecutor

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25249: [SPARK-28237][SQL] Enforce Idempotence 
for Once batches in RuleExecutor
URL: https://github.com/apache/spark/pull/25249#issuecomment-514874295
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25249: [SPARK-28237][SQL] Enforce Idempotence for Once batches in RuleExecutor

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25249: [SPARK-28237][SQL] Enforce 
Idempotence for Once batches in RuleExecutor
URL: https://github.com/apache/spark/pull/25249#issuecomment-514874295
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25249: [SPARK-28237][SQL] Enforce Idempotence for Once batches in RuleExecutor

2019-07-24 Thread GitBox

SparkQA commented on issue #25249: [SPARK-28237][SQL] Enforce Idempotence for 
Once batches in RuleExecutor
URL: https://github.com/apache/spark/pull/25249#issuecomment-514874161
 
 
   **[Test build #108143 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108143/testReport)**
 for PR 25249 at commit 
[`10678ad`](https://github.com/apache/spark/commit/10678ad436e9c50499703df96fae64f782e8722f).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514873120
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514873123
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108139/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed 
new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514873123
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/108139/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed 
new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514873120
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

SparkQA removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514836023
 
 
   **[Test build #108139 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108139/testReport)**
 for PR 25007 at commit 
[`e53a001`](https://github.com/apache/spark/commit/e53a001b30a15ebc06df6b62c5650ae6f3213477).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new 
shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514872726
 
 
   **[Test build #108139 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108139/testReport)**
 for PR 25007 at commit 
[`e53a001`](https://github.com/apache/spark/commit/e53a001b30a15ebc06df6b62c5650ae6f3213477).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

SparkQA commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514872275
 
 
   **[Test build #108147 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108147/testReport)**
 for PR 25251 at commit 
[`600444e`](https://github.com/apache/spark/commit/600444e9f7044c9972fb76599d99a985641e840d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514871919
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514871920
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13249/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514871919
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

AmplabJenkins commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514871920
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13249/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514866728
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon removed a comment on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

HyukjinKwon removed a comment on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514871649
 
 
   add to whitelist


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

HyukjinKwon commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514871696
 
 
   ok to test


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on issue #25251: [MINOR] Trivial cleanups

2019-07-24 Thread GitBox

HyukjinKwon commented on issue #25251: [MINOR] Trivial cleanups
URL: https://github.com/apache/spark/pull/25251#issuecomment-514871649
 
 
   add to whitelist


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #25248: [SPARK-28152][SQL][2.4] Mapped ShortType to SMALLINT and FloatType to REAL for MsSqlServerDialect

2019-07-24 Thread GitBox

dongjoon-hyun closed pull request #25248: [SPARK-28152][SQL][2.4] Mapped 
ShortType to SMALLINT and FloatType to REAL for MsSqlServerDialect
URL: https://github.com/apache/spark/pull/25248
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307093683
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 33
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 33 schema
+struct
+-- !query 33 output
+0  A
+0  B
+1  AC
+1  BC
+2  ACC
+2  BCC
+3  ACCC
+3  BCCC
+
+
+-- !query 34
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 34 schema
+struct
+-- !query 34 output
+0  A
+1  AB
+1  AC
+2  ABB
+2  ABC
+2  ACB
+2  ACC
+3  ABBC
+3  ABCC
+3  ACBC
+3  ACCC
+
+
+-- !query 35
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 35 schema
+struct
+-- !query 35 output
+0  A
+0  B
+1  AC
+1  AD
+1  BC
+1  BD
+2  ACC
+2  ACD
+2  ADC
+2  ADD
+2  BCC
+2  BCD
+2  BDC
+2  BDD
+3  ACCD
+3  ACDD
+3  ADCD
+3  ADDD
+3  BCCD
+3  BCDD
+3  BDCD
+3  BDDD
+
+
+-- !query 36
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 36 schema
+struct<>
+-- !query 36 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 37
+WITH RECURSIVE r(level) AS (
+  VALUES (0), (0)
+  UNION
+  SELECT (level + 1) % 10 FROM r
+)
+SELECT * FROM r
+-- !query 37 schema
+struct
+-- !query 37 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 38
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  INTERSECT
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
+-- !query 38 schema
+struct<>
+-- !query 38 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 39
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE (SELECT SUM(level) FROM r) < 10
+)
+SELECT * FROM r
+-- !query 39 schema
+struct<>
+-- !

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307093581
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 33
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 33 schema
+struct
+-- !query 33 output
+0  A
+0  B
+1  AC
+1  BC
+2  ACC
+2  BCC
+3  ACCC
+3  BCCC
+
+
+-- !query 34
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 34 schema
+struct
+-- !query 34 output
+0  A
+1  AB
+1  AC
+2  ABB
+2  ABC
+2  ACB
+2  ACC
+3  ABBC
+3  ABCC
+3  ACBC
+3  ACCC
+
+
+-- !query 35
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 35 schema
+struct
+-- !query 35 output
+0  A
+0  B
+1  AC
+1  AD
+1  BC
+1  BD
+2  ACC
+2  ACD
+2  ADC
+2  ADD
+2  BCC
+2  BCD
+2  BDC
+2  BDD
+3  ACCD
+3  ACDD
+3  ADCD
+3  ADDD
+3  BCCD
+3  BCDD
+3  BDCD
+3  BDDD
+
+
+-- !query 36
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 36 schema
+struct<>
+-- !query 36 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 37
+WITH RECURSIVE r(level) AS (
+  VALUES (0), (0)
+  UNION
+  SELECT (level + 1) % 10 FROM r
+)
+SELECT * FROM r
+-- !query 37 schema
+struct
+-- !query 37 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 38
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  INTERSECT
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
+-- !query 38 schema
+struct<>
+-- !query 38 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 39
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE (SELECT SUM(level) FROM r) < 10
+)
+SELECT * FROM r
+-- !query 39 schema
+struct<>
+-- !

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307093321
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 33
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 33 schema
+struct
+-- !query 33 output
+0  A
+0  B
+1  AC
+1  BC
+2  ACC
+2  BCC
+3  ACCC
+3  BCCC
+
+
+-- !query 34
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 34 schema
+struct
+-- !query 34 output
+0  A
+1  AB
+1  AC
+2  ABB
+2  ABC
+2  ACB
+2  ACC
+3  ABBC
+3  ABCC
+3  ACBC
+3  ACCC
+
+
+-- !query 35
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 35 schema
+struct
+-- !query 35 output
+0  A
+0  B
+1  AC
+1  AD
+1  BC
+1  BD
+2  ACC
+2  ACD
+2  ADC
+2  ADD
+2  BCC
+2  BCD
+2  BDC
+2  BDD
+3  ACCD
+3  ACDD
+3  ADCD
+3  ADDD
+3  BCCD
+3  BCDD
+3  BDCD
+3  BDDD
+
+
+-- !query 36
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 36 schema
+struct<>
+-- !query 36 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 37
+WITH RECURSIVE r(level) AS (
+  VALUES (0), (0)
+  UNION
+  SELECT (level + 1) % 10 FROM r
+)
+SELECT * FROM r
+-- !query 37 schema
+struct
+-- !query 37 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 38
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  INTERSECT
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
+-- !query 38 schema
+struct<>
+-- !query 38 output
+org.apache.spark.sql.AnalysisException
+Recursive query r should contain UNION or UNION ALL statements only. This 
error can also be caused by ORDER BY or LIMIT keywords used on result of UNION 
or UNION ALL.;
+
+
+-- !query 39
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE (SELECT SUM(level) FROM r) < 10
+)
+SELECT * FROM r
+-- !query 39 schema
+struct<>
+-- !

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307093160
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala
 ##
 @@ -98,3 +103,163 @@ private[image] class ImageFileFormat extends FileFormat 
with DataSourceRegister
 }
   }
 }
+
+object ImageFileFormat {
+
+  val undefinedImageType = "Undefined"
+
+  /**
+   * (Scala-specific) OpenCV type mapping supported
+   */
+  val ocvTypes: Map[String, Int] = Map(
+undefinedImageType -> -1,
+"CV_8U" -> 0, "CV_8UC1" -> 0, "CV_8UC3" -> 16, "CV_8UC4" -> 24
+  )
+
+  /**
+   * (Java-specific) OpenCV type mapping supported
+   */
+  val javaOcvTypes: java.util.Map[String, Int] = ocvTypes.asJava
+
+  /**
+   * Schema for the image column: Row(String, Int, Int, Int, Int, Array[Byte])
+   */
+  private[image] val columnSchema = StructType(
+StructField("origin", StringType, true) ::
+StructField("height", IntegerType, true) ::
+StructField("width", IntegerType, true) ::
+StructField("nChannels", IntegerType, true) ::
+// OpenCV-compatible type: CV_8UC3 in most cases
+StructField("mode", IntegerType, true) ::
 
 Review comment:
   That's not true in structured streaming. Shall we don't change this here for 
now? Sounds like orthogonal with this change.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307092949
 
 

 ##
 File path: mllib/src/main/scala/org/apache/spark/ml/image/ImageSchema.scala
 ##
 @@ -1,266 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the "License"); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark.ml.image
-
-import java.awt.Color
-import java.awt.color.ColorSpace
-import java.io.ByteArrayInputStream
-import javax.imageio.ImageIO
-
-import scala.collection.JavaConverters._
-
-import org.apache.spark.annotation.{Experimental, Since}
-import org.apache.spark.input.PortableDataStream
-import org.apache.spark.sql.{DataFrame, Row, SparkSession}
-import org.apache.spark.sql.types._
-
-/**
- * :: Experimental ::
- * Defines the image schema and methods to read and manipulate images.
- */
-@Experimental
-@Since("2.3.0")
-object ImageSchema {
-
-  val undefinedImageType = "Undefined"
-
-  /**
-   * (Scala-specific) OpenCV type mapping supported
-   */
-  val ocvTypes: Map[String, Int] = Map(
-undefinedImageType -> -1,
-"CV_8U" -> 0, "CV_8UC1" -> 0, "CV_8UC3" -> 16, "CV_8UC4" -> 24
-  )
-
-  /**
-   * (Java-specific) OpenCV type mapping supported
-   */
-  val javaOcvTypes: java.util.Map[String, Int] = ocvTypes.asJava
-
-  /**
-   * Schema for the image column: Row(String, Int, Int, Int, Int, Array[Byte])
-   */
-  val columnSchema = StructType(
-StructField("origin", StringType, true) ::
-StructField("height", IntegerType, false) ::
-StructField("width", IntegerType, false) ::
-StructField("nChannels", IntegerType, false) ::
-// OpenCV-compatible type: CV_8UC3 in most cases
-StructField("mode", IntegerType, false) ::
-// Bytes in OpenCV-compatible order: row-wise BGR in most cases
-StructField("data", BinaryType, false) :: Nil)
-
-  val imageFields: Array[String] = columnSchema.fieldNames
-
-  /**
-   * DataFrame with a single column of images named "image" (nullable)
-   */
-  val imageSchema = StructType(StructField("image", columnSchema, true) :: Nil)
-
-  /**
-   * Gets the origin of the image
-   *
-   * @return The origin of the image
-   */
-  def getOrigin(row: Row): String = row.getString(0)
-
-  /**
-   * Gets the height of the image
-   *
-   * @return The height of the image
-   */
-  def getHeight(row: Row): Int = row.getInt(1)
-
-  /**
-   * Gets the width of the image
-   *
-   * @return The width of the image
-   */
-  def getWidth(row: Row): Int = row.getInt(2)
-
-  /**
-   * Gets the number of channels in the image
-   *
-   * @return The number of channels in the image
-   */
-  def getNChannels(row: Row): Int = row.getInt(3)
-
-  /**
-   * Gets the OpenCV representation as an int
-   *
-   * @return The OpenCV representation as an int
-   */
-  def getMode(row: Row): Int = row.getInt(4)
-
-  /**
-   * Gets the image data
-   *
-   * @return The image data
-   */
-  def getData(row: Row): Array[Byte] = row.getAs[Array[Byte]](5)
-
-  /**
-   * Default values for the invalid image
-   *
-   * @param origin Origin of the invalid image
-   * @return Row with the default values
-   */
-  private[spark] def invalidImageRow(origin: String): Row =
-Row(Row(origin, -1, -1, -1, ocvTypes(undefinedImageType), 
Array.ofDim[Byte](0)))
-
-  /**
-   * Convert the compressed image (jpeg, png, etc.) into OpenCV
-   * representation and store it in DataFrame Row
-   *
-   * @param origin Arbitrary string that identifies the image
-   * @param bytes Image bytes (for example, jpeg)
-   * @return DataFrame Row or None (if the decompression fails)
-   */
-  private[spark] def decode(origin: String, bytes: Array[Byte]): Option[Row] = 
{
-
-val img = try {
-  ImageIO.read(new ByteArrayInputStream(bytes))
-} catch {
-  // Catch runtime exception because `ImageIO` may throw unexcepted 
`RuntimeException`.
-  // But do not catch the declared `IOException` (regarded as FileSystem 
failure)
-  case _: RuntimeException => null
-}
-
-if (img == null) {
-  None
-} else {
-  val isGray = img.getColorModel.getColorSpace.getType == 
ColorSpace.TYPE_GRAY
-  val hasAlpha = img.getColorModel.hasA

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307092864
 
 

 ##
 File path: 
mllib/src/main/scala/org/apache/spark/ml/source/image/ImageFileFormat.scala
 ##
 @@ -98,3 +103,163 @@ private[image] class ImageFileFormat extends FileFormat 
with DataSourceRegister
 }
   }
 }
+
+object ImageFileFormat {
 
 Review comment:
   `BinaryFileFormat` has to be private. We only don't do `private[sql]` or 
`private[spark]` in execution and catalyst modules because we explicitly 
mention that those modules are private as of SPARK-16813 and SPARK-16964
   
   I don't think we should keep non-API instances as public.
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307092843
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 33
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 33 schema
+struct
+-- !query 33 output
+0  A
+0  B
+1  AC
+1  BC
+2  ACC
+2  BCC
+3  ACCC
+3  BCCC
+
+
+-- !query 34
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 34 schema
+struct
+-- !query 34 output
+0  A
+1  AB
+1  AC
+2  ABB
+2  ABC
+2  ACB
+2  ACC
+3  ABBC
+3  ABCC
+3  ACBC
+3  ACCC
+
+
+-- !query 35
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r
 
 Review comment:
   ```
   psql:with-recursive.sql:98: ERROR:  recursive reference to query "r" must 
not appear within its non-recursive term
   LINE 6:   SELECT level + 1, data || 'C' FROM r WHERE level < 2
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307092786
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 33
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
+-- !query 33 schema
+struct
+-- !query 33 output
+0  A
+0  B
+1  AC
+1  BC
+2  ACC
+2  BCC
+3  ACCC
+3  BCCC
+
+
+-- !query 34
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r
 
 Review comment:
   ```
   psql:with-recursive.sql:86: ERROR:  recursive reference to query "r" must 
not appear within its non-recursive term
   LINE 4:   SELECT level + 1, data || 'B' FROM r WHERE level < 2
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307092625
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 
 -- !query 26
-DROP VIEW IF EXISTS t2
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 26 schema
-struct<>
+struct
 -- !query 26 output
+0
+1
+10
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 27
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r
+-- !query 27 schema
+struct<>
+-- !query 27 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 28
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10
+-- !query 28 schema
+struct
+-- !query 28 output
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+
+
+-- !query 29
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10
+-- !query 29 schema
+struct
+-- !query 29 output
+0  0
+1  1
+2  2
+3  3
+4  4
+5  5
+6  6
+7  7
+8  8
+9  9
+
+
+-- !query 30
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10
+-- !query 30 schema
+struct<>
+-- !query 30 output
+org.apache.spark.SparkException
+Recursion level limit 100 reached but query has not exhausted, try increasing 
spark.sql.cte.recursion.level.limit
+
+
+-- !query 31
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r
+-- !query 31 schema
+struct
+-- !query 31 output
+a
+a b
+a b b
+a b b b
+a b b b b
+a b b b b b
+
+
+-- !query 32
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r
+-- !query 32 schema
+struct
+-- !query 32 output
 
 Review comment:
   pg cannot accept this query;
   ```
   psql:with-recursive.sql:66: ERROR:  recursive reference to query "r" must 
not appear within its non-recursive term
   LINE 2:   SELECT level + 1 FROM r WHERE level < 10
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307092141
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/cte.sql
 ##
 @@ -155,6 +155,419 @@ SELECT (
   )
 );
 
+-- fails due to recursion isn't allowed with RECURSIVE keyword
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- very basic recursion
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
 
 Review comment:
   better to sort the output by using `ORDER BY`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307091967
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -328,16 +328,891 @@ struct
 
 
 -- !query 25
-DROP VIEW IF EXISTS t
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r
 -- !query 25 schema
 struct<>
 -- !query 25 output
-
+org.apache.spark.sql.AnalysisException
+Table or view not found: r; line 4 pos 24
 
 Review comment:
   Can you make the error message more clear as pg does so?
   ```
   postgres=# WITH r(level) AS (
   postgres(#   VALUES (0)
   postgres(#   UNION ALL
   postgres(#   SELECT level + 1 FROM r WHERE level < 10
   postgres(# )
   postgres-# SELECT * FROM r;
   ERROR:  relation "r" does not exist
   LINE 4:   SELECT level + 1 FROM r WHERE level < 10
   ^
   DETAIL:  There is a WITH item named "r", but it cannot be referenced from 
this part of the query.
   HINT:  Use WITH RECURSIVE, or re-order the WITH items to remove forward 
references.
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307091759
 
 

 ##
 File path: python/pyspark/ml/image.py
 ##
 @@ -203,52 +205,16 @@ def toImage(self, array, origin=""):
 return _create_row(self.imageFields,
[origin, height, width, nChannels, mode, data])
 
-def readImages(self, path, recursive=False, numPartitions=-1,
-   dropImageFailures=False, sampleRatio=1.0, seed=0):
-"""
-Reads the directory of images from the local or remote source.
-
-.. note:: If multiple jobs are run in parallel with different 
sampleRatio or recursive flag,
-there may be a race condition where one job overwrites the hadoop 
configs of another.
-
-.. note:: If sample ratio is less than 1, sampling uses a PathFilter 
that is efficient but
-potentially non-deterministic.
-
-.. note:: Deprecated in 2.4.0. Use 
`spark.read.format("image").load(path)` instead and
-this `readImages` will be removed in 3.0.0.
-
-:param str path: Path to the image directory.
-:param bool recursive: Recursive search flag.
-:param int numPartitions: Number of DataFrame partitions.
-:param bool dropImageFailures: Drop the files that are not valid 
images.
-:param float sampleRatio: Fraction of the images loaded.
-:param int seed: Random number seed.
-:return: a :class:`DataFrame` with a single column of "images",
-   see ImageSchema for details.
-
->>> df = ImageSchema.readImages('data/mllib/images/origin/kittens', 
recursive=True)
->>> df.count()
-5
-
-.. versionadded:: 2.3.0
-"""
-warnings.warn("`ImageSchema.readImage` is deprecated. " +
-  "Use `spark.read.format(\"image\").load(path)` 
instead.", DeprecationWarning)
-spark = SparkSession.builder.getOrCreate()
-image_schema = spark._jvm.org.apache.spark.ml.image.ImageSchema
-jsession = spark._jsparkSession
-jresult = image_schema.readImages(path, jsession, recursive, 
numPartitions,
-  dropImageFailures, 
float(sampleRatio), seed)
-return DataFrame(jresult, spark._wrapped)
 
-
-ImageSchema = _ImageSchema()
+ImageUtils = _ImageUtils()
 
 Review comment:
   Are we going to expose those utils as APIs for PySpark specifically or not? 
If we're going to keep them, we should keep the original name `ImageSchema` for 
backward compatibility.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307091599
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/cte.sql
 ##
 @@ -155,6 +155,419 @@ SELECT (
   )
 );
 
+-- fails due to recursion isn't allowed with RECURSIVE keyword
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- very basic recursion
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- unlimited recursion fails at spark.sql.cte.recursion.level.limits level
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r;
+
+-- terminate recursion with LIMIT
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10;
+
+-- terminate projected recursion with LIMIT
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10;
+
+-- fails because using LIMIT to terminate recursion only works where Limit can 
be pushed through
+-- recursion
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10;
+
+-- using string column in recursion
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r;
+
+-- recursion works regardless the order of anchor and recursive terms
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r;
+
+-- multiple anchor terms are supported
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- multiple recursive terms are supported
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- multiple anchor and recursive terms are supported
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- recursion without an anchor term fails
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- UNION combinator supported to eliminate duplicates and stop recursion
+WITH RECURSIVE r(level) AS (
+  VALUES (0), (0)
+  UNION
+  SELECT (level + 1) % 10 FROM r
+)
+SELECT * FROM r;
+
+-- fails because a recursive query should contain UNION ALL or UNION combinator
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  INTERSECT
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed in a subquery
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE (SELECT SUM(level) FROM r) < 10
+)
+SELECT * FROM r;
+
+-- recursive reference can't be used multiple times in a recursive term
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT r1.level + 1, r1.data
+  FROM r AS r1
+  JOIN r AS r2 ON r2.data = r1.data
+  WHERE r1.level < 10
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed on right side of a left outer join
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, r.data
+  FROM (
+SELECT 'B' AS data
+  ) AS o
+  LEFT JOIN r ON r.data = o.data
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed on left side of a right outer join
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, r.data
+  FROM r
+  RIGHT JOIN (
+SELECT 'B' AS data
+  ) AS o ON o.data = r.data
+)
+SELECT * FROM r;
+
+-- aggregate is supported in the anchor term
+WITH RECURSIVE r(level, data) AS (
+  SELECT MAX(level) AS level, SUM(data) AS data FROM VALUES (0, 1), (0, 2)
+  UNION ALL
+  SELECT level + 1, data FROM r WHERE level < 10
+)
+SELECT * FROM r ORDER BY level;
+
+-- recursive reference is not allowed in an aggregate in a recursive term
+WITH RECURSIVE r(group, data) AS (
+  VALUES (0, 1L)
+  UNION ALL
+  SELECT 1, SUM(data) FROM r WHERE data < 10 GROUP BY group
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed in an aggregate (made from project) in a 
recursive term
+WITH RECURSIVE r(level) AS (
+  VALUES (1L)
+  UNION ALL
+  SELECT SUM(level) FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- aggregate is supported on a recursive table
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data FROM r WHERE level < 10
+)
+SELECT COUNT(*) FROM r;
+
+-- recursive refe

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307091452
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -1,5 +1,5 @@
 -- Automatically generated by SQLQueryTestSuite
--- Number of queries: 27
+-- Number of queries: 63
 
 Review comment:
   I checked if these queries in this file could work well in postgresql;
   rewritten queries for postgresql: 
https://gist.github.com/maropu/0f7ac12ef3f1b6c6262ecfda4be6be09
   output: https://gist.github.com/maropu/a3c12d50058157c6e697ba2a0d4b19dd


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307091452
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/results/cte.sql.out
 ##
 @@ -1,5 +1,5 @@
 -- Automatically generated by SQLQueryTestSuite
--- Number of queries: 27
+-- Number of queries: 63
 
 Review comment:
   I checked these queries in this file work well in postgresql;
   rewritten queries for postgresql: 
https://gist.github.com/maropu/0f7ac12ef3f1b6c6262ecfda4be6be09
   output: https://gist.github.com/maropu/a3c12d50058157c6e697ba2a0d4b19dd


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307091483
 
 

 ##
 File path: python/pyspark/ml/image.py
 ##
 @@ -16,11 +16,11 @@
 #
 
 """
-.. attribute:: ImageSchema
+.. attribute:: ImageUtils
 
-An attribute of this module that contains the instance of 
:class:`_ImageSchema`.
+An attribute of this module that contains the instance of 
:class:`_ImageUtils`.
 
-.. autoclass:: _ImageSchema
+.. autoclass:: _ImageUtils
:members:
 """
 
 Review comment:
   We should remove
   
   ```
   pyspark.ml.image module
   
   
   .. automodule:: pyspark.ml.image
   :members:
   :undoc-members:
   :inherited-members:
   ```
   
   at `spark/python/docs/pyspark.ml.rst`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307091121
 
 

 ##
 File path: python/pyspark/ml/image.py
 ##
 @@ -16,11 +16,11 @@
 #
 
 """
 
 Review comment:
   Shall we remove this doc since this isn't an API anymore?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307091258
 
 

 ##
 File path: python/pyspark/ml/image.py
 ##
 @@ -16,11 +16,11 @@
 #
 
 """
 
 Review comment:
   We should remove:
   
   ```
   pyspark.ml.image module
   
   
   .. automodule:: pyspark.ml.image
   :members:
   :undoc-members:
   :inherited-members:
   ```
   
   at `spark/python/docs/pyspark.ml.rst` as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307091258
 
 

 ##
 File path: python/pyspark/ml/image.py
 ##
 @@ -16,11 +16,11 @@
 #
 
 """
 
 Review comment:
   We should remove:
   
   ```
   pyspark.ml.image module
   
   
   .. automodule:: pyspark.ml.image
   :members:
   :undoc-members:
   :inherited-members:
   ```
   
   at `spark/python/docs/pyspark.ml.rst` as well.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25245: [SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25245: 
[SPARK-25382][SQL][PYSPARK] Remove ImageSchema.readImages in 3.0
URL: https://github.com/apache/spark/pull/25245#discussion_r307091121
 
 

 ##
 File path: python/pyspark/ml/image.py
 ##
 @@ -16,11 +16,11 @@
 #
 
 """
 
 Review comment:
   Shall we remove this doc since this isn't an API anymore?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] beliefer commented on a change in pull request #25074: [SPARK-27924][SQL] Support ANSI SQL Boolean-Predicate syntax

2019-07-24 Thread GitBox

beliefer commented on a change in pull request #25074: [SPARK-27924][SQL] 
Support ANSI SQL Boolean-Predicate syntax 
URL: https://github.com/apache/spark/pull/25074#discussion_r305786347
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
 ##
 @@ -1243,6 +1244,12 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
 IsNotNull(e)
   case SqlBaseParser.NULL =>
 IsNull(e)
+  case SqlBaseParser.TRUE =>
+invertIfNotDefined(BooleanTest(e, Some(true)))
+  case SqlBaseParser.FALSE =>
+invertIfNotDefined(BooleanTest(e, Some(false)))
+  case SqlBaseParser.UNKNOWN =>
+invertIfNotDefined(BooleanTest(e, None))
 
 Review comment:
   Thanks for your reminder. But I don't known the reason that must use 
`IsNull` or `IsNotNull`. I think boolean predicate and null predicate is two 
different syntax.
   If we surely do this, I will change and reference @maropu 's suggestion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307091036
 
 

 ##
 File path: sql/core/src/test/resources/sql-tests/inputs/cte.sql
 ##
 @@ -155,6 +155,419 @@ SELECT (
   )
 );
 
+-- fails due to recursion isn't allowed with RECURSIVE keyword
+WITH r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- very basic recursion
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- unlimited recursion fails at spark.sql.cte.recursion.level.limits level
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r;
+
+-- terminate recursion with LIMIT
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT * FROM r LIMIT 10;
+
+-- terminate projected recursion with LIMIT
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r LIMIT 10;
+
+-- fails because using LIMIT to terminate recursion only works where Limit can 
be pushed through
+-- recursion
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r
+)
+SELECT level, level FROM r ORDER BY level LIMIT 10;
+
+-- using string column in recursion
+WITH RECURSIVE r(c) AS (
+  SELECT 'a'
+  UNION ALL
+  SELECT c || ' b' FROM r WHERE LENGTH(c) < 10
+)
+SELECT * FROM r;
+
+-- recursion works regardless the order of anchor and recursive terms
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 10
+  UNION ALL
+  VALUES (0)
+)
+SELECT * FROM r;
+
+-- multiple anchor terms are supported
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- multiple recursive terms are supported
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data || 'B' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- multiple anchor and recursive terms are supported
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  VALUES (0, 'B')
+  UNION ALL
+  SELECT level + 1, data || 'C' FROM r WHERE level < 2
+  UNION ALL
+  SELECT level + 1, data || 'D' FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- recursion without an anchor term fails
+WITH RECURSIVE r(level) AS (
+  SELECT level + 1 FROM r WHERE level < 3
+)
+SELECT * FROM r;
+
+-- UNION combinator supported to eliminate duplicates and stop recursion
+WITH RECURSIVE r(level) AS (
+  VALUES (0), (0)
+  UNION
+  SELECT (level + 1) % 10 FROM r
+)
+SELECT * FROM r;
+
+-- fails because a recursive query should contain UNION ALL or UNION combinator
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  INTERSECT
+  SELECT level + 1 FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed in a subquery
+WITH RECURSIVE r(level) AS (
+  VALUES (0)
+  UNION ALL
+  SELECT level + 1 FROM r WHERE (SELECT SUM(level) FROM r) < 10
+)
+SELECT * FROM r;
+
+-- recursive reference can't be used multiple times in a recursive term
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT r1.level + 1, r1.data
+  FROM r AS r1
+  JOIN r AS r2 ON r2.data = r1.data
+  WHERE r1.level < 10
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed on right side of a left outer join
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, r.data
+  FROM (
+SELECT 'B' AS data
+  ) AS o
+  LEFT JOIN r ON r.data = o.data
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed on left side of a right outer join
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, r.data
+  FROM r
+  RIGHT JOIN (
+SELECT 'B' AS data
+  ) AS o ON o.data = r.data
+)
+SELECT * FROM r;
+
+-- aggregate is supported in the anchor term
+WITH RECURSIVE r(level, data) AS (
+  SELECT MAX(level) AS level, SUM(data) AS data FROM VALUES (0, 1), (0, 2)
+  UNION ALL
+  SELECT level + 1, data FROM r WHERE level < 10
+)
+SELECT * FROM r ORDER BY level;
+
+-- recursive reference is not allowed in an aggregate in a recursive term
+WITH RECURSIVE r(group, data) AS (
+  VALUES (0, 1L)
+  UNION ALL
+  SELECT 1, SUM(data) FROM r WHERE data < 10 GROUP BY group
+)
+SELECT * FROM r;
+
+-- recursive reference is not allowed in an aggregate (made from project) in a 
recursive term
+WITH RECURSIVE r(level) AS (
+  VALUES (1L)
+  UNION ALL
+  SELECT SUM(level) FROM r WHERE level < 10
+)
+SELECT * FROM r;
+
+-- aggregate is supported on a recursive table
+WITH RECURSIVE r(level, data) AS (
+  VALUES (0, 'A')
+  UNION ALL
+  SELECT level + 1, data FROM r WHERE level < 10
+)
+SELECT COUNT(*) FROM r;
+
+-- recursive refe

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25242: [SPARK-28497][SQL] Disallow upcasting complex data types to string type

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25242: [SPARK-28497][SQL] 
Disallow upcasting complex data types to string type
URL: https://github.com/apache/spark/pull/25242#discussion_r307090548
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderResolutionSuite.scala
 ##
 @@ -196,6 +196,43 @@ class EncoderResolutionSuite extends PlanTest {
 encoder.resolveAndBind(attrs)
   }
 
+  test("SPARK-28497: complex type is not compatible with string encoder 
schema") {
+val encoder = ExpressionEncoder[String]
+
+{
+  val attrs = Seq('a.struct('x.long))
+  
assert(intercept[AnalysisException](encoder.resolveAndBind(attrs)).message ==
+s"""
+   |Cannot up cast `a` from struct to string.
+   |The type path of the target object is:
+   |- root class: "java.lang.String"
+   |You can either add an explicit cast to the input data or choose a 
higher precision type
+""".stripMargin.trim + " of the field in the target object")
+}
+
+{
+  val attrs = Seq('a.array(StringType))
+  
assert(intercept[AnalysisException](encoder.resolveAndBind(attrs)).message ==
+s"""
+   |Cannot up cast `a` from array to string.
 
 Review comment:
   It doesn't necessarily compare the whole message. We can check if the 
message contains some keywords. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

SparkQA commented on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new 
shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514867350
 
 
   **[Test build #108146 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108146/testReport)**
 for PR 25007 at commit 
[`b8b7b8d`](https://github.com/apache/spark/commit/b8b7b8d00418ccc735ec7bdcc10de5e71384e8bc).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #25242: [SPARK-28497][SQL] Disallow upcasting complex data types to string type

2019-07-24 Thread GitBox

HyukjinKwon commented on a change in pull request #25242: [SPARK-28497][SQL] 
Disallow upcasting complex data types to string type
URL: https://github.com/apache/spark/pull/25242#discussion_r307090548
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderResolutionSuite.scala
 ##
 @@ -196,6 +196,43 @@ class EncoderResolutionSuite extends PlanTest {
 encoder.resolveAndBind(attrs)
   }
 
+  test("SPARK-28497: complex type is not compatible with string encoder 
schema") {
+val encoder = ExpressionEncoder[String]
+
+{
+  val attrs = Seq('a.struct('x.long))
+  
assert(intercept[AnalysisException](encoder.resolveAndBind(attrs)).message ==
+s"""
+   |Cannot up cast `a` from struct to string.
+   |The type path of the target object is:
+   |- root class: "java.lang.String"
+   |You can either add an explicit cast to the input data or choose a 
higher precision type
+""".stripMargin.trim + " of the field in the target object")
+}
+
+{
+  val attrs = Seq('a.array(StringType))
+  
assert(intercept[AnalysisException](encoder.resolveAndBind(attrs)).message ==
+s"""
+   |Cannot up cast `a` from array to string.
 
 Review comment:
   I doesn't necessarily compare the whole message. We can check if the message 
contains some keywords. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support recursive SQL query

2019-07-24 Thread GitBox

maropu commented on a change in pull request #23531: [SPARK-24497][SQL] Support 
recursive SQL query
URL: https://github.com/apache/spark/pull/23531#discussion_r307090403
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##
 @@ -1873,6 +1874,22 @@ object SQLConf {
 .booleanConf
 .createWithDefault(false)
 
+  val RECURSION_LEVEL_LIMIT = buildConf("spark.sql.cte.recursion.level.limit")
+.internal()
+.doc("Maximum level of recursion that is allowed wile executing a 
recursive CTE definition." +
+  "If a query does not get exhausted before reaching this limit it fails.")
+.intConf
+.createWithDefault(100)
 
 Review comment:
   Can you use "-1" for the unlimited case?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API

2019-07-24 Thread GitBox

AmplabJenkins removed a comment on issue #25007: [SPARK-28209][CORE][SHUFFLE] 
Proposed new shuffle writer API 
URL: https://github.com/apache/spark/pull/25007#issuecomment-514866817
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/13248/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

< 1 2 3 4 5 6 7 8 9 10 >

101 - 200 of 1120 matches

Mail list logo