[GitHub] [spark] itholic opened a new pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


itholic opened a new pull request #33054:
URL: https://github.com/apache/spark/pull/33054


   ### What changes were proposed in this pull request?
   
   This PR proposes move `to_pandas_on_spark` function from 
`pyspark.pandas.frame` to `pyspark.sql.dataframe`, and added the related tests 
to the PySpark DataFrame tests.
   
   ### Why are the changes needed?
   
   Because now the Koalas is ported into PySpark, so we don't need to Spark 
auto-patch anymore.
   And also `to_pandas_on_spark` is belongs to the pandas-on-Spark DataFrame 
doesn't look make sense.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No, it's kinda internal refactoring stuff.
   
   ### How was this patch tested?
   
   Added the related tests and manually check they're passed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan opened a new pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


cloud-fan opened a new pull request #33055:
URL: https://github.com/apache/spark/pull/33055


   
   
   ### What changes were proposed in this pull request?
   
   After the RC vote, the release manager still need to do many work to 
finalize the release. This PR updates the script the automatize some steps:
   1. create the final git tag
   2. publish to pypi
   3. publish docs to spark-website
   4. move the release binaries from dev directory to release directory.
   5. update the KEYS file
   
   ### Why are the changes needed?
   
   easy the work of release manager.
   
   ### Does this PR introduce _any_ user-facing change?
   
   no
   
   ### How was this patch tested?
   
   tested with the recent 3.0.3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


cloud-fan commented on pull request #33055:
URL: https://github.com/apache/spark/pull/33055#issuecomment-867403746


   @srowen @dongjoon-hyun @HyukjinKwon 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #33002: [SPARK-35843][SQL] Unify the file name between batch and streaming file writers

2021-06-24 Thread GitBox


viirya commented on a change in pull request #33002:
URL: https://github.com/apache/spark/pull/33002#discussion_r657690836



##
File path: 
core/src/main/scala/org/apache/spark/internal/io/HadoopMapReduceCommitProtocol.scala
##
@@ -152,12 +153,21 @@ class HadoopMapReduceCommitProtocol(
 tmpOutputPath
   }
 
-  protected def getFilename(taskContext: TaskAttemptContext, ext: String): 
String = {
-// The file name looks like 
part-0-2dd664f9-d2c4-4ffe-878f-c6c70c1fb0cb_3-c000.parquet
-// Note that %05d does not truncate the split number, so if we have more 
than 10 tasks,
+  protected def getFilename(ext: String): String = {
+// Use the Spark task attempt ID which is unique within the write job, so 
that file writes never
+// collide if the file name also includes job ID. The Hadoop task id is 
equivalent to Spark's
+// partitionId, which is not unique within the write job, for cases like 
task retry or
+// speculative tasks.
+// NOTE: this is not necessary for certain Hadoop output committers, as 
they will create a
+// unique staging directory for each task attempt, so we don't need to 
worry about file name
+// collision between different task attempts, and using Hadoop task 
ID/Spark partition ID is
+// also fine. For extra safety and consistency with the streaming side, we 
always use the
+// Spark task attempt ID here.
+val taskId = TaskContext.get.taskAttemptId()
+// The file name looks like 
part-0-2dd664f9-d2c4-4ffe-878f-c6c70c1fb0cb_3.gz.parquet
+// Note that %05d does not truncate the taskId, so if we have more than 
10 tasks,

Review comment:
   taskId -> taskAttemptId?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #33002: [SPARK-35843][SQL] Unify the file name between batch and streaming file writers

2021-06-24 Thread GitBox


viirya commented on a change in pull request #33002:
URL: https://github.com/apache/spark/pull/33002#discussion_r657691597



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ManifestFileCommitProtocol.scala
##
@@ -113,12 +113,15 @@ class ManifestFileCommitProtocol(jobId: String, path: 
String)
 
   override def newTaskTempFile(
   taskContext: TaskAttemptContext, dir: Option[String], ext: String): 
String = {
-// The file name looks like 
part-r-0-2dd664f9-d2c4-4ffe-878f-c6c70c1fb0cb_3.gz.parquet
-// Note that %05d does not truncate the split number, so if we have more 
than 10 tasks,
+// Use the Spark task attempt ID which is unique within the write job, so 
that file writes never
+// collide if the file name also includes job ID. The Hadoop task id is 
equivalent to Spark's
+// partitionId, which is not unique within the write job, for cases like 
task retry or
+// speculative tasks.
+val taskId = TaskContext.get.taskAttemptId()
+// The file name looks like 
part-0-2dd664f9-d2c4-4ffe-878f-c6c70c1fb0cb_3.gz.parquet
+// Note that %05d does not truncate the taskId, so if we have more than 
10 tasks,

Review comment:
   same, taskId -> taskAttemptId?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu opened a new pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


AngersZh opened a new pull request #33056:
URL: https://github.com/apache/spark/pull/33056


   ### What changes were proposed in this pull request?
   Current Literal.create(data, dataType) for Period to YearMonthIntervalType 
and Duration to DayTimeIntervalType is not correct.
   
   if data type is Period/Duration, it will create converter of default 
YearMonthIntervalType/DayTimeIntervalType,  then the result is not correct, 
this pr fix this bug.
   
   ### Why are the changes needed?
   Fix  bug when use Literal.create()
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Added UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


AngersZh commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867406449


   FYI @MaxGekk @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


AngersZh commented on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867406649


   this pr should wait for https://github.com/apache/spark/pull/33056


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


SparkQA commented on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867409836


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44774/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33046: [SPARK-35282][SQL][FOLLOWUP] Simplify condition code of shuffled hash join

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33046:
URL: https://github.com/apache/spark/pull/33046#issuecomment-867410799


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140232/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #32675: [SPARK-35531][SQL] Can not insert into hive bucket table if create table with upper case schema

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #32675:
URL: https://github.com/apache/spark/pull/32675#discussion_r657699288



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
##
@@ -1092,14 +1092,23 @@ private[hive] object HiveClientImpl extends Logging {
   hiveTable.setViewExpandedText(t)
 }
 
+// hive may convert schema into lower cases while bucketSpec will not
+// only convert if case not match
+def restoreHiveBucketSpecColNames(schema: StructType, names: Seq[String]): 
Seq[String] = {
+  names.map { name =>
+schema.find(col => SQLConf.get.resolver(col.name, 
name)).map(_.name).getOrElse(name)
+  }
+}
+
 table.bucketSpec match {
   case Some(bucketSpec) if !HiveExternalCatalog.isDatasourceTable(table) =>
 hiveTable.setNumBuckets(bucketSpec.numBuckets)
-hiveTable.setBucketCols(bucketSpec.bucketColumnNames.toList.asJava)
+hiveTable.setBucketCols(
+  restoreHiveBucketSpecColNames(table.schema, 
bucketSpec.bucketColumnNames).toList.asJava)
 
 if (bucketSpec.sortColumnNames.nonEmpty) {
   hiveTable.setSortCols(
-bucketSpec.sortColumnNames
+restoreHiveBucketSpecColNames(table.schema, 
bucketSpec.sortColumnNames)

Review comment:
   catalogTable->HiveTable is fine, as long as the catalogTable is 
correctly initialized. The problem I see here is, we get catalogTable by 
`HiveClient.getTable` which doesn't go through the intialization logic in 
`HiveExternalCatalog`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


SparkQA commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867411921


   **[Test build #140247 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140247/testReport)**
 for PR 33056 at commit 
[`775f6fe`](https://github.com/apache/spark/commit/775f6feee8e046090c53410779bca47d1bd5a110).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


SparkQA commented on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867411977


   **[Test build #140249 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140249/testReport)**
 for PR 33054 at commit 
[`32c24d7`](https://github.com/apache/spark/commit/32c24d7f8a7652dbc03457d9b1963fbc03354a88).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33045: [SPARK-35476][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.series

2021-06-24 Thread GitBox


SparkQA commented on pull request #33045:
URL: https://github.com/apache/spark/pull/33045#issuecomment-867412122


   **[Test build #140250 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140250/testReport)**
 for PR 33045 at commit 
[`6d862be`](https://github.com/apache/spark/commit/6d862be3075c4a38b2726e5c148bfa7269fe9dfe).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


SparkQA commented on pull request #33055:
URL: https://github.com/apache/spark/pull/33055#issuecomment-867411990


   **[Test build #140248 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140248/testReport)**
 for PR 33055 at commit 
[`403b55a`](https://github.com/apache/spark/commit/403b55adecd564f0ed75dc99c607ef974d468286).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


SparkQA commented on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867412108


   **[Test build #140251 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140251/testReport)**
 for PR 33000 at commit 
[`0e7d5fc`](https://github.com/apache/spark/commit/0e7d5fc25f02754bb774eda1be654abf2d0ab884).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33053: [SPARK-35870][BUILD] Upgrade Jetty to 9.4.42

2021-06-24 Thread GitBox


SparkQA commented on pull request #33053:
URL: https://github.com/apache/spark/pull/33053#issuecomment-867413055


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44775/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33053: [SPARK-35870][BUILD] Upgrade Jetty to 9.4.42

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33053:
URL: https://github.com/apache/spark/pull/33053#issuecomment-867413086


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44775/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #32850: [SPARK-34920][CORE][SQL] Add error classes with SQLSTATE

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #32850:
URL: https://github.com/apache/spark/pull/32850#discussion_r657701584



##
File path: core/src/main/scala/org/apache/spark/SparkError.scala
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark
+
+import java.net.URL
+
+import scala.collection.immutable.SortedMap
+
+import com.fasterxml.jackson.annotation.JsonIgnore
+import com.fasterxml.jackson.core.`type`.TypeReference
+import com.fasterxml.jackson.databind.SerializationFeature
+import com.fasterxml.jackson.databind.json.JsonMapper
+import com.fasterxml.jackson.module.scala.DefaultScalaModule
+
+import org.apache.spark.util.Utils
+
+/**
+ * Information associated with an error class.
+ *
+ * @param sqlState SQLSTATE associated with this class.
+ * @param messageFormatLines C-style message format compatible with printf.
+ *   The error message is constructed by concatenating 
the lines with
+ *   linebreaks.
+ */
+case class ErrorInfo(sqlState: Option[String], messageFormatLines: 
Seq[String]) {

Review comment:
   It's better to have a unified representation for all the errors. "error 
class" and "message" are required fields, and it's ok to have more optional 
fields for other purposes, like "sqlState" for JDBC compatibility.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


MaxGekk commented on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867415234


   +1, LGTM. GA passed. Merging to master.
   https://user-images.githubusercontent.com/1580697/123222569-bd5dd600-d4d8-11eb-9b43-834c0b67ecba.png";>
   Thank you, @AngersZh .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #33056:
URL: https://github.com/apache/spark/pull/33056#discussion_r657704145



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
##
@@ -432,4 +432,22 @@ class LiteralExpressionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   assert(literal.toString === expected)
 }
   }
+
+  test("SPARK-35871: Literal.create(value, dataType) should support fields") {
+Seq((Period.ofMonths(13), Array(13, 12, 13)))
+  .foreach { case (period, expect) =>
+DataTypeTestUtils.yearMonthIntervalTypes.zip(expect).foreach { case 
(dt, result) =>
+  checkEvaluation(Literal.create(period, dt), result)
+}
+  }
+
+Seq((Duration.ofSeconds(86400 + 3600 + 60 + 1),
+  Array(864L, 900L, 9006000L, 9006100L, 
900L,
+9006000L, 9006100L, 9006000L, 9006100L, 9006100L)))

Review comment:
   This test is too hard to read, I don't which value is for which type. 
How about
   ```
   val duration = Duration.ofSeconds(86400 + 3600 + 60 + 1)
   DataTypeTestUtils.dayTimeIntervalTypes.foreach { dt =>
 val result = dt.endField match {
   case SECOND => ...
   ...
 }
 checkEvaluation(Literal.create(duration, dt), result)
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #33056:
URL: https://github.com/apache/spark/pull/33056#discussion_r657704145



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
##
@@ -432,4 +432,22 @@ class LiteralExpressionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   assert(literal.toString === expected)
 }
   }
+
+  test("SPARK-35871: Literal.create(value, dataType) should support fields") {
+Seq((Period.ofMonths(13), Array(13, 12, 13)))
+  .foreach { case (period, expect) =>
+DataTypeTestUtils.yearMonthIntervalTypes.zip(expect).foreach { case 
(dt, result) =>
+  checkEvaluation(Literal.create(period, dt), result)
+}
+  }
+
+Seq((Duration.ofSeconds(86400 + 3600 + 60 + 1),
+  Array(864L, 900L, 9006000L, 9006100L, 
900L,
+9006000L, 9006100L, 9006000L, 9006100L, 9006100L)))

Review comment:
   This test is too hard to read, I don't know which value is for which 
type. How about
   ```
   val duration = Duration.ofSeconds(86400 + 3600 + 60 + 1)
   DataTypeTestUtils.dayTimeIntervalTypes.foreach { dt =>
 val result = dt.endField match {
   case SECOND => ...
   ...
 }
 checkEvaluation(Literal.create(duration, dt), result)
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


SparkQA commented on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867415803


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44774/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867415839


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44774/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk closed pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


MaxGekk closed pull request #33047:
URL: https://github.com/apache/spark/pull/33047


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33046: [SPARK-35282][SQL][FOLLOWUP] Simplify condition code of shuffled hash join

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33046:
URL: https://github.com/apache/spark/pull/33046#issuecomment-867410799


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140232/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33053: [SPARK-35870][BUILD] Upgrade Jetty to 9.4.42

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33053:
URL: https://github.com/apache/spark/pull/33053#issuecomment-867413086


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44775/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867415839


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44774/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


SparkQA commented on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867416502


   **[Test build #140251 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140251/testReport)**
 for PR 33000 at commit 
[`0e7d5fc`](https://github.com/apache/spark/commit/0e7d5fc25f02754bb774eda1be654abf2d0ab884).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867416557


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140251/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


SparkQA removed a comment on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867412108


   **[Test build #140251 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140251/testReport)**
 for PR 33000 at commit 
[`0e7d5fc`](https://github.com/apache/spark/commit/0e7d5fc25f02754bb774eda1be654abf2d0ab884).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867416557


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140251/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-24 Thread GitBox


cloud-fan commented on pull request #32940:
URL: https://github.com/apache/spark/pull/32940#issuecomment-867419498


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-24 Thread GitBox


cloud-fan closed pull request #32940:
URL: https://github.com/apache/spark/pull/32940


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-24 Thread GitBox


MaxGekk commented on a change in pull request #32940:
URL: https://github.com/apache/spark/pull/32940#discussion_r657709476



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##
@@ -108,18 +108,22 @@ object IntervalUtils {
 
   def castStringToYMInterval(
   input: UTF8String,
-  // TODO(SPARK-35768): Take into account year-month interval fields in 
cast
   startField: Byte,
   endField: Byte): Int = {
+
+def truncatedMonth(month: String) : String = {
+  if (endField == YearMonthIntervalType.YEAR) "0" else month
+}
+
 input.trimAll().toString match {
-  case yearMonthRegex("-", year, month) => toYMInterval(year, month, -1)
-  case yearMonthRegex(_, year, month) => toYMInterval(year, month, 1)
+  case yearMonthRegex("-", year, month) => toYMInterval(year, 
truncatedMonth(month), -1)
+  case yearMonthRegex(_, year, month) => toYMInterval(year, 
truncatedMonth(month), 1)
   case yearMonthLiteralRegex(firstSign, secondSign, year, month) =>

Review comment:
   @AngersZh Could you open JIRA ticket for this, please.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


SparkQA commented on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867421422


   **[Test build #140249 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140249/testReport)**
 for PR 33054 at commit 
[`32c24d7`](https://github.com/apache/spark/commit/32c24d7f8a7652dbc03457d9b1963fbc03354a88).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


cloud-fan commented on pull request #33055:
URL: https://github.com/apache/spark/pull/33055#issuecomment-867421503


   >  I expected Moving Spark binaries to the release directory first 
   
   This is a good point, as the release directory should be more powerful to 
serve download requests. We can try this idea in the next release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867421659


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140249/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


SparkQA commented on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867421694


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44780/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


SparkQA removed a comment on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867411977


   **[Test build #140249 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140249/testReport)**
 for PR 33054 at commit 
[`32c24d7`](https://github.com/apache/spark/commit/32c24d7f8a7652dbc03457d9b1963fbc03354a88).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867421659


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140249/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867421720


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44780/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei commented on a change in pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


Peng-Lei commented on a change in pull request #33056:
URL: https://github.com/apache/spark/pull/33056#discussion_r657712009



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
##
@@ -153,7 +153,13 @@ object Literal {
   def fromObject(obj: Any): Literal = new Literal(obj, 
ObjectType(obj.getClass))
 
   def create(v: Any, dataType: DataType): Literal = {
-Literal(CatalystTypeConverters.convertToCatalyst(v), dataType)
+dataType match {
+  case _: YearMonthIntervalType if v.isInstanceOf[Period] =>
+Literal(CatalystTypeConverters.createToCatalystConverter(dataType)(v), 
dataType)
+  case _: DayTimeIntervalType if v.isInstanceOf[Duration] =>
+Literal(CatalystTypeConverters.createToCatalystConverter(dataType)(v), 
dataType)
+  case _ => Literal(CatalystTypeConverters.convertToCatalyst(v), dataType)
+}

Review comment:
   I don't think so. When I debug the `val r = Literal.create(duration, 
dt)`, r.dateType is right. dt is the day-time intervals of any fields or 
year-month intervals of any fields




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33000: [WIP][SPARK-35778][TESTS][SQL] Check multiply/divide of year-month intervals of any fields by numeric

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33000:
URL: https://github.com/apache/spark/pull/33000#issuecomment-867421720


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44780/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei commented on a change in pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


Peng-Lei commented on a change in pull request #33056:
URL: https://github.com/apache/spark/pull/33056#discussion_r657712009



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
##
@@ -153,7 +153,13 @@ object Literal {
   def fromObject(obj: Any): Literal = new Literal(obj, 
ObjectType(obj.getClass))
 
   def create(v: Any, dataType: DataType): Literal = {
-Literal(CatalystTypeConverters.convertToCatalyst(v), dataType)
+dataType match {
+  case _: YearMonthIntervalType if v.isInstanceOf[Period] =>
+Literal(CatalystTypeConverters.createToCatalystConverter(dataType)(v), 
dataType)
+  case _: DayTimeIntervalType if v.isInstanceOf[Duration] =>
+Literal(CatalystTypeConverters.createToCatalystConverter(dataType)(v), 
dataType)
+  case _ => Literal(CatalystTypeConverters.convertToCatalyst(v), dataType)
+}

Review comment:
   I don't think so. When I debug the `val r = Literal.create(duration, 
dt)`, `r.dateType` is right. dt is the day-time intervals of any fields or 
year-month intervals of any fields




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #32753:
URL: https://github.com/apache/spark/pull/32753#discussion_r657714379



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java
##
@@ -17,13 +17,31 @@
 
 package org.apache.spark.sql.execution.datasources.parquet;
 
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.PrimitiveIterator;
+
 /**
  * Helper class to store intermediate state while reading a Parquet column 
chunk.
  */
 final class ParquetReadState {
-  /** Maximum definition level */
+  private static final RowRange MAX_ROW_RANGE = new RowRange(Long.MIN_VALUE, 
Long.MAX_VALUE);
+  private static final RowRange MIN_ROW_RANGE = new RowRange(Long.MAX_VALUE, 
Long.MIN_VALUE);

Review comment:
   so `start` can be larger than `end` in `RowRange`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #32753:
URL: https://github.com/apache/spark/pull/32753#discussion_r657715329



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java
##
@@ -17,13 +17,31 @@
 
 package org.apache.spark.sql.execution.datasources.parquet;
 
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.PrimitiveIterator;
+
 /**
  * Helper class to store intermediate state while reading a Parquet column 
chunk.
  */
 final class ParquetReadState {
-  /** Maximum definition level */
+  private static final RowRange MAX_ROW_RANGE = new RowRange(Long.MIN_VALUE, 
Long.MAX_VALUE);
+  private static final RowRange MIN_ROW_RANGE = new RowRange(Long.MAX_VALUE, 
Long.MIN_VALUE);
+
+  /** Iterator over all row ranges, only not-null if column index is present */
+  private final Iterator rowRanges;

Review comment:
   does each column generate one row range?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #32753:
URL: https://github.com/apache/spark/pull/32753#discussion_r657715983



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java
##
@@ -33,31 +51,102 @@
   /** The remaining number of values to read in the current batch */
   int valuesToReadInBatch;
 
-  ParquetReadState(int maxDefinitionLevel) {
+  ParquetReadState(int maxDefinitionLevel, PrimitiveIterator.OfLong 
rowIndexes) {
 this.maxDefinitionLevel = maxDefinitionLevel;
+this.rowRanges = rowIndexes == null ? null : constructRanges(rowIndexes);
+nextRange();
+  }
+
+  private Iterator constructRanges(PrimitiveIterator.OfLong 
rowIndexes) {
+List rowRanges = new ArrayList<>();
+long currentStart, previous;
+currentStart = previous = Long.MIN_VALUE;

Review comment:
   I'd prefer
   ```
   long currentStart = Long.MIN_VALUE;
   long previous = Long.MIN_VALUE;
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #32753: [SPARK-34859][SQL] Handle column index when using vectorized Parquet reader

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #32753:
URL: https://github.com/apache/spark/pull/32753#discussion_r657716760



##
File path: 
sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetReadState.java
##
@@ -33,31 +51,102 @@
   /** The remaining number of values to read in the current batch */
   int valuesToReadInBatch;
 
-  ParquetReadState(int maxDefinitionLevel) {
+  ParquetReadState(int maxDefinitionLevel, PrimitiveIterator.OfLong 
rowIndexes) {
 this.maxDefinitionLevel = maxDefinitionLevel;
+this.rowRanges = rowIndexes == null ? null : constructRanges(rowIndexes);
+nextRange();
+  }
+
+  private Iterator constructRanges(PrimitiveIterator.OfLong 
rowIndexes) {

Review comment:
   Can we add some comments to explain how to generate row ranges?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


AngersZh commented on a change in pull request #33056:
URL: https://github.com/apache/spark/pull/33056#discussion_r657717341



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
##
@@ -432,4 +432,22 @@ class LiteralExpressionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   assert(literal.toString === expected)
 }
   }
+
+  test("SPARK-35871: Literal.create(value, dataType) should support fields") {
+Seq((Period.ofMonths(13), Array(13, 12, 13)))
+  .foreach { case (period, expect) =>
+DataTypeTestUtils.yearMonthIntervalTypes.zip(expect).foreach { case 
(dt, result) =>
+  checkEvaluation(Literal.create(period, dt), result)
+}
+  }
+
+Seq((Duration.ofSeconds(86400 + 3600 + 60 + 1),
+  Array(864L, 900L, 9006000L, 9006100L, 
900L,
+9006000L, 9006100L, 9006000L, 9006100L, 9006100L)))

Review comment:
   how about current ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32940: [SPARK-35768][SQL] Take into account year-month interval fields in cast

2021-06-24 Thread GitBox


AngersZh commented on a change in pull request #32940:
URL: https://github.com/apache/spark/pull/32940#discussion_r657717789



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala
##
@@ -108,18 +108,22 @@ object IntervalUtils {
 
   def castStringToYMInterval(
   input: UTF8String,
-  // TODO(SPARK-35768): Take into account year-month interval fields in 
cast
   startField: Byte,
   endField: Byte): Int = {
+
+def truncatedMonth(month: String) : String = {
+  if (endField == YearMonthIntervalType.YEAR) "0" else month
+}
+
 input.trimAll().toString match {
-  case yearMonthRegex("-", year, month) => toYMInterval(year, month, -1)
-  case yearMonthRegex(_, year, month) => toYMInterval(year, month, 1)
+  case yearMonthRegex("-", year, month) => toYMInterval(year, 
truncatedMonth(month), -1)
+  case yearMonthRegex(_, year, month) => toYMInterval(year, 
truncatedMonth(month), 1)
   case yearMonthLiteralRegex(firstSign, secondSign, year, month) =>

Review comment:
   > @AngersZh Could you open JIRA ticket for this, please.
   
   Sure and will work for ths.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on pull request #33046: [SPARK-35282][SQL][FOLLOWUP] Simplify condition code of shuffled hash join

2021-06-24 Thread GitBox


ulysses-you commented on pull request #33046:
URL: https://github.com/apache/spark/pull/33046#issuecomment-867433817


   cc @maropu @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak opened a new pull request #33057: [SPARK-35736][SPARK-35774][SQL][FOLLOWUP] Prohibit to specify the same units for FROM and TO with unit-to-unit interval syntax.

2021-06-24 Thread GitBox


sarutak opened a new pull request #33057:
URL: https://github.com/apache/spark/pull/33057


   ### What changes were proposed in this pull request?
   
   This PR change the behavior of unit-to-unit interval syntax to prohibit the 
case that the same units are specified for FROM and TO.
   
   ### Why are the changes needed?
   
   For ANSI compliance.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   New test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


SparkQA commented on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867440031


   **[Test build #140237 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140237/testReport)**
 for PR 33047 at commit 
[`6c942d0`](https://github.com/apache/spark/commit/6c942d01b7fb4b2d058c9a5e055da3899572f500).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33045: [SPARK-35476][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.series

2021-06-24 Thread GitBox


SparkQA commented on pull request #33045:
URL: https://github.com/apache/spark/pull/33045#issuecomment-867440278


   **[Test build #140250 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140250/testReport)**
 for PR 33045 at commit 
[`6d862be`](https://github.com/apache/spark/commit/6d862be3075c4a38b2726e5c148bfa7269fe9dfe).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `case class MergingSessionsExec(`
 * `class MergingSessionsIterator(`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


HeartSaVioR commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867443816


   UPDATE: This is now on top of SPARK-35861 (#33038). I'll rebase this once we 
merge #33038.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33045: [SPARK-35476][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.series

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33045:
URL: https://github.com/apache/spark/pull/33045#issuecomment-867449652


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140250/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867449653


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140237/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] steveloughran commented on pull request #33044: [SPARK-35868][CORE] Add fs.s3a.downgrade.syncable.exceptions if not set

2021-06-24 Thread GitBox


steveloughran commented on pull request #33044:
URL: https://github.com/apache/spark/pull/33044#issuecomment-867449625


   thx. FWIW, given its causing trouble, do you want this to be the default in 
hadoop default-xml?
   
   its there to stop people attempting to use s3 as a WAL for HBase or similar, 
but if applications have been treating it as a low-cost operation in general 
file IO, then we can just downgrade it broadly and rely on the hope that people 
don't do this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


SparkQA commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867450077


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44776/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


SparkQA commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867451470


   **[Test build #140253 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140253/testReport)**
 for PR 33056 at commit 
[`e878815`](https://github.com/apache/spark/commit/e878815889b4297b90253e4d1d66680b479c2781).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33057: [SPARK-35736][SPARK-35774][SQL][FOLLOWUP] Prohibit to specify the same units for FROM and TO with unit-to-unit interval syntax.

2021-06-24 Thread GitBox


SparkQA commented on pull request #33057:
URL: https://github.com/apache/spark/pull/33057#issuecomment-867451388


   **[Test build #140252 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140252/testReport)**
 for PR 33057 at commit 
[`29f09df`](https://github.com/apache/spark/commit/29f09df912eb886dd944f6ca2b151528027afbfd).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


SparkQA commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867452234


   **[Test build #140254 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140254/testReport)**
 for PR 31989 at commit 
[`efab51d`](https://github.com/apache/spark/commit/efab51d6783651d529cc1cc000e77ef89ce5b3cc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


SparkQA commented on pull request #33055:
URL: https://github.com/apache/spark/pull/33055#issuecomment-867452860


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44777/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


cloud-fan commented on a change in pull request #33056:
URL: https://github.com/apache/spark/pull/33056#discussion_r657747329



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
##
@@ -432,4 +434,27 @@ class LiteralExpressionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   assert(literal.toString === expected)
 }
   }
+
+  test("SPARK-35871: Literal.create(value, dataType) should support fields") {
+val period = Period.ofMonths(13)
+Seq(YearMonthIntervalType(YEAR, MONTH) -> 13,
+  YearMonthIntervalType(YEAR) -> 12,
+  YearMonthIntervalType(MONTH) -> 13).foreach { case (dt, result) =>
+  checkEvaluation(Literal.create(period, dt), result)
+}
+
+val duration = Duration.ofSeconds(86400 + 3600 + 60 + 1)
+Seq(DayTimeIntervalType(DAY) -> 864L,
+  DayTimeIntervalType(DAY, HOUR) -> 900L,
+  DayTimeIntervalType(DAY, MINUTE) -> 9006000L,
+  DayTimeIntervalType(DAY, SECOND) -> 9006100L,
+  DayTimeIntervalType(HOUR) -> 900L,
+  DayTimeIntervalType(HOUR, MINUTE) -> 9006000L,
+  DayTimeIntervalType(HOUR, SECOND) -> 9006100L,
+  DayTimeIntervalType(MINUTE) -> 9006000L,
+  DayTimeIntervalType(MINUTE, SECOND) -> 9006100L,
+  DayTimeIntervalType(SECOND) -> 9006100L).foreach { case (dt, result) 
=>

Review comment:
   This is still hard to read as I don't know if all the types are listed. 
I'd prefer `DataTypeTestUtils.dayTimeIntervalTypes`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #33046: [SPARK-35282][SQL][FOLLOWUP] Simplify condition code of shuffled hash join

2021-06-24 Thread GitBox


cloud-fan commented on pull request #33046:
URL: https://github.com/apache/spark/pull/33046#issuecomment-867454625


   The GA failure of pyspark is unrelated. merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


SparkQA commented on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867454801


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44778/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #33046: [SPARK-35282][SQL][FOLLOWUP] Simplify condition code of shuffled hash join

2021-06-24 Thread GitBox


cloud-fan closed pull request #33046:
URL: https://github.com/apache/spark/pull/33046


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33045: [SPARK-35476][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.series

2021-06-24 Thread GitBox


SparkQA removed a comment on pull request #33045:
URL: https://github.com/apache/spark/pull/33045#issuecomment-867412122


   **[Test build #140250 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140250/testReport)**
 for PR 33045 at commit 
[`6d862be`](https://github.com/apache/spark/commit/6d862be3075c4a38b2726e5c148bfa7269fe9dfe).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


SparkQA removed a comment on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867309381


   **[Test build #140237 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140237/testReport)**
 for PR 33047 at commit 
[`6c942d0`](https://github.com/apache/spark/commit/6c942d01b7fb4b2d058c9a5e055da3899572f500).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33047: [SPARK-35730][SQL][TESTS] Check all day-time interval types in UDF

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33047:
URL: https://github.com/apache/spark/pull/33047#issuecomment-867449653


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140237/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33045: [SPARK-35476][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.series

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33045:
URL: https://github.com/apache/spark/pull/33045#issuecomment-867449652


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140250/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867455537


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140253/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


SparkQA commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867455483


   **[Test build #140253 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140253/testReport)**
 for PR 33056 at commit 
[`e878815`](https://github.com/apache/spark/commit/e878815889b4297b90253e4d1d66680b479c2781).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


HeartSaVioR commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867455800


   Changed this to draft for now to prevent merging before merging #33038


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


SparkQA removed a comment on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867451470


   **[Test build #140253 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140253/testReport)**
 for PR 33056 at commit 
[`e878815`](https://github.com/apache/spark/commit/e878815889b4297b90253e4d1d66680b479c2781).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867455537


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140253/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


SparkQA commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867456553


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44776/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867456588


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44776/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #33011: [SPARK-35841][SQL] Casting string to decimal type doesn't work if the…

2021-06-24 Thread GitBox


cloud-fan commented on pull request #33011:
URL: https://github.com/apache/spark/pull/33011#issuecomment-867456657


   thanks, merging to master/3.1!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867456793


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140254/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #33011: [SPARK-35841][SQL] Casting string to decimal type doesn't work if the…

2021-06-24 Thread GitBox


cloud-fan closed pull request #33011:
URL: https://github.com/apache/spark/pull/33011


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


SparkQA removed a comment on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867452234


   **[Test build #140254 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140254/testReport)**
 for PR 31989 at commit 
[`efab51d`](https://github.com/apache/spark/commit/efab51d6783651d529cc1cc000e77ef89ce5b3cc).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


SparkQA commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867456739


   **[Test build #140254 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140254/testReport)**
 for PR 31989 at commit 
[`efab51d`](https://github.com/apache/spark/commit/efab51d6783651d529cc1cc000e77ef89ce5b3cc).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `sealed trait StreamingSessionWindowStateManager extends Serializable `
 * `class StreamingSessionWindowStateManagerImplV1(`
 * `class StreamingSessionWindowHelper(sessionExpression: Attribute, 
inputSchema: Seq[Attribute]) `
 * `  case class WindowRecord(start: Long, end: Long, isNew: Boolean, row: 
UnsafeRow)`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867456793


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140254/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33056:
URL: https://github.com/apache/spark/pull/33056#issuecomment-867456588


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44776/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


SparkQA commented on pull request #33055:
URL: https://github.com/apache/spark/pull/33055#issuecomment-867460212


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44777/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33055:
URL: https://github.com/apache/spark/pull/33055#issuecomment-867460253


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44777/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33055: [SPARK-35872][INFRA] Automatize some steps to finalize the release

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33055:
URL: https://github.com/apache/spark/pull/33055#issuecomment-867460253


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44777/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


AmplabJenkins commented on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867461504


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44778/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


SparkQA commented on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867461467


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44778/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33054: [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame

2021-06-24 Thread GitBox


AmplabJenkins removed a comment on pull request #33054:
URL: https://github.com/apache/spark/pull/33054#issuecomment-867461504


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/44778/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33045: [SPARK-35476][PYTHON] Fix disallow_untyped_defs mypy checks for pyspark.pandas.series

2021-06-24 Thread GitBox


SparkQA commented on pull request #33045:
URL: https://github.com/apache/spark/pull/33045#issuecomment-867462481


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/44779/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #31989: [SPARK-34891][SS] Introduce state store manager for session window in streaming query

2021-06-24 Thread GitBox


HeartSaVioR commented on pull request #31989:
URL: https://github.com/apache/spark/pull/31989#issuecomment-867463664


   retest this, please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #33014: [SPARK-35728][SQL][TESTS] Check multiply/divide of day-time intervals of any fields by numeric

2021-06-24 Thread GitBox


MaxGekk commented on pull request #33014:
URL: https://github.com/apache/spark/pull/33014#issuecomment-867465260


   > I use Literal.create(duration, dt) instead of Literal(duration, dt)
   
   @Peng-Lei Do you mean "instead of Literal(duration)"?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #33050: [SPARK-35301][PYTHON][DOCS] Document migration guide from Koalas to pandas APIs on Spark

2021-06-24 Thread GitBox


HyukjinKwon commented on pull request #33050:
URL: https://github.com/apache/spark/pull/33050#issuecomment-867465696


   Thanks all. Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #33050: [SPARK-35301][PYTHON][DOCS] Document migration guide from Koalas to pandas APIs on Spark

2021-06-24 Thread GitBox


HyukjinKwon closed pull request #33050:
URL: https://github.com/apache/spark/pull/33050


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-24 Thread GitBox


SparkQA commented on pull request #32932:
URL: https://github.com/apache/spark/pull/32932#issuecomment-867466382


   **[Test build #140243 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140243/testReport)**
 for PR 32932 at commit 
[`0fe14d0`](https://github.com/apache/spark/commit/0fe14d0001adedef66696699d478dd4d4e9c392c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #32932: [SPARK-35786][SQL] Add a new operator to distingush if AQE can optimize safely

2021-06-24 Thread GitBox


SparkQA removed a comment on pull request #32932:
URL: https://github.com/apache/spark/pull/32932#issuecomment-867328177


   **[Test build #140243 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140243/testReport)**
 for PR 32932 at commit 
[`0fe14d0`](https://github.com/apache/spark/commit/0fe14d0001adedef66696699d478dd4d4e9c392c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei commented on a change in pull request #33056: [SPARK-35871][SQL] Literal.create(value, dataType) should support fields

2021-06-24 Thread GitBox


Peng-Lei commented on a change in pull request #33056:
URL: https://github.com/apache/spark/pull/33056#discussion_r657762610



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/LiteralExpressionSuite.scala
##
@@ -432,4 +434,27 @@ class LiteralExpressionSuite extends SparkFunSuite with 
ExpressionEvalHelper {
   assert(literal.toString === expected)
 }
   }
+
+  test("SPARK-35871: Literal.create(value, dataType) should support fields") {
+val period = Period.ofMonths(13)
+Seq(YearMonthIntervalType(YEAR, MONTH) -> 13,
+  YearMonthIntervalType(YEAR) -> 12,
+  YearMonthIntervalType(MONTH) -> 13).foreach { case (dt, result) =>
+  checkEvaluation(Literal.create(period, dt), result)
+}
+
+val duration = Duration.ofSeconds(86400 + 3600 + 60 + 1)
+Seq(DayTimeIntervalType(DAY) -> 864L,
+  DayTimeIntervalType(DAY, HOUR) -> 900L,
+  DayTimeIntervalType(DAY, MINUTE) -> 9006000L,
+  DayTimeIntervalType(DAY, SECOND) -> 9006100L,
+  DayTimeIntervalType(HOUR) -> 900L,
+  DayTimeIntervalType(HOUR, MINUTE) -> 9006000L,
+  DayTimeIntervalType(HOUR, SECOND) -> 9006100L,
+  DayTimeIntervalType(MINUTE) -> 9006000L,
+  DayTimeIntervalType(MINUTE, SECOND) -> 9006100L,
+  DayTimeIntervalType(SECOND) -> 9006100L).foreach { case (dt, result) 
=>

Review comment:
   how about:
   ```scala
   val duration = Duration.ofSeconds(86400 + 3600 + 60 + 1)
   DataTypeTestUtils.dayTimeIntervalTypes.foreach { dt => {
 checkEvaluation(Literal.create(duration, dt), 
durationToMicros(duration, dt.endField))
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   >