[GitHub] [spark] MaxGekk commented on a change in pull request #26102: [SPARK-29448][SQL] Support the `INTERVAL` type by Parquet datasource
MaxGekk commented on a change in pull request #26102: [SPARK-29448][SQL] Support the `INTERVAL` type by Parquet datasource URL: https://github.com/apache/spark/pull/26102#discussion_r335287678 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRowConverter.scala ## @@ -325,6 +325,26 @@ private[parquet] class ParquetRowConverter( override def set(value: Any): Unit = updater.set(value.asInstanceOf[InternalRow].copy()) }) + case CalendarIntervalType +if parquetType.asPrimitiveType().getPrimitiveTypeName == FIXED_LEN_BYTE_ARRAY => +new ParquetPrimitiveConverter(updater) { + override def addBinary(value: Binary): Unit = { +assert( + value.length() == 12, + "Intervals are expected to be stored in 12-byte fixed len byte array, " + +s"but got a ${value.length()}-byte array.") + +val buf = value.toByteBuffer.order(ByteOrder.LITTLE_ENDIAN) +val milliseconds = buf.getInt +var microseconds = milliseconds * DateTimeUtils.MICROS_PER_MILLIS +val days = buf.getInt +val daysInUs = Math.multiplyExact(days, DateTimeUtils.MICROS_PER_DAY) Review comment: Don't want to defend another side :-) but the consequence of storing days separately means that hours are unbounded. In this way, `interval 1 day 25 hours` and `interval 2 days 1 hours` are represented differently in parquet - (0, 1, 9000) and (0, 2, 360). As @cloud-fan wrote above, this can lead to different result while adding those intervals to 2 November 2019: `2019-11-02` + `interval 1 day 25 hours` = `2019-11-04 00:00:00` but `2019-11-02` + `interval 2 days 1 hour` = `2019-11-04 01:00:00`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request
SparkQA commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request URL: https://github.com/apache/spark/pull/26014#issuecomment-542529388 **[Test build #112138 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112138/testReport)** for PR 26014 at commit [`62e1f98`](https://github.com/apache/spark/commit/62e1f98aa9ad92d958e85e7b499da6ab0ffd4119). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request
AmplabJenkins removed a comment on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request URL: https://github.com/apache/spark/pull/26014#issuecomment-542527423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17131/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request
AmplabJenkins removed a comment on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request URL: https://github.com/apache/spark/pull/26014#issuecomment-542527420 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request
AmplabJenkins commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request URL: https://github.com/apache/spark/pull/26014#issuecomment-542527420 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request
AmplabJenkins commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request URL: https://github.com/apache/spark/pull/26014#issuecomment-542527423 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17131/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request
wangyum commented on issue #26014: [SPARK-29349][SQL] Support FETCH_PRIOR in Thriftserver fetch request URL: https://github.com/apache/spark/pull/26014#issuecomment-542526964 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542525576 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542525580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112135/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542525580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112135/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542525576 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
SparkQA removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542487308 **[Test build #112135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112135/testReport)** for PR 25840 at commit [`9526f85`](https://github.com/apache/spark/commit/9526f8507c2dd3f504308894f131b96fedc817a0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
SparkQA commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542525062 **[Test build #112135 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112135/testReport)** for PR 25840 at commit [`9526f85`](https://github.com/apache/spark/commit/9526f8507c2dd3f504308894f131b96fedc817a0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED
dilipbiswal commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED URL: https://github.com/apache/spark/pull/26042#discussion_r335284509 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -436,6 +436,9 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession { .replaceAll( s"Location.*/sql/core/spark-warehouse/$clsName/", s"Location ${notIncludedMsg}sql/core/spark-warehouse/") + .replaceAll( +s"Location.*\\.\\.\\.", Review comment: @cloud-fan because, we truncate the location to 100 chars for explain. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates
cloud-fan commented on a change in pull request #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates URL: https://github.com/apache/spark/pull/26126#discussion_r335283934 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ## @@ -316,6 +316,8 @@ object FunctionRegistry { expression[EveryAgg]("every"), expression[AnyAgg]("any"), expression[SomeAgg]("some"), +expression[BoolAnd]("bool_and"), Review comment: We don't need to keep the function alias in the schema. There are many examples in `FunctionRegistry`, that a function has more than one name and we always use one name when displaying. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED
cloud-fan commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED URL: https://github.com/apache/spark/pull/26042#discussion_r335282862 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -436,6 +436,9 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession { .replaceAll( s"Location.*/sql/core/spark-warehouse/$clsName/", s"Location ${notIncludedMsg}sql/core/spark-warehouse/") + .replaceAll( +s"Location.*\\.\\.\\.", Review comment: oh wait, why does the previous case doesn't work for the new changes? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #26123: [SPARK-27259][CORE] Allow setting -1 as length for FileBlock
dongjoon-hyun closed pull request #26123: [SPARK-27259][CORE] Allow setting -1 as length for FileBlock URL: https://github.com/apache/spark/pull/26123 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on issue #26115: [SPARK-29469][Shuffle] Avoid retries by RetryingBlockFetcher when ExternalBlockStoreClient is closed
viirya commented on issue #26115: [SPARK-29469][Shuffle] Avoid retries by RetryingBlockFetcher when ExternalBlockStoreClient is closed URL: https://github.com/apache/spark/pull/26115#issuecomment-542519818 thanks @dongjoon-hyun @cloud-fan! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26115: [SPARK-29469][Shuffle] Avoid retries by RetryingBlockFetcher when ExternalBlockStoreClient is closed
cloud-fan closed pull request #26115: [SPARK-29469][Shuffle] Avoid retries by RetryingBlockFetcher when ExternalBlockStoreClient is closed URL: https://github.com/apache/spark/pull/26115 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26115: [SPARK-29469][Shuffle] Avoid retries by RetryingBlockFetcher when ExternalBlockStoreClient is closed
cloud-fan commented on issue #26115: [SPARK-29469][Shuffle] Avoid retries by RetryingBlockFetcher when ExternalBlockStoreClient is closed URL: https://github.com/apache/spark/pull/26115#issuecomment-542515910 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25854: [SPARK-29145][SQL]Spark SQL cannot handle "NOT IN" condition when using "JOIN"
cloud-fan commented on a change in pull request #25854: [SPARK-29145][SQL]Spark SQL cannot handle "NOT IN" condition when using "JOIN" URL: https://github.com/apache/spark/pull/25854#discussion_r335278200 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SubquerySuite.scala ## @@ -204,6 +204,30 @@ class SubquerySuite extends QueryTest with SharedSparkSession { } } + test("SPARK-29145: JOIN Condition use QueryList") { +withTempView("s1", "s2", "s3") { + Seq(1, 3, 5, 7, 9).toDF("id").createOrReplaceTempView("s1") + Seq(1, 3, 4, 6, 9).toDF("id").createOrReplaceTempView("s2") + Seq(3, 4, 6, 9).toDF("id").createOrReplaceTempView("s3") + + checkAnswer( +sql("SELECT s1.id from s1 JOIN s2 ON s1.id = s2.id and s1.id IN (select 9)"), Review comment: can we put correlated subquery in join condition? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25854: [SPARK-29145][SQL]Spark SQL cannot handle "NOT IN" condition when using "JOIN"
cloud-fan commented on a change in pull request #25854: [SPARK-29145][SQL]Spark SQL cannot handle "NOT IN" condition when using "JOIN" URL: https://github.com/apache/spark/pull/25854#discussion_r335278004 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ## @@ -602,7 +602,7 @@ trait CheckAnalysis extends PredicateHelper { case inSubqueryOrExistsSubquery => plan match { - case _: Filter | _: SupportsSubquery => // Ok + case _: Filter | _: SupportsSubquery | _: Join => // Ok case _ => failAnalysis(s"IN/EXISTS predicate sub-queries can only be used in" + s" Filter and a few commands: $plan") Review comment: let's update the message: `Filter/Join and a few commands` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542510827 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112131/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542510822 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542510822 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542510827 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112131/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25670: [SPARK-28869][CORE] Roll over event log files
AmplabJenkins removed a comment on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-542510224 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25670: [SPARK-28869][CORE] Roll over event log files
AmplabJenkins removed a comment on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-542510228 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112134/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542510411 **[Test build #112131 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112131/testReport)** for PR 26123 at commit [`53ae391`](https://github.com/apache/spark/commit/53ae39174b73b4ea7cf981b97f2bb24168752c5d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542477946 **[Test build #112131 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112131/testReport)** for PR 26123 at commit [`53ae391`](https://github.com/apache/spark/commit/53ae39174b73b4ea7cf981b97f2bb24168752c5d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25670: [SPARK-28869][CORE] Roll over event log files
AmplabJenkins commented on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-542510224 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25670: [SPARK-28869][CORE] Roll over event log files
AmplabJenkins commented on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-542510228 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112134/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED
dilipbiswal commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED URL: https://github.com/apache/spark/pull/26042#discussion_r335274882 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -436,6 +436,9 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession { .replaceAll( s"Location.*/sql/core/spark-warehouse/$clsName/", s"Location ${notIncludedMsg}sql/core/spark-warehouse/") + .replaceAll( +s"Location.*\\.\\.\\.", Review comment: If we decide to do it, i feel we should do it in another pr. So if required, can be reverted easily ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dilipbiswal commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED
dilipbiswal commented on a change in pull request #26042: [SPARK-29092][SQL] Report additional information about DataSourceScanExec in EXPLAIN FORMATTED URL: https://github.com/apache/spark/pull/26042#discussion_r335274734 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala ## @@ -436,6 +436,9 @@ class SQLQueryTestSuite extends QueryTest with SharedSparkSession { .replaceAll( s"Location.*/sql/core/spark-warehouse/$clsName/", s"Location ${notIncludedMsg}sql/core/spark-warehouse/") + .replaceAll( +s"Location.*\\.\\.\\.", Review comment: @cloud-fan Actually i had thought about it. But for some tests, i wasn't sure if we should change the output. For example : ```sql/core/src/test/resources/sql-tests/results/describe-part-after-analyze.sql``` the test is : DESC EXTENDED t PARTITION (ds='2017-08-01', hr=10) Perhaps displaying the location is the intention. What do you think ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25670: [SPARK-28869][CORE] Roll over event log files
SparkQA removed a comment on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-542482603 **[Test build #112134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112134/testReport)** for PR 25670 at commit [`2ff349b`](https://github.com/apache/spark/commit/2ff349bcba539c5d1e0ea40ac52fd4b4bf75c3b8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25670: [SPARK-28869][CORE] Roll over event log files
SparkQA commented on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-542509756 **[Test build #112134 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112134/testReport)** for PR 25670 at commit [`2ff349b`](https://github.com/apache/spark/commit/2ff349bcba539c5d1e0ea40ac52fd4b4bf75c3b8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542508848 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542508854 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112133/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542508848 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542508854 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112133/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542481135 **[Test build #112133 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112133/testReport)** for PR 26123 at commit [`840f1ce`](https://github.com/apache/spark/commit/840f1ce4bd15c58fe2d6b2a5aa9980127b2291b8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542508394 **[Test build #112133 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112133/testReport)** for PR 26123 at commit [`840f1ce`](https://github.com/apache/spark/commit/840f1ce4bd15c58fe2d6b2a5aa9980127b2291b8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState
dongjoon-hyun closed pull request #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState URL: https://github.com/apache/spark/pull/26089 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on a change in pull request #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates
yaooqinn commented on a change in pull request #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates URL: https://github.com/apache/spark/pull/26126#discussion_r335269997 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ## @@ -316,6 +316,8 @@ object FunctionRegistry { expression[EveryAgg]("every"), expression[AnyAgg]("any"), expression[SomeAgg]("some"), +expression[BoolAnd]("bool_and"), Review comment: Yes, it is. https://github.com/apache/spark/pull/26126/files#diff-fdccd9945d709da6c561b67a2e46a0d8R296, is there a way to keep the bool_and string in the schema? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] merrily01 commented on a change in pull request #26088: [SPARK-29436][K8S] Support executor for selecting scheduler through scheduler name in the case of k8s multi-scheduler scenario
merrily01 commented on a change in pull request #26088: [SPARK-29436][K8S] Support executor for selecting scheduler through scheduler name in the case of k8s multi-scheduler scenario URL: https://github.com/apache/spark/pull/26088#discussion_r335268520 ## File path: resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/BasicExecutorFeatureStep.scala ## @@ -216,6 +216,11 @@ private[spark] class BasicExecutorFeatureStep( .endSpec() .build() +val schedulerName = kubernetesConf.get(KUBERNETES_EXECUTOR_SCHEDULER_NAME) +if (schedulerName.nonEmpty) { Review comment: Resolved, thx Sean~ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26053: [SPARK-29379][SQL]SHOW FUNCTIONS show '!=', '<>' , 'between', 'case'
cloud-fan commented on a change in pull request #26053: [SPARK-29379][SQL]SHOW FUNCTIONS show '!=', '<>' , 'between', 'case' URL: https://github.com/apache/spark/pull/26053#discussion_r335268319 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -59,7 +59,8 @@ class SQLQuerySuite extends QueryTest with SharedSparkSession { test("show functions") { def getFunctions(pattern: String): Seq[Row] = { StringUtils.filterPattern( - spark.sessionState.catalog.listFunctions("default").map(_._1.funcName), pattern) +spark.sessionState.catalog.listFunctions("default").map(_._1.funcName) +++ Seq("!=", "<>", "between", "case"), pattern) Review comment: `Seq("!=", "<>", "between", "case")` appears many times, shall we put it in ``` object ShowFunctionsCommand { // operators that do not have corresponding functions. // They should be handled in `DropFunctionCommand` as well. val virtualOperators = Seq("!=", "<>", "between", "case") } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState
SparkQA commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState URL: https://github.com/apache/spark/pull/26089#issuecomment-542500923 **[Test build #112137 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112137/testReport)** for PR 26089 at commit [`c33208b`](https://github.com/apache/spark/commit/c33208b762cbebe34d204bfb1c976d62ef9b6dea). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26053: [SPARK-29379][SQL]SHOW FUNCTIONS show '!=', '<>' , 'between', 'case'
cloud-fan commented on a change in pull request #26053: [SPARK-29379][SQL]SHOW FUNCTIONS show '!=', '<>' , 'between', 'case' URL: https://github.com/apache/spark/pull/26053#discussion_r335267938 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -222,6 +223,13 @@ case class ShowFunctionsCommand( case (f, "USER") if showUserFunctions => f.unquotedString case (f, "SYSTEM") if showSystemFunctions => f.unquotedString } -functionNames.sorted.map(Row(_)) +if (showSystemFunctions) { + (functionNames ++ +StringUtils.filterPattern(Seq("!=", "<>", "between", "case"), pattern.getOrElse("*"))) Review comment: shall we add comment like https://github.com/apache/spark/blob/9b8d63e7ce76740eb0fc08fcd2b24849ff7693be/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala#L117 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState
AmplabJenkins removed a comment on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState URL: https://github.com/apache/spark/pull/26089#issuecomment-542499630 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState
AmplabJenkins removed a comment on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState URL: https://github.com/apache/spark/pull/26089#issuecomment-542499636 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17130/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState
dongjoon-hyun commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState URL: https://github.com/apache/spark/pull/26089#issuecomment-542499411 Retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState
AmplabJenkins commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState URL: https://github.com/apache/spark/pull/26089#issuecomment-542499636 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17130/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState
AmplabJenkins commented on issue #26089: [SPARK-29423][SS] lazily initialize StreamingQueryManager in SessionState URL: https://github.com/apache/spark/pull/26089#issuecomment-542499630 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates
cloud-fan commented on a change in pull request #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates URL: https://github.com/apache/spark/pull/26126#discussion_r335266748 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ## @@ -316,6 +316,8 @@ object FunctionRegistry { expression[EveryAgg]("every"), expression[AnyAgg]("any"), expression[SomeAgg]("some"), +expression[BoolAnd]("bool_and"), Review comment: is `bool_and` just an alias of `every`? If so we can do `expression[EveryAgg]("bool_and")` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values
cloud-fan commented on a change in pull request #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values URL: https://github.com/apache/spark/pull/26098#discussion_r335266132 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala ## @@ -76,6 +76,9 @@ private[sql] class JSONOptions( // Whether to ignore column of all null values or empty array/struct during schema inference val dropFieldIfAllNull = parameters.get("dropFieldIfAllNull").map(_.toBoolean).getOrElse(false) + // Whether to ignore column of all null during json generating + val structIngoreNull = parameters.getOrElse("structIngoreNull", "true").toBoolean Review comment: is it specific to struct type column? if not how about naming it `ignroeNullFields`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values
cloud-fan commented on a change in pull request #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values URL: https://github.com/apache/spark/pull/26098#discussion_r335266426 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/json/JacksonGeneratorSuite.scala ## @@ -39,6 +39,19 @@ class JacksonGeneratorSuite extends SparkFunSuite { assert(writer.toString === """{"a":1}""") } + test("SPARK-29444: initial with StructType and write out an empty row " + + "with allowStructIncludeNull=true") { +val dataType = StructType(StructField("a", IntegerType) :: Nil) +val input = InternalRow(null) +val writer = new CharArrayWriter() +val allowNullOption = + new JSONOptions(Map("structIngoreNull" -> "false"), gmtId) +val gen = new JacksonGenerator(dataType, writer, allowNullOption) +gen.write(input) +gen.flush() +assert(writer.toString === """{"a":null}""") Review comment: can we also test null inner field? e.g. `{"a": {"b": null}}` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values
cloud-fan commented on issue #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values URL: https://github.com/apache/spark/pull/26098#issuecomment-542498420 The change looks reasonable. Do you know why the json data source ignore null fields at the first place? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values
cloud-fan commented on a change in pull request #26098: [SPARK-29444] Add configuration to support JacksonGenrator to keep fields with null values URL: https://github.com/apache/spark/pull/26098#discussion_r335266132 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JSONOptions.scala ## @@ -76,6 +76,9 @@ private[sql] class JSONOptions( // Whether to ignore column of all null values or empty array/struct during schema inference val dropFieldIfAllNull = parameters.get("dropFieldIfAllNull").map(_.toBoolean).getOrElse(false) + // Whether to ignore column of all null during json generating + val structIngoreNull = parameters.getOrElse("structIngoreNull", "true").toBoolean Review comment: is it specific to struct type column? if not how about naming it `ignroeNullColumns`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands
AmplabJenkins removed a comment on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26129#issuecomment-542495739 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17129/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands
AmplabJenkins removed a comment on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26129#issuecomment-542495733 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands
AmplabJenkins commented on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26129#issuecomment-542495739 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17129/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands
AmplabJenkins commented on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26129#issuecomment-542495733 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands
SparkQA commented on issue #26129: [SPARK-29482][SQL] ANALYZE TABLE should look up catalog/table like v2 commands URL: https://github.com/apache/spark/pull/26129#issuecomment-542495390 **[Test build #112136 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112136/testReport)** for PR 26129 at commit [`df18d7a`](https://github.com/apache/spark/commit/df18d7ab4df4d8e4c85d92e7d56c805a07cfbb31). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] yaooqinn commented on issue #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates
yaooqinn commented on issue #26126: [SPARK-27880][SQL] Add support for bool_and and bool_or aggregates URL: https://github.com/apache/spark/pull/26126#issuecomment-542495049 cc @wangyum @cloud-fan @gatorsmile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-542493213 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-542493217 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112129/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #26112: [SPARK-29364][SQL] Return an interval from date subtract according to SQL standard
cloud-fan commented on a change in pull request #26112: [SPARK-29364][SQL] Return an interval from date subtract according to SQL standard URL: https://github.com/apache/spark/pull/26112#discussion_r335263183 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -849,12 +850,13 @@ object TypeCoercion { case Add(l @ DateType(), r @ IntegerType()) => DateAdd(l, r) case Add(l @ IntegerType(), r @ DateType()) => DateAdd(r, l) case Subtract(l @ DateType(), r @ IntegerType()) => DateSub(l, r) - case Subtract(l @ DateType(), r @ DateType()) => DateDiff(l, r) - case Subtract(l @ TimestampType(), r @ TimestampType()) => TimestampDiff(l, r) + case Subtract(l @ DateType(), r @ DateType()) => SubtractDates(l, r) Review comment: seems like we should use `DateDiff` if the dialect is pgsql. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-542493217 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112129/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
AmplabJenkins commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-542493213 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26128: [SPARK-28560][SQL][followup] code cleanup for local shuffle reader
cloud-fan commented on issue #26128: [SPARK-28560][SQL][followup] code cleanup for local shuffle reader URL: https://github.com/apache/spark/pull/26128#issuecomment-542492762 thanks for the review, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
SparkQA commented on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-542492837 **[Test build #112129 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112129/testReport)** for PR 22952 at commit [`33a5331`](https://github.com/apache/spark/commit/33a5331c4517eda23c512a4654981ca3aed9a9a5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query
SparkQA removed a comment on issue #22952: [SPARK-20568][SS] Provide option to clean up completed files in streaming query URL: https://github.com/apache/spark/pull/22952#issuecomment-542450514 **[Test build #112129 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112129/testReport)** for PR 22952 at commit [`33a5331`](https://github.com/apache/spark/commit/33a5331c4517eda23c512a4654981ca3aed9a9a5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #26128: [SPARK-28560][SQL][followup] code cleanup for local shuffle reader
cloud-fan closed pull request #26128: [SPARK-28560][SQL][followup] code cleanup for local shuffle reader URL: https://github.com/apache/spark/pull/26128 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider
cloud-fan commented on a change in pull request #25651: [SPARK-28948][SQL] Support passing all Table metadata in TableProvider URL: https://github.com/apache/spark/pull/25651#discussion_r335262702 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/TableProvider.java ## @@ -36,26 +35,12 @@ public interface TableProvider { /** - * Return a {@link Table} instance to do read/write with user-specified options. + * Return a {@link Table} instance with the given table options to do read/write. + * Implementations should infer the table schema and partitioning. * * @param options the user-specified options that can identify a table, e.g. file path, Kafka *topic name, etc. It's an immutable case-insensitive string-to-string map. */ + // TODO: this should take a Map as table properties. Review comment: > I don't think that partition inference needs to scan the entire file system tree. Spark needs to do it to get all the partition values and infer the schema. This is an existing feature that Spark can infer a common type for partition values with different types. The same applies to schema inference as well. Spark can read parquet files of different but compatible schema, so Spark must read all the files to infer the schema. Can you share more about the static cache? Do you mean a global cache that maps a directory to its listed files? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #26127: [SPARK-29348][SQL] Add observable Metrics for Streaming queries
viirya commented on a change in pull request #26127: [SPARK-29348][SQL] Add observable Metrics for Streaming queries URL: https://github.com/apache/spark/pull/26127#discussion_r335261651 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -1847,6 +1847,53 @@ class Dataset[T] private[sql]( @scala.annotation.varargs def agg(expr: Column, exprs: Column*): DataFrame = groupBy().agg(expr, exprs : _*) + /** + * Define (named) metrics to observe on the Dataset. This method returns an 'observed' Dataset + * that returns the same result as the input, with the following guarantees: + * - It will compute the defined aggregates (metrics) on all the data that is flowing through the + * Dataset at that point. + * - It will report the value of the defined aggregate columns as soon as we reach a completion + * point. A completion point is either the end of a query (batch mode) or the end of a streaming + * epoch. The value of the aggregates only reflects the data processed since the previous + * completion point. + * + * The metrics columns must either contain a literal (e.g. lit(42)), or should contain one or + * more aggregate functions (e.g. sum(a) or sum(a + b) + avg(c) - lit(1)). Expressions that + * contain references to the input Dataset's columns must always be wrapped in an aggregate + * function. + * + * A user can observe these metrics by either adding + * [[org.apache.spark.sql.streaming.StreamingQueryListener]] or a + * [[org.apache.spark.sql.util.QueryExecutionListener]] to the spark session. + * + * {{{ + * // Observe row count (rc) and error row count (erc) in the streaming Dataset + * val observed_ds = ds.observe("my_event", count(lit(1)).as("rc"), count($"error").as("erc")) + * observed_ds.writeStream.format("...").start() + * + * // Monitor the metrics using a listener. + * spark.streams.addListener(new StreamingQueryListener() { + * override def onQueryProgress(event: QueryProgressEvent): Unit = { + * event.progress.observedMetrics.get("my_event").foreach { row => + * // Trigger if the number of errors exceeds 5 percent + * val num_rows = row.getAs[Long]("rc") + * val num_error_rows = row.getAs[Long]("erc") + * val ratio = num_error_rows.toDouble / num_rows + * if (ratio > 0.05) { + * // Trigger alert + * } + * } + * } + * }) + * }}} + * + * @group typedrel + * @since DBR-6.0 Review comment: DBR-6.0? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #26114: [SPARK-29468][SQL] Change Literal.sql to be correct for floats.
cloud-fan commented on issue #26114: [SPARK-29468][SQL] Change Literal.sql to be correct for floats. URL: https://github.com/apache/spark/pull/26114#issuecomment-542489303 shall we update the old test suite to do roundtrip? We can also move it to catalyst. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542487583 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542487588 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17128/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542487588 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17128/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
AmplabJenkins removed a comment on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542487583 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table
SparkQA commented on issue #25840: [SPARK-29166][SQL] Add parameters to limit the number of dynamic partitions for data source table URL: https://github.com/apache/spark/pull/25840#issuecomment-542487308 **[Test build #112135 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112135/testReport)** for PR 25840 at commit [`9526f85`](https://github.com/apache/spark/commit/9526f8507c2dd3f504308894f131b96fedc817a0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #26108: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows
HeartSaVioR commented on a change in pull request #26108: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows URL: https://github.com/apache/spark/pull/26108#discussion_r335254184 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1069,6 +1069,16 @@ object SQLConf { .checkValue(v => Set(1, 2).contains(v), "Valid versions are 1 and 2") .createWithDefault(2) + val STREAMING_JOIN_STATE_FORMAT_VERSION = Review comment: What you see "version" in state store denotes "batchId", which is different from semantic of "version" in here which denotes the difference of schema - I know that's confusing - not sure why "version" is taken instead of "batch" or "epoch" in state store. We already mixed up multiple terms to denote the same (or similar - epoch is not strictly same as batch when it's being used in continuous mode, as there's no micro-batch). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #26108: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows
HeartSaVioR commented on a change in pull request #26108: [SPARK-26154][SS] Streaming left/right outer join should not return outer nulls for already matched rows URL: https://github.com/apache/spark/pull/26108#discussion_r335254184 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -1069,6 +1069,16 @@ object SQLConf { .checkValue(v => Set(1, 2).contains(v), "Valid versions are 1 and 2") .createWithDefault(2) + val STREAMING_JOIN_STATE_FORMAT_VERSION = Review comment: What you see "version" in state store denotes "batchId", which is different from semantic of "version" in here which denotes the difference of schema - I know that's really confusing - not sure why "version" is taken instead of "batch" or "epoch" in state store. We already mixed up multiple terms to denote the same (or similar - epoch is not strictly same as batch when it's being used in continuous mode, as there's no micro-batch). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25670: [SPARK-28869][CORE] Roll over event log files
SparkQA commented on issue #25670: [SPARK-28869][CORE] Roll over event log files URL: https://github.com/apache/spark/pull/25670#issuecomment-542482603 **[Test build #112134 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112134/testReport)** for PR 25670 at commit [`2ff349b`](https://github.com/apache/spark/commit/2ff349bcba539c5d1e0ea40ac52fd4b4bf75c3b8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542481135 **[Test build #112133 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112133/testReport)** for PR 26123 at commit [`840f1ce`](https://github.com/apache/spark/commit/840f1ce4bd15c58fe2d6b2a5aa9980127b2291b8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479808 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17127/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479735 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112132/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479726 **[Test build #112132 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112132/testReport)** for PR 26123 at commit [`c67bcf3`](https://github.com/apache/spark/commit/c67bcf303f0764bae900ac243121523bdcd35274). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479731 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479487 **[Test build #112132 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112132/testReport)** for PR 26123 at commit [`c67bcf3`](https://github.com/apache/spark/commit/c67bcf303f0764bae900ac243121523bdcd35274). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479735 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/112132/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479731 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479808 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479819 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17127/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
SparkQA commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542479487 **[Test build #112132 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112132/testReport)** for PR 26123 at commit [`c67bcf3`](https://github.com/apache/spark/commit/c67bcf303f0764bae900ac243121523bdcd35274). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542478178 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins removed a comment on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542478183 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17126/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542478178 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock
AmplabJenkins commented on issue #26123: [SPARK-27259][CORE] Allow setting -1 as split size for InputFileBlock URL: https://github.com/apache/spark/pull/26123#issuecomment-542478183 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/17126/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org