[GitHub] [spark] AmplabJenkins removed a comment on pull request #32328: [SPARK-35214][SQL] OptimizeSkewedJoin support ShuffledHashJoinExec

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32328: URL: https://github.com/apache/spark/pull/32328#issuecomment-826265715 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] HeartSaVioR commented on pull request #32316: [SPARK-28247][SS][TEST]Fix flaky test "query without test harness" on ContinuousSuite

2021-04-26 Thread GitBox
HeartSaVioR commented on pull request #32316: URL: https://github.com/apache/spark/pull/32316#issuecomment-826441652 Would we like to wait for @jose-torres to do the final review (and probably sign-off), or OK to go merging? -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #31744: [WIP][SPARK-34625][R] Enable Arrow optimization for float types with SparkR

2021-04-26 Thread GitBox
SparkQA commented on pull request #31744: URL: https://github.com/apache/spark/pull/31744#issuecomment-826273481 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] cloud-fan commented on a change in pull request #32340: [SPARK-35139][SQL]Support ANSI intervals as Arrow Column vectors

2021-04-26 Thread GitBox
cloud-fan commented on a change in pull request #32340: URL: https://github.com/apache/spark/pull/32340#discussion_r619962492 ## File path: sql/catalyst/src/main/java/org/apache/spark/sql/vectorized/ArrowColumnVector.java ## @@ -508,4 +516,39 @@ final ColumnarMap getMap(int ro

[GitHub] [spark] ulysses-you commented on a change in pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning number in some SparkPlan

2021-04-26 Thread GitBox
ulysses-you commented on a change in pull request #32236: URL: https://github.com/apache/spark/pull/32236#discussion_r619735173 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala ## @@ -775,8 +775,8 @@ class PlannerSuite extends SharedSpark

[GitHub] [spark] sadhen commented on pull request #32319: [SPARK-35211][PYSPARK] _create_dataframe: infer schema earlier and do type check

2021-04-26 Thread GitBox
sadhen commented on pull request #32319: URL: https://github.com/apache/spark/pull/32319#issuecomment-826071082 CI failed: ``` fatal: couldn't find remote ref SPARK-35211 Error: Process completed with exit code 128. ``` I guess it is related to branch name, so I created an

[GitHub] [spark] asfgit closed pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-26 Thread GitBox
asfgit closed pull request #30480: URL: https://github.com/apache/spark/pull/30480 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] AmplabJenkins commented on pull request #32323: [SPARK-35210][BUILD][3.0] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32323: URL: https://github.com/apache/spark/pull/32323#issuecomment-826084238 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-26 Thread GitBox
AngersZh commented on a change in pull request #32266: URL: https://github.com/apache/spark/pull/32266#discussion_r619736559 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/IntervalUtils.scala ## @@ -93,6 +93,23 @@ object IntervalUtils { p

[GitHub] [spark] sadhen closed pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-26 Thread GitBox
sadhen closed pull request #32026: URL: https://github.com/apache/spark/pull/32026 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] SparkQA removed a comment on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-826124648 **[Test build #137892 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137892/testReport)** for PR 32136 at commit [`173bb07`](https://gi

[GitHub] [spark] AngersZhuuuu commented on pull request #32333: [SPARK-33985][SQL] Add query test of combine usage of TRANSFORM and CLUSTER BY/ORDER BY

2021-04-26 Thread GitBox
AngersZh commented on pull request #32333: URL: https://github.com/apache/spark/pull/32333#issuecomment-826291950 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For qu

[GitHub] [spark] SparkQA removed a comment on pull request #32321: [SPARK-34771][PYTHON] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32321: URL: https://github.com/apache/spark/pull/32321#issuecomment-826071637 **[Test build #137883 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137883/testReport)** for PR 32321 at commit [`5845ee2`](https://gi

[GitHub] [spark] SparkQA commented on pull request #32327: [SPARK-35211][PYTHON][FOLLOW_UP] Proper NumericType conversion for applySchemaToPythonRDD

2021-04-26 Thread GitBox
SparkQA commented on pull request #32327: URL: https://github.com/apache/spark/pull/32327#issuecomment-826242974 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] SparkQA commented on pull request #32324: [SPARK-35210][BUILD][3.1] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

2021-04-26 Thread GitBox
SparkQA commented on pull request #32324: URL: https://github.com/apache/spark/pull/32324#issuecomment-826077488 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] HyukjinKwon closed pull request #32325: [SPARK-33913][SS] Upgrade Kafka to 2.8.0

2021-04-26 Thread GitBox
HyukjinKwon closed pull request #32325: URL: https://github.com/apache/spark/pull/32325 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] JWenBin commented on pull request #32329: com.google.protobuf.Parser.parseFrom() method Can't use in spark

2021-04-26 Thread GitBox
JWenBin commented on pull request #32329: URL: https://github.com/apache/spark/pull/32329#issuecomment-826275834 > > > Please use JIRA to file an issue: https://issues.apache.org/jira/projects/SPARK thanks! -- This is an automated message from the Apache Git Service. To re

[GitHub] [spark] Kimahriman commented on pull request #32338: [SPARK-35213][SQL] Keep the correct ordering of nested structs in chained withField operations

2021-04-26 Thread GitBox
Kimahriman commented on pull request #32338: URL: https://github.com/apache/spark/pull/32338#issuecomment-826350011 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For quer

[GitHub] [spark] HyukjinKwon commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-04-26 Thread GitBox
HyukjinKwon commented on pull request #32320: URL: https://github.com/apache/spark/pull/32320#issuecomment-826274103 @sadhen, can we separate refactoring and the UDT inferred type verification? It would make the change much easier to review. -- This is an automated message from the Apach

[GitHub] [spark] sadhen commented on pull request #32327: [SPARK-35211][PYTHON] Proper NumericType conversion for applySchemaToPythonRDD

2021-04-26 Thread GitBox
sadhen commented on pull request #32327: URL: https://github.com/apache/spark/pull/32327#issuecomment-826477653 > We should probably define one standard to follow. I don't know about standard. That's why I only do conversion for numeric type: int/long/byte/short/float/double. How

[GitHub] [spark] SparkQA removed a comment on pull request #32146: [SPARK-34990][SQL][TESTS][test-maven][test-hadoop2.7] Add ParquetEncryptionSuite

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32146: URL: https://github.com/apache/spark/pull/32146#issuecomment-826063304 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32343: [SPARK-35220][DOCS][FOLLOWUP] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimi

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32343: URL: https://github.com/apache/spark/pull/32343#issuecomment-826494988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA removed a comment on pull request #32236: [WIP][SPARK-35137][SQL] Revise outputpartitioning in some SparkPlan

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32236: URL: https://github.com/apache/spark/pull/32236#issuecomment-826220463 **[Test build #137899 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137899/testReport)** for PR 32236 at commit [`e104a30`](https://gi

[GitHub] [spark] SparkQA removed a comment on pull request #32340: [SPARK-35139][SQL]Support ANSI intervals as Arrow Column vectors

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32340: URL: https://github.com/apache/spark/pull/32340#issuecomment-826443626 **[Test build #137928 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137928/testReport)** for PR 32340 at commit [`01367f2`](https://gi

[GitHub] [spark] SparkQA commented on pull request #32169: [SPARK-35009][CORE] Avoid creating multiple python worker monitor threads for the same worker and same task context

2021-04-26 Thread GitBox
SparkQA commented on pull request #32169: URL: https://github.com/apache/spark/pull/32169#issuecomment-826051729 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32266: [SPARK-35111][SQL] Support Cast string to year-month interval

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32266: URL: https://github.com/apache/spark/pull/32266#issuecomment-826257657 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32316: [SPARK-28247][SS][TEST]Fix flaky test "query without test harness" on ContinuousSuite

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32316: URL: https://github.com/apache/spark/pull/32316#issuecomment-826165940 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] AmplabJenkins commented on pull request #32325: [SPARK-33913][SS] Upgrade Kafka to 2.8.0

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32325: URL: https://github.com/apache/spark/pull/32325#issuecomment-826165955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AmplabJenkins commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-826133080 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] SparkQA commented on pull request #32346: Update the resolver for spark-packages in SparkSubmit

2021-04-26 Thread GitBox
SparkQA commented on pull request #32346: URL: https://github.com/apache/spark/pull/32346#issuecomment-826549311 **[Test build #137937 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137937/testReport)** for PR 32346 at commit [`4a678ff`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #32321: [SPARK-34771][PySpark] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-26 Thread GitBox
SparkQA commented on pull request #32321: URL: https://github.com/apache/spark/pull/32321#issuecomment-826071637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] mridulm commented on a change in pull request #32287: [SPARK-27991][CORE] Defer the fetch request on Netty OOM

2021-04-26 Thread GitBox
mridulm commented on a change in pull request #32287: URL: https://github.com/apache/spark/pull/32287#discussion_r619982295 ## File path: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala ## @@ -683,7 +694,28 @@ final class ShuffleBlockFetcherItera

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32333: [SPARK-33985][SQL] Add query test of combine usage of TRANSFORM and CLUSTER BY/ORDER BY

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32333: URL: https://github.com/apache/spark/pull/32333#issuecomment-826298909 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] SparkQA removed a comment on pull request #32339: [SPARK-35224][SQL][TESTS] Fix buffer overflow in `MutableProjectionSuite`

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32339: URL: https://github.com/apache/spark/pull/32339#issuecomment-826392237 **[Test build #137925 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137925/testReport)** for PR 32339 at commit [`47cb0e5`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #32333: [SPARK-33985][SQL] Add query test of combine usage of TRANSFORM and CLUSTER BY/ORDER BY

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32333: URL: https://github.com/apache/spark/pull/32333#issuecomment-826298909 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AmplabJenkins commented on pull request #32335: [SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32335: URL: https://github.com/apache/spark/pull/32335#issuecomment-826308152 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] bozhang2820 opened a new pull request #32346: Update the resolver for spark-packages in SparkSubmit

2021-04-26 Thread GitBox
bozhang2820 opened a new pull request #32346: URL: https://github.com/apache/spark/pull/32346 ### What changes were proposed in this pull request? This change is to use repos.spark-packages.org instead of Bintray as the repository service for spark-packages. ### Why are the change

[GitHub] [spark] ulysses-you commented on pull request #32328: [SPARK-35214][SQL] OptimizeSkewedJoin support ShuffledHashJoinExec

2021-04-26 Thread GitBox
ulysses-you commented on pull request #32328: URL: https://github.com/apache/spark/pull/32328#issuecomment-826293222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For que

[GitHub] [spark] viirya edited a comment on pull request #32338: [SPARK-35213][SQL] Keep the correct ordering of nested structs in chained withField operations

2021-04-26 Thread GitBox
viirya edited a comment on pull request #32338: URL: https://github.com/apache/spark/pull/32338#issuecomment-826474985 Jenkins tests already passed, so I think it should be fine. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #32311: [SPARK-35088][SQL] Accept ANSI intervals by the Sequence expression

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32311: URL: https://github.com/apache/spark/pull/32311#issuecomment-826281938 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] LuciferYang commented on a change in pull request #32232: [SPARK-35135][CORE] Turn the `WritablePartitionedIterator` from a trait into a default implementation class

2021-04-26 Thread GitBox
LuciferYang commented on a change in pull request #32232: URL: https://github.com/apache/spark/pull/32232#discussion_r619760870 ## File path: core/src/main/scala/org/apache/spark/util/collection/WritablePartitionedPairCollection.scala ## @@ -94,3 +83,20 @@ private[spark] trait

[GitHub] [spark] AmplabJenkins commented on pull request #32341: [SPARK-35212][Spark Core][DStreams] Another way to added PreferRandom for the scenario that topic partitions need to be randomly distri

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32341: URL: https://github.com/apache/spark/pull/32341#issuecomment-826475018 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32327: [SPARK-35211][PYTHON] Proper NumericType conversion for applySchemaToPythonRDD

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32327: URL: https://github.com/apache/spark/pull/32327#issuecomment-826242257 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] mcdull-zhang commented on pull request #31653: [SPARK-33832][SQL] v2. move OptimzieSkewedJoin to query stage preparation

2021-04-26 Thread GitBox
mcdull-zhang commented on pull request #31653: URL: https://github.com/apache/spark/pull/31653#issuecomment-82648 > We found the same issue about failed to optimize skewed join due to the extra shuffle. Before submit a ticket, I just found this PR and [#30829](https://github.com/apache

[GitHub] [spark] AmplabJenkins commented on pull request #32324: [SPARK-35210][BUILD][3.1] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32324: URL: https://github.com/apache/spark/pull/32324#issuecomment-826084240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] sadhen commented on pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-04-26 Thread GitBox
sadhen commented on pull request #32332: URL: https://github.com/apache/spark/pull/32332#issuecomment-826519391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] cloud-fan commented on a change in pull request #32032: [SPARK-34701][SQL] Introduce AnalysisOnlyCommand that allows its children to be removed once the command is marked as analyzed.

2021-04-26 Thread GitBox
cloud-fan commented on a change in pull request #32032: URL: https://github.com/apache/spark/pull/32032#discussion_r619787170 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala ## @@ -48,29 +48,41 @@ import org.apache.spark.sql.util.Schema

[GitHub] [spark] HeartSaVioR commented on a change in pull request #31986: [SPARK-34888][SS] Introduce UpdatingSessionIterator adjusting session window on elements

2021-04-26 Thread GitBox
HeartSaVioR commented on a change in pull request #31986: URL: https://github.com/apache/spark/pull/31986#discussion_r619919017 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/UpdatingSessionsExec.scala ## @@ -0,0 +1,79 @@ +/* + * Licensed to the

[GitHub] [spark] SparkQA removed a comment on pull request #31744: [WIP][SPARK-34625][R] Enable Arrow optimization for float types with SparkR

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #31744: URL: https://github.com/apache/spark/pull/31744#issuecomment-826273481 **[Test build #137909 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137909/testReport)** for PR 31744 at commit [`4d5de83`](https://gi

[GitHub] [spark] SparkQA commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-04-26 Thread GitBox
SparkQA commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-826124648 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] AmplabJenkins commented on pull request #32344: [SPARK-35226][SQL] JDBC datasources should accept refreshKrb5Config parameter

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32344: URL: https://github.com/apache/spark/pull/32344#issuecomment-826544923 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42456/ -- T

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32333: [SPARK-33985][SQL][TESTS] Add query test of combine usage of TRANSFORM and CLUSTER BY/ORDER BY

2021-04-26 Thread GitBox
AngersZh commented on a change in pull request #32333: URL: https://github.com/apache/spark/pull/32333#discussion_r620011217 ## File path: sql/core/src/test/resources/sql-tests/inputs/transform.sql ## @@ -342,3 +354,41 @@ SELECT TRANSFORM(b, MAX(a) AS max_a, CAST(sum(c) AS

[GitHub] [spark] SparkQA commented on pull request #32340: [SPARK-35139][SQL]Support ANSI intervals as Arrow Column vectors

2021-04-26 Thread GitBox
SparkQA commented on pull request #32340: URL: https://github.com/apache/spark/pull/32340#issuecomment-826443626 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] AmplabJenkins commented on pull request #32336: [SPARK-35222] In YARN mode, the tracking URL is printed to allow users to better track Spark Job

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32336: URL: https://github.com/apache/spark/pull/32336#issuecomment-826307536 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #32314: [SPARK-35169][SQL] Fix wrong result of min ANSI interval division by -1

2021-04-26 Thread GitBox
AngersZh commented on a change in pull request #32314: URL: https://github.com/apache/spark/pull/32314#discussion_r619202058 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/intervalExpressions.scala ## @@ -402,6 +403,19 @@ case class Div

[GitHub] [spark] Kimahriman commented on a change in pull request #32338: [SPARK-35213][SQL] Keep the correct ordering of nested structs in chained withField operations

2021-04-26 Thread GitBox
Kimahriman commented on a change in pull request #32338: URL: https://github.com/apache/spark/pull/32338#discussion_r619831036 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeWithFieldsSuite.scala ## @@ -99,7 +99,7 @@ class OptimizeWit

[GitHub] [spark] cloud-fan commented on a change in pull request #32330: [SPARK-35215][SQL] Update custom metric per certain rows and at the end of the task

2021-04-26 Thread GitBox
cloud-fan commented on a change in pull request #32330: URL: https://github.com/apache/spark/pull/32330#discussion_r620010077 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DataSourceRDD.scala ## @@ -92,12 +104,15 @@ private class Partition

[GitHub] [spark] SparkQA removed a comment on pull request #32337: [SPARK-35223] [IDEA] Add IssueNavigationLink

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32337: URL: https://github.com/apache/spark/pull/32337#issuecomment-826358905 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #31966: [SPARK-34638][SQL] Single field nested column prune on generator output

2021-04-26 Thread GitBox
SparkQA commented on pull request #31966: URL: https://github.com/apache/spark/pull/31966#issuecomment-826189477 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] mridulm commented on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-26 Thread GitBox
mridulm commented on pull request #30480: URL: https://github.com/apache/spark/pull/30480#issuecomment-826495631 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] HyukjinKwon commented on pull request #32338: [SPARK-35213][SQL] Keep the correct ordering of nested structs in chained withField operations

2021-04-26 Thread GitBox
HyukjinKwon commented on pull request #32338: URL: https://github.com/apache/spark/pull/32338#issuecomment-826447988 @Kimahriman can you retrigger https://github.com/Kimahriman/spark/actions/runs/783536510 please? The PR in Apache Spark repo runs the build in your forked repository. --

[GitHub] [spark] viirya closed pull request #32322: [SPARK-35210][BUILD][2.4] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

2021-04-26 Thread GitBox
viirya closed pull request #32322: URL: https://github.com/apache/spark/pull/32322 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] HyukjinKwon closed pull request #32329: com.google.protobuf.Parser.parseFrom() method Can't use in spark

2021-04-26 Thread GitBox
HyukjinKwon closed pull request #32329: URL: https://github.com/apache/spark/pull/32329 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, pl

[GitHub] [spark] HyukjinKwon commented on pull request #32329: com.google.protobuf.Parser.parseFrom() method Can't use in spark

2021-04-26 Thread GitBox
HyukjinKwon commented on pull request #32329: URL: https://github.com/apache/spark/pull/32329#issuecomment-826275418 Please use JIRA to file an issue: https://issues.apache.org/jira/projects/SPARK -- This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] SparkQA commented on pull request #32330: [SPARK-35215][SQL] Update custom metric per certain rows and at the end of the task

2021-04-26 Thread GitBox
SparkQA commented on pull request #32330: URL: https://github.com/apache/spark/pull/32330#issuecomment-826272960 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] SparkQA removed a comment on pull request #32315: [SPARK-35206][TESTS][SQL] Extract common used get project path into a function in SparkFunctionSuite

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32315: URL: https://github.com/apache/spark/pull/32315#issuecomment-826101112 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] mridulm commented on pull request #32136: [SPARK-35022][CORE] Task Scheduling Plugin in Spark

2021-04-26 Thread GitBox
mridulm commented on pull request #32136: URL: https://github.com/apache/spark/pull/32136#issuecomment-826506510 As I mentioned in the doc, are we are trying to retrofit scenarios that Spark is not trying to handle ? Namely: some task for some stage must only run on a particular executor a

[GitHub] [spark] HyukjinKwon commented on pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-04-26 Thread GitBox
HyukjinKwon commented on pull request #32332: URL: https://github.com/apache/spark/pull/32332#issuecomment-826520847 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For que

[GitHub] [spark] SparkQA removed a comment on pull request #32232: [SPARK-35135][CORE] Turn the `WritablePartitionedIterator` from a trait into a default implementation class

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32232: URL: https://github.com/apache/spark/pull/32232#issuecomment-826238292 **[Test build #137902 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137902/testReport)** for PR 32232 at commit [`8835ecc`](https://gi

[GitHub] [spark] maropu commented on pull request #32210: [SPARK-32634][SQL] Introduce sort-based fallback for shuffled hash join (non-code-gen path)

2021-04-26 Thread GitBox
maropu commented on pull request #32210: URL: https://github.com/apache/spark/pull/32210#issuecomment-826501739 I have the same impression with @sigmod. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] xuanyuanking commented on a change in pull request #32272: [SPARK-35172][SS] The implementation of RocksDBCheckpointMetadata

2021-04-26 Thread GitBox
xuanyuanking commented on a change in pull request #32272: URL: https://github.com/apache/spark/pull/32272#discussion_r619785921 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala ## @@ -0,0 +1,165 @@ +/* + * Licensed

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32318: [SPARK-35210][BUILD] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32318: URL: https://github.com/apache/spark/pull/32318#issuecomment-826048519 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] HyukjinKwon commented on pull request #32335: [SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited

2021-04-26 Thread GitBox
HyukjinKwon commented on pull request #32335: URL: https://github.com/apache/spark/pull/32335#issuecomment-826458656 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For que

[GitHub] [spark] SparkQA removed a comment on pull request #32026: [SPARK-34771] Support UDT for Pandas/Spark conversion with Arrow support Enabled

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32026: URL: https://github.com/apache/spark/pull/32026#issuecomment-826048887 **[Test build #137878 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137878/testReport)** for PR 32026 at commit [`5845ee2`](https://gi

[GitHub] [spark] AngersZhuuuu commented on pull request #32343: [SPARK-35220][DOCS][FOLLOWUP] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited

2021-04-26 Thread GitBox
AngersZh commented on pull request #32343: URL: https://github.com/apache/spark/pull/32343#issuecomment-826480244 FYI @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

[GitHub] [spark] Amitg1 commented on pull request #13599: [SPARK-13587] [PYSPARK] Support virtualenv in pyspark

2021-04-26 Thread GitBox
Amitg1 commented on pull request #13599: URL: https://github.com/apache/spark/pull/13599#issuecomment-826307849 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] sarutak commented on pull request #32190: [SPARK-35087][UI] Some columns in table Aggregated Metrics by Executor of stage-detail page shows incorrectly.

2021-04-26 Thread GitBox
sarutak commented on pull request #32190: URL: https://github.com/apache/spark/pull/32190#issuecomment-826472470 Merged to `master` and `branch-3.1`. Thanks @kyoty . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] AmplabJenkins commented on pull request #32319: [SPARK-35211][PYSPARK] _create_dataframe: infer schema earlier and do type check

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32319: URL: https://github.com/apache/spark/pull/32319#issuecomment-826070387 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AmplabJenkins commented on pull request #32322: [SPARK-35210][BUILD][2.4] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32322: URL: https://github.com/apache/spark/pull/32322#issuecomment-826077279 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AmplabJenkins commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service on Kubernetes

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32031: URL: https://github.com/apache/spark/pull/32031#issuecomment-826048520 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] SparkQA commented on pull request #32345: [WIP][SPARK-35085][SQL] Get columns operation should handle ANSI interval column properly

2021-04-26 Thread GitBox
SparkQA commented on pull request #32345: URL: https://github.com/apache/spark/pull/32345#issuecomment-826545637 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] SparkQA removed a comment on pull request #32311: [SPARK-35088][SQL] Accept ANSI intervals by the Sequence expression

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32311: URL: https://github.com/apache/spark/pull/32311#issuecomment-826281705 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] sarutak edited a comment on pull request #32317: [SPARK-33195][UI] Fix stages/stage UI page fails because of uri parameters encoded twice

2021-04-26 Thread GitBox
sarutak edited a comment on pull request #32317: URL: https://github.com/apache/spark/pull/32317#issuecomment-826523987 @mdianjun > A new Unit test + manually test on a cluster (Added later). Could you add the new test? -- This is an automated message from the Apache Git Servic

[GitHub] [spark] SparkQA commented on pull request #32335: [SPARK-35220][SQL] DayTimeIntervalType/YearMonthIntervalType show different between Hive SerDe and row format delimited

2021-04-26 Thread GitBox
SparkQA commented on pull request #32335: URL: https://github.com/apache/spark/pull/32335#issuecomment-826300918 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] viirya closed pull request #32323: [SPARK-35210][BUILD][3.0] Upgrade Jetty to 9.4.40 to fix ERR_CONNECTION_RESET issue

2021-04-26 Thread GitBox
viirya closed pull request #32323: URL: https://github.com/apache/spark/pull/32323 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] SparkQA removed a comment on pull request #32330: [SPARK-35215][SQL] Update custom metric per certain rows and at the end of the task

2021-04-26 Thread GitBox
SparkQA removed a comment on pull request #32330: URL: https://github.com/apache/spark/pull/32330#issuecomment-826272960 **[Test build #137908 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137908/testReport)** for PR 32330 at commit [`1a98660`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30480: [SPARK-32921][SHUFFLE] MapOutputTracker extensions to support push-based shuffle

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #30480: URL: https://github.com/apache/spark/pull/30480#issuecomment-826517567 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42455/

[GitHub] [spark] viirya commented on pull request #32316: [SPARK-28247][SS][TEST]Fix flaky test "query without test harness" on ContinuousSuite

2021-04-26 Thread GitBox
viirya commented on pull request #32316: URL: https://github.com/apache/spark/pull/32316#issuecomment-826127058 It seems to fail Scala 2.13 build. > [error] /home/runner/work/spark/spark/sql/core/src/test/scala/org/apache/spark/sql/streaming/continuous/ContinuousSuite.scala:63:53: t

[GitHub] [spark] MaxGekk commented on a change in pull request #32339: [SPARK-35224][SQL][TESTS] Fix buffer overflow in `MutableProjectionSuite`

2021-04-26 Thread GitBox
MaxGekk commented on a change in pull request #32339: URL: https://github.com/apache/spark/pull/32339#discussion_r62002 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MutableProjectionSuite.scala ## @@ -50,8 +51,10 @@ class MutableProje

[GitHub] [spark] AmplabJenkins commented on pull request #32337: [SPARK-35223] [IDEA] Add IssueNavigationLink

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32337: URL: https://github.com/apache/spark/pull/32337#issuecomment-826334246 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] viirya commented on a change in pull request #31398: [SPARK-34297][SQL][SS] Add metrics for data loss and offset out range for KafkaMicroBatchStream

2021-04-26 Thread GitBox
viirya commented on a change in pull request #31398: URL: https://github.com/apache/spark/pull/31398#discussion_r619993990 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/metric/CustomMetrics.scala ## @@ -47,25 +47,21 @@ object CustomMetrics { } /** -

[GitHub] [spark] SparkQA commented on pull request #32333: [SPARK-33985][SQL] Add query test of combine usage of TRANSFORM and CLUSTER BY/ORDER BY

2021-04-26 Thread GitBox
SparkQA commented on pull request #32333: URL: https://github.com/apache/spark/pull/32333#issuecomment-826291935 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

[GitHub] [spark] pan3793 commented on a change in pull request #32337: [SPARK-35223] Add IssueNavigationLink

2021-04-26 Thread GitBox
pan3793 commented on a change in pull request #32337: URL: https://github.com/apache/spark/pull/32337#discussion_r619934141 ## File path: .gitignore ## @@ -15,7 +15,9 @@ .ensime_cache/ .ensime_lucene .generated-mima* -.idea/ +# The star is required for further !/.idea/ to wo

[GitHub] [spark] AmplabJenkins commented on pull request #32340: [SPARK-35139][SQL]Support ANSI intervals as Arrow Column vectors

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32340: URL: https://github.com/apache/spark/pull/32340#issuecomment-826442231 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] dongjoon-hyun commented on pull request #31966: [SPARK-34638][SQL] Single field nested column prune on generator output

2021-04-26 Thread GitBox
dongjoon-hyun commented on pull request #31966: URL: https://github.com/apache/spark/pull/31966#issuecomment-826229100 No problem. Thank you for updates, @viirya . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the U

[GitHub] [spark] cloud-fan commented on a change in pull request #32339: [SPARK-35224][SQL][TESTS] Fix buffer overflow in `MutableProjectionSuite`

2021-04-26 Thread GitBox
cloud-fan commented on a change in pull request #32339: URL: https://github.com/apache/spark/pull/32339#discussion_r619975922 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/MutableProjectionSuite.scala ## @@ -50,8 +51,10 @@ class MutablePro

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32330: [SPARK-35215][SQL] Update custom metric per certain rows and at the end of the task

2021-04-26 Thread GitBox
AmplabJenkins removed a comment on pull request #32330: URL: https://github.com/apache/spark/pull/32330#issuecomment-826281272 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment

[GitHub] [spark] sadhen edited a comment on pull request #32332: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

2021-04-26 Thread GitBox
sadhen edited a comment on pull request #32332: URL: https://github.com/apache/spark/pull/32332#issuecomment-826519391 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] AmplabJenkins commented on pull request #32339: [SPARK-35224][SQL][TESTS] Fix buffer overflow in `MutableProjectionSuite`

2021-04-26 Thread GitBox
AmplabJenkins commented on pull request #32339: URL: https://github.com/apache/spark/pull/32339#issuecomment-826399280 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For q

[GitHub] [spark] SparkQA commented on pull request #32253: [SPARK-35150][ML] Accelerate fallback BLAS with dev.ludovic.netlib

2021-04-26 Thread GitBox
SparkQA commented on pull request #32253: URL: https://github.com/apache/spark/pull/32253#issuecomment-826358919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries

  1   2   3   4   5   6   7   8   >