[GitHub] [spark] AngersZhuuuu commented on a change in pull request #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation.

2019-10-28 Thread GitBox
AngersZh commented on a change in pull request #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation. URL: https://github.com/apache/spark/pull/25028#discussion_r339891398 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

[GitHub] [spark] AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252524 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
SparkQA commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252503 **[Test build #112819 has

[GitHub] [spark] SparkQA removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
SparkQA removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547251927 **[Test build #112819 has

[GitHub] [spark] cloud-fan commented on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures

2019-10-28 Thread GitBox
cloud-fan commented on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures URL: https://github.com/apache/spark/pull/25962#issuecomment-547252507 This makes sense to me. Temp files are not only used by shuffle, but also external sorter,

[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252519 Merged build finished. Test FAILed.

[GitHub] [spark] AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252519 Merged build finished. Test FAILed. This is an

[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252203 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252196 Merged build finished. Test PASSed. This is an

[GitHub] [spark] AmplabJenkins removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252196 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547252203 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
SparkQA commented on issue #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#issuecomment-547251927 **[Test build #112819 has

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r339890564 ## File path: python/pyspark/tests/test_pin_thread.py ## @@ -0,0 +1,156 @@ +#

[GitHub] [spark] alfozan commented on a change in pull request #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation.

2019-10-28 Thread GitBox
alfozan commented on a change in pull request #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation. URL: https://github.com/apache/spark/pull/25028#discussion_r339890074 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

[GitHub] [spark] HyukjinKwon commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1

2019-10-28 Thread GitBox
HyukjinKwon commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1 URL: https://github.com/apache/spark/pull/26243#issuecomment-547251312 should be good to go. This is an automated message from the Apache Git

[GitHub] [spark] alfozan commented on a change in pull request #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation.

2019-10-28 Thread GitBox
alfozan commented on a change in pull request #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation. URL: https://github.com/apache/spark/pull/25028#discussion_r339890074 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala

[GitHub] [spark] alfozan commented on issue #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation.

2019-10-28 Thread GitBox
alfozan commented on issue #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation. URL: https://github.com/apache/spark/pull/25028#issuecomment-547250946 Follow up - I think It's even better to always use a manual aliasing function and not just for a subset of expressions:

[GitHub] [spark] AmplabJenkins removed a comment on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#issuecomment-547250771 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins removed a comment on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#issuecomment-547250772 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#issuecomment-547250771 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#issuecomment-547250772 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r339889301 ## File path: python/pyspark/tests/test_pin_thread.py ## @@ -0,0 +1,156 @@ +#

[GitHub] [spark] SparkQA commented on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
SparkQA commented on issue #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#issuecomment-547250552 **[Test build #112818 has

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r339889301 ## File path: python/pyspark/tests/test_pin_thread.py ## @@ -0,0 +1,156 @@ +#

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r339889301 ## File path: python/pyspark/tests/test_pin_thread.py ## @@ -0,0 +1,156 @@ +#

[GitHub] [spark] ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#discussion_r339889000 ## File path: python/pyspark/worker.py ## @@ -505,6 +505,9 @@ def

[GitHub] [spark] ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#discussion_r339888940 ## File path: python/pyspark/taskcontext.py ## @@ -162,7 +166,10

[GitHub] [spark] ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#discussion_r33991 ## File path: python/pyspark/taskcontext.py ## @@ -162,7 +166,10

[GitHub] [spark] ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#discussion_r339888910 ## File path: python/pyspark/tests/test_taskcontext.py ## @@

[GitHub] [spark] ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#discussion_r339888616 ## File path: python/pyspark/worker.py ## @@ -596,6 +599,10 @@ def

[GitHub] [spark] ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core

2019-10-28 Thread GitBox
ConeyLiu commented on a change in pull request #26239: [SPARK-29582][PYSPARK] Unify the behavior of pyspark.TaskContext with spark core URL: https://github.com/apache/spark/pull/26239#discussion_r339888616 ## File path: python/pyspark/worker.py ## @@ -596,6 +599,10 @@ def

[GitHub] [spark] AngersZhuuuu commented on issue #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation.

2019-10-28 Thread GitBox
AngersZh commented on issue #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation. URL: https://github.com/apache/spark/pull/25028#issuecomment-547248832 > Regarding the issue in [#25028 (comment)](https://github.com/apache/spark/pull/25028#issuecomment-547230458) > >

[GitHub] [spark] AmplabJenkins removed a comment on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils URL: https://github.com/apache/spark/pull/26261#issuecomment-547247999 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils URL: https://github.com/apache/spark/pull/26261#issuecomment-547247988 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils URL: https://github.com/apache/spark/pull/26261#issuecomment-547247999 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r339884667 ## File path: python/pyspark/tests/test_pin_thread.py ## @@ -0,0 +1,156 @@ +#

[GitHub] [spark] AmplabJenkins commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils URL: https://github.com/apache/spark/pull/26261#issuecomment-547247988 Merged build finished. Test PASSed.

[GitHub] [spark] SparkQA commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils

2019-10-28 Thread GitBox
SparkQA commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils URL: https://github.com/apache/spark/pull/26261#issuecomment-547247780 **[Test build #112817 has

[GitHub] [spark] cloud-fan commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils

2019-10-28 Thread GitBox
cloud-fan commented on issue #26261: [SPARK-29607][SQL] Move static methods from CalendarInterval to IntervalUtils URL: https://github.com/apache/spark/pull/26261#issuecomment-547247193 retest this please This is an

[GitHub] [spark] alfozan commented on issue #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation.

2019-10-28 Thread GitBox
alfozan commented on issue #25028: [SPARK-28227][SQL] Support TRANSFORM with aggregation. URL: https://github.com/apache/spark/pull/25028#issuecomment-547246714 Regarding the issue in https://github.com/apache/spark/pull/25028#issuecomment-547230458 Instead of ``` val

[GitHub] [spark] zhengruifeng commented on issue #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-10-28 Thread GitBox
zhengruifeng commented on issue #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#issuecomment-547246258 In practice I am using FM/FFM, and IMHO SSP or ASYNC solvers (like Difacto/PS-lite) seems more

[GitHub] [spark] cloud-fan closed pull request #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
cloud-fan closed pull request #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287 This is

[GitHub] [spark] cloud-fan commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
cloud-fan commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547245580 thanks, merging to master!

[GitHub] [spark] AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547245340 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547245334 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547245334 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547245340 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin URL: https://github.com/apache/spark/pull/26289#issuecomment-547245247 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin URL: https://github.com/apache/spark/pull/26289#issuecomment-547245250 Test PASSed. Refer to this link for build results (access

[GitHub] [spark] SparkQA removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
SparkQA removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547239197 **[Test build #112815 has

[GitHub] [spark] AmplabJenkins removed a comment on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin URL: https://github.com/apache/spark/pull/26289#issuecomment-547245250 Test PASSed. Refer to this link for build results

[GitHub] [spark] AmplabJenkins removed a comment on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin URL: https://github.com/apache/spark/pull/26289#issuecomment-547245247 Merged build finished. Test PASSed.

[GitHub] [spark] beliefer commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process

2019-10-28 Thread GitBox
beliefer commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process URL: https://github.com/apache/spark/pull/26282#discussion_r339884891 ## File path:

[GitHub] [spark] SparkQA commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
SparkQA commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547245121 **[Test build #112815 has

[GitHub] [spark] SparkQA commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin

2019-10-28 Thread GitBox
SparkQA commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin URL: https://github.com/apache/spark/pull/26289#issuecomment-547244971 **[Test build #112816 has

[GitHub] [spark] beliefer commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process

2019-10-28 Thread GitBox
beliefer commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process URL: https://github.com/apache/spark/pull/26282#discussion_r339884891 ## File path:

[GitHub] [spark] JkSelf commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin

2019-10-28 Thread GitBox
JkSelf commented on issue #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin URL: https://github.com/apache/spark/pull/26289#issuecomment-547244817 @cloud-fan Please help me review. Thanks.

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r339884667 ## File path: python/pyspark/tests/test_pin_thread.py ## @@ -0,0 +1,156 @@ +#

[GitHub] [spark] JkSelf opened a new pull request #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin

2019-10-28 Thread GitBox
JkSelf opened a new pull request #26289: [SPARK-28560][SQL][followup] support the build side to local shuffle reader as far as possible in BroadcastHashJoin URL: https://github.com/apache/spark/pull/26289 ### What changes were proposed in this pull request?

[GitHub] [spark] zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-10-28 Thread GitBox
zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#discussion_r339881152 ## File path:

[GitHub] [spark] zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-10-28 Thread GitBox
zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#discussion_r339883747 ## File path:

[GitHub] [spark] zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-10-28 Thread GitBox
zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#discussion_r339881680 ## File path:

[GitHub] [spark] zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-10-28 Thread GitBox
zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#discussion_r339882900 ## File path:

[GitHub] [spark] zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-10-28 Thread GitBox
zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#discussion_r339882486 ## File path:

[GitHub] [spark] zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component

2019-10-28 Thread GitBox
zhengruifeng commented on a change in pull request #26124: [SPARK-29224][ML]Implement Factorization Machines as a ml-pipeline component URL: https://github.com/apache/spark/pull/26124#discussion_r339882282 ## File path:

[GitHub] [spark] HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #24898: [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's URL: https://github.com/apache/spark/pull/24898#discussion_r339882917 ## File path: python/pyspark/context.py ## @@ -1010,13 +1010,42 @@ def

[GitHub] [spark] yaooqinn commented on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures

2019-10-28 Thread GitBox
yaooqinn commented on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures URL: https://github.com/apache/spark/pull/25962#issuecomment-547242547 @cloud-fan @squito sorry for the irrelevant answer. Yes, the final files are not handled here

[GitHub] [spark] yaooqinn edited a comment on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures

2019-10-28 Thread GitBox
yaooqinn edited a comment on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures URL: https://github.com/apache/spark/pull/25962#issuecomment-547242547 @cloud-fan @squito sorry for the irrelevant answer. Yes, the final files are not handled

[GitHub] [spark] AmplabJenkins removed a comment on issue #26243: Prepare Spark release v3.0.0-preview-rc1

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26243: Prepare Spark release v3.0.0-preview-rc1 URL: https://github.com/apache/spark/pull/26243#issuecomment-547242030 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on issue #26243: Prepare Spark release v3.0.0-preview-rc1

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26243: Prepare Spark release v3.0.0-preview-rc1 URL: https://github.com/apache/spark/pull/26243#issuecomment-547242022 Merged build finished. Test PASSed. This is an automated

[GitHub] [spark] AmplabJenkins commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1 URL: https://github.com/apache/spark/pull/26243#issuecomment-547242022 Merged build finished. Test PASSed. This is an automated message from

[GitHub] [spark] viirya edited a comment on issue #26159: [SPARK-29506][SQL] Use dynamicPartitionOverwrite in FileCommitProtocol when insert into hive table

2019-10-28 Thread GitBox
viirya edited a comment on issue #26159: [SPARK-29506][SQL] Use dynamicPartitionOverwrite in FileCommitProtocol when insert into hive table URL: https://github.com/apache/spark/pull/26159#issuecomment-547242091 @rezasafi thanks for referring that. Is it specified to enabling

[GitHub] [spark] viirya commented on issue #26159: [SPARK-29506][SQL] Use dynamicPartitionOverwrite in FileCommitProtocol when insert into hive table

2019-10-28 Thread GitBox
viirya commented on issue #26159: [SPARK-29506][SQL] Use dynamicPartitionOverwrite in FileCommitProtocol when insert into hive table URL: https://github.com/apache/spark/pull/26159#issuecomment-547242091 @rezasafi thanks for referring that. Is it specified to enabling

[GitHub] [spark] AmplabJenkins commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1 URL: https://github.com/apache/spark/pull/26243#issuecomment-547242030 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] SparkQA removed a comment on issue #26243: Prepare Spark release v3.0.0-preview-rc1

2019-10-28 Thread GitBox
SparkQA removed a comment on issue #26243: Prepare Spark release v3.0.0-preview-rc1 URL: https://github.com/apache/spark/pull/26243#issuecomment-547211559 **[Test build #112809 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112809/testReport)** for PR

[GitHub] [spark] SparkQA commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1

2019-10-28 Thread GitBox
SparkQA commented on issue #26243: Prepare Spark release v3.0.0-preview-rc1 URL: https://github.com/apache/spark/pull/26243#issuecomment-547241691 **[Test build #112809 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/112809/testReport)** for PR 26243 at

[GitHub] [spark] AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547241512 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547241519 Test PASSed. Refer to this link for

[GitHub] [spark] AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547241519 Test PASSed. Refer to this link for build

[GitHub] [spark] AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547241512 Merged build finished. Test PASSed.

[GitHub] [spark] SparkQA removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
SparkQA removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547216280 **[Test build #112811 has

[GitHub] [spark] SparkQA commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
SparkQA commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547241219 **[Test build #112811 has

[GitHub] [spark] maropu commented on a change in pull request #26285: [SPARK-29623][SQL] do not allow multiple unit TO unit statements in interval literal syntax

2019-10-28 Thread GitBox
maropu commented on a change in pull request #26285: [SPARK-29623][SQL] do not allow multiple unit TO unit statements in interval literal syntax URL: https://github.com/apache/spark/pull/26285#discussion_r339881094 ## File path:

[GitHub] [spark] cloud-fan commented on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures

2019-10-28 Thread GitBox
cloud-fan commented on issue #25962: [SPARK-29285][Shuffle] Temporary shuffle files should be able to handle disk failures URL: https://github.com/apache/spark/pull/25962#issuecomment-547240532 @yaooqinn I think @squito was asking about the final file, which has a fixed name. How do you

[GitHub] [spark] AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547239509 Merged build finished. Test PASSed.

[GitHub] [spark] cloud-fan commented on a change in pull request #26285: [SPARK-29623][SQL] do not allow multiple unit TO unit statements in interval literal syntax

2019-10-28 Thread GitBox
cloud-fan commented on a change in pull request #26285: [SPARK-29623][SQL] do not allow multiple unit TO unit statements in interval literal syntax URL: https://github.com/apache/spark/pull/26285#discussion_r339880267 ## File path:

[GitHub] [spark] AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547239515 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547239515 Test PASSed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547239509 Merged build finished. Test PASSed.

[GitHub] [spark] SparkQA commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
SparkQA commented on issue #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288#issuecomment-547239197 **[Test build #112815 has

[GitHub] [spark] HyukjinKwon commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process

2019-10-28 Thread GitBox
HyukjinKwon commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process URL: https://github.com/apache/spark/pull/26282#discussion_r339879865 ## File path:

[GitHub] [spark] AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547238766 Test PASSed. Refer to this link for

[GitHub] [spark] HyukjinKwon opened a new pull request #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances

2019-10-28 Thread GitBox
HyukjinKwon opened a new pull request #26288: [SPARK-29627][PYTHON][SQL] Allow array_contains to take column instances URL: https://github.com/apache/spark/pull/26288 ### What changes were proposed in this pull request? This PR proposes to allow `array_contains` to take column

[GitHub] [spark] AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547238759 Merged build finished. Test PASSed.

[GitHub] [spark] AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547238759 Merged build finished. Test PASSed.

[GitHub] [spark] SparkQA removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
SparkQA removed a comment on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547213116 **[Test build #112810 has

[GitHub] [spark] AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
AmplabJenkins commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547238766 Test PASSed. Refer to this link for build

[GitHub] [spark] SparkQA commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType

2019-10-28 Thread GitBox
SparkQA commented on issue #26287: [SPARK-28158][SQL][FOLLOWUP] HiveUserDefinedTypeSuite: don't use RandomDataGenerator to create row for UDT backed by ArrayType URL: https://github.com/apache/spark/pull/26287#issuecomment-547238532 **[Test build #112810 has

[GitHub] [spark] ConeyLiu commented on issue #25470: [SPARK-28751][Core][WIP] Improve java serializer deserialization performance

2019-10-28 Thread GitBox
ConeyLiu commented on issue #25470: [SPARK-28751][Core][WIP] Improve java serializer deserialization performance URL: https://github.com/apache/spark/pull/25470#issuecomment-547238238 I have tested it locally, the performance is almost equally: Test case: ```scala object

[GitHub] [spark] beliefer commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process

2019-10-28 Thread GitBox
beliefer commented on a change in pull request #26282: [SPARK-29619] Add meaningful log and retry for start python worker process URL: https://github.com/apache/spark/pull/26282#discussion_r339878728 ## File path:

<    1   2   3   4   5   6   7   8   9   10   >