[GitHub] spark issue #22112: [WIP][SPARK-23243][Core] Fix RDD.repartition() data corr...

2018-08-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/22112 > IMO we should traverse the dependency graph and rely on how ShuffledRDD is configured A trivial point here - Since `ShuffleDependency` is also a DeveloperAPI, it's possible for users

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21320 @mallman, can you close this and put some efforts there in https://github.com/apache/spark/pull/21889? I see no point of leaving this PR open. ---

[GitHub] spark issue #22112: [WIP][SPARK-23243][Core] Fix RDD.repartition() data corr...

2018-08-15 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/22112 You are perfectly correct @jiangxb1987, that was a silly mistake on my part - and not trivial at all ! It should be shuffle dependency we should rely on when traversing the dependency tree, not

[GitHub] spark issue #22117: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

2018-08-15 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/22117 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL ...

2018-08-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22107#discussion_r210488641 --- Diff: R/pkg/R/DataFrame.R --- @@ -2876,6 +2905,37 @@ setMethod("except", dataFrame(excepted) }) +#' exceptA

[GitHub] spark pull request #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL ...

2018-08-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22107#discussion_r210488754 --- Diff: R/pkg/R/DataFrame.R --- @@ -2848,6 +2848,35 @@ setMethod("intersect", dataFrame(intersected) }) +#' i

[GitHub] spark issue #22117: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

2018-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22117 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2238/

[GitHub] spark issue #22117: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

2018-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22117 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL ...

2018-08-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22107#discussion_r210488890 --- Diff: R/pkg/R/DataFrame.R --- @@ -2876,6 +2905,37 @@ setMethod("except", dataFrame(excepted) }) +#' exceptA

[GitHub] spark pull request #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL ...

2018-08-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/22107#discussion_r210488842 --- Diff: R/pkg/R/DataFrame.R --- @@ -2848,6 +2848,35 @@ setMethod("intersect", dataFrame(intersected) }) +#' i

[GitHub] spark issue #22117: [SPARK-23654][BUILD] remove jets3t as a dependency of sp...

2018-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22117 **[Test build #94840 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94840/testReport)** for PR 22117 at commit [`3cad78f`](https://github.com/apache/spark/commit/3c

[GitHub] spark pull request #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databr...

2018-08-15 Thread gengliangwang
GitHub user gengliangwang opened a pull request: https://github.com/apache/spark/pull/22119 [WIP][SPARK-25129][SQL] Revert mapping com.databricks.spark.avro to org.apache.spark.sql.avro ## What changes were proposed in this pull request? In https://issues.apache.org/jira/br

[GitHub] spark pull request #20725: [SPARK-23555][PYTHON] Add BinaryType support for ...

2018-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20725#discussion_r210489614 --- Diff: python/pyspark/sql/tests.py --- @@ -4331,13 +4354,22 @@ def test_createDataFrame_fallback_enabled(self): self.assertEq

[GitHub] spark pull request #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databr...

2018-08-15 Thread gengliangwang
Github user gengliangwang commented on a diff in the pull request: https://github.com/apache/spark/pull/22119#discussion_r210489530 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -637,6 +635,12 @@ object DataSource extends Logg

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22119 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2239/

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/22119 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22119 @tgravescs @dongjoon-hyun @HyukjinKwon @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For a

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/22119 **[Test build #94841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94841/testReport)** for PR 22119 at commit [`656790e`](https://github.com/apache/spark/commit/65

[GitHub] spark pull request #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL ...

2018-08-15 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22107#discussion_r210490074 --- Diff: R/pkg/R/DataFrame.R --- @@ -2876,6 +2905,37 @@ setMethod("except", dataFrame(excepted) }) +#' exceptA

[GitHub] spark pull request #21835: [SPARK-24779]Add sequence / map_concat / map_from...

2018-08-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21835#discussion_r210489980 --- Diff: R/pkg/R/functions.R --- @@ -3320,7 +3321,7 @@ setMethod("explode", #' @aliases sequence sequence,Column-method #' @note sequence sinc

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22119 Sorry if I missed some comments somewhere but just for clarification, should we do it for CSV in 3.0.0? Inconsistency should also be taken into account. Actually configuration sounds making mor

[GitHub] spark pull request #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL ...

2018-08-15 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22107#discussion_r210490166 --- Diff: R/pkg/R/DataFrame.R --- @@ -2876,6 +2905,37 @@ setMethod("except", dataFrame(excepted) }) +#' exceptA

[GitHub] spark pull request #22107: [SPARK-25117][R] Add EXEPT ALL and INTERSECT ALL ...

2018-08-15 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22107#discussion_r210490145 --- Diff: R/pkg/R/DataFrame.R --- @@ -2848,6 +2848,35 @@ setMethod("intersect", dataFrame(intersected) }) +#' i

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21537 Thank for involving me in an important thread. I was busy this morning in Japan. I think there are three topics in the thread. 1. Merge or revert this PR 2. Design document 3. IR d

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #4277 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4277/testReport)** for PR 21320 at commit [`0e5594b`](https://github.com/apache/spark/commit/0

[GitHub] spark issue #21889: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-08-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21889 **[Test build #4278 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4278/testReport)** for PR 21889 at commit [`8d822ee`](https://github.com/apache/spark/commit/8

[GitHub] spark pull request #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databr...

2018-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22119#discussion_r210491110 --- Diff: external/avro/src/test/scala/org/apache/spark/sql/avro/AvroSuite.scala --- @@ -503,7 +495,7 @@ class AvroSuite extends QueryTest with SharedSQLC

[GitHub] spark pull request #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databr...

2018-08-15 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22119#discussion_r210491239 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala --- @@ -637,6 +635,12 @@ object DataSource extends Logging

[GitHub] spark issue #22098: [SPARK-24886][INFRA] Fix the testing script to increase ...

2018-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/22098 @shaneknapp, seems this was first introduced in https://issues.apache.org/jira/browse/SPARK-3076 / https://github.com/apache/spark/pull/1974 for a good reason fwiw. One thing I am not s

[GitHub] spark issue #22111: [SPARK-25123][SQL] Use Block to track code in SimpleExpr...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22111 Let us hold these codegen PRs until we see the design doc for building IR for the codegen? --- - To unsubscribe, e-mail: revi

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22119 If we all agree this databricks mapping is not reasonable, I think it's ok to have this inconsistency and remove the mapping for CSV in 3.0. It's weird to make the same mistake just to mak

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/22119 For details, see the discussion in the JIRA https://issues.apache.org/jira/browse/SPARK-24924 --- - To unsubscribe, e-mail: r

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21537 I wouldn't revert this unless there are specific concerns about this. Do you see any bug by a mixture of representation `s""` and `code""`? --- -

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21537 If there's a bug, then let's fix in another JIRA. If that's impossible to fix or sounds super risky and there's something I missed, let's revert. --- --

[GitHub] spark issue #22119: [WIP][SPARK-25129][SQL] Revert mapping com.databricks.sp...

2018-08-15 Thread gengliangwang
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/22119 CSV is loaded by default, while AVRO is not. So having a backward compatibility mapping in CSV only still makes sense. Let's remove the mapping for CSV in 3.0. ---

[GitHub] spark issue #21537: [SPARK-24505][SQL] Convert strings in codegen to blocks:...

2018-08-15 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21537 For 2. and 3., it is harder to say my opinion in the comment. Let me say short comments at first. For 2., if I remember correctly, @viirya once wrote the API document in a JIRA entry. it woul

[GitHub] spark pull request #22110: [SPARK-25122][SQL] Deduplication of supports equa...

2018-08-15 Thread mn-mikke
Github user mn-mikke commented on a diff in the pull request: https://github.com/apache/spark/pull/22110#discussion_r210493260 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TypeUtils.scala --- @@ -73,4 +73,14 @@ object TypeUtils { } x.l

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-08-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r210492513 --- Diff: core/src/main/scala/org/apache/spark/internal/config/package.scala --- @@ -69,6 +69,11 @@ package object config { .bytesConf(ByteUnit

[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...

2018-08-15 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r210492311 --- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala --- @@ -216,8 +217,7 @@ private[spark] class Executor( def stop(): Un

[GitHub] spark issue #21221: [SPARK-23429][CORE] Add executor memory metrics to heart...

2018-08-15 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/21221 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mai

<    1   2   3   4   5   6