[GitHub] spark issue #19180: [SPARK-21967][CORE] org.apache.spark.unsafe.types.UTF8St...

2017-09-13 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19180 CC @davies By the way do you have any measurements that show the speed up? I imagine it's faster, but mostly curious if it's still a win for short strings. ---

[GitHub] spark pull request #19225: [SPARK-4131] [Follow-up] Support "Writing data in...

2017-09-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19225 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19225: [SPARK-4131] [Follow-up] Support "Writing data into the ...

2017-09-13 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19225 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19208: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19208 **[Test build #81767 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81767/testReport)** for PR 19208 at commit [`e0f4ce6`](https://github.com/apache/spark/commit/e0

[GitHub] spark issue #19208: [SPARK-21087] [ML] CrossValidator, TrainValidationSplit ...

2017-09-13 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19208 @jkbradley I split this PR, removed the code for "dump models to disk", so the PR will be smaller and easier to review. When this PR merged, I will create follow-up PR for "dump models to disk"

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19229 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19229 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81755/ Test PASSed. ---

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19229 **[Test build #81755 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81755/testReport)** for PR 19229 at commit [`4efb643`](https://github.com/apache/spark/commit/4

[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19223 **[Test build #81766 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81766/testReport)** for PR 19223 at commit [`af8d941`](https://github.com/apache/spark/commit/af

[GitHub] spark issue #19181: [SPARK-21907][CORE] oom during spill

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19181 **[Test build #81765 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81765/testReport)** for PR 19181 at commit [`57f20b7`](https://github.com/apache/spark/commit/57

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19230#discussion_r138806043 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java --- @@ -99,73 +100,18 @@ public ArrayData copy() {

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-13 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/19230#discussion_r138805752 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java --- @@ -99,73 +100,18 @@ public ArrayData copy() {

[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...

2017-09-13 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19223 LGTM except for one comment left. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-ma

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138805358 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, options={

[GitHub] spark pull request #19132: [SPARK-21922] Fix duration always updating when t...

2017-09-13 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19132#discussion_r138805063 --- Diff: core/src/main/scala/org/apache/spark/status/api/v1/AllStagesResource.scala --- @@ -69,7 +70,8 @@ private[v1] object AllStagesResource {

[GitHub] spark pull request #19220: [SPARK-18608][ML][FOLLOWUP] Fix double caching fo...

2017-09-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19220 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19198: [MINOR][DOC] Add missing call of `update()` in ex...

2017-09-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19198 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19220: [SPARK-18608][ML][FOLLOWUP] Fix double caching for PySpa...

2017-09-13 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19220 Merged into master, thanks for all reviewing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additiona

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19186 **[Test build #81763 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81763/testReport)** for PR 19186 at commit [`29f38e4`](https://github.com/apache/spark/commit/29

[GitHub] spark issue #19132: [SPARK-21922] Fix duration always updating when task fai...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19132 **[Test build #81764 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81764/testReport)** for PR 19132 at commit [`8ec4319`](https://github.com/apache/spark/commit/8e

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19186 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81762/ Test FAILed. ---

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19186 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19227 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81750/ Test PASSed. ---

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19186 **[Test build #81762 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81762/testReport)** for PR 19186 at commit [`3f11c67`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19227 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19227: [SPARK-20060][CORE] Support accessing secure Hadoop clus...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19227 **[Test build #81750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81750/testReport)** for PR 19227 at commit [`2b3d2f2`](https://github.com/apache/spark/commit/2

[GitHub] spark pull request #19185: [Spark-21854] Added LogisticRegressionTrainingSum...

2017-09-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19185 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19215: [MINOR][SQL] Only populate type metadata for requ...

2017-09-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19215 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19186 **[Test build #81762 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81762/testReport)** for PR 19186 at commit [`3f11c67`](https://github.com/apache/spark/commit/3f

[GitHub] spark issue #19215: [MINOR][SQL] Only populate type metadata for required ty...

2017-09-13 Thread dilipbiswal
Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/19215 many thanks @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-13 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r138801682 --- Diff: core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala --- @@ -897,6 +897,80 @@ class SparkSubmitSuite sysProps("spark.subm

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-13 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r138801550 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,53 @@ object SparkSubmit extends CommandLineUtils with Logging

[GitHub] spark issue #19215: [MINOR][SQL] Only populate type metadata for required ty...

2017-09-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19215 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19215: [MINOR][SQL] Only populate type metadata for required ty...

2017-09-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19215 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.ap

[GitHub] spark issue #19185: [Spark-21854] Added LogisticRegressionTrainingSummary fo...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19185 **[Test build #81757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81757/testReport)** for PR 19185 at commit [`6529fa6`](https://github.com/apache/spark/commit/6

[GitHub] spark issue #19185: [Spark-21854] Added LogisticRegressionTrainingSummary fo...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19185 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19185: [Spark-21854] Added LogisticRegressionTrainingSummary fo...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19185 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81757/ Test PASSed. ---

[GitHub] spark issue #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support c...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19223 **[Test build #81761 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81761/testReport)** for PR 19223 at commit [`158140e`](https://github.com/apache/spark/commit/15

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-13 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 Hi @jerryshao thanks for your reviewing. >So it somehow reflects that CPU core contention is the main issue for memory pre-occupation I have modified the code, now it will not request m

[GitHub] spark pull request #19231: [SPARK-22002][SQL] Read JDBC table use custom sch...

2017-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19231#discussion_r138800677 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -993,7 +996,10 @@ class JDBCSuite extends SparkFunSuite S

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread goldmedal
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138800321 --- Diff: python/pyspark/sql/functions.py --- @@ -1921,10 +1921,12 @@ def from_json(col, schema, options={}): @since(2.1) def to_json(col, option

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r13870 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"}

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread goldmedal
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138799591 --- Diff: R/pkg/R/functions.R --- @@ -1715,7 +1717,15 @@ setMethod("to_date", #' #' # Converts an array of structs into a JSON array #' df2

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread goldmedal
Github user goldmedal commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138799483 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"}

[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19226 **[Test build #81760 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81760/testReport)** for PR 19226 at commit [`e99ed23`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19226 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81760/ Test PASSed. ---

[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19226 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19230#discussion_r138799219 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ColumnVector.java --- @@ -99,73 +100,18 @@ public ArrayData copy() { @Ov

[GitHub] spark pull request #19231: [SPARK-22002][SQL] Read JDBC table use custom sch...

2017-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19231#discussion_r138797882 --- Diff: docs/sql-programming-guide.md --- @@ -1333,7 +1333,7 @@ the following case-insensitive options: customSchema -

[GitHub] spark pull request #19188: [SPARK-21973][SQL] Add an new option to filter qu...

2017-09-13 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19188 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19226: [SPARK-21985][PySpark] PairDeserializer is broken...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19226#discussion_r138797273 --- Diff: python/pyspark/serializers.py --- @@ -343,6 +343,8 @@ def _load_stream_without_unbatching(self, stream): key_batch_stream = self.

[GitHub] spark pull request #19188: [SPARK-21973][SQL] Add an new option to filter qu...

2017-09-13 Thread maropu
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/19188#discussion_r138797271 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmarkArguments.scala --- @@ -29,7 +33,11 @@ class TPCDSQueryBenchma

[GitHub] spark pull request #19226: [SPARK-21985][PySpark] PairDeserializer is broken...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19226#discussion_r138797113 --- Diff: python/pyspark/tests.py --- @@ -644,6 +644,18 @@ def test_cartesian_chaining(self): set([(x, (y, y)) for x in range(10) for y

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 @zhengruifeng Yeah, it is better. Actually the difference between running multiple `withColumn` and one `withColumns` is mainly in the cost of query analysis and plan/dataset initialization. I will

[GitHub] spark issue #19188: [SPARK-21973][SQL] Add an new option to filter queries i...

2017-09-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19188 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.ap

[GitHub] spark issue #19188: [SPARK-21973][SQL] Add an new option to filter queries i...

2017-09-13 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19188 Merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #19188: [SPARK-21973][SQL] Add an new option to filter qu...

2017-09-13 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/19188#discussion_r138796870 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/TPCDSQueryBenchmarkArguments.scala --- @@ -29,7 +33,11 @@ class TPCDSQueryBen

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread zhengruifeng
Github user zhengruifeng commented on the issue: https://github.com/apache/spark/pull/19229 In the test code, should we use `model.transform(df).count` instead? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-13 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19230 Add a test for it? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h..

[GitHub] spark issue #19226: [SPARK-21985][PySpark] PairDeserializer is broken for do...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19226 **[Test build #81760 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81760/testReport)** for PR 19226 at commit [`e99ed23`](https://github.com/apache/spark/commit/e9

[GitHub] spark issue #19231: [SPARK-22002][SQL] Read JDBC table use custom schema sup...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19231 **[Test build #81758 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81758/testReport)** for PR 19231 at commit [`9e7a8a4`](https://github.com/apache/spark/commit/9e

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19230 **[Test build #81759 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81759/testReport)** for PR 19230 at commit [`adbaeab`](https://github.com/apache/spark/commit/ad

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-13 Thread liufengdb
GitHub user liufengdb opened a pull request: https://github.com/apache/spark/pull/19230 [SPARK-22003][SQL] support array column in vectorized reader with UDF ## What changes were proposed in this pull request? The UDF needs to deserialize the `UnsafeRow`. When the column typ

[GitHub] spark pull request #19231: [SPARK-22002][SQL] Read JDBC table use custom sch...

2017-09-13 Thread wangyum
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/19231 [SPARK-22002][SQL] Read JDBC table use custom schema support specify partial fields. ## What changes were proposed in this pull request? https://github.com/apache/spark/pull/18266 add a ne

[GitHub] spark issue #19216: [SPARK-21990][SQL] QueryPlanConstraints misses some cons...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19216 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81749/ Test PASSed. ---

[GitHub] spark issue #19216: [SPARK-21990][SQL] QueryPlanConstraints misses some cons...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19216 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19216: [SPARK-21990][SQL] QueryPlanConstraints misses some cons...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19216 **[Test build #81749 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81749/testReport)** for PR 19216 at commit [`e4cffda`](https://github.com/apache/spark/commit/e

[GitHub] spark issue #19185: [Spark-21854] Added LogisticRegressionTrainingSummary fo...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19185 **[Test build #81757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81757/testReport)** for PR 19185 at commit [`6529fa6`](https://github.com/apache/spark/commit/65

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138795133 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"}

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138795006 --- Diff: sql/core/src/test/resources/sql-tests/results/json-functions.sql.out --- @@ -26,13 +26,13 @@ Extended Usage: {"time":"26/08/2015"}

[GitHub] spark issue #19213: [SPARK-17642] [SQL] [FOLLOWUP] drop test tables and impr...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19213 **[Test build #81747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81747/testReport)** for PR 19213 at commit [`d922c85`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 FYI, the `withColumns` API was proposed in #17819. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For addition

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 Ran the similar benchmark as https://github.com/apache/spark/pull/18902#issuecomment-321727416: numColums | Old Mean | Old Median | New Mean | New Median -- | -- | -- | -- | --

[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2017-09-13 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r138794370 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/hash/Murmur3_x86_32.java --- @@ -59,6 +60,18 @@ public static int hashUnsafeWords(Object base, l

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19229 **[Test build #81756 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81756/testReport)** for PR 19229 at commit [`4b47709`](https://github.com/apache/spark/commit/4b

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 cc @MLnick @zhengruifeng @yanboliang --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19213: [SPARK-17642] [SQL] [FOLLOWUP] drop test tables and impr...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19213 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81747/ Test PASSed. ---

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19211 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #19222: [SPARK-10399][CORE][SQL] Introduce multiple Memor...

2017-09-13 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19222#discussion_r138794058 --- Diff: common/unsafe/src/main/java/org/apache/spark/unsafe/Platform.java --- @@ -75,67 +76,131 @@ public static boolean unaligned() { return unali

[GitHub] spark issue #19213: [SPARK-17642] [SQL] [FOLLOWUP] drop test tables and impr...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19213 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark pull request #19223: [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json su...

2017-09-13 Thread felixcheung
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/19223#discussion_r138794052 --- Diff: R/pkg/R/functions.R --- @@ -1715,7 +1717,15 @@ setMethod("to_date", #' #' # Converts an array of structs into a JSON array #' df

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19211 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81744/ Test PASSed. ---

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19211 **[Test build #81744 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81744/testReport)** for PR 19211 at commit [`20b8382`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19229 **[Test build #81755 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81755/testReport)** for PR 19229 at commit [`4efb643`](https://github.com/apache/spark/commit/4e

[GitHub] spark issue #19228: [SPARK-21985][PYTHON] Fix zip-chained RDD to work

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19228 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81753/ Test PASSed. ---

[GitHub] spark issue #19228: [SPARK-21985][PYTHON] Fix zip-chained RDD to work

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19228 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19228: [SPARK-21985][PYTHON] Fix zip-chained RDD to work

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19228 **[Test build #81753 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81753/testReport)** for PR 19228 at commit [`0703b67`](https://github.com/apache/spark/commit/0

[GitHub] spark pull request #19229: [SPARK-22001][ML][SQL] ImputerModel can do withCo...

2017-09-13 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/19229 [SPARK-22001][ML][SQL] ImputerModel can do withColumn for all input columns at one pass ## What changes were proposed in this pull request? SPARK-21690 makes one-pass `Imputer` by paralleli

[GitHub] spark pull request #19185: [Spark-21854] Added LogisticRegressionTrainingSum...

2017-09-13 Thread yanboliang
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/19185#discussion_r138792851 --- Diff: python/pyspark/ml/tests.py --- @@ -1464,20 +1464,79 @@ def test_logistic_regression_summary(self): self.assertEqual(s.probabilityCo

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19186 **[Test build #81752 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81752/testReport)** for PR 19186 at commit [`74445cd`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19186 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81752/ Test FAILed. ---

[GitHub] spark issue #19186: [SPARK-21972][ML] Add param handlePersistence

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19186 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19210 **[Test build #81754 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81754/testReport)** for PR 19210 at commit [`8e982c7`](https://github.com/apache/spark/commit/8e

[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19210 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@s

[GitHub] spark pull request #19130: [SPARK-21917][CORE][YARN] Supporting adding http(...

2017-09-13 Thread jerryshao
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/19130#discussion_r138791246 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -367,6 +368,53 @@ object SparkSubmit extends CommandLineUtils with Logging

[GitHub] spark issue #19188: [SPARK-21973][SQL] Add an new option to filter queries i...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19188 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81745/ Test PASSed. ---

[GitHub] spark issue #19188: [SPARK-21973][SQL] Add an new option to filter queries i...

2017-09-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19188 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #19210: Fix Graphite re-connects for Graphite instances behind E...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19210 Sure. ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19188: [SPARK-21973][SQL] Add an new option to filter queries i...

2017-09-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19188 **[Test build #81745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81745/testReport)** for PR 19188 at commit [`b543e71`](https://github.com/apache/spark/commit/b

[GitHub] spark pull request #19226: [SPARK-21985][PySpark] PairDeserializer is broken...

2017-09-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/19226#discussion_r138790747 --- Diff: python/pyspark/serializers.py --- @@ -343,9 +346,6 @@ def _load_stream_without_unbatching(self, stream): key_batch_stream = self.

  1   2   3   4   5   6   >