[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #81876 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81876/testReport)** for PR 18853 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya It is possible I think. A similar example is, `HasRegParam` trait, do not put `setRegParam` in trait but moved into concrete estimator/transformer class, should be the same reason.

[GitHub] spark pull request #19238: [SPARK-22016][SQL] Add HiveDialect for JDBC conne...

2017-09-18 Thread danielfx90
Github user danielfx90 commented on a diff in the pull request: https://github.com/apache/spark/pull/19238#discussion_r139441849 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/jdbc/JDBCSuite.scala --- @@ -1103,6 +1103,17 @@ class JDBCSuite extends SparkFunSuite

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-18 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/18805 It looks like zstd-jni has now been updated to pull 1.3.1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81880/ Test PASSed. ---

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81880 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81880/testReport)** for PR 19265 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81879/ Test PASSed. ---

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81879 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81879/testReport)** for PR 19265 at commit

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 I'm ok for that but I think adding an interface doesn't break binary compatibility? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19261: [SPARK-22040] Add current_date function with timezone id

2017-09-18 Thread jaceklaskowski
Github user jaceklaskowski commented on the issue: https://github.com/apache/spark/pull/19261 @gatorsmile Dunno, but the logical operator does. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 Btw, I don't see any hint about `df.withColumn(..).withColumn(..).withColumn(..)` can prevent the shuffle re-using. --- - To

[GitHub] spark issue #19234: [WIP][SPARK-22010][PySpark] Change fromInternal method o...

2017-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19234 Seems fine to me too as is. @maver1ck, I think you could take out `[WIP]` and let it be merged. --- - To unsubscribe,

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15544 **[Test build #81881 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81881/testReport)** for PR 15544 at commit

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread wzhfy
Github user wzhfy commented on the issue: https://github.com/apache/spark/pull/15544 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 I don't think re-using shuffle is the reason behind the numbers. If you looked at the previous comments, you will find that I ran the test before without `count` after `model.transform`. Namely the

[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19145 Did you enable RM or NM recovery, can you please clarify it? Normally, if we assume there's are 2 containers running on this NM, after 10 minutes, RM will detect the failure of NM and

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 @viirya Yes. But if there is some better design I will be happy to listen. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 New numbers: numColums | Old Mean | Old Median | New Mean | New Median -- | -- | -- | -- | -- 1 | 0.1727832915997 | 0.1537169693 | 0.1687325048997 | 0.1521283075 10 |

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 Great! That's it. thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 Updated test codes: import org.apache.spark.ml.feature._ import org.apache.spark.sql.{DataFrame, Row} import org.apache.spark.sql.types._ import spark.implicits._

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 @viirya I guess the reason is, the old PR version: `df.withColumn(..).withColumn(..).withColumn(..)`, the long df chain prevent the shuffle re-using... but now you merge them into one

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 Do you mean we keep both inputCol and inputCols in `Bucketizer`? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 @WeichenXu123 Btw, the test is basically re-using the codes from https://github.com/apache/spark/pull/18902#issuecomment-321727416. Is your concern is specified for this? ---

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19229 @WeichenXu123 Sure. And I must point out that I ran this benchmark in spark-shell under local mode. It is great if you can run the benchmark too to verify the numbers. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #81876 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81876/testReport)** for PR 18853 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81880 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81880/testReport)** for PR 19265 at commit

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19230 **[Test build #81877 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81877/testReport)** for PR 19230 at commit

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19135 **[Test build #81878 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81878/testReport)** for PR 19135 at commit

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19264 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19210 **[Test build #81875 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81875/testReport)** for PR 19210 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81879 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81879/testReport)** for PR 19265 at commit

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19264 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81873/ Test PASSed. ---

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19264 **[Test build #81873 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81873/testReport)** for PR 19264 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19265 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19265: [SPARK-22047][flaky test] HiveExternalCatalogVers...

2017-09-18 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19265 [SPARK-22047][flaky test] HiveExternalCatalogVersionsSuite ## What changes were proposed in this pull request? This PR tries to download Spark for each test run, to make sure each test

[GitHub] spark issue #19263: [SPARK-22050][CORE] Allow BlockUpdated events to be opti...

2017-09-18 Thread michaelmior
Github user michaelmior commented on the issue: https://github.com/apache/spark/pull/19263 Whoops. Sorry about that. I opened the PR via the CLI so I didn't see the pointer on the web interface. I should have known better though. Updated! ---

[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...

2017-09-18 Thread klion26
Github user klion26 commented on the issue: https://github.com/apache/spark/pull/19145 Hi @jerryshao, thank you for your reply. # Problem the problem is for long running jobs which run on **yarn with HA** will request more executors than it requests. # How to

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19135 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19230 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/18853 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19234: [WIP][SPARK-22010][PySpark] Change fromInternal method o...

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 I check with some samples and code with float can trigger errors. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19152: [SPARK-21915][ML][PySpark] Model 1 and Model 2 ParamMaps...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19152 @marktab You should close merged PR. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 Hi @cloud-fan, thanks for reviewing. The code has updated, pls take a look. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Docker tes...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19259 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81870/ Test PASSed. ---

[GitHub] spark issue #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Docker tes...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19259 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Docker tes...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19259 **[Test build #81870 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81870/testReport)** for PR 19259 at commit

[GitHub] spark issue #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory usage to...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19160 @zsxwing @jiangxb1987 would you please help to review this PR when you have time, thanks a lot. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19263: Optionally add block updates to log

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19263 @michaelmior would you please follow the instruction (https://spark.apache.org/contributing.html) to update PR title and create a corresponding JIRA, thanks! ---

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-18 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 Encounter two problems: 1. I tried to fix it in the order of 'compression' > 'parquet.compression' > 'spark.sql.parquet.compression. codec', but found 'parquet.compression' may come from a

[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19145 Hi @klion26 , sorry for the late response. Can we please understand the problem first, would you please describe your problem in detail and how to reproduce your issue? ---

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-18 Thread original-brownbear
Github user original-brownbear commented on the issue: https://github.com/apache/spark/pull/19254 @srowen rebased against `master` to get the test ignore https://github.com/apache/spark/commit/894a7561de2c2ff01fe7fcc5268378161e9e5643 , should be good to retest now :) ---

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19230 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81872/ Test FAILed. ---

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19230 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19230 **[Test build #81872 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81872/testReport)** for PR 19230 at commit

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19210 LGTM, let me retest this again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19210 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81871/ Test FAILed. ---

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19219 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81874/ Test FAILed. ---

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19219 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #81871 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81871/testReport)** for PR 18853 at commit

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19219 **[Test build #81874 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81874/testReport)** for PR 19219 at commit

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139366789 --- Diff: python/pyspark/sql/functions.py --- @@ -2142,18 +2159,26 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE|

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139365685 --- Diff: python/pyspark/sql/functions.py --- @@ -2142,18 +2159,26 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE|

[GitHub] spark pull request #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVer...

2017-09-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19264 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #18704: [SPARK-20783][SQL] Create ColumnVector to abstrac...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18704#discussion_r139364602 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala --- @@ -169,6 +267,125 @@

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19264 cc @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19264 merging to master/2.2 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19219 **[Test build #81874 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81874/testReport)** for PR 19219 at commit

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19264 **[Test build #81873 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81873/testReport)** for PR 19264 at commit

[GitHub] spark pull request #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVer...

2017-09-18 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19264 [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSuite ## What changes were proposed in this pull request? As reported in https://issues.apache.org/jira/browse/SPARK-22047 ,

[GitHub] spark pull request #18704: [SPARK-20783][SQL] Create ColumnVector to abstrac...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18704#discussion_r139362028 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/CompressibleColumnAccessor.scala --- @@ -17,8 +17,11 @@

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19230 Can we add test code for `null` row in a column for each type? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #18704: [SPARK-20783][SQL] Create ColumnVector to abstrac...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18704#discussion_r139361487 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/WritableColumnVector.java --- @@ -147,6 +147,11 @@ private void

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17819 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81869/ Test FAILed. ---

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-18 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/19230#discussion_r139361387 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnVectorSuite.scala --- @@ -0,0 +1,201 @@ +/* + * Licensed to the

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17819 **[Test build #81869 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81869/testReport)** for PR 17819 at commit

[GitHub] spark issue #19234: [WIP][SPARK-22010][PySpark] Change fromInternal method o...

2017-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19234 I think it was changed back in https://github.com/apache/spark/pull/7363. I think that avoids precision loss from Python's `float` limitation: ```python >>> decimal.Decimal(1 / 1e6)

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139359087 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r13935 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -422,208 +456,101 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139353082 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139358597 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139344378 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala --- @@ -624,7 +639,9 @@ class FsHistoryProviderSuite extends

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139346016 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala --- @@ -74,6 +76,7 @@ class HistoryServerSuite extends

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139349515 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139346721 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -117,17 +122,37 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139348395 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -742,53 +703,150 @@ private[history] object

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139358487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139357733 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowEvalPythonExec.scala --- @@ -0,0 +1,127 @@ +/* + * Licensed to the

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19230 **[Test build #81872 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81872/testReport)** for PR 19230 at commit

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19230 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDFs

2017-09-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18659 @BryanCutler I think it's okay to rename `size` to `length` (or longer name to avoid name-conflict like `_length_`?). --- - To

[GitHub] spark issue #19249: [SPARK-22032][PySpark] Speed up StructType conversion

2017-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19249 We can split two `needConversion` for key and value only and save key conversion or value conversion call though? --- - To

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-18 Thread original-brownbear
Github user original-brownbear commented on the issue: https://github.com/apache/spark/pull/19254 Sure, added a JIRA for the failure here https://issues.apache.org/jira/browse/SPARK-22047 :) --- - To unsubscribe,

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #81871 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81871/testReport)** for PR 18853 at commit

[GitHub] spark issue #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Docker tes...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19259 **[Test build #81870 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81870/testReport)** for PR 19259 at commit

<    1   2   3   4   5   >