[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139366789 --- Diff: python/pyspark/sql/functions.py --- @@ -2142,18 +2159,26 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE|

[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19145 Hi @klion26 , sorry for the late response. Can we please understand the problem first, would you please describe your problem in detail and how to reproduce your issue? ---

[GitHub] spark issue #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Docker tes...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19259 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Docker tes...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19259 **[Test build #81870 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81870/testReport)** for PR 19259 at commit

[GitHub] spark pull request #18704: [SPARK-20783][SQL] Create ColumnVector to abstrac...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/18704#discussion_r139364602 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/compression/compressionSchemes.scala --- @@ -169,6 +267,125 @@

[GitHub] spark pull request #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVer...

2017-09-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19264 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 Hi @cloud-fan, thanks for reviewing. The code has updated, pls take a look. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19263: [SPARK-22050][CORE] Allow BlockUpdated events to be opti...

2017-09-18 Thread michaelmior
Github user michaelmior commented on the issue: https://github.com/apache/spark/pull/19263 Whoops. Sorry about that. I opened the PR via the CLI so I didn't see the pointer on the web interface. I should have known better though. Updated! ---

[GitHub] spark pull request #19265: [SPARK-22047][flaky test] HiveExternalCatalogVers...

2017-09-18 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/19265 [SPARK-22047][flaky test] HiveExternalCatalogVersionsSuite ## What changes were proposed in this pull request? This PR tries to download Spark for each test run, to make sure each test

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK][WIP] Python Vectorized UDF...

2017-09-18 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r139365685 --- Diff: python/pyspark/sql/functions.py --- @@ -2142,18 +2159,26 @@ def udf(f=None, returnType=StringType()): | 8| JOHN DOE|

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19230 **[Test build #81872 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81872/testReport)** for PR 19230 at commit

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19210 LGTM, let me retest this again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19152: [SPARK-21915][ML][PySpark] Model 1 and Model 2 ParamMaps...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19152 @marktab You should close merged PR. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19234: [WIP][SPARK-22010][PySpark] Change fromInternal method o...

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 I check with some samples and code with float can trigger errors. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19135 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...

2017-09-18 Thread klion26
Github user klion26 commented on the issue: https://github.com/apache/spark/pull/19145 Hi @jerryshao, thank you for your reply. # Problem the problem is for long running jobs which run on **yarn with HA** will request more executors than it requests. # How to

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81871/ Test FAILed. ---

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19219 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81874/ Test FAILed. ---

[GitHub] spark issue #19263: Optionally add block updates to log

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19263 @michaelmior would you please follow the instruction (https://spark.apache.org/contributing.html) to update PR title and create a corresponding JIRA, thanks! ---

[GitHub] spark issue #19160: [SPARK-21934][CORE] Expose Shuffle Netty memory usage to...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19160 @zsxwing @jiangxb1987 would you please help to review this PR when you have time, thanks a lot. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19230 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19230 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81872/ Test FAILed. ---

[GitHub] spark issue #19259: [BACKPORT-2.1][SPARK-19318][SPARK-22041][SQL] Docker tes...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19259 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81870/ Test PASSed. ---

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19210 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-18 Thread original-brownbear
Github user original-brownbear commented on the issue: https://github.com/apache/spark/pull/19254 @srowen rebased against `master` to get the test ignore https://github.com/apache/spark/commit/894a7561de2c2ff01fe7fcc5268378161e9e5643 , should be good to retest now :) ---

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/18853 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19230 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19265 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19219 **[Test build #81874 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81874/testReport)** for PR 19219 at commit

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18853 **[Test build #81871 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81871/testReport)** for PR 18853 at commit

[GitHub] spark issue #19219: [SPARK-21993][SQL][WIP] Close sessionState when finish

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19219 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19218: [SPARK-21786][SQL] The 'spark.sql.parquet.compression.co...

2017-09-18 Thread fjh100456
Github user fjh100456 commented on the issue: https://github.com/apache/spark/pull/19218 Encounter two problems: 1. I tried to fix it in the order of 'compression' > 'parquet.compression' > 'spark.sql.parquet.compression. codec', but found 'parquet.compression' may come from a

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19135 **[Test build #81878 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81878/testReport)** for PR 19135 at commit

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19210 **[Test build #81875 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81875/testReport)** for PR 19210 at commit

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15544 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81881/ Test FAILed. ---

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15544 **[Test build #81881 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81881/testReport)** for PR 15544 at commit

[GitHub] spark issue #15544: [SPARK-17997] [SQL] Add an aggregation function for coun...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15544 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-18 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/19266 I though, if this limit highly depends on JVM implementations, better to put the limit as a global variable somewhere (e.g., `ARRAY_INT_MAX` in `spark.util.Utils` or other places)? As another

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #81885 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81885/testReport)** for PR 18924 at commit

[GitHub] spark pull request #18853: [SPARK-21646][SQL] CommonType for binary comparis...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18853#discussion_r139465565 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -352,11 +374,16 @@ object TypeCoercion {

[GitHub] spark issue #19256: [SPARK-21338][SQL]implement isCascadingTruncateTable() m...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19256 It looks good, but the actual code should be very simple if you are writing using the Scala way --- - To unsubscribe,

[GitHub] spark issue #19210: [SPARK-22030][CORE] GraphiteSink fails to re-connect to ...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19210 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81875/ Test PASSed. ---

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19230 Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19250: [SPARK-12297] Table timezone correction for Timestamps

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19250 **[Test build #81888 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81888/testReport)** for PR 19250 at commit

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139467580 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/FsHistoryProviderSuite.scala --- @@ -624,7 +639,9 @@ class FsHistoryProviderSuite extends

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139467662 --- Diff: core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala --- @@ -74,6 +76,7 @@ class HistoryServerSuite extends SparkFunSuite

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #81889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81889/testReport)** for PR 19222 at commit

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19264 **[Test build #81873 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81873/testReport)** for PR 19264 at commit

[GitHub] spark issue #19264: [SPARK-22047][TEST] ignore HiveExternalCatalogVersionsSu...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19264 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81873/ Test PASSed. ---

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/17819 ok to test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19145: [spark-21933][yarn] Spark Streaming request more executo...

2017-09-18 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19145 Did you enable RM or NM recovery, can you please clarify it? Normally, if we assume there's are 2 containers running on this NM, after 10 minutes, RM will detect the failure of NM and

[GitHub] spark issue #19234: [WIP][SPARK-22010][PySpark] Change fromInternal method o...

2017-09-18 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19234 Seems fine to me too as is. @maver1ck, I think you could take out `[WIP]` and let it be merged. --- - To unsubscribe,

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19265 **[Test build #81880 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81880/testReport)** for PR 19265 at commit

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19265: [SPARK-22047][flaky test] HiveExternalCatalogVersionsSui...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19265 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81879/ Test PASSed. ---

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81876/ Test PASSed. ---

[GitHub] spark issue #18853: [SPARK-21646][SQL] CommonType for binary comparison

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18853 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18945: Add option to convert nullable int columns to flo...

2017-09-18 Thread a10y
Github user a10y commented on a diff in the pull request: https://github.com/apache/spark/pull/18945#discussion_r139450187 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1810,17 +1810,20 @@ def _to_scala_map(sc, jm): return sc._jvm.PythonUtils.toScalaMap(jm)

[GitHub] spark issue #19254: [MINOR][CORE] Cleanup dead code and duplication in Mem. ...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19254 **[Test build #3925 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3925/testReport)** for PR 19254 at commit

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 Looks not the reason. maybe issues somewhere else. Let me run test later. Thanks! But there is some small issues in test: Don't include gen data time: ``` val start =

[GitHub] spark pull request #19211: [SPARK-18838][core] Add separate listener queues ...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19211#discussion_r139458935 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -65,53 +60,76 @@ private[spark] class LiveListenerBus(conf: SparkConf)

[GitHub] spark pull request #19211: [SPARK-18838][core] Add separate listener queues ...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19211#discussion_r139458812 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -65,53 +60,76 @@ private[spark] class LiveListenerBus(conf: SparkConf)

[GitHub] spark pull request #18853: [SPARK-21646][SQL] CommonType for binary comparis...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18853#discussion_r139464231 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -925,6 +925,12 @@ object SQLConf { .intConf

[GitHub] spark issue #19266: [SPARK-22033][CORE] BufferHolder, other size checks shou...

2017-09-18 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19266 Yeah, agree, it could be some global constant. I don't think it should be configurable. Ideally it's determined from the JVM, but don't know a way to do that. In many cases, assuming

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139468324 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark issue #19238: [SPARK-22016][SQL] Add HiveDialect for JDBC connection t...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19238 I can see the value, but it does not perform well in most cases if we using JDBC connection. Instead of adding the extra dialect to upstream, could you please add Hive as a separate data source?

[GitHub] spark pull request #19268: Incorrect Metric reported in MetricsReporter.scal...

2017-09-18 Thread Taaffy
GitHub user Taaffy opened a pull request: https://github.com/apache/spark/pull/19268 Incorrect Metric reported in MetricsReporter.scala Current implementation for processingRate-total uses wrong metric: mistakenly uses inputRowsPerSecond instead of processedRowsPerSecond

[GitHub] spark pull request #19211: [SPARK-18838][core] Add separate listener queues ...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19211#discussion_r139458303 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -65,53 +60,76 @@ private[spark] class LiveListenerBus(conf: SparkConf)

[GitHub] spark issue #19234: [SPARK-22010][PySpark] Change fromInternal method of Tim...

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 OK. It passed all tests, so let merge it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19211: [SPARK-18838][core] Add separate listener queues to Live...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19211 **[Test build #81884 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81884/testReport)** for PR 19211 at commit

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread akopich
Github user akopich commented on the issue: https://github.com/apache/spark/pull/18924 Ping @jkbradley . Thank you @WeichenXu123 one again for the comment! Please, have a look. --- - To unsubscribe, e-mail:

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139468045 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139468080 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -720,19 +633,67 @@ private[history] class

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18924 **[Test build #81885 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81885/testReport)** for PR 18924 at commit

[GitHub] spark pull request #18887: [SPARK-20642][core] Store FsHistoryProvider listi...

2017-09-18 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/18887#discussion_r139468509 --- Diff: core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala --- @@ -742,53 +703,150 @@ private[history] object FsHistoryProvider {

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should not coll...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/18924 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81885/ Test FAILed. ---

[GitHub] spark pull request #19267: [WIP][SPARK-20628][CORE] Blacklist nodes when the...

2017-09-18 Thread juanrh
GitHub user juanrh opened a pull request: https://github.com/apache/spark/pull/19267 [WIP][SPARK-20628][CORE] Blacklist nodes when they transition to DECOMMISSIONING state in YARN ## What changes were proposed in this pull request? Dynamic cluster configurations where cluster

[GitHub] spark issue #19261: [SPARK-22040] Add current_date function with timezone id

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19261 I think we should not do it, because no DB vendor does it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/12646 can we retest this ? The unknown return code is not related to the code. Thanks. --- - To unsubscribe, e-mail:

[GitHub] spark issue #18704: [SPARK-20783][SQL] Create ColumnVector to abstract exist...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18704 **[Test build #81883 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81883/testReport)** for PR 18704 at commit

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/12646 **[Test build #81886 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81886/testReport)** for PR 12646 at commit

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-18 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/19196 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19230: [SPARK-22003][SQL] support array column in vector...

2017-09-18 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/19230 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19196: [SPARK-21977] SinglePartition optimizations break certai...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19196 **[Test build #81887 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81887/testReport)** for PR 19196 at commit

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 @viirya I run the code, you're right, most of time cost on the executedPlan generation (The old version code). thanks! But can you append benchmark comparison with `RDD.aggregate` version?

[GitHub] spark issue #19261: [SPARK-22040] Add current_date function with timezone id

2017-09-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19261 What does this even mean? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #18887: [SPARK-20642][core] Store FsHistoryProvider listing data...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18887 **[Test build #81890 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81890/testReport)** for PR 18887 at commit

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19135 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid calling reserveUnrollMemoryForT...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19135 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/81878/ Test PASSed. ---

[GitHub] spark issue #12646: [SPARK-14878][SQL] Trim characters string function suppo...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/12646 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #18853: [SPARK-21646][SQL] CommonType for binary comparis...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18853#discussion_r139464749 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -925,6 +925,12 @@ object SQLConf { .intConf

[GitHub] spark pull request #18853: [SPARK-21646][SQL] CommonType for binary comparis...

2017-09-18 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/18853#discussion_r139464467 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -925,6 +925,12 @@ object SQLConf { .intConf

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r139470472 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +462,44 @@ final class OnlineLDAOptimizer extends

[GitHub] spark pull request #18924: [SPARK-14371] [MLLIB] OnlineLDAOptimizer should n...

2017-09-18 Thread WeichenXu123
Github user WeichenXu123 commented on a diff in the pull request: https://github.com/apache/spark/pull/18924#discussion_r139467949 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala --- @@ -462,31 +462,44 @@ final class OnlineLDAOptimizer extends

[GitHub] spark issue #19267: [WIP][SPARK-20628][CORE] Blacklist nodes when they trans...

2017-09-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19267 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19268: Incorrect Metric reported in MetricsReporter.scala

2017-09-18 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19268 Please make a JIRA @Taaffy --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #17819: [SPARK-20542][ML][SQL] Add an API to Bucketizer that can...

2017-09-18 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/17819 @WeichenXu123 Do you mean we keep both inputCol and inputCols in `Bucketizer`? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19230: [SPARK-22003][SQL] support array column in vectorized re...

2017-09-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19230 **[Test build #81877 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/81877/testReport)** for PR 19230 at commit

  1   2   3   4   5   >