[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140606013 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -89,61 +89,124 @@ class

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140615574 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingJoinSuite.scala --- @@ -116,6 +116,26 @@ class StreamingJoinSuite extends

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140608032 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -89,61 +89,124 @@ class

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140614563 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -89,61 +89,124 @@ class

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/19327#discussion_r140614721 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/SymmetricHashJoinStateManager.scala --- @@ -89,61 +89,124 @@ class

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19323 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82095/ Test PASSed. ---

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19323 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19323 **[Test build #82095 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82095/testReport)** for PR 19323 at commit

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19325 cc @HyukjinKwon @viirya --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19311: [SPARK-22083][CORE] Release locks in MemoryStore....

2017-09-22 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/19311#discussion_r140612177 --- Diff: core/src/test/scala/org/apache/spark/storage/MemoryStoreSuite.scala --- @@ -407,4 +407,119 @@ class MemoryStoreSuite })

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-22 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/16578 > @mallman how about adding comment explaining why such workaround was done + bug number in parquet-mr ? So in future once that bug is fixed, code can be cleaned. It will take me more time

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19194 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82092/ Test PASSed. ---

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19194 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19194 **[Test build #82092 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82092/testReport)** for PR 19194 at commit

[GitHub] spark issue #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16578 **[Test build #82097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82097/testReport)** for PR 16578 at commit

[GitHub] spark issue #19302: [SPARK-14878][SQL] Adding examples for Trim characters s...

2017-09-22 Thread kevinyu98
Github user kevinyu98 commented on the issue: https://github.com/apache/spark/pull/19302 I am opening a new jira SPARK-22088 for this. I will close this PR. The style fails is because a new JIRA SPARK-22088 fixed a style issue after I submit my PR. I have included that JIRA in my new

[GitHub] spark pull request #19302: [SPARK-14878][SQL] Adding examples for Trim chara...

2017-09-22 Thread kevinyu98
Github user kevinyu98 closed the pull request at: https://github.com/apache/spark/pull/19302 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #16578: [SPARK-4502][SQL] Parquet nested column pruning

2017-09-22 Thread mallman
Github user mallman commented on a diff in the pull request: https://github.com/apache/spark/pull/16578#discussion_r140611282 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaPruning.scala --- @@ -0,0 +1,130 @@ +/* + *

[GitHub] spark issue #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19327 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19327 **[Test build #82094 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82094/testReport)** for PR 19327 at commit

[GitHub] spark issue #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19327 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82094/ Test FAILed. ---

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19325 **[Test build #82096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82096/testReport)** for PR 19325 at commit

[GitHub] spark issue #19311: [SPARK-22083][CORE] Release locks in MemoryStore.evictBl...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19311 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19311: [SPARK-22083][CORE] Release locks in MemoryStore.evictBl...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19311 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82091/ Test PASSed. ---

[GitHub] spark issue #19311: [SPARK-22083][CORE] Release locks in MemoryStore.evictBl...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19311 **[Test build #82091 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82091/testReport)** for PR 19311 at commit

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19325 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19325 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82093/ Test FAILed. ---

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19325 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19325 **[Test build #82093 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82093/testReport)** for PR 19325 at commit

[GitHub] spark pull request #19122: [SPARK-21911][ML][PySpark] Parallel Model Evaluat...

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/19122#discussion_r140600118 --- Diff: python/pyspark/ml/tests.py --- @@ -836,6 +836,27 @@ def test_save_load_simple_estimator(self): loadedModel =

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19323 **[Test build #82095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82095/testReport)** for PR 19323 at commit

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/19323 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19327 **[Test build #82094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82094/testReport)** for PR 19327 at commit

[GitHub] spark issue #19326: [SPARK-22107] Change as to alias in python quickstart

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19326 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19327: [WIP] Implement stream-stream outer joins.

2017-09-22 Thread joseph-torres
GitHub user joseph-torres opened a pull request: https://github.com/apache/spark/pull/19327 [WIP] Implement stream-stream outer joins. ## What changes were proposed in this pull request? Allow one-sided outer joins between two streams when a watermark is defined.

[GitHub] spark pull request #19326: [SPARK-22107] Change as to alias in python quicks...

2017-09-22 Thread jgoleary
GitHub user jgoleary opened a pull request: https://github.com/apache/spark/pull/19326 [SPARK-22107] Change as to alias in python quickstart ## What changes were proposed in this pull request? Updated docs so that a line of python in the quick start guide executes.

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19325 Adding the patch to enable 0-parameter pandas_udf if it is requested in the future

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19325 **[Test build #82093 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82093/testReport)** for PR 19325 at commit

[GitHub] spark issue #19325: [SPARK-22106][PYSPARK][SQL] Disable 0-parameter pandas_u...

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19325 This is a followup to #18659 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19325: [SPARK--22106][PYSPARK][SQL] Disable 0-parameter pandas_...

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/19325 @cloud-fan @ueshin I'm not sure if you are ok with merging this soon, but in adding the doctests I found there were problems with using the decorator and having empty partitions. I fixed those

[GitHub] spark pull request #19325: [SPARK--22106][PYSPARK][SQL] Disable 0-parameter ...

2017-09-22 Thread BryanCutler
GitHub user BryanCutler opened a pull request: https://github.com/apache/spark/pull/19325 [SPARK--22106][PYSPARK][SQL] Disable 0-parameter pandas_udf and add doctests ## What changes were proposed in this pull request? This change disables the use of 0-parameter pandas_udfs

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19194 **[Test build #82092 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82092/testReport)** for PR 19194 at commit

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-22 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19194 Jenkins, test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19311: [SPARK-22083][CORE] Release locks in MemoryStore.evictBl...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19311 **[Test build #82091 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82091/testReport)** for PR 19311 at commit

[GitHub] spark issue #19303: [SPARK-22085][CORE]When the application has no core left...

2017-09-22 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19303 Personally I don't suggest we add extra logic to resolve a non-bug. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19317: [SPARK-22098][CORE] Add new method aggregateByKeyLocally...

2017-09-22 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19317 cc @WeichenXu123 mind take a look? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19270: [SPARK-21809] : Change Stage Page to use datatables to s...

2017-09-22 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/19270 I just tried this out and it appears to be working for me for a running application, haven't tried the history UI yet.@ajbozarth What browser are you using and what are you running (a

[GitHub] spark pull request #19311: [SPARK-22083][CORE] Release locks in MemoryStore....

2017-09-22 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/19311#discussion_r140559549 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -544,20 +544,39 @@ private[spark] class MemoryStore( }

[GitHub] spark issue #18994: [SPARK-21784][SQL] Adds support for defining information...

2017-09-22 Thread sureshthalamati
Github user sureshthalamati commented on the issue: https://github.com/apache/spark/pull/18994 Thank you for the input @gatorsmile --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #18659: [SPARK-21190][PYSPARK] Python Vectorized UDFs

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/18659#discussion_r14089 --- Diff: python/pyspark/serializers.py --- @@ -199,6 +211,55 @@ def __repr__(self): return "ArrowSerializer" +class

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-22 Thread sitalkedia
Github user sitalkedia commented on the issue: https://github.com/apache/spark/pull/18805 Thanks for looking into this @srowen. Its weird, I dont understand that either. Also, I am not able to reproduce this issue on my laptop. ---

[GitHub] spark issue #17743: [SPARK-20448][DOCS] Document how FileInputDStream works ...

2017-09-22 Thread steveloughran
Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/17743 People don't realise how much object stores aren't file systems until they discover all their assumptions are broken. Once you know how they work, you can set up a workflow which is

[GitHub] spark issue #19300: [SPARK-22082][SparkR]Spelling mistake: "choosen" in API ...

2017-09-22 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/19300 or find and fix more typos ;) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #13143: [SPARK-15359] [Mesos] Mesos dispatcher should handle DRI...

2017-09-22 Thread devaraj-kavali
Github user devaraj-kavali commented on the issue: https://github.com/apache/spark/pull/13143 @ArtRand I think this is still an issue which needs to be merged, do you have any observations with this PR? --- - To

[GitHub] spark issue #18659: [SPARK-21190][PYSPARK] Python Vectorized UDFs

2017-09-22 Thread BryanCutler
Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/18659 Thanks @cloud-fan @ueshin and others who reviewed! I'll make followups to disable 0-param and complete the docs for this. ---

[GitHub] spark issue #19324: [SPARK-22103] Move HashAggregateExec parent consume to a...

2017-09-22 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19324 @viirya This is related to https://github.com/apache/spark/pull/18931/, as it also separates out the consume function. Maybe it would be enough to do similar splits into functions in the

[GitHub] spark issue #19229: [SPARK-22001][ML][SQL] ImputerModel can do withColumn fo...

2017-09-22 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19229 The performance gap issue (compared with RDD version), I create a separated JIRA to track: https://issues.apache.org/jira/browse/SPARK-22105 As the result of discussion with @cloud-fan

[GitHub] spark pull request #19324: [SPARK-22103] Move HashAggregateExec parent consu...

2017-09-22 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19324#discussion_r140532783 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -462,18 +464,36 @@ case class

[GitHub] spark issue #19194: [SPARK-20589] Allow limiting task concurrency per stage

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19194 **[Test build #82090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82090/testReport)** for PR 19194 at commit

[GitHub] spark pull request #19324: [SPARK-22103] Move HashAggregateExec parent consu...

2017-09-22 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19324#discussion_r140528883 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala --- @@ -329,6 +332,15 @@ case class

[GitHub] spark pull request #19324: [SPARK-22103] Move HashAggregateExec parent consu...

2017-09-22 Thread juliuszsompolski
Github user juliuszsompolski commented on a diff in the pull request: https://github.com/apache/spark/pull/19324#discussion_r140528662 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala --- @@ -599,10 +621,14 @@ case class

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82080/ Test FAILed. ---

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #82080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82080/testReport)** for PR 19222 at commit

[GitHub] spark issue #19324: [SPARK-22103] Move HashAggregateExec parent consume to a...

2017-09-22 Thread juliuszsompolski
Github user juliuszsompolski commented on the issue: https://github.com/apache/spark/pull/19324 @hvanhovell @gatorsmile @cloud-fan @rednaxelafx --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19324: [SPARK-22103] Move HashAggregateExec parent consume to a...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19324 **[Test build #82088 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82088/testReport)** for PR 19324 at commit

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #82089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82089/testReport)** for PR 19222 at commit

[GitHub] spark pull request #19324: [SPARK-22103] Move HashAggregateExec parent consu...

2017-09-22 Thread juliuszsompolski
GitHub user juliuszsompolski opened a pull request: https://github.com/apache/spark/pull/19324 [SPARK-22103] Move HashAggregateExec parent consume to a separate function in codegen ## What changes were proposed in this pull request? HashAggregateExec codegen uses two paths

[GitHub] spark issue #19322: [SPARK-22102][SQL] Set ConfVars.METASTOREWAREHOUSE befor...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19322 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19322: [SPARK-22102][SQL] Set ConfVars.METASTOREWAREHOUSE befor...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19322 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82082/ Test FAILed. ---

[GitHub] spark issue #19322: [SPARK-22102][SQL] Set ConfVars.METASTOREWAREHOUSE befor...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19322 **[Test build #82082 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82082/testReport)** for PR 19322 at commit

[GitHub] spark issue #19156: [SPARK-19634][SQL][ML][FOLLOW-UP] Improve interface of d...

2017-09-22 Thread WeichenXu123
Github user WeichenXu123 commented on the issue: https://github.com/apache/spark/pull/19156 @cloud-fan Can you help review the part of code which related to SQL interface ? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19277: [SPARK-22058][CORE]the BufferedInputStream will not be c...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19277 **[Test build #3932 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3932/testReport)** for PR 19277 at commit

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82078/ Test FAILed. ---

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #82078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82078/testReport)** for PR 19222 at commit

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19222 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19290: [WIP][SPARK-22063][R] Upgrades lintr to latest commit sh...

2017-09-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19290 Will update the comment tomorrow. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19222 **[Test build #82087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82087/testReport)** for PR 19222 at commit

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/19323 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19323 **[Test build #82085 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82085/testReport)** for PR 19323 at commit

[GitHub] spark issue #19020: [SPARK-3181] [ML] Implement huber loss for LinearRegress...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19020 **[Test build #82086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82086/testReport)** for PR 19020 at commit

[GitHub] spark issue #19020: [SPARK-3181] [ML] Implement huber loss for LinearRegress...

2017-09-22 Thread yanboliang
Github user yanboliang commented on the issue: https://github.com/apache/spark/pull/19020 Jenkins, test this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19323: [SPARK-22092] Reallocation in OffHeapColumnVector.reserv...

2017-09-22 Thread ala
Github user ala commented on the issue: https://github.com/apache/spark/pull/19323 @hvanhovell --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark pull request #19323: [SPARK-22092] Reallocation in OffHeapColumnVector...

2017-09-22 Thread ala
GitHub user ala opened a pull request: https://github.com/apache/spark/pull/19323 [SPARK-22092] Reallocation in OffHeapColumnVector.reserveInternal corrupts struct and array data `OffHeapColumnVector.reserveInternal()` will only copy already inserted values during reallocation if

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/19222 Ok, I missed that you had moved these into the base class. I still look forward to the benchmark :)... I still think that the hierarchy does give any benefit. All block subclasses

[GitHub] spark pull request #19301: [SPARK-22084][SQL] Fix performance regression in ...

2017-09-22 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/19301#discussion_r140508154 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/interfaces.scala --- @@ -72,11 +74,19 @@ object

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/19222 As @hvanhovell pointed out, the first implementation introduced a lot polymorphic call sites in very performance critical code (e.g. `getBaseObject()` or `getBaseOffset()`. While `MemoryBlock` class

[GitHub] spark pull request #19322: [SPARK-22102][SQL] Set ConfVars.METASTOREWAREHOUS...

2017-09-22 Thread wangyum
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/19322#discussion_r140503940 --- Diff: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/HiveCliSessionStateSuite.scala --- @@ -27,12 +28,12 @@ import

[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...

2017-09-22 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/19144 Thank you, I was wrong. I've seen it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19301: [SPARK-22084][SQL] Fix performance regression in aggrega...

2017-09-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19301 @stanzhai Thanks. I see. Because the aggregation functions are bound to individual buffer slots, they are recognized as different expressions and won't be eliminated. ---

[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...

2017-09-22 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/19144 That's right, please look at https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/ui/AllBatchesTable.scala#L28 BTW, sometimes there's another

[GitHub] spark issue #19319: [SPARK-21766][PySpark][SQL] DataFrame toPandas() raises ...

2017-09-22 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/19319 Thanks @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...

2017-09-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19144 Because you didn't fetch the upstream I guess .. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...

2017-09-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19285 **[Test build #82084 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82084/testReport)** for PR 19285 at commit

[GitHub] spark issue #19322: [SPARK-22102][SQL] Set ConfVars.METASTOREWAREHOUSE befor...

2017-09-22 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/19322 cc @cloud-fan @yaooqinn --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19222: [SPARK-10399][CORE][SQL] Introduce multiple MemoryBlocks...

2017-09-22 Thread hvanhovell
Github user hvanhovell commented on the issue: https://github.com/apache/spark/pull/19222 Circling back to the inheritance discussion. My worry is that this will introduce a lot polymorphic call sites in very performance critical code. Even if you tag on final to each method, the

[GitHub] spark issue #18805: [SPARK-19112][CORE] Support for ZStandard codec

2017-09-22 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/18805 Hm, this might be a real error. It seems to be hanging at: ``` [info] ExternalAppendOnlyMapSuite: ... [info] - simple cogroup (56 milliseconds) [info] - spilling (3 seconds,

[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...

2017-09-22 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/19144 https://github.com/guoxiaolongzte/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/ui/AllBatchesTable.scala It's strange that I didn't see my revised record.

[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...

2017-09-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19144 This is merged to master ... https://github.com/apache/spark/commit/10e37f6eb6819c9233830c0d97e8fd1c713be0f1 ... --- - To

[GitHub] spark issue #19144: [UI][Streaming]Modify the title, 'Records' instead of 'I...

2017-09-22 Thread guoxiaolongzte
Github user guoxiaolongzte commented on the issue: https://github.com/apache/spark/pull/19144 why is it closed?I think this should be incorporated into the master. --- - To unsubscribe, e-mail:

<    1   2   3   4   >