[jira] [Commented] (SPARK-18717) Datasets - crash (compile exception) when mapping to immutable scala map

2016-12-05 Thread Damian Momot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724702#comment-15724702 ] Damian Momot commented on SPARK-18717: -- Yep it's already workarounded this way in my code but usage

[jira] [Created] (SPARK-18737) Serialization setting "spark.serializer" ignored in Spark 2.x

2016-12-05 Thread Dr. Michael Menzel (JIRA)
Dr. Michael Menzel created SPARK-18737: -- Summary: Serialization setting "spark.serializer" ignored in Spark 2.x Key: SPARK-18737 URL: https://issues.apache.org/jira/browse/SPARK-18737 Project:

[jira] [Created] (SPARK-18736) [SQL] CreateMap allow non-unique keys

2016-12-05 Thread Eyal Farago (JIRA)
Eyal Farago created SPARK-18736: --- Summary: [SQL] CreateMap allow non-unique keys Key: SPARK-18736 URL: https://issues.apache.org/jira/browse/SPARK-18736 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-18735) Why don't we destroy the broadcast variable after each iteration?

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724620#comment-15724620 ] Sean Owen commented on SPARK-18735: --- Should be because they are used in a computation that produces a

[jira] [Commented] (SPARK-18735) Why don't we destroy the broadcast variable after each iteration?

2016-12-05 Thread Jianfei Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724605#comment-15724605 ] Jianfei Wang commented on SPARK-18735: -- oh no,I just see in Kmeans and GaussianMixture they both use

[jira] [Resolved] (SPARK-18735) Why don't we destroy the broadcast variable after each iteration?

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18735. --- Resolution: Invalid (Please ask questions on the mailing list) Two reasons: 1) if the computation

[jira] [Created] (SPARK-18735) Why don't we destroy the broadcast variable after each iteration?

2016-12-05 Thread Jianfei Wang (JIRA)
Jianfei Wang created SPARK-18735: Summary: Why don't we destroy the broadcast variable after each iteration? Key: SPARK-18735 URL: https://issues.apache.org/jira/browse/SPARK-18735 Project: Spark

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724488#comment-15724488 ] Sean Owen commented on SPARK-18731: --- Yes, the scheduler delay comes from having to transmit the huge

[jira] [Commented] (SPARK-18712) keep the order of sql expression and support short circuit

2016-12-05 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724442#comment-15724442 ] Xiao Li commented on SPARK-18712: - [~cloud_fan] Sure, let me take this. > keep the order of sql

[jira] [Comment Edited] (SPARK-18712) keep the order of sql expression and support short circuit

2016-12-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724391#comment-15724391 ] Wenchen Fan edited comment on SPARK-18712 at 12/6/16 5:35 AM: -- Spark SQL has

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-05 Thread Xiaoye Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724415#comment-15724415 ] Xiaoye Sun commented on SPARK-18731: My concern is not about improving the overall performance of

[jira] [Commented] (SPARK-18712) keep the order of sql expression and support short circuit

2016-12-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724391#comment-15724391 ] Wenchen Fan commented on SPARK-18712: - Spark SQL has no guarantee about the filter conditions

[jira] [Comment Edited] (SPARK-18712) keep the order of sql expression and support short circuit

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724381#comment-15724381 ] Cheng Lian edited comment on SPARK-18712 at 12/6/16 5:10 AM: - I think the

[jira] [Commented] (SPARK-18712) keep the order of sql expression and support short circuit

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724381#comment-15724381 ] Cheng Lian commented on SPARK-18712: I think the contract here is that for a DataFrame {{df}} and 1

[jira] [Assigned] (SPARK-18734) Represent timestamp in StreamingQueryProgress as formatted string instead of millis

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18734: Assignee: Apache Spark (was: Tathagata Das) > Represent timestamp in

[jira] [Assigned] (SPARK-18734) Represent timestamp in StreamingQueryProgress as formatted string instead of millis

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18734: Assignee: Tathagata Das (was: Apache Spark) > Represent timestamp in

[jira] [Commented] (SPARK-18734) Represent timestamp in StreamingQueryProgress as formatted string instead of millis

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18734?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724371#comment-15724371 ] Apache Spark commented on SPARK-18734: -- User 'tdas' has created a pull request for this issue:

[jira] [Created] (SPARK-18734) Represent timestamp in StreamingQueryProgress as formatted string instead of millis

2016-12-05 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-18734: - Summary: Represent timestamp in StreamingQueryProgress as formatted string instead of millis Key: SPARK-18734 URL: https://issues.apache.org/jira/browse/SPARK-18734

[jira] [Updated] (SPARK-18734) Represent timestamp in StreamingQueryProgress as formatted string instead of millis

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-18734: -- Issue Type: Improvement (was: Bug) > Represent timestamp in StreamingQueryProgress as

[jira] [Assigned] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18733: Assignee: (was: Apache Spark) > Spark history server file cleaner excludes

[jira] [Assigned] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18733: Assignee: Apache Spark > Spark history server file cleaner excludes in-progress files >

[jira] [Commented] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724362#comment-15724362 ] Apache Spark commented on SPARK-18733: -- User 'seyfe' has created a pull request for this issue:

[jira] [Created] (SPARK-18733) Spark history server file cleaner excludes in-progress files

2016-12-05 Thread Ergin Seyfe (JIRA)
Ergin Seyfe created SPARK-18733: --- Summary: Spark history server file cleaner excludes in-progress files Key: SPARK-18733 URL: https://issues.apache.org/jira/browse/SPARK-18733 Project: Spark

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-05 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724335#comment-15724335 ] yuhao yang commented on SPARK-18731: Based on my experiences, generally KMeans is fast even for large

[jira] [Resolved] (SPARK-18721) ForeachSink breaks Watermark in append mode

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18721. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16160

[jira] [Resolved] (SPARK-18672) Close recordwriter in SparkHadoopMapReduceWriter before committing

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18672. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16098

[jira] [Updated] (SPARK-17591) Fix/investigate the failure of tests in Scala On Windows

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17591: -- Assignee: Hyukjin Kwon > Fix/investigate the failure of tests in Scala On Windows >

[jira] [Updated] (SPARK-18672) Close recordwriter in SparkHadoopMapReduceWriter before committing

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18672: -- Assignee: Hyukjin Kwon > Close recordwriter in SparkHadoopMapReduceWriter before committing >

[jira] [Resolved] (SPARK-18684) Spark Executors off-heap memory usage keeps increasing while running spark streaming

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-18684. --- Resolution: Not A Problem > Spark Executors off-heap memory usage keeps increasing while running

[jira] [Resolved] (SPARK-18572) Use the hive client method "getPartitionNames" to answer "SHOW PARTITIONS" queries on partitioned Hive tables

2016-12-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18572. - Resolution: Fixed Assignee: Michael Allman Fix Version/s: 2.1.0 > Use the hive

[jira] [Commented] (SPARK-18709) Automatic null conversion bug (instead of throwing error) when creating a Spark Datarame with incompatible types for fields.

2016-12-05 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724144#comment-15724144 ] Dongjoon Hyun commented on SPARK-18709: --- I'll check which commit added the guard condition. >

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724147#comment-15724147 ] Sean Owen commented on SPARK-18731: --- Although later versions may be more efficient, I don't think this

[jira] [Assigned] (SPARK-18732) The Y axis ranges of "schedulingDelay", "processingTime", and "totalDelay" should not keep the same.

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18732: Assignee: (was: Apache Spark) > The Y axis ranges of "schedulingDelay",

[jira] [Commented] (SPARK-18732) The Y axis ranges of "schedulingDelay", "processingTime", and "totalDelay" should not keep the same.

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724139#comment-15724139 ] Apache Spark commented on SPARK-18732: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18732) The Y axis ranges of "schedulingDelay", "processingTime", and "totalDelay" should not keep the same.

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18732: Assignee: Apache Spark > The Y axis ranges of "schedulingDelay", "processingTime", and

[jira] [Resolved] (SPARK-18722) Move no data rate limit from StreamExecution to ProgressReporter

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18722. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16155

[jira] [Created] (SPARK-18732) The Y axis ranges of "schedulingDelay", "processingTime", and "totalDelay" should not keep the same.

2016-12-05 Thread Genmao Yu (JIRA)
Genmao Yu created SPARK-18732: - Summary: The Y axis ranges of "schedulingDelay", "processingTime", and "totalDelay" should not keep the same. Key: SPARK-18732 URL: https://issues.apache.org/jira/browse/SPARK-18732

[jira] [Commented] (SPARK-18593) JDBCRDD returns incorrect results for filters on CHAR of PostgreSQL

2016-12-05 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724129#comment-15724129 ] Dongjoon Hyun commented on SPARK-18593: --- Oops. Thank you for correction. > JDBCRDD returns

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-05 Thread Xiaoye Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724121#comment-15724121 ] Xiaoye Sun commented on SPARK-18731: Could you please provide a link to the solution? Large task

[jira] [Resolved] (SPARK-18555) na.fill miss up original values in long integers

2016-12-05 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-18555. - Resolution: Fixed Assignee: Song Jun Fix Version/s: 2.2.0 > na.fill miss up

[jira] [Commented] (SPARK-18711) NPE in generated SpecificMutableProjection for Aggregator

2016-12-05 Thread koert kuipers (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724115#comment-15724115 ] koert kuipers commented on SPARK-18711: --- confirmed it resolved the issue for me. thanks > NPE in

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724110#comment-15724110 ] Liang-Chi Hsieh commented on SPARK-18539: - That's cool. > Cannot filter by nonexisting column in

[jira] [Updated] (SPARK-18657) Persist UUID across query restart

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18657: -- Assignee: Tathagata Das > Persist UUID across query restart > - > >

[jira] [Updated] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18728: -- Issue Type: Improvement (was: Bug) I think the questions will be: what does it gain? and what is the

[jira] [Updated] (SPARK-18593) JDBCRDD returns incorrect results for filters on CHAR of PostgreSQL

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-18593: -- Assignee: Takeshi Yamamuro > JDBCRDD returns incorrect results for filters on CHAR of PostgreSQL >

[jira] [Commented] (SPARK-18709) Automatic null conversion bug (instead of throwing error) when creating a Spark Datarame with incompatible types for fields.

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724101#comment-15724101 ] Sean Owen commented on SPARK-18709: --- BTW do you know what change fixed this, by any chance? >

[jira] [Resolved] (SPARK-18720) Code Refactoring of withColumn

2016-12-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18720. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16152

[jira] [Updated] (SPARK-18720) Code Refactoring of withColumn

2016-12-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-18720: Affects Version/s: (was: 2.0.2) > Code Refactoring of withColumn >

[jira] [Updated] (SPARK-18720) Code Refactoring of withColumn

2016-12-05 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-18720: Assignee: Xiao Li > Code Refactoring of withColumn > -- > >

[jira] [Commented] (SPARK-18731) Task size in K-means is so large

2016-12-05 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724082#comment-15724082 ] Sean Owen commented on SPARK-18731: --- This may have been improved in more recent versions, by the way.

[jira] [Resolved] (SPARK-18668) Do not auto-generate query name

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18668. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16113

[jira] [Resolved] (SPARK-18657) Persist UUID across query restart

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18657. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16113

[jira] [Resolved] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-18729. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 16162

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15724013#comment-15724013 ] Cheng Lian commented on SPARK-18539: [~xwu0226], thanks for the new use case! [~viirya], I do think

[jira] [Resolved] (SPARK-18634) Corruption and Correctness issues with exploding Python UDFs

2016-12-05 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18634. --- Resolution: Fixed Assignee: Liang-Chi Hsieh Fix Version/s: 2.1.0

[jira] [Updated] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-18729: -- Priority: Critical (was: Major) > MemorySink should not call DataFrame.collect when holding a

[jira] [Updated] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-18729: -- Target Version/s: 2.1.0 > MemorySink should not call DataFrame.collect when holding a lock >

[jira] [Updated] (SPARK-18671) Add tests to ensure stability of that all Structured Streaming log formats

2016-12-05 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-18671: -- Priority: Major (was: Critical) > Add tests to ensure stability of that all Structured

[jira] [Comment Edited] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-05 Thread Mansur Ashraf (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723786#comment-15723786 ] Mansur Ashraf edited comment on SPARK-18728 at 12/6/16 1:37 AM: Alex,

[jira] [Created] (SPARK-18731) Task size in K-means is so large

2016-12-05 Thread Xiaoye Sun (JIRA)
Xiaoye Sun created SPARK-18731: -- Summary: Task size in K-means is so large Key: SPARK-18731 URL: https://issues.apache.org/jira/browse/SPARK-18731 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-18730) Ask the build script to link to Jenkins test report page instead of full console output page when posting to GitHub

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723958#comment-15723958 ] Apache Spark commented on SPARK-18730: -- User 'liancheng' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18730) Ask the build script to link to Jenkins test report page instead of full console output page when posting to GitHub

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18730: Assignee: Cheng Lian (was: Apache Spark) > Ask the build script to link to Jenkins test

[jira] [Assigned] (SPARK-18730) Ask the build script to link to Jenkins test report page instead of full console output page when posting to GitHub

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18730: Assignee: Apache Spark (was: Cheng Lian) > Ask the build script to link to Jenkins test

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723960#comment-15723960 ] Liang-Chi Hsieh commented on SPARK-18539: - Actually I am not sure if this is a valid usage. I

[jira] [Updated] (SPARK-18730) Ask the build script to link to Jenkins test report page instead of full console output page when posting to GitHub

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-18730: --- Priority: Minor (was: Major) > Ask the build script to link to Jenkins test report page instead of

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723949#comment-15723949 ] Liang-Chi Hsieh commented on SPARK-18539: - Because we respect user-specified schema, we won't

[jira] [Created] (SPARK-18730) Ask the build script to link to Jenkins test report page instead of full console output page when posting to GitHub

2016-12-05 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-18730: -- Summary: Ask the build script to link to Jenkins test report page instead of full console output page when posting to GitHub Key: SPARK-18730 URL:

[jira] [Comment Edited] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723888#comment-15723888 ] Xin Wu edited comment on SPARK-18539 at 12/6/16 12:46 AM: -- I think we will hit

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Xin Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723888#comment-15723888 ] Xin Wu commented on SPARK-18539: I think we will hit the issue if we use user-specified schema. Here is

[jira] [Commented] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-05 Thread Alex Levenson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723866#comment-15723866 ] Alex Levenson commented on SPARK-18728: --- I think the main selling point of Algebird aggregators

[jira] [Commented] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2016-12-05 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723814#comment-15723814 ] Jakob Odersky commented on SPARK-14280: --- You're welcome pull the changes back into your repo of

[jira] [Commented] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2016-12-05 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723807#comment-15723807 ] Jakob Odersky commented on SPARK-14280: --- Hi [~joshrosen], I rebased your initial work onto the

[jira] [Comment Edited] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-05 Thread Mansur Ashraf (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723786#comment-15723786 ] Mansur Ashraf edited comment on SPARK-18728 at 12/5/16 11:55 PM: - Alex,

[jira] [Commented] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-05 Thread Mansur Ashraf (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723786#comment-15723786 ] Mansur Ashraf commented on SPARK-18728: --- Alex, Thanks for opening the issue. Let me add some more

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723781#comment-15723781 ] Cheng Lian commented on SPARK-18539: Please remind me if I missed anything important, otherwise, we

[jira] [Assigned] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18729: Assignee: Shixiong Zhu (was: Apache Spark) > MemorySink should not call

[jira] [Commented] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723772#comment-15723772 ] Apache Spark commented on SPARK-18729: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18729: Assignee: Apache Spark (was: Shixiong Zhu) > MemorySink should not call

[jira] [Updated] (SPARK-14660) Executors show up active tasks indefinitely after stage is killed

2016-12-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-14660: --- Component/s: Scheduler > Executors show up active tasks indefinitely after stage is killed >

[jira] [Created] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-18729: Summary: MemorySink should not call DataFrame.collect when holding a lock Key: SPARK-18729 URL: https://issues.apache.org/jira/browse/SPARK-18729 Project: Spark

[jira] [Updated] (SPARK-18729) MemorySink should not call DataFrame.collect when holding a lock

2016-12-05 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18729: - Issue Type: Improvement (was: Bug) > MemorySink should not call DataFrame.collect when holding

[jira] [Comment Edited] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723747#comment-15723747 ] Cheng Lian edited comment on SPARK-18539 at 12/5/16 11:43 PM: --

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723747#comment-15723747 ] Cheng Lian commented on SPARK-18539: [~v-gerasimov], [~smilegator], and [~xwu0226], after some

[jira] [Commented] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2016-12-05 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723718#comment-15723718 ] Cheng Lian commented on SPARK-18539: As commented on GitHub, there're two issues right now: # This

[jira] [Updated] (SPARK-18721) ForeachSink breaks Watermark in append mode

2016-12-05 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-18721: - Target Version/s: 2.1.0 > ForeachSink breaks Watermark in append mode >

[jira] [Assigned] (SPARK-18717) Datasets - crash (compile exception) when mapping to immutable scala map

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18717: Assignee: (was: Apache Spark) > Datasets - crash (compile exception) when mapping to

[jira] [Commented] (SPARK-18717) Datasets - crash (compile exception) when mapping to immutable scala map

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723694#comment-15723694 ] Apache Spark commented on SPARK-18717: -- User 'aray' has created a pull request for this issue:

[jira] [Assigned] (SPARK-18717) Datasets - crash (compile exception) when mapping to immutable scala map

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18717: Assignee: Apache Spark > Datasets - crash (compile exception) when mapping to immutable

[jira] [Assigned] (SPARK-18721) ForeachSink breaks Watermark in append mode

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18721: Assignee: Shixiong Zhu (was: Apache Spark) > ForeachSink breaks Watermark in append mode

[jira] [Assigned] (SPARK-18721) ForeachSink breaks Watermark in append mode

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18721: Assignee: Apache Spark (was: Shixiong Zhu) > ForeachSink breaks Watermark in append mode

[jira] [Commented] (SPARK-18721) ForeachSink breaks Watermark in append mode

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723691#comment-15723691 ] Apache Spark commented on SPARK-18721: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-18728) Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator

2016-12-05 Thread Alex Levenson (JIRA)
Alex Levenson created SPARK-18728: - Summary: Consider using Algebird's Aggregator instead of org.apache.spark.sql.expressions.Aggregator Key: SPARK-18728 URL: https://issues.apache.org/jira/browse/SPARK-18728

[jira] [Updated] (SPARK-14932) Allow DataFrame.replace() to replace values with None

2016-12-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-14932: --- Labels: starter (was: ) > Allow DataFrame.replace() to replace values with None >

[jira] [Resolved] (SPARK-18694) Add StreamingQuery.explain and exception to Python and fix StreamingQueryException

2016-12-05 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18694. -- Resolution: Fixed Fix Version/s: 2.1.0 > Add StreamingQuery.explain and exception to

[jira] [Commented] (SPARK-14932) Allow DataFrame.replace() to replace values with None

2016-12-05 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723620#comment-15723620 ] Josh Rosen commented on SPARK-14932: I think that there's a similar issue impacting the Scala / Java

[jira] [Assigned] (SPARK-18721) ForeachSink breaks Watermark in append mode

2016-12-05 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-18721: Assignee: Shixiong Zhu > ForeachSink breaks Watermark in append mode >

[jira] [Updated] (SPARK-18719) Document spark.ui.showConsoleProgress

2016-12-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-18719: --- Assignee: Nicholas > Document spark.ui.showConsoleProgress > - >

[jira] [Resolved] (SPARK-18719) Document spark.ui.showConsoleProgress

2016-12-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-18719. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16151

[jira] [Updated] (SPARK-18719) Document spark.ui.showConsoleProgress

2016-12-05 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-18719: --- Assignee: Nicholas Chammas (was: Nicholas) > Document spark.ui.showConsoleProgress >

[jira] [Commented] (SPARK-18697) Upgrade sbt plugins

2016-12-05 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723579#comment-15723579 ] Apache Spark commented on SPARK-18697: -- User 'weiqingy' has created a pull request for this issue:

  1   2   3   >