[jira] [Commented] (SPARK-6567) Large linear model parallelism via a join and reduceByKey

2016-09-11 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483287#comment-15483287 ] WangJianfei commented on SPARK-6567: @Reza Zadeh Any progress about this problems? Th

[jira] [Comment Edited] (SPARK-6567) Large linear model parallelism via a join and reduceByKey

2016-09-11 Thread WangJianfei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483287#comment-15483287 ] WangJianfei edited comment on SPARK-6567 at 9/12/16 6:57 AM: -

[jira] [Assigned] (SPARK-17462) Check for places within MLlib which should use VersionUtils to parse Spark version strings

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17462: Assignee: Apache Spark > Check for places within MLlib which should use VersionUtils to pa

[jira] [Assigned] (SPARK-17462) Check for places within MLlib which should use VersionUtils to parse Spark version strings

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17462: Assignee: (was: Apache Spark) > Check for places within MLlib which should use Version

[jira] [Commented] (SPARK-17462) Check for places within MLlib which should use VersionUtils to parse Spark version strings

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483285#comment-15483285 ] Apache Spark commented on SPARK-17462: -- User 'VinceShieh' has created a pull request

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD

2016-09-11 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Description: h2.Problem description: The following query triggers out of memory error. {code} sc.

[jira] [Created] (SPARK-17503) Memory leak in Memory store which unable to cache whole RDD

2016-09-11 Thread Sean Zhong (JIRA)
Sean Zhong created SPARK-17503: -- Summary: Memory leak in Memory store which unable to cache whole RDD Key: SPARK-17503 URL: https://issues.apache.org/jira/browse/SPARK-17503 Project: Spark Issu

[jira] [Updated] (SPARK-17503) Memory leak in Memory store when unable to cache the whole RDD

2016-09-11 Thread Sean Zhong (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Zhong updated SPARK-17503: --- Summary: Memory leak in Memory store when unable to cache the whole RDD (was: Memory leak in Memory

[jira] [Commented] (SPARK-17502) Multiple Bugs in DDL Statements on Temporary Views

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483219#comment-15483219 ] Apache Spark commented on SPARK-17502: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-17502) Multiple Bugs in DDL Statements on Temporary Views

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17502: Assignee: (was: Apache Spark) > Multiple Bugs in DDL Statements on Temporary Views >

[jira] [Assigned] (SPARK-17502) Multiple Bugs in DDL Statements on Temporary Views

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17502: Assignee: Apache Spark > Multiple Bugs in DDL Statements on Temporary Views > ---

[jira] [Created] (SPARK-17502) Multiple Bugs in DDL Statements on Temporary Views

2016-09-11 Thread Xiao Li (JIRA)
Xiao Li created SPARK-17502: --- Summary: Multiple Bugs in DDL Statements on Temporary Views Key: SPARK-17502 URL: https://issues.apache.org/jira/browse/SPARK-17502 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17501) Re-register BlockManager again and again

2016-09-11 Thread Jagadeesan A S (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15483144#comment-15483144 ] Jagadeesan A S commented on SPARK-17501: Does this fail consistently? what is co

[jira] [Updated] (SPARK-17501) Re-register BlockManager again and again

2016-09-11 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-17501: -- Description: After many times re-register, executor will exit because of timeout exception {code}

[jira] [Updated] (SPARK-17501) Re-register BlockManager again and again

2016-09-11 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-17501: -- Description: {code} 16/09/11 04:02:42 INFO executor.Executor: Told to re-register on heartbeat 16/09/11

[jira] [Created] (SPARK-17501) Re-register BlockManager again and again

2016-09-11 Thread cen yuhai (JIRA)
cen yuhai created SPARK-17501: - Summary: Re-register BlockManager again and again Key: SPARK-17501 URL: https://issues.apache.org/jira/browse/SPARK-17501 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-17486) Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-11 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-17486. -- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > Remove unused TaskMetr

[jira] [Commented] (SPARK-12008) Spark hive security authorization doesn't work as Apache hive's

2016-09-11 Thread pin_zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482828#comment-15482828 ] pin_zhang commented on SPARK-12008: --- Does Spark SQL have any plan to support authrizati

[jira] [Commented] (SPARK-11374) skip.header.line.count is ignored in HiveContext

2016-09-11 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482812#comment-15482812 ] Dongjoon Hyun commented on SPARK-11374: --- Which versions of Spark are you using now?

[jira] [Commented] (SPARK-17454) Add option to specify Mesos resource offer constraints

2016-09-11 Thread Michael Gummelt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482682#comment-15482682 ] Michael Gummelt commented on SPARK-17454: - As of Spark 2.0, Mesos mode supports s

[jira] [Updated] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right

2016-09-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-17500: Description: The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is increased by file size in

[jira] [Updated] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right

2016-09-11 Thread DjvuLee (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DjvuLee updated SPARK-17500: Summary: The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is not right (was: The DiskBytes

[jira] [Assigned] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is wrong

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17500: Assignee: Apache Spark > The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy

[jira] [Assigned] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is wrong

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17500: Assignee: (was: Apache Spark) > The DiskBytesSpilled metric in ExternalMerger && Exter

[jira] [Commented] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is wrong

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15482303#comment-15482303 ] Apache Spark commented on SPARK-17500: -- User 'djvulee' has created a pull request fo

[jira] [Created] (SPARK-17500) The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is wrong

2016-09-11 Thread DjvuLee (JIRA)
DjvuLee created SPARK-17500: --- Summary: The DiskBytesSpilled metric in ExternalMerger && ExternalGroupBy is wrong Key: SPARK-17500 URL: https://issues.apache.org/jira/browse/SPARK-17500 Project: Spark

[jira] [Assigned] (SPARK-17499) make the default params in sparkR spark.mlp consistent with MultilayerPerceptronClassifier

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17499: Assignee: Apache Spark > make the default params in sparkR spark.mlp consistent with > Mu

[jira] [Commented] (SPARK-17499) make the default params in sparkR spark.mlp consistent with MultilayerPerceptronClassifier

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481973#comment-15481973 ] Apache Spark commented on SPARK-17499: -- User 'WeichenXu123' has created a pull reque

[jira] [Assigned] (SPARK-17499) make the default params in sparkR spark.mlp consistent with MultilayerPerceptronClassifier

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17499: Assignee: (was: Apache Spark) > make the default params in sparkR spark.mlp consistent

[jira] [Created] (SPARK-17499) make the default params in sparkR spark.mlp consistent with MultilayerPerceptronClassifier

2016-09-11 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-17499: -- Summary: make the default params in sparkR spark.mlp consistent with MultilayerPerceptronClassifier Key: SPARK-17499 URL: https://issues.apache.org/jira/browse/SPARK-17499

[jira] [Resolved] (SPARK-17415) Better error message for driver-side broadcast join OOMs

2016-09-11 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-17415. --- Resolution: Fixed Assignee: Sameer Agarwal Fix Version/s: 2.1.0 > Bet

[jira] [Created] (SPARK-17498) StringIndexer.setHandleInvalid sohuld have another option 'new'

2016-09-11 Thread Miroslav Balaz (JIRA)
Miroslav Balaz created SPARK-17498: -- Summary: StringIndexer.setHandleInvalid sohuld have another option 'new' Key: SPARK-17498 URL: https://issues.apache.org/jira/browse/SPARK-17498 Project: Spark

[jira] [Updated] (SPARK-17497) Preserve order when scanning ordered buckets over multiple partitions

2016-09-11 Thread Fridtjof Sander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fridtjof Sander updated SPARK-17497: Description: Non-associative aggregations (like ```collect_list```) require the data to be

[jira] [Updated] (SPARK-17497) Preserve order when scanning ordered buckets over multiple partitions

2016-09-11 Thread Fridtjof Sander (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fridtjof Sander updated SPARK-17497: Description: Non-associative aggregations (like `collect_list`) require the data to be sor

[jira] [Created] (SPARK-17497) Preserve order when scanning ordered buckets over multiple partitions

2016-09-11 Thread Fridtjof Sander (JIRA)
Fridtjof Sander created SPARK-17497: --- Summary: Preserve order when scanning ordered buckets over multiple partitions Key: SPARK-17497 URL: https://issues.apache.org/jira/browse/SPARK-17497 Project:

[jira] [Commented] (SPARK-17445) Reference an ASF page as the main place to find third-party packages

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481515#comment-15481515 ] Sean Owen commented on SPARK-17445: --- That's OK by me. I don't draw much distinction bet

[jira] [Commented] (SPARK-17389) KMeans speedup with better choice of k-means|| init steps = 2

2016-09-11 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481490#comment-15481490 ] Apache Spark commented on SPARK-17389: -- User 'yanboliang' has created a pull request

[jira] [Resolved] (SPARK-17336) Repeated calls sbin/spark-config.sh file Causes ${PYTHONPATH} Value duplicate

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17336. --- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull request

[jira] [Updated] (SPARK-17336) Repeated calls sbin/spark-config.sh file Causes ${PYTHONPATH} Value duplicate

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17336: -- Assignee: Bryan Cutler Priority: Minor (was: Major) > Repeated calls sbin/spark-config.sh file Cau

[jira] [Resolved] (SPARK-17330) Clean up spark-warehouse in UT

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17330. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14894 [https://github.co

[jira] [Updated] (SPARK-17330) Clean up spark-warehouse in UT

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17330: -- Assignee: tone Issue Type: Improvement (was: Bug) > Clean up spark-warehouse in UT > ---

[jira] [Resolved] (SPARK-16834) TrainValildationSplit and direct evaluation produce different scores

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-16834. --- Resolution: Won't Fix I think the answer is the same as in https://issues.apache.org/jira/browse/SPA

[jira] [Resolved] (SPARK-17439) QuantilesSummaries returns the wrong result after compression

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17439. --- Resolution: Fixed Assignee: Tim Hunter Fix Version/s: 2.1.0 2.0.1

[jira] [Resolved] (SPARK-17306) QuantileSummaries doesn't compress

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17306. --- Resolution: Fixed Assignee: Sean Owen Fix Version/s: 2.1.0 2.0.1 R

[jira] [Commented] (SPARK-16834) TrainValildationSplit and direct evaluation produce different scores

2016-09-11 Thread Max Moroz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481341#comment-15481341 ] Max Moroz commented on SPARK-16834: --- [~bryanc] thanks for looking into this. I have no

[jira] [Commented] (SPARK-17493) Spark Job hangs while DataFrame writing to HDFS path with parquet mode

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481314#comment-15481314 ] Sean Owen commented on SPARK-17493: --- I don't think this is enough info. What does 'hang

[jira] [Issue Comment Deleted] (SPARK-15687) Columnar execution engine

2016-09-11 Thread Kiran Lonikar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kiran Lonikar updated SPARK-15687: -- Comment: was deleted (was: I agree. It will then be possible to offer off-heap (sun.misc.Unsafe

[jira] [Commented] (SPARK-15687) Columnar execution engine

2016-09-11 Thread Kiran Lonikar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481272#comment-15481272 ] Kiran Lonikar commented on SPARK-15687: --- I agree. It will then be possible to offer

[jira] [Comment Edited] (SPARK-15687) Columnar execution engine

2016-09-11 Thread Kiran Lonikar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481268#comment-15481268 ] Kiran Lonikar edited comment on SPARK-15687 at 9/11/16 7:29 AM: ---

[jira] [Commented] (SPARK-17496) missing int to float coercion in df.sample() signature

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481270#comment-15481270 ] Sean Owen commented on SPARK-17496: --- I don't think that's a bug. The value is virtually

[jira] [Commented] (SPARK-15687) Columnar execution engine

2016-09-11 Thread Kiran Lonikar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15481268#comment-15481268 ] Kiran Lonikar commented on SPARK-15687: --- I agree. It will then be possible to offer

[jira] [Created] (SPARK-17496) missing int to float coercion in df.sample() signature

2016-09-11 Thread Max Moroz (JIRA)
Max Moroz created SPARK-17496: - Summary: missing int to float coercion in df.sample() signature Key: SPARK-17496 URL: https://issues.apache.org/jira/browse/SPARK-17496 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-17389) KMeans speedup with better choice of k-means|| init steps = 2

2016-09-11 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17389. --- Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14956 [https://github.co