[jira] [Updated] (SPARK-17491) MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used

2016-09-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-17491: --- Labels: correctness (was: ) > MemoryStore.putIteratorAsBytes() may silently lose values when KryoSer

[jira] [Assigned] (SPARK-17491) MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17491: Assignee: Josh Rosen (was: Apache Spark) > MemoryStore.putIteratorAsBytes() may silently

[jira] [Commented] (SPARK-17491) MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479312#comment-15479312 ] Apache Spark commented on SPARK-17491: -- User 'JoshRosen' has created a pull request

[jira] [Assigned] (SPARK-17491) MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17491: Assignee: Apache Spark (was: Josh Rosen) > MemoryStore.putIteratorAsBytes() may silently

[jira] [Created] (SPARK-17491) MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used

2016-09-09 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-17491: -- Summary: MemoryStore.putIteratorAsBytes() may silently lose values when KryoSerializer is used Key: SPARK-17491 URL: https://issues.apache.org/jira/browse/SPARK-17491 Pro

[jira] [Created] (SPARK-17490) Optimize SerializeFromObject for primitive array

2016-09-09 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-17490: Summary: Optimize SerializeFromObject for primitive array Key: SPARK-17490 URL: https://issues.apache.org/jira/browse/SPARK-17490 Project: Spark Issu

[jira] [Commented] (SPARK-17479) Fix LDA example in docs

2016-09-09 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479243#comment-15479243 ] zhengruifeng commented on SPARK-17479: -- Because the paths in examples are relative p

[jira] [Commented] (SPARK-17479) Fix LDA example in docs

2016-09-09 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479240#comment-15479240 ] zhengruifeng commented on SPARK-17479: -- +1 I test this example in Scala,Java,Py2 and

[jira] [Assigned] (SPARK-17449) Relation between heartbeatInterval and network timeout

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17449: Assignee: Apache Spark > Relation between heartbeatInterval and network timeout >

[jira] [Assigned] (SPARK-17449) Relation between heartbeatInterval and network timeout

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17449: Assignee: (was: Apache Spark) > Relation between heartbeatInterval and network timeout

[jira] [Commented] (SPARK-17449) Relation between heartbeatInterval and network timeout

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479237#comment-15479237 ] Apache Spark commented on SPARK-17449: -- User 'jagadeesanas2' has created a pull requ

[jira] [Created] (SPARK-17489) Improve filtering for bucketed tables

2016-09-09 Thread Shuai Lin (JIRA)
Shuai Lin created SPARK-17489: - Summary: Improve filtering for bucketed tables Key: SPARK-17489 URL: https://issues.apache.org/jira/browse/SPARK-17489 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-17488) TakeAndOrder will OOM when the data is very large

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17488: Assignee: (was: Apache Spark) > TakeAndOrder will OOM when the data is very large > --

[jira] [Assigned] (SPARK-17488) TakeAndOrder will OOM when the data is very large

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17488: Assignee: Apache Spark > TakeAndOrder will OOM when the data is very large > -

[jira] [Commented] (SPARK-17488) TakeAndOrder will OOM when the data is very large

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479183#comment-15479183 ] Apache Spark commented on SPARK-17488: -- User 'cenyuhai' has created a pull request f

[jira] [Created] (SPARK-17488) TakeAndOrder will OOM when the data is very large

2016-09-09 Thread cen yuhai (JIRA)
cen yuhai created SPARK-17488: - Summary: TakeAndOrder will OOM when the data is very large Key: SPARK-17488 URL: https://issues.apache.org/jira/browse/SPARK-17488 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17445) Reference an ASF page as the main place to find third-party packages

2016-09-09 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479121#comment-15479121 ] Matei Zaharia commented on SPARK-17445: --- The powered by wiki page is a bit of a mes

[jira] [Commented] (SPARK-17450) spark sql rownumber OOM

2016-09-09 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479097#comment-15479097 ] cen yuhai commented on SPARK-17450: --- can you provide me davies's pr? > spark sql rownu

[jira] [Comment Edited] (SPARK-17487) Configurable bucketing info extraction

2016-09-09 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479042#comment-15479042 ] Tejas Patil edited comment on SPARK-17487 at 9/10/16 3:39 AM: -

[jira] [Assigned] (SPARK-17487) Configurable bucketing info extraction

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17487: Assignee: (was: Apache Spark) > Configurable bucketing info extraction > -

[jira] [Commented] (SPARK-17487) Configurable bucketing info extraction

2016-09-09 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479042#comment-15479042 ] Tejas Patil commented on SPARK-17487: - I have a WIP for this. I am looking for early

[jira] [Assigned] (SPARK-17487) Configurable bucketing info extraction

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17487: Assignee: Apache Spark > Configurable bucketing info extraction >

[jira] [Commented] (SPARK-17487) Configurable bucketing info extraction

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15479043#comment-15479043 ] Apache Spark commented on SPARK-17487: -- User 'tejasapatil' has created a pull reques

[jira] [Created] (SPARK-17487) Configurable bucketing info extraction

2016-09-09 Thread Tejas Patil (JIRA)
Tejas Patil created SPARK-17487: --- Summary: Configurable bucketing info extraction Key: SPARK-17487 URL: https://issues.apache.org/jira/browse/SPARK-17487 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-17447) performance improvement in Partitioner.DefaultPartitioner

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478926#comment-15478926 ] Apache Spark commented on SPARK-17447: -- User 'codlife' has created a pull request fo

[jira] [Updated] (SPARK-15453) Improve join planning for bucketed / sorted tables

2016-09-09 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-15453: Assignee: Tejas Patil > Improve join planning for bucketed / sorted tables > --

[jira] [Resolved] (SPARK-15453) Improve join planning for bucketed / sorted tables

2016-09-09 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-15453. - Resolution: Fixed Fix Version/s: 2.1.0 Issue resolved by pull request 14864 [https://githu

[jira] [Commented] (SPARK-17468) Cluster workers crushed when master network bad more than one WORKER_TIMEOUT_MS!

2016-09-09 Thread zhangzhiyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478846#comment-15478846 ] zhangzhiyan commented on SPARK-17468: - some of my worker died because of memory excee

[jira] [Commented] (SPARK-17400) MinMaxScaler.transform() outputs DenseVector by default, which causes poor performance

2016-09-09 Thread Frank Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478819#comment-15478819 ] Frank Dai commented on SPARK-17400: --- [~mlnick] After reading the doc of MaxAbsScaler, I

[jira] [Closed] (SPARK-17400) MinMaxScaler.transform() outputs DenseVector by default, which causes poor performance

2016-09-09 Thread Frank Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frank Dai closed SPARK-17400. - Resolution: Not A Problem > MinMaxScaler.transform() outputs DenseVector by default, which causes poor >

[jira] [Assigned] (SPARK-17486) Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17486: Assignee: Josh Rosen (was: Apache Spark) > Remove unused TaskMetricsUIData.updatedBlockSt

[jira] [Commented] (SPARK-17486) Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478817#comment-15478817 ] Apache Spark commented on SPARK-17486: -- User 'JoshRosen' has created a pull request

[jira] [Assigned] (SPARK-17486) Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17486: Assignee: Apache Spark (was: Josh Rosen) > Remove unused TaskMetricsUIData.updatedBlockSt

[jira] [Created] (SPARK-17486) Remove unused TaskMetricsUIData.updatedBlockStatuses field

2016-09-09 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-17486: -- Summary: Remove unused TaskMetricsUIData.updatedBlockStatuses field Key: SPARK-17486 URL: https://issues.apache.org/jira/browse/SPARK-17486 Project: Spark Issue

[jira] [Commented] (SPARK-17476) Proper handling for unseen labels in logistic regression training.

2016-09-09 Thread Xin Ren (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478517#comment-15478517 ] Xin Ren commented on SPARK-17476: - Hi I can try to work on this one, thanks :) > Proper

[jira] [Comment Edited] (SPARK-16239) SQL issues with cast from date to string around daylight savings time

2016-09-09 Thread Dean Wampler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478438#comment-15478438 ] Dean Wampler edited comment on SPARK-16239 at 9/9/16 10:20 PM:

[jira] [Commented] (SPARK-16239) SQL issues with cast from date to string around daylight savings time

2016-09-09 Thread Dean Wampler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478438#comment-15478438 ] Dean Wampler commented on SPARK-16239: -- I invested this a bit today for a customer.

[jira] [Commented] (SPARK-14221) Cross-publish Chill for Scala 2.12

2016-09-09 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478428#comment-15478428 ] Jakob Odersky commented on SPARK-14221: --- I just saw that chill already [has a pendi

[jira] [Commented] (SPARK-16834) TrainValildationSplit and direct evaluation produce different scores

2016-09-09 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478414#comment-15478414 ] Bryan Cutler commented on SPARK-16834: -- [~mmoroz], your sample doesn't quite do the

[jira] [Assigned] (SPARK-17485) Failed remote cached block reads can lead to whole job failure

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17485: Assignee: Apache Spark (was: Josh Rosen) > Failed remote cached block reads can lead to w

[jira] [Assigned] (SPARK-17485) Failed remote cached block reads can lead to whole job failure

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17485: Assignee: Josh Rosen (was: Apache Spark) > Failed remote cached block reads can lead to w

[jira] [Commented] (SPARK-17485) Failed remote cached block reads can lead to whole job failure

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478407#comment-15478407 ] Apache Spark commented on SPARK-17485: -- User 'JoshRosen' has created a pull request

[jira] [Resolved] (SPARK-17469) mapWithState causes block lock warning

2016-09-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-17469. --- Resolution: Duplicate > mapWithState causes block lock warning >

[jira] [Commented] (SPARK-17469) mapWithState causes block lock warning

2016-09-09 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478339#comment-15478339 ] Miao Wang commented on SPARK-17469: --- Can you give command for reproduction? > mapWithS

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2016-09-09 Thread Srinath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478341#comment-15478341 ] Srinath commented on SPARK-16026: - Thanks for the response: 1. You’re correct that the se

[jira] [Created] (SPARK-17485) Failed remote cached block reads can lead to whole job failure

2016-09-09 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-17485: -- Summary: Failed remote cached block reads can lead to whole job failure Key: SPARK-17485 URL: https://issues.apache.org/jira/browse/SPARK-17485 Project: Spark I

[jira] [Updated] (SPARK-17485) Failed remote cached block reads can lead to whole job failure

2016-09-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-17485: --- Priority: Critical (was: Major) > Failed remote cached block reads can lead to whole job failure > -

[jira] [Resolved] (SPARK-17354) java.lang.ClassCastException: java.lang.Integer cannot be cast to java.sql.Date

2016-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-17354. Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 Issue resolved by pull reque

[jira] [Created] (SPARK-17484) Race condition when cancelling a job during a cache write can lead to block fetch failures

2016-09-09 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-17484: -- Summary: Race condition when cancelling a job during a cache write can lead to block fetch failures Key: SPARK-17484 URL: https://issues.apache.org/jira/browse/SPARK-17484

[jira] [Updated] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2016-09-09 Thread Gang Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated SPARK-17477: Shepherd: (was: Gang Wu) > SparkSQL cannot handle schema evolution from Int -> Long when parquet files >

[jira] [Assigned] (SPARK-17483) Minor refactoring and cleanup in BlockManager block status reporting and block removal

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17483: Assignee: Apache Spark (was: Josh Rosen) > Minor refactoring and cleanup in BlockManager

[jira] [Commented] (SPARK-17483) Minor refactoring and cleanup in BlockManager block status reporting and block removal

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478208#comment-15478208 ] Apache Spark commented on SPARK-17483: -- User 'JoshRosen' has created a pull request

[jira] [Assigned] (SPARK-17483) Minor refactoring and cleanup in BlockManager block status reporting and block removal

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17483: Assignee: Josh Rosen (was: Apache Spark) > Minor refactoring and cleanup in BlockManager

[jira] [Assigned] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17477: Assignee: Apache Spark > SparkSQL cannot handle schema evolution from Int -> Long when par

[jira] [Commented] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478196#comment-15478196 ] Apache Spark commented on SPARK-17477: -- User 'wgtmac' has created a pull request for

[jira] [Assigned] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17477: Assignee: (was: Apache Spark) > SparkSQL cannot handle schema evolution from Int -> Lo

[jira] [Created] (SPARK-17483) Minor refactoring and cleanup in BlockManager block status reporting and block removal

2016-09-09 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-17483: -- Summary: Minor refactoring and cleanup in BlockManager block status reporting and block removal Key: SPARK-17483 URL: https://issues.apache.org/jira/browse/SPARK-17483 Pr

[jira] [Created] (SPARK-17482) Analyzer should be able run on top of optimized rule

2016-09-09 Thread Davies Liu (JIRA)
Davies Liu created SPARK-17482: -- Summary: Analyzer should be able run on top of optimized rule Key: SPARK-17482 URL: https://issues.apache.org/jira/browse/SPARK-17482 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16240) model loading backward compatibility for ml.clustering.LDA

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478139#comment-15478139 ] Apache Spark commented on SPARK-16240: -- User 'jkbradley' has created a pull request

[jira] [Commented] (SPARK-5992) Locality Sensitive Hashing (LSH) for MLlib

2016-09-09 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478135#comment-15478135 ] Yun Ni commented on SPARK-5992: --- Thank you very much for reviewing it, Joseph! I will work

[jira] [Commented] (SPARK-15573) Backwards-compatible persistence for spark.ml

2016-09-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478131#comment-15478131 ] Joseph K. Bradley commented on SPARK-15573: --- I'd prefer to put this in unit tes

[jira] [Commented] (SPARK-5992) Locality Sensitive Hashing (LSH) for MLlib

2016-09-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478125#comment-15478125 ] Joseph K. Bradley commented on SPARK-5992: -- The design doc LGTM! Thanks for upda

[jira] [Commented] (SPARK-17478) Create spark.eventLog.dir if it does not exist

2016-09-09 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478087#comment-15478087 ] Robert Kruszewski commented on SPARK-17478: --- Thanks for the pointers and apolog

[jira] [Commented] (SPARK-17478) Create spark.eventLog.dir if it does not exist

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478085#comment-15478085 ] Apache Spark commented on SPARK-17478: -- User 'robert3005' has created a pull request

[jira] [Commented] (SPARK-17479) Fix LDA example in docs

2016-09-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478058#comment-15478058 ] Nick Pentreath commented on SPARK-17479: I just ran Scala, Java and Python exampl

[jira] [Commented] (SPARK-17479) Fix LDA example in docs

2016-09-09 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478050#comment-15478050 ] Nick Pentreath commented on SPARK-17479: I do see the data file: https://github.

[jira] [Commented] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Ergin Seyfe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478043#comment-15478043 ] Ergin Seyfe commented on SPARK-17480: - Yes [~srowen], that was exactly same as my PR:

[jira] [Assigned] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17480: Assignee: (was: Apache Spark) > CompressibleColumnBuilder inefficiently call gatherCom

[jira] [Assigned] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17480: Assignee: Apache Spark > CompressibleColumnBuilder inefficiently call gatherCompressibilit

[jira] [Resolved] (SPARK-17478) Create spark.eventLog.dir if it does not exist

2016-09-09 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-17478. Resolution: Duplicate There are reasons why this will never be done. > Create spark.eventL

[jira] [Commented] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478040#comment-15478040 ] Apache Spark commented on SPARK-17480: -- User 'seyfe' has created a pull request for

[jira] [Commented] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478035#comment-15478035 ] Sean Owen commented on SPARK-17480: --- Yeah, I wonder if the "unrolled" while loop here i

[jira] [Updated] (SPARK-17481) Flaky test: org.apache.spark.DistributedSuite.passing environment variables to cluster

2016-09-09 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-17481: - Attachment: log-17481.txt > Flaky test: org.apache.spark.DistributedSuite.passing environment variables

[jira] [Created] (SPARK-17481) Flaky test: org.apache.spark.DistributedSuite.passing environment variables to cluster

2016-09-09 Thread Yin Huai (JIRA)
Yin Huai created SPARK-17481: Summary: Flaky test: org.apache.spark.DistributedSuite.passing environment variables to cluster Key: SPARK-17481 URL: https://issues.apache.org/jira/browse/SPARK-17481 Projec

[jira] [Updated] (SPARK-17478) Create spark.eventLog.dir if it does not exist

2016-09-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17478: -- Issue Type: Improvement (was: Bug) When this has come up previously, I think the problem has been that

[jira] [Updated] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Ergin Seyfe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ergin Seyfe updated SPARK-17480: Description: When we profile one of our Spark jobs we saw that: 6.24% of the CPU is spend on List.

[jira] [Updated] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Ergin Seyfe (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ergin Seyfe updated SPARK-17480: Description: When we profile one of our Spark jobs we saw that: 6.24% of the CPU is spend on List.

[jira] [Created] (SPARK-17480) CompressibleColumnBuilder inefficiently call gatherCompressibilityStats

2016-09-09 Thread Ergin Seyfe (JIRA)
Ergin Seyfe created SPARK-17480: --- Summary: CompressibleColumnBuilder inefficiently call gatherCompressibilityStats Key: SPARK-17480 URL: https://issues.apache.org/jira/browse/SPARK-17480 Project: Spark

[jira] [Commented] (SPARK-17479) Fix LDA example in docs

2016-09-09 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478019#comment-15478019 ] Joseph K. Bradley commented on SPARK-17479: --- CC [~podongfeng] [~mlnick] from th

[jira] [Created] (SPARK-17479) Fix LDA example in docs

2016-09-09 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-17479: - Summary: Fix LDA example in docs Key: SPARK-17479 URL: https://issues.apache.org/jira/browse/SPARK-17479 Project: Spark Issue Type: Documentation

[jira] [Created] (SPARK-17478) Create spark.eventLog.dir if it does not exist

2016-09-09 Thread Robert Kruszewski (JIRA)
Robert Kruszewski created SPARK-17478: - Summary: Create spark.eventLog.dir if it does not exist Key: SPARK-17478 URL: https://issues.apache.org/jira/browse/SPARK-17478 Project: Spark Issu

[jira] [Updated] (SPARK-17474) Python UDF does not work between Sort and Limit

2016-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17474: --- Summary: Python UDF does not work between Sort and Limit (was: expressions of QueryPlan does not inc

[jira] [Updated] (SPARK-17474) expressions of QueryPlan does not include those inside Option[Seq[Expression]]

2016-09-09 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-17474: --- Affects Version/s: (was: 1.6.2) (was: 1.5.2) > expressions of QueryPla

[jira] [Assigned] (SPARK-17474) expressions of QueryPlan does not include those inside Option[Seq[Expression]]

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17474: Assignee: Davies Liu (was: Apache Spark) > expressions of QueryPlan does not include thos

[jira] [Commented] (SPARK-17474) expressions of QueryPlan does not include those inside Option[Seq[Expression]]

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477985#comment-15477985 ] Apache Spark commented on SPARK-17474: -- User 'davies' has created a pull request for

[jira] [Assigned] (SPARK-17474) expressions of QueryPlan does not include those inside Option[Seq[Expression]]

2016-09-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17474: Assignee: Apache Spark (was: Davies Liu) > expressions of QueryPlan does not include thos

[jira] [Updated] (SPARK-17476) Proper handling for unseen labels in logistic regression training.

2016-09-09 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai updated SPARK-17476: Issue Type: Sub-task (was: New Feature) Parent: SPARK-17133 > Proper handling for unseen labels in

[jira] [Updated] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2016-09-09 Thread Gang Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Wu updated SPARK-17477: Shepherd: Gang Wu > SparkSQL cannot handle schema evolution from Int -> Long when parquet files > have Int

[jira] [Commented] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2016-09-09 Thread Gang Wu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477945#comment-15477945 ] Gang Wu commented on SPARK-17477: - I'm working on a fix for this issue. Will send pull re

[jira] [Commented] (SPARK-4563) Allow spark driver to bind to different ip then advertise ip

2016-09-09 Thread Sunil Kotagiri (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477942#comment-15477942 ] Sunil Kotagiri commented on SPARK-4563: --- +1 I also disagree that it is Minor bug. We

[jira] [Created] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2016-09-09 Thread Gang Wu (JIRA)
Gang Wu created SPARK-17477: --- Summary: SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type Key: SPARK-17477 URL: https://issues.apa

[jira] [Resolved] (SPARK-17433) YarnShuffleService doesn't handle moving credentials levelDb

2016-09-09 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-17433. --- Resolution: Fixed Assignee: Thomas Graves Fix Version/s: 2.1.0 > YarnShuffleS

[jira] [Commented] (SPARK-14221) Cross-publish Chill for Scala 2.12

2016-09-09 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477922#comment-15477922 ] Josh Rosen commented on SPARK-14221: Assuming that the license is the same, I don't s

[jira] [Comment Edited] (SPARK-14221) Cross-publish Chill for Scala 2.12

2016-09-09 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477903#comment-15477903 ] Jakob Odersky edited comment on SPARK-14221 at 9/9/16 6:30 PM:

[jira] [Commented] (SPARK-14221) Cross-publish Chill for Scala 2.12

2016-09-09 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477903#comment-15477903 ] Jakob Odersky commented on SPARK-14221: --- [~joshrosen]'s upstream PR requires Kryo 3

[jira] [Updated] (SPARK-17475) HDFSMetadataLog should not leak CRC files

2016-09-09 Thread Frederick Reiss (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Frederick Reiss updated SPARK-17475: Description: When HDFSMetadataLog uses a log directory on a filesystem other than HDFS (i.e

[jira] [Updated] (SPARK-17421) Document warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-09 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-17421: -- Priority: Trivial (was: Minor) Issue Type: Improvement (was: Bug) Summary: Document warni

[jira] [Commented] (SPARK-17466) Error message is not very clear

2016-09-09 Thread Tim Chan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477873#comment-15477873 ] Tim Chan commented on SPARK-17466: -- Thanks [~srowen]! > Error message is not very clear

[jira] [Created] (SPARK-17476) Proper handling for unseen labels in logistic regression training.

2016-09-09 Thread Seth Hendrickson (JIRA)
Seth Hendrickson created SPARK-17476: Summary: Proper handling for unseen labels in logistic regression training. Key: SPARK-17476 URL: https://issues.apache.org/jira/browse/SPARK-17476 Project: S

[jira] [Commented] (SPARK-17421) Warnings about "MaxPermSize" parameter when building with Maven and Java 8

2016-09-09 Thread Frederick Reiss (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15477864#comment-15477864 ] Frederick Reiss commented on SPARK-17421: - Committer feedback on first PR was tha

  1   2   >