[jira] [Updated] (SPARK-5724) misconfiguration in Akka system

2015-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5724: - Priority: Minor (was: Major) Target Version/s: (was: 1.3.0) Assignee: Nan Zhu

[jira] [Resolved] (SPARK-5943) Update the API to remove several warns in building for Spark Streaming

2015-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5943. -- Resolution: Fixed Fix Version/s: 1.3.0 Issue resolved by pull request 4722

[jira] [Updated] (SPARK-5943) Update the API to remove several warns in building for Spark Streaming

2015-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-5943: - Affects Version/s: (was: 1.3.0) Assignee: Saisai Shao Update the API to remove several

[jira] [Resolved] (SPARK-5724) misconfiguration in Akka system

2015-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5724. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4512

[jira] [Created] (SPARK-5949) Driver program has to register roaring bitmap classes used by spark with Kryo when number of partitions is greater than 2000

2015-02-23 Thread Peter Torok (JIRA)
Peter Torok created SPARK-5949: -- Summary: Driver program has to register roaring bitmap classes used by spark with Kryo when number of partitions is greater than 2000 Key: SPARK-5949 URL:

[jira] [Created] (SPARK-5947) First class partitioning support in data sources API

2015-02-23 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-5947: - Summary: First class partitioning support in data sources API Key: SPARK-5947 URL: https://issues.apache.org/jira/browse/SPARK-5947 Project: Spark Issue Type:

[jira] [Created] (SPARK-5948) Support writing to partitioned table for the Parquet data source

2015-02-23 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-5948: - Summary: Support writing to partitioned table for the Parquet data source Key: SPARK-5948 URL: https://issues.apache.org/jira/browse/SPARK-5948 Project: Spark

[jira] [Commented] (SPARK-5940) Graph Loader: refactor + add more formats

2015-02-23 Thread lukovnikov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333265#comment-14333265 ] lukovnikov commented on SPARK-5940: --- Probably it's better to involve core Spark/GraphX

[jira] [Commented] (SPARK-5905) Improve RowMatrix user guide and doc.

2015-02-23 Thread Mike Beyer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333417#comment-14333417 ] Mike Beyer commented on SPARK-5905: --- ok, then we have the same understanding on naming

[jira] [Updated] (SPARK-5949) Driver program has to register roaring bitmap classes used by spark with Kryo when number of partitions is greater than 2000

2015-02-23 Thread Peter Torok (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Torok updated SPARK-5949: --- Description: When more than 2000 partitions are being used with Kryo, the following classes need to

[jira] [Updated] (SPARK-5944) Python release docs say SNAPSHOT + Author is missing

2015-02-23 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-5944: -- Target Version/s: 1.3.0, 1.2.2 (was: 1.2.2) Python release docs say SNAPSHOT + Author is missing

[jira] [Resolved] (SPARK-5939) Make FPGrowth example app take parameters

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-5939. -- Resolution: Fixed Issue resolved by pull request 4714

[jira] [Updated] (SPARK-5939) Make FPGrowth example app take parameters

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5939: - Assignee: Jacky Li Make FPGrowth example app take parameters

[jira] [Commented] (SPARK-5944) Python release docs say SNAPSHOT + Author is missing

2015-02-23 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333496#comment-14333496 ] Nicholas Chammas commented on SPARK-5944: - I'm not sure, but I think [here in the

[jira] [Commented] (SPARK-5950) Insert array into table saved as parquet should work when using datasource api

2015-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333532#comment-14333532 ] Apache Spark commented on SPARK-5950: - User 'viirya' has created a pull request for

[jira] [Updated] (SPARK-5944) Python release docs say SNAPSHOT + Author is missing

2015-02-23 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5944: Target Version/s: 1.2.2 Python release docs say SNAPSHOT + Author is missing

[jira] [Created] (SPARK-5950) Insert array into table saved as parquet should work when using datasource api

2015-02-23 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-5950: -- Summary: Insert array into table saved as parquet should work when using datasource api Key: SPARK-5950 URL: https://issues.apache.org/jira/browse/SPARK-5950

[jira] [Commented] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333799#comment-14333799 ] Sean Owen commented on SPARK-5953: -- Dumb question, is it on the classpath? in your app

[jira] [Updated] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Aleksandar Stojadinovic (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandar Stojadinovic updated SPARK-5953: --- Description: When using a Kafka input stream, and setting a custom Kafka

[jira] [Updated] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Aleksandar Stojadinovic (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandar Stojadinovic updated SPARK-5953: --- Description: When using a Kafka input stream, and setting a custom Kafka

[jira] [Commented] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333818#comment-14333818 ] Sean Owen commented on SPARK-5953: -- YARN or standalone? Did you look into

[jira] [Commented] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Aleksandar Stojadinovic (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333828#comment-14333828 ] Aleksandar Stojadinovic commented on SPARK-5953: Standalone, in local, in

[jira] [Updated] (SPARK-4144) Support incremental model training of Naive Bayes classifier

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4144: - Target Version/s: 1.4.0 Support incremental model training of Naive Bayes classifier

[jira] [Commented] (SPARK-5912) Programming guide for feature selection

2015-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333832#comment-14333832 ] Apache Spark commented on SPARK-5912: - User 'jkbradley' has created a pull request for

[jira] [Updated] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Aleksandar Stojadinovic (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandar Stojadinovic updated SPARK-5953: --- Description: When using a Kafka input stream, and setting a custom Kafka

[jira] [Created] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Aleksandar Stojadinovic (JIRA)
Aleksandar Stojadinovic created SPARK-5953: -- Summary: NoSuchMethodException with a Kafka input stream and custom decoder in Scala Key: SPARK-5953 URL: https://issues.apache.org/jira/browse/SPARK-5953

[jira] [Updated] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Aleksandar Stojadinovic (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandar Stojadinovic updated SPARK-5953: --- Description: When using a Kafka input stream, and setting a custom Kafka

[jira] [Commented] (SPARK-5953) NoSuchMethodException with a Kafka input stream and custom decoder in Scala

2015-02-23 Thread Aleksandar Stojadinovic (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333808#comment-14333808 ] Aleksandar Stojadinovic commented on SPARK-5953: The decoder? Yes, it's a

[jira] [Commented] (SPARK-4144) Support incremental model training of Naive Bayes classifier

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333830#comment-14333830 ] Xiangrui Meng commented on SPARK-4144: -- [~freeman-lab] I've assigned this ticket to

[jira] [Updated] (SPARK-4144) Support incremental model training of Naive Bayes classifier

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4144: - Assignee: Jeremy Freeman (was: Liquan Pei) Support incremental model training of Naive Bayes

[jira] [Created] (SPARK-5951) Remove unreachable driver memory properties in yarn client mode (YarnClientSchedulerBackend)

2015-02-23 Thread Shekhar Bansal (JIRA)
Shekhar Bansal created SPARK-5951: - Summary: Remove unreachable driver memory properties in yarn client mode (YarnClientSchedulerBackend) Key: SPARK-5951 URL: https://issues.apache.org/jira/browse/SPARK-5951

[jira] [Resolved] (SPARK-5904) DataFrame methods with varargs do not work in Java

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5904. Resolution: Fixed Fix Version/s: 1.3.0 I think rxin just forgot to close this. It

[jira] [Commented] (SPARK-5463) Fix Parquet filter push-down

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333608#comment-14333608 ] Patrick Wendell commented on SPARK-5463: Bumping to critical. Per our offline

[jira] [Commented] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-23 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333691#comment-14333691 ] Ilya Ganelin commented on SPARK-5845: - Hi Kay - I can knock this one out. Thanks.

[jira] [Commented] (SPARK-5951) Remove unreachable driver memory properties in yarn client mode (YarnClientSchedulerBackend)

2015-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333568#comment-14333568 ] Apache Spark commented on SPARK-5951: - User 'zuxqoj' has created a pull request for

[jira] [Updated] (SPARK-5463) Fix Parquet filter push-down

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5463: --- Priority: Critical (was: Blocker) Fix Parquet filter push-down

[jira] [Updated] (SPARK-3650) Triangle Count handles reverse edges incorrectly

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3650: --- Priority: Critical (was: Blocker) Triangle Count handles reverse edges incorrectly

[jira] [Commented] (SPARK-765) Test suite should run Spark example programs

2015-02-23 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333682#comment-14333682 ] Josh Rosen commented on SPARK-765: -- Yep, this still needs to be done. It's more of an

[jira] [Commented] (SPARK-5944) Python release docs say SNAPSHOT + Author is missing

2015-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333699#comment-14333699 ] Apache Spark commented on SPARK-5944: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-5750) Document that ordering of elements in shuffled partitions is not deterministic across runs

2015-02-23 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333685#comment-14333685 ] Ilya Ganelin commented on SPARK-5750: - Hi Josh - I can knock this out. Thanks.

[jira] [Updated] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-23 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-5845: -- Assignee: Ilya Ganelin Time to cleanup intermediate shuffle files not included in shuffle

[jira] [Resolved] (SPARK-5912) Programming guide for feature selection

2015-02-23 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-5912. -- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Alexander Ulanov

[jira] [Commented] (SPARK-5079) Detect failed jobs / batches in Spark Streaming unit tests

2015-02-23 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333743#comment-14333743 ] Ilya Ganelin commented on SPARK-5079: - Hi [~joshrosen] - I'm trying to wrap my head

[jira] [Created] (SPARK-5954) Add topByKey to pair RDDs

2015-02-23 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5954: Summary: Add topByKey to pair RDDs Key: SPARK-5954 URL: https://issues.apache.org/jira/browse/SPARK-5954 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-5922) Add diff(other: RDD[VertexId, VD]) in VertexRDD

2015-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333875#comment-14333875 ] Apache Spark commented on SPARK-5922: - User 'brennonyork' has created a pull request

[jira] [Closed] (SPARK-4284) BinaryClassificationMetrics precision-recall method names should correspond to return types

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-4284. Resolution: Won't Fix I'm closing this JIRA per discussion on the Github PR page.

[jira] [Updated] (SPARK-4510) Add k-medoids Partitioning Around Medoids (PAM) algorithm

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4510: - Labels: clustering features (was: features) Add k-medoids Partitioning Around Medoids (PAM)

[jira] [Closed] (SPARK-5010) native openblas library doesn't work: undefined symbol: cblas_dscal

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-5010. Resolution: Not a Problem I'm closing this PR because it is a upstream issue with the native BLAS

[jira] [Updated] (SPARK-5226) Add DBSCAN Clustering Algorithm to MLlib

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5226: - Labels: DBSCAN clustering (was: DBSCAN) Add DBSCAN Clustering Algorithm to MLlib

[jira] [Commented] (SPARK-5940) Graph Loader: refactor + add more formats

2015-02-23 Thread Magellanea (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334013#comment-14334013 ] Magellanea commented on SPARK-5940: --- [~lukovnikov] Thanks a lot for the reply, Do you

[jira] [Commented] (SPARK-794) Remove sleep() in ClusterScheduler.stop

2015-02-23 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334020#comment-14334020 ] Brennon York commented on SPARK-794: [~srowen] [~joshrosen] bump on this. Would assume

[jira] [Commented] (SPARK-5261) In some cases ,The value of word's vector representation is too big

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334021#comment-14334021 ] Xiangrui Meng commented on SPARK-5261: -- Could you try a larger minCount to reduce the

[jira] [Updated] (SPARK-5405) Spark clusterer should support high dimensional data

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5405: - Labels: clustering (was: ) Spark clusterer should support high dimensional data

[jira] [Closed] (SPARK-4039) KMeans support sparse cluster centers

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-4039. Resolution: Duplicate KMeans support sparse cluster centers -

[jira] [Reopened] (SPARK-4039) KMeans support sparse cluster centers

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng reopened SPARK-4039: -- KMeans support sparse cluster centers -

[jira] [Commented] (SPARK-1182) Sort the configuration parameters in configuration.md

2015-02-23 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334025#comment-14334025 ] Brennon York commented on SPARK-1182: - Given [~joshrosen]'s comments on the PR making

[jira] [Commented] (SPARK-5405) Spark clusterer should support high dimensional data

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334027#comment-14334027 ] Xiangrui Meng commented on SPARK-5405: -- Dimension reduction should be separated from

[jira] [Commented] (SPARK-5490) KMeans costs can be incorrect if tasks need to be rerun

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334030#comment-14334030 ] Xiangrui Meng commented on SPARK-5490: -- [~sandyr] This is a bug in core. Could you

[jira] [Updated] (SPARK-5490) KMeans costs can be incorrect if tasks need to be rerun

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5490: - Target Version/s: 1.4.0 KMeans costs can be incorrect if tasks need to be rerun

[jira] [Updated] (SPARK-5832) Add Affinity Propagation clustering algorithm

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5832: - Labels: clustering (was: ) Add Affinity Propagation clustering algorithm

[jira] [Closed] (SPARK-5927) Modify FPGrowth's partition strategy to reduce transactions in partitions

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-5927. Resolution: Won't Fix I'm closing this JIRA per discussion on the PR page. Modify FPGrowth's

[jira] [Updated] (SPARK-5490) KMeans costs can be incorrect if tasks need to be rerun

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5490: - Labels: clustering (was: ) KMeans costs can be incorrect if tasks need to be rerun

[jira] [Updated] (SPARK-2429) Hierarchical Implementation of KMeans

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2429: - Labels: clustering (was: ) Hierarchical Implementation of KMeans

[jira] [Updated] (SPARK-3439) Add Canopy Clustering Algorithm

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3439: - Labels: clustering (was: ) Add Canopy Clustering Algorithm ---

[jira] [Updated] (SPARK-3218) K-Means clusterer can fail on degenerate data

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3218: - Labels: clustering (was: ) K-Means clusterer can fail on degenerate data

[jira] [Updated] (SPARK-3220) K-Means clusterer should perform K-Means initialization in parallel

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3220: - Labels: clustering (was: ) K-Means clusterer should perform K-Means initialization in parallel

[jira] [Commented] (SPARK-3850) Scala style: disallow trailing spaces

2015-02-23 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334038#comment-14334038 ] Brennon York commented on SPARK-3850: - This made it into the [master

[jira] [Updated] (SPARK-3219) K-Means clusterer should support Bregman distance functions

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3219: - Labels: clustering (was: ) K-Means clusterer should support Bregman distance functions

[jira] [Updated] (SPARK-3504) KMeans optimization: track distances and unmoved cluster centers across iterations

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3504: - Labels: clustering (was: ) KMeans optimization: track distances and unmoved cluster centers

[jira] [Updated] (SPARK-5272) Refactor NaiveBayes to support discrete and continuous labels,features

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5272: - Labels: clustering (was: ) Refactor NaiveBayes to support discrete and continuous

[jira] [Updated] (SPARK-4039) KMeans support sparse cluster centers

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4039: - Labels: clustering (was: ) KMeans support sparse cluster centers

[jira] [Updated] (SPARK-2336) Approximate k-NN Models for MLLib

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2336: - Labels: clustering features newbie (was: features newbie) Approximate k-NN Models for MLLib

[jira] [Updated] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2138: - Labels: clustering (was: ) The KMeans algorithm in the MLlib can lead to the Serialized Task

[jira] [Updated] (SPARK-2344) Add Fuzzy C-Means algorithm to MLlib

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2344: - Labels: clustering (was: ) Add Fuzzy C-Means algorithm to MLlib

[jira] [Updated] (SPARK-3261) KMeans clusterer can return duplicate cluster centers

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3261: - Labels: clustering (was: ) KMeans clusterer can return duplicate cluster centers

[jira] [Updated] (SPARK-2308) Add KMeans MiniBatch clustering algorithm to MLlib

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2308: - Labels: clustering (was: ) Add KMeans MiniBatch clustering algorithm to MLlib

[jira] [Updated] (SPARK-2336) Approximate k-NN Models for MLLib

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2336: - Labels: clustering features (was: clustering features newbie) Approximate k-NN Models for MLLib

[jira] [Commented] (SPARK-5490) KMeans costs can be incorrect if tasks need to be rerun

2015-02-23 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334039#comment-14334039 ] Sandy Ryza commented on SPARK-5490: --- The relevant JIRA is SPARK-732, but it's marked as

[jira] [Commented] (SPARK-4123) Show new dependencies added in pull requests

2015-02-23 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14334049#comment-14334049 ] Brennon York commented on SPARK-4123: - [~nchammas] have you started this? If not I can

[jira] [Created] (SPARK-5955) Add checkpointInterval to ALS

2015-02-23 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-5955: Summary: Add checkpointInterval to ALS Key: SPARK-5955 URL: https://issues.apache.org/jira/browse/SPARK-5955 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-3355) Allow running maven tests in run-tests

2015-02-23 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333887#comment-14333887 ] Apache Spark commented on SPARK-3355: - User 'brennonyork' has created a pull request

[jira] [Closed] (SPARK-1006) MLlib ALS gets stack overflow with too many iterations

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-1006. Resolution: Duplicate MLlib ALS gets stack overflow with too many iterations

[jira] [Closed] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3080. Resolution: Fixed I'm closing this issue since the only way that I can re-produce this bug is the

[jira] [Updated] (SPARK-3436) Streaming SVM

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3436: - Assignee: (was: Liquan Pei) Streaming SVM -- Key: SPARK-3436

[jira] [Commented] (SPARK-5541) Allow running Maven or SBT in run-tests

2015-02-23 Thread Brennon York (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333991#comment-14333991 ] Brennon York commented on SPARK-5541: - Just pushed up a PR for

[jira] [Closed] (SPARK-3435) Distributed matrix multiplication

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3435. Resolution: Duplicate I'm closing this JIRA because it is hard to control data locality. We

[jira] [Closed] (SPARK-3436) Streaming SVM

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3436. Resolution: Duplicate Streaming SVM -- Key: SPARK-3436

[jira] [Updated] (SPARK-5673) Implement Streaming wrapper for all linear methos

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-5673: - Assignee: Kirill A. Korinskiy Implement Streaming wrapper for all linear methos

[jira] [Closed] (SPARK-3403) NaiveBayes crashes with blas/lapack native libraries for breeze (netlib-java)

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3403. Resolution: Not a Problem [~avulanov] Did you try OpenBLAS 0.2.12, as suggested by xianyi on

[jira] [Commented] (SPARK-4039) KMeans support sparse cluster centers

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333997#comment-14333997 ] Xiangrui Meng commented on SPARK-4039: -- I changed the JIRA title to be more

[jira] [Updated] (SPARK-4039) KMeans support sparse cluster centers

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4039: - Summary: KMeans support sparse cluster centers (was: KMeans support HashingTF vectors) KMeans

[jira] [Closed] (SPARK-4956) Vector Initialization error when initialize a Sparse Vector by calling Vectors.sparse(size, indices, values)

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-4956. Resolution: Won't Fix I'm closing this PR per discussion on the PR page. Vector Initialization

[jira] [Updated] (SPARK-3147) Implement A/B testing

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3147: - Assignee: Feynman Liang Implement A/B testing - Key:

[jira] [Commented] (SPARK-4289) Creating an instance of Hadoop Job fails in the Spark shell when toString() is called on the instance.

2015-02-23 Thread Alexander Bezzubov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333118#comment-14333118 ] Alexander Bezzubov commented on SPARK-4289: --- Could you please tell how exactly

[jira] [Comment Edited] (SPARK-4289) Creating an instance of Hadoop Job fails in the Spark shell when toString() is called on the instance.

2015-02-23 Thread Alexander Bezzubov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333142#comment-14333142 ] Alexander Bezzubov edited comment on SPARK-4289 at 2/23/15 9:26 AM:

[jira] [Commented] (SPARK-3147) Implement A/B testing

2015-02-23 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333086#comment-14333086 ] Xiangrui Meng commented on SPARK-3147: -- Done:) Implement A/B testing

[jira] [Commented] (SPARK-4289) Creating an instance of Hadoop Job fails in the Spark shell when toString() is called on the instance.

2015-02-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333127#comment-14333127 ] Sean Owen commented on SPARK-4289: -- [~bzz] Just type {{:silent}} into the shell at the

[jira] [Commented] (SPARK-4289) Creating an instance of Hadoop Job fails in the Spark shell when toString() is called on the instance.

2015-02-23 Thread Alexander Bezzubov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333142#comment-14333142 ] Alexander Bezzubov commented on SPARK-4289: --- [~sowen] Thanks, that's what I

[jira] [Comment Edited] (SPARK-4289) Creating an instance of Hadoop Job fails in the Spark shell when toString() is called on the instance.

2015-02-23 Thread Alexander Bezzubov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333142#comment-14333142 ] Alexander Bezzubov edited comment on SPARK-4289 at 2/23/15 9:39 AM:

[jira] [Comment Edited] (SPARK-4289) Creating an instance of Hadoop Job fails in the Spark shell when toString() is called on the instance.

2015-02-23 Thread Alexander Bezzubov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14333142#comment-14333142 ] Alexander Bezzubov edited comment on SPARK-4289 at 2/23/15 9:39 AM:

  1   2   >