[jira] [Closed] (SPARK-1543) Add ADMM for solving Lasso (and elastic net) problem

2014-07-29 Thread Shuo Xiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shuo Xiang closed SPARK-1543. - Resolution: Later close at this time for more design discussion Add ADMM for solving Lasso (and

[jira] [Updated] (SPARK-2726) Remove SortOrder in ShuffleDependency and HashShuffleReader

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2726: --- Description: SPARK-2125 introduced a SortOrder in ShuffleDependency and HashShuffleReader. However,

[jira] [Commented] (SPARK-2723) Block Manager should catch exceptions in putValues

2014-07-29 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077419#comment-14077419 ] Patrick Wendell commented on SPARK-2723: I think blacklisting directories is a

[jira] [Commented] (SPARK-2677) BasicBlockFetchIterator#next can wait forever

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077439#comment-14077439 ] Apache Spark commented on SPARK-2677: - User 'sarutak' has created a pull request for

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077452#comment-14077452 ] Josh Rosen commented on SPARK-1630: --- We aren't passing completely arbitrary iterators of

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077465#comment-14077465 ] Davies Liu commented on SPARK-1630: --- If a RDD is generated in Scala/Java by user code,

[jira] [Commented] (SPARK-1981) Add AWS Kinesis streaming support

2014-07-29 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077473#comment-14077473 ] Matei Zaharia commented on SPARK-1981: -- The EC2 scripts actually fetch a package that

[jira] [Resolved] (SPARK-2580) broken pipe collecting schemardd results

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2580. --- Resolution: Fixed Fix Version/s: 1.0.3 1.1.0 Target Version/s:

[jira] [Commented] (SPARK-2388) Streaming from multiple different Kafka topics is problematic

2014-07-29 Thread Silver (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077488#comment-14077488 ] Silver commented on SPARK-2388: --- Cant you use the overload of Kafka's KeyedMessage(topic :

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077492#comment-14077492 ] Josh Rosen commented on SPARK-1630: --- In the current Spark codebase, the PythonRDD

[jira] [Resolved] (SPARK-2727) HashShuffleReader should do in-place sort

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2727. Resolution: Fixed Fix Version/s: 1.1.0 HashShuffleReader should do in-place sort

[jira] [Resolved] (SPARK-2726) Remove SortOrder in ShuffleDependency and HashShuffleReader

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2726. Resolution: Fixed Fix Version/s: 1.1.0 Remove SortOrder in ShuffleDependency and

[jira] [Resolved] (SPARK-2174) Implement treeReduce and treeAggregate

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2174. Resolution: Fixed Fix Version/s: 1.1.0 Implement treeReduce and treeAggregate

[jira] [Resolved] (SPARK-791) [pyspark] operator.getattr not serialized

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-791. -- Resolution: Fixed [pyspark] operator.getattr not serialized -

[jira] [Updated] (SPARK-791) [pyspark] operator.getattr not serialized

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-791: - Affects Version/s: 1.0.0 Fix Version/s: 1.0.3 0.9.3

[jira] [Updated] (SPARK-791) [pyspark] operator.getattr not serialized

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-791: - Target Version/s: 1.0.2 [pyspark] operator.getattr not serialized

[jira] [Comment Edited] (SPARK-2720) spark-examples should depend on HBase modules for HBase 0.96+

2014-07-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077525#comment-14077525 ] Sean Owen edited comment on SPARK-2720 at 7/29/14 8:55 AM: --- See

[jira] [Commented] (SPARK-2632) Importing a method of class in Spark REPL causes the REPL to pulls in unnecessary stuff.

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077552#comment-14077552 ] Apache Spark commented on SPARK-2632: - User 'ScrapCodes' has created a pull request

[jira] [Created] (SPARK-2728) Integer overflow in partition index calculation RangePartitioner

2014-07-29 Thread Jianshi Huang (JIRA)
Jianshi Huang created SPARK-2728: Summary: Integer overflow in partition index calculation RangePartitioner Key: SPARK-2728 URL: https://issues.apache.org/jira/browse/SPARK-2728 Project: Spark

[jira] [Commented] (SPARK-2379) stopReceive in dead loop, cause stackoverflow exception

2014-07-29 Thread dai zhiyuan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077675#comment-14077675 ] dai zhiyuan commented on SPARK-2379: me too. stopReceive in dead loop, cause

[jira] [Commented] (SPARK-2720) spark-examples should depend on HBase modules for HBase 0.96+

2014-07-29 Thread Ted Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077740#comment-14077740 ] Ted Yu commented on SPARK-2720: --- 0.98 is currently the stable release of HBase. bq. It

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Kalpit Shah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077899#comment-14077899 ] Kalpit Shah commented on SPARK-1630: Here's my case that led me to filing this bug and

[jira] [Updated] (SPARK-2729) Forgot to match Timestamp type in ColumnBuilder

2014-07-29 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teng Qiu updated SPARK-2729: Description: after SPARK-2710 we can create a table in Spark SQL with ColumnType Timestamp from jdbc.

[jira] [Created] (SPARK-2729) Forgot to match Timestamp type in ColumnBuilder

2014-07-29 Thread Teng Qiu (JIRA)
Teng Qiu created SPARK-2729: --- Summary: Forgot to match Timestamp type in ColumnBuilder Key: SPARK-2729 URL: https://issues.apache.org/jira/browse/SPARK-2729 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077945#comment-14077945 ] Josh Rosen commented on SPARK-1630: --- Hi Kalpit, Thanks for sharing your use-case; it

[jira] [Commented] (SPARK-2308) Add KMeans MiniBatch clustering algorithm to MLlib

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077952#comment-14077952 ] Xiangrui Meng commented on SPARK-2308: -- Thanks for testing! Did you mean k-means||

[jira] [Created] (SPARK-2730) When retrieving a value from a Map, GetItem evaluates key twice

2014-07-29 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2730: --- Summary: When retrieving a value from a Map, GetItem evaluates key twice Key: SPARK-2730 URL: https://issues.apache.org/jira/browse/SPARK-2730 Project: Spark Issue

[jira] [Commented] (SPARK-2729) Forgot to match Timestamp type in ColumnBuilder

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077972#comment-14077972 ] Apache Spark commented on SPARK-2729: - User 'chutium' has created a pull request for

[jira] [Updated] (SPARK-2206) Automatically infer the number of classification classes in multiclass classification

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2206: - Target Version/s: 1.2.0 (was: 1.1.0) Automatically infer the number of classification classes

[jira] [Commented] (SPARK-2206) Automatically infer the number of classification classes in multiclass classification

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077994#comment-14077994 ] Xiangrui Meng commented on SPARK-2206: -- [~manishamde] I'm thinking about adding a

[jira] [Commented] (SPARK-2341) loadLibSVMFile doesn't handle regression datasets

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078031#comment-14078031 ] Xiangrui Meng commented on SPARK-2341: -- [~srowen] For the doc in your version:

[jira] [Closed] (SPARK-2512) Stratified sampling

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-2512. Resolution: Duplicate Stratified sampling --- Key: SPARK-2512

[jira] [Updated] (SPARK-2207) Add minimum information gain and minimum instances per node as training parameters for decision tree.

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2207: - Target Version/s: (was: 1.1.0) Add minimum information gain and minimum instances per node as

[jira] [Commented] (SPARK-2138) The KMeans algorithm in the MLlib can lead to the Serialized Task size become bigger and bigger

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078036#comment-14078036 ] Xiangrui Meng commented on SPARK-2138: -- [DjvuLee] Do you mind testing the latest

[jira] [Commented] (SPARK-2341) loadLibSVMFile doesn't handle regression datasets

2014-07-29 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078040#comment-14078040 ] Sean Owen commented on SPARK-2341: -- To me, it's less confusing than writing multiclass

[jira] [Commented] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-07-29 Thread Ted Malaska (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078047#comment-14078047 ] Ted Malaska commented on SPARK-2447: So Spark has HBase 94.6 as the default HBase.

[jira] [Commented] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-07-29 Thread Ted Malaska (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078059#comment-14078059 ] Ted Malaska commented on SPARK-2447: Just talked to JMS from HBase and we don't want

[jira] [Created] (SPARK-2731) Update Tachyon dependency to 0.5.0

2014-07-29 Thread Henry Saputra (JIRA)
Henry Saputra created SPARK-2731: Summary: Update Tachyon dependency to 0.5.0 Key: SPARK-2731 URL: https://issues.apache.org/jira/browse/SPARK-2731 Project: Spark Issue Type: Task

[jira] [Updated] (SPARK-1729) Make Flume pull data from source, rather than the current push model

2014-07-29 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1729: - Target Version/s: 1.1.0 Make Flume pull data from source, rather than the current push model

[jira] [Resolved] (SPARK-1729) Make Flume pull data from source, rather than the current push model

2014-07-29 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-1729. -- Resolution: Fixed Make Flume pull data from source, rather than the current push model

[jira] [Updated] (SPARK-2377) Create a Python API for Spark Streaming

2014-07-29 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2377: - Target Version/s: 1.2.0 (was: 1.1.0) Create a Python API for Spark Streaming

[jira] [Created] (SPARK-2732) Update build script to Tachyon 0.5.0

2014-07-29 Thread Henry Saputra (JIRA)
Henry Saputra created SPARK-2732: Summary: Update build script to Tachyon 0.5.0 Key: SPARK-2732 URL: https://issues.apache.org/jira/browse/SPARK-2732 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-2733) Update make-distribution.sh to download Tachyon 0.5.0

2014-07-29 Thread Henry Saputra (JIRA)
Henry Saputra created SPARK-2733: Summary: Update make-distribution.sh to download Tachyon 0.5.0 Key: SPARK-2733 URL: https://issues.apache.org/jira/browse/SPARK-2733 Project: Spark Issue

[jira] [Commented] (SPARK-1729) Make Flume pull data from source, rather than the current push model

2014-07-29 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078123#comment-14078123 ] Hari Shreedharan commented on SPARK-1729: - Thanks for merging! Make Flume pull

[jira] [Created] (SPARK-2734) DROP TABLE should also uncache table

2014-07-29 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2734: --- Summary: DROP TABLE should also uncache table Key: SPARK-2734 URL: https://issues.apache.org/jira/browse/SPARK-2734 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-2730) When retrieving a value from a Map, GetItem evaluates key twice

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2730. - Resolution: Fixed Fix Version/s: 1.0.3 1.1.0 When retrieving

[jira] [Resolved] (SPARK-2674) Add date and time types to inferSchema

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2674. - Resolution: Fixed Fix Version/s: 1.1.0 Add date and time types to inferSchema

[jira] [Resolved] (SPARK-2082) Stratified sampling implementation in PairRDDFunctions

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-2082. -- Resolution: Fixed Fix Version/s: 1.1.0 Issue resolved by pull request 1025

[jira] [Updated] (SPARK-2729) Forgot to match Timestamp type in ColumnBuilder

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2729: Target Version/s: 1.1.0 Forgot to match Timestamp type in ColumnBuilder

[jira] [Updated] (SPARK-2406) Partitioned Parquet Support

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2406: Target Version/s: 1.2.0 (was: 1.1.0) Partitioned Parquet Support

[jira] [Updated] (SPARK-2472) Spark SQL Thrift server sometimes assigns wrong job group name

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2472: Target Version/s: 1.2.0 (was: 1.1.0) Spark SQL Thrift server sometimes assigns wrong job

[jira] [Resolved] (SPARK-2592) Make CACHE TABLE statement eager

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2592. - Resolution: Won't Fix Assignee: Cheng Lian I think its better to just be

[jira] [Resolved] (SPARK-2136) Spark SQL does not disply the job description on web ui/ event log

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2136. - Resolution: Fixed Fix Version/s: 1.1.0 I'm going to mark this as fixed since the

[jira] [Created] (SPARK-2735) Remove deprecation in jekyll for pygment in _config.yml

2014-07-29 Thread Rajiv Abraham (JIRA)
Rajiv Abraham created SPARK-2735: Summary: Remove deprecation in jekyll for pygment in _config.yml Key: SPARK-2735 URL: https://issues.apache.org/jira/browse/SPARK-2735 Project: Spark Issue

[jira] [Updated] (SPARK-2449) Spark sql reflection code requires a constructor taking all the fields for the table

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2449: Target Version/s: 1.2.0 (was: 1.1.0) Spark sql reflection code requires a constructor

[jira] [Created] (SPARK-2736) Ceeate Pyspark RDD from Apache Avro File

2014-07-29 Thread Eric Garcia (JIRA)
Eric Garcia created SPARK-2736: -- Summary: Ceeate Pyspark RDD from Apache Avro File Key: SPARK-2736 URL: https://issues.apache.org/jira/browse/SPARK-2736 Project: Spark Issue Type: Improvement

[jira] [Updated] (SPARK-2736) Ceeate Pyspark RDD from Apache Avro File

2014-07-29 Thread Eric Garcia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Garcia updated SPARK-2736: --- Description: There is a partially working example Avro Converter at this pull request:

[jira] [Commented] (SPARK-2631) In-memory Compression is not configured with SQLConf

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078344#comment-14078344 ] Apache Spark commented on SPARK-2631: - User 'marmbrus' has created a pull request for

[jira] [Created] (SPARK-2737) ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs

2014-07-29 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-2737: - Summary: ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs Key: SPARK-2737 URL: https://issues.apache.org/jira/browse/SPARK-2737 Project: Spark

[jira] [Commented] (SPARK-2737) ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078398#comment-14078398 ] Apache Spark commented on SPARK-2737: - User 'JoshRosen' has created a pull request for

[jira] [Updated] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-07-29 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2447: - Assignee: Ted Malaska (was: Tathagata Das) Add common solution for sending upsert actions to

[jira] [Commented] (SPARK-1981) Add AWS Kinesis streaming support

2014-07-29 Thread Chris Fregly (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078422#comment-14078422 ] Chris Fregly commented on SPARK-1981: - [~matei] the ec2 scripts allow you to specify a

[jira] [Commented] (SPARK-2716) Having clause with no references fails to resolve

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078446#comment-14078446 ] Apache Spark commented on SPARK-2716: - User 'marmbrus' has created a pull request for

[jira] [Commented] (SPARK-2397) Get rid of LocalHiveContext

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078509#comment-14078509 ] Apache Spark commented on SPARK-2397: - User 'marmbrus' has created a pull request for

[jira] [Resolved] (SPARK-2393) Simple cost estimation and auto selection of broadcast join

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2393. - Resolution: Fixed Fix Version/s: 1.1.0 Simple cost estimation and auto selection

[jira] [Commented] (SPARK-2712) Add a small note that mvn package must happen before test

2014-07-29 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078536#comment-14078536 ] Josh Rosen commented on SPARK-2712: --- I think that we wouldn't need this if we modified

[jira] [Updated] (SPARK-2576) slave node throws NoClassDefFoundError $line11.$read$ when executing a Spark QL query on HDFS CSV file

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2576: Assignee: Prashant Sharma (was: Yin Huai) slave node throws NoClassDefFoundError

[jira] [Updated] (SPARK-2714) DAGScheduler logs jobid when runJob finishes

2014-07-29 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hamstra updated SPARK-2714: Issue Type: Improvement (was: Documentation) DAGScheduler logs jobid when runJob finishes

[jira] [Updated] (SPARK-2352) [MLLIB] Add Artificial Neural Network (ANN) to Spark

2014-07-29 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-2352: - Target Version/s: (was: 1.1.0) [MLLIB] Add Artificial Neural Network (ANN) to Spark

[jira] [Updated] (SPARK-2260) Spark submit standalone-cluster mode is broken

2014-07-29 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2260: - Priority: Blocker (was: Major) Spark submit standalone-cluster mode is broken

[jira] [Commented] (SPARK-2392) Executors should not start their own HTTP servers

2014-07-29 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078594#comment-14078594 ] Andrew Or commented on SPARK-2392: -- https://github.com/apache/spark/pull/1335 Executors

[jira] [Updated] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-1630: Component/s: (was: SQL) PythonRDDs don't handle nulls gracefully

[jira] [Created] (SPARK-2738) Remove redundant imports in BlockManagerSuite

2014-07-29 Thread Sandy Ryza (JIRA)
Sandy Ryza created SPARK-2738: - Summary: Remove redundant imports in BlockManagerSuite Key: SPARK-2738 URL: https://issues.apache.org/jira/browse/SPARK-2738 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2738) Remove redundant imports in BlockManagerSuite

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078661#comment-14078661 ] Apache Spark commented on SPARK-2738: - User 'sryza' has created a pull request for

[jira] [Created] (SPARK-2739) Rename registerAsTable to registerTempTable

2014-07-29 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2739: --- Summary: Rename registerAsTable to registerTempTable Key: SPARK-2739 URL: https://issues.apache.org/jira/browse/SPARK-2739 Project: Spark Issue Type:

[jira] [Commented] (SPARK-1740) Pyspark cancellation kills unrelated pyspark workers

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078673#comment-14078673 ] Apache Spark commented on SPARK-1740: - User 'davies' has created a pull request for

[jira] [Commented] (SPARK-2197) Spark invoke DecisionTree by Java

2014-07-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078675#comment-14078675 ] Joseph K. Bradley commented on SPARK-2197: -- This error is at least partly caused

[jira] [Commented] (SPARK-2737) ClassCastExceptions when collect()ing JavaRDDs' underlying Scala RDDs

2014-07-29 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078705#comment-14078705 ] Joseph K. Bradley commented on SPARK-2737: -- Relating to [SPARK-2197 Spark invoke

[jira] [Commented] (SPARK-2702) Upgrade Tachyon dependency to 0.5.0

2014-07-29 Thread Haoyuan Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078704#comment-14078704 ] Haoyuan Li commented on SPARK-2702: --- New dependency introduced by 0.5.0: [INFO]+-

[jira] [Resolved] (SPARK-2716) Having clause with no references fails to resolve

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2716. - Resolution: Fixed Fix Version/s: 1.1.0 Having clause with no references fails to

[jira] [Commented] (SPARK-2000) cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without specify master

2014-07-29 Thread Chen Chao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078738#comment-14078738 ] Chen Chao commented on SPARK-2000: -- [~pwendell] We can close this issue I think. There's

[jira] [Closed] (SPARK-2000) cannot connect to cluster in Standalone mode when run spark-shell in one of the cluster node without specify master

2014-07-29 Thread Chen Chao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Chao closed SPARK-2000. Resolution: Not a Problem cannot connect to cluster in Standalone mode when run spark-shell in one of

[jira] [Resolved] (SPARK-2631) In-memory Compression is not configured with SQLConf

2014-07-29 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2631. - Resolution: Fixed Fix Version/s: 1.1.0 In-memory Compression is not configured

[jira] [Resolved] (SPARK-2305) pyspark - depend on py4j 0.8.1

2014-07-29 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2305. -- Resolution: Fixed Fix Version/s: 1.1.0 pyspark - depend on py4j 0.8.1

[jira] [Created] (SPARK-2740) In JavaPairRdd, allow user to specify ascending and numPartitions for sortByKey

2014-07-29 Thread Rui Li (JIRA)
Rui Li created SPARK-2740: - Summary: In JavaPairRdd, allow user to specify ascending and numPartitions for sortByKey Key: SPARK-2740 URL: https://issues.apache.org/jira/browse/SPARK-2740 Project: Spark

[jira] [Commented] (SPARK-2387) Remove the stage barrier for better resource utilization

2014-07-29 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078798#comment-14078798 ] Rui Li commented on SPARK-2387: --- Right, thanks [~joshrosen] for pointing out. This is just

[jira] [Created] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Brock Noland (JIRA)
Brock Noland created SPARK-2741: --- Summary: Publish version of spark assembly which does not contain Hive Key: SPARK-2741 URL: https://issues.apache.org/jira/browse/SPARK-2741 Project: Spark

[jira] [Created] (SPARK-2742) The variable inputFormatInfo and inputFormatMap never used

2014-07-29 Thread meiyoula (JIRA)
meiyoula created SPARK-2742: --- Summary: The variable inputFormatInfo and inputFormatMap never used Key: SPARK-2742 URL: https://issues.apache.org/jira/browse/SPARK-2742 Project: Spark Issue Type:

[jira] [Commented] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078813#comment-14078813 ] Xuefu Zhang commented on SPARK-2741: cc: [~rxin], [~sandyr] Publish version of spark

[jira] [Commented] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078818#comment-14078818 ] Reynold Xin commented on SPARK-2741: cc [~pwendell] Publish version of spark

[jira] [Commented] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078819#comment-14078819 ] Reynold Xin commented on SPARK-2741: Actually as I understand the assembly can be

[jira] [Commented] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Brock Noland (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078832#comment-14078832 ] Brock Noland commented on SPARK-2741: - I understand that maybe true, though I didn't

[jira] [Created] (SPARK-2743) Parquet has issues with capital letters and case insensitivity

2014-07-29 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2743: --- Summary: Parquet has issues with capital letters and case insensitivity Key: SPARK-2743 URL: https://issues.apache.org/jira/browse/SPARK-2743 Project: Spark

[jira] [Commented] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078835#comment-14078835 ] Xuefu Zhang commented on SPARK-2741: I did see a profile about Hive. However, it seems

[jira] [Commented] (SPARK-2743) Parquet has issues with capital letters and case insensitivity

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078877#comment-14078877 ] Apache Spark commented on SPARK-2743: - User 'marmbrus' has created a pull request for

[jira] [Commented] (SPARK-2585) Remove special handling of Hadoop JobConf

2014-07-29 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078879#comment-14078879 ] Apache Spark commented on SPARK-2585: - User 'rxin' has created a pull request for this

[jira] [Commented] (SPARK-1812) Support cross-building with Scala 2.11

2014-07-29 Thread Anand Avati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078887#comment-14078887 ] Anand Avati commented on SPARK-1812: So the akka-2.3.x incompatibility turns out to be

[jira] [Commented] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Brock Noland (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078891#comment-14078891 ] Brock Noland commented on SPARK-2741: - Yes, after looking into it more, to include

[jira] [Updated] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2741: --- Target Version/s: 1.1.0 Publish version of spark assembly which does not contain Hive

[jira] [Updated] (SPARK-2741) Publish version of spark assembly which does not contain Hive

2014-07-29 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-2741: --- Assignee: Patrick Wendell Publish version of spark assembly which does not contain Hive

  1   2   >