[jira] [Updated] (SPARK-1836) REPL $outer type mismatch causes lookup() and equals() problems

2014-05-16 Thread Michael Malak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Malak updated SPARK-1836: - Description: Anand Avati partially traced the cause to REPL wrapping classes in $outer classes.

[jira] [Commented] (SPARK-1781) Generalized validity checking for configuration parameters

2014-05-16 Thread Erik Erlandson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993903#comment-13993903 ] Erik Erlandson commented on SPARK-1781: --- Ideally, pre-fab predicates could be

[jira] [Updated] (SPARK-1647) Prevent data loss when Streaming driver goes down

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1647: - Fix Version/s: 1.1.0 Prevent data loss when Streaming driver goes down

[jira] [Updated] (SPARK-1850) Bad exception if multiple jars exist when running PySpark

2014-05-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-1850: - Description: {code} Found multiple Spark assembly jars in

[jira] [Reopened] (SPARK-1860) Standalone Worker cleanup should not clean up running applications by default

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-1860: The PR actually just disabled it, it didn't fix this. Standalone Worker cleanup should not

[jira] [Updated] (SPARK-1839) PySpark take() does not launch a Spark job when it has to

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1839: --- Fix Version/s: 1.1.0 PySpark take() does not launch a Spark job when it has to

[jira] [Created] (SPARK-1850) Bad exception if multiple jars exist when running PySpark

2014-05-16 Thread Andrew Or (JIRA)
Andrew Or created SPARK-1850: Summary: Bad exception if multiple jars exist when running PySpark Key: SPARK-1850 URL: https://issues.apache.org/jira/browse/SPARK-1850 Project: Spark Issue Type:

[jira] [Updated] (SPARK-1359) SGD implementation is not efficient

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1359: - Affects Version/s: 1.0.0 SGD implementation is not efficient

[jira] [Resolved] (SPARK-1110) Clean up and clarify use of SPARK_HOME

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1110. Resolution: Fixed This was subsumed by the other configuration clean-up. Clean up and

[jira] [Updated] (SPARK-1849) Broken UTF-8 encoded data gets character replacements and thus can't be fixed

2014-05-16 Thread Harry Brundage (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harry Brundage updated SPARK-1849: -- Attachment: encoding_test Here's the windows encoded file I was using to test with if you'd

[jira] [Updated] (SPARK-911) Support map pruning on sorted (K, V) RDD's

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-911: -- Fix Version/s: (was: 1.0.0) Support map pruning on sorted (K, V) RDD's

[jira] [Created] (SPARK-1853) Show Streaming application code context (file, line number) in Spark Stages UI

2014-05-16 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-1853: Summary: Show Streaming application code context (file, line number) in Spark Stages UI Key: SPARK-1853 URL: https://issues.apache.org/jira/browse/SPARK-1853

[jira] [Updated] (SPARK-1741) Add predict(JavaRDD) to predictive models

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1741: --- Fix Version/s: 1.0.1 Add predict(JavaRDD) to predictive models

[jira] [Updated] (SPARK-874) Have version of `sc.stop()` that blocks until all executors are cleaned up.

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-874: -- Fix Version/s: (was: 1.0.0) 1.1.0 Have version of `sc.stop()` that

[jira] [Updated] (SPARK-1485) Implement AllReduce

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1485: - Priority: Critical (was: Major) Implement AllReduce ---

[jira] [Updated] (SPARK-1487) Support record filtering via predicate pushdown in Parquet

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-1487: Fix Version/s: (was: 1.1.0) Support record filtering via predicate pushdown in

[jira] [Updated] (SPARK-1704) java.lang.AssertionError: assertion failed: No plan for ExplainCommand (Project [*])

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1704: --- Fix Version/s: (was: 1.0.0) 1.1.0 java.lang.AssertionError:

[jira] [Updated] (SPARK-1820) Make GenerateMimaIgnore @DeveloperApi annotation aware.

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1820: --- Assignee: Prashant Sharma Make GenerateMimaIgnore @DeveloperApi annotation aware.

[jira] [Commented] (SPARK-1826) Some bad head notations in sparksql

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998988#comment-13998988 ] Michael Armbrust commented on SPARK-1826: - Fixed in:

[jira] [Updated] (SPARK-1669) SQLContext.cacheTable() should be idempotent

2014-05-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-1669: -- Description: Calling {{cacheTable()}} on some table {{t}} multiple times causes table {{t}} to be

[jira] [Resolved] (SPARK-1633) Various examples for Scala and Java custom receiver, etc.

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-1633. -- Resolution: Fixed Fix Version/s: 1.0.0 Various examples for Scala and Java custom

[jira] [Updated] (SPARK-732) Recomputation of RDDs may result in duplicated accumulator updates

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-732: -- Fix Version/s: (was: 1.0.0) Recomputation of RDDs may result in duplicated accumulator

[jira] [Updated] (SPARK-1389) Make numPartitions in Exchange configurable

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1389: --- Fix Version/s: (was: 1.0.0) 1.1.0 Make numPartitions in Exchange

[jira] [Commented] (SPARK-1603) flaky test case in StreamingContextSuite

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999241#comment-13999241 ] Tathagata Das commented on SPARK-1603: -- I think we havent seen the flakiness since

[jira] [Commented] (SPARK-1154) Spark fills up disk with app-* folders

2014-05-16 Thread Mingyu Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999562#comment-13999562 ] Mingyu Kim commented on SPARK-1154: --- I looked at the commit, and it seems like it wipes

[jira] [Resolved] (SPARK-1826) Some bad head notations in sparksql

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-1826. - Resolution: Fixed Some bad head notations in sparksql

[jira] [Created] (SPARK-1849) Broken UTF-8 encoded data gets character replacements and thus can't be fixed

2014-05-16 Thread Harry Brundage (JIRA)
Harry Brundage created SPARK-1849: - Summary: Broken UTF-8 encoded data gets character replacements and thus can't be fixed Key: SPARK-1849 URL: https://issues.apache.org/jira/browse/SPARK-1849

[jira] [Commented] (SPARK-1782) svd for sparse matrix using ARPACK

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999499#comment-13999499 ] Xiangrui Meng commented on SPARK-1782: -- Btw, this approach only gives us \Sigma and

[jira] [Updated] (SPARK-1752) Standardize input/output format for vectors and labeled points

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1752: - Fix Version/s: 1.1.0 Standardize input/output format for vectors and labeled points

[jira] [Commented] (SPARK-1845) Use AllScalaRegistrar for SparkSqlSerializer to register serializers of Scala collections.

2014-05-16 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998680#comment-13998680 ] Takuya Ueshin commented on SPARK-1845: -- Pull-requested:

[jira] [Resolved] (SPARK-1845) Use AllScalaRegistrar for SparkSqlSerializer to register serializers of Scala collections.

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-1845. - Resolution: Fixed Use AllScalaRegistrar for SparkSqlSerializer to register serializers

[jira] [Updated] (SPARK-1850) Bad exception if multiple jars exist when running PySpark

2014-05-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-1850: - Description: {code} Found multiple Spark assembly jars in

[jira] [Commented] (SPARK-1830) Deploy failover, Make Persistence engine and LeaderAgent Pluggable.

2014-05-16 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998549#comment-13998549 ] Prashant Sharma commented on SPARK-1830: PR at

[jira] [Commented] (SPARK-944) Give example of writing to HBase from Spark Streaming

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999226#comment-13999226 ] Tathagata Das commented on SPARK-944: - Hi Kanwal, the usual process of the contributing

[jira] [Updated] (SPARK-1022) Add unit tests for kafka streaming

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1022: --- Fix Version/s: (was: 1.0.0) Add unit tests for kafka streaming

[jira] [Updated] (SPARK-1749) DAGScheduler supervisor strategy broken with Mesos

2014-05-16 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Hamstra updated SPARK-1749: Fix Version/s: 1.0.1 DAGScheduler supervisor strategy broken with Mesos

[jira] [Resolved] (SPARK-1603) flaky test case in StreamingContextSuite

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-1603. -- Resolution: Fixed flaky test case in StreamingContextSuite

[jira] [Updated] (SPARK-1442) Add Window function support

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-1442: Fix Version/s: 1.1.0 Add Window function support ---

[jira] [Resolved] (SPARK-1741) Add predict(JavaRDD) to predictive models

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1741. Resolution: Fixed Fix Version/s: 1.0.0 Issue resolved by pull request 670

[jira] [Updated] (SPARK-1860) Standalone Worker cleanup should not clean up running applications

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1860: --- Assignee: (was: Aaron Davidson) Standalone Worker cleanup should not clean up running

[jira] [Updated] (SPARK-1824) Python examples still take in master

2014-05-16 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-1824: - Issue Type: Improvement (was: Bug) Python examples still take in master

[jira] [Updated] (SPARK-911) Support map pruning on sorted (K, V) RDD's

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-911: -- Description: If someone has sorted a (K, V) rdd, we should offer them a way to filter a range

[jira] [Updated] (SPARK-874) Have version of `sc.stop()` that blocks until all executors are cleaned up.

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-874: -- Assignee: (was: Patrick Cogan) Have version of `sc.stop()` that blocks until all executors

[jira] [Updated] (SPARK-874) Have a --wait flag in ./sbin/stop-all.sh that polls until Worker's are finished

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-874: -- Description: When running benchmarking jobs, sometimes the cluster takes a long time to shut

[jira] [Commented] (SPARK-1860) Standalone Worker cleanup should not clean up running applications

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999602#comment-13999602 ] Patrick Wendell commented on SPARK-1860: I think it would be better to only start

[jira] [Updated] (SPARK-874) Have version of `sc.stop()` that blocks until all executors are cleaned up.

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-874: -- Labels: starter (was: ) Have version of `sc.stop()` that blocks until all executors are

[jira] [Updated] (SPARK-944) Give example of writing to HBase from Spark Streaming

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-944: -- Assignee: (was: Patrick Cogan) Give example of writing to HBase from Spark Streaming

[jira] [Updated] (SPARK-1792) Missing Spark-Shell Configure Options

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1792: --- Fix Version/s: 1.1.0 Missing Spark-Shell Configure Options

[jira] [Updated] (SPARK-1478) Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-1915

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1478: - Fix Version/s: 1.1.0 Upgrade FlumeInputDStream's FlumeReceiver to support FLUME-1915

[jira] [Updated] (SPARK-1820) Make GenerateMimaIgnore @DeveloperApi annotation aware.

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1820: --- Affects Version/s: (was: 1.1.0) Make GenerateMimaIgnore @DeveloperApi annotation aware.

[jira] [Commented] (SPARK-1154) Spark fills up disk with app-* folders

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999601#comment-13999601 ] Patrick Wendell commented on SPARK-1154: [~mkim] yes you are correct - this is

[jira] [Updated] (SPARK-874) Have a --wait flag in ./sbin/stop-all.sh that polls until Worker's are finished

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-874: -- Description: When running benchmarking jobs, sometimes the cluster takes a long time to shut

[jira] [Updated] (SPARK-1851) Upgrade Avro dependency to 1.7.6 so Spark can read Avro files

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1851: --- Assignee: Sandy Ryza Upgrade Avro dependency to 1.7.6 so Spark can read Avro files

[jira] [Updated] (SPARK-1830) Deploy failover, Make Persistence engine and LeaderAgent Pluggable.

2014-05-16 Thread Prashant Sharma (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prashant Sharma updated SPARK-1830: --- Fix Version/s: 1.0.1 1.1.0 Deploy failover, Make Persistence engine and

[jira] [Created] (SPARK-1846) RAT checks should exclude logs/ directory

2014-05-16 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-1846: - Summary: RAT checks should exclude logs/ directory Key: SPARK-1846 URL: https://issues.apache.org/jira/browse/SPARK-1846 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-1836) REPL $outer type mismatch causes lookup() and equals() problems

2014-05-16 Thread Michael Malak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998807#comment-13998807 ] Michael Malak commented on SPARK-1836: -- Michael Ambrust: Indeed. Do you think I

[jira] [Updated] (SPARK-1730) Make receiver store data reliably to avoid data-loss on executor failures

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1730: - Fix Version/s: 1.1.0 Make receiver store data reliably to avoid data-loss on executor failures

[jira] [Created] (SPARK-1858) Update third-party Hadoop distros doc to list more distros

2014-05-16 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-1858: Summary: Update third-party Hadoop distros doc to list more distros Key: SPARK-1858 URL: https://issues.apache.org/jira/browse/SPARK-1858 Project: Spark

[jira] [Updated] (SPARK-1553) Support alternating nonnegative least-squares

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1553: - Fix Version/s: 1.1.0 Support alternating nonnegative least-squares

[jira] [Created] (SPARK-1854) Add a version of StreamingContext.fileStream that take hadoop conf object

2014-05-16 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-1854: Summary: Add a version of StreamingContext.fileStream that take hadoop conf object Key: SPARK-1854 URL: https://issues.apache.org/jira/browse/SPARK-1854 Project:

[jira] [Commented] (SPARK-1782) svd for sparse matrix using ARPACK

2014-05-16 Thread Li Pu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000232#comment-14000232 ] Li Pu commented on SPARK-1782: -- [~mengxr] thank you for the comments! You are right, (A^T A)

[jira] [Updated] (SPARK-874) Have version of `sc.stop()` that blocks until all executors are cleaned up.

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-874: -- Component/s: Deploy Have version of `sc.stop()` that blocks until all executors are cleaned

[jira] [Updated] (SPARK-944) Give example of writing to HBase from Spark Streaming

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-944?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-944: -- Fix Version/s: (was: 1.0.0) Give example of writing to HBase from Spark Streaming

[jira] [Commented] (SPARK-1368) HiveTableScan is slow

2014-05-16 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998619#comment-13998619 ] Cheng Lian commented on SPARK-1368: --- Corresponding PR:

[jira] [Commented] (SPARK-1585) Not robust Lasso causes Infinity on weights and losses

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999123#comment-13999123 ] Xiangrui Meng commented on SPARK-1585: -- I think the gradient should pull the weights

[jira] [Resolved] (SPARK-1851) Upgrade Avro dependency to 1.7.6 so Spark can read Avro files

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1851. Resolution: Fixed Fix Version/s: 1.0.0 Issue resolved by pull request 795

[jira] [Created] (SPARK-1855) Provide memory-and-local-disk RDD checkpointing

2014-05-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-1855: Summary: Provide memory-and-local-disk RDD checkpointing Key: SPARK-1855 URL: https://issues.apache.org/jira/browse/SPARK-1855 Project: Spark Issue Type:

[jira] [Updated] (SPARK-1485) Implement AllReduce

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1485: - Fix Version/s: 1.1.0 Implement AllReduce --- Key: SPARK-1485

[jira] [Created] (SPARK-1852) SparkSQL Queries with Sorts run before the user asks them to

2014-05-16 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-1852: --- Summary: SparkSQL Queries with Sorts run before the user asks them to Key: SPARK-1852 URL: https://issues.apache.org/jira/browse/SPARK-1852 Project: Spark

[jira] [Updated] (SPARK-1729) Make Flume pull data from source, rather than the current push model

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1729: - Fix Version/s: 1.1.0 Make Flume pull data from source, rather than the current push model

[jira] [Resolved] (SPARK-1230) Enable SparkContext.addJars() to load classes not in CLASSPATH

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1230. Resolution: Incomplete I forget what this actually means (hah) so I'm gonna close it for

[jira] [Resolved] (SPARK-1638) Executors fail to come up if spark.executor.extraJavaOptions is set

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1638. Resolution: Duplicate Executors fail to come up if spark.executor.extraJavaOptions is set

[jira] [Updated] (SPARK-1368) HiveTableScan is slow

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-1368: Assignee: Cheng Lian HiveTableScan is slow - Key:

[jira] [Created] (SPARK-1847) Pushdown filters on non-required parquet columns

2014-05-16 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-1847: --- Summary: Pushdown filters on non-required parquet columns Key: SPARK-1847 URL: https://issues.apache.org/jira/browse/SPARK-1847 Project: Spark Issue

[jira] [Updated] (SPARK-1768) History Server enhancements

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1768: --- Fix Version/s: 1.1.0 History Server enhancements ---

[jira] [Created] (SPARK-1862) Add build support for MapR

2014-05-16 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-1862: -- Summary: Add build support for MapR Key: SPARK-1862 URL: https://issues.apache.org/jira/browse/SPARK-1862 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-1853) Show Streaming application code context (file, line number) in Spark Stages UI

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1853: - Fix Version/s: 1.1.0 Show Streaming application code context (file, line number) in Spark

[jira] [Updated] (SPARK-1682) Add gradient descent w/o sampling and RDA L1 updater

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1682: --- Fix Version/s: (was: 1.0.0) Add gradient descent w/o sampling and RDA L1 updater

[jira] [Updated] (SPARK-1272) Don't fail job if some local directories are buggy

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1272: --- Fix Version/s: (was: 1.0.0) 1.1.0 Don't fail job if some local

[jira] [Commented] (SPARK-1782) svd for sparse matrix using ARPACK

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999493#comment-13999493 ] Xiangrui Meng commented on SPARK-1782: -- This sounds good to me. Let's assume that A

[jira] [Updated] (SPARK-1580) ALS: Estimate communication and computation costs given a partitioner

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1580: - Fix Version/s: 1.1.0 ALS: Estimate communication and computation costs given a partitioner

[jira] [Commented] (SPARK-1813) Add a utility to SparkConf that makes using Kryo really easy

2014-05-16 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998557#comment-13998557 ] Sandy Ryza commented on SPARK-1813: --- https://github.com/apache/spark/pull/789 is what I

[jira] [Updated] (SPARK-874) Have a --wait flag in ./sbin/stop-all.sh that polls until Worker's are finished

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-874: -- Summary: Have a --wait flag in ./sbin/stop-all.sh that polls until Worker's are finished (was:

[jira] [Created] (SPARK-1856) Standardize MLlib interfaces

2014-05-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-1856: Summary: Standardize MLlib interfaces Key: SPARK-1856 URL: https://issues.apache.org/jira/browse/SPARK-1856 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-1860) Standalone Worker cleanup should not clean up running applications by default

2014-05-16 Thread Aaron Davidson (JIRA)
Aaron Davidson created SPARK-1860: - Summary: Standalone Worker cleanup should not clean up running applications by default Key: SPARK-1860 URL: https://issues.apache.org/jira/browse/SPARK-1860

[jira] [Created] (SPARK-1861) ArrayIndexOutOfBoundsException when reading bzip2 files

2014-05-16 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-1861: Summary: ArrayIndexOutOfBoundsException when reading bzip2 files Key: SPARK-1861 URL: https://issues.apache.org/jira/browse/SPARK-1861 Project: Spark Issue

[jira] [Updated] (SPARK-1585) Not robust Lasso causes Infinity on weights and losses

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1585: --- Fix Version/s: (was: 1.0.0) 1.1.0 Not robust Lasso causes Infinity

[jira] [Updated] (SPARK-1486) Support multi-model training in MLlib

2014-05-16 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-1486: - Priority: Critical (was: Major) Support multi-model training in MLlib

[jira] [Resolved] (SPARK-1765) Modify a typo in monitoring.md

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1765. Resolution: Fixed Fix Version/s: 1.0.0 Modify a typo in monitoring.md

[jira] [Created] (SPARK-1859) Linear, Ridge and Lasso Regressions with SGD yield unexpected results

2014-05-16 Thread Vlad Frolov (JIRA)
Vlad Frolov created SPARK-1859: -- Summary: Linear, Ridge and Lasso Regressions with SGD yield unexpected results Key: SPARK-1859 URL: https://issues.apache.org/jira/browse/SPARK-1859 Project: Spark

[jira] [Updated] (SPARK-1630) PythonRDDs don't handle nulls gracefully

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1630: --- Fix Version/s: (was: 1.0.0) 1.1.0 PythonRDDs don't handle nulls

[jira] [Updated] (SPARK-1600) flaky test case in streaming.CheckpointSuite

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-1600: - Fix Version/s: 1.1.0 flaky test case in streaming.CheckpointSuite

[jira] [Updated] (SPARK-1487) Support record filtering via predicate pushdown in Parquet

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-1487: Fix Version/s: 1.1.0 Support record filtering via predicate pushdown in Parquet

[jira] [Resolved] (SPARK-1340) Some Spark Streaming receivers are not restarted when worker fails

2014-05-16 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-1340. -- Resolution: Fixed Resolved with https://issues.apache.org/jira/browse/SPARK-1332 Some Spark

[jira] [Created] (SPARK-1848) Executors are mysteriously dying when using Spark on Mesos

2014-05-16 Thread Bouke van der Bijl (JIRA)
Bouke van der Bijl created SPARK-1848: - Summary: Executors are mysteriously dying when using Spark on Mesos Key: SPARK-1848 URL: https://issues.apache.org/jira/browse/SPARK-1848 Project: Spark

[jira] [Resolved] (SPARK-1810) The spark tar ball does not unzip into a separate folder when un-tarred.

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1810. Resolution: Cannot Reproduce The spark tar ball does not unzip into a separate folder

[jira] [Commented] (SPARK-1741) Add predict(JavaRDD) to predictive models

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999115#comment-13999115 ] Patrick Wendell commented on SPARK-1741: This might end up in 1.0 or not depending

[jira] [Commented] (SPARK-1487) Support record filtering via predicate pushdown in Parquet

2014-05-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999034#comment-13999034 ] Michael Armbrust commented on SPARK-1487: - PR here:

[jira] [Comment Edited] (SPARK-1154) Spark fills up disk with app-* folders

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13999601#comment-13999601 ] Patrick Wendell edited comment on SPARK-1154 at 5/16/14 5:00 AM:

[jira] [Updated] (SPARK-1623) SPARK-1623. Broadcast cleaner should use getCanonicalPath when deleting files by name

2014-05-16 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1623: --- Fix Version/s: (was: 1.0.0) SPARK-1623. Broadcast cleaner should use getCanonicalPath

  1   2   >