[jira] [Updated] (SPARK-2629) Improve performance of DStream.updateStateByKey

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2629: - Summary: Improve performance of DStream.updateStateByKey (was: Improve performance of DStream.upd

[jira] [Created] (SPARK-6235) Address various 2G limits

2015-03-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-6235: -- Summary: Address various 2G limits Key: SPARK-6235 URL: https://issues.apache.org/jira/browse/SPARK-6235 Project: Spark Issue Type: Umbrella Components

[jira] [Updated] (SPARK-5042) Updated Receiver API to make it easier to write reliable receivers that ack source

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5042: - Target Version/s: (was: 1.4.0) > Updated Receiver API to make it easier to write reliable receiv

[jira] [Commented] (SPARK-2629) Improve performance of DStream.updateStateByKey

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353906#comment-14353906 ] Tathagata Das commented on SPARK-2629: -- Since IndexRDD is not supposed to be added to

[jira] [Commented] (SPARK-5042) Updated Receiver API to make it easier to write reliable receivers that ack source

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353908#comment-14353908 ] Tathagata Das commented on SPARK-5042: -- This is being deprioritized for features that

[jira] [Updated] (SPARK-5044) Update ReliableKafkaReceiver to use updated Receiver API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5044: - Target Version/s: (was: 1.3.0) > Update ReliableKafkaReceiver to use updated Receiver API >

[jira] [Comment Edited] (SPARK-5042) Updated Receiver API to make it easier to write reliable receivers that ack source

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353908#comment-14353908 ] Tathagata Das edited comment on SPARK-5042 at 3/9/15 11:55 PM: -

[jira] [Commented] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-09 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353909#comment-14353909 ] Xiangrui Meng commented on SPARK-6234: -- [~nravi] This seems to be a regression in Bre

[jira] [Updated] (SPARK-5559) Flaky test: o.a.s.streaming.flume.FlumeStreamSuite

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5559: - Target Version/s: 1.4.0 (was: 1.3.0) > Flaky test: o.a.s.streaming.flume.FlumeStreamSuite > -

[jira] [Updated] (SPARK-6190) create LargeByteBuffer abstraction for eliminating 2GB limit on blocks

2015-03-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-6190: --- Issue Type: Sub-task (was: Improvement) Parent: SPARK-6235 > create LargeByteBuffer abstracti

[jira] [Updated] (SPARK-5155) Python API for MQTT streaming

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5155: - Target Version/s: 1.4.0 (was: 1.3.0) > Python API for MQTT streaming > --

[jira] [Created] (SPARK-6236) Support caching blocks larger than 2G

2015-03-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-6236: -- Summary: Support caching blocks larger than 2G Key: SPARK-6236 URL: https://issues.apache.org/jira/browse/SPARK-6236 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-5155) Python API for MQTT streaming

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353913#comment-14353913 ] Tathagata Das commented on SPARK-5155: -- This issue is still blocking on us figuring o

[jira] [Created] (SPARK-6238) Support shuffle where individual blocks might be > 2G

2015-03-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-6238: -- Summary: Support shuffle where individual blocks might be > 2G Key: SPARK-6238 URL: https://issues.apache.org/jira/browse/SPARK-6238 Project: Spark Issue Type: S

[jira] [Created] (SPARK-6237) Support network transfer for blocks larger than 2G

2015-03-09 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-6237: -- Summary: Support network transfer for blocks larger than 2G Key: SPARK-6237 URL: https://issues.apache.org/jira/browse/SPARK-6237 Project: Spark Issue Type: Sub-

[jira] [Updated] (SPARK-5045) Update FlumePollingReceiver to use updated Receiver API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5045: - Target Version/s: (was: 1.3.0) > Update FlumePollingReceiver to use updated Receiver API > -

[jira] [Updated] (SPARK-5048) Add Flume to the Python Streaming API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5048: - Assignee: Hari Shreedharan > Add Flume to the Python Streaming API > -

[jira] [Commented] (SPARK-5048) Add Flume to the Python Streaming API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353918#comment-14353918 ] Tathagata Das commented on SPARK-5048: -- [~hshreedharan] Can you take a crack at this?

[jira] [Updated] (SPARK-5048) Add Flume to the Python Streaming API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5048: - Target Version/s: 1.4.0 (was: 1.3.0) > Add Flume to the Python Streaming API > --

[jira] [Updated] (SPARK-5046) Update KinesisReceiver to use updated Receiver API

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5046: - Target Version/s: 1.4.0 (was: 1.3.0) > Update KinesisReceiver to use updated Receiver API > -

[jira] [Updated] (SPARK-5205) Inconsistent behaviour between Streaming job and others, when click kill link in WebUI

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5205: - Target Version/s: 1.4.0, 1.3.1 (was: 1.3.0, 1.2.1) > Inconsistent behaviour between Streaming job

[jira] [Commented] (SPARK-6190) create LargeByteBuffer abstraction for eliminating 2GB limit on blocks

2015-03-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353940#comment-14353940 ] Reynold Xin commented on SPARK-6190: Hi [~imranr], As I said earlier, I would advise

[jira] [Comment Edited] (SPARK-6190) create LargeByteBuffer abstraction for eliminating 2GB limit on blocks

2015-03-09 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353940#comment-14353940 ] Reynold Xin edited comment on SPARK-6190 at 3/10/15 12:14 AM: --

[jira] [Updated] (SPARK-5682) Add encrypted shuffle in spark

2015-03-09 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated SPARK-5682: Summary: Add encrypted shuffle in spark (was: Reuse hadoop encrypted shuffle algorithm to e

[jira] [Updated] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Shreedharan updated SPARK-6222: Attachment: AfterPatch.txt CleanWithoutPatch.txt Here are the logs: CleanWi

[jira] [Commented] (SPARK-4118) Create python bindings for Streaming KMeans

2015-03-09 Thread mat (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14353963#comment-14353963 ] mat commented on SPARK-4118: Is there a plan to support this? > Create python bindings for S

[jira] [Updated] (SPARK-6177) LDA should check partitions size of the input

2015-03-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6177: -- Description: Add comment to introduce coalesce to LDA example to avoid the possible massive partitions

[jira] [Updated] (SPARK-6177) Add note for

2015-03-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6177: -- Summary: Add note for (was: LDA should check partitions size of the input) > Add note for > -

[jira] [Updated] (SPARK-6177) Add note in LDA example to remind possible coalesce

2015-03-09 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang updated SPARK-6177: -- Summary: Add note in LDA example to remind possible coalesce (was: Add note for ) > Add note in LDA e

[jira] [Commented] (SPARK-5523) TaskMetrics and TaskInfo have innumerable copies of the hostname string

2015-03-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354078#comment-14354078 ] Saisai Shao commented on SPARK-5523: Hi [~tdas], I will take a look at this issue and

[jira] [Commented] (SPARK-6211) Test Python Kafka API using Python unit tests

2015-03-09 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354088#comment-14354088 ] Saisai Shao commented on SPARK-6211: Thanks TD for your suggestion, I was thinking of

[jira] [Commented] (SPARK-6211) Test Python Kafka API using Python unit tests

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354094#comment-14354094 ] Tathagata Das commented on SPARK-6211: -- Yes. It would be good to keep the change at t

[jira] [Updated] (SPARK-5817) UDTF column names didn't set properly

2015-03-09 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-5817: - Description: {code} createQueryTest("Specify the udtf output", select d from (select explode(array(key,1)

[jira] [Updated] (SPARK-5817) UDTF column names didn't set properly

2015-03-09 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-5817: - Description: createQueryTest("Specify the udtf output", """ select d from (select explode(array(key,1)

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354122#comment-14354122 ] Tathagata Das commented on SPARK-6222: -- Offline discussion with Hari, these logs dont

[jira] [Updated] (SPARK-5817) UDTF column names didn't set properly

2015-03-09 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-5817: - Description: {code} createQueryTest("Specify the udtf output", select d from (select explode(array(1,1))

[jira] [Updated] (SPARK-5817) UDTF column names didn't set properly

2015-03-09 Thread Cheng Hao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-5817: - Description: {code} createQueryTest("Specify the udtf output", "select d from (select explode(array(1,1))

[jira] [Updated] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-6222: - Target Version/s: 1.4.0 (was: 1.3.0) > [STREAMING] All data may not be recovered from WAL when dr

[jira] [Updated] (SPARK-5659) Flaky test: o.a.s.streaming.ReceiverSuite.block

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5659: - Target Version/s: 1.4.0 > Flaky test: o.a.s.streaming.ReceiverSuite.block > --

[jira] [Updated] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-6222: - Target Version/s: 1.4.0, 1.3.1 (was: 1.4.0) > [STREAMING] All data may not be recovered from WAL

[jira] [Updated] (SPARK-5659) Flaky test: o.a.s.streaming.ReceiverSuite.block

2015-03-09 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-5659: - Target Version/s: (was: 1.3.0) > Flaky test: o.a.s.streaming.ReceiverSuite.block > -

[jira] [Commented] (SPARK-5183) Document data source API

2015-03-09 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354150#comment-14354150 ] Apache Spark commented on SPARK-5183: - User 'marmbrus' has created a pull request for

[jira] [Commented] (SPARK-6221) SparkSQL should support auto merging output files

2015-03-09 Thread Yi Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354166#comment-14354166 ] Yi Tian commented on SPARK-6221: [~srowen], thanks for your comment. I think we should add

[jira] [Closed] (SPARK-4734) [Streaming]limit the file Dstream size for each batch

2015-03-09 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-4734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] 宿荣全 closed SPARK-4734. -- > [Streaming]limit the file Dstream size for each batch > - > >

[jira] [Comment Edited] (SPARK-6221) SparkSQL should support auto merging output files

2015-03-09 Thread Yi Tian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354166#comment-14354166 ] Yi Tian edited comment on SPARK-6221 at 3/10/15 3:25 AM: - [~srowen

[jira] [Commented] (SPARK-6220) Allow extended EC2 options to be passed through spark-ec2

2015-03-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354217#comment-14354217 ] Nicholas Chammas commented on SPARK-6220: - I took another look at the 2 boto metho

[jira] [Created] (SPARK-6239) Spark MLlib fpm#FPGrowth minSupport should use double instead

2015-03-09 Thread Littlestar (JIRA)
Littlestar created SPARK-6239: - Summary: Spark MLlib fpm#FPGrowth minSupport should use double instead Key: SPARK-6239 URL: https://issues.apache.org/jira/browse/SPARK-6239 Project: Spark Issue

[jira] [Updated] (SPARK-6239) Spark MLlib fpm#FPGrowth minSupport should use long instead

2015-03-09 Thread Littlestar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Littlestar updated SPARK-6239: -- Description: Spark MLlib fpm#FPGrowth minSupport should use long instead == val minCount =

<    1   2