[jira] [Commented] (SPARK-2321) Design a proper progress reporting event listener API

2014-11-14 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212000#comment-14212000 ] Rui Li commented on SPARK-2321: --- Hi [~joshrosen], The new API is quite useful. But the

[jira] [Created] (SPARK-4398) Specialize rdd.parallelize for xrange

2014-11-14 Thread Xiangrui Meng (JIRA)
Xiangrui Meng created SPARK-4398: Summary: Specialize rdd.parallelize for xrange Key: SPARK-4398 URL: https://issues.apache.org/jira/browse/SPARK-4398 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-4398) Specialize rdd.parallelize for xrange

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212014#comment-14212014 ] Apache Spark commented on SPARK-4398: - User 'mengxr' has created a pull request for

[jira] [Comment Edited] (SPARK-689) Task will crash when setting SPARK_WORKER_CORES 128

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207721#comment-14207721 ] Andrew Ash edited comment on SPARK-689 at 11/14/14 8:46 AM: I

[jira] [Commented] (SPARK-4395) Running a Spark SQL SELECT command from PySpark causes a hang for ~ 1 hour

2014-11-14 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212019#comment-14212019 ] Davies Liu commented on SPARK-4395: --- [~marmbrus] After removing the cache(), this script

[jira] [Commented] (SPARK-755) Kryo serialization failing - MLbase

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212033#comment-14212033 ] Andrew Ash commented on SPARK-755: -- Quick check [~sparks], this hasn't been updated in

[jira] [Commented] (SPARK-2352) [MLLIB] Add Artificial Neural Network (ANN) to Spark

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212043#comment-14212043 ] Apache Spark commented on SPARK-2352: - User 'witgo' has created a pull request for

[jira] [Commented] (SPARK-664) Accumulator updates should get locally merged before sent to the driver

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212046#comment-14212046 ] Andrew Ash commented on SPARK-664: -- [~irashid] it sounds like your proposal is to batch

[jira] [Commented] (SPARK-748) Add documentation page describing interoperability with other software (e.g. HBase, JDBC, Kafka, etc.)

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212049#comment-14212049 ] Andrew Ash commented on SPARK-748: -- I agree this would be valuable -- almost like a Spark

[jira] [Commented] (SPARK-625) Client hangs when connecting to standalone cluster using wrong address

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212062#comment-14212062 ] Andrew Ash commented on SPARK-625: -- Spark is very sensitive to hostnames in Spark URLs,

[jira] [Commented] (SPARK-2468) Netty-based block server / client module

2014-11-14 Thread zzc (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212064#comment-14212064 ] zzc commented on SPARK-2468: Hi, Aaron Davidson, I send a email to you about shuffle data

[jira] [Commented] (SPARK-809) Give newly registered apps a set of executors right away

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212068#comment-14212068 ] Andrew Ash commented on SPARK-809: -- I believe this situation hasn't changed? Looking a

[jira] [Updated] (SPARK-835) RDD$parallelize() should use object serializer (not closure serializer) for collection objects

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-835: - Fix Version/s: 0.8.0 RDD$parallelize() should use object serializer (not closure serializer) for

[jira] [Resolved] (SPARK-835) RDD$parallelize() should use object serializer (not closure serializer) for collection objects

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash resolved SPARK-835. -- Resolution: Fixed Closing per [~dlyubimov] with the Fix Version from SPARK-826 RDD$parallelize()

[jira] [Commented] (SPARK-904) Not able to Start/Stop Spark Worker from Remote Machine

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212078#comment-14212078 ] Andrew Ash commented on SPARK-904: -- [~ayushmishra2005] I suspect you don't have Spark

[jira] [Closed] (SPARK-904) Not able to Start/Stop Spark Worker from Remote Machine

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash closed SPARK-904. Resolution: Not a Problem Not able to Start/Stop Spark Worker from Remote Machine

[jira] [Commented] (SPARK-957) The problem that repeated computation among iterations

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212084#comment-14212084 ] Andrew Ash commented on SPARK-957: -- Hi [~caizhua], are you still having issues with your

[jira] [Commented] (SPARK-794) Remove sleep() in ClusterScheduler.stop

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212088#comment-14212088 ] Andrew Ash commented on SPARK-794: -- I don't see a {{ClusterScheduler}} class on master --

[jira] [Commented] (SPARK-665) Create RPM packages for Spark

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212094#comment-14212094 ] Andrew Ash commented on SPARK-665: -- This role of creating RPM packages seems to have been

[jira] [Closed] (SPARK-1206) Add python support for average and other summary satistics

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash closed SPARK-1206. - Resolution: Implemented Closing per [~holdenk_amp] as already done. Add python support for average and

[jira] [Commented] (SPARK-665) Create RPM packages for Spark

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212104#comment-14212104 ] Andrew Ash commented on SPARK-665: -- Sean are you suggesting dropping the .deb packages

[jira] [Closed] (SPARK-818) Design Spark Job Server

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash closed SPARK-818. Resolution: Won't Fix The community consensus was that the Spark job server should live in a separate

[jira] [Commented] (SPARK-665) Create RPM packages for Spark

2014-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212109#comment-14212109 ] Sean Owen commented on SPARK-665: - Not suggesting that, no. I suppose it does depend on

[jira] [Closed] (SPARK-1231) DEAD worker should recover automaticly

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash closed SPARK-1231. - Resolution: Duplicate DEAD worker should recover automaticly --

[jira] [Commented] (SPARK-1231) DEAD worker should recover automaticly

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212110#comment-14212110 ] Andrew Ash commented on SPARK-1231: --- Sorry [~tianyi], when I did my search for prior

[jira] [Commented] (SPARK-1169) Add countApproxDistinct and countApproxDistinctByKey to PySpark

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212112#comment-14212112 ] Andrew Ash commented on SPARK-1169: --- On current master (1.2) I see that rdd.py now has a

[jira] [Commented] (SPARK-754) Multiple Spark Contexts active in a single Spark Context

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212113#comment-14212113 ] Andrew Ash commented on SPARK-754: -- This is actually currently unsupported, and a ticket

[jira] [Closed] (SPARK-754) Multiple Spark Contexts active in a single Spark Context

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash closed SPARK-754. Resolution: Duplicate Multiple Spark Contexts active in a single Spark Context

[jira] [Updated] (SPARK-2243) Support multiple SparkContexts in the same JVM

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-2243: -- Affects Version/s: 0.7.0 Support multiple SparkContexts in the same JVM

[jira] [Created] (SPARK-4399) Support multiple cloud providers

2014-11-14 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-4399: - Summary: Support multiple cloud providers Key: SPARK-4399 URL: https://issues.apache.org/jira/browse/SPARK-4399 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-4400) Add scripts for launching Spark on Google Compute Engine (GCE)

2014-11-14 Thread Andrew Ash (JIRA)
Andrew Ash created SPARK-4400: - Summary: Add scripts for launching Spark on Google Compute Engine (GCE) Key: SPARK-4400 URL: https://issues.apache.org/jira/browse/SPARK-4400 Project: Spark

[jira] [Resolved] (SPARK-2398) Trouble running Spark 1.0 on Yarn

2014-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2398. -- Resolution: Duplicate The PR that resolved that was ultimately tied to a new JIRA, SPARK-3768.

[jira] [Commented] (SPARK-1358) Continuous integrated test should be involved in Spark ecosystem

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212125#comment-14212125 ] Andrew Ash commented on SPARK-1358: --- I've heard these sorts of extended tests called end

[jira] [Closed] (SPARK-4400) Add scripts for launching Spark on Google Compute Engine (GCE)

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash closed SPARK-4400. - Resolution: Duplicate Add scripts for launching Spark on Google Compute Engine (GCE)

[jira] [Commented] (SPARK-928) Add support for Unsafe-based serializer in Kryo 2.22

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212132#comment-14212132 ] Andrew Ash commented on SPARK-928: -- Latest Chill (0.5.0) is still using Kryo 2.21 so this

[jira] [Commented] (SPARK-1568) Spark 0.9.0 hangs reading s3

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212145#comment-14212145 ] Andrew Ash commented on SPARK-1568: --- [~sams] did you see an improvement when you

[jira] [Created] (SPARK-4401) RuleExecutor correctly logs trace iteration num

2014-11-14 Thread YanTang Zhai (JIRA)
YanTang Zhai created SPARK-4401: --- Summary: RuleExecutor correctly logs trace iteration num Key: SPARK-4401 URL: https://issues.apache.org/jira/browse/SPARK-4401 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-4402) Output path validation of an action statement resulting in runtime exception

2014-11-14 Thread Vijay (JIRA)
Vijay created SPARK-4402: Summary: Output path validation of an action statement resulting in runtime exception Key: SPARK-4402 URL: https://issues.apache.org/jira/browse/SPARK-4402 Project: Spark

[jira] [Resolved] (SPARK-3722) Spark on yarn docs work

2014-11-14 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-3722. -- Resolution: Fixed Fix Version/s: 1.3.0 Assignee: WangTaoTheTonic

[jira] [Updated] (SPARK-4403) Elastic allocation(spark.dynamicAllocation.enabled) results in task never being execued.

2014-11-14 Thread Egor Pahomov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Egor Pahomov updated SPARK-4403: Attachment: ipython_out Elastic allocation(spark.dynamicAllocation.enabled) results in task never

[jira] [Resolved] (SPARK-1568) Spark 0.9.0 hangs reading s3

2014-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1568. -- Resolution: Fixed Fix Version/s: 1.0.0 Spark 0.9.0 hangs reading s3

[jira] [Commented] (SPARK-4402) Output path validation of an action statement resulting in runtime exception

2014-11-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212376#comment-14212376 ] Sean Owen commented on SPARK-4402: -- Is this not the same issue resolved by

[jira] [Commented] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212412#comment-14212412 ] Apache Spark commented on SPARK-4404: - User 'WangTaoTheTonic' has created a pull

[jira] [Updated] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-14 Thread WangTaoTheTonic (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] WangTaoTheTonic updated SPARK-4404: --- Description: When we have spark.driver.extra* or spark.driver.memory in

[jira] [Commented] (SPARK-4354) 14/11/12 09:39:00 WARN TaskSetManager: Lost task 5.0 in stage 0.0 (TID 5, HYD-RNDNW-VFRCO-RCORE2): java.lang.NoClassDefFoundError: Could not initialize class org.xerial

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212482#comment-14212482 ] Apache Spark commented on SPARK-4354: - User 'alexliu68' has created a pull request for

[jira] [Commented] (SPARK-4338) Remove yarn-alpha support

2014-11-14 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212485#comment-14212485 ] Andrew Ash commented on SPARK-4338: --- From discussion on the dev list today, Sandy aims

[jira] [Created] (SPARK-4405) Matrices.* construction methods should check for rows x cols overflow

2014-11-14 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-4405: Summary: Matrices.* construction methods should check for rows x cols overflow Key: SPARK-4405 URL: https://issues.apache.org/jira/browse/SPARK-4405 Project:

[jira] [Created] (SPARK-4406) SVD should check for k 1

2014-11-14 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-4406: Summary: SVD should check for k 1 Key: SPARK-4406 URL: https://issues.apache.org/jira/browse/SPARK-4406 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-4407) Thrift server for 0.13.1 doesn't deserialize complex types properly

2014-11-14 Thread Cheng Lian (JIRA)
Cheng Lian created SPARK-4407: - Summary: Thrift server for 0.13.1 doesn't deserialize complex types properly Key: SPARK-4407 URL: https://issues.apache.org/jira/browse/SPARK-4407 Project: Spark

[jira] [Created] (SPARK-4408) Behavior difference between spark-submit conf vs cmd line args

2014-11-14 Thread Pedro Rodriguez (JIRA)
Pedro Rodriguez created SPARK-4408: -- Summary: Behavior difference between spark-submit conf vs cmd line args Key: SPARK-4408 URL: https://issues.apache.org/jira/browse/SPARK-4408 Project: Spark

[jira] [Commented] (SPARK-4407) Thrift server for 0.13.1 doesn't deserialize complex types properly

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212544#comment-14212544 ] Apache Spark commented on SPARK-4407: - User 'liancheng' has created a pull request for

[jira] [Resolved] (SPARK-603) add simple Counter API

2014-11-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-603. --- Resolution: Won't Fix Closing this one as part of [~aash]'s cleanup. I think this problem is being

[jira] [Created] (SPARK-4410) Support for external sort

2014-11-14 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-4410: --- Summary: Support for external sort Key: SPARK-4410 URL: https://issues.apache.org/jira/browse/SPARK-4410 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-4409) Additional (but limited) Linear Algebra Utils

2014-11-14 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-4409: -- Summary: Additional (but limited) Linear Algebra Utils Key: SPARK-4409 URL: https://issues.apache.org/jira/browse/SPARK-4409 Project: Spark Issue Type:

[jira] [Updated] (SPARK-4395) Running a Spark SQL SELECT command from PySpark causes a hang for ~ 1 hour

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4395: Description: When I run this command it hangs for one to many hours and then finally

[jira] [Resolved] (SPARK-4394) Allow datasources to support IN and sizeInBytes

2014-11-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-4394. Resolution: Fixed Fix Version/s: 1.2.0 Allow datasources to support IN and sizeInBytes

[jira] [Updated] (SPARK-4409) Additional (but limited) Linear Algebra Utils

2014-11-14 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-4409: --- Description: This ticket is to discuss the addition of a very limited number of local matrix

[jira] [Created] (SPARK-4411) Add kill link for jobs in the UI

2014-11-14 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-4411: - Summary: Add kill link for jobs in the UI Key: SPARK-4411 URL: https://issues.apache.org/jira/browse/SPARK-4411 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-4411) Add kill link for jobs in the UI

2014-11-14 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-4411: -- Issue Type: New Feature (was: Bug) Add kill link for jobs in the UI

[jira] [Created] (SPARK-4413) Parquet support through datasource API

2014-11-14 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-4413: --- Summary: Parquet support through datasource API Key: SPARK-4413 URL: https://issues.apache.org/jira/browse/SPARK-4413 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-4398) Specialize rdd.parallelize for xrange

2014-11-14 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng resolved SPARK-4398. -- Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3264

[jira] [Commented] (SPARK-4413) Parquet support through datasource API

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212805#comment-14212805 ] Apache Spark commented on SPARK-4413: - User 'marmbrus' has created a pull request for

[jira] [Comment Edited] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-11-14 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212815#comment-14212815 ] Xiangrui Meng edited comment on SPARK-3080 at 11/14/14 8:56 PM:

[jira] [Commented] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-11-14 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212815#comment-14212815 ] Xiangrui Meng commented on SPARK-3080: -- Thanks for the confirmation! If [~ilganeli]

[jira] [Updated] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-11-14 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-3080: - Assignee: Xiangrui Meng ArrayIndexOutOfBoundsException in ALS for Large datasets

[jira] [Commented] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-11-14 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212822#comment-14212822 ] Ilya Ganelin commented on SPARK-3080: - Hi Xiangrui - I was not doing any sort of

[jira] [Created] (SPARK-4414) SparkContext.wholeTextFiles Doesn't work with S3 Buckets

2014-11-14 Thread Pedro Rodriguez (JIRA)
Pedro Rodriguez created SPARK-4414: -- Summary: SparkContext.wholeTextFiles Doesn't work with S3 Buckets Key: SPARK-4414 URL: https://issues.apache.org/jira/browse/SPARK-4414 Project: Spark

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-11-14 Thread sam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212833#comment-14212833 ] sam commented on SPARK-1867: [~ansonism] Are you 100% sure your jar is also the same hadoop

[jira] [Updated] (SPARK-3860) Improve dimension joins

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-3860: Priority: Critical (was: Major) Improve dimension joins ---

[jira] [Commented] (SPARK-1867) Spark Documentation Error causes java.lang.IllegalStateException: unread block data

2014-11-14 Thread Anson Abraham (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212852#comment-14212852 ] Anson Abraham commented on SPARK-1867: -- Yes. i added 3 data nodes just for this.

[jira] [Updated] (SPARK-4380) Executor full of log spilling in-memory map of 0 MB to disk

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4380: - Assignee: Hong Shen Executor full of log spilling in-memory map of 0 MB to disk

[jira] [Commented] (SPARK-4380) Executor full of log spilling in-memory map of 0 MB to disk

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212857#comment-14212857 ] Andrew Or commented on SPARK-4380: -- In general it's pretty worrying that it's spilling so

[jira] [Closed] (SPARK-4313) Thread Dump link is broken in yarn-cluster mode

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4313. Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: 1.2.0 Thread Dump link is broken

[jira] [Updated] (SPARK-4313) Thread Dump link is broken in yarn-cluster mode

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4313: - Assignee: Shixiong Zhu Thread Dump link is broken in yarn-cluster mode

[jira] [Updated] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4404: - Affects Version/s: 1.1.0 SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process

[jira] [Commented] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-11-14 Thread Ilya Ganelin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212875#comment-14212875 ] Ilya Ganelin commented on SPARK-3694: - There is also task serialization that happens

[jira] [Commented] (SPARK-2321) Design a proper progress reporting event listener API

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14212884#comment-14212884 ] Apache Spark commented on SPARK-2321: - User 'JoshRosen' has created a pull request for

[jira] [Resolved] (SPARK-4239) support view in HiveQL

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4239. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3131

[jira] [Resolved] (SPARK-4245) Fix containsNull of the result ArrayType of CreateArray expression.

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4245. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3110

[jira] [Resolved] (SPARK-4333) Correctly log number of iterations in RuleExecutor

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4333. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3180

[jira] [Resolved] (SPARK-4062) Improve KafkaReceiver to prevent data loss

2014-11-14 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-4062. -- Resolution: Fixed Improve KafkaReceiver to prevent data loss

[jira] [Resolved] (SPARK-3129) Prevent data loss in Spark Streaming on driver failure

2014-11-14 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-3129. -- Resolution: Fixed Fix Version/s: 1.2.0 I am marking this as fixed, as all non-test

[jira] [Updated] (SPARK-4246) Add testsuite with end-to-end testing of driver failure

2014-11-14 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-4246: - Target Version/s: 1.2.0 Add testsuite with end-to-end testing of driver failure

[jira] [Updated] (SPARK-3129) Prevent data loss in Spark Streaming on driver failure using Write Ahead Logs

2014-11-14 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-3129: - Summary: Prevent data loss in Spark Streaming on driver failure using Write Ahead Logs (was:

[jira] [Resolved] (SPARK-4391) Parquet Filter pushdown flag should be set with SQLConf

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4391. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3258

[jira] [Resolved] (SPARK-4322) Struct fields can't be used as sub-expression of grouping fields

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4322. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3248

[jira] [Resolved] (SPARK-4386) Parquet file write performance improvement

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4386. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3254

[jira] [Resolved] (SPARK-4365) Remove unnecessary filter call on records returned from parquet library

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4365. - Resolution: Fixed Issue resolved by pull request 3229

[jira] [Resolved] (SPARK-4412) Parquet logger cannot be configured

2014-11-14 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-4412. - Resolution: Fixed Fix Version/s: 1.2.0 Issue resolved by pull request 3271

[jira] [Commented] (SPARK-4415) Driver did not exit after python driver had exited.

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213076#comment-14213076 ] Apache Spark commented on SPARK-4415: - User 'davies' has created a pull request for

[jira] [Closed] (SPARK-4214) With dynamic allocation, avoid outstanding requests for more executors than pending tasks need

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4214. Resolution: Fixed Fix Version/s: 1.2.0 With dynamic allocation, avoid outstanding requests for more

[jira] [Issue Comment Deleted] (SPARK-4345) Spark SQL Hive throws exception when drop a none-exist table

2014-11-14 Thread Alex Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu updated SPARK-4345: Comment: was deleted (was: Swallow NoSuchObjectException exception when drop a none-exist hive table. pull

[jira] [Commented] (SPARK-4345) Spark SQL Hive throws exception when drop a none-exist table

2014-11-14 Thread Alex Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213121#comment-14213121 ] Alex Liu commented on SPARK-4345: - It looks like a bug in Hive

[jira] [Resolved] (SPARK-4345) Spark SQL Hive throws exception when drop a none-exist table

2014-11-14 Thread Alex Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Liu resolved SPARK-4345. - Resolution: Won't Fix Spark SQL Hive throws exception when drop a none-exist table

[jira] [Commented] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization

2014-11-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213128#comment-14213128 ] Apache Spark commented on SPARK-4349: - User 'mccheah' has created a pull request for

[jira] [Created] (SPARK-4417) New API: sample RDD to fixed number of items

2014-11-14 Thread Davies Liu (JIRA)
Davies Liu created SPARK-4417: - Summary: New API: sample RDD to fixed number of items Key: SPARK-4417 URL: https://issues.apache.org/jira/browse/SPARK-4417 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-3445) Deprecate and later remove YARN alpha support

2014-11-14 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14213300#comment-14213300 ] Guoqiang Li commented on SPARK-3445: I think there are a lot of people are using

[jira] [Closed] (SPARK-4404) SparkSubmitDriverBootstrapper should stop after its SparkSubmit sub-process ends

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4404. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: WangTaoTheTonic Target

[jira] [Closed] (SPARK-4415) Driver did not exit after python driver had exited.

2014-11-14 Thread Andrew Or (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-4415. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Davies Liu Target Version/s:

[jira] [Created] (SPARK-4420) Change nullability of Cast from DoubleType/FloatType to DecimalType.

2014-11-14 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-4420: Summary: Change nullability of Cast from DoubleType/FloatType to DecimalType. Key: SPARK-4420 URL: https://issues.apache.org/jira/browse/SPARK-4420 Project: Spark

  1   2   >