[jira] [Resolved] (SPARK-4505) Reduce the memory usage of CompactBuffer[T] when T is a primitive type

2014-11-29 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4505. Resolution: Fixed Fix Version/s: 1.3.0 > Reduce the memory usage of CompactBuffe

[jira] [Resolved] (SPARK-4057) Use -agentlib instead of -Xdebug in sbt-launch-lib.bash for debugging

2014-11-29 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4057. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Kousuke Saruta >

[jira] [Commented] (SPARK-4082) Show Waiting/Queued Stages in Spark UI

2014-11-29 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14228820#comment-14228820 ] Patrick Wendell commented on SPARK-4082: IMO this is sufficiently addresse

Re: [VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-29 Thread Patrick Wendell
rtunately you got some of the text here wrong, saying 1.1.0 > instead of 1.2.0. Not sure it will matter since there can well be another RC > after testing, but we should be careful. > > Matei > >> On Nov 28, 2014, at 9:16 PM, Patrick Wendell wrote: >> >> Pleas

[VOTE] Release Apache Spark 1.2.0 (RC1)

2014-11-28 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb The release files, including signatures, digests, etc. c

[jira] [Updated] (SPARK-4352) Incorporate locality preferences in dynamic allocation requests

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4352: --- Priority: Critical (was: Major) > Incorporate locality preferences in dynamic allocat

[jira] [Resolved] (SPARK-1450) Specify the default zone in the EC2 script help

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1450. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sean Owen (was: Tathagata

[jira] [Resolved] (SPARK-2985) Buffered data in BlockGenerator gets lost when receiver crashes

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2985. Resolution: Invalid I think this represents a misunderstanding of the internal API&#

[jira] [Updated] (SPARK-4584) 2x Performance regression for Spark-on-YARN

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4584: --- Fix Version/s: 1.2.0 > 2x Performance regression for Spark-on-Y

[jira] [Resolved] (SPARK-4584) 2x Performance regression for Spark-on-YARN

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4584. Resolution: Fixed > 2x Performance regression for Spark-on-Y

[jira] [Commented] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14228493#comment-14228493 ] Patrick Wendell commented on SPARK-4349: Hey Matt, It turns out that para

[jira] [Issue Comment Deleted] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3694: --- Comment: was deleted (was: User 'davies' has created a pull request for this is

[jira] [Commented] (SPARK-3694) Allow printing object graph of tasks/RDD's with a debug flag

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14228489#comment-14228489 ] Patrick Wendell commented on SPARK-3694: Yes we should print that too - I

[jira] [Resolved] (SPARK-4193) Disable doclint in Java 8 to prevent from build error.

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4193. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Takuya Ueshin https

[jira] [Resolved] (SPARK-4645) Asynchronous execution in HiveThriftServer2 with Hive 0.13.1 doesn't play well with Simba ODBC driver

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4645. Resolution: Fixed Fix Version/s: 1.2.0 > Asynchronous execution in HiveThriftServ

[jira] [Updated] (SPARK-4643) Remove unneeded staging repositories from build

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4643: --- Assignee: Adrian Wang > Remove unneeded staging repositories from bu

[jira] [Updated] (SPARK-4643) Remove unneeded staging repositories from build

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4643: --- Summary: Remove unneeded staging repositories from build (was: spark staging repository

[jira] [Resolved] (SPARK-4643) Remove unneeded staging repositories from build

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4643. Resolution: Fixed Fix Version/s: 1.3.0 > Remove unneeded staging repositories f

[jira] [Updated] (SPARK-4632) Upgrade MQTT dependency to use latest mqtt-client

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4632: --- Target Version/s: 1.3.0 (was: 1.2.0) > Upgrade MQTT dependency to use latest mqtt-cli

[jira] [Updated] (SPARK-4632) Upgrade MQTT dependency to use latest mqtt-client

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4632: --- Priority: Critical (was: Blocker) > Upgrade MQTT dependency to use latest mqtt-cli

[jira] [Updated] (SPARK-4628) Put all external projects behind a build flag

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4628: --- Target Version/s: 1.3.0 (was: 1.2.0) > Put all external projects behind a build f

[jira] [Updated] (SPARK-4645) Asynchronous execution in HiveThriftServer2 with Hive 0.13.1 doesn't play well with Simba ODBC driver

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4645: --- Assignee: Cheng Lian > Asynchronous execution in HiveThriftServer2 with Hive 0.13.1 does

[jira] [Updated] (SPARK-3182) Twitter Streaming Geoloaction Filter

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3182: --- Target Version/s: 1.3.0 (was: 1.0.0, 1.0.2) > Twitter Streaming Geoloaction Fil

[jira] [Updated] (SPARK-3182) Twitter Streaming Geoloaction Filter

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3182: --- Affects Version/s: (was: 1.0.2) (was: 1.0.0) > Twit

[jira] [Updated] (SPARK-3182) Twitter Streaming Geoloaction Filter

2014-11-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3182: --- Fix Version/s: (was: 1.2.0) > Twitter Streaming Geoloaction Fil

[jira] [Commented] (SPARK-4598) Paginate stage page to avoid OOM with > 100,000 tasks

2014-11-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14227842#comment-14227842 ] Patrick Wendell commented on SPARK-4598: Having sorting with pagination s

[jira] [Created] (SPARK-4628) Put all external projects behind a build flag

2014-11-26 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4628: -- Summary: Put all external projects behind a build flag Key: SPARK-4628 URL: https://issues.apache.org/jira/browse/SPARK-4628 Project: Spark Issue Type

Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-26 Thread Patrick Wendell
uot;org/spark-project/guava/common/base/Preconitions".checkArgument:(ZLjava/lang/Object;)V 50: invokestatic #502// Method "org/spark-project/guava/common/base/Preconitions".checkArgument:(ZLjava/lang/Object;)V On Wed, Nov 26, 2014 at 11:08 AM, Patri

Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-26 Thread Patrick Wendell
Hi Judy, Are you somehow modifying Spark's classpath to include jars from Hadoop and Hive that you have running on the machine? The issue seems to be that you are somehow including a version of Hadoop that references the original guava package. The Hadoop that is bundled in the Spark jars should n

[jira] [Resolved] (SPARK-4516) Netty off-heap memory use causes executors to be killed by OS

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4516. Resolution: Fixed Fix Version/s: 1.2.0 > Netty off-heap memory use causes execut

[jira] [Updated] (SPARK-4613) Make JdbcRDD easier to use from Java

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4613: --- Assignee: Cheng Lian > Make JdbcRDD easier to use from J

[jira] [Commented] (SPARK-4613) Make JdbcRDD easier to use from Java

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225632#comment-14225632 ] Patrick Wendell commented on SPARK-4613: Yeah the only other tricky bit is

[jira] [Updated] (SPARK-4613) Make JdbcRDD easier to use from Java

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4613: --- Description: We might eventually deprecate it, but for now it would be nice to expose a Java

[jira] [Commented] (SPARK-4605) Proposed Contribution: Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225264#comment-14225264 ] Patrick Wendell commented on SPARK-4605: I see - so basically this

[jira] [Commented] (SPARK-4605) Proposed Contribution: Spark Kernel to enable interactive Spark applications

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225173#comment-14225173 ] Patrick Wendell commented on SPARK-4605: Thanks for sharing this design

[jira] [Resolved] (SPARK-4462) flume-sink build broken in SBT

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4462. Resolution: Cannot Reproduce I couldn't reproduce this either. However Michael, plea

[jira] [Commented] (SPARK-4598) Paginate stage page to avoid OOM with > 100,000 tasks

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224810#comment-14224810 ] Patrick Wendell commented on SPARK-4598: It is a good idea to paginate this

[jira] [Updated] (SPARK-4598) java.lang.OutOfMemoryError occurs when opening stage page of an application has 100000 tasks,

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4598: --- Priority: Critical (was: Major) > java.lang.OutOfMemoryError occurs when opening stage p

[jira] [Updated] (SPARK-4598) Paginate stage page to avoid OOM with > 100,000 tasks

2014-11-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4598: --- Summary: Paginate stage page to avoid OOM with > 100,000 tasks (

[jira] [Updated] (SPARK-1476) 2GB limit in spark for blocks

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1476: --- Target Version/s: (was: 1.2.0) > 2GB limit in spark for blo

[jira] [Updated] (SPARK-4525) MesosSchedulerBackend.resourceOffers cannot decline unused offers from acceptedOffers

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4525: --- Fix Version/s: 1.2.0 > MesosSchedulerBackend.resourceOffers cannot decline unused offers f

[jira] [Resolved] (SPARK-4525) MesosSchedulerBackend.resourceOffers cannot decline unused offers from acceptedOffers

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4525. Resolution: Fixed > MesosSchedulerBackend.resourceOffers cannot decline unused offers f

[jira] [Resolved] (SPARK-4266) Avoid expensive JavaScript for StagePages with huge numbers of tasks

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4266. Resolution: Fixed Fix Version/s: 1.2.0 [~kayousterhout] I'm resolving this beca

[jira] [Updated] (SPARK-4525) MesosSchedulerBackend.resourceOffers cannot decline unused offers from acceptedOffers

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4525: --- Target Version/s: 1.2.0 (was: 1.2.0, 1.3.0) > MesosSchedulerBackend.resourceOffers can

[jira] [Resolved] (SPARK-4578) Row.asDict() should keep the type of values

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4578. Resolution: Fixed Fix Version/s: 1.2.0 Thanks davies I've resolved

[jira] [Updated] (SPARK-4196) Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4196: --- Assignee: Tathagata Das (was: Patrick Wendell) > Streaming + checkpoint

[jira] [Assigned] (SPARK-4196) Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reassigned SPARK-4196: -- Assignee: Patrick Wendell > Streaming + checkpointing + saveAsNewAPIHadoopFi

[jira] [Updated] (SPARK-4548) Python broadcast perf regression from Spark 1.1

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4548: --- Assignee: Davies Liu > Python broadcast perf regression from Spark

[jira] [Updated] (SPARK-4266) Avoid expensive JavaScript for StagePages with huge numbers of tasks

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4266: --- Priority: Blocker (was: Critical) > Avoid expensive JavaScript for StagePages with h

[jira] [Resolved] (SPARK-4145) Create jobs overview and job details pages on the web UI

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4145. Resolution: Fixed Fix Version/s: 1.2.0 > Create jobs overview and job details pa

[jira] [Updated] (SPARK-4266) Avoid expensive JavaScript for StagePages with huge numbers of tasks

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4266: --- Affects Version/s: 1.2.0 > Avoid expensive JavaScript for StagePages with huge numbers

[jira] [Resolved] (SPARK-4515) OOM/GC errors with sort-based shuffle

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4515. Resolution: Duplicate > OOM/GC errors with sort-based shuf

[jira] [Updated] (SPARK-3452) Maven build should skip publishing artifacts people shouldn't depend on

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3452: --- Fix Version/s: 1.2.0 > Maven build should skip publishing artifacts people shouldn'

[jira] [Reopened] (SPARK-4515) OOM/GC errors with sort-based shuffle

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-4515: > OOM/GC errors with sort-based shuf

[jira] [Updated] (SPARK-1860) Standalone Worker cleanup should not clean up running executors

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1860: --- Fix Version/s: 1.2.0 > Standalone Worker cleanup should not clean up running execut

[jira] [Updated] (SPARK-4468) Wrong Parquet filters are created for all inequality predicates with literals on the left hand side

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4468: --- Fix Version/s: 1.2.0 > Wrong Parquet filters are created for all inequality predicates w

[jira] [Updated] (SPARK-4264) SQL HashJoin induces "refCnt = 0" error in ShuffleBlockFetcherIterator

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4264: --- Fix Version/s: 1.2.0 > SQL HashJoin induces "refCnt = 0" error in ShuffleBlockFe

[jira] [Updated] (SPARK-3686) flume.SparkSinkSuite.Success is flaky

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3686: --- Fix Version/s: 1.2.0 > flume.SparkSinkSuite.Success is fl

[jira] [Updated] (SPARK-3615) Kafka test should not hard code Zookeeper port

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3615: --- Fix Version/s: 1.2.0 > Kafka test should not hard code Zookeeper p

[jira] [Updated] (SPARK-4385) DataSource DDL Parser can't handle table names with '_'

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4385: --- Fix Version/s: (was: 1.2.0) > DataSource DDL Parser can't handle table na

[jira] [Updated] (SPARK-4385) DataSource DDL Parser can't handle table names with '_'

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4385: --- Fix Version/s: 1.2.0 > DataSource DDL Parser can't handle table na

[jira] [Updated] (SPARK-4090) Memory leak in snappy-java 1.1.1.4/5

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4090: --- Fix Version/s: 1.2.0 > Memory leak in snappy-java 1.1.1.

[jira] [Updated] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-11-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3633: --- Fix Version/s: 1.2.0 1.1.1 > Fetches failure observed after SPARK-2

[jira] [Commented] (SPARK-4567) Make SparkJobInfo and SparkStageInfo serializable

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222729#comment-14222729 ] Patrick Wendell commented on SPARK-4567: [~xuefuz] please don'

[jira] [Updated] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3628: --- Component/s: Spark Core > Don't apply accumulator updates multiple times for tasks i

[jira] [Updated] (SPARK-4567) Make SparkJobInfo and SparkStageInfo serializable

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4567: --- Fix Version/s: (was: 1.2.0) > Make SparkJobInfo and SparkStageInfo serializa

[jira] [Updated] (SPARK-4562) serialization in MLlib is slow

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4562: --- Priority: Blocker (was: Major) > serialization in MLlib is s

[jira] [Updated] (SPARK-4562) GLM testing time regressions from Spark 1.1

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4562: --- Summary: GLM testing time regressions from Spark 1.1 (was: serialization in MLlib is slow

[jira] [Updated] (SPARK-4548) Python broadcast is very slow

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4548: --- Priority: Blocker (was: Major) > Python broadcast is very s

[jira] [Updated] (SPARK-4548) Python broadcast perf regression from Spark 1.1

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4548: --- Summary: Python broadcast perf regression from Spark 1.1 (was: Python broadcast is very slow

[jira] [Commented] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222672#comment-14222672 ] Patrick Wendell commented on SPARK-3628: I took a quick look at the current p

[jira] [Updated] (SPARK-3628) Don't apply accumulator updates multiple times for tasks in result stages

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3628: --- Target Version/s: 1.1.1, 0.9.3, 1.0.3, 1.2.1 (was: 1.1.1, 1.2.0, 0.9.3, 1.0.3) > Do

Re: Notes on writing complex spark applications

2014-11-23 Thread Patrick Wendell
Hey Evan, It might be nice to merge this into existing documentation. In particular, a lot of this could serve to update the current tuning section and programming guides. It could also work to paste this wholesale as a reference for Spark users, but in that case it's less likely to get updated w

[jira] [Updated] (SPARK-4568) Publish release candidates under $VERSION-RCX instead of $VERSION

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4568: --- Issue Type: Improvement (was: Bug) > Publish release candidates under $VERSION-RCX inst

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Patrick Wendell
Hey Stephen, Thanks for bringing this up. Technically when we call a release vote it needs to be on the exact commit that will be the final release. However, one thing I've thought of doing for a while would be to publish the maven artifacts using a version tag with $VERSION-rcX even if the underl

[jira] [Created] (SPARK-4568) Publish release candidates under $VERSION-RCX instead of $VERSION

2014-11-23 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4568: -- Summary: Publish release candidates under $VERSION-RCX instead of $VERSION Key: SPARK-4568 URL: https://issues.apache.org/jira/browse/SPARK-4568 Project: Spark

[jira] [Updated] (SPARK-3958) Possible stream-corruption issues in TorrentBroadcast

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3958: --- Target Version/s: 1.2.1 (was: 1.2.0) > Possible stream-corruption issues in TorrentBroadc

[jira] [Updated] (SPARK-4105) FAILED_TO_UNCOMPRESS(5) errors when fetching shuffle data with sort-based shuffle

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4105: --- Target Version/s: 1.2.1 (was: 1.2.0) > FAILED_TO_UNCOMPRESS(5) errors when fetching shuf

[jira] [Commented] (SPARK-4258) NPE with new Parquet Filters

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222551#comment-14222551 ] Patrick Wendell commented on SPARK-4258: After discussion with [~lian cheng]

[jira] [Updated] (SPARK-4258) NPE with new Parquet Filters

2014-11-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4258: --- Priority: Critical (was: Blocker) > NPE with new Parquet Filt

Re: [VOTE] Release Apache Spark 1.1.1 (RC2)

2014-11-23 Thread Patrick Wendell
+1 (binding). Don't see any evidence of regressions at this point. The issue reported by Hector was not related to this rlease. On Sun, Nov 23, 2014 at 9:50 AM, Debasish Das wrote: > -1 from me...same FetchFailed issue as what Hector saw... > > I am running Netflix dataset and dumping out recomm

Re: Apache infra github sync down

2014-11-22 Thread Patrick Wendell
Hi All, Unfortunately this went back down again. I've opened a new JIRA to track it: https://issues.apache.org/jira/browse/INFRA-8688 - Patrick On Tue, Nov 18, 2014 at 10:24 PM, Patrick Wendell wrote: > Hey All, > > The Apache-->github mirroring is not working right no

[jira] [Updated] (SPARK-4377) ZooKeeperPersistenceEngine: java.lang.IllegalStateException: Trying to deserialize a serialized ActorRef without an ActorSystem in scope.

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4377: --- Affects Version/s: (was: 1.2.0) 1.3.0 > ZooKeeperPersistenceEng

[jira] [Updated] (SPARK-4377) ZooKeeperPersistenceEngine: java.lang.IllegalStateException: Trying to deserialize a serialized ActorRef without an ActorSystem in scope.

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4377: --- Target Version/s: (was: 1.2.0) > ZooKeeperPersistenceEng

[jira] [Resolved] (SPARK-4377) ZooKeeperPersistenceEngine: java.lang.IllegalStateException: Trying to deserialize a serialized ActorRef without an ActorSystem in scope.

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4377. Resolution: Fixed > ZooKeeperPersistenceEngine: java.lang.IllegalStateException: Trying

Re: How spark and hive integrate in long term?

2014-11-22 Thread Patrick Wendell
There are two distinct topics when it comes to hive integration. Part of the 1.3 roadmap will likely be better defining the plan for Hive integration as Hive adds future versions. 1. Ability to interact with Hive metastore's from different versions ==> I.e. if a user has a metastore, can Spark SQL

[jira] [Comment Edited] (SPARK-4556) binary distribution assembly can't run in local mode

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1415#comment-1415 ] Patrick Wendell edited comment on SPARK-4556 at 11/22/14 10:1

[jira] [Commented] (SPARK-4556) binary distribution assembly can't run in local mode

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1415#comment-1415 ] Patrick Wendell commented on SPARK-4556: Checkout make-distribution.sh ra

[jira] [Updated] (SPARK-4377) ZooKeeperPersistenceEngine: java.lang.IllegalStateException: Trying to deserialize a serialized ActorRef without an ActorSystem in scope.

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4377: --- Fix Version/s: 1.3.0 > ZooKeeperPersistenceEngine: java.lang.IllegalStateException: Trying

[jira] [Updated] (SPARK-4507) PR merge script should support closing multiple JIRA tickets

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4507: --- Labels: starter (was: ) > PR merge script should support closing multiple JIRA tick

[jira] [Updated] (SPARK-1517) Publish nightly snapshots of documentation, maven artifacts, and binary builds

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1517: --- Priority: Critical (was: Major) > Publish nightly snapshots of documentation, ma

[jira] [Updated] (SPARK-1517) Publish nightly snapshots of documentation, maven artifacts, and binary builds

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1517: --- Target Version/s: 1.3.0 > Publish nightly snapshots of documentation, maven artifacts,

[jira] [Resolved] (SPARK-4542) Post nightly releases

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4542. Resolution: Duplicate > Post nightly relea

[jira] [Updated] (SPARK-1517) Publish nightly snapshots of documentation, maven artifacts, and binary builds

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1517: --- Fix Version/s: (was: 1.2.0) > Publish nightly snapshots of documentation, maven artifa

[jira] [Updated] (SPARK-2143) Display Spark version on Driver web page

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2143: --- Priority: Critical (was: Major) > Display Spark version on Driver web p

[jira] [Commented] (SPARK-4516) Netty off-heap memory use causes executors to be killed by OS

2014-11-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222172#comment-14222172 ] Patrick Wendell commented on SPARK-4516: Okay then I think this is ju

[jira] [Commented] (SPARK-3633) Fetches failure observed after SPARK-2711

2014-11-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221699#comment-14221699 ] Patrick Wendell commented on SPARK-3633: [~nravi] resolved this because

[jira] [Commented] (SPARK-4550) In sort-based shuffle, store map outputs in serialized form

2014-11-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221690#comment-14221690 ] Patrick Wendell commented on SPARK-4550: Not an expert on the internals of

[jira] [Commented] (SPARK-4516) Netty off-heap memory use causes executors to be killed by OS

2014-11-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221519#comment-14221519 ] Patrick Wendell commented on SPARK-4516: Okay sounds good. Does changing

[jira] [Commented] (SPARK-4541) Add --version to spark-submit

2014-11-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14221475#comment-14221475 ] Patrick Wendell commented on SPARK-4541: This is a good idea. > Add --ver

<    12   13   14   15   16   17   18   19   20   21   >