RE: memory size for caching RDD

2014-09-04 Thread Liu, Raymond
I think there is no public API available to do this. In this case, the best you can do might be unpersist some RDDs manually. The problem is that this is done by RDD unit, not by block unit. And then, if the storage level including disk level, the data on the disk will be removed too. Best Rega

Re: memory size for caching RDD

2014-09-04 Thread 牛兆捷
ok. So can I use the similar logic as the block manager does when space fills up ? 2014-09-04 15:05 GMT+08:00 Liu, Raymond : > I think there is no public API available to do this. In this case, the > best you can do might be unpersist some RDDs manually. The problem is that > this is done by RDD

Dependency hell in Spark applications

2014-09-04 Thread Aniket Bhatnagar
I am trying to use Kinesis as source to Spark Streaming and have run into a dependency issue that can't be resolved without making my own custom Spark build. The issue is that Spark is transitively dependent on org.apache.httpcomponents:httpclient:jar:4.1.2 (I think because of libfb303 coming from

Re: Dependency hell in Spark applications

2014-09-04 Thread Sean Owen
Dumb question -- are you using a Spark build that includes the Kinesis dependency? that build would have resolved conflicts like this for you. Your app would need to use the same version of the Kinesis client SDK, ideally. All of these ideas are well-known, yes. In cases of super-common dependenci

Re: Dependency hell in Spark applications

2014-09-04 Thread Felix Garcia Borrego
Hi, I run into the same issue and apart from the ideas Aniket said, I only could find a nasty workaround. Add my custom PoolingClientConnectionManager to my classpath. http://stackoverflow.com/questions/24788949/nosuchmethoderror-while-running-aws-s3-client-on-spark-while-javap-shows-otherwi/25488

Re: Dependency hell in Spark applications

2014-09-04 Thread Koert Kuipers
custom spark builds should not be the answer. at least not if spark ever wants to have a vibrant community for spark apps. spark does support a user-classpath-first option, which would deal with some of these issues, but I don't think it works. On Sep 4, 2014 9:01 AM, "Felix Garcia Borrego" wrote

RE: Is breeze thread safe in Spark?

2014-09-04 Thread Ulanov, Alexander
I've experienced something related to what we discussed. NaïveBayes crashes with native blas/lapack libraries for breeze/netlib on Windows: https://issues.apache.org/jira/browse/SPARK-3403 I've also attached to the issue another example with gradient that crashes in runMiniBatchSGD, probably try

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Tom Graves
+1. Ran spark on yarn on hadoop 0.23 and 2.x. Tom On Wednesday, September 3, 2014 2:25 AM, Patrick Wendell wrote: Please vote on releasing the following candidate as Apache Spark version 1.1.0! The tag to be voted on is v1.1.0-rc4 (commit 2f9b2bd): https://git-wip-us.apache.org/repos/asf?

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Gurvinder Singh
On 09/03/2014 04:23 PM, Nicholas Chammas wrote: > On Wed, Sep 3, 2014 at 3:24 AM, Patrick Wendell wrote: > >> == What default changes should I be aware of? == >> 1. The default value of "spark.io.compression.codec" is now "snappy" >> --> Old behavior can be restored by switching to "lzf" >> >> 2.

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Henry Saputra
LICENSE and NOTICE files are good Hash files are good Signature files are good No 3rd parties executables Source compiled Run local and standalone tests Test persist off heap with Tachyon looks good +1 - Henry On Wed, Sep 3, 2014 at 12:24 AM, Patrick Wendell wrote: > Please vote on releasing th

amplab jenkins is down

2014-09-04 Thread shane knapp
i am trying to get things up and running, but it looks like either the firewall gateway or jenkins server itself is down. i'll update as soon as i know more.

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
looks like a power outage in soda hall. more updates as they happen. On Thu, Sep 4, 2014 at 12:25 PM, shane knapp wrote: > i am trying to get things up and running, but it looks like either the > firewall gateway or jenkins server itself is down. i'll update as soon as > i know more. >

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Egor Pahomov
+1 Compiled, ran on yarn-hadoop-2.3 simple job. 2014-09-04 22:22 GMT+04:00 Henry Saputra : > LICENSE and NOTICE files are good > Hash files are good > Signature files are good > No 3rd parties executables > Source compiled > Run local and standalone tests > Test persist off heap with Tachyon lo

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
looks like some hardware failed, and we're swapping in a replacement. i don't have more specific information yet -- including *what* failed, as our sysadmin is super busy ATM. the root cause was an incorrect circuit being switched off during building maintenance. on a side note, this incident wi

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
it's a faulty power switch on the firewall, which has been swapped out. we're about to reboot and be good to go. On Thu, Sep 4, 2014 at 1:19 PM, shane knapp wrote: > looks like some hardware failed, and we're swapping in a replacement. i > don't have more specific information yet -- including

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Nicholas Chammas
On Thu, Sep 4, 2014 at 1:50 PM, Gurvinder Singh wrote: > There is a regression when using pyspark to read data > from HDFS. > Could you open a JIRA with a brief repro? We'll look into it. (You could also provide a repro in a separate thread.) Nick

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread randomuser54
+1 -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/VOTE-Release-Apache-Spark-1-1-0-RC4-tp8219p8278.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

How to kill a Spark job running in local mode programmatically ?

2014-09-04 Thread randomuser54
I have a java class which calls SparkSubmit.scala with all the arguments to run a spark job in a thread. I am running them in local mode for now but also want to run them in yarn-cluster mode later. Now, I want to kill the running spark job (which can be in local or yarn-cluster mode) programmatic

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
AND WE'RE UP! sorry that this took so long... i'll send out a more detailed explanation of what happened soon. now, off to back up jenkins. shane On Thu, Sep 4, 2014 at 1:27 PM, shane knapp wrote: > it's a faulty power switch on the firewall, which has been swapped out. > we're about to re

Re: amplab jenkins is down

2014-09-04 Thread Nicholas Chammas
Woohoo! Thanks Shane. Do you know if queued PR builds will automatically be picked up? Or do we have to ping the Jenkinmensch manually from each PR? Nick On Thu, Sep 4, 2014 at 5:37 PM, shane knapp wrote: > AND WE'RE UP! > > sorry that this took so long... i'll send out a more detailed expla

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
i'd ping the Jenkinsmench... the master was completely offline, so any new jobs wouldn't have reached it. any jobs that were queued when power was lost probably started up, but jobs that were running would fail. On Thu, Sep 4, 2014 at 2:45 PM, Nicholas Chammas wrote: > Woohoo! Thanks Shane. >

Re: amplab jenkins is down

2014-09-04 Thread Nicholas Chammas
It appears that our main man is having trouble hearing new requests . Do we need some smelling salts? On Thu, Sep 4, 2014 at 5:49

Re: amplab jenkins is down

2014-09-04 Thread Patrick Wendell
Hm yeah it seems that it hasn't been polling since 3:45. On Thu, Sep 4, 2014 at 4:21 PM, Nicholas Chammas wrote: > It appears that our main man is having trouble hearing new requests. > > Do we need some smelling salts? > > > On Thu, Sep 4, 2014 at 5:49 PM, shane knapp wrote: >> >> i'd ping the

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
looking On Thu, Sep 4, 2014 at 4:21 PM, Nicholas Chammas wrote: > It appears that our main man is having trouble > > hearing new requests >

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
i'm going to restart jenkins and see if that fixes things. On Thu, Sep 4, 2014 at 4:56 PM, shane knapp wrote: > looking > > > On Thu, Sep 4, 2014 at 4:21 PM, Nicholas Chammas < > nicholas.cham...@gmail.com> wrote: > >> It appears that our main man is having trouble >>

Re: [VOTE] Release Apache Spark 1.1.0 (RC4)

2014-09-04 Thread Kan Zhang
+1 Compiled, ran newly-introduced PySpark Hadoop input/output examples. On Thu, Sep 4, 2014 at 1:10 PM, Egor Pahomov wrote: > +1 > > Compiled, ran on yarn-hadoop-2.3 simple job. > > > 2014-09-04 22:22 GMT+04:00 Henry Saputra : > > > LICENSE and NOTICE files are good > > Hash files are good > >

Re: amplab jenkins is down

2014-09-04 Thread Nicholas Chammas
Looks like during the last build Jenkins was unable to execute a git fetch? On Thu, Sep 4, 2014 at 7:58 PM, shane knapp wrote: > i'm going to restart jenkins and see if that fixes t

Re: amplab jenkins is down

2014-09-04 Thread shane knapp
yep. that's exactly the behavior i saw earlier, and will be figuring out first thing tomorrow morning. i bet it's an environment issues on the slaves. On Thu, Sep 4, 2014 at 7:10 PM, Nicholas Chammas wrote: > Looks like during the last build >