date:20140608

Re: Comprehensive Port Configuration reference?

2014-06-08 Thread Andrew Ash

Hi Jacob, The port configuration docs that we worked on together are now available at: http://spark.apache.org/docs/latest/spark-standalone.html#configuring-ports-for-network-security Thanks for the help! Andrew On Wed, May 28, 2014 at 3:21 PM, Jacob Eisinger wrote: > Howdy Andrew, > > This

RE: How can I make Spark 1.0 saveAsTextFile to overwrite existing file

2014-06-08 Thread innowireless TaeYun Kim

Without (C), what is the best practice to implement the following scenario? 1. rdd = sc.textFile(FileA) 2. rdd = rdd.map(...) // actually modifying the rdd 3. rdd.saveAsTextFile(FileA) Since the rdd transformation is 'lazy', rdd will not materialize until saveAsTextFile(), so FileA must still ex

Re: Classpath errors with Breeze

2014-06-08 Thread Xiangrui Meng

Hi Tobias, Which file system and which encryption are you using? Best, Xiangrui On Sun, Jun 8, 2014 at 10:16 PM, Xiangrui Meng wrote: > Hi dlaw, > > You are using breeze-0.8.1, but the spark assembly jar depends on > breeze-0.7. If the spark assembly jar comes the first on the classpath > but t

Re: Classpath errors with Breeze

2014-06-08 Thread Xiangrui Meng

Hi dlaw, You are using breeze-0.8.1, but the spark assembly jar depends on breeze-0.7. If the spark assembly jar comes the first on the classpath but the method from DenseMatrix is only available in breeze-0.8.1, you get NoSuchMethod. So, a) If you don't need the features in breeze-0.8.1, do not

Re: Classpath errors with Breeze

2014-06-08 Thread dlaw

Thanks for the quick response. No, I actually build my jar via 'sbt package' on EC2 on the master itself. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Classpath-errors-with-Breeze-tp7220p7225.html Sent from the Apache Spark User List mailing list archive

Re: How to compile a Spark project in Scala IDE for Eclipse?

2014-06-08 Thread Carter

Thanks for your reply Wei, will try this. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-compile-a-Spark-project-in-Scala-IDE-for-Eclipse-tp7197p7224.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to compile a Spark project in Scala IDE for Eclipse?

2014-06-08 Thread Carter

Thanks a lot Krishna, this works for me. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-compile-a-Spark-project-in-Scala-IDE-for-Eclipse-tp7197p7223.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Classpath errors with Breeze

2014-06-08 Thread Tobias Pfeiffer

Hi, I had a similar problem; I was using `sbt assembly` to build a jar containing all my dependencies, but since my file system has a problem with long file names (due to disk encryption), some class files (which correspond to functions in Scala) where not included in the jar I uploaded. Although,

How to achieve a reasonable performance on Spark Streaming

2014-06-08 Thread onpoq

Dear All, I recently installed Spark 1.0.0 on a 10-slave dedicate cluster. However, the max input rate that the system can sustain with stable latency seems very low. I use a simple word counting workload over tweets: theDStream.flatMap(extractWordOnePairs).reduceByKey(sumFunc).count.print With

Classpath errors with Breeze

2014-06-08 Thread dlaw

I'm having some trouble getting a basic matrix multiply to work with Breeze. I'm pretty sure it's related to my classpath. My setup is a cluster on AWS with 8 m3.xlarges. To create the cluster I used the provided ec2 scripts and Spark 1.0.0. I've made a gist with the relevant pieces of my app: ht

Re: Are "scala.MatchError" messages a problem?

2014-06-08 Thread Tobias Pfeiffer

Jeremy, On Mon, Jun 9, 2014 at 10:22 AM, Jeremy Lee wrote: >> When you use match, the match must be exhaustive. That is, a match error >> is thrown if the match fails. > > Ahh, right. That makes sense. Scala is applying its "strong typing" rules > here instead of "no ceremony"... but isn't the id

Re: Are "scala.MatchError" messages a problem?

2014-06-08 Thread Jeremy Lee

On Sun, Jun 8, 2014 at 10:00 AM, Nick Pentreath wrote: > When you use match, the match must be exhaustive. That is, a match error > is thrown if the match fails. Ahh, right. That makes sense. Scala is applying its "strong typing" rules here instead of "no ceremony"... but isn't the idea that t

Re: Spark Kafka streaming - ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaReceiver

2014-06-08 Thread Tobias Pfeiffer

Gaurav, I am not sure that the "*" expands to what you expect it to do. Normally the bash expands "*" to a space-separated string, not colon-separated. Try specifying all the jars manually, maybe? Tobias On Thu, Jun 5, 2014 at 6:45 PM, Gaurav Dasgupta wrote: > Hi, > > I have written my own cust

Re: Spark Worker Core Allocation

2014-06-08 Thread Subacini B

Thanks Sean, let me try to set spark.deploy.spreadOut as false. On Sun, Jun 8, 2014 at 12:44 PM, Sean Owen wrote: > Have a look at: > > https://spark.apache.org/docs/1.0.0/job-scheduling.html > https://spark.apache.org/docs/1.0.0/spark-standalone.html > > The default is to grab resource on al

Re: Strange problem with saveAsTextFile after upgrade Spark 0.9.0->1.0.0

2014-06-08 Thread Patrick Wendell

Okay I think I've isolated this a bit more. Let's discuss over on the JIRA: https://issues.apache.org/jira/browse/SPARK-2075 On Sun, Jun 8, 2014 at 1:16 PM, Paul Brown wrote: > > Hi, Patrick -- > > Java 7 on the development machines: > > » java -version > 1 ↵ > java version "1.7.0_51" > Java(TM)

Re: Strange problem with saveAsTextFile after upgrade Spark 0.9.0->1.0.0

2014-06-08 Thread Paul Brown

Hi, Patrick -- Java 7 on the development machines: » java -version 1 ↵ java version "1.7.0_51" Java(TM) SE Runtime Environment (build 1.7.0_51-b13) Java HotSpot(TM) 64-Bit Server VM (build 24.51-b03, mixed mode) And on the deployed boxes: $ java -version

Re: Strange problem with saveAsTextFile after upgrade Spark 0.9.0->1.0.0

2014-06-08 Thread Sean Owen

I suspect Patrick is right about the cause. The Maven artifact that was released does contain this class (phew) http://search.maven.org/#artifactdetails%7Corg.apache.spark%7Cspark-core_2.10%7C1.0.0%7Cjar As to the hadoop1 / hadoop2 artifact question -- agree that is often done. Here the working t

Re: Strange problem with saveAsTextFile after upgrade Spark 0.9.0->1.0.0

2014-06-08 Thread Patrick Wendell

Also I should add - thanks for taking time to help narrow this down! On Sun, Jun 8, 2014 at 1:02 PM, Patrick Wendell wrote: > Paul, > > Could you give the version of Java that you are building with and the > version of Java you are running with? Are they the same? > > Just off the cuff, I wonder

Re: Strange problem with saveAsTextFile after upgrade Spark 0.9.0->1.0.0

2014-06-08 Thread Patrick Wendell

Paul, Could you give the version of Java that you are building with and the version of Java you are running with? Are they the same? Just off the cuff, I wonder if this is related to: https://issues.apache.org/jira/browse/SPARK-1520 If it is, it could appear that certain functions are not in the

Re: Strange problem with saveAsTextFile after upgrade Spark 0.9.0->1.0.0

2014-06-08 Thread Paul Brown

Moving over to the dev list, as this isn't a user-scope issue. I just ran into this issue with the missing saveAsTestFile, and here's a little additional information: - Code ported from 0.9.1 up to 1.0.0; works with local[n] in both cases. - Driver built as an uberjar via Maven. - Deployed to sma

Re: Spark Worker Core Allocation

2014-06-08 Thread Sean Owen

Have a look at: https://spark.apache.org/docs/1.0.0/job-scheduling.html https://spark.apache.org/docs/1.0.0/spark-standalone.html The default is to grab resource on all nodes. In your case you could set spark.cores.max to 2 or less to enable running two apps on a cluster of 4-core machines simult

Re: How to get the help or explanation for the functions in Spark shell?

2014-06-08 Thread Nicholas Chammas

In PySpark you can also do help(my_rdd) and get a nice help page of methods available. 2014년 6월 8일 일요일, Carter님이 작성한 메시지: > Thank you very much Gerard. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-the-help-or-explanation-for-the-funct

Spark Streaming union expected behaviour?

2014-06-08 Thread Shrikar archak

Hi All, I was writing a simple Streaming job to get more understanding about Spark streaming. I am not understanding why the union behaviour in this particular case *WORKS:* val lines = ssc.socketTextStream("localhost", , StorageLevel.MEMORY_AND_DISK_SER) val words = lines..flatMap(_.

Re: Spark Worker Core Allocation

2014-06-08 Thread Subacini B

HI, I am stuck here, my cluster is not effficiently utilized . Appreciate any input on this. Thanks Subacini On Sat, Jun 7, 2014 at 10:54 PM, Subacini B wrote: > Hi All, > > My cluster has 5 workers each having 4 cores (So total 20 cores).It is in > stand alone mode (not using Mesos or Yarn)

Re: Are "scala.MatchError" messages a problem?

2014-06-08 Thread Mark Hamstra

> > The solution is either to add a default case which does nothing, or > probably better to add a .filter such that you filter out anything that's > not a command before matching. > And you probably want to push down that filter into the cluster -- collecting all of the elements of an RDD only to

Re: Are "scala.MatchError" messages a problem?

2014-06-08 Thread Nick Pentreath

When you use match, the match must be exhaustive. That is, a match error is thrown if the match fails. That's why you usually handle the default case using "case _ => ..." Here it looks like your taking the text of all statuses - which means not all of them will be commands... Which means

Re: Are "scala.MatchError" messages a problem?

2014-06-08 Thread Sean Owen

A match clause needs to cover all the possibilities, and not matching any regex is a distinct possibility. It's not really like 'switch' because it requires this and I think that has benefits, like being able to interpret a match as something with a type. I think it's all in order, but it's more of

Are "scala.MatchError" messages a problem?

2014-06-08 Thread Jeremy Lee

I shut down my first (working) cluster and brought up a fresh one... and It's been a bit of a horror and I need to sleep now. Should I be worried about these errors? Or did I just have the old log4j.config tuned so I didn't see them? I 14/06/08 16:32:52 ERROR scheduler.JobScheduler: Error running

Re: How to compile a Spark project in Scala IDE for Eclipse?

2014-06-08 Thread Wei Tan

This will make the compilation pass but you may not be able to run it correctly. I used maven adding these two jars (I use Hadoop 1), maven added their dependent jars (a lot) for me. org.apache.spark spark-core_2.10 1.0.0 org.apache.hadoop hadoop-client 1.2.1 Best

Re: How to compile a Spark project in Scala IDE for Eclipse?

2014-06-08 Thread Krishna Sankar

Project->Properties->Java Build Path->Add External Jars Add the /spark-1.0.0-bin-hadoop2/lib/spark-assembly-1.0.0-hadoop2.2.0.jar Cheers On Sun, Jun 8, 2014 at 8:06 AM, Carter wrote: > Hi All, > > I just downloaded the Scala IDE for Eclipse. After I created a Spark > project > and clicked "Run

How to compile a Spark project in Scala IDE for Eclipse?

2014-06-08 Thread Carter

Hi All, I just downloaded the Scala IDE for Eclipse. After I created a Spark project and clicked "Run" there was an error on this line of code "import org.apache.spark.SparkContext": "object apache is not a member of package org". I guess I need to import the Spark dependency into Scala IDE for Ec

Re: Best practise for 'Streaming' dumps?

2014-06-08 Thread Gino Bustelo

Yeah... Have not tried it, but if you set the slidingDuration == windowDuration that should prevent overlaps. Gino B. > On Jun 8, 2014, at 8:25 AM, Jeremy Lee wrote: > > I read it more carefully, and window() might actually work for some other > stuff like logs. (assuming I can have multiple

Re: Best practise for 'Streaming' dumps?

2014-06-08 Thread Jeremy Lee

I read it more carefully, and window() might actually work for some other stuff like logs. (assuming I can have multiple windows with entirely different attributes on a single stream..) Thanks for that! On Sun, Jun 8, 2014 at 11:11 PM, Jeremy Lee wrote: > Yes.. but from what I understand that'

Re: Best practise for 'Streaming' dumps?

2014-06-08 Thread Jeremy Lee

Yes.. but from what I understand that's a "sliding window" so for a window of (60) over (1) second DStreams, that would save the entire last minute of data once per second. That's more than I need. I think what I'm after is probably updateStateByKey... I want to mutate data structures (probably ev

Re: How to get the help or explanation for the functions in Spark shell?

2014-06-08 Thread Carter

Thank you very much Gerard. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-the-help-or-explanation-for-the-functions-in-Spark-shell-tp7191p7193.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: How to get the help or explanation for the functions in Spark shell?

2014-06-08 Thread Gerard Maas

You can consult the docs at : https://spark.apache.org/docs/latest/api/scala/index.html#package In particular, the rdd docs contain the explanation of each method : https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD Kr, Gerard On Jun 8, 2014 1:00 PM, "Carter" wrot

How to get the help or explanation for the functions in Spark shell?

2014-06-08 Thread Carter

Hi All, I am new to Spark. In the Spark shell, how can I get the help or explanation for those functions that I can use for a variable or RDD? For example, after I input a RDD's name with a dot (.) at the end, if I press the Tab key, a list of functions that I can use for this RDD will be displa

Re: Gradient Descent with MLBase

2014-06-08 Thread Aslan Bekirov

Hi DB, Thanks a lot. Appreciated. BR, Aslan On Sun, Jun 8, 2014 at 2:52 AM, DB Tsai wrote: > Hi Aslan, > > You can check out the unittest code of GradientDescent.runMiniBatchSGD > > > https://github.com/apache/spark/blob/master/mllib/src/test/scala/org/apache/spark/mllib/optimization/Gradient

38 matches

Mail list logo