Re: Implementing a custom Spark shell

2014-02-26 Thread Matei Zaharia
In Spark 0.9 and master, you can pass the -i argument to spark-shell to load a script containing commands before opening the prompt. This is also a feature of the Scala shell as a whole (try scala -help for details). Also, once you’re in the shell, you can use :load file.scala to execute the

Re: ReduceByKey or groupByKey to Count?

2014-02-26 Thread dmpour23
If i use groupbyKey as so... JavaPairRDDString, Listlt;String twos = ones.groupByKey(3).cache(); How would I write to a file/ or Hadoop the contents of the List of Strings. Do i need to transform the JavaPairRDD to JavaRDD and call f saveAsTextFile? -- View this message in context:

Build Spark in IntelliJ IDEA 13

2014-02-26 Thread Yanzhe Chen
Hi, all I'm trying to build Spark in IntelliJ IDEA 13. I clone the latest repo and run sbt/sbt gen-idea in the root folder. Then import it into IntelliJ IDEA. Scala plugin for IntelliJ IDEA has been installed. Everything seems ok until I ran Build Make Project: Information: Using javac

Re: Build Spark in IntelliJ IDEA 13

2014-02-26 Thread Sean Owen
I also use IntelliJ 13 on a Mac, with only Java 7, and have never seen this. If you look at the Spark build, you will see that it specifies Java 6, not 7. Even if you changed java.version in the build, you would not get this error, since it specifies source and target to be the same value. In

Re: Dealing with headers in csv file pyspark

2014-02-26 Thread Chengi Liu
I am not sure.. the suggestion is to open a TB file and remove a line? That doesnt sounds that good. I am hacking my way by using a filter.. Can I put a try:except clause in my lambda function.. Maybe i should just try that out. But thanks for the suggestion. Also, can i run scripts against spark

Actors and sparkcontext actions

2014-02-26 Thread Ognen Duzlevski
Can someone point me to a simple, short code example of creating a basic Actor that gets a context and runs an operation such as .textFile.count? I am trying to figure out how to create just a basic actor that gets a message like this: case class Msg(filename:String, ctx: SparkContext) and

worker keeps getting disassociated upon a failed job spark version 0.90

2014-02-26 Thread Shirish
I am an newbie!! I am running Spark 0.90 in standalone mode on my mac. The master and worker run on the same machine. Both of them startup fine (at least that is what I see in the log). *Upon start-up master log is:* 14/02/26 15:38:08 INFO Slf4jLogger: Slf4jLogger started 14/02/26 15:38:08

Re: [incubating-0.9.0] Too Many Open Files on Workers

2014-02-26 Thread Rohit Rai
Hello Andy, This is a problem we have seen in using the CQL Java driver under heavy ready loads where it is using NIO and is waiting on many pending responses which causes to many open sockets and hence too many open files. Are you by any chance using async queries? I am the maintainer of