How to convert a non-rdd data to rdd.

2014-10-12 Thread rapelly kartheek
Hi, I am trying to write a String that is not an rdd to HDFS. This data is a variable in Spark Scheduler code. None of the spark File operations are working because my data is not rdd. So, I tried using SparkContext.parallelize(data). But it throws error: [error]

Re: How to convert a non-rdd data to rdd.

2014-10-12 Thread @Sanjiv Singh
Hi Karthik, Can you provide us more detail of dataset data that you wanted to parallelize with SparkContext.parallelize(data); Regards, Sanjiv Singh Regards Sanjiv Singh Mob : +091 9990-447-339 On Sun, Oct 12, 2014 at 11:45 AM, rapelly kartheek kartheek.m...@gmail.com wrote: Hi, I am

Re: How to convert a non-rdd data to rdd.

2014-10-12 Thread rapelly kartheek
Its a variable in spark-1.0.0/*/storagre/BlockManagerMaster.scala class. The return data of AskDriverWithReply() method for the getPeers() method. Basically, it is a Seq[ArrayBuffer]: ArraySeq(ArrayBuffer(BlockManagerId(1, s1, 47006, 0), BlockManagerId(0, s1, 34625, 0)),

Re: How to convert a non-rdd data to rdd.

2014-10-12 Thread Kartheek.R
Hi Sean, I tried even with sc as: sc.parallelize(data). But. I get the error: value sc not found. On Sun, Oct 12, 2014 at 1:47 PM, sowen [via Apache Spark User List] ml-node+s1001560n16233...@n3.nabble.com wrote: It is a method of the class, not a static method of the object. Since a

RE: Spark SQL parser bug?

2014-10-12 Thread Cheng, Hao
Hi, I couldn’t reproduce the bug with the latest master branch. Which version are you using? Can you also list data in the table “x”? case class T(a:String, ts:java.sql.Timestamp) val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext.createSchemaRDD val data =

Re: How to convert a non-rdd data to rdd.

2014-10-12 Thread Kartheek.R
Does SparkContext exists when this part (AskDriverWithReply()) of the scheduler code gets executed? On Sun, Oct 12, 2014 at 1:54 PM, rapelly kartheek kartheek.m...@gmail.com wrote: Hi Sean, I tried even with sc as: sc.parallelize(data). But. I get the error: value sc not found. On Sun, Oct

Re: Interactive interface tool for spark

2014-10-12 Thread andy petrella
Dear Sparkers, As promised, I've just updated the repo with a new name (for the sake of clarity), default branch but specially with a dedicated README containing: * explanations on how to launch and use it * an intro on each feature like Spark, Classpaths, SQL, Dynamic update, ... * pictures

ClasssNotFoundExeception was thrown while trying to save rdd

2014-10-12 Thread Tao Xiao
Hi all, I'm using CDH 5.0.1 (Spark 0.9) and submitting a job in Spark Standalone Cluster mode. The job is quite simple as follows: object HBaseApp { def main(args:Array[String]) { testHBase(student, /test/xt/saveRDD) } def testHBase(tableName: String, outFile:String) {

Re: ClasssNotFoundExeception was thrown while trying to save rdd

2014-10-12 Thread Ted Yu
Your app is named scala.HBaseApp Does it read / write to HBase ? Just curious. On Sun, Oct 12, 2014 at 8:00 AM, Tao Xiao xiaotao.cs@gmail.com wrote: Hi all, I'm using CDH 5.0.1 (Spark 0.9) and submitting a job in Spark Standalone Cluster mode. The job is quite simple as follows:

setting heap space

2014-10-12 Thread Chengi Liu
Hi, I am trying to use spark but I am having hard time configuring the sparkconf... My current conf is conf = SparkConf().set(spark.executor.memory,10g).set(spark.akka.frameSize, 1).set(spark.driver.memory,16g) but I still see the java heap size error 14/10/12 09:54:50 ERROR Executor:

Re: Interactive interface tool for spark

2014-10-12 Thread Jaonary Rabarisoa
And what about Hue http://gethue.com ? On Sun, Oct 12, 2014 at 1:26 PM, andy petrella andy.petre...@gmail.com wrote: Dear Sparkers, As promised, I've just updated the repo with a new name (for the sake of clarity), default branch but specially with a dedicated README containing: *

Re: Interactive interface tool for spark

2014-10-12 Thread andy petrella
Yeah, if it allows to craft some Scala/Spark code in a shareable manner, it is a good another option! thx for sharing aℕdy ℙetrella about.me/noootsab [image: aℕdy ℙetrella on about.me] http://about.me/noootsab On Sun, Oct 12, 2014 at 9:47 PM, Jaonary Rabarisoa jaon...@gmail.com wrote: And

NullPointerException when deploying JAR to standalone cluster..

2014-10-12 Thread Jorge Simão
Hi, everybody! I'm trying to deploy a simple app in Spark standalone cluster with a single node (the localhost). Unfortunately, something goes wrong while processing the JAR file and an exception NullPointerException is thrown. I'm running everything in a single machine with Windows8. Check below

Spark in cluster and errors

2014-10-12 Thread Morbious
Hi, Can anyone point me how spark works ? Why is it trying to connect from master port A to master port ABCD in cluster mode with 6 workers ? 14/10/09 19:37:19 ERROR remote.EndpointWriter: AssociationError [akka.tcp://sparkWorker@...:7078] - [akka.tcp://sparkExecutor@...:53757]: Error

Nested Query using SparkSQL 1.1.0

2014-10-12 Thread shahab
Hi, Apparently is it is possible to query nested json using spark SQL, but , mainly due to lack of proper documentation/examples, I did not manage to make it working. I do appreciate if you could point me to any example or help with this issue, Here is my code: val anotherPeopleRDD =

Re: Spark in cluster and errors

2014-10-12 Thread Jorge Simão
You have a connection refuse error. You need to check: -That the master is listening on specified hostport. -No firewall blocking access. -Make sure that config is pointing to the master hostport. Check the host name from the web console. Send more details about cluster layout for more details..

Re: Spark job doesn't clean after itself

2014-10-12 Thread Rohit Pujari
Reviving this .. any thoughts experts? On Thu, Oct 9, 2014 at 3:47 PM, Rohit Pujari rpuj...@hortonworks.com wrote: Hello Folks: I'm running spark job on YARN. After the execution, I would expect the spark job to clean staging the area, but it seems every run creates a new staging directory.

Re: ClasssNotFoundExeception was thrown while trying to save rdd

2014-10-12 Thread Tao Xiao
In the beginning I tried to read HBase and found that exception was thrown, then I start to debug the app. I removed the codes reading HBase and tried to save an rdd containing a list and the exception was still thrown. So I'm sure that exception was not caused by reading HBase. While debugging I

Re: small bug in pyspark

2014-10-12 Thread Josh Rosen
Hi Andy, You may be interested in https://github.com/apache/spark/pull/2651, a recent pull request of mine which cleans up / simplifies the configuration of PySpark's Python executables. For instance, it makes it much easier to control which Python options are passed when launching the PySpark

Re: What if I port Spark from TCP/IP to RDMA?

2014-10-12 Thread Josh Rosen
Hi Theo, Check out *spark-perf*, a suite of performance benchmarks for Spark: https://github.com/databricks/spark-perf. - Josh On Fri, Oct 10, 2014 at 7:27 PM, Theodore Si sjyz...@gmail.com wrote: Hi, Let's say that I managed to port Spark from TCP/IP to RDMA. What tool or benchmark can I