Hi,
I am trying to write a String that is not an rdd to HDFS. This data is a
variable in Spark Scheduler code. None of the spark File operations are
working because my data is not rdd.
So, I tried using SparkContext.parallelize(data). But it throws error:
[error]
Hi Karthik,
Can you provide us more detail of dataset data that you wanted to
parallelize with
SparkContext.parallelize(data);
Regards,
Sanjiv Singh
Regards
Sanjiv Singh
Mob : +091 9990-447-339
On Sun, Oct 12, 2014 at 11:45 AM, rapelly kartheek kartheek.m...@gmail.com
wrote:
Hi,
I am
Its a variable in spark-1.0.0/*/storagre/BlockManagerMaster.scala class.
The return data of AskDriverWithReply() method for the getPeers() method.
Basically, it is a Seq[ArrayBuffer]:
ArraySeq(ArrayBuffer(BlockManagerId(1, s1, 47006, 0), BlockManagerId(0, s1,
34625, 0)),
Hi Sean,
I tried even with sc as: sc.parallelize(data). But. I get the error: value
sc not found.
On Sun, Oct 12, 2014 at 1:47 PM, sowen [via Apache Spark User List]
ml-node+s1001560n16233...@n3.nabble.com wrote:
It is a method of the class, not a static method of the object. Since a
Hi, I couldn’t reproduce the bug with the latest master branch. Which version
are you using? Can you also list data in the table “x”?
case class T(a:String, ts:java.sql.Timestamp)
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
val data =
Does SparkContext exists when this part (AskDriverWithReply()) of the
scheduler code gets executed?
On Sun, Oct 12, 2014 at 1:54 PM, rapelly kartheek kartheek.m...@gmail.com
wrote:
Hi Sean,
I tried even with sc as: sc.parallelize(data). But. I get the error: value
sc not found.
On Sun, Oct
Dear Sparkers,
As promised, I've just updated the repo with a new name (for the sake of
clarity), default branch but specially with a dedicated README containing:
* explanations on how to launch and use it
* an intro on each feature like Spark, Classpaths, SQL, Dynamic update, ...
* pictures
Hi all,
I'm using CDH 5.0.1 (Spark 0.9) and submitting a job in Spark Standalone
Cluster mode.
The job is quite simple as follows:
object HBaseApp {
def main(args:Array[String]) {
testHBase(student, /test/xt/saveRDD)
}
def testHBase(tableName: String, outFile:String) {
Your app is named scala.HBaseApp
Does it read / write to HBase ?
Just curious.
On Sun, Oct 12, 2014 at 8:00 AM, Tao Xiao xiaotao.cs@gmail.com wrote:
Hi all,
I'm using CDH 5.0.1 (Spark 0.9) and submitting a job in Spark Standalone
Cluster mode.
The job is quite simple as follows:
Hi,
I am trying to use spark but I am having hard time configuring the
sparkconf...
My current conf is
conf =
SparkConf().set(spark.executor.memory,10g).set(spark.akka.frameSize,
1).set(spark.driver.memory,16g)
but I still see the java heap size error
14/10/12 09:54:50 ERROR Executor:
And what about Hue http://gethue.com ?
On Sun, Oct 12, 2014 at 1:26 PM, andy petrella andy.petre...@gmail.com
wrote:
Dear Sparkers,
As promised, I've just updated the repo with a new name (for the sake of
clarity), default branch but specially with a dedicated README containing:
*
Yeah, if it allows to craft some Scala/Spark code in a shareable manner, it
is a good another option!
thx for sharing
aℕdy ℙetrella
about.me/noootsab
[image: aℕdy ℙetrella on about.me]
http://about.me/noootsab
On Sun, Oct 12, 2014 at 9:47 PM, Jaonary Rabarisoa jaon...@gmail.com
wrote:
And
Hi, everybody!
I'm trying to deploy a simple app in Spark standalone cluster with a single
node (the localhost).
Unfortunately, something goes wrong while processing the JAR file and an
exception NullPointerException is thrown.
I'm running everything in a single machine with Windows8.
Check below
Hi,
Can anyone point me how spark works ?
Why is it trying to connect from master port A to master port ABCD in
cluster mode with 6 workers ?
14/10/09 19:37:19 ERROR remote.EndpointWriter: AssociationError
[akka.tcp://sparkWorker@...:7078] - [akka.tcp://sparkExecutor@...:53757]:
Error
Hi,
Apparently is it is possible to query nested json using spark SQL, but ,
mainly due to lack of proper documentation/examples, I did not manage to
make it working. I do appreciate if you could point me to any example or
help with this issue,
Here is my code:
val anotherPeopleRDD =
You have a connection refuse error.
You need to check:
-That the master is listening on specified hostport.
-No firewall blocking access.
-Make sure that config is pointing to the master hostport. Check the host
name from the web console.
Send more details about cluster layout for more details..
Reviving this .. any thoughts experts?
On Thu, Oct 9, 2014 at 3:47 PM, Rohit Pujari rpuj...@hortonworks.com
wrote:
Hello Folks:
I'm running spark job on YARN. After the execution, I would expect the
spark job to clean staging the area, but it seems every run creates a new
staging directory.
In the beginning I tried to read HBase and found that exception was thrown,
then I start to debug the app. I removed the codes reading HBase and tried
to save an rdd containing a list and the exception was still thrown. So I'm
sure that exception was not caused by reading HBase.
While debugging I
Hi Andy,
You may be interested in https://github.com/apache/spark/pull/2651, a
recent pull request of mine which cleans up / simplifies the configuration
of PySpark's Python executables. For instance, it makes it much easier to
control which Python options are passed when launching the PySpark
Hi Theo,
Check out *spark-perf*, a suite of performance benchmarks for Spark:
https://github.com/databricks/spark-perf.
- Josh
On Fri, Oct 10, 2014 at 7:27 PM, Theodore Si sjyz...@gmail.com wrote:
Hi,
Let's say that I managed to port Spark from TCP/IP to RDMA.
What tool or benchmark can I
20 matches
Mail list logo