= getUrlAsString(https://somehost.com/test.json;)
val jsonDataRDD = ?
val json1 = sqlContext.jsonRDD(jsonDataRDD)
Thanks,
RP
In fact I think it's highly impossible, but I just want some confirmation
from you, please leave your option, thanks :)
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/The-running-time-of-spark-tp12624p12691.html
Sent from the Apache Spark User List mailing
The algorithm uses Pregel of GraphX.
It ran for more than one day and only reached the third stage, and I
cancelled it because the consumption is unacceptable.
The time expected is about ten minutes (not expected by me ...), but I think
a couple of hours is acceptable.
Bottleneck seems to be I/O,
Thanks for the suggestion, the program actually failed because of
OutOfMemory: Java heap space, and I tried some modifications and it went
further, but the exception might occur again anyway.
How long did your test take? I can take it for reference.
--
View this message in context:
Hi,
I'm using spark on a cluster of 8 VMs, each with two cores and 3.5GB RAM.
But I need to run a shortest path algorithm on data of 500+GB(textfile, each
line contains a node id and nodes it points to)
I've tested it on the cluster, but the speed seems to be extremely slow, and
haven't got any
Hi, I'm running a spark standalone cluster to calculate single source
shortest path.
Here is the code, VertexRDD[(String, Long)], String for the path and Long
for the distance
codes before these lines related to reading graph data from file and
building the graph.
71 val sssp =
I build it with sbt package, I run it with sbt run, and I do use
SparkConf.set for deployment options and external jars. It seems that
spark-submit can't load extra jars and will lead to noclassdeffounderror,
should I pack all the jars to a giant one and give it a try?
I run it on a cluster of 8
Anyone can help? I'm using spark 1.0.1
I'm confusing that if the block is found, why no non-empty blocks is got,
and the process keeps going forever?
Thanks!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-got-stuck-with-a-loop-tp10590p10663.html
Hi,
I ran spark standalone mode on a cluster and it went well for approximately
one hour, then the driver's output stopped with the following:
14/07/24 08:07:36 INFO MapOutputTrackerMasterActor: Asked to send map output
locations for shuffle 36 to spark@worker5.local:47416
14/07/24 08:07:36 INFO