Dear Sparkers,
Has anyone got any insight on this ? I am really stuck.
Yadid
On 4/28/14, 11:28 AM, Yadid Ayzenberg wrote:
Thanks for your answer.
I tried running on a single machine - master and worker on one host. I
get exactly the same results.
Very little CPU activity on the machine in
That is quite mysterious, and I do not think we have enough information to
answer. JavaPairRDDString, Tuple2.lookup() works fine on a remote Spark
cluster:
$ MASTER=spark://localhost:7077 bin/spark-shell
scala val rdd = org.apache.spark.api.java.JavaPairRDD.fromRDD(sc.makeRDD(0
until 10, 3).map(x
Thanks for your answer.
I tried running on a single machine - master and worker on one host. I
get exactly the same results.
Very little CPU activity on the machine in question. The web UI shows a
single task and its state is RUNNING. it will remain so indefinitely.
I have a single partition,
Could this be related to the size of the lookup result ?
I tried to recreate a similar scenario on the spark shell which causes
an exception:
scala val rdd =
org.apache.spark.api.java.JavaPairRDD.fromRDD(sc.makeRDD(0 until 4,
3).map(x = ( ( 0,52fb9b1a3004f07d1a87c8f3 ),
Can someone please suggest how I can move forward with this?
My spark version is 0.9.1.
The big challenge is that this issue is not recreated when running in
local mode. What could be the difference?
I would really appreciate any pointers, as currently the the job just hangs.
On 4/25/14,
Hi All,
Im running a lookup on a JavaPairRDDString, Tuple2.
When running on local machine - the lookup is successfull. However, when
running a standalone cluster with the exact same dataset - one of the
tasks never ends (constantly in RUNNING status).
When viewing the worker log, it seems that
Some additional information - maybe this rings a bell with someone:
I suspect this happens when the lookup returns more than one value.
For 0 and 1 values, the function behaves as you would expect.
Anyone ?
On 4/25/14, 1:55 PM, Yadid Ayzenberg wrote:
Hi All,
Im running a lookup on a