That is quite mysterious, and I do not think we have enough information to
answer. JavaPairRDD<String, Tuple2>.lookup() works fine on a remote Spark
cluster:

$ MASTER=spark://localhost:7077 bin/spark-shell
scala> val rdd = org.apache.spark.api.java.JavaPairRDD.fromRDD(sc.makeRDD(0
until 10, 3).map(x => ((x%3).toString, (x, x%3))))
scala> rdd.lookup("1")
res0: java.util.List[(Int, Int)] = [(1,1), (4,1), (7,1)]

You suggest maybe the driver does not receive a message from an executor. I
guess it is likely possible, though it has not happened to me. I would
recommend running on a single machine in the standalone setup. Start the
master and worker on the same machine, run the application there too. This
should eliminate network configuration problems.

If you still see the issue, I'd check whether the task has really
completed. What do you see on the web UI? Is the executor using CPU?

Good luck.




On Mon, Apr 28, 2014 at 2:35 AM, Yadid Ayzenberg <ya...@media.mit.edu>wrote:

> Can someone please suggest how I can move forward with this?
> My spark version is 0.9.1.
> The big challenge is that this issue is not recreated when running in
> local mode. What could be the difference?
>
> I would really appreciate any pointers, as currently the the job just
> hangs.
>
>
>
>
> On 4/25/14, 7:37 PM, Yadid Ayzenberg wrote:
>
>> Some additional information - maybe this rings a bell with someone:
>>
>> I suspect this happens when the lookup returns more than one value.
>> For 0 and 1 values, the function behaves as you would expect.
>>
>> Anyone ?
>>
>>
>>
>> On 4/25/14, 1:55 PM, Yadid Ayzenberg wrote:
>>
>>> Hi All,
>>>
>>> Im running a lookup on a JavaPairRDD<String, Tuple2>.
>>> When running on local machine - the lookup is successfull. However, when
>>> running a standalone cluster with the exact same dataset - one of the tasks
>>> never ends (constantly in RUNNING status).
>>> When viewing the worker log, it seems that the task has finished
>>> successfully:
>>>
>>> 14/04/25 13:40:38 INFO BlockManager: Found block rdd_2_0 locally
>>> 14/04/25 13:40:38 INFO Executor: Serialized size of result for 2 is
>>> 10896794
>>> 14/04/25 13:40:38 INFO Executor: Sending result for 2 directly to driver
>>> 14/04/25 13:40:38 INFO Executor: Finished task ID 2
>>>
>>> But it seems the driver is not aware of this, and hangs indefinitely.
>>>
>>> If I execute a count priot to the lookup - I get the correct number
>>> which suggests that the cluster is operating as expected.
>>>
>>> The exact same scenario works with a different type of key (Tuple2):
>>> JavaPairRDD<Tuple2, Tuple2>.
>>>
>>> Any ideas on how to debug this problem ?
>>>
>>> Thanks,
>>>
>>> Yadid
>>>
>>>
>>
>

Reply via email to