Try this:
Is this python right? I'm not used to it, I'm used to scala, so

val toDebug = rdd.foreachPartition(partition -> { //breakpoint stop here
*// by val toDebug I mean to assign the result of foreachPartition to a
variable*
    partition.forEachRemaining(message -> {
        //breakpoint doenst stop here

     })
});

*toDebug.first* // now is when this method will run


2016-05-31 17:59 GMT-03:00 Marcelo Oikawa <marcelo.oik...@webradar.com>:

>
>
>> Hi Marcelo, this is because the operations in rdd are lazy, you will only
>> stop at this inside foreach breakpoint when you call a first, a collect or
>> a reduce operation.
>>
>
> Does forEachRemaining isn't a final method as first, collect or reduce?
> Anyway, I guess this is not the problem itself because the code inside
> forEachRemaining runs well but I can't debug this block.
>
>
>> This is when the spark will run the operations.
>> Have you tried that?
>>
>> Cheers.
>>
>> 2016-05-31 17:18 GMT-03:00 Marcelo Oikawa <marcelo.oik...@webradar.com>:
>>
>>> Hello, list.
>>>
>>> I'm trying to debug my spark application on Intellij IDE. Before I
>>> submit my job, I ran the command line:
>>>
>>> export
>>> SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4000
>>>
>>> after that:
>>>
>>> bin/spark-submit app-jar-with-dependencies.jar <arguments>
>>>
>>> The IDE connects with the running job but all code that is running on
>>> worker machine is unreachable to debug. See below:
>>>
>>> rdd.foreachPartition(partition -> { //breakpoint stop here
>>>
>>>     partition.forEachRemaining(message -> {
>>>
>>>         //breakpoint doenst stop here
>>>
>>>      })
>>> });
>>>
>>> Does anyone know if is is possible? How? Any ideas?
>>>
>>>
>>>
>>
>

Reply via email to