Try this: Is this python right? I'm not used to it, I'm used to scala, so val toDebug = rdd.foreachPartition(partition -> { //breakpoint stop here *// by val toDebug I mean to assign the result of foreachPartition to a variable* partition.forEachRemaining(message -> { //breakpoint doenst stop here
}) }); *toDebug.first* // now is when this method will run 2016-05-31 17:59 GMT-03:00 Marcelo Oikawa <marcelo.oik...@webradar.com>: > > >> Hi Marcelo, this is because the operations in rdd are lazy, you will only >> stop at this inside foreach breakpoint when you call a first, a collect or >> a reduce operation. >> > > Does forEachRemaining isn't a final method as first, collect or reduce? > Anyway, I guess this is not the problem itself because the code inside > forEachRemaining runs well but I can't debug this block. > > >> This is when the spark will run the operations. >> Have you tried that? >> >> Cheers. >> >> 2016-05-31 17:18 GMT-03:00 Marcelo Oikawa <marcelo.oik...@webradar.com>: >> >>> Hello, list. >>> >>> I'm trying to debug my spark application on Intellij IDE. Before I >>> submit my job, I ran the command line: >>> >>> export >>> SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4000 >>> >>> after that: >>> >>> bin/spark-submit app-jar-with-dependencies.jar <arguments> >>> >>> The IDE connects with the running job but all code that is running on >>> worker machine is unreachable to debug. See below: >>> >>> rdd.foreachPartition(partition -> { //breakpoint stop here >>> >>> partition.forEachRemaining(message -> { >>> >>> //breakpoint doenst stop here >>> >>> }) >>> }); >>> >>> Does anyone know if is is possible? How? Any ideas? >>> >>> >>> >> >