Re: Debug spark jobs on Intellij

2016-05-31 Thread Marcelo Oikawa
> Is this python right? I'm not used to it, I'm used to scala, so
>

No. It is Java.


> val toDebug = rdd.foreachPartition(partition -> { //breakpoint stop here
> *// by val toDebug I mean to assign the result of foreachPartition to a
> variable*
> partition.forEachRemaining(message -> {
> //breakpoint doenst stop here
>
>  })
> });
>
> *toDebug.first* // now is when this method will run
>

foreachPartition is a void method.


>
>
> 2016-05-31 17:59 GMT-03:00 Marcelo Oikawa :
>
>>
>>
>>> Hi Marcelo, this is because the operations in rdd are lazy, you will
>>> only stop at this inside foreach breakpoint when you call a first, a
>>> collect or a reduce operation.
>>>
>>
>> Does forEachRemaining isn't a final method as first, collect or reduce?
>> Anyway, I guess this is not the problem itself because the code inside
>> forEachRemaining runs well but I can't debug this block.
>>
>>
>>> This is when the spark will run the operations.
>>> Have you tried that?
>>>
>>> Cheers.
>>>
>>> 2016-05-31 17:18 GMT-03:00 Marcelo Oikawa :
>>>
 Hello, list.

 I'm trying to debug my spark application on Intellij IDE. Before I
 submit my job, I ran the command line:

 export
 SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4000

 after that:

 bin/spark-submit app-jar-with-dependencies.jar 

 The IDE connects with the running job but all code that is running on
 worker machine is unreachable to debug. See below:

 rdd.foreachPartition(partition -> { //breakpoint stop here

 partition.forEachRemaining(message -> {

 //breakpoint doenst stop here

  })
 });

 Does anyone know if is is possible? How? Any ideas?



>>>
>>
>


Re: Debug spark jobs on Intellij

2016-05-31 Thread Dirceu Semighini Filho
Try this:
Is this python right? I'm not used to it, I'm used to scala, so

val toDebug = rdd.foreachPartition(partition -> { //breakpoint stop here
*// by val toDebug I mean to assign the result of foreachPartition to a
variable*
partition.forEachRemaining(message -> {
//breakpoint doenst stop here

 })
});

*toDebug.first* // now is when this method will run


2016-05-31 17:59 GMT-03:00 Marcelo Oikawa :

>
>
>> Hi Marcelo, this is because the operations in rdd are lazy, you will only
>> stop at this inside foreach breakpoint when you call a first, a collect or
>> a reduce operation.
>>
>
> Does forEachRemaining isn't a final method as first, collect or reduce?
> Anyway, I guess this is not the problem itself because the code inside
> forEachRemaining runs well but I can't debug this block.
>
>
>> This is when the spark will run the operations.
>> Have you tried that?
>>
>> Cheers.
>>
>> 2016-05-31 17:18 GMT-03:00 Marcelo Oikawa :
>>
>>> Hello, list.
>>>
>>> I'm trying to debug my spark application on Intellij IDE. Before I
>>> submit my job, I ran the command line:
>>>
>>> export
>>> SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4000
>>>
>>> after that:
>>>
>>> bin/spark-submit app-jar-with-dependencies.jar 
>>>
>>> The IDE connects with the running job but all code that is running on
>>> worker machine is unreachable to debug. See below:
>>>
>>> rdd.foreachPartition(partition -> { //breakpoint stop here
>>>
>>> partition.forEachRemaining(message -> {
>>>
>>> //breakpoint doenst stop here
>>>
>>>  })
>>> });
>>>
>>> Does anyone know if is is possible? How? Any ideas?
>>>
>>>
>>>
>>
>


Re: Debug spark jobs on Intellij

2016-05-31 Thread Marcelo Oikawa
> Hi Marcelo, this is because the operations in rdd are lazy, you will only
> stop at this inside foreach breakpoint when you call a first, a collect or
> a reduce operation.
>

Does forEachRemaining isn't a final method as first, collect or reduce?
Anyway, I guess this is not the problem itself because the code inside
forEachRemaining runs well but I can't debug this block.


> This is when the spark will run the operations.
> Have you tried that?
>
> Cheers.
>
> 2016-05-31 17:18 GMT-03:00 Marcelo Oikawa :
>
>> Hello, list.
>>
>> I'm trying to debug my spark application on Intellij IDE. Before I submit
>> my job, I ran the command line:
>>
>> export
>> SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4000
>>
>> after that:
>>
>> bin/spark-submit app-jar-with-dependencies.jar 
>>
>> The IDE connects with the running job but all code that is running on
>> worker machine is unreachable to debug. See below:
>>
>> rdd.foreachPartition(partition -> { //breakpoint stop here
>>
>> partition.forEachRemaining(message -> {
>>
>> //breakpoint doenst stop here
>>
>>  })
>> });
>>
>> Does anyone know if is is possible? How? Any ideas?
>>
>>
>>
>


Re: Debug spark jobs on Intellij

2016-05-31 Thread Dirceu Semighini Filho
Hi Marcelo, this is because the operations in rdd are lazy, you will only
stop at this inside foreach breakpoint when you call a first, a collect or
a reduce operation.
This is when the spark will run the operations.
Have you tried that?

Cheers.

2016-05-31 17:18 GMT-03:00 Marcelo Oikawa :

> Hello, list.
>
> I'm trying to debug my spark application on Intellij IDE. Before I submit
> my job, I ran the command line:
>
> export
> SPARK_SUBMIT_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4000
>
> after that:
>
> bin/spark-submit app-jar-with-dependencies.jar 
>
> The IDE connects with the running job but all code that is running on
> worker machine is unreachable to debug. See below:
>
> rdd.foreachPartition(partition -> { //breakpoint stop here
>
> partition.forEachRemaining(message -> {
>
> //breakpoint doenst stop here
>
>  })
> });
>
> Does anyone know if is is possible? How? Any ideas?
>
>
>


Re: Debug spark core and streaming programs in scala

2016-05-16 Thread Ted Yu
>From https://spark.apache.org/docs/latest/monitoring.html#metrics :

   - JmxSink: Registers metrics for viewing in a JMX console.

FYI

On Sun, May 15, 2016 at 11:54 PM, Mich Talebzadeh  wrote:

> Have you tried Spark GUI on 4040. This will show jobs being executed by
> executors is each stage and the line of code as well.
>
> [image: Inline images 1]
>
> Also command line tools like jps and jmonitor
>
> HTH
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 16 May 2016 at 06:25, Deepak Sharma  wrote:
>
>> Hi
>> I have scala program consisting of spark core and spark streaming APIs
>> Is there any open source tool that i can use to debug the program for
>> performance reasons?
>> My primary interest is to find the block of codes that would be exeuted
>> on driver and what would go to the executors.
>> Is there JMX extension of Spark?
>>
>> --
>> Thanks
>> Deepak
>>
>>
>


Re: Debug spark core and streaming programs in scala

2016-05-16 Thread Mich Talebzadeh
Have you tried Spark GUI on 4040. This will show jobs being executed by
executors is each stage and the line of code as well.

[image: Inline images 1]

Also command line tools like jps and jmonitor

HTH


Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 16 May 2016 at 06:25, Deepak Sharma  wrote:

> Hi
> I have scala program consisting of spark core and spark streaming APIs
> Is there any open source tool that i can use to debug the program for
> performance reasons?
> My primary interest is to find the block of codes that would be exeuted on
> driver and what would go to the executors.
> Is there JMX extension of Spark?
>
> --
> Thanks
> Deepak
>
>


Re: Debug Spark

2015-12-02 Thread Masf
This is very intersting.

Thanks!!!

On Thu, Dec 3, 2015 at 8:28 AM, Sudhanshu Janghel <
sudhanshu.jang...@cloudwick.com> wrote:

> Hi,
>
> Here is a doc that I had created for my team. This has steps along with
> snapshots of how to setup debugging in spark using IntelliJ locally.
>
>
> https://docs.google.com/a/cloudwick.com/document/d/13kYPbmK61di0f_XxxJ-wLP5TSZRGMHE6bcTBjzXD0nA/edit?usp=sharing
>
> Kind Regards,
> Sudhanshu
>
> On Thu, Dec 3, 2015 at 6:46 AM, Akhil Das 
> wrote:
>
>> This doc will get you started
>> https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-IntelliJ
>>
>> Thanks
>> Best Regards
>>
>> On Sun, Nov 29, 2015 at 9:48 PM, Masf  wrote:
>>
>>> Hi
>>>
>>> Is it possible to debug spark locally with IntelliJ or another IDE?
>>>
>>> Thanks
>>>
>>> --
>>> Regards.
>>> Miguel Ángel
>>>
>>
>>
>


-- 


Saludos.
Miguel Ángel


Re: Debug Spark

2015-12-02 Thread Sudhanshu Janghel
Hi,

Here is a doc that I had created for my team. This has steps along with
snapshots of how to setup debugging in spark using IntelliJ locally.

https://docs.google.com/a/cloudwick.com/document/d/13kYPbmK61di0f_XxxJ-wLP5TSZRGMHE6bcTBjzXD0nA/edit?usp=sharing

Kind Regards,
Sudhanshu

On Thu, Dec 3, 2015 at 6:46 AM, Akhil Das 
wrote:

> This doc will get you started
> https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-IntelliJ
>
> Thanks
> Best Regards
>
> On Sun, Nov 29, 2015 at 9:48 PM, Masf  wrote:
>
>> Hi
>>
>> Is it possible to debug spark locally with IntelliJ or another IDE?
>>
>> Thanks
>>
>> --
>> Regards.
>> Miguel Ángel
>>
>
>


Re: Debug Spark

2015-12-02 Thread Akhil Das
This doc will get you started
https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-IntelliJ

Thanks
Best Regards

On Sun, Nov 29, 2015 at 9:48 PM, Masf  wrote:

> Hi
>
> Is it possible to debug spark locally with IntelliJ or another IDE?
>
> Thanks
>
> --
> Regards.
> Miguel Ángel
>


Re: Debug Spark

2015-11-30 Thread Jacek Laskowski
Hi,

Yes, that's possible -- I'm doing it every day in local and standalone modes.

Just use SPARK_PRINT_LAUNCH_COMMAND=1 before any Spark command, i.e.
spark-submit, spark-shell, to know the command to start it:

$ SPARK_PRINT_LAUNCH_COMMAND=1 ./bin/spark-shell

SPARK_PRINT_LAUNCH_COMMAND environment variable controls whether the
Spark launch command is printed out to the standard error output, i.e.
System.err, or not.

Once you've got the command, add the following command-line option to
enable JDWP agent and have it suspended (suspend=y) until a remote
debugging client connects (on port 5005):

-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=5005

In IntelliJ IDEA, define a new debug configuration for Remote and
press Debug. You're done.

https://www.jetbrains.com/idea/help/debugging-2.html might help.

Pozdrawiam,
Jacek

--
Jacek Laskowski | https://medium.com/@jaceklaskowski/ |
http://blog.jaceklaskowski.pl
Mastering Spark https://jaceklaskowski.gitbooks.io/mastering-apache-spark/
Follow me at https://twitter.com/jaceklaskowski
Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski


On Sun, Nov 29, 2015 at 5:18 PM, Masf  wrote:
> Hi
>
> Is it possible to debug spark locally with IntelliJ or another IDE?
>
> Thanks
>
> --
> Regards.
> Miguel Ángel

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Debug Spark

2015-11-29 Thread Ndjido Ardo BAR
Spark Job server allows you to submit your apps to any kind of deployment
(Standalone, Cluster). I think that it could be suitable for your use case.
Check the following Github repo:
https://github.com/spark-jobserver/spark-jobserver

Ardo

On Sun, Nov 29, 2015 at 6:42 PM, Նարեկ Գալստեան 
wrote:

> A question regarding the topic,
>
> I am using Intellij to write spark applications and then have to ship the
> source code to my cluster on the cloud to compile and test
>
> is there a way to automatise the process using Intellij?
>
> Narek Galstyan
>
> Նարեկ Գալստյան
>
> On 29 November 2015 at 20:51, Ndjido Ardo BAR  wrote:
>
>> Masf, the following link sets the basics to start debugging your spark
>> apps in local mode:
>>
>>
>> https://medium.com/large-scale-data-processing/how-to-kick-start-spark-development-on-intellij-idea-in-4-steps-c7c8f5c2fe63#.675s86940
>>
>> Ardo
>>
>> On Sun, Nov 29, 2015 at 5:34 PM, Masf  wrote:
>>
>>> Hi Ardo
>>>
>>>
>>> Some tutorial to debug with Intellij?
>>>
>>> Thanks
>>>
>>> Regards.
>>> Miguel.
>>>
>>>
>>> On Sun, Nov 29, 2015 at 5:32 PM, Ndjido Ardo BAR 
>>> wrote:
>>>
 hi,

 IntelliJ is just great for that!

 cheers,
 Ardo.

 On Sun, Nov 29, 2015 at 5:18 PM, Masf  wrote:

> Hi
>
> Is it possible to debug spark locally with IntelliJ or another IDE?
>
> Thanks
>
> --
> Regards.
> Miguel Ángel
>


>>>
>>>
>>> --
>>>
>>>
>>> Saludos.
>>> Miguel Ángel
>>>
>>
>>
>


Re: Debug Spark

2015-11-29 Thread Նարեկ Գալստեան
A question regarding the topic,

I am using Intellij to write spark applications and then have to ship the
source code to my cluster on the cloud to compile and test

is there a way to automatise the process using Intellij?

Narek Galstyan

Նարեկ Գալստյան

On 29 November 2015 at 20:51, Ndjido Ardo BAR  wrote:

> Masf, the following link sets the basics to start debugging your spark
> apps in local mode:
>
>
> https://medium.com/large-scale-data-processing/how-to-kick-start-spark-development-on-intellij-idea-in-4-steps-c7c8f5c2fe63#.675s86940
>
> Ardo
>
> On Sun, Nov 29, 2015 at 5:34 PM, Masf  wrote:
>
>> Hi Ardo
>>
>>
>> Some tutorial to debug with Intellij?
>>
>> Thanks
>>
>> Regards.
>> Miguel.
>>
>>
>> On Sun, Nov 29, 2015 at 5:32 PM, Ndjido Ardo BAR 
>> wrote:
>>
>>> hi,
>>>
>>> IntelliJ is just great for that!
>>>
>>> cheers,
>>> Ardo.
>>>
>>> On Sun, Nov 29, 2015 at 5:18 PM, Masf  wrote:
>>>
 Hi

 Is it possible to debug spark locally with IntelliJ or another IDE?

 Thanks

 --
 Regards.
 Miguel Ángel

>>>
>>>
>>
>>
>> --
>>
>>
>> Saludos.
>> Miguel Ángel
>>
>
>


Re: Debug Spark

2015-11-29 Thread Ndjido Ardo BAR
Masf, the following link sets the basics to start debugging your spark apps
in local mode:

https://medium.com/large-scale-data-processing/how-to-kick-start-spark-development-on-intellij-idea-in-4-steps-c7c8f5c2fe63#.675s86940

Ardo

On Sun, Nov 29, 2015 at 5:34 PM, Masf  wrote:

> Hi Ardo
>
>
> Some tutorial to debug with Intellij?
>
> Thanks
>
> Regards.
> Miguel.
>
>
> On Sun, Nov 29, 2015 at 5:32 PM, Ndjido Ardo BAR  wrote:
>
>> hi,
>>
>> IntelliJ is just great for that!
>>
>> cheers,
>> Ardo.
>>
>> On Sun, Nov 29, 2015 at 5:18 PM, Masf  wrote:
>>
>>> Hi
>>>
>>> Is it possible to debug spark locally with IntelliJ or another IDE?
>>>
>>> Thanks
>>>
>>> --
>>> Regards.
>>> Miguel Ángel
>>>
>>
>>
>
>
> --
>
>
> Saludos.
> Miguel Ángel
>


Re: Debug Spark

2015-11-29 Thread Danny Stephan
Hi,

You can use “jwdp" to debug everything that run on top of JVM including Spark.
 
Specific with IntelliJ,  maybe this link can help you:

http://danosipov.com/?p=779 

regards,
Danny


> Op 29 nov. 2015, om 17:34 heeft Masf  het volgende 
> geschreven:
> 
> Hi Ardo
> 
> 
> Some tutorial to debug with Intellij?
> 
> Thanks
> 
> Regards.
> Miguel.
> 
> 
> On Sun, Nov 29, 2015 at 5:32 PM, Ndjido Ardo BAR  > wrote:
> hi,
> 
> IntelliJ is just great for that!
> 
> cheers,
> Ardo.
> 
> On Sun, Nov 29, 2015 at 5:18 PM, Masf  > wrote:
> Hi
> 
> Is it possible to debug spark locally with IntelliJ or another IDE?
> 
> Thanks
> 
> -- 
> Regards.
> Miguel Ángel
> 
> 
> 
> 
> -- 
> 
> 
> Saludos.
> Miguel Ángel



Re: Debug Spark

2015-11-29 Thread Ndjido Ardo BAR
hi,

IntelliJ is just great for that!

cheers,
Ardo.

On Sun, Nov 29, 2015 at 5:18 PM, Masf  wrote:

> Hi
>
> Is it possible to debug spark locally with IntelliJ or another IDE?
>
> Thanks
>
> --
> Regards.
> Miguel Ángel
>


Re: Debug Spark

2015-11-29 Thread Masf
Hi Ardo


Some tutorial to debug with Intellij?

Thanks

Regards.
Miguel.


On Sun, Nov 29, 2015 at 5:32 PM, Ndjido Ardo BAR  wrote:

> hi,
>
> IntelliJ is just great for that!
>
> cheers,
> Ardo.
>
> On Sun, Nov 29, 2015 at 5:18 PM, Masf  wrote:
>
>> Hi
>>
>> Is it possible to debug spark locally with IntelliJ or another IDE?
>>
>> Thanks
>>
>> --
>> Regards.
>> Miguel Ángel
>>
>
>


-- 


Saludos.
Miguel Ángel


Re: Debug Spark Streaming in PyCharm

2015-07-10 Thread Tathagata Das
spark-submit does a lot of magic configurations (classpaths etc) underneath
the covers to enable pyspark to find Spark JARs, etc. I am not sure how you
can start running things directly from the PyCharm IDE. Others in the
community may be able to answer. For now the main way to run pyspark stuff
is through spark-submit, or pyspark (which uses spark-submit underneath).

On Fri, Jul 10, 2015 at 6:28 AM, blbradley bradleytas...@gmail.com wrote:

 Hello,

 I'm trying to debug a PySpark app with Kafka Streaming in PyCharm. However,
 PySpark cannot find the jar dependencies for Kafka Streaming without
 editing
 the program. I can temporarily use SparkConf to set 'spark.jars', but I'm
 using Mesos for production and don't want to edit my program everytime I
 want to debug. I'd like to find a way to debug without editing the source.

 Here's what my PyCharm debug execution command looks like:

 home/brandon/.pyenv/versions/coinspark/bin/python2.7
 /opt/pycharm-community/helpers/pydev/pydevd.py --multiproc --client
 127.0.0.1 --port 59042 --file
 /home/brandon/src/coins/coinspark/streaming.py

 I might be able to use spark-submit has the command PyCharm runs, but I'm
 not sure if that will work with the debugger.

 Thoughts?

 Cheers!
 Brandon Bradley



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Debug-Spark-Streaming-in-PyCharm-tp23766.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Debug Spark in Cluster Mode

2014-10-10 Thread Ilya Ganelin
I would also be interested in knowing more about this. I have used the
cloudera manager and the spark resource interface (clientnode:4040) but
would love to know if there are other tools out there - either for post
processing or better observation during execution.
On Oct 9, 2014 4:50 PM, Rohit Pujari rpuj...@hortonworks.com wrote:

 Hello Folks:

 What're some best practices to debug Spark in cluster mode?


 Thanks,
 Rohit

 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity
 to which it is addressed and may contain information that is confidential,
 privileged and exempt from disclosure under applicable law. If the reader
 of this message is not the intended recipient, you are hereby notified that
 any printing, copying, dissemination, distribution, disclosure or
 forwarding of this communication is strictly prohibited. If you have
 received this communication in error, please contact the sender immediately
 and delete it from your system. Thank You.