Thank you Yiming. It is helpful. Regards!
Chen On Tue, Nov 18, 2014 at 8:00 PM, Yiming (John) Zhang <sdi...@gmail.com> wrote: > Hi, > > > > I noticed it is hard to find a thorough introduction to using IntelliJ to > debug SPARK-1.1 Apps with mvn/sbt, which is not straightforward for > beginners. So I spent several days to figure it out and hope that it would > be helpful for beginners like me and that professionals can help me improve > it. (The intro with figures can be found at: > http://kylinx.com/spark/Debug-Spark-in-IntelliJ.htm) > > > > (1) Install the Scala plugin > > > > (2) Download, unzip and open spark-1.1.0 in IntelliJ > > a) mvn: File -> Open. > > Select the Spark source folder (e.g., /root/spark-1.1.0). Maybe it will > take a long time to download and compile a lot of things > > b) sbt: File -> Import Project. > > Select "Import project from external model", then choose SBT project, > click Next. Input the Spark source path (e.g., /root/spark-1.1.0) for "SBT > project", and select Use auto-import. > > > > (3) First compile and run spark examples in the console to ensure > everything > OK > > # mvn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package > > # ./sbt/sbt assembly -Phadoop-2.2 -Dhadoop.version=2.2.0 > > > > (4) Add the compiled spark-hadoop library > (spark-assembly-1.1.0-hadoop2.2.0) > to "Libraries" (File -> Project Structure. -> Libraries -> green +). And > choose modules that use it (right-click the library and click "Add to > Modules"). It seems only spark-examples need it. > > > > (5) In the "Dependencies" page of the modules using this library, ensure > that the "Scope" of this library is "Compile" (File -> Project Structure. > -> > Modules) > > (6) For sbt, it seems that we have to label the scope of all other hadoop > dependencies (SBT: org.apache.hadoop.hadoop-*) as "Test" (due to poor > Internet connection?) And this has to be done every time opening IntelliJ > (due to a bug?) > > > > (7) Configure debug environment (using LogQuery as an example). Run -> Edit > Configurations. > > Main class: org.apache.spark.examples.LogQuery > > VM options: -Dspark.master=local > > Working directory: /root/spark-1.1.0 > > Use classpath of module: spark-examples_2.10 > > Before launch: External tool: mvn > > Program: /root/Programs/apache-maven-3.2.1/bin/mvn > > Parameters: -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests package > > Working directory: /root/spark-1.1.0 > > Before launch: External tool: sbt > > Program: /root/spark-1.1.0/sbt/sbt > > Parameters: -Phadoop-2.2 -Dhadoop.version=2.2.0 assembly > > Working directory: /root/spark-1.1.0 > > > > (8) Click Run -> Debug 'LogQuery' to start debugging > > > > > > Cheers, > > Yiming > >