Re: New Codes in GraphX

Ankur Dave Tue, 18 Nov 2014 02:06:39 -0800

At 2014-11-18 14:51:54 +0530, Deep Pradhan <pradhandeep1...@gmail.com> wrote:
> I am using Spark-1.0.0. There are two GraphX directories that I can see here
>
> 1. spark-1.0.0/examples/src/main/scala/org/apache/sprak/examples/graphx
> which         contains LiveJournalPageRank,scala
>
> 2. spark-1.0.0/graphx/src/main/scala/org/apache/sprak/graphx/lib which
> contains           Analytics.scala, ConnectedComponenets.scala etc etc
>
> Now, if I want to add my own code to GraphX i.e., if I want to write a
> small application on GraphX, in which directory should I add my code, in 1
> or 2 ? And what is the difference?


If you want to add an algorithm which you can call from the Spark shell and 
submit as a pull request, you should add it to org.apache.spark.graphx.lib 
(#2). To run it from the command line, you'll also have to modify 
Analytics.scala.

If you want to write a separate application, the ideal way is to do it in a 
separate project that links in Spark as a dependency [1]. It will also work to 
put it in either #1 or #2, but this will be worse in the long term because each 
build cycle will require you to rebuild and restart all of Spark rather than 
just building your application and calling spark-submit on the new JAR.

Ankur

[1] http://spark.apache.org/docs/1.0.2/quick-start.html#standalone-applications

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: New Codes in GraphX

Reply via email to