Yeah, my comment was mostly reflecting the fact that mvn is what
creates the releases and is the 'build of reference', from which the
SBT build is generated. The docs were recently changed to suggest that
Maven is the default build and SBT is for advanced users. I find Maven
plays nicer with IDEs, or at least, IntelliJ.

SBT is faster for incremental compilation and better for anyone who
knows and can leverage SBT's model.

If someone's new to it all, I dunno, they're likelier to have fewer
problems using Maven to start? YMMV.

On Sun, Nov 16, 2014 at 9:23 PM, Michael Armbrust
<mich...@databricks.com> wrote:
> I'm going to have to disagree here.  If you are building a release
> distribution or integrating with legacy systems then maven is probably the
> correct choice.  However most of the core developers that I know use sbt,
> and I think its a better choice for exploration and development overall.
> That said, this probably falls into the category of a religious argument so
> you might want to look at both options and decide for yourself.
>
> In my experience the SBT build is significantly faster with less effort (and
> I think sbt is still faster even if you go through the extra effort of
> installing zinc) and easier to read.  The console mode of sbt (just run
> sbt/sbt and then a long running console session is started that will accept
> further commands) is great for building individual subprojects or running
> single test suites.  In addition to being faster since its a long running
> JVM, its got a lot of nice features like tab-completion for test case names.
>
> For example, if I wanted to see what test cases are available in the SQL
> subproject you can do the following:
>
> [marmbrus@michaels-mbp spark (tpcds)]$ sbt/sbt
> [info] Loading project definition from
> /Users/marmbrus/workspace/spark/project/project
> [info] Loading project definition from
> /Users/marmbrus/.sbt/0.13/staging/ad8e8574a5bcb2d22d23/sbt-pom-reader/project
> [info] Set current project to spark-parent (in build
> file:/Users/marmbrus/workspace/spark/)
>> sql/test-only <tab>
> --
> org.apache.spark.sql.CachedTableSuite
> org.apache.spark.sql.DataTypeSuite
> org.apache.spark.sql.DslQuerySuite
> org.apache.spark.sql.InsertIntoSuite
> ...
>
> Another very useful feature is the development console, which starts an
> interactive REPL including the most recent version of the code and a lot of
> useful imports for some subprojects.  For example in the hive subproject it
> automatically sets up a temporary database with a bunch of test data
> pre-loaded:
>
> $ sbt/sbt hive/console
>> hive/console
> ...
> import org.apache.spark.sql.hive._
> import org.apache.spark.sql.hive.test.TestHive._
> import org.apache.spark.sql.parquet.ParquetTestData
> Welcome to Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java
> 1.7.0_45).
> Type in expressions to have them evaluated.
> Type :help for more information.
>
> scala> sql("SELECT * FROM src").take(2)
> res0: Array[org.apache.spark.sql.Row] = Array([238,val_238], [86,val_86])
>
> Michael
>
> On Sun, Nov 16, 2014 at 3:27 AM, Dinesh J. Weerakkody
> <dineshjweerakk...@gmail.com> wrote:
>>
>> Hi Stephen and Sean,
>>
>> Thanks for correction.
>>
>> On Sun, Nov 16, 2014 at 12:28 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> > No, the Maven build is the main one.  I would use it unless you have a
>> > need to use the SBT build in particular.
>> > On Nov 16, 2014 2:58 AM, "Dinesh J. Weerakkody" <
>> > dineshjweerakk...@gmail.com> wrote:
>> >
>> >> Hi Yiming,
>> >>
>> >> I believe that both SBT and MVN is supported in SPARK, but SBT is
>> >> preferred
>> >> (I'm not 100% sure about this :) ). When I'm using MVN I got some build
>> >> failures. After that used SBT and works fine.
>> >>
>> >> You can go through these discussions regarding SBT vs MVN and learn
>> >> pros
>> >> and cons of both [1] [2].
>> >>
>> >> [1]
>> >>
>> >>
>> >> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-of-Maven-and-SBT-Build-in-Spark-td2315.html
>> >>
>> >> [2]
>> >>
>> >>
>> >> https://groups.google.com/forum/#!msg/spark-developers/OxL268v0-Qs/fBeBY8zmh3oJ
>> >>
>> >> Thanks,
>> >>
>> >> On Sun, Nov 16, 2014 at 7:11 AM, Yiming (John) Zhang <sdi...@gmail.com>
>> >> wrote:
>> >>
>> >> > Hi,
>> >> >
>> >> >
>> >> >
>> >> > I am new in developing Spark and my current focus is about
>> >> co-scheduling of
>> >> > spark tasks. However, I am confused with the building tools:
>> >> > sometimes
>> >> the
>> >> > documentation uses mvn but sometimes uses sbt.
>> >> >
>> >> >
>> >> >
>> >> > So, my question is that which one is the preferred tool of Spark
>> >> community?
>> >> > And what's the technical difference between them? Thank you!
>> >> >
>> >> >
>> >> >
>> >> > Cheers,
>> >> >
>> >> > Yiming
>> >> >
>> >> >
>> >>
>> >>
>> >> --
>> >> Thanks & Best Regards,
>> >>
>> >> *Dinesh J. Weerakkody*
>> >>
>> >
>>
>>
>> --
>> Thanks & Best Regards,
>>
>> *Dinesh J. Weerakkody*
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to