Re: mvn or sbt for studying and developing Spark?

Mark Hamstra Sun, 16 Nov 2014 13:55:19 -0800

Ok, strictly speaking, that's equivalent to your second class of
examples, "development
console", not the first "sbt console"


On Sun, Nov 16, 2014 at 1:47 PM, Mark Hamstra <m...@clearstorydata.com>
wrote:

> The console mode of sbt (just run
>> sbt/sbt and then a long running console session is started that will
>> accept
>> further commands) is great for building individual subprojects or running
>> single test suites.  In addition to being faster since its a long running
>> JVM, its got a lot of nice features like tab-completion for test case
>> names.
>
>
> We include the scala-maven-plugin in spark/pom.xml, so equivalent
> functionality is available using Maven.  You can start a console session
> with `mvn scala:console`.
>
>
> On Sun, Nov 16, 2014 at 1:23 PM, Michael Armbrust <mich...@databricks.com>
> wrote:
>
>> I'm going to have to disagree here.  If you are building a release
>> distribution or integrating with legacy systems then maven is probably the
>> correct choice.  However most of the core developers that I know use sbt,
>> and I think its a better choice for exploration and development overall.
>> That said, this probably falls into the category of a religious argument
>> so
>> you might want to look at both options and decide for yourself.
>>
>> In my experience the SBT build is significantly faster with less effort
>> (and I think sbt is still faster even if you go through the extra effort
>> of
>> installing zinc) and easier to read.  The console mode of sbt (just run
>> sbt/sbt and then a long running console session is started that will
>> accept
>> further commands) is great for building individual subprojects or running
>> single test suites.  In addition to being faster since its a long running
>> JVM, its got a lot of nice features like tab-completion for test case
>> names.
>>
>> For example, if I wanted to see what test cases are available in the SQL
>> subproject you can do the following:
>>
>> [marmbrus@michaels-mbp spark (tpcds)]$ sbt/sbt
>> [info] Loading project definition from
>> /Users/marmbrus/workspace/spark/project/project
>> [info] Loading project definition from
>>
>> /Users/marmbrus/.sbt/0.13/staging/ad8e8574a5bcb2d22d23/sbt-pom-reader/project
>> [info] Set current project to spark-parent (in build
>> file:/Users/marmbrus/workspace/spark/)
>> > sql/test-only *<tab>*
>> --
>>  org.apache.spark.sql.CachedTableSuite
>> org.apache.spark.sql.DataTypeSuite
>>  org.apache.spark.sql.DslQuerySuite
>> org.apache.spark.sql.InsertIntoSuite
>> ...
>>
>> Another very useful feature is the development console, which starts an
>> interactive REPL including the most recent version of the code and a lot
>> of
>> useful imports for some subprojects.  For example in the hive subproject
>> it
>> automatically sets up a temporary database with a bunch of test data
>> pre-loaded:
>>
>> $ sbt/sbt hive/console
>> > hive/console
>> ...
>> import org.apache.spark.sql.hive._
>> import org.apache.spark.sql.hive.test.TestHive._
>> import org.apache.spark.sql.parquet.ParquetTestData
>> Welcome to Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java
>> 1.7.0_45).
>> Type in expressions to have them evaluated.
>> Type :help for more information.
>>
>> scala> sql("SELECT * FROM src").take(2)
>> res0: Array[org.apache.spark.sql.Row] = Array([238,val_238], [86,val_86])
>>
>> Michael
>>
>> On Sun, Nov 16, 2014 at 3:27 AM, Dinesh J. Weerakkody <
>> dineshjweerakk...@gmail.com> wrote:
>>
>> > Hi Stephen and Sean,
>> >
>> > Thanks for correction.
>> >
>> > On Sun, Nov 16, 2014 at 12:28 PM, Sean Owen <so...@cloudera.com> wrote:
>> >
>> > > No, the Maven build is the main one.  I would use it unless you have a
>> > > need to use the SBT build in particular.
>> > > On Nov 16, 2014 2:58 AM, "Dinesh J. Weerakkody" <
>> > > dineshjweerakk...@gmail.com> wrote:
>> > >
>> > >> Hi Yiming,
>> > >>
>> > >> I believe that both SBT and MVN is supported in SPARK, but SBT is
>> > >> preferred
>> > >> (I'm not 100% sure about this :) ). When I'm using MVN I got some
>> build
>> > >> failures. After that used SBT and works fine.
>> > >>
>> > >> You can go through these discussions regarding SBT vs MVN and learn
>> pros
>> > >> and cons of both [1] [2].
>> > >>
>> > >> [1]
>> > >>
>> > >>
>> >
>> http://apache-spark-developers-list.1001551.n3.nabble.com/DISCUSS-Necessity-of-Maven-and-SBT-Build-in-Spark-td2315.html
>> > >>
>> > >> [2]
>> > >>
>> > >>
>> >
>> https://groups.google.com/forum/#!msg/spark-developers/OxL268v0-Qs/fBeBY8zmh3oJ
>> > >>
>> > >> Thanks,
>> > >>
>> > >> On Sun, Nov 16, 2014 at 7:11 AM, Yiming (John) Zhang <
>> sdi...@gmail.com>
>> > >> wrote:
>> > >>
>> > >> > Hi,
>> > >> >
>> > >> >
>> > >> >
>> > >> > I am new in developing Spark and my current focus is about
>> > >> co-scheduling of
>> > >> > spark tasks. However, I am confused with the building tools:
>> sometimes
>> > >> the
>> > >> > documentation uses mvn but sometimes uses sbt.
>> > >> >
>> > >> >
>> > >> >
>> > >> > So, my question is that which one is the preferred tool of Spark
>> > >> community?
>> > >> > And what's the technical difference between them? Thank you!
>> > >> >
>> > >> >
>> > >> >
>> > >> > Cheers,
>> > >> >
>> > >> > Yiming
>> > >> >
>> > >> >
>> > >>
>> > >>
>> > >> --
>> > >> Thanks & Best Regards,
>> > >>
>> > >> *Dinesh J. Weerakkody*
>> > >>
>> > >
>> >
>> >
>> > --
>> > Thanks & Best Regards,
>> >
>> > *Dinesh J. Weerakkody*
>> >
>>
>
>

Re: mvn or sbt for studying and developing Spark?

Reply via email to