Re: Where to place "Spark + GemFire" connector.

Jianxia Chen Tue, 07 Jul 2015 11:41:41 -0700

I agree that Spark Geode connector has its own repo.

In fact, in order to use Spark Geode Connector, the users write Spark
application (instead of  Geode application) that calls the Spark Geode
Connector APIs.


There are a bunch of similar Spark connector projects which connect Spark
with other data store (e.g Cassandra, HBase). Each of these projects has
its own independent repo, instead of living in the same repo as the data
store it supports. Please take a look at http://spark-packages.org for more
details.

I don't think having Spark Geode Connector in its own repo will make Geode
release difficult. On the contrary, it will be easier. Because then Geode
release doesn't have to worry about Spark Geode Connector.


On Tue, Jul 7, 2015 at 10:35 AM, Jason Huynh <jhu...@pivotal.io> wrote:

> I agree with the github approach as the Spark connector was originally
> designed to be in it's own repo with dependencies on the Spark and Geode
> jars.  I think the backwards compatibility for the Spark versions would be
> as John described, based on the sbt dependencies file.
>
> If we go with the single repo approach, as everyone has stated, we would
> want to have more releases for Geode, which would mean Geode would have at
> least as many releases as Spark.
>
>
> On Tue, Jul 7, 2015 at 10:21 AM, Kirk Lund <kl...@pivotal.io> wrote:
>
> > The recommended ideal time for building and executing all unit tests for
> a
> > project is 10 minutes.[0][1][2]
> >
> > "Builds should be fast. Anything beyond 10 minutes becomes a dysfunction
> in
> > the process, because people won’t commit as frequently. Large builds can
> be
> > broken into multiple jobs and executed in parallel."[3]
> >
> > Now imagine packing 6 projects together into 1 project. Assuming all 6
> have
> > very fast unit tests that use Mockito then each takes 10 minutes to run
> and
> > you end up with 60 minutes for building the overall project.
> >
> > This is then heading in the opposite direction from where Geode needs to
> > go. If Geode continues to execute distributedTest and integrationTest
> from
> > the main build target then this drives it up even longer. I'd recommend
> > considering every option to reduce build time including moving
> independent
> > tools to other repos.
> >
> > I think it's more likely that other contributors will join in on the
> Spark
> > connector or JVSD or even Geode if they are isolated in their own
> projects.
> >
> > But, if group consensus is to keep everything in 1 project, then let's at
> > least talk seriously about committing to breaking up tests into multiple
> > jobs for parallel execution.
> >
> > [0] http://www.jamesshore.com/Agile-Book/ten_minute_build.html
> > [1]
> >
> >
> http://www.martinfowler.com/articles/continuousIntegration.html#KeepTheBuildFast
> > [2]
> > http://www.energizedwork.com/weblog/2006/02/ten-minute-build-continuous
> > [3]
> >
> >
> http://blogs.collab.net/devopsci/ten-best-practices-for-continuous-integration
> >
> > On Tue, Jul 7, 2015 at 9:55 AM, Anthony Baker <aba...@pivotal.io> wrote:
> >
> > > Given the rate of change, it doesn’t seem like we should be trying to
> add
> > > (and maintain) support for every single Spark release.  We’re early in
> > the
> > > lifecycle of the Spark connector and too much emphasis on
> > > backwards-compatibility will be a drag on our ongoing development,
> > > particularly since the Spark community is valuing rapid evolution over
> > > stability.
> > >
> > > (apologies if I have misconstrued the state of Spark)
> > >
> > > Anthony
> > >
> > >
> > > > On Jul 6, 2015, at 11:22 PM, Qihong Chen <qc...@pivotal.io> wrote:
> > > >
> > > > The problem is caused by multiple major dependencies and different
> > > release
> > > > cycles. Spark Geode Connector depends on two products: Spark and
> Geode
> > > (not
> > > > counting other dependencies), and Spark moves much faster than Geode,
> > and
> > > > some features/code are not backward compatible.
> > > >
> > > > Our initial connector implementation depends on Spark 1.2 in before
> the
> > > > last week of March 15. Then Spark 1.3 was released on the last week
> of
> > > > March, and some connector feature doesn't work with Spark 1.3, then
> we
> > > > moved on, and now support Spark 1.3 (but not 1.2 any more, we did
> > create
> > > > tag). Two weeks ago, Spark 1.4 was released, and it breaks our
> > connector
> > > > code again.
> > > >
> > > > Therefore, for each Geode release, we probably need multiple
> Connector
> > > > releases, and probably need to maintain last 2 or 3 Connector
> releases,
> > > for
> > > > example, we need to support both Spark 1.3 and 1.4 with the current
> > Geode
> > > > code.
> > > >
> > > > The question is how to support this with single source repository?
> > > >
> > > > Thanks,
> > > > Qihong
> > >
> > >
> >
>

Re: Where to place "Spark + GemFire" connector.

Reply via email to