Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Qihong Chen
The problem is caused by multiple major dependencies and different release cycles. Spark Geode Connector depends on two products: Spark and Geode (not counting other dependencies), and Spark moves much faster than Geode, and some features/code are not backward compatible. Our initial connector

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Anthony Baker
Given the rate of change, it doesn’t seem like we should be trying to add (and maintain) support for every single Spark release. We’re early in the lifecycle of the Spark connector and too much emphasis on backwards-compatibility will be a drag on our ongoing development, particularly since

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Kirk Lund
I would think that github would be a better option for the Spark Geode Connector. That way it's not tightly coupled to the Geode release cycle. I don't see why it's desirable to bloat Geode with every single script, tool, or connector that might interact with Geode. Another reason to consider

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Roman Shaposhnik
On Tue, Jul 7, 2015 at 10:34 AM, Anilkumar Gingade aging...@pivotal.io wrote: Agree...And thats the point...The connector code needs to catch up with spark release train; if its part of Geode then the Geode releases needs to happen as often as Spark release (along with other planned Geode

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Dan Smith
To support different versions of spark, wouldn't it be better to have a single code base that has adapters for different versions of spark? It seems like that would be better than maintaining several active branches with semi-duplicate code. I do think it would be better to keep the geode spark

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Jianxia Chen
I agree that Spark Geode connector has its own repo. In fact, in order to use Spark Geode Connector, the users write Spark application (instead of Geode application) that calls the Spark Geode Connector APIs. There are a bunch of similar Spark connector projects which connect Spark with other

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Roman Shaposhnik
On Tue, Jul 7, 2015 at 11:21 AM, Gregory Chase gch...@pivotal.io wrote: More important than easy to develop is easy to pick up and use. Improving the new user experience is something that needs attention from Geode. How we develop and provide Spark integration needs to take this into

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Gregory Chase
More important than easy to develop is easy to pick up and use. Improving the new user experience is something that needs attention from Geode. How we develop and provide Spark integration needs to take this into account. Once we are able to provide official releases, how can a user know and

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Eric Pederson
I would vote to support at least the previous Spark release. The big Hadoop distros usually are a version behind in their Spark support. For example, we use MapR which, in their latest release (4.1.0), only supports Spark 1.2.1 and 1.3.1 http://doc.mapr.com/display/MapR/Ecosystem+Support+Matrix.

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread John Blum
Just a quick word on maintaining different (release) branches for main dependencies (.e.g. driver dependencies). Again, this is exactly what Spring Data GemFire does to support GemFire, and now Geode. In fact, it has to be this way for Apache Geode and Pivotal GemFire given the fork in the

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread John Blum
For clarification... what I specifically mean when I say level of modularity can be reflected in the dependencies between modules. The POM distinguishes required vs. non-required dependencies based on the scope (i.e. 'compile'-time vs. 'optional', and so on). If you look at the Maven POM files in

Re: Where to place Spark + GemFire connector.

2015-07-07 Thread Bruce Schuchardt
+1 Le 7/7/2015 3:58 PM, John Blum a écrit : There are a few Spring projects that are exemplary (examples) in their modularity, contained within a single repo. The core Spring Framework and Spring Boot are 2 such projects that immediately come to mind. However, this sort of disciplined

Re: Where to place Spark + GemFire connector.

2015-07-06 Thread Roman Shaposhnik
On Thu, Jul 2, 2015 at 5:39 PM, Anthony Baker aba...@pivotal.io wrote: We are wondering wether to have this as part of Geode repo or on separate public GitHub repo? I think the spark connector belongs in the geode community, which implies the geode ASF repo. I think we can address the

Re: Where to place Spark + GemFire connector.

2015-07-06 Thread John Blum
if you unbundle your spark connector from Geode releases how do you know that a given Geode release actually works with it? Because the Spark Connector will depend on (i.e. have been developed and tested with) a specific version of Apache Geode, and is not guaranteed to work with downstream

Re: Where to place Spark + GemFire connector.

2015-07-06 Thread Roman Shaposhnik
On Mon, Jul 6, 2015 at 3:10 PM, John Blum jb...@pivotal.io wrote: if you unbundle your spark connector from Geode releases how do you know that a given Geode release actually works with it? Because the Spark Connector will depend on (i.e. have been developed and tested with) a specific

Where to place Spark + GemFire connector.

2015-07-02 Thread Anilkumar Gingade
Hi Team, We Have build Spark + Geode connector, which allows users to write Spark application to store/retrieve/query RDDs to/from Geode cluster. We are wondering wether to have this as part of Geode repo or on separate public GitHub repo? Why are we thinking about separate GitHub repo: - The

Re: Where to place Spark + GemFire connector.

2015-07-02 Thread John Blum
Personally, I would like to see Apache Geode become more modular, even down to the key low-level functional components, or features of Geode (such as Querying/Indexing, Persistence, Compression, Security, Management/Monitoring, Function Execution, even Membership, etc, etc). Of course, such