Attach the design doc here
https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit#

I think the initial design intention is to replace graphx in spark. At
first, we plan to merge graphframe project into spark, now this design is
improved to not only include graphframe, but also include a graph query
(CypherQL) engine.

On Thu, Oct 17, 2019 at 1:41 AM Reynold Xin <r...@databricks.com> wrote:

> Just curious - did we discuss why this shouldn't be another Apache sister
> project?
>
>
> On Wed, Oct 16, 2019 at 10:21 AM, Sean Owen <sro...@gmail.com> wrote:
>
>> We don't all have to agree on whether to add this -- there are like 10
>> people with an opinion -- and I certainly would not veto it. In practice a
>> medium-sized changes needs someone to review/merge it all the way through,
>> and nobody strongly objecting. I too don't know what to make of the
>> situation; what happened to the supporters here?
>>
>> I am concerned about maintenance, as inevitably any new module falls on
>> everyone to maintain to some degree, and people come and go despite their
>> intentions. But that isn't the substance of why I personally wouldn't merge
>> it. Just doesn't seem like it must live in Spark. But again this is my
>> opinion; you don't need to convince me, just need to
>> (re?)-convince a shepherd, sponsor for this change.
>>
>> Voting on the dependency part or whatever is also not important. It's a
>> detail, and already merged even.
>>
>> The issue to hand is: if nobody supports reviewing and merging the rest
>> of the change, what then? we can't leave it half implemented. The fallback
>> plan is just to back it out and reconsider later. This would be a poor
>> outcome process-wise, but better than leaving it incomplete.
>>
>> On Wed, Oct 16, 2019 at 3:15 AM Martin Junghanns
>> <martin.jungha...@neo4j.com> wrote:
>>
>> I'm slightly confused about this discussion. I worked on all of the
>> aforementioned PRs: the module PR that has been merged, the current PR that
>> introduces the Graph API and the PoC PR that contains the full
>> implementation. The issues around shading were addressed and the module PR
>> eventually got merged. Two PMC members including the SPIP shepherd are
>> working with me (and others) on the current API PR. The SPIP to bring Spark
>> Graph into Apache Spark itself has been successfully voted on earlier this
>> year. I presented this work at Spark Summit in San Fransisco in May and was
>> asked by the organizers to present the topic at the European Spark Summit.
>> I'm currently sitting in the speakers room of that conference preparing for
>> the talk and reading this thread. I hope you understand my confusion.
>>
>> I admit - and Xiangrui pointed this out in the other thread, too - that
>> we made the early mistake of not bringing more Spark committers on board
>> which lead to a stagnation period during summer when Xiangrui wasn't around
>> to help review and bring progress to the project.
>>
>> Sean, if your concern is the lack of maintainers of that module, I
>> personally would like to volunteer to maintain Spark Graph. I'm also a
>> contributor to the Okapi stack and am able to work on whatever issue might
>> come up on that end including updating dependencies etc. FWIW, Okapi is
>> actively maintained by a team at Neo4j.
>>
>> Best, Martin
>>
>> On Wed, 16 Oct 2019, 4:35 AM Sean Owen <sro...@gmail.com> wrote:
>>
>> I do not have a very informed opinion here, so take this with a grain of
>> salt.
>>
>> I'd say that we need to either commit a coherent version of this for
>> Spark 3, or not at all. If it doesn't have support, I'd back out the
>> existing changes.
>> I was initially skeptical about how much this needs to be in Spark vs a
>> third-party package, and that still stands.
>>
>> The addition of another dependency isn't that big a deal IMHO, but, yes,
>> it does add something to the maintenance overhead. But that's all the more
>> true of a new module.
>>
>> I don't feel strongly about it, but if this isn't obviously getting
>> support from any committers, can we keep it as a third party library for
>> now?
>>
>> On Tue, Oct 15, 2019 at 8:53 PM Weichen Xu <weichen...@databricks.com>
>> wrote:
>>
>> Hi Mats Rydberg,
>>
>> Although this dependency "org.opencypher:okapi-shade.okapi" was added
>> into spark, but Xiangrui raised two concerns (see above mail) about it, so
>> we'd better rethink on this and consider whether this is a good choice, so
>> I call this vote.
>>
>> Thanks!
>>
>> On Tue, Oct 15, 2019 at 10:56 PM Mats Rydberg <m...@neo4j.org.invalid>
>> wrote:
>>
>> Hello Weichen, community
>>
>> I'm sorry, I'm feeling a little bit confused about this vote. Is this
>> about the PR (https://github.com/apache/spark/pull/24490) that was
>> merged in early June and introduced the spark-graph module including the
>> okapi-shade dependency?
>>
>> Regarding the okapi-shade dependency which was developed as part of the
>> above PR work, some advice was offered by Scala experts at TripleQuote
>> which helped find a satisfactory solution. The shading mechanism used is
>> standard and very comparable to a Java library shading solution.
>>
>> The PR you link (https://github.com/apache/spark/pull/24297) is not
>> meant for merging. It is just a proof-of-concept branch containing a full
>> implementation of the system, which is kept up-to-date with the API
>> discussion on the currently proposed PR:
>> https://github.com/apache/spark/pull/24851.
>>
>> Thank you
>> Mats
>>
>> On Tue, Oct 15, 2019 at 10:38 AM Weichen Xu <weichen...@databricks.com>
>> wrote:
>>
>> Hi everyone,
>>
>> I'd like to call a new vote on the issue: should we add dependency
>> "org.opencypher:okapi-shade.okapi" into spark ? The issue background is:
>>
>> Spark is going to add a big feature "Spark Graph", the prototypical
>> implementation is here https://github.com/apache/spark/pull/24297
>> which will introduce dependency org.opencypher:okapi-shade.okapi
>>
>> Xiangrui already mentioned 2 concerns on this dependency change:
>>
>> On the technical side, my main concern is the runtime dependency on
>> org.opencypher:okapi-shade.okapi depends on several Scala libraries. We
>> came out with the solution to shade a few Scala libraries to avoid
>> pollution. However, I'm not super confident that the approach is
>> sustainable for two reasons: a) there exists no proper shading libraries
>> for Scala, 2) We will have to wait for upgrades from those Scala libraries
>> before we can upgrade Spark to use a newer Scala version. So it would be
>> great if some Scala experts can help review the current implementation and
>> help assess the risk.
>>
>> So let's discuss and vote whether this is a good choice. Before this
>> spark graph feature to get into spark ASAP, this issue should be resolved
>> first.
>>
>> This vote is open until next Tuseday (Oct. 22).
>>
>> [ ] +1: Accept the proposal
>> [ ] +0
>> [ ] -1: I don't think this is a good idea because ...
>>
>> Thank you!
>>
>> Weichen
>>
>> --------------------------------------------------------------------- To
>> unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>> --------------------------------------------------------------------- To
>> unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>
>

Reply via email to