A lot of people like me use GraphFrames for its connected components implementation and its motif matching feature. I am willing to work on it to keep it alive. They did a 0.8.3 release not too long ago. Please keep GraphX alive.
On Sat, Oct 5, 2024 at 3:44 PM Mich Talebzadeh <mich.talebza...@gmail.com> wrote: > I added the user list as they may have vested interest here and and > hopefully can contribute > > Few suggestions: > > > 1. Data-Driven Decision Making: Return to the core metrics—analyze > usage trends, performance benchmarks, and the actual impact on businesses > that rely on GraphX. Objectivity can be restored by letting data speak > louder than opinions so to speak. > 2. Broaden the Discussion: Engage more stakeholders from diverse > backgrounds (especially spark users) to bring in new perspectives and > counterbalance the more vocal but potentially narrow interests of core > maintainers or open-source contributors. > 3. Define Clear Criteria for Decision Making: Agree on a set of > objective criteria by which the project’s future will be judged. These > could include market demand, contribution levels, maintenance costs, > alternative solutions, and alignment with the overall Spark ecosystem > goals. Some have already been covered. > 4. Timely Conclusion of Discussions: Set a timeline for making a > decision. Long, open-ended discussions tend to lose focus. Putting > deadlines forces participants to focus on key issues and prevents endless > debates. > 5. Borrowing from commercial settings, it is often necessary for a > strong leadership team to step in and make the final decision after > considering the input. When the objectivity of discussions starts to wane, > leadership needs to cut through the round discussions and steer towards > action based on business and technical realities. > > > HTH > > Mich Talebzadeh, > > Architect | Data Engineer | Data Science | Financial Crime > PhD <https://en.wikipedia.org/wiki/Doctor_of_Philosophy> Imperial College > London <https://en.wikipedia.org/wiki/Imperial_College_London> > London, United Kingdom > > > view my Linkedin profile > <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> > > > https://en.everybodywiki.com/Mich_Talebzadeh > > > > *Disclaimer:* The information provided is correct to the best of my > knowledge but of course cannot be guaranteed . It is essential to note > that, as with any advice, quote "one test result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Sat, 5 Oct 2024 at 06:26, Ángel <angel.alvarez.pas...@gmail.com> wrote: > >> I completely agree with everyone here. I don’t think the issue is >> deprecating it; to me, the problem lies in not providing a new and better >> solution for handling graphs in Spark. In the past, I used GraphX via >> GraphFrames for record linkage, and I found it both useful and effective. >> Is there any discussion about a potential replacement? >> >> I’d be willing to help maintain GraphX, though I don’t have previous >> experience with maintaining open-source projects. All I can promise is good >> intentions, willingness to learn and lots of energy and passion. Is that >> enough? >> >> Btw, what's your take on this? >> >> >> - >> >> GraphX will be deprecated in favor of a new graphing component, >> SparkGraph, based on Cypher >> <https://neo4j.com/developer/cypher-query-language/>, a much richer >> graph language than previously offered by GraphX. >> >> >> >> https://cloud.google.com/blog/products/data-analytics/introducing-spark-3-and-hadoop-3-on-dataproc-image-version-2-0 >> >> El sáb, 5 oct 2024 a las 2:17, Mark Hamstra (<markhams...@gmail.com>) >> escribió: >> >>> As I wrote to Holden privately, I might well change my vote to be in >>> favor of a deprecation label combined with some effective means of >>> communicating that this doesn't mean the end for GraphX if interested >>> contributors come forward to rescue it. I don't like either the idea >>> of keeping unmaintained code and public APIs around (especially if >>> there are problems with them) or the idea of removing Spark >>> functionality just because no one has contributed to it for a while. A >>> naked deprecation label feels somewhat drastic and pre-emptive to me. >>> I don't expect that GraphX will be the last part of Spark to run the >>> risk of death through neglect, and I think we need an effective means >>> of encouraging resuscitation that a deprecation label on its own does >>> not provide. On the other hand, if no one really is willing to come to >>> the aid of GraphX or other neglected functionality given adequate >>> warning of possible removal, I'm not then opposed to the usual >>> deprecation and removal process. >>> >>> >>> On Fri, Oct 4, 2024 at 4:10 PM Sean Owen <sro...@gmail.com> wrote: >>> > >>> > This is a reasonable discussion, but maybe the more practical point >>> is: are you sure you want to block this unilaterally? This effectively >>> makes a decision that GraphX cannot be removed for a long while. I'd >>> understand it more if we had an active maintainer and/or active user >>> proposing to veto, but my understanding is this is just a proposal to block >>> this on behalf of some users, someone else who might do some work and >>> hasn't to date for some reason. Add to that the fact that the 'pro' >>> arguments all seem to be arguments for working on GraphFrames, and I find >>> this somewhat drastic. >>> > >>> > On Fri, Oct 4, 2024 at 5:23 PM Mark Hamstra <markhams...@gmail.com> >>> wrote: >>> >> >>> >> "You can't say nothing is removable until there are no users." >>> >> >>> >> That is not what I am saying. Rather, I am countering what others seem >>> >> to be suggesting: There are no users and no interest, therefore we can >>> >> and should deprecate. >>> >> >>> >> On Fri, Oct 4, 2024 at 3:10 PM Sean Owen <sro...@gmail.com> wrote: >>> >> > >>> >> > I could flip this argument around. More strongly, not being >>> deprecated means "won't be removed" and likewise implies support and >>> development. I don't think either of the latter have been true for years. >>> What suggests this will change? A todo list is not going to do anything, >>> IMHO. >>> >> > >>> >> > I'm also concerned about the cost of that, which I have observed. >>> GraphX PRs are almost certainly not going to be reviewed because of its >>> state. Deprecation both communicates that reality, and leaves an option >>> open, whereas not deprecating forecloses that option for a while. >>> >> > >>> >> > I don't think the question is, does anyone use it? because anyone >>> can continue to use it -- in Spark 3.x for sure, and in 4.x if not removed. >>> >> > You can't say nothing is removable until there are no users. >>> >> > >>> >> > Also, why would GraphFrames not be the logical home of this going >>> forward anyway? which I think is the subtext. >>> >> > >>> >> > On Fri, Oct 4, 2024 at 4:56 PM Mark Hamstra <markhams...@gmail.com> >>> wrote: >>> >> >> >>> >> >> I'm -1(*) because, while it technically means "might be removed in >>> the >>> >> >> future", I think developers and users are more prone to interpret >>> >> >> something being marked as deprecated as "very likely will be >>> removed >>> >> >> in the future, so don't depend on this or waste your time >>> contributing >>> >> >> to its further development." I don't think the latter is what we >>> want >>> >> >> just because something hasn't been updated meaningfully in a while. >>> >> >> There have been How To articles for GraphX and Graph Frames posted >>> in >>> >> >> the not too distant past, and the Google Search trend shows a >>> pretty >>> >> >> steady level of interest, not a decline to zero, so I don't think >>> that >>> >> >> it is accurate to declare that there is no use or interest in >>> GraphX. >>> >> >> >>> >> >> Unless retaining GraphX is imposing significant costs on continuing >>> >> >> Spark development, I can't support deprecating GraphX. I can >>> support >>> >> >> encouraging GraphX and Graph Frames development through something >>> like >>> >> >> a To Do list or document of "What we'd like to see in the way of >>> >> >> further development of Spark's graph processing capabilities" -- >>> i.e., >>> >> >> things that encourage and support new contributions to address any >>> >> >> shortcomings in Spark's graph processing, not things that >>> discourage >>> >> >> contributions and use in the way that I believe simply declaring >>> >> >> GraphX to be deprecated would. >>> >> >> >>> >> >> >>> >> >> On Sun, Sep 29, 2024 at 11:04 AM Holden Karau < >>> holden.ka...@gmail.com> wrote: >>> >> >> > >>> >> >> > Since we're getting close to cutting a 4.0 branch I'd like to >>> float the idea of officially deprecating Graph X. What that would mean (to >>> me) is we would update the docs to indicate that Graph X is deprecated and >>> it's APIs may be removed at anytime in the future. >>> >> >> > >>> >> >> > Alternatively, we could mark it as "unmaintained and in search >>> of maintainers" with a note that if no maintainers are found, we may remove >>> it in a future minor version. >>> >> >> > >>> >> >> > Looking at the source graph X, I don't see any meaningful active >>> development going back over three years*. There is even a thread on user@ >>> from 2017 asking if graph X is maintained anymore, with no response from >>> the developers. >>> >> >> > >>> >> >> > Now I'm open to the idea that GraphX is stable and "works as is" >>> and simply doesn't require modifications but given the user thread I'm a >>> little concerned here about bringing this API with us into Spark 4 if we >>> don't have anyone signed up to maintain it. >>> >> >> > >>> >> >> > * Excluding globally applied changes >>> >> >> > -- >>> >> >> > Twitter: https://twitter.com/holdenkarau >>> >> >> > Fight Health Insurance: https://www.fighthealthinsurance.com/ >>> >> >> > Books (Learning Spark, High Performance Spark, etc.): >>> https://amzn.to/2MaRAG9 >>> >> >> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau >>> >> >> > Pronouns: she/her >>> >> >> >>> >> >> >>> --------------------------------------------------------------------- >>> >> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >> >> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> >>>