+1 Graph analytics is now mainstream, and having Cypher first-class support in Spark would allow users to deal with highly connected datasets (fraud detection, epidemiology analysis, genomic analysis, and so on) going beyond the limits of joins when you must traverse a dataset.
On 2019/01/15 16:52:44, Xiangrui Meng <m...@gmail.com> wrote: > Hi all,> > > I want to re-send the previous SPIP on introducing a DataFrame-based graph> > component to collect more feedback. It supports property graphs, Cypher> > graph queries, and graph algorithms built on top of the DataFrame API. If> > you are a GraphX user or your workload is essentially graph queries, please> > help review and check how it fits into your use cases. Your feedback would> > be greatly appreciated!> > > # Links to SPIP and design sketch:> > > * Jira issue for the SPIP: https://issues.apache.org/jira/browse/SPARK-25994> > * Google Doc:> > https://docs.google.com/document/d/1ljqVsAh2wxTZS8XqwDQgRT6i_mania3ffYSYpEgLx9k/edit?usp=sharing > > * Jira issue for a first design sketch:> > https://issues.apache.org/jira/browse/SPARK-26028> > * Google Doc:> > https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit?usp=sharing > > > # Sample code:> > > ~~~> > val graph = ...> > > // query> > val result = graph.cypher("""> > MATCH (p:Person)-[r:STUDY_AT]->(u:University)> > RETURN p.name, r.since, u.name> > """)> > > // algorithms> > val ranks = graph.pageRank.run()> > ~~~> > > Best,> > Xiangrui> >