[jira] [Commented] (SPARK-25994) SPIP: Property Graphs, Cypher Queries, and Algorithms

Martin Junghanns (JIRA) Tue, 09 Apr 2019 02:11:23 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-25994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813156#comment-16813156
 ]


Martin Junghanns commented on SPARK-25994:
------------------------------------------

[~kanjilal] Thanks for the initial comments on the doc. Looking forward to you 
PR comments. Let's discuss tasks after the API is settled and 
https://issues.apache.org/jira/browse/SPARK-27300 is merged. How familiar are 
you with pyspark? Would implementing the Python API be something of interest 
for you?

> SPIP: Property Graphs, Cypher Queries, and Algorithms
> -----------------------------------------------------
>
>                 Key: SPARK-25994
>                 URL: https://issues.apache.org/jira/browse/SPARK-25994
>             Project: Spark
>          Issue Type: Epic
>          Components: Graph
>    Affects Versions: 3.0.0
>            Reporter: Xiangrui Meng
>            Assignee: Martin Junghanns
>            Priority: Major
>              Labels: SPIP
>
> Copied from the SPIP doc:
> {quote}
> GraphX was one of the foundational pillars of the Spark project, and is the 
> current graph component. This reflects the importance of the graphs data 
> model, which naturally pairs with an important class of analytic function, 
> the network or graph algorithm. 
> However, GraphX is not actively maintained. It is based on RDDs, and cannot 
> exploit Spark 2’s Catalyst query engine. GraphX is only available to Scala 
> users.
> GraphFrames is a Spark package, which implements DataFrame-based graph 
> algorithms, and also incorporates simple graph pattern matching with fixed 
> length patterns (called “motifs”). GraphFrames is based on DataFrames, but 
> has a semantically weak graph data model (based on untyped edges and 
> vertices). The motif pattern matching facility is very limited by comparison 
> with the well-established Cypher language. 
> The Property Graph data model has become quite widespread in recent years, 
> and is the primary focus of commercial graph data management and of graph 
> data research, both for on-premises and cloud data management. Many users of 
> transactional graph databases also wish to work with immutable graphs in 
> Spark.
> The idea is to define a Cypher-compatible Property Graph type based on 
> DataFrames; to replace GraphFrames querying with Cypher; to reimplement 
> GraphX/GraphFrames algos on the PropertyGraph type. 
> To achieve this goal, a core subset of Cypher for Apache Spark (CAPS), 
> reusing existing proven designs and code, will be employed in Spark 3.0. This 
> graph query processor, like CAPS, will overlay and drive the SparkSQL 
> Catalyst query engine, using the CAPS graph query planner.
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-25994) SPIP: Property Graphs, Cypher Queries, and Algorithms

Reply via email to