Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms
Hi, this is fantastic and it will be great to have this. Also a place where we could use graph frames is for data lineage. You will see a 100% adoption of graph frames in case we can send data from catalyst to be stored somewhere as graphs of dependencies. In case you are including data lineage as well, please do let me know and I will love to be a part of the testing as well. Regards, Gourav Sengupta On Tue, Jan 15, 2019 at 4:53 PM Xiangrui Meng wrote: > Hi all, > > I want to re-send the previous SPIP on introducing a DataFrame-based graph > component to collect more feedback. It supports property graphs, Cypher > graph queries, and graph algorithms built on top of the DataFrame API. If > you are a GraphX user or your workload is essentially graph queries, please > help review and check how it fits into your use cases. Your feedback would > be greatly appreciated! > > # Links to SPIP and design sketch: > > * Jira issue for the SPIP: > https://issues.apache.org/jira/browse/SPARK-25994 > * Google Doc: > https://docs.google.com/document/d/1ljqVsAh2wxTZS8XqwDQgRT6i_mania3ffYSYpEgLx9k/edit?usp=sharing > * Jira issue for a first design sketch: > https://issues.apache.org/jira/browse/SPARK-26028 > * Google Doc: > https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit?usp=sharing > > # Sample code: > > ~~~ > val graph = ... > > // query > val result = graph.cypher(""" > MATCH (p:Person)-[r:STUDY_AT]->(u:University) > RETURN p.name, r.since, u.name > """) > > // algorithms > val ranks = graph.pageRank.run() > ~~~ > > Best, > Xiangrui >
Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms
I support the proposal. I am assisting various companies as a system integrator of Neo4j. There is several Japanese telecommunications companies quickly grasping the state of the network topology. In addition, many advertising censure companies associate enormous metadata and use it for marketing using Neo4j and Cypher query. Cypher Query's flexible search and extraction mechanisms benefit from these activities. Also, many manufacturing industries in Japan interest in the graph database model. We believe that support of Cypher Query in Apache Spark can give Japanese graph data users a more convenient path to distributed processing. In Japan, communities that love Neo4j, Cypher Query are already active and frequently communicating. ( https://jp-neo4j-usersgroup.connpass.com/ ) With Cypher Query support from Apache Spark, they will be encouraged and will love Apache Spark. We are convinced that the Apache Spark developer community will expand further. Regards, -- Mitsutoshi Kiuchi 2019年1月16日(水) 1:53 Xiangrui Meng : > Hi all, > > I want to re-send the previous SPIP on introducing a DataFrame-based graph > component to collect more feedback. It supports property graphs, Cypher > graph queries, and graph algorithms built on top of the DataFrame API. If > you are a GraphX user or your workload is essentially graph queries, please > help review and check how it fits into your use cases. Your feedback would > be greatly appreciated! > > # Links to SPIP and design sketch: > > * Jira issue for the SPIP: > https://issues.apache.org/jira/browse/SPARK-25994 > * Google Doc: > https://docs.google.com/document/d/1ljqVsAh2wxTZS8XqwDQgRT6i_mania3ffYSYpEgLx9k/edit?usp=sharing > * Jira issue for a first design sketch: > https://issues.apache.org/jira/browse/SPARK-26028 > * Google Doc: > https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit?usp=sharing > > # Sample code: > > ~~~ > val graph = ... > > // query > val result = graph.cypher(""" > MATCH (p:Person)-[r:STUDY_AT]->(u:University) > RETURN p.name, r.since, u.name > """) > > // algorithms > val ranks = graph.pageRank.run() > ~~~ > > Best, > Xiangrui >
Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?
Thanks, so I'll check YARN. Does anyone know if Spark-on-Yarn plans to expose such functionality? сб, 19 янв. 2019 г. в 18:04, Felix Cheung : > To clarify, yarn actually supports excluding node right when requesting > resources. It’s spark that doesn’t provide a way to populate such a > blacklist. > > If you can change yarn config, the equivalent is node label: > https://hadoop.apache.org/docs/r2.7.4/hadoop-yarn/hadoop-yarn-site/NodeLabel.html > > > > -- > *From:* Li Gao > *Sent:* Saturday, January 19, 2019 8:43 AM > *To:* Felix Cheung > *Cc:* Serega Sheypak; user > *Subject:* Re: Spark on Yarn, is it possible to manually blacklist nodes > before running spark job? > > on yarn it is impossible afaik. on kubernetes you can use taints to keep > certain nodes outside of spark > > On Fri, Jan 18, 2019 at 9:35 PM Felix Cheung > wrote: > >> Not as far as I recall... >> >> >> -- >> *From:* Serega Sheypak >> *Sent:* Friday, January 18, 2019 3:21 PM >> *To:* user >> *Subject:* Spark on Yarn, is it possible to manually blacklist nodes >> before running spark job? >> >> Hi, is there any possibility to tell Scheduler to blacklist specific >> nodes in advance? >> >