Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

2019-01-20 Thread Gourav Sengupta
Hi,

this is fantastic and it will be great to have this. Also a place where we
could use graph frames is for data lineage. You will see a 100% adoption of
graph frames in case we can send data from catalyst to be stored somewhere
as graphs of dependencies.

In case you are including data lineage as well, please do let me know and I
will love to be a part of the testing as well.

Regards,
Gourav Sengupta

On Tue, Jan 15, 2019 at 4:53 PM Xiangrui Meng  wrote:

> Hi all,
>
> I want to re-send the previous SPIP on introducing a DataFrame-based graph
> component to collect more feedback. It supports property graphs, Cypher
> graph queries, and graph algorithms built on top of the DataFrame API. If
> you are a GraphX user or your workload is essentially graph queries, please
> help review and check how it fits into your use cases. Your feedback would
> be greatly appreciated!
>
> # Links to SPIP and design sketch:
>
> * Jira issue for the SPIP:
> https://issues.apache.org/jira/browse/SPARK-25994
> * Google Doc:
> https://docs.google.com/document/d/1ljqVsAh2wxTZS8XqwDQgRT6i_mania3ffYSYpEgLx9k/edit?usp=sharing
> * Jira issue for a first design sketch:
> https://issues.apache.org/jira/browse/SPARK-26028
> * Google Doc:
> https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit?usp=sharing
>
> # Sample code:
>
> ~~~
> val graph = ...
>
> // query
> val result = graph.cypher("""
>   MATCH (p:Person)-[r:STUDY_AT]->(u:University)
>   RETURN p.name, r.since, u.name
> """)
>
> // algorithms
> val ranks = graph.pageRank.run()
> ~~~
>
> Best,
> Xiangrui
>


Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

2019-01-20 Thread 木内満歳
I support the proposal. I am assisting various companies as a system
integrator of Neo4j. There is several Japanese telecommunications companies
quickly grasping the state of the network topology. In addition, many
advertising censure companies associate enormous metadata and use it for
marketing using Neo4j and Cypher query. Cypher Query's flexible search and
extraction mechanisms benefit from these activities. Also, many
manufacturing industries in Japan interest in the graph database model. We
believe that support of Cypher Query in Apache Spark can give Japanese
graph data users a more convenient path to distributed processing.
In Japan, communities that love Neo4j, Cypher Query are already active and
frequently communicating. ( https://jp-neo4j-usersgroup.connpass.com/ )
With Cypher Query support from Apache Spark, they will be encouraged and
will love Apache Spark. We are convinced that the Apache Spark developer
community will expand further.

Regards,

--
Mitsutoshi Kiuchi


2019年1月16日(水) 1:53 Xiangrui Meng :

> Hi all,
>
> I want to re-send the previous SPIP on introducing a DataFrame-based graph
> component to collect more feedback. It supports property graphs, Cypher
> graph queries, and graph algorithms built on top of the DataFrame API. If
> you are a GraphX user or your workload is essentially graph queries, please
> help review and check how it fits into your use cases. Your feedback would
> be greatly appreciated!
>
> # Links to SPIP and design sketch:
>
> * Jira issue for the SPIP:
> https://issues.apache.org/jira/browse/SPARK-25994
> * Google Doc:
> https://docs.google.com/document/d/1ljqVsAh2wxTZS8XqwDQgRT6i_mania3ffYSYpEgLx9k/edit?usp=sharing
> * Jira issue for a first design sketch:
> https://issues.apache.org/jira/browse/SPARK-26028
> * Google Doc:
> https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit?usp=sharing
>
> # Sample code:
>
> ~~~
> val graph = ...
>
> // query
> val result = graph.cypher("""
>   MATCH (p:Person)-[r:STUDY_AT]->(u:University)
>   RETURN p.name, r.since, u.name
> """)
>
> // algorithms
> val ranks = graph.pageRank.run()
> ~~~
>
> Best,
> Xiangrui
>


Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-20 Thread Serega Sheypak
Thanks, so I'll check YARN.
Does anyone know if Spark-on-Yarn plans to expose such functionality?

сб, 19 янв. 2019 г. в 18:04, Felix Cheung :

> To clarify, yarn actually supports excluding node right when requesting
> resources. It’s spark that doesn’t provide a way to populate such a
> blacklist.
>
> If you can change yarn config, the equivalent is node label:
> https://hadoop.apache.org/docs/r2.7.4/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
>
>
>
> --
> *From:* Li Gao 
> *Sent:* Saturday, January 19, 2019 8:43 AM
> *To:* Felix Cheung
> *Cc:* Serega Sheypak; user
> *Subject:* Re: Spark on Yarn, is it possible to manually blacklist nodes
> before running spark job?
>
> on yarn it is impossible afaik. on kubernetes you can use taints to keep
> certain nodes outside of spark
>
> On Fri, Jan 18, 2019 at 9:35 PM Felix Cheung 
> wrote:
>
>> Not as far as I recall...
>>
>>
>> --
>> *From:* Serega Sheypak 
>> *Sent:* Friday, January 18, 2019 3:21 PM
>> *To:* user
>> *Subject:* Spark on Yarn, is it possible to manually blacklist nodes
>> before running spark job?
>>
>> Hi, is there any possibility to tell Scheduler to blacklist specific
>> nodes in advance?
>>
>