[jira] [Commented] (IGNITE-3084) Spark Data Frames Support in Apache Ignite

Valentin Kulichenko (JIRA) Tue, 26 Dec 2017 17:08:34 -0800

    [ 
https://issues.apache.org/jira/browse/IGNITE-3084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16304114#comment-16304114
 ]


Valentin Kulichenko commented on IGNITE-3084:
---------------------------------------------

[~NIzhikov], looks much better now, here are some points I want to reiterate on 
though.
* {{onApplicationEnd}} method - makes sense. But it sounds like it should be on 
{{IgniteContext}} level, what do you think?
* {{IgniteSQLRelation#calcPartitions}} - got it, but what will happen if 
topology changes? Will partitions be recalculated?
* {{IgniteCacheRelation}} - let's remove it for now and discuss on dev@ as a 
separate task. If we come up with good API for this talk, then create a Jira 
ticket and implement. Although I feel that there are more important tasks at 
the moment, like implementing custom strategy for SQL execution.
* {{org.apache.spark.sql.ignite}} package. We currently have three classes 
there, and it looks like only {{IgniteSparkSession}} is supposed to be used in 
application code. Can we move it to {{org.apache.ignite.spark}} package and put 
it with all other public classes? {{IgniteExternalCatalog}} and 
{{IgniteSharedState}} can then remain in this weird package, as they are 
implementation only and not public.

> Spark Data Frames Support in Apache Ignite
> ------------------------------------------
>
>                 Key: IGNITE-3084
>                 URL: https://issues.apache.org/jira/browse/IGNITE-3084
>             Project: Ignite
>          Issue Type: Task
>          Components: spark
>    Affects Versions: 1.5.0.final
>            Reporter: Vladimir Ozerov
>            Assignee: Nikolay Izhikov
>            Priority: Critical
>              Labels: bigdata, important
>             Fix For: 2.4
>
>
> Apache Spark already benefits from integration with Apache Ignite. The latter 
> provides shared RDDs, an implementation of Spark RDD, that help Spark to 
> share a state between Spark workers and execute SQL queries much faster. The 
> next logical step is to enable support for modern Spark Data Frames API in a 
> similar way.
> As a contributor, you will be fully in charge of the integration of Spark 
> Data Frame API and Apache Ignite.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (IGNITE-3084) Spark Data Frames Support in Apache Ignite

Reply via email to