Re: [Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

Reynold Xin Mon, 25 Sep 2017 10:55:29 -0700

It's probably just an indication of lack of interest (or at least there
isn't a substantial overlap between Ignite users and Spark users). A new
catalog implementation is also pretty fundamental to Spark and the bar for
that would be pretty high. See my comment in SPARK-17767.


Guys - while I think this is very useful to do, I'm going to mark this as
"later" for now. The reason is that there are a lot of things to consider
before making this switch, including:

   - The ExternalCatalog API is currently internal, and we can't just make
   it public without thinking about the consequences and whether this API is
   maintainable in the long run.
   - SPARK-15777 <https://issues.apache.org/jira/browse/SPARK-15777> We
   need to design this in the context of catalog federation and persistence.
   - SPARK-15691 <https://issues.apache.org/jira/browse/SPARK-15691>
Refactoring
   of how we integrate with Hive.

This is not as simple as just submitting a PR to make it pluggable.

On Mon, Sep 25, 2017 at 10:50 AM, Николай Ижиков <nizhikov....@gmail.com>
wrote:

> Guys.
>
> Am I miss something and wrote a fully wrong mail?
> Can you give me some feedback?
> What I have missed with my propositions?
>
> 2017-09-19 15:39 GMT+03:00 Nikolay Izhikov <nizhikov....@gmail.com>:
>
>> Guys,
>>
>> Anyone had a chance to look at my message?
>>
>> 15.09.2017 15:50, Nikolay Izhikov пишет:
>>
>> Hello, guys.
>>>
>>> I’m contributor of Apache Ignite project which is self-described as an
>>> in-memory computing platform.
>>>
>>> It has Data Grid features: distribute, transactional key-value store
>>> [1], Distributed SQL support [2], etc…[3]
>>>
>>> Currently, I’m working on integration between Ignite and Spark [4]
>>> I want to add support of Spark Data Frame API for Ignite.
>>>
>>> As far as Ignite is distributed store it would be useful to create
>>> implementation of Catalog [5] for an Apache Ignite.
>>>
>>> I see two ways to implement this feature:
>>>
>>>      1. Spark can provide API for any custom catalog implementation. As
>>> far as I can see there is a ticket for it [6]. It is closed with resolution
>>> “Later”. Is it suitable time to continue working on the ticket? How can I
>>> help with it?
>>>
>>>      2. I can provide an implementation of Catalog and other required
>>> API in the form of pull request in Spark, as it was implemented for Hive
>>> [7]. Can such pull request be acceptable?
>>>
>>> Which way is more convenient for Spark community?
>>>
>>> [1] https://ignite.apache.org/features/datagrid.html
>>> [2] https://ignite.apache.org/features/sql.html
>>> [3] https://ignite.apache.org/features.html
>>> [4] https://issues.apache.org/jira/browse/IGNITE-3084
>>> [5] https://github.com/apache/spark/blob/master/sql/catalyst/src
>>> /main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala
>>> [6] https://issues.apache.org/jira/browse/SPARK-17767
>>> [7] https://github.com/apache/spark/blob/master/sql/hive/src/mai
>>> n/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
>>>
>>
>
>
> --
> Nikolay Izhikov
> nizhikov....@gmail.com
>

Re: [Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

Reply via email to