Thank you and this is great news. 

We currently use the Ignite cache as a Reference dataset RDD in Spark, convert 
it into a spark DataFrame and then join this DF with the incoming-data DF. I 
hope we can change this 3 step process to a single step with the Spark DF 
integration. If so, would index / affinitykeys on the join columns help with 
performance? We currently do not have them defined on the Reference dataset. 
Are there examples available joining ignite DF with Spark DF? Also, what is the 
best way to get the latest executables with the IGNITE-3084 included? Thanks 
again. 


On 12/29/17, 10:34 PM, "Nikolay Izhikov" <nizhikov....@gmail.com> wrote:

    Thank you, guys.
    
    Val, thanks for all reviews, advices and patience.
    
    Anton, thanks for ignite wisdom you share with me.
    
    Looking forward for next issues :)
    
    P.S Happy New Year for all Ignite community!
    
    В Пт, 29/12/2017 в 13:22 -0800, Valentin Kulichenko пишет:
    > Igniters,
    > 
    > Great news! We completed and merged first part of integration with
    > Spark data frames [1]. It contains implementation of Spark data
    > source which allows to use DataFrame API to query Ignite data, as
    > well as join it with other data frames originated from different
    > sources.
    > 
    > Next planned steps are the following:
    > - Implement custom execution strategy to avoid transferring data from
    > Ignite to Spark when possible [2]. This should give serious
    > performance improvement in cases when only Ignite tables participate
    > in a query.
    > - Implement ability to save a data frame into Ignite via
    > DataFrameWrite API [3].
    > 
    > [1] https://issues.apache.org/jira/browse/IGNITE-3084
    > [2] https://issues.apache.org/jira/browse/IGNITE-7077
    > [3] https://issues.apache.org/jira/browse/IGNITE-7337
    > 
    > Nikolay Izhikov, thanks for the contribution and for all the hard
    > work!
    > 
    > -Val
    

Reply via email to