[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-26 Thread Alexey Zinoviev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938765#comment-16938765
 ] 

Alexey Zinoviev commented on IGNITE-12054:
--

Also, in Spark was fixed bug with incorrect null handling on columns in codition

https://issues.apache.org/jira/browse/SPARK-21479

It leads to IgniteOptimizationJoinSpec fixes (the same thing was in the 
previous migration from 2.2 to 2.3)

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis A. Magda
>Assignee: Alexey Zinoviev
>Priority: Blocker
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-25 Thread Aleksey Zinoviev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937737#comment-16937737
 ] 

Aleksey Zinoviev commented on IGNITE-12054:
---

I've added a PR with compiled version (the previous issue was resolved)

[https://github.com/apache/ignite/pull/6909]

 

But a few example and tests are broken.

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Assignee: Aleksey Zinoviev
>Priority: Blocker
> Fix For: 2.8
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-25 Thread Aleksey Zinoviev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937715#comment-16937715
 ] 

Aleksey Zinoviev commented on IGNITE-12054:
---

An ExternalCatalog was refactored and all listener properties were inherited in 
ExternalCatalogWithListener.

 

Nobody yet inherited from this class on Github, the known implementations are 
HiveExternalCatalog and MemoryExternalCatalog (both of them doesn't support 
listeners and events)

 

Also, people in Spark ML couldn't solve the same problem

[http://mail-archives.apache.org/mod_mbox/spark-issues/201812.mbox/%3cjira.13144856.1520975543000.147283.1544598241...@atlassian.jira%3E]

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Assignee: Nikolay Izhikov
>Priority: Blocker
> Fix For: 2.8
>
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-24 Thread Denis Magda (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16937242#comment-16937242
 ] 

Denis Magda commented on IGNITE-12054:
--

[~NIzhikov], let's follow your original plan by producing 2 new modules. This 
modularization story will take a while. 

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Assignee: Nikolay Izhikov
>Priority: Blocker
> Fix For: 2.8
>
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-12 Thread Nikolay Izhikov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928509#comment-16928509
 ] 

Nikolay Izhikov commented on IGNITE-12054:
--

OK, Let's try it.

[~zaleslaw] Can you start the discussion on dev-list?

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Assignee: Nikolay Izhikov
>Priority: Blocker
> Fix For: 2.8
>
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-11 Thread Denis Magda (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16928005#comment-16928005
 ] 

Denis Magda commented on IGNITE-12054:
--

[~NIzhikov], [~zaleslaw], how about creating the first separate Ignite module 
for Spark and release it independently? The module will be in its own 
repository and there are might be different branches for different versions. 
Ignite community will support and release only the latest version of Spark 
integration while the users can always build an old version from one of the 
branches.

More details are recorded on this page:
https://cwiki.apache.org/confluence/display/IGNITE/IEP-36%3A+Modularization

If three of us agree, then we can restart discussion on the dev list, do a call 
with [~agoncharuk] and other community members and get down to the 
implementation. 

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Assignee: Nikolay Izhikov
>Priority: Blocker
> Fix For: 2.8
>
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-11 Thread Nikolay Izhikov (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16927325#comment-16927325
 ] 

Nikolay Izhikov commented on IGNITE-12054:
--

Hello, [~zaleslaw]

The main issue with upgrading to 2.4 and support both 2.3 and 2.5 is changes in 
Spark internal API that was introduced in 2.3 version.
As you may know, Ignite - Spark integration uses some internal API 
({{ExternalCatalog}}) and leverage on other parts of close-to-internal parts of 
Spark(query parser and optimizer). That code was changed in 2.4 and we can't 
simply change the version in pom.xml. We need to fix our code for the new Spark 
behaviour.

I tried to do so but fail to do it quickly.
If you want to take care of this upgrade, please, do.

As for support both version, it seems we should have different modules or 
similar to do so.

[~dmagda] Ignite, in the past, had those kinds of modules, that support two 
different versions of some external library.
Should we do the similar and create spark_24 and spark_25 modules with the very 
similar(but still different) codebase?

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Assignee: Nikolay Izhikov
>Priority: Blocker
> Fix For: 2.8
>
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-09-10 Thread Aleksey Zinoviev (Jira)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926751#comment-16926751
 ] 

Aleksey Zinoviev commented on IGNITE-12054:
---

[~dmagda] [~NIzhikov] Let's discuss how to support both versions like 2.3 and 
2.4 (because a lot of people use both now). Maybe we could provide support of 
both versions depending on the outer parameter (SPARK_VERSION) implementing 
internally. I could try to investigate this feature, if nobody will start the 
work in the nearest future

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Assignee: Nikolay Izhikov
>Priority: Blocker
> Fix For: 2.8
>
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Commented] (IGNITE-12054) Upgrade Spark module to 2.4

2019-08-08 Thread Denis Magda (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16903210#comment-16903210
 ] 

Denis Magda commented on IGNITE-12054:
--

Raised to the BLOCKER as long as our Data Frames integration doesn't work with 
the latest Spark version.

> Upgrade Spark module to 2.4
> ---
>
> Key: IGNITE-12054
> URL: https://issues.apache.org/jira/browse/IGNITE-12054
> Project: Ignite
>  Issue Type: Task
>  Components: spark
>Affects Versions: 2.7.5
>Reporter: Denis Magda
>Priority: Blocker
> Fix For: 2.7.6
>
>
> Users can't use APIs that are already available in Spark 2.4:
> https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite
> Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as 
> a separate module that can support multiple Spark versions.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)