Re: Removal of ignite ml module (or moving it to extensions)

2023-08-16 Thread Aleksei Zinovev
Hi, I have objection for fast merging, (not for moving) as a module
maintainer.

I never used ignite extension, need a time to be familiar with it and test
the pr.

Please postpone it till 10 september.

I don't understand reasons to do it so fast. I suppose it's ok to wait
15-20 days with PR

Thanks for collaboration and doing this work.


Re: Removal of ignite ml module (or moving it to extensions)

2023-08-16 Thread Ivan Daschinsky
ML Extensions suite is ready and it works, all tests from the main module
and parsers, all examples -- everything works and all green [1].
The green visa has been obtained. So I am going to merge it tomorrow, if
there is no objection.



[1] ---
https://ci.ignite.apache.org/buildConfiguration/IgniteExtensions_Tests_Ml/7438920?hideProblemsFromDependencies=false=false


Re: TX code cleanup (MVCC removal)

2023-08-16 Thread Ivan Daschinsky
The plan looks good to me. Some of the tests are in the ODBC test suite, so
i can help if needed.

ср, 16 авг. 2023 г. в 16:32, Anton Vinogradov :

> Igniters,
>
> I started the TX code cleanup [1] last month and almost finished with the
> obvious garbage.
> Now, started the code deduplication, I was faced with code overcomplexity
> because of unfinished MVCC.
>
> The community agreed to remove MVCC, but the initial attempt [2] was not
> successful because of the impossibility to get rid of 20k+ lines of the
> code at once.
> So, my proposal is to remove it step by step.
>
> 1) MVCC tests should be removed from the project
> 2) MVCC-related code should be removed from the project by reasonably sized
> commits, checking it does not affect the existing tests.
>
> I'm ready to perform the removal.
>
> Any objections/tips?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-19844
> [2] https://issues.apache.org/jira/browse/IGNITE-13871
>


-- 
Sincerely yours, Ivan Daschinskiy


TX code cleanup (MVCC removal)

2023-08-16 Thread Anton Vinogradov
Igniters,

I started the TX code cleanup [1] last month and almost finished with the
obvious garbage.
Now, started the code deduplication, I was faced with code overcomplexity
because of unfinished MVCC.

The community agreed to remove MVCC, but the initial attempt [2] was not
successful because of the impossibility to get rid of 20k+ lines of the
code at once.
So, my proposal is to remove it step by step.

1) MVCC tests should be removed from the project
2) MVCC-related code should be removed from the project by reasonably sized
commits, checking it does not affect the existing tests.

I'm ready to perform the removal.

Any objections/tips?

[1] https://issues.apache.org/jira/browse/IGNITE-19844
[2] https://issues.apache.org/jira/browse/IGNITE-13871


Re: Removal of ignite ml module (or moving it to extensions)

2023-08-16 Thread Ivan Daschinsky
>> https://issues.apache.org/jira/browse/IGNITE-20216
Also, I've updated dependencies and fixed BLAS issue (tested with intel mkl
blas on ubuntu 22.04)

ср, 16 авг. 2023 г. в 12:11, Ivan Daschinsky :

> I've filed a ticket and created 2 PRs. After tuning of TC I'm going to
> merge both of them, if nobody disagrees with it.
>
> https://issues.apache.org/jira/browse/IGNITE-20216
>
>
>
> пн, 14 авг. 2023 г. в 22:29, Ivan Daschinsky :
>
>> >> * com.github.fommil.netlib:core:1.1.2 - not developed and archived
>> since 2017. Last version released in 2013 [2]
>> Moreover, this version is so outdated and JNI extension was so strangely
>> made (linked to libgfortran3 for example), that native BLAS simply doesn't
>> work.
>> Always fallback option is used (f2jBLAS, and it is also outdated).
>> There is a modern option -- https://github.com/luhenry/netlib, it is
>> used in spark mllib.
>>
>> I have run all tests successfully with it (with few lines changed, of
>> course) using native blas (libopenblas on ubuntu 22.04)
>>
>> So it is possible to state that nobody has run it on native blas. So I
>> have some concerns about existence of prod like installations with Ignite
>> ML module
>>
>> пт, 11 авг. 2023 г. в 13:14, Николай Ижиков :
>>
>>> A few cents to let you know how abandoned ML module is.
>>>
>>> 1. Last valuable commit December 9, 2020 -
>>> https://github.com/apache/ignite/commit/04f6a33851d9f7bd269a09fdc2c74485b1e01a8a
>>>
>>> 2. Dependencies and current versions of them:
>>>
>>>   * com.dropbox.core:dropbox-core-sdk:2.1.1 current version is - 5.4.5
>>> [1]
>>>   * com.github.fommil.netlib:core:1.1.2 - not developed and archived
>>> since 2017. Last version released in 2013 [2]
>>>   * org.apache.commons:commons-rng-core:1.0 current version is 1.5 [3]
>>>   * com.zaxxer:SparseBitSet:1.0 current version is 1.2 [4]
>>>   * ai.catboost:catboost-prediction:0.24 current version is 1.2 [5]
>>>   * ai.h2o:h2o-genmodel:3.26.0.8 current version is 3.42.0.2 [6]
>>>
>>> ML community make a huge step forward since 2020.
>>> So I doubt ML features and tools integrations works as expected nowadays.
>>> Those type of Ignite features(abandoned or supported partially) has to
>>> be in extensions.
>>>
>>> [1] https://github.com/dropbox/dropbox-sdk-java
>>> [2] https://github.com/fommil/netlib-java
>>> [3] https://commons.apache.org/proper/commons-rng/commons-rng-core/
>>> [4] https://github.com/brettwooldridge/SparseBitSet
>>> [5] https://mvnrepository.com/artifact/ai.catboost/catboost-prediction
>>> [6] https://mvnrepository.com/artifact/ai.h2o/h2o-genmodel
>>>
>>>
>>>
>>> > 11 авг. 2023 г., в 12:19, Kseniya Romanova 
>>> написал(а):
>>> >
>>> > As far as I know, the integration was removed from the Tensorflow side.
>>> >
>>> > On Thu, Aug 10, 2023 at 2:04 PM Andrey Mashenkov <
>>> andrey.mashen...@gmail.com>
>>> > wrote:
>>> >
>>> >> Ivan,
>>> >>
>>> >>> Actually, I haven't found any integration with tensorflow in AI code.
>>> >>
>>> >> Ok. You are right.
>>> >> Tensorflow is mentioned in docs: docs/_docs/setup.adoc.
>>> >>
>>> >> Adapters may require compilation time dependencies, but these
>>> dependencies
>>> >> shouldn't be part or release package,
>>> >> regardless whether the ML module is a part of Ignite or extensions.
>>> WDYT?
>>> >>
>>> >> On Thu, Aug 10, 2023 at 1:36 PM Ivan Daschinsky 
>>> >> wrote:
>>> >>
>>> >>> Actually, I haven't found any integration with tensorflow in AI code.
>>> >>> Actually, all integrations are some adapters that allow to load
>>> >> pretrained
>>> >>> models (h2o, catboost etc.)
>>> >>>
>>> >>> чт, 10 авг. 2023 г. в 13:08, Ivan Daschinsky :
>>> >>>
>>>  I am personally for moving to extensions. Alex has already mentioned
>>> >> all
>>>  the reasons why it should be done and all of them are quite
>>> important.
>>>  The module seems to be quite independent and there is no problem to
>>> >> move
>>>  it to ignite-extensions.
>>>  So I am +1 for moving to ignite-extensions.
>>> 
>>> 
>>>  чт, 10 авг. 2023 г. в 12:45, Kseniya Romanova <
>>> ksroman...@apache.org>:
>>> 
>>> >>
>>> >> do you know anyone who uses it?
>>> >
>>> > I know some teams, who do. At the last Ignite Summit we had a talk
>>> > featuring Ml module (from the Groovy community).
>>> > Anyway, We need here the module maintainer opinion
>>> >  + Alex
>>> >
>>> > On Wed, Aug 9, 2023 at 3:38 PM Andrey Mashenkov <
>>> > andrey.mashen...@gmail.com>
>>> > wrote:
>>> >
>>> >> -1 for removal.
>>> >> 0 for relocation
>>> >>
>>> >> imho, TC resources and module size aren't good arguments for
>>> >> removal/moving.
>>> >> ML tests could be run nightly.
>>> >> ML module contains few integrations (with TensorFlow and other),
>>> >> these
>>> >> optional integrations are wighty and could be moved to extension,
>>> >> but core functionality still can be left untouched if it is highly
>>> > coupled
>>> >> with 

Re: Removal of ignite ml module (or moving it to extensions)

2023-08-16 Thread Aleksei Zinovev
Hi, as PMC and maintainer of this module

-1 for removal
+1 for moving to an extension, if it is compatible with the Ignite and
could be compiled separately from other extension modules

Some facts:

   - nobody updates it for latest 3 years—it's true
   - classic ML algorithms are not changed in the latest 3 years (we have
   not supported DL as a part of the module, it's not a goal, Random Forest
   was not changed latest 20 years) as a CSV parsing or JDBC
   - Tensorflow integration was removed 3 years ago
   - some people contacted me a few weeks ago to fix or develop some
   features in the Ignite ML urgent, but I have no time to do it urgent
   - I met some companies who used IgniteML in 2021 and 2022 including my
   job interview:)
   - I agree with the blas issue, great if somebody could update it, again
   I could help with testing


I could help with the review of the PR on the github with moving to an
extension, please assign on me @zaleslaw, but now I am on vacation, could
do it in September


Re: Removal of ignite ml module (or moving it to extensions)

2023-08-16 Thread Ivan Daschinsky
I've filed a ticket and created 2 PRs. After tuning of TC I'm going to
merge both of them, if nobody disagrees with it.

https://issues.apache.org/jira/browse/IGNITE-20216



пн, 14 авг. 2023 г. в 22:29, Ivan Daschinsky :

> >> * com.github.fommil.netlib:core:1.1.2 - not developed and archived
> since 2017. Last version released in 2013 [2]
> Moreover, this version is so outdated and JNI extension was so strangely
> made (linked to libgfortran3 for example), that native BLAS simply doesn't
> work.
> Always fallback option is used (f2jBLAS, and it is also outdated).
> There is a modern option -- https://github.com/luhenry/netlib, it is used
> in spark mllib.
>
> I have run all tests successfully with it (with few lines changed, of
> course) using native blas (libopenblas on ubuntu 22.04)
>
> So it is possible to state that nobody has run it on native blas. So I
> have some concerns about existence of prod like installations with Ignite
> ML module
>
> пт, 11 авг. 2023 г. в 13:14, Николай Ижиков :
>
>> A few cents to let you know how abandoned ML module is.
>>
>> 1. Last valuable commit December 9, 2020 -
>> https://github.com/apache/ignite/commit/04f6a33851d9f7bd269a09fdc2c74485b1e01a8a
>>
>> 2. Dependencies and current versions of them:
>>
>>   * com.dropbox.core:dropbox-core-sdk:2.1.1 current version is - 5.4.5 [1]
>>   * com.github.fommil.netlib:core:1.1.2 - not developed and archived
>> since 2017. Last version released in 2013 [2]
>>   * org.apache.commons:commons-rng-core:1.0 current version is 1.5 [3]
>>   * com.zaxxer:SparseBitSet:1.0 current version is 1.2 [4]
>>   * ai.catboost:catboost-prediction:0.24 current version is 1.2 [5]
>>   * ai.h2o:h2o-genmodel:3.26.0.8 current version is 3.42.0.2 [6]
>>
>> ML community make a huge step forward since 2020.
>> So I doubt ML features and tools integrations works as expected nowadays.
>> Those type of Ignite features(abandoned or supported partially) has to be
>> in extensions.
>>
>> [1] https://github.com/dropbox/dropbox-sdk-java
>> [2] https://github.com/fommil/netlib-java
>> [3] https://commons.apache.org/proper/commons-rng/commons-rng-core/
>> [4] https://github.com/brettwooldridge/SparseBitSet
>> [5] https://mvnrepository.com/artifact/ai.catboost/catboost-prediction
>> [6] https://mvnrepository.com/artifact/ai.h2o/h2o-genmodel
>>
>>
>>
>> > 11 авг. 2023 г., в 12:19, Kseniya Romanova 
>> написал(а):
>> >
>> > As far as I know, the integration was removed from the Tensorflow side.
>> >
>> > On Thu, Aug 10, 2023 at 2:04 PM Andrey Mashenkov <
>> andrey.mashen...@gmail.com>
>> > wrote:
>> >
>> >> Ivan,
>> >>
>> >>> Actually, I haven't found any integration with tensorflow in AI code.
>> >>
>> >> Ok. You are right.
>> >> Tensorflow is mentioned in docs: docs/_docs/setup.adoc.
>> >>
>> >> Adapters may require compilation time dependencies, but these
>> dependencies
>> >> shouldn't be part or release package,
>> >> regardless whether the ML module is a part of Ignite or extensions.
>> WDYT?
>> >>
>> >> On Thu, Aug 10, 2023 at 1:36 PM Ivan Daschinsky 
>> >> wrote:
>> >>
>> >>> Actually, I haven't found any integration with tensorflow in AI code.
>> >>> Actually, all integrations are some adapters that allow to load
>> >> pretrained
>> >>> models (h2o, catboost etc.)
>> >>>
>> >>> чт, 10 авг. 2023 г. в 13:08, Ivan Daschinsky :
>> >>>
>>  I am personally for moving to extensions. Alex has already mentioned
>> >> all
>>  the reasons why it should be done and all of them are quite
>> important.
>>  The module seems to be quite independent and there is no problem to
>> >> move
>>  it to ignite-extensions.
>>  So I am +1 for moving to ignite-extensions.
>> 
>> 
>>  чт, 10 авг. 2023 г. в 12:45, Kseniya Romanova > >:
>> 
>> >>
>> >> do you know anyone who uses it?
>> >
>> > I know some teams, who do. At the last Ignite Summit we had a talk
>> > featuring Ml module (from the Groovy community).
>> > Anyway, We need here the module maintainer opinion
>> >  + Alex
>> >
>> > On Wed, Aug 9, 2023 at 3:38 PM Andrey Mashenkov <
>> > andrey.mashen...@gmail.com>
>> > wrote:
>> >
>> >> -1 for removal.
>> >> 0 for relocation
>> >>
>> >> imho, TC resources and module size aren't good arguments for
>> >> removal/moving.
>> >> ML tests could be run nightly.
>> >> ML module contains few integrations (with TensorFlow and other),
>> >> these
>> >> optional integrations are wighty and could be moved to extension,
>> >> but core functionality still can be left untouched if it is highly
>> > coupled
>> >> with core Ignite and moving to extension is hard.
>> >>
>> >>
>> >> On Wed, Aug 9, 2023 at 3:22 PM Anton Vinogradov 
>> >>> wrote:
>> >>
>> >>> +1 to relocation
>> >>>
>> >>> On Wed, Aug 9, 2023 at 3:09 PM Alex Plehanov <
>> >>> plehanov.a...@gmail.com
>> >>
>> >>> wrote:
>> >>>
>>  Pavel, do you know anyone who