[jira] [Commented] (IGNITE-20216) Moving ML module to ignite-extensions
[ https://issues.apache.org/jira/browse/IGNITE-20216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17763612#comment-17763612 ] Alexey Zinoviev commented on IGNITE-20216: -- Everything is fine, could be merged > Moving ML module to ignite-extensions > - > > Key: IGNITE-20216 > URL: https://issues.apache.org/jira/browse/IGNITE-20216 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Daschinsky >Assignee: Ivan Daschinsky >Priority: Major > Labels: ise > Fix For: 2.16 > > Time Spent: 50m > Remaining Estimate: 0h > > It is time to move this module to ignite extensions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (IGNITE-20216) Moving ML module to ignite-extensions
[ https://issues.apache.org/jira/browse/IGNITE-20216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754952#comment-17754952 ] Alexey Zinoviev commented on IGNITE-20216: -- Hi, as PMC and maintainer of this module -1 for removal +1 for moving to an extension, if it is compatible with the Ignite and could be compiled separately from other extension modules Some facts: * nobody updates it for latest 3 years—it's true * classic ML algorithms are not changed in the latest 3 years (we have not supported DL as a part of the module, it's not a goal, Random Forest was not changed latest 20 years) as a CSV parsing or JDBC * Tensorflow integration was removed 3 years ago * some people contacted me a few weeks ago to fix or develop some features in the Ignite ML urgent, but I have no time to do it urgent * I met some companies who used IgniteML in 2021 and 2022 including my job interview:) * I agree with the blas issue, great if somebody could update it, again I could help with testing I could help with the review of the PR on the github with moving to an extension, please assign on me @zaleslaw, but now I am on vacation, could do it in September > Moving ML module to ignite-extensions > - > > Key: IGNITE-20216 > URL: https://issues.apache.org/jira/browse/IGNITE-20216 > Project: Ignite > Issue Type: Improvement >Reporter: Ivan Daschinsky >Assignee: Ivan Daschinsky >Priority: Major > Labels: ise > Fix For: 2.16 > > Time Spent: 20m > Remaining Estimate: 0h > > It is time to move this module to ignite extensions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (IGNITE-13803) Scalar test failed due to incorrect Jackson dependency
[ https://issues.apache.org/jira/browse/IGNITE-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev resolved IGNITE-13803. -- Resolution: Fixed > Scalar test failed due to incorrect Jackson dependency > -- > > Key: IGNITE-13803 > URL: https://issues.apache.org/jira/browse/IGNITE-13803 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.10 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > Time Spent: 20m > Remaining Estimate: 0h > > It's failed with > ``` > java.lang.ExceptionInInitializerError > Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible > Jackson version: 2.10.3``` > > > https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_ScalaExamples?branch=%3Cdefault%3E&buildTypeTab=overview&mode=builds# -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13803) Scalar test failed due to incorrect Jackson dependency
[ https://issues.apache.org/jira/browse/IGNITE-13803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17242512#comment-17242512 ] Alexey Zinoviev commented on IGNITE-13803: -- After excluding dependency in "example" POM from ignite-ml RDD tests, scalar suite, ML project and all examples are executed without errors > Scalar test failed due to incorrect Jackson dependency > -- > > Key: IGNITE-13803 > URL: https://issues.apache.org/jira/browse/IGNITE-13803 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.10 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > Time Spent: 10m > Remaining Estimate: 0h > > It's failed with > ``` > java.lang.ExceptionInInitializerError > Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible > Jackson version: 2.10.3``` > > > https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_ScalaExamples?branch=%3Cdefault%3E&buildTypeTab=overview&mode=builds# -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13803) Scalar test failed due to incorrect Jackson dependency
Alexey Zinoviev created IGNITE-13803: Summary: Scalar test failed due to incorrect Jackson dependency Key: IGNITE-13803 URL: https://issues.apache.org/jira/browse/IGNITE-13803 Project: Ignite Issue Type: Bug Components: ml Affects Versions: 2.10 Reporter: Alexey Zinoviev Assignee: Alexey Zinoviev Fix For: 2.10 It's failed with ``` java.lang.ExceptionInInitializerError Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.10.3``` https://ci.ignite.apache.org/buildConfiguration/IgniteTests24Java8_ScalaExamples?branch=%3Cdefault%3E&buildTypeTab=overview&mode=builds# -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IGNITE-12337) [ML] Redesign the package structure
[ https://issues.apache.org/jira/browse/IGNITE-12337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev resolved IGNITE-12337. -- Resolution: Won't Fix > [ML] Redesign the package structure > --- > > Key: IGNITE-12337 > URL: https://issues.apache.org/jira/browse/IGNITE-12337 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.10 > > > The problem is the next: a lot of classes and algorithms are located in not > the appropriate places and are not grouped in the high-level packages -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12337) [ML] Redesign the package structure
[ https://issues.apache.org/jira/browse/IGNITE-12337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12337: - Fix Version/s: (was: 2.10) > [ML] Redesign the package structure > --- > > Key: IGNITE-12337 > URL: https://issues.apache.org/jira/browse/IGNITE-12337 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > > The problem is the next: a lot of classes and algorithms are located in not > the appropriate places and are not grouped in the high-level packages -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12288) [ML] Replace assert logic with exceptions
[ https://issues.apache.org/jira/browse/IGNITE-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12288: - Fix Version/s: (was: 2.10) > [ML] Replace assert logic with exceptions > - > > Key: IGNITE-12288 > URL: https://issues.apache.org/jira/browse/IGNITE-12288 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > > 1) Add exceptions instead of assert logic > 2) Add tests for the proposed exceptions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12079) [ML][Umbrella] Add advanced preprocessing techniques
[ https://issues.apache.org/jira/browse/IGNITE-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12079: - Fix Version/s: (was: 2.10) > [ML][Umbrella] Add advanced preprocessing techniques > > > Key: IGNITE-12079 > URL: https://issues.apache.org/jira/browse/IGNITE-12079 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > *Main goal:* > To reduce the gap between Apache Spark and Apache Ignite in preprocessing > operations. The reducing of the gap could help with loading Spark ML > Pipelines to Ignite ML. > > Next steps: > # Add Frequency Encoder > # Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, > LEAST_FREQUENT) > # Add RobustScaler (will be added in Spark 3.0) > # Add CountVectorizer > # Add FeatureHasher > # Add QuantileDiscretizer > # Add Locality Sensitive Hashing (LSH) > # Add LabelEncoder > # Add RevertStringIndexing > # Add multi-column preprocessor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10426) [ML] Spread parameter isKeepRawLabels across all models
[ https://issues.apache.org/jira/browse/IGNITE-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10426: - Fix Version/s: (was: 2.10) > [ML] Spread parameter isKeepRawLabels across all models > --- > > Key: IGNITE-10426 > URL: https://issues.apache.org/jira/browse/IGNITE-10426 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > Currently, a few models has the parameter isKeepRawLabels and threshold to > change predicted value to one of class labels 1 or 0. > Discuss this in dev-list and think how to solve this task to optimize > MultiClassModel > Possible solution: > * add these methods to common model > * add this method to MultiClassModel and use reflection to check this > parameter in apply method for example -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10870) [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset
[ https://issues.apache.org/jira/browse/IGNITE-10870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10870: - Fix Version/s: (was: 2.10) > [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset > - > > Key: IGNITE-10870 > URL: https://issues.apache.org/jira/browse/IGNITE-10870 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Labels: newbie > > Add a one or two examples for KNN/LogReg and Iris dataset with 3 classes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13672) [ML] Add initial JSON export/import support for all models
[ https://issues.apache.org/jira/browse/IGNITE-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13672: - Labels: important (was: ) > [ML] Add initial JSON export/import support for all models > -- > > Key: IGNITE-13672 > URL: https://issues.apache.org/jira/browse/IGNITE-13672 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Labels: important > Fix For: 2.10 > > Time Spent: 20m > Remaining Estimate: 0h > > This approaches uses JAXB project abilities to import in and export from > human-readable JSON format. > Should include: > * Basic interfaces > * Implementations for all models > * Examples for all models (maybe export only) > * Tests with import/export to the temp directory -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Model export/import to PMML and custom JSON format
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Fix Version/s: (was: 2.10) > [Umbrella] Model export/import to PMML and custom JSON format > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > > We need to be able to export/import Ignite model versions across clusters > with different versions and have exchangable & human-readable format for > inference with different systems like scikit-learn, Spark ML and etc > The PMML format is a good choice here: > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] > (i) [https://github.com/jpmml/jpmml-model] > > But PMML has limitation support for Ensembles like Random Forest, Gradient > Boosted Trees, Stacking, Bagging and so on. > These cases could be covered with our own JSON format which could be easily > parsed in another system. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Model export/import to PMML and custom JSON format
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Labels: (was: important) > [Umbrella] Model export/import to PMML and custom JSON format > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > > We need to be able to export/import Ignite model versions across clusters > with different versions and have exchangable & human-readable format for > inference with different systems like scikit-learn, Spark ML and etc > The PMML format is a good choice here: > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] > (i) [https://github.com/jpmml/jpmml-model] > > But PMML has limitation support for Ensembles like Random Forest, Gradient > Boosted Trees, Stacking, Bagging and so on. > These cases could be covered with our own JSON format which could be easily > parsed in another system. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13672) [ML] Add initial JSON export/import support for all models
Alexey Zinoviev created IGNITE-13672: Summary: [ML] Add initial JSON export/import support for all models Key: IGNITE-13672 URL: https://issues.apache.org/jira/browse/IGNITE-13672 Project: Ignite Issue Type: Sub-task Components: ml Reporter: Alexey Zinoviev Assignee: Alexey Zinoviev Fix For: 2.10 This approaches uses JAXB project abilities to import in and export from human-readable JSON format. Should include: * Basic interfaces * Implementations for all models * Examples for all models (maybe export only) * Tests with import/export to the temp directory -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13533) [ML] Tutorial examples runs more than 300000ms
[ https://issues.apache.org/jira/browse/IGNITE-13533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13533: - Component/s: ml Fix Version/s: 2.10 > [ML] Tutorial examples runs more than 30ms > -- > > Key: IGNITE-13533 > URL: https://issues.apache.org/jira/browse/IGNITE-13533 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > Test has been timed out [test=testExample, timeout=30] > Seems like we have a race condition in Genetic Parallel Hyper-parameter tuning > > {code:java} > [12:22:10] : [Step 4/5] Thread > [name="test-runner-#31311%ml.TutorialStepByStepExampleSelfTest%", id=32007, > state=RUNNABLE, blockCnt=1982, waitCnt=91727][12:22:10] : [Step 4/5] Thread > [name="test-runner-#31311%ml.TutorialStepByStepExampleSelfTest%", id=32007, > state=RUNNABLE, blockCnt=1982, waitCnt=91727][12:22:10] : [Step 4/5] > at java.lang.System.identityHashCode(Native Method)[12:22:10] : [Step 4/5] > at > java.io.ObjectOutputStream$HandleTable.hash(ObjectOutputStream.java:2360)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream$HandleTable.lookup(ObjectOutputStream.java:2293)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream$ReplaceTable.lookup(ObjectOutputStream.java:2399)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1113)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)[12:22:10] > : [Step 4/5] at > java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)[12:22:10] > : [Step 4/5] at > o.a.i.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:97)[12:22:10] > : [Step 4/5] at > o.a.i.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:109)[12:22:10] > : [Step 4/5] at > o.a.i.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:56)[12:22:10] > : [Step 4/5] at > o.a.i.i.util.IgniteUtils.marshal(IgniteUtils.java:10505)[12:22:10] : [Step > 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor$7.applyx(GridCacheProcessor.java:4952)[12:22:10] > : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor$7.applyx(GridCacheProcessor.java:4933)[12:22:10] > : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor.withBinaryContext(GridCacheProcessor.java:4978)[12:22:10] > : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor.cloneCheckSerializable(GridCacheProcessor.java:4933)[12:22:10] > : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor.prepareCacheChangeRequest(GridCacheProcessor.java:5036)[12:22:10] > : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor.lambda$dynamicStartCache$26(GridCacheProcessor.java:3472)[12:22:10] > : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor$$Lambda$722/1638695311.apply(Unknown > Source)[12:22:10] : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor.dynamicStartCache(GridCacheProcessor.java:3503)[12:22:10] > : [Step 4/5] at > o.a.i.i.processors.cache.GridCacheProcessor.dynamicStartCache(GridCacheProcessor.java:3408)[12:22:10] > : [Step 4/5] at > o.a.i.i.IgniteKernal.createCache(IgniteKernal.java:3191)[12:22:10] : [Step > 4/5] at > o.a.i.ml.dataset.impl.cache.CacheBasedDatasetBuilder.build(CacheBasedDatasetBuilder.java:151)[12:22:10] > : [Step 4/5] at > o.a.i.ml.dataset.impl.cache.CacheBasedDatasetBuilder.build(CacheBasedDatasetBuilder.java:43)[12:22:10] > : [Step 4/5] at > o.a.i.ml.selection.scoring.evaluator.Evaluator.evaluate(Evaluator.java:429)[12:22:10] > : [Step 4/5] at > o.a.i.ml.selection.cv.AbstractCrossValidation.score(AbstractCrossValidation.java:350)[12:22:10] > : [Step 4/5] at > o.a.i.ml.selection.cv.CrossValidation.scoreOnIgnite(CrossValidation.java:79)[12:22:10] > : [Step 4/5] at > o.a.i.ml.selection.cv.CrossValidation.scoreByFolds(CrossValidation.java:53)[12:22:10] > : [Step 4/5] at > o.a.i.ml.selection.cv.AbstractCrossValidation.calculateScoresForFixedParamSet(AbstractCrossValidation.java:294)[12:22:10] > : [Step 4/5] at > o.a.i.ml.selection.c
[jira] [Updated] (IGNITE-13533) [ML] Tutorial examples runs more than 300000ms
[ https://issues.apache.org/jira/browse/IGNITE-13533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13533: - Description: Test has been timed out [test=testExample, timeout=30] Seems like we have a race condition in Genetic Parallel Hyper-parameter tuning {code:java} [12:22:10] : [Step 4/5] Thread [name="test-runner-#31311%ml.TutorialStepByStepExampleSelfTest%", id=32007, state=RUNNABLE, blockCnt=1982, waitCnt=91727][12:22:10] : [Step 4/5] Thread [name="test-runner-#31311%ml.TutorialStepByStepExampleSelfTest%", id=32007, state=RUNNABLE, blockCnt=1982, waitCnt=91727][12:22:10] : [Step 4/5] at java.lang.System.identityHashCode(Native Method)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream$HandleTable.hash(ObjectOutputStream.java:2360)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream$HandleTable.lookup(ObjectOutputStream.java:2293)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream$ReplaceTable.lookup(ObjectOutputStream.java:2399)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1113)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)[12:22:10] : [Step 4/5] at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)[12:22:10] : [Step 4/5] at o.a.i.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:97)[12:22:10] : [Step 4/5] at o.a.i.marshaller.jdk.JdkMarshaller.marshal0(JdkMarshaller.java:109)[12:22:10] : [Step 4/5] at o.a.i.marshaller.AbstractNodeNameAwareMarshaller.marshal(AbstractNodeNameAwareMarshaller.java:56)[12:22:10] : [Step 4/5] at o.a.i.i.util.IgniteUtils.marshal(IgniteUtils.java:10505)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor$7.applyx(GridCacheProcessor.java:4952)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor$7.applyx(GridCacheProcessor.java:4933)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor.withBinaryContext(GridCacheProcessor.java:4978)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor.cloneCheckSerializable(GridCacheProcessor.java:4933)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor.prepareCacheChangeRequest(GridCacheProcessor.java:5036)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor.lambda$dynamicStartCache$26(GridCacheProcessor.java:3472)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor$$Lambda$722/1638695311.apply(Unknown Source)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor.dynamicStartCache(GridCacheProcessor.java:3503)[12:22:10] : [Step 4/5] at o.a.i.i.processors.cache.GridCacheProcessor.dynamicStartCache(GridCacheProcessor.java:3408)[12:22:10] : [Step 4/5] at o.a.i.i.IgniteKernal.createCache(IgniteKernal.java:3191)[12:22:10] : [Step 4/5] at o.a.i.ml.dataset.impl.cache.CacheBasedDatasetBuilder.build(CacheBasedDatasetBuilder.java:151)[12:22:10] : [Step 4/5] at o.a.i.ml.dataset.impl.cache.CacheBasedDatasetBuilder.build(CacheBasedDatasetBuilder.java:43)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.scoring.evaluator.Evaluator.evaluate(Evaluator.java:429)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.cv.AbstractCrossValidation.score(AbstractCrossValidation.java:350)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.cv.CrossValidation.scoreOnIgnite(CrossValidation.java:79)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.cv.CrossValidation.scoreByFolds(CrossValidation.java:53)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.cv.AbstractCrossValidation.calculateScoresForFixedParamSet(AbstractCrossValidation.java:294)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.cv.AbstractCrossValidation.lambda$scoreEvolutionAlgorithmSearchHyperparameterOptimization$0(AbstractCrossValidation.java:142)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.cv.AbstractCrossValidation$$Lambda$1731/1372309000.apply(Unknown Source)[12:22:10] : [Step 4/5] at o.a.i.ml.util.genetic.Population.calculateFitnessForChromosome(Population.java:58)[12:22:10] : [Step 4/5] at o.a.i.ml.util.genetic.GeneticAlgorithm.run(GeneticAlgorithm.java:118)[12:22:10] : [Step 4/5] at o.a.i.ml.selection.cv.AbstractCrossValidation.scoreEvolutionAlgorithmSearchHyperparam
[jira] [Commented] (IGNITE-13532) [ML] Test DatasetAffinityFunctionWrapperTest failed with UnnecessaryStubbingException
[ https://issues.apache.org/jira/browse/IGNITE-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208647#comment-17208647 ] Alexey Zinoviev commented on IGNITE-13532: -- I've run ML visa due to changes related only to ML module Currently TC Bot run on master with hundrends of broken tests and missed licenses > [ML] Test DatasetAffinityFunctionWrapperTest failed with > UnnecessaryStubbingException > - > > Key: IGNITE-13532 > URL: https://issues.apache.org/jira/browse/IGNITE-13532 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.10 > > Time Spent: 10m > Remaining Estimate: 0h > > NOTE: This is not reproduced locally, but reproduced on TC > > org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary > stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & > maintainable test code requires zero unnecessary code. Following stubbings > are unnecessary (click to navigate to relevant line of code): 1. -> at > org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) > Please remove unnecessary stubbings or use 'lenient' strictness. More info: > javadoc for UnnecessaryStubbingException class. > org.mockito.exceptions.misusing.UnnecessaryStubbingException: > Unnecessary stubbings detected in test class: > DatasetAffinityFunctionWrapperTest > Clean & maintainable test code requires zero unnecessary code. > Following stubbings are unnecessary (click to navigate to relevant line of > code): > 1. -> at > org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) > Please remove unnecessary stubbings or use 'lenient' strictness. More info: > javadoc for UnnecessaryStubbingException class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13533) [ML] Tutorial examples runs more than 300000ms
Alexey Zinoviev created IGNITE-13533: Summary: [ML] Tutorial examples runs more than 30ms Key: IGNITE-13533 URL: https://issues.apache.org/jira/browse/IGNITE-13533 Project: Ignite Issue Type: Bug Reporter: Alexey Zinoviev Assignee: Alexey Zinoviev Test has been timed out [test=testExample, timeout=30] Seems like we have a race condition in Genetic Parallel Hyper-parameter tuning -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13532) [ML] Test DatasetAffinityFunctionWrapperTest failed with UnnecessaryStubbingException
Alexey Zinoviev created IGNITE-13532: Summary: [ML] Test DatasetAffinityFunctionWrapperTest failed with UnnecessaryStubbingException Key: IGNITE-13532 URL: https://issues.apache.org/jira/browse/IGNITE-13532 Project: Ignite Issue Type: Bug Components: ml Reporter: Alexey Zinoviev Assignee: Alexey Zinoviev Fix For: 2.10 org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & maintainable test code requires zero unnecessary code. Following stubbings are unnecessary (click to navigate to relevant line of code): 1. -> at org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) Please remove unnecessary stubbings or use 'lenient' strictness. More info: javadoc for UnnecessaryStubbingException class. org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & maintainable test code requires zero unnecessary code. Following stubbings are unnecessary (click to navigate to relevant line of code): 1. -> at org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) Please remove unnecessary stubbings or use 'lenient' strictness. More info: javadoc for UnnecessaryStubbingException class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13532) [ML] Test DatasetAffinityFunctionWrapperTest failed with UnnecessaryStubbingException
[ https://issues.apache.org/jira/browse/IGNITE-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13532: - Description: NOTE: This is not reproduced locally, but reproduced on TC org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & maintainable test code requires zero unnecessary code. Following stubbings are unnecessary (click to navigate to relevant line of code): 1. -> at org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) Please remove unnecessary stubbings or use 'lenient' strictness. More info: javadoc for UnnecessaryStubbingException class. org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & maintainable test code requires zero unnecessary code. Following stubbings are unnecessary (click to navigate to relevant line of code): 1. -> at org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) Please remove unnecessary stubbings or use 'lenient' strictness. More info: javadoc for UnnecessaryStubbingException class. was: org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & maintainable test code requires zero unnecessary code. Following stubbings are unnecessary (click to navigate to relevant line of code): 1. -> at org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) Please remove unnecessary stubbings or use 'lenient' strictness. More info: javadoc for UnnecessaryStubbingException class. org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & maintainable test code requires zero unnecessary code. Following stubbings are unnecessary (click to navigate to relevant line of code): 1. -> at org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) Please remove unnecessary stubbings or use 'lenient' strictness. More info: javadoc for UnnecessaryStubbingException class. > [ML] Test DatasetAffinityFunctionWrapperTest failed with > UnnecessaryStubbingException > - > > Key: IGNITE-13532 > URL: https://issues.apache.org/jira/browse/IGNITE-13532 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.10 > > > NOTE: This is not reproduced locally, but reproduced on TC > > org.mockito.exceptions.misusing.UnnecessaryStubbingException: Unnecessary > stubbings detected in test class: DatasetAffinityFunctionWrapperTest Clean & > maintainable test code requires zero unnecessary code. Following stubbings > are unnecessary (click to navigate to relevant line of code): 1. -> at > org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) > Please remove unnecessary stubbings or use 'lenient' strictness. More info: > javadoc for UnnecessaryStubbingException class. > org.mockito.exceptions.misusing.UnnecessaryStubbingException: > Unnecessary stubbings detected in test class: > DatasetAffinityFunctionWrapperTest > Clean & maintainable test code requires zero unnecessary code. > Following stubbings are unnecessary (click to navigate to relevant line of > code): > 1. -> at > org.apache.ignite.ml.dataset.impl.cache.util.DatasetAffinityFunctionWrapperTest.testPartition(DatasetAffinityFunctionWrapperTest.java:80) > Please remove unnecessary stubbings or use 'lenient' strictness. More info: > javadoc for UnnecessaryStubbingException class. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13531) [ML] Code cleanup in Util classes
[ https://issues.apache.org/jira/browse/IGNITE-13531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208597#comment-17208597 ] Alexey Zinoviev commented on IGNITE-13531: -- [~mrkandreev] please, have a look, I've created a ticket for your clean-up edits, suggested in doc [https://docs.google.com/document/d/1_oBgmNfu6YnuSxEg9e1ImyGSV-fgmHq4Ut-hPq2bakQ/edit?usp=sharing] except the cloning (need to think how to do it better and maybe make it later by myself) > [ML] Code cleanup in Util classes > - > > Key: IGNITE-13531 > URL: https://issues.apache.org/jira/browse/IGNITE-13531 > Project: Ignite > Issue Type: Improvement >Reporter: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > *Suggest improvement to Util classes* > > I suggest to add a final class modifier and to add a private constructor > to Util classes in ignite ml. This is Sonar rule RSPEC-1118 ( > [https://rules.sonarsource.com/java/tag/design/RSPEC-1118]). > > Motivation: > Utility classes, which are collections of static members, are not meant to > be instantiated. Even abstract utility classes, which can be extended, > should not have public constructors. Java adds an implicit public > constructor to every class which does not define at least one explicitly. > Hence, at least one non-public constructor should be defined. > > We can add this to: > * DistributedMetaStorageUtil.java > * ComputeUtils.java > * IgniteModelStorageUtil.java > * MapUtil.java > * MatrixUtil.java > * Utils.java > Class JdbcThinSSLUtil.java already has a private constructor. > > *Suggest add Serializable to Blas class* > I found that class Blas (org.apache.ignite.ml.math) is not Serializable but > fields f2jBlas and nativeBlas are transient. So I suggest adding > a Serializable to Blas class. > > *Add final modifier to static inner fields in utils class* > Motivation: > This static field public but not final, and could be changed by malicious > code or by accident from another package. The field could be made final to > avoid this vulnerability. > > For example replace: > public static IgniteDifferentiableDoubleToDoubleFunction SIGMOID = new > IgniteDifferentiableDoubleToDoubleFunction() { > } > With: > public static final IgniteDifferentiableDoubleToDoubleFunction SIGMOID = new > IgniteDifferentiableDoubleToDoubleFunction() { > } > > *Inefficient use of keySet iterator instead of entrySet* > This method accesses the value of a Map entry, using a key that was retrieved > from a keySet iterator. It is more efficient to use an iterator on the > entrySet of the map, to avoid the Map.get(key) lookup. > > Possible problem is expected order for set. > > For example: > for (Integer bucket : hist.keySet()) { > accum += hist.get(bucket); > res.put(bucket, accum); > } > > *Can be replaced with single Arrays.fill method call* > For example: > for (int i = 0; i < mins.length; i++) > mins[i] = Double.POSITIVE_INFINITY; > Can be replaced with: > Arrays.fill(mins, Double.POSITIVE_INFINITY); > Founded in: > * ImputerTrainer > * MaxAbsScalerTrainer > * MinMaxScalerTrainer -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13531) [ML] Code cleanup in Util classes
Alexey Zinoviev created IGNITE-13531: Summary: [ML] Code cleanup in Util classes Key: IGNITE-13531 URL: https://issues.apache.org/jira/browse/IGNITE-13531 Project: Ignite Issue Type: Improvement Reporter: Alexey Zinoviev Fix For: 2.10 *Suggest improvement to Util classes* I suggest to add a final class modifier and to add a private constructor to Util classes in ignite ml. This is Sonar rule RSPEC-1118 ( [https://rules.sonarsource.com/java/tag/design/RSPEC-1118]). Motivation: Utility classes, which are collections of static members, are not meant to be instantiated. Even abstract utility classes, which can be extended, should not have public constructors. Java adds an implicit public constructor to every class which does not define at least one explicitly. Hence, at least one non-public constructor should be defined. We can add this to: * DistributedMetaStorageUtil.java * ComputeUtils.java * IgniteModelStorageUtil.java * MapUtil.java * MatrixUtil.java * Utils.java Class JdbcThinSSLUtil.java already has a private constructor. *Suggest add Serializable to Blas class* I found that class Blas (org.apache.ignite.ml.math) is not Serializable but fields f2jBlas and nativeBlas are transient. So I suggest adding a Serializable to Blas class. *Add final modifier to static inner fields in utils class* Motivation: This static field public but not final, and could be changed by malicious code or by accident from another package. The field could be made final to avoid this vulnerability. For example replace: public static IgniteDifferentiableDoubleToDoubleFunction SIGMOID = new IgniteDifferentiableDoubleToDoubleFunction() { } With: public static final IgniteDifferentiableDoubleToDoubleFunction SIGMOID = new IgniteDifferentiableDoubleToDoubleFunction() { } *Inefficient use of keySet iterator instead of entrySet* This method accesses the value of a Map entry, using a key that was retrieved from a keySet iterator. It is more efficient to use an iterator on the entrySet of the map, to avoid the Map.get(key) lookup. Possible problem is expected order for set. For example: for (Integer bucket : hist.keySet()) { accum += hist.get(bucket); res.put(bucket, accum); } *Can be replaced with single Arrays.fill method call* For example: for (int i = 0; i < mins.length; i++) mins[i] = Double.POSITIVE_INFINITY; Can be replaced with: Arrays.fill(mins, Double.POSITIVE_INFINITY); Founded in: * ImputerTrainer * MaxAbsScalerTrainer * MinMaxScalerTrainer -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13386) [ML] Add more distances between two Vectors (Part 2)
[ https://issues.apache.org/jira/browse/IGNITE-13386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208580#comment-17208580 ] Alexey Zinoviev commented on IGNITE-13386: -- [~mrkandreev] please move to patch available > [ML] Add more distances between two Vectors (Part 2) > > > Key: IGNITE-13386 > URL: https://issues.apache.org/jira/browse/IGNITE-13386 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Mark Andreev >Priority: Minor > Fix For: 2.10 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Mark suggested to add more distances, below his letter about topic > [http://apache-ignite-developers.2346864.n4.nabble.com/First-contribute-to-Ignite-ML-td48950.html] > "Currently, Ignite supports only these distances > (org.apache.ignite.ml.math.distances) : > - ChebyshevDistance > - CosineSimilarity > - EuclideanDistance > - HammingDistance > - JaccardIndex > - ManhattanDistance > - MinkowskiDistance > But in scipy ( > [https://docs.scipy.org/doc/scipy/reference/spatial.distance.html]) we can > find at least: > - BrayCurtis > - Canberra > - Jensen-Shannon > - Seuclidean > - Weighted Minkowski > I can implement those and coverage with unit tests." -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13392) Incorrect Vector::kNorm evaluation for odd powers
[ https://issues.apache.org/jira/browse/IGNITE-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17208562#comment-17208562 ] Alexey Zinoviev commented on IGNITE-13392: -- [~mrkandreev] Please move ticket to the patch available status > Incorrect Vector::kNorm evaluation for odd powers > - > > Key: IGNITE-13392 > URL: https://issues.apache.org/jira/browse/IGNITE-13392 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Mark Andreev >Assignee: Mark Andreev >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > Current implementation of `Vector::kNorm` is incorrect. > Current formula is > (`org.apache.ignite.ml.math.primitives.vector.AbstractVector:882`): > {code:java} > (\sum_{i}{x^p})^{1/p} > {code} > But correct formula is: > {code:java} > (\sum_{i}{|x|^p})^{1/p} > {code} > We can verify this using lectures > ([https://www.math.usm.edu/lambers/mat610/sum10/lecture2.pdf)] or using > Wolfram Mathematica: > {code:java} > > Norm[{x, y, z}, p] > (Abs[x]^p+Abs[y]^p+Abs[z]^p)^(1/p){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13392) Incorrect Vector::kNorm evaluation for odd powers
[ https://issues.apache.org/jira/browse/IGNITE-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13392: - Fix Version/s: 2.10 > Incorrect Vector::kNorm evaluation for odd powers > - > > Key: IGNITE-13392 > URL: https://issues.apache.org/jira/browse/IGNITE-13392 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Mark Andreev >Assignee: Mark Andreev >Priority: Minor > Fix For: 2.10 > > Time Spent: 20m > Remaining Estimate: 0h > > Current implementation of `Vector::kNorm` is incorrect. > Current formula is > (`org.apache.ignite.ml.math.primitives.vector.AbstractVector:882`): > {code:java} > (\sum_{i}{x^p})^{1/p} > {code} > But correct formula is: > {code:java} > (\sum_{i}{|x|^p})^{1/p} > {code} > We can verify this using lectures > ([https://www.math.usm.edu/lambers/mat610/sum10/lecture2.pdf)] or using > Wolfram Mathematica: > {code:java} > > Norm[{x, y, z}, p] > (Abs[x]^p+Abs[y]^p+Abs[z]^p)^(1/p){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13392) [ML] Incorrect Vector::kNorm evaluation for odd powers
[ https://issues.apache.org/jira/browse/IGNITE-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13392: - Summary: [ML] Incorrect Vector::kNorm evaluation for odd powers (was: Incorrect Vector::kNorm evaluation for odd powers) > [ML] Incorrect Vector::kNorm evaluation for odd powers > -- > > Key: IGNITE-13392 > URL: https://issues.apache.org/jira/browse/IGNITE-13392 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Mark Andreev >Assignee: Mark Andreev >Priority: Minor > Fix For: 2.10 > > Time Spent: 20m > Remaining Estimate: 0h > > Current implementation of `Vector::kNorm` is incorrect. > Current formula is > (`org.apache.ignite.ml.math.primitives.vector.AbstractVector:882`): > {code:java} > (\sum_{i}{x^p})^{1/p} > {code} > But correct formula is: > {code:java} > (\sum_{i}{|x|^p})^{1/p} > {code} > We can verify this using lectures > ([https://www.math.usm.edu/lambers/mat610/sum10/lecture2.pdf)] or using > Wolfram Mathematica: > {code:java} > > Norm[{x, y, z}, p] > (Abs[x]^p+Abs[y]^p+Abs[z]^p)^(1/p){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13386) [ML] Add more distances between two Vectors (Part 2)
[ https://issues.apache.org/jira/browse/IGNITE-13386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13386: - Ignite Flags: Docs Required,Release Notes Required (was: Release Notes Required) > [ML] Add more distances between two Vectors (Part 2) > > > Key: IGNITE-13386 > URL: https://issues.apache.org/jira/browse/IGNITE-13386 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Mark Andreev >Priority: Minor > Fix For: 2.10 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Mark suggested to add more distances, below his letter about topic > [http://apache-ignite-developers.2346864.n4.nabble.com/First-contribute-to-Ignite-ML-td48950.html] > "Currently, Ignite supports only these distances > (org.apache.ignite.ml.math.distances) : > - ChebyshevDistance > - CosineSimilarity > - EuclideanDistance > - HammingDistance > - JaccardIndex > - ManhattanDistance > - MinkowskiDistance > But in scipy ( > [https://docs.scipy.org/doc/scipy/reference/spatial.distance.html]) we can > find at least: > - BrayCurtis > - Canberra > - Jensen-Shannon > - Seuclidean > - Weighted Minkowski > I can implement those and coverage with unit tests." -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13386) [ML] Add more distances between two Vectors (Part 2)
[ https://issues.apache.org/jira/browse/IGNITE-13386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13386: - Fix Version/s: 2.10 > [ML] Add more distances between two Vectors (Part 2) > > > Key: IGNITE-13386 > URL: https://issues.apache.org/jira/browse/IGNITE-13386 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Mark Andreev >Priority: Minor > Fix For: 2.10 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Mark suggested to add more distances, below his letter about topic > [http://apache-ignite-developers.2346864.n4.nabble.com/First-contribute-to-Ignite-ML-td48950.html] > "Currently, Ignite supports only these distances > (org.apache.ignite.ml.math.distances) : > - ChebyshevDistance > - CosineSimilarity > - EuclideanDistance > - HammingDistance > - JaccardIndex > - ManhattanDistance > - MinkowskiDistance > But in scipy ( > [https://docs.scipy.org/doc/scipy/reference/spatial.distance.html]) we can > find at least: > - BrayCurtis > - Canberra > - Jensen-Shannon > - Seuclidean > - Weighted Minkowski > I can implement those and coverage with unit tests." -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-13344) [ML] DummyVectorizer fails to extract label for coordinate with value "0.0" when backed by sparse vector
[ https://issues.apache.org/jira/browse/IGNITE-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-13344: - Fix Version/s: 2.10 > [ML] DummyVectorizer fails to extract label for coordinate with value "0.0" > when backed by sparse vector > > > Key: IGNITE-13344 > URL: https://issues.apache.org/jira/browse/IGNITE-13344 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8.1 >Reporter: Thilo-Alexander Ginkel >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.10 > > > Given: A labeled DummyVectorizer: > > {code:java} > new DummyVectorizer() > .exclude(excludeCoordinates.stream().map(coord -> vectorLength + > coord).toArray(Integer[]::new)) > .labeled(labelCoord); > {code} > {{When extracting the label, the call hierarchy eventually ends up at > org.apache.ignite.ml.dataset.feature.extractor.impl.DummyVectorizer#feature, > which returns null for val.getRaw when val is a sparse vector with the > element at the requested label coordinate being 0.0. This causes the training > job to fail (which expects a non-null label):}} > {code:java} > org.apache.ignite.IgniteException: Remote job threw user exception (override > or implement ComputeTask.result(..) method if you would like to have > automatic failover for this exception): > nullorg.apache.ignite.IgniteException: Remote job threw user exception > (override or implement ComputeTask.result(..) method if you would like to > have automatic failover for this exception): null at > org.apache.ignite.compute.ComputeTaskAdapter.result(ComputeTaskAdapter.java:102) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1062) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1055) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7037) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker.result(GridTaskWorker.java:1055) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:862) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:1146) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:961) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:809) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:659) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:519) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > ~[ignite-core-2.8.1.jar:2.8.1] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > ~[na:na] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > ~[na:na] at java.base/java.lang.Thread.run(Thread.java:832) ~[na:na]Caused > by: org.apache.ignite.IgniteException: null at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1858) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:596) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7005) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:590) > ~[ignite-core-2.8.1.jar:2.8.1] ... 5 common frames omittedCaused by: > java.lang.NullPointerException: null at > org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:91) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:41) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.lambda$getData$4(ComputeUtils.java:239) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.cache.util.PartitionDataStorage.lambda$computeDataIfAbsent$1(PartitionDataStorage.java:56) > ~[ignite-ml-2.8.1.jar:2.8.1] at > java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.j
[jira] [Updated] (IGNITE-10870) [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset
[ https://issues.apache.org/jira/browse/IGNITE-10870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10870: - Fix Version/s: (was: 3.0) 2.10 > [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset > - > > Key: IGNITE-10870 > URL: https://issues.apache.org/jira/browse/IGNITE-10870 > Project: Ignite > Issue Type: Sub-task > Components: ml >Affects Versions: 3.0 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.10 > > > Add a one or two examples for KNN/LogReg and Iris dataset with 3 classes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10539) [ML] Make 'with' methods consistent
[ https://issues.apache.org/jira/browse/IGNITE-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10539: - Reporter: Alexey Zinoviev (was: Artem Malykh) > [ML] Make 'with' methods consistent > --- > > Key: IGNITE-10539 > URL: https://issues.apache.org/jira/browse/IGNITE-10539 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > In some places we have 'with*' methods making inplace changes and returning > object itself (for example MLPTrainer::withLoss) while in other places we > have them creating new instances with corresponding parameter changed (for > example DatasetBuilder::withFilter, > DatasetBuilder::withUpstreamTrainsformer). This inconsistency makes user look > into javadoc each time and worsens overall API consistensy level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10539) [ML] Make 'with' methods consistent
[ https://issues.apache.org/jira/browse/IGNITE-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10539: - Fix Version/s: (was: 2.10) > [ML] Make 'with' methods consistent > --- > > Key: IGNITE-10539 > URL: https://issues.apache.org/jira/browse/IGNITE-10539 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > In some places we have 'with*' methods making inplace changes and returning > object itself (for example MLPTrainer::withLoss) while in other places we > have them creating new instances with corresponding parameter changed (for > example DatasetBuilder::withFilter, > DatasetBuilder::withUpstreamTrainsformer). This inconsistency makes user look > into javadoc each time and worsens overall API consistensy level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10869) [ML] Add MultiClass classification metrics
[ https://issues.apache.org/jira/browse/IGNITE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10869: - Affects Version/s: (was: 2.9) > [ML] Add MultiClass classification metrics > -- > > Key: IGNITE-10869 > URL: https://issues.apache.org/jira/browse/IGNITE-10869 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > > Add ability to calculate multiple metrics (as binary metrics) for multiclass > classification > It can be merged with OneVsRest approach -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10870) [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset
[ https://issues.apache.org/jira/browse/IGNITE-10870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10870: - Affects Version/s: (was: 3.0) > [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset > - > > Key: IGNITE-10870 > URL: https://issues.apache.org/jira/browse/IGNITE-10870 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.10 > > > Add a one or two examples for KNN/LogReg and Iris dataset with 3 classes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10869) [ML] Add MultiClass classification metrics
[ https://issues.apache.org/jira/browse/IGNITE-10869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10869: - Fix Version/s: (was: 3.0) > [ML] Add MultiClass classification metrics > -- > > Key: IGNITE-10869 > URL: https://issues.apache.org/jira/browse/IGNITE-10869 > Project: Ignite > Issue Type: Sub-task > Components: ml >Affects Versions: 2.9 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > > Add ability to calculate multiple metrics (as binary metrics) for multiclass > classification > It can be merged with OneVsRest approach -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10870) [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset
[ https://issues.apache.org/jira/browse/IGNITE-10870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10870: - Labels: newbie (was: ) > [ML] Add an example for KNN/LogReg and multi-class task full Iris dataset > - > > Key: IGNITE-10870 > URL: https://issues.apache.org/jira/browse/IGNITE-10870 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Labels: newbie > Fix For: 2.10 > > > Add a one or two examples for KNN/LogReg and Iris dataset with 3 classes -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10503) [ML] Meta information for vectors
[ https://issues.apache.org/jira/browse/IGNITE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10503: - Reporter: Alexey Zinoviev (was: Alexey Platonov) > [ML] Meta information for vectors > - > > Key: IGNITE-10503 > URL: https://issues.apache.org/jira/browse/IGNITE-10503 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > We need to design and implement vector meta-information like feature names, > bagging information, etc -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9414) [ML] Using sparce vectors in Tree-based algorithms.
[ https://issues.apache.org/jira/browse/IGNITE-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-9414: Fix Version/s: (was: 2.10) > [ML] Using sparce vectors in Tree-based algorithms. > --- > > Key: IGNITE-9414 > URL: https://issues.apache.org/jira/browse/IGNITE-9414 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > We need to support sparce vectors in DecisionTrees, RF, GDB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9415) [ML] Using sparce vectors in LSQR and MLP
[ https://issues.apache.org/jira/browse/IGNITE-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-9415: Fix Version/s: (was: 2.10) > [ML] Using sparce vectors in LSQR and MLP > - > > Key: IGNITE-9415 > URL: https://issues.apache.org/jira/browse/IGNITE-9415 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > We need to investigate and apply sparce vectors support in BLAS for LSQR and > MLP (or implement own version) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10503) [ML] Meta information for vectors
[ https://issues.apache.org/jira/browse/IGNITE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10503: - Fix Version/s: (was: 2.10) > [ML] Meta information for vectors > - > > Key: IGNITE-10503 > URL: https://issues.apache.org/jira/browse/IGNITE-10503 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Platonov >Assignee: Alexey Zinoviev >Priority: Major > > We need to design and implement vector meta-information like feature names, > bagging information, etc -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets
[ https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12396: - Affects Version/s: (was: 2.8) > [ML] Random Forest generates NaN for a part of models on small datasets > --- > > Key: IGNITE-12396 > URL: https://issues.apache.org/jira/browse/IGNITE-12396 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > @Override public Double predict(Vector features) { > double[] predictions = new double[models.size()]; > for (int i = 0; i < models.size(); i++) > predictions[i] = models.get(i).predict(features); > return predictionsAggregator.apply(predictions); > } > > predictionAggreagtor gets a lot of models and part of them returns null and > it could be aggregated, first of all handle this in Aggregator (using > threshold for amount of broken models before aggregation) also RandomForest > trees should return Double.NaN - it should fail or throw message after the > training > > I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows > > RF generates a few models with one LEAF node with empty val (Double.NaN by > default) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12396) [ML] Random Forest generates NaN for a part of models on small datasets
[ https://issues.apache.org/jira/browse/IGNITE-12396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12396: - Fix Version/s: (was: 3.0) 2.10 > [ML] Random Forest generates NaN for a part of models on small datasets > --- > > Key: IGNITE-12396 > URL: https://issues.apache.org/jira/browse/IGNITE-12396 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > @Override public Double predict(Vector features) { > double[] predictions = new double[models.size()]; > for (int i = 0; i < models.size(); i++) > predictions[i] = models.get(i).predict(features); > return predictionsAggregator.apply(predictions); > } > > predictionAggreagtor gets a lot of models and part of them returns null and > it could be aggregated, first of all handle this in Aggregator (using > threshold for amount of broken models before aggregation) also RandomForest > trees should return Double.NaN - it should fail or throw message after the > training > > I've tested with 100 or 1000 rows and it fails and doesn't fail on 10 000 rows > > RF generates a few models with one LEAF node with empty val (Double.NaN by > default) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13497) [ML] Tutorial examples fails with serialization error
Alexey Zinoviev created IGNITE-13497: Summary: [ML] Tutorial examples fails with serialization error Key: IGNITE-13497 URL: https://issues.apache.org/jira/browse/IGNITE-13497 Project: Ignite Issue Type: Bug Components: ml Reporter: Alexey Zinoviev Assignee: Alexey Zinoviev Fix For: 2.10 Cross-Validation uses in interfaces unserializable functions (DoubleConsumers and etc.) Adds custom serializable functions and double check-up all public interfaces to find similar problems -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (IGNITE-13386) [ML] Add more distances between two Vectors (Part 2)
Alexey Zinoviev created IGNITE-13386: Summary: [ML] Add more distances between two Vectors (Part 2) Key: IGNITE-13386 URL: https://issues.apache.org/jira/browse/IGNITE-13386 Project: Ignite Issue Type: Sub-task Components: ml Reporter: Alexey Zinoviev Assignee: Mark Andreev Mark suggested to add more distances, below his letter about topic [http://apache-ignite-developers.2346864.n4.nabble.com/First-contribute-to-Ignite-ML-td48950.html] "Currently, Ignite supports only these distances (org.apache.ignite.ml.math.distances) : - ChebyshevDistance - CosineSimilarity - EuclideanDistance - HammingDistance - JaccardIndex - ManhattanDistance - MinkowskiDistance But in scipy ( [https://docs.scipy.org/doc/scipy/reference/spatial.distance.html]) we can find at least: - BrayCurtis - Canberra - Jensen-Shannon - Seuclidean - Weighted Minkowski I can implement those and coverage with unit tests." -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10592) [ML] DatasetTrainer#update should be thought over.
[ https://issues.apache.org/jira/browse/IGNITE-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10592: - Reporter: Alexey Zinoviev (was: Artem Malykh) > [ML] DatasetTrainer#update should be thought over. > -- > > Key: IGNITE-10592 > URL: https://issues.apache.org/jira/browse/IGNITE-10592 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 3.0 > > > DatasetTrainer#update was designed to contain skeleton for updating models, > whereas concrete behaviour of update is implemented in subclasses by > overriding this skeletons protected components, namely > DatasetTrainer#checkState and DatasetTrainer#updateModel. > We have a problem here: if we retain skeleton method, then it should be > final. But making it final will cut the possibility to write wrappers around > some given DatasetTrainer, because in that case we will not be able to > implement Wrapper#checkState and Wrapper#updateModel by delegation to wrapped > object (this methods have protected access). We need wrappers for stacking > and for bagging for example. > Now in wrappers we have ability to > 1. Override skeleton method, but (maybe) it seems not very clean solution, > since it is no more skeleton method and we loose guarantees that checkState > and updateModel will be used at all; > 2. place wrapper in the same package as DatasetTrainer, but this forces > not-so-good classes structure. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11192) [ML] Use nd4j for matrix inversions and determinants
[ https://issues.apache.org/jira/browse/IGNITE-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-11192: - Reporter: Alexey Zinoviev (was: Alexey Platonov) > [ML] Use nd4j for matrix inversions and determinants > > > Key: IGNITE-11192 > URL: https://issues.apache.org/jira/browse/IGNITE-11192 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > From optimization point of view we should use matrix inversions and > determinant computations of dl4j instead of own realization. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11192) [ML] Use nd4j for matrix inversions and determinants
[ https://issues.apache.org/jira/browse/IGNITE-11192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-11192: - Fix Version/s: (was: 3.0) > [ML] Use nd4j for matrix inversions and determinants > > > Key: IGNITE-11192 > URL: https://issues.apache.org/jira/browse/IGNITE-11192 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Platonov >Assignee: Alexey Zinoviev >Priority: Major > > From optimization point of view we should use matrix inversions and > determinant computations of dl4j instead of own realization. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10441) Fluent API refactoring.
[ https://issues.apache.org/jira/browse/IGNITE-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10441: - Reporter: Alexey Zinoviev (was: Artem Malykh) > Fluent API refactoring. > --- > > Key: IGNITE-10441 > URL: https://issues.apache.org/jira/browse/IGNITE-10441 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > In many classes we have fluent API ("with*" methods). We have following > problem: these methods should return exactly instance of it's own class > (otherwise we'll have problems with subclasses, more precisely, if with > method is declared in class A and we have class B extending A, with method > (if we do not override it) will return A). Currently we opted to override > "with" methods in subclasses. There is one solution which is probably more > elegant, but involves relatively complex generics construction which reduces > readability: > > {code:java} > class A> { > Self withX(X x) { > this.x = x; > > return (Self)this; > } > class B> extends A { >// No need to override "withX" here >Self withY(Y y) { > this.y = y; > > return(Self)this; >} > } > class C> extends B { >// No need to override "withX" and "withY" methods here. > } > //... etc > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10746) [ML] Participate in TensorFlow 2.0 preparation
[ https://issues.apache.org/jira/browse/IGNITE-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10746: - Fix Version/s: (was: 3.0) > [ML] Participate in TensorFlow 2.0 preparation > -- > > Key: IGNITE-10746 > URL: https://issues.apache.org/jira/browse/IGNITE-10746 > Project: Ignite > Issue Type: Task > Components: ml, tensorflow >Affects Versions: 2.7 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > > The next TensorFlow releases starting from 2.0 introduce significant > structure changes: all code from contribution module will be moved into > separate sub-projects. Our "TensorFlow on Apache Ignite" integration code in > contribution module is also moving into so called "tensorflow/io" sub-project > (see [https://github.com/tensorflow/io]). > Almost all things related to this movement is already done by community > members. We need to check that "TensorFlow on Apache Ignite" is still working > after the movement, clarify details about "tensorflow/io" > review/build/publish procedures including Windows build which is not > supported so far. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6707) .NET: Machine learning APIs
[ https://issues.apache.org/jira/browse/IGNITE-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6707: Fix Version/s: (was: 3.0) > .NET: Machine learning APIs > --- > > Key: IGNITE-6707 > URL: https://issues.apache.org/jira/browse/IGNITE-6707 > Project: Ignite > Issue Type: Improvement > Components: ml, platforms >Reporter: Pavel Tupitsyn >Assignee: Alexey Zinoviev >Priority: Major > Labels: .NET > > Propagate ML APIs to .NET (see {{modules\ml\}} in Java). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11871) [ML] IP resolver in TensorFlow cluster manager doesn't work properly
[ https://issues.apache.org/jira/browse/IGNITE-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-11871: - Fix Version/s: (was: 3.0) > [ML] IP resolver in TensorFlow cluster manager doesn't work properly > > > Key: IGNITE-11871 > URL: https://issues.apache.org/jira/browse/IGNITE-11871 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.7, 2.8 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Critical > > TensorFlow cluster manager requires NodeId to be resolved into IP address or > hostname to pass the address/name to TensorFlow worker. Currently, it uses > strategy "return first" and returns the first available address/name. As a > result of that, in the case when the server has more than one interface > cluster resolver might work incorrectly and return different addresses/names > for the same server. > To fix this problem we need to update > [TensorFlowServerAddressSpec|https://github.com/apache/ignite/blob/master/modules/tensorflow/src/main/java/org/apache/ignite/tensorflow/cluster/spec/TensorFlowServerAddressSpec.java] > so that it returns the same address/name for the same server all the time. > If a server has multiple network interfaces we need to find a "GCD", a > network with all Ignite nodes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11342) [ML] Umbrella: Create a Python API for Ignite ML
[ https://issues.apache.org/jira/browse/IGNITE-11342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-11342: - Priority: Minor (was: Major) > [ML] Umbrella: Create a Python API for Ignite ML > > > Key: IGNITE-11342 > URL: https://issues.apache.org/jira/browse/IGNITE-11342 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 3.0 > > > Currently Apache Ignite ML provides only Java API. The most popular language > of data analysts is Python. To allow data analysts work with Ignite ML we > need to provide Python API. > The architecture of this Python API should be based on > [Py4J|https://www.py4j.org/] library. This library allows to starts a simple > server of Java side and then translate all calls from Python API into calls > of corresponding Java API and interact with the server via TCP. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11871) [ML] IP resolver in TensorFlow cluster manager doesn't work properly
[ https://issues.apache.org/jira/browse/IGNITE-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-11871: - Priority: Minor (was: Critical) > [ML] IP resolver in TensorFlow cluster manager doesn't work properly > > > Key: IGNITE-11871 > URL: https://issues.apache.org/jira/browse/IGNITE-11871 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.7, 2.8 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > > TensorFlow cluster manager requires NodeId to be resolved into IP address or > hostname to pass the address/name to TensorFlow worker. Currently, it uses > strategy "return first" and returns the first available address/name. As a > result of that, in the case when the server has more than one interface > cluster resolver might work incorrectly and return different addresses/names > for the same server. > To fix this problem we need to update > [TensorFlowServerAddressSpec|https://github.com/apache/ignite/blob/master/modules/tensorflow/src/main/java/org/apache/ignite/tensorflow/cluster/spec/TensorFlowServerAddressSpec.java] > so that it returns the same address/name for the same server all the time. > If a server has multiple network interfaces we need to find a "GCD", a > network with all Ignite nodes. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-13344) [ML] DummyVectorizer fails to extract label for coordinate with value "0.0" when backed by sparse vector
[ https://issues.apache.org/jira/browse/IGNITE-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17174122#comment-17174122 ] Alexey Zinoviev commented on IGNITE-13344: -- [~thilo.ginkel] Great, thanks, I'll take into account and fix in the next release > [ML] DummyVectorizer fails to extract label for coordinate with value "0.0" > when backed by sparse vector > > > Key: IGNITE-13344 > URL: https://issues.apache.org/jira/browse/IGNITE-13344 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8.1 >Reporter: Thilo-Alexander Ginkel >Assignee: Alexey Zinoviev >Priority: Minor > > Given: A labeled DummyVectorizer: > > {code:java} > new DummyVectorizer() > .exclude(excludeCoordinates.stream().map(coord -> vectorLength + > coord).toArray(Integer[]::new)) > .labeled(labelCoord); > {code} > {{When extracting the label, the call hierarchy eventually ends up at > org.apache.ignite.ml.dataset.feature.extractor.impl.DummyVectorizer#feature, > which returns null for val.getRaw when val is a sparse vector with the > element at the requested label coordinate being 0.0. This causes the training > job to fail (which expects a non-null label):}} > {code:java} > org.apache.ignite.IgniteException: Remote job threw user exception (override > or implement ComputeTask.result(..) method if you would like to have > automatic failover for this exception): > nullorg.apache.ignite.IgniteException: Remote job threw user exception > (override or implement ComputeTask.result(..) method if you would like to > have automatic failover for this exception): null at > org.apache.ignite.compute.ComputeTaskAdapter.result(ComputeTaskAdapter.java:102) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1062) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1055) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7037) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker.result(GridTaskWorker.java:1055) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:862) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:1146) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:961) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:809) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:659) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:519) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > ~[ignite-core-2.8.1.jar:2.8.1] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > ~[na:na] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > ~[na:na] at java.base/java.lang.Thread.run(Thread.java:832) ~[na:na]Caused > by: org.apache.ignite.IgniteException: null at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1858) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:596) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7005) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:590) > ~[ignite-core-2.8.1.jar:2.8.1] ... 5 common frames omittedCaused by: > java.lang.NullPointerException: null at > org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:91) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:41) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.lambda$getData$4(ComputeUtils.java:239) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.cache.util.PartitionDataStorage.lambda$computeDataIfAbsent$1(PartitionDataStorage.java:56) > ~[ignite-ml-2.8.1.jar:2.8.1] at
[jira] [Assigned] (IGNITE-13344) [ML] DummyVectorizer fails to extract label for coordinate with value "0.0" when backed by sparse vector
[ https://issues.apache.org/jira/browse/IGNITE-13344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev reassigned IGNITE-13344: Assignee: Alexey Zinoviev > [ML] DummyVectorizer fails to extract label for coordinate with value "0.0" > when backed by sparse vector > > > Key: IGNITE-13344 > URL: https://issues.apache.org/jira/browse/IGNITE-13344 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8.1 >Reporter: Thilo-Alexander Ginkel >Assignee: Alexey Zinoviev >Priority: Minor > > Given: A labeled DummyVectorizer: > > {code:java} > new DummyVectorizer() > .exclude(excludeCoordinates.stream().map(coord -> vectorLength + > coord).toArray(Integer[]::new)) > .labeled(labelCoord); > {code} > {{When extracting the label, the call hierarchy eventually ends up at > org.apache.ignite.ml.dataset.feature.extractor.impl.DummyVectorizer#feature, > which returns null for val.getRaw when val is a sparse vector with the > element at the requested label coordinate being 0.0. This causes the training > job to fail (which expects a non-null label):}} > {code:java} > org.apache.ignite.IgniteException: Remote job threw user exception (override > or implement ComputeTask.result(..) method if you would like to have > automatic failover for this exception): > nullorg.apache.ignite.IgniteException: Remote job threw user exception > (override or implement ComputeTask.result(..) method if you would like to > have automatic failover for this exception): null at > org.apache.ignite.compute.ComputeTaskAdapter.result(ComputeTaskAdapter.java:102) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1062) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker$5.apply(GridTaskWorker.java:1055) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7037) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker.result(GridTaskWorker.java:1055) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskWorker.onResponse(GridTaskWorker.java:862) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.task.GridTaskProcessor.processJobExecuteResponse(GridTaskProcessor.java:1146) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:961) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.finishJob(GridJobWorker.java:809) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:659) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.body(GridJobWorker.java:519) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) > ~[ignite-core-2.8.1.jar:2.8.1] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > ~[na:na] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) > ~[na:na] at java.base/java.lang.Thread.run(Thread.java:832) ~[na:na]Caused > by: org.apache.ignite.IgniteException: null at > org.apache.ignite.internal.processors.closure.GridClosureProcessor$C2.execute(GridClosureProcessor.java:1858) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker$2.call(GridJobWorker.java:596) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:7005) > ~[ignite-core-2.8.1.jar:2.8.1] at > org.apache.ignite.internal.processors.job.GridJobWorker.execute0(GridJobWorker.java:590) > ~[ignite-core-2.8.1.jar:2.8.1] ... 5 common frames omittedCaused by: > java.lang.NullPointerException: null at > org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:91) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.bootstrapping.BootstrappedDatasetBuilder.build(BootstrappedDatasetBuilder.java:41) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.cache.util.ComputeUtils.lambda$getData$4(ComputeUtils.java:239) > ~[ignite-ml-2.8.1.jar:2.8.1] at > org.apache.ignite.ml.dataset.impl.cache.util.PartitionDataStorage.lambda$computeDataIfAbsent$1(PartitionDataStorage.java:56) > ~[ignite-ml-2.8.1.jar:2.8.1] at > java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708) > ~[na
[jira] [Commented] (IGNITE-11942) IGFS and Hadoop Accelerator Discontinuation
[ https://issues.apache.org/jira/browse/IGNITE-11942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152093#comment-17152093 ] Alexey Zinoviev commented on IGNITE-11942: -- [~alex_pl] I'm not working on it, but i was a blocker for this ticket, as I mentioned above, currently it could be removed (if somebody do this) > IGFS and Hadoop Accelerator Discontinuation > --- > > Key: IGNITE-11942 > URL: https://issues.apache.org/jira/browse/IGNITE-11942 > Project: Ignite > Issue Type: Task >Reporter: Denis A. Magda >Assignee: Anton Kalashnikov >Priority: Blocker > Fix For: 2.9 > > > The community has voted for the following decision: > * IGFS and In-Memory Hadoop Accelerator components are to be discontinued and > no longer supported by the community > * The existing source code of IGFS and In-Memory Hadoop Accelerator is to be > removed from Ignite master. Before that, a special branch like > "ignite-igfs-and-hadoop-accelerator" to be forked off the master in order to > preserve the sources in Git history for those who might need it. > The voting thread: > http://apache-ignite-developers.2346864.n4.nabble.com/VOTE-Complete-Discontinuation-of-IGFS-and-Hadoop-Accelerator-td42405.html > Once the changes are made for Ignite 2.8, please contact Denis Magda to > update a public documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IGNITE-10292) ML: Replace IGFS by model storage for TensorFlow
[ https://issues.apache.org/jira/browse/IGNITE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev resolved IGNITE-10292. -- Resolution: Not A Problem The TensorFlow component was removed from 2.8. No needs to fix ticket related to TensorFlow > ML: Replace IGFS by model storage for TensorFlow > > > Key: IGNITE-10292 > URL: https://issues.apache.org/jira/browse/IGNITE-10292 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 3.0 > > > Currently we have a TensorFlow IGFS plugin that provides a file system > functionality (see > https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ignite). > At the same time IGFS is deprecated and would be great to replace it by a > simple model storage based on cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-10292) ML: Replace IGFS by model storage for TensorFlow
[ https://issues.apache.org/jira/browse/IGNITE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17152075#comment-17152075 ] Alexey Zinoviev commented on IGNITE-10292: -- [~alex_pl] not an issue, I'll close the ticket, thanks The IGFS could be removed > ML: Replace IGFS by model storage for TensorFlow > > > Key: IGNITE-10292 > URL: https://issues.apache.org/jira/browse/IGNITE-10292 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 3.0 > > > Currently we have a TensorFlow IGFS plugin that provides a file system > functionality (see > https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/ignite). > At the same time IGFS is deprecated and would be great to replace it by a > simple model storage based on cache. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IGNITE-10782) javadoc description for ml.math.exceptions.preprocessing and ml.selection.scoring.evaluator
[ https://issues.apache.org/jira/browse/IGNITE-10782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev resolved IGNITE-10782. -- Resolution: Fixed > javadoc description for ml.math.exceptions.preprocessing and > ml.selection.scoring.evaluator > --- > > Key: IGNITE-10782 > URL: https://issues.apache.org/jira/browse/IGNITE-10782 > Project: Ignite > Issue Type: Bug > Components: documentation, ml >Reporter: Stepan Pilschikov >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.9 > > > Need to add modules description for > - org.apache.ignite.ml.math.exceptions.preprocessing > - org.apache.ignite.ml.selection.scoring.evaluator > Located in ignite/docs/overview-summary.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12274) [ML] DecisionTree works incorrectly if maxDeep > amount of features
[ https://issues.apache.org/jira/browse/IGNITE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12274: - Affects Version/s: (was: 2.8) > [ML] DecisionTree works incorrectly if maxDeep > amount of features > --- > > Key: IGNITE-12274 > URL: https://issues.apache.org/jira/browse/IGNITE-12274 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Blocker > Fix For: 2.10 > > > We have a problem in two places: > null nodes could be created here *MeanDecisionTreeLeafBuilder.createLeafNode* > method in the row *return aa != null ? new DecisionTreeLeafNode(aa[0]) : > null;* > Probably, this situation is arising then the amount of features is smaller > than maxDeep -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12274) [ML] DecisionTree works incorrectly if maxDeep > amount of features
[ https://issues.apache.org/jira/browse/IGNITE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12274: - Priority: Major (was: Blocker) > [ML] DecisionTree works incorrectly if maxDeep > amount of features > --- > > Key: IGNITE-12274 > URL: https://issues.apache.org/jira/browse/IGNITE-12274 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > We have a problem in two places: > null nodes could be created here *MeanDecisionTreeLeafBuilder.createLeafNode* > method in the row *return aa != null ? new DecisionTreeLeafNode(aa[0]) : > null;* > Probably, this situation is arising then the amount of features is smaller > than maxDeep -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12274) [ML] DecisionTree works incorrectly if maxDeep > amount of features
[ https://issues.apache.org/jira/browse/IGNITE-12274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12274: - Fix Version/s: (was: 2.9) 2.10 > [ML] DecisionTree works incorrectly if maxDeep > amount of features > --- > > Key: IGNITE-12274 > URL: https://issues.apache.org/jira/browse/IGNITE-12274 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Blocker > Fix For: 2.10 > > > We have a problem in two places: > null nodes could be created here *MeanDecisionTreeLeafBuilder.createLeafNode* > method in the row *return aa != null ? new DecisionTreeLeafNode(aa[0]) : > null;* > Probably, this situation is arising then the amount of features is smaller > than maxDeep -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (IGNITE-9740) [ML] Remove IgniteThread wrapper from ml unit test EvaluatorTest (follow up to IGNITE-9711)
[ https://issues.apache.org/jira/browse/IGNITE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev resolved IGNITE-9740. - Resolution: Fixed > [ML] Remove IgniteThread wrapper from ml unit test EvaluatorTest (follow up > to IGNITE-9711) > --- > > Key: IGNITE-9740 > URL: https://issues.apache.org/jira/browse/IGNITE-9740 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Oleg Ignatenko >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.9 > > > [EvaluatorTest|https://github.com/apache/ignite/blob/master/modules/ml/src/test/java/org/apache/ignite/ml/selection/scoring/evaluator/EvaluatorTest.java] > involves {{IgniteThread}} which is in fact not needed there and should be > removed. > {{IgniteThread}} usage is a remainder / copy-paste from older tests and > examples that were using API requiring it. This API has been removed and > there is no need for wrapping like that anymore. For the reference on how to > perform suggested cleanup check changes made to ml examples per IGNITE-9711. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-9740) [ML] Remove IgniteThread wrapper from ml unit test EvaluatorTest (follow up to IGNITE-9711)
[ https://issues.apache.org/jira/browse/IGNITE-9740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146194#comment-17146194 ] Alexey Zinoviev commented on IGNITE-9740: - This test was removed, no IgniteThread usage in ML tests anymore > [ML] Remove IgniteThread wrapper from ml unit test EvaluatorTest (follow up > to IGNITE-9711) > --- > > Key: IGNITE-9740 > URL: https://issues.apache.org/jira/browse/IGNITE-9740 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Oleg Ignatenko >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.9 > > > [EvaluatorTest|https://github.com/apache/ignite/blob/master/modules/ml/src/test/java/org/apache/ignite/ml/selection/scoring/evaluator/EvaluatorTest.java] > involves {{IgniteThread}} which is in fact not needed there and should be > removed. > {{IgniteThread}} usage is a remainder / copy-paste from older tests and > examples that were using API requiring it. This API has been removed and > there is no need for wrapping like that anymore. For the reference on how to > perform suggested cleanup check changes made to ml examples per IGNITE-9711. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-10782) javadoc description for ml.math.exceptions.preprocessing and ml.selection.scoring.evaluator
[ https://issues.apache.org/jira/browse/IGNITE-10782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146193#comment-17146193 ] Alexey Zinoviev commented on IGNITE-10782: -- [~spilschikov] could you please have a look in 2.8 or 2.8.1 release. Seems like both package descritpion in place > javadoc description for ml.math.exceptions.preprocessing and > ml.selection.scoring.evaluator > --- > > Key: IGNITE-10782 > URL: https://issues.apache.org/jira/browse/IGNITE-10782 > Project: Ignite > Issue Type: Bug > Components: documentation, ml >Reporter: Stepan Pilschikov >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.9 > > > Need to add modules description for > - org.apache.ignite.ml.math.exceptions.preprocessing > - org.apache.ignite.ml.selection.scoring.evaluator > Located in ignite/docs/overview-summary.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12587) ML examples failed on start
[ https://issues.apache.org/jira/browse/IGNITE-12587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146188#comment-17146188 ] Alexey Zinoviev commented on IGNITE-12587: -- [~agoncharuk] It was merged in common PR [https://github.com/apache/ignite/pull/7430] > ML examples failed on start > --- > > Key: IGNITE-12587 > URL: https://issues.apache.org/jira/browse/IGNITE-12587 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8 > Environment: Java 8 > Linux/Win >Reporter: Stepan Pilschikov >Assignee: Alexey Zinoviev >Priority: Blocker > Time Spent: 20m > Remaining Estimate: 0h > > New release build comes with lost data sets for ML 2.8 > Steps: > - Try to run any ML examples used MLSandboxDatasets > (org.apache.ignite.examples.ml.environment.TrainingWithCustomPreprocessorsExample > for examples) > Actual: > - FileNotFoundException > {code} > Exception in thread "main" java.io.FileNotFoundException: > modules/ml/src/main/resources/datasets/boston_housing_dataset.txt > at > org.apache.ignite.ml.util.SandboxMLCache.fillCacheWith(SandboxMLCache.java:119) > at > org.apache.ignite.examples.ml.environment.TrainingWithCustomPreprocessorsExample.main(TrainingWithCustomPreprocessorsExample.java:62) > {code} > Release build - > https://ci.ignite.apache.org/viewLog.html?buildId=4957767&buildTypeId=Releases_ApacheIgniteMain_ReleaseBuild&tab=artifacts&branch_Releases_ApacheIgniteMain=ignite-2.8 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12673) [ML] Fix ML examples logging
[ https://issues.apache.org/jira/browse/IGNITE-12673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146186#comment-17146186 ] Alexey Zinoviev commented on IGNITE-12673: -- It was merged to master in [https://github.com/apache/ignite/pull/7430] > [ML] Fix ML examples logging > > > Key: IGNITE-12673 > URL: https://issues.apache.org/jira/browse/IGNITE-12673 > Project: Ignite > Issue Type: Bug > Components: examples, ml >Affects Versions: 2.8 >Reporter: Stepan Pilschikov >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.8 > > Time Spent: 20m > Remaining Estimate: 0h > > Compile of several minor fixes for ML examples: > 1. In TutorialStepByStepExample we running 17 examples > First 12 logging is pretty good and looks like "Tutorial step N: name" -> > model -> accuracy -> "Tutorial step N: completed" > But then starting with 13 this pattern is kind of broke, step start and step > completion is missing > 2. Step_8_CV_with_Param_Grid_and_metrics_and_pipeline is haven't step > completion log > 3. Complete log for Step_9_Scaling_With_Stacking looks like 'Tutorial step 5 > (scaling) example completed' -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12657) ML examples EvaluatorExample and MultipleMetricsExample looks the same
[ https://issues.apache.org/jira/browse/IGNITE-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146185#comment-17146185 ] Alexey Zinoviev commented on IGNITE-12657: -- It was merged to master branch here [https://github.com/apache/ignite/pull/7430] > ML examples EvaluatorExample and MultipleMetricsExample looks the same > -- > > Key: IGNITE-12657 > URL: https://issues.apache.org/jira/browse/IGNITE-12657 > Project: Ignite > Issue Type: Bug > Components: examples, ml >Affects Versions: 2.8 >Reporter: Stepan Pilschikov >Assignee: Alexey Zinoviev >Priority: Blocker > Fix For: 2.8 > > > Examples > org.apache.ignite.examples.ml.selection.scoring.EvaluatorExample > and > org.apache.ignite.examples.ml.selection.scoring.MultipleMetricsExample > Looks exactly the same > I think MultipleMetricsExample is wrong because description told about using > KNNClassificationTrainer but actually used SVMLinearClassificationTrainer -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12658) [ML][Examples] TutorialStepByStepExample failed on cluster with more then 1 node
[ https://issues.apache.org/jira/browse/IGNITE-12658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146182#comment-17146182 ] Alexey Zinoviev commented on IGNITE-12658: -- It was merged to master branch in this PR [https://github.com/apache/ignite/pull/7430] > [ML][Examples] TutorialStepByStepExample failed on cluster with more then 1 > node > > > Key: IGNITE-12658 > URL: https://issues.apache.org/jira/browse/IGNITE-12658 > Project: Ignite > Issue Type: Bug > Components: examples, ml >Affects Versions: 2.8 > Environment: Ubuntu/Win > Java 8 >Reporter: Stepan Pilschikov >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.8 > > > Steps to reproduce: > 1. Run Ignite node with org.apache.ignite.examples.ExampleNodeStartup (1 > node will be enough) > 2. Run org.apache.ignite.examples.ml.tutorial.TutorialStepByStepExample > Actual: > On Step_8_CV_with_Param_Grid_and_metrics starting to throw a lot of > exceptions > {code:java} > Train with p: 2 and maxDeep: 1 > >>> Trained model: if (x1 > 0.4368) then return 1. else return 0. > >>> Accuracy 0.7679083094555874 > >>> Test Error 0.2320916905444126 > >>> Tutorial step 8 (cross-validation) example completed. > [13:25:40] Ignite node stopped OK [uptime=00:00:17.453] > >>> Tutorial step 8 (cross-validation with param grid) example started. > [13:25:40]__ > [13:25:40] / _/ ___/ |/ / _/_ __/ __/ > [13:25:40] _/ // (7 7// / / / / _/ > [13:25:40] /___/\___/_/|_/___/ /_/ /___/ > [13:25:40] > [13:25:40] ver. 2.8.0#20200130-sha1:f478aa56 > [13:25:40] 2020 Copyright(C) Apache Software Foundation > [13:25:40] > [13:25:40] Ignite documentation: http://ignite.apache.org > [13:25:40] > [13:25:40] Quiet mode. > [13:25:40] ^-- Logging to file > '/opt/buildagent/work/d501ae8146bd8253/i2test/var/suite-examples/app-ignite/work/log/ignite-e156b2f2.log' > [13:25:40] ^-- Logging by 'Log4JLogger [quiet=true, config=null]' > [13:25:40] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false or > "-v" to ignite.{sh|bat} > [13:25:40] > [13:25:40] OS: Linux 4.15.0-65-generic amd64 > [13:25:40] VM information: Java(TM) SE Runtime Environment 1.8.0_221-b11 > Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.221-b11 > [13:25:40] Please set system property '-Djava.net.preferIPv4Stack=true' to > avoid possible problems in mixed environments. > [13:25:40] Configured plugins: > [13:25:40] ^-- ml-inference-plugin 1.0.0 > [13:25:40] ^-- null > [13:25:40] > [13:25:40] Configured failure handler: [hnd=StopNodeOrHaltFailureHandler > [tryStop=false, timeout=0, super=AbstractFailureHandler > [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, > SYSTEM_CRITICAL_OPERATION_TIMEOUT > [13:25:40] Message queue limit is set to 0 which may lead to potential OOMEs > when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to > message queues growth on sender and receiver sides. > [13:25:40] Security status [authentication=off, tls/ssl=off] > [13:25:41] Performance suggestions for grid (fix if possible) > [13:25:41] To disable, set -DIGNITE_PERFORMANCE_SUGGESTIONS_DISABLED=true > [13:25:41] ^-- Disable grid events (remove 'includeEventTypes' from > configuration) > [13:25:41] ^-- Enable G1 Garbage Collector (add '-XX:+UseG1GC' to JVM > options) > [13:25:41] ^-- Set max direct memory size if getting 'OOME: Direct buffer > memory' (add '-XX:MaxDirectMemorySize=[g|G|m|M|k|K]' to JVM options) > [13:25:41] ^-- Disable processing of calls to System.gc() (add > '-XX:+DisableExplicitGC' to JVM options) > [13:25:41] Refer to this page for more performance suggestions: > https://apacheignite.readme.io/docs/jvm-and-system-tuning > [13:25:41] > [13:25:41] To start Console Management & Monitoring run > ignitevisorcmd.{sh|bat} > [13:25:41] Data Regions Configured: > [13:25:41] ^-- Default_Region [initSize=500.0 MiB, maxSize=18.9 GiB, > persistence=false, lazyMemoryAllocation=true] > [13:25:41] > [13:25:41] Ignite node started OK (id=e156b2f2) > [13:25:41] Topology snapshot [ver=20, locNode=e156b2f2, servers=2, clients=0, > state=ACTIVE, CPUs=5, offheap=38.0GB, heap=3.0GB] > [13:25:41] ^-- Baseline [id=0, size=2, online=2, offline=0] > [2020-02-11 13:25:42,428][ERROR][sys-#593][GridTaskWorker] Failed to obtain > remote job result policy for result from ComputeTask.result(..) method (will > fail the whole task): GridJobResultImpl [job=C2 > [c=o.a.i.ml.dataset.impl.cache.util.ComputeUtils$DeployableCallable@30e27659], > sib=GridJobSiblingImpl > [sesId=f9aced33071-e156b2f2-d116-4389-bd43-8536dc59, > jobId=1aaced33071-e156b2f2-d116-4389-bd43-8536dc59, > nodeId=f1135598-73c8-43
[jira] [Commented] (IGNITE-12660) [ML] The ParamGrid uses unserialized lambdas in interface to get an access to the trainer fields
[ https://issues.apache.org/jira/browse/IGNITE-12660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17146181#comment-17146181 ] Alexey Zinoviev commented on IGNITE-12660: -- [~agoncharuk] It was resolved in this PR [https://github.com/apache/ignite/pull/7430] and merged to master > [ML] The ParamGrid uses unserialized lambdas in interface to get an access to > the trainer fields > > > Key: IGNITE-12660 > URL: https://issues.apache.org/jira/browse/IGNITE-12660 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Blocker > Fix For: 2.8 > > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10539) [ML] Make 'with' methods consistent
[ https://issues.apache.org/jira/browse/IGNITE-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10539: - Priority: Major (was: Critical) > [ML] Make 'with' methods consistent > --- > > Key: IGNITE-10539 > URL: https://issues.apache.org/jira/browse/IGNITE-10539 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Artem Malykh >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > In some places we have 'with*' methods making inplace changes and returning > object itself (for example MLPTrainer::withLoss) while in other places we > have them creating new instances with corresponding parameter changed (for > example DatasetBuilder::withFilter, > DatasetBuilder::withUpstreamTrainsformer). This inconsistency makes user look > into javadoc each time and worsens overall API consistensy level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12288) [ML] Replace assert logic with exceptions
[ https://issues.apache.org/jira/browse/IGNITE-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12288: - Priority: Minor (was: Critical) > [ML] Replace assert logic with exceptions > - > > Key: IGNITE-12288 > URL: https://issues.apache.org/jira/browse/IGNITE-12288 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.10 > > > 1) Add exceptions instead of assert logic > 2) Add tests for the proposed exceptions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12054) [Umbrella][Spark] Upgrade Spark module to 2.4
[ https://issues.apache.org/jira/browse/IGNITE-12054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12054: - Priority: Major (was: Blocker) > [Umbrella][Spark] Upgrade Spark module to 2.4 > - > > Key: IGNITE-12054 > URL: https://issues.apache.org/jira/browse/IGNITE-12054 > Project: Ignite > Issue Type: New Feature > Components: spark >Reporter: Denis A. Magda >Assignee: Alexey Zinoviev >Priority: Major > Labels: important > Fix For: 3.0 > > Attachments: ignite-spark-patch-new.diff > > Time Spent: 0.5h > Remaining Estimate: 0h > > Users can't use APIs that are already available in Spark 2.4: > https://stackoverflow.com/questions/57392143/persisting-spark-dataframe-to-ignite > Let's upgrade Spark from 2.3 to 2.4 until we extract the Spark Integration as > a separate module that can support multiple Spark versions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Model export/import to PMML and custom JSON format
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Fix Version/s: (was: 2.9) 2.10 > [Umbrella] Model export/import to PMML and custom JSON format > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > > We need to be able to export/import Ignite model versions across clusters > with different versions and have exchangable & human-readable format for > inference with different systems like scikit-learn, Spark ML and etc > The PMML format is a good choice here: > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] > (i) [https://github.com/jpmml/jpmml-model] > > But PMML has limitation support for Ensembles like Random Forest, Gradient > Boosted Trees, Stacking, Bagging and so on. > These cases could be covered with our own JSON format which could be easily > parsed in another system. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Model export/import to PMML and custom JSON format
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Description: We need to be able to export/import Ignite model versions across clusters with different versions and have exchangable & human-readable format for inference with different systems like scikit-learn, Spark ML and etc The PMML format is a good choice here: PMML - Predictive Model Markup Language is XML based language which used in SPARK MLlib and others platforms. Here some additional info about PMML: (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] (i) [https://github.com/jpmml/jpmml-model] But PMML has limitation support for Ensembles like Random Forest, Gradient Boosted Trees, Stacking, Bagging and so on. These cases could be covered with our own JSON format which could be easily parsed in another system. was: PMML - Predictive Model Markup Language is XML based language which used in SPARK MLlib and others platforms. Here some additional info about PMML: (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] (i) [https://github.com/jpmml/jpmml-model] > [Umbrella] Model export/import to PMML and custom JSON format > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.9 > > > > We need to be able to export/import Ignite model versions across clusters > with different versions and have exchangable & human-readable format for > inference with different systems like scikit-learn, Spark ML and etc > The PMML format is a good choice here: > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] > (i) [https://github.com/jpmml/jpmml-model] > > But PMML has limitation support for Ensembles like Random Forest, Gradient > Boosted Trees, Stacking, Bagging and so on. > These cases could be covered with our own JSON format which could be easily > parsed in another system. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Model export/import to PMML and custom JSON format
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Description: PMML - Predictive Model Markup Language is XML based language which used in SPARK MLlib and others platforms. Here some additional info about PMML: (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] (i) [https://github.com/jpmml/jpmml-model] was: PMML - Predictive Model Markup Language is XML based language which used in SPARK MLlib and others platforms. Here some additional info about PMML: (i) http://dmg.org/pmml/v4-3/GeneralStructure.html (i) https://github.com/jpmml/jpmml-model > [Umbrella] Model export/import to PMML and custom JSON format > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.9 > > > > > > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) [http://dmg.org/pmml/v4-3/GeneralStructure.html] > (i) [https://github.com/jpmml/jpmml-model] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12337) [ML] Redesign the package structure
[ https://issues.apache.org/jira/browse/IGNITE-12337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12337: - Fix Version/s: (was: 2.9) 2.10 > [ML] Redesign the package structure > --- > > Key: IGNITE-12337 > URL: https://issues.apache.org/jira/browse/IGNITE-12337 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.10 > > > The problem is the next: a lot of classes and algorithms are located in not > the appropriate places and are not grouped in the high-level packages -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-7593) Improve data used in DecisionTreesExample
[ https://issues.apache.org/jira/browse/IGNITE-7593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-7593: Fix Version/s: (was: 2.9) > Improve data used in DecisionTreesExample > - > > Key: IGNITE-7593 > URL: https://issues.apache.org/jira/browse/IGNITE-7593 > Project: Ignite > Issue Type: Task > Components: ml >Reporter: Oleg Ignatenko >Assignee: Alexey Zinoviev >Priority: Minor > > Data currently used in {{DecisionTreesExample}} looks not quite optimal: > # It is large, as evidenced in the warning in javadocs: "It is recommended to > start at least one node prior to launching this example if you intend to run > it with default memory settings." > # It makes example run for quite a long time. > # It doesn't have license (likely meaning "all rights reserved" by default) > which makes it troublesome to include in project sources so that current > approach is to prompt user to download it, additionally complicated by making > example skip when run unattended from {{IgniteExamplesMLTestSuite}}. > Suggest to find or construct a smaller data for this example which would > still make sense to demonstrate how algorithm works and in the same time > would be 1) easier on memory usage, 2) quicker to run and 3) would allow > carrying it within project instead of prompting user to download it. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12331) [ML] ML Preprocessing doesn't work on SQL Tables
[ https://issues.apache.org/jira/browse/IGNITE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12331: - Fix Version/s: (was: 2.9) 2.10 > [ML] ML Preprocessing doesn't work on SQL Tables > > > Key: IGNITE-12331 > URL: https://issues.apache.org/jira/browse/IGNITE-12331 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.8 >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > {code:java} > /* > * Licensed to the Apache Software Foundation (ASF) under one or more > * contributor license agreements. See the NOTICE file distributed with > * this work for additional information regarding copyright ownership. > * The ASF licenses this file to You under the Apache License, Version 2.0 > * (the "License"); you may not use this file except in compliance with > * the License. You may obtain a copy of the License at > * > * http://www.apache.org/licenses/LICENSE-2.0 > * > * Unless required by applicable law or agreed to in writing, software > * distributed under the License is distributed on an "AS IS" BASIS, > * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. > * See the License for the specific language governing permissions and > * limitations under the License. > */ > package org.apache.ignite.examples.ml.tutorial.sql; > import java.util.List; > import org.apache.ignite.Ignite; > import org.apache.ignite.IgniteCache; > import org.apache.ignite.Ignition; > import org.apache.ignite.cache.query.QueryCursor; > import org.apache.ignite.cache.query.SqlFieldsQuery; > import org.apache.ignite.configuration.CacheConfiguration; > import org.apache.ignite.internal.util.IgniteUtils; > import org.apache.ignite.ml.dataset.feature.extractor.Vectorizer; > import > org.apache.ignite.ml.dataset.feature.extractor.impl.BinaryObjectVectorizer; > import org.apache.ignite.ml.math.primitives.vector.Vector; > import org.apache.ignite.ml.math.primitives.vector.VectorUtils; > import org.apache.ignite.ml.preprocessing.Preprocessor; > import org.apache.ignite.ml.preprocessing.minmaxscaling.MinMaxScalerTrainer; > import org.apache.ignite.ml.preprocessing.normalization.NormalizationTrainer; > import org.apache.ignite.ml.sql.SqlDatasetBuilder; > import org.apache.ignite.ml.tree.DecisionTreeClassificationTrainer; > import org.apache.ignite.ml.tree.DecisionTreeNode; > /** > * Example of using distributed {@link DecisionTreeClassificationTrainer} on > a data stored in SQL table. > */ > public class PreprocessingAndTrainingSQLTableExample { > /** > * Dummy cache name. > */ > private static final String DUMMY_CACHE_NAME = "dummy_cache"; > /** > * Training data. > */ > private static final String TRAIN_DATA_RES = > "examples/src/main/resources/datasets/titanic_train.csv"; > /** > * Test data. > */ > private static final String TEST_DATA_RES = > "examples/src/main/resources/datasets/titanic_test.csv"; > /** > * Run example. > */ > public static void main(String[] args) { > System.out.println(">>> Decision tree classification trainer example > started."); > // Start ignite grid. > try (Ignite ignite = > Ignition.start("examples/config/example-ignite.xml")) { > System.out.println(">>> Ignite grid started."); > // Dummy cache is required to perform SQL queries. > CacheConfiguration cacheCfg = new > CacheConfiguration<>(DUMMY_CACHE_NAME) > .setSqlSchema("PUBLIC"); > IgniteCache cache = null; > try { > cache = ignite.getOrCreateCache(cacheCfg); > System.out.println(">>> Creating table with training > data..."); > cache.query(new SqlFieldsQuery("create table titanic_train > (\n" + > "passengerid int primary key,\n" + > "survived int,\n" + > "pclass int,\n" + > "name varchar(255),\n" + > "sex varchar(255),\n" + > "age float,\n" + > "sibsp int,\n" + > "parch int,\n" + > "ticket varchar(255),\n" + > "fare float,\n" + > "cabin varchar(255),\n" + > "embarked varchar(255)\n" + > ") with \"template=partitioned\";")).getAll(); > System.out.println(">>> Filling training data..."); > cache.query(new SqlFieldsQuery("insert into titanic_train > select * from csvread('" + > > IgniteUtils.resolveI
[jira] [Updated] (IGNITE-12685) [ML] [Umbrella] Unify Preprocessors and Pipeline approaches to collect common statistics
[ https://issues.apache.org/jira/browse/IGNITE-12685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12685: - Fix Version/s: (was: 2.9) 2.10 > [ML] [Umbrella] Unify Preprocessors and Pipeline approaches to collect > common statistics > -- > > Key: IGNITE-12685 > URL: https://issues.apache.org/jira/browse/IGNITE-12685 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > In the current implementation we have different behavior in Cross-Validation > during running on the experimental Pipeline and chain of Preprocessors. > > Look at the tutorial step 8 CV_Param_Grid and 8_CV_Param_Grid_and_pipeline > In the first example all preprocessors fits on the whole dataset and don't > use train/test filter (due to limited API in preprocessors), and collects the > stat on the whole initial dataset. > > In the second example, we have honest re-fitting on each cross-validation > fold three times with three different stats. As a result we could get a > different encoding values or Max/Min values for each column and so on. > > Should learn this question and be in consistency with the most popular > approaches. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10426) [ML] Spread parameter isKeepRawLabels across all models
[ https://issues.apache.org/jira/browse/IGNITE-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10426: - Fix Version/s: (was: 2.9) 2.10 > [ML] Spread parameter isKeepRawLabels across all models > --- > > Key: IGNITE-10426 > URL: https://issues.apache.org/jira/browse/IGNITE-10426 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > Currently, a few models has the parameter isKeepRawLabels and threshold to > change predicted value to one of class labels 1 or 0. > Discuss this in dev-list and think how to solve this task to optimize > MultiClassModel > Possible solution: > * add these methods to common model > * add this method to MultiClassModel and use reflection to check this > parameter in apply method for example -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12079) [ML][Umbrella] Add advanced preprocessing techniques
[ https://issues.apache.org/jira/browse/IGNITE-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12079: - Fix Version/s: (was: 2.9) 2.10 > [ML][Umbrella] Add advanced preprocessing techniques > > > Key: IGNITE-12079 > URL: https://issues.apache.org/jira/browse/IGNITE-12079 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > *Main goal:* > To reduce the gap between Apache Spark and Apache Ignite in preprocessing > operations. The reducing of the gap could help with loading Spark ML > Pipelines to Ignite ML. > > Next steps: > # Add Frequency Encoder > # Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, > LEAST_FREQUENT) > # Add RobustScaler (will be added in Spark 3.0) > # Add CountVectorizer > # Add FeatureHasher > # Add QuantileDiscretizer > # Add Locality Sensitive Hashing (LSH) > # Add LabelEncoder > # Add RevertStringIndexing > # Add multi-column preprocessor -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10539) [ML] Make 'with' methods consistent
[ https://issues.apache.org/jira/browse/IGNITE-10539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10539: - Fix Version/s: (was: 2.9) 2.10 > [ML] Make 'with' methods consistent > --- > > Key: IGNITE-10539 > URL: https://issues.apache.org/jira/browse/IGNITE-10539 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Artem Malykh >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.10 > > > In some places we have 'with*' methods making inplace changes and returning > object itself (for example MLPTrainer::withLoss) while in other places we > have them creating new instances with corresponding parameter changed (for > example DatasetBuilder::withFilter, > DatasetBuilder::withUpstreamTrainsformer). This inconsistency makes user look > into javadoc each time and worsens overall API consistensy level. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9414) [ML] Using sparce vectors in Tree-based algorithms.
[ https://issues.apache.org/jira/browse/IGNITE-9414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-9414: Fix Version/s: (was: 2.9) 2.10 > [ML] Using sparce vectors in Tree-based algorithms. > --- > > Key: IGNITE-9414 > URL: https://issues.apache.org/jira/browse/IGNITE-9414 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > We need to support sparce vectors in DecisionTrees, RF, GDB -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-9415) [ML] Using sparce vectors in LSQR and MLP
[ https://issues.apache.org/jira/browse/IGNITE-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-9415: Fix Version/s: (was: 2.9) 2.10 > [ML] Using sparce vectors in LSQR and MLP > - > > Key: IGNITE-9415 > URL: https://issues.apache.org/jira/browse/IGNITE-9415 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > We need to investigate and apply sparce vectors support in BLAS for LSQR and > MLP (or implement own version) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12288) [ML] Replace assert logic with exceptions
[ https://issues.apache.org/jira/browse/IGNITE-12288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12288: - Fix Version/s: (was: 2.9) 2.10 > [ML] Replace assert logic with exceptions > - > > Key: IGNITE-12288 > URL: https://issues.apache.org/jira/browse/IGNITE-12288 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Critical > Fix For: 2.10 > > > 1) Add exceptions instead of assert logic > 2) Add tests for the proposed exceptions -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-11664) [ML] Use Double.NaN as default values for missing values in Vector
[ https://issues.apache.org/jira/browse/IGNITE-11664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-11664: - Fix Version/s: (was: 2.9) 2.10 > [ML] Use Double.NaN as default values for missing values in Vector > -- > > Key: IGNITE-11664 > URL: https://issues.apache.org/jira/browse/IGNITE-11664 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Labels: stability > Fix For: 2.10 > > > Currently, we use 0.0 value for default values in vectors if a value is > missing. But this way contradicts to preprocessors politics where for missing > values Double.NaN is using. Moreover, Double.NaN is a more convenient value > for missing feature values. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-10503) [ML] Meta information for vectors
[ https://issues.apache.org/jira/browse/IGNITE-10503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-10503: - Fix Version/s: (was: 2.9) 2.10 > [ML] Meta information for vectors > - > > Key: IGNITE-10503 > URL: https://issues.apache.org/jira/browse/IGNITE-10503 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Platonov >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.10 > > > We need to design and implement vector meta-information like feature names, > bagging information, etc -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Model export/import to PMML and custom JSON format
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Priority: Major (was: Minor) > [Umbrella] Model export/import to PMML and custom JSON format > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.9 > > > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) http://dmg.org/pmml/v4-3/GeneralStructure.html > (i) https://github.com/jpmml/jpmml-model -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Model export/import to PMML and custom JSON format
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Summary: [Umbrella] Model export/import to PMML and custom JSON format (was: [Umbrella] Integration with PMML) > [Umbrella] Model export/import to PMML and custom JSON format > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.9 > > > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) http://dmg.org/pmml/v4-3/GeneralStructure.html > (i) https://github.com/jpmml/jpmml-model -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-6642) [Umbrella] Integration with PMML
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-6642: Reporter: Alexey Zinoviev (was: Yury Babak) > [Umbrella] Integration with PMML > > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Minor > Fix For: 2.9 > > > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) http://dmg.org/pmml/v4-3/GeneralStructure.html > (i) https://github.com/jpmml/jpmml-model -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12716) Ignite support for spark-3.0.0
[ https://issues.apache.org/jira/browse/IGNITE-12716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143850#comment-17143850 ] Alexey Zinoviev commented on IGNITE-12716: -- Dear [~shensonj] I'm not going to release something for spark support, hope another contributor could help > Ignite support for spark-3.0.0 > -- > > Key: IGNITE-12716 > URL: https://issues.apache.org/jira/browse/IGNITE-12716 > Project: Ignite > Issue Type: Bug > Components: spark >Affects Versions: 2.7, 2.8, 2.7.5, 2.7.6 >Reporter: Shenson Joseph >Priority: Blocker > > Ignite support for spark-3.0.0 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12903) Fix ML + SQL examples
[ https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12903: - Fix Version/s: 2.9 > Fix ML + SQL examples > - > > Key: IGNITE-12903 > URL: https://issues.apache.org/jira/browse/IGNITE-12903 > Project: Ignite > Issue Type: Task > Components: examples, ml >Reporter: Taras Ledkov >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 2.9 > > > The examples > {{DecisionTreeClassificationTrainerSQLInferenceExample}} > {{DecisionTreeClassificationTrainerSQLTableExample}} > are used CSVREAD function to initial load data into cluster. > Must be changed because this function is disabled by default -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12903) Fix ML + SQL examples
[ https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136757#comment-17136757 ] Alexey Zinoviev commented on IGNITE-12903: -- [~tledkov-gridgain] great, I'd prefer the third example, I suppose like in another examples. Will wait cool implementations, I reassigned ticket on myself, fix it for 2.9. > Fix ML + SQL examples > - > > Key: IGNITE-12903 > URL: https://issues.apache.org/jira/browse/IGNITE-12903 > Project: Ignite > Issue Type: Task > Components: examples, ml >Reporter: Taras Ledkov >Assignee: Alexey Zinoviev >Priority: Major > > The examples > {{DecisionTreeClassificationTrainerSQLInferenceExample}} > {{DecisionTreeClassificationTrainerSQLTableExample}} > are used CSVREAD function to initial load data into cluster. > Must be changed because this function is disabled by default -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (IGNITE-12903) Fix ML + SQL examples
[ https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev reassigned IGNITE-12903: Assignee: Alexey Zinoviev (was: Taras Ledkov) > Fix ML + SQL examples > - > > Key: IGNITE-12903 > URL: https://issues.apache.org/jira/browse/IGNITE-12903 > Project: Ignite > Issue Type: Task > Components: examples >Reporter: Taras Ledkov >Assignee: Alexey Zinoviev >Priority: Major > > The examples > {{DecisionTreeClassificationTrainerSQLInferenceExample}} > {{DecisionTreeClassificationTrainerSQLTableExample}} > are used CSVREAD function to initial load data into cluster. > Must be changed because this function is disabled by default -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12903) Fix ML + SQL examples
[ https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12903: - Component/s: ml > Fix ML + SQL examples > - > > Key: IGNITE-12903 > URL: https://issues.apache.org/jira/browse/IGNITE-12903 > Project: Ignite > Issue Type: Task > Components: examples, ml >Reporter: Taras Ledkov >Assignee: Alexey Zinoviev >Priority: Major > > The examples > {{DecisionTreeClassificationTrainerSQLInferenceExample}} > {{DecisionTreeClassificationTrainerSQLTableExample}} > are used CSVREAD function to initial load data into cluster. > Must be changed because this function is disabled by default -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (IGNITE-12903) Fix ML + SQL examples
[ https://issues.apache.org/jira/browse/IGNITE-12903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17136508#comment-17136508 ] Alexey Zinoviev commented on IGNITE-12903: -- [~tledkov-gridgain] What is the best way to fix? Enable this function manually (could you suggest the way, here, in comments) or the best way here to populate cache manually not from CSV. What do you think? > Fix ML + SQL examples > - > > Key: IGNITE-12903 > URL: https://issues.apache.org/jira/browse/IGNITE-12903 > Project: Ignite > Issue Type: Task > Components: examples >Reporter: Taras Ledkov >Assignee: Taras Ledkov >Priority: Major > > The examples > {{DecisionTreeClassificationTrainerSQLInferenceExample}} > {{DecisionTreeClassificationTrainerSQLTableExample}} > are used CSVREAD function to initial load data into cluster. > Must be changed because this function is disabled by default -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-8451) [ML] Refactor Labeled Dataset: remove unused methods and fields
[ https://issues.apache.org/jira/browse/IGNITE-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-8451: Fix Version/s: (was: 2.9) 3.0 > [ML] Refactor Labeled Dataset: remove unused methods and fields > --- > > Key: IGNITE-8451 > URL: https://issues.apache.org/jira/browse/IGNITE-8451 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 3.0 > > > Remove > * loading from file > * distributed version (we need local version only) > * parent class Dataset and meta-information -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (IGNITE-12432) [Spark] Need to add test for AVG function in IgniteOptimizationAggregationFuncSpec
[ https://issues.apache.org/jira/browse/IGNITE-12432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Zinoviev updated IGNITE-12432: - Fix Version/s: (was: 2.9) 3.0 > [Spark] Need to add test for AVG function in > IgniteOptimizationAggregationFuncSpec > -- > > Key: IGNITE-12432 > URL: https://issues.apache.org/jira/browse/IGNITE-12432 > Project: Ignite > Issue Type: Test > Components: spark >Reporter: Alexey Zinoviev >Assignee: Alexey Zinoviev >Priority: Major > Fix For: 3.0 > > > The test is skipped with TODO: write me > it("AVG - DECIMAL") { > //TODO: write me > } > It should be merged to 2.3 and 2.4 Spark together -- This message was sent by Atlassian Jira (v8.3.4#803005)