[jira] [Created] (IGNITE-9587) [ML] Umbrella ticket: Handle different labels in training data and handle unknown labels in test or updated training data correctly
Aleksey Zinoviev created IGNITE-9587: Summary: [ML] Umbrella ticket: Handle different labels in training data and handle unknown labels in test or updated training data correctly Key: IGNITE-9587 URL: https://issues.apache.org/jira/browse/IGNITE-9587 Project: Ignite Issue Type: New Feature Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev The problem is that all algorithms of binary classification are ready to handle the datasets marked with 0/1 labels and predict 0/1 labels without especial mapping. Also the algorithms don't handle situation with unknown labels during the updating and testing phases Possible solution: it could be stored in context of ML training -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9582) Document Model Updating
Aleksey Zinoviev created IGNITE-9582: Summary: Document Model Updating Key: IGNITE-9582 URL: https://issues.apache.org/jira/browse/IGNITE-9582 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Alexey Platonov -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9581) Document ANN algorithm based on ACD concept
Aleksey Zinoviev created IGNITE-9581: Summary: Document ANN algorithm based on ACD concept Key: IGNITE-9581 URL: https://issues.apache.org/jira/browse/IGNITE-9581 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9579) Document Random Forest
Aleksey Zinoviev created IGNITE-9579: Summary: Document Random Forest Key: IGNITE-9579 URL: https://issues.apache.org/jira/browse/IGNITE-9579 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Alexey Platonov Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9578) Document K-fold cross validation of models
Aleksey Zinoviev created IGNITE-9578: Summary: Document K-fold cross validation of models Key: IGNITE-9578 URL: https://issues.apache.org/jira/browse/IGNITE-9578 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Anton Dmitriev -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9577) Document Preprocessing
Aleksey Zinoviev created IGNITE-9577: Summary: Document Preprocessing Key: IGNITE-9577 URL: https://issues.apache.org/jira/browse/IGNITE-9577 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9576) Document Multi-Class Logistic Regression
Aleksey Zinoviev created IGNITE-9576: Summary: Document Multi-Class Logistic Regression Key: IGNITE-9576 URL: https://issues.apache.org/jira/browse/IGNITE-9576 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9575) Document Binary Logistic Regression
Aleksey Zinoviev created IGNITE-9575: Summary: Document Binary Logistic Regression Key: IGNITE-9575 URL: https://issues.apache.org/jira/browse/IGNITE-9575 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9574) Document Gradient boosting
Aleksey Zinoviev created IGNITE-9574: Summary: Document Gradient boosting Key: IGNITE-9574 URL: https://issues.apache.org/jira/browse/IGNITE-9574 Project: Ignite Issue Type: Task Components: documentation, ml Reporter: Aleksey Zinoviev Assignee: Alexey Platonov Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9313) ML TF integration: killed user script or chief processes didn't restart workers
[ https://issues.apache.org/jira/browse/IGNITE-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9313: - Ignite Flags: (was: Docs Required) > ML TF integration: killed user script or chief processes didn't restart > workers > > > Key: IGNITE-9313 > URL: https://issues.apache.org/jira/browse/IGNITE-9313 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.7 >Reporter: Stepan Pilschikov >Assignee: Anton Dmitriev >Priority: Major > Labels: tf-integration > Fix For: 2.7 > > > Case: > * Run cluster > * Filling caches with data > * Running python script > * Killing user script or chief > Expected: > - chief and user script processes shutdown and run again on same node (-) > - rerun user script (-) (+) > - directory with metadata was deleted and created new one in /tmp (-) > Actual: > - chief or user script shutting down and run again > - all workers still running and didn't restart > - directory with metadata (/tmp/tf_us_*) not deleted > - new directory with metadata is not created after restart > - user script did not rerun after 'chief process' killing ('user_script' > process killing restarting script execution) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9338) ML TF integration: tf cluster can't connect after killing first node with default port 10800
[ https://issues.apache.org/jira/browse/IGNITE-9338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9338: - Ignite Flags: (was: Docs Required) > ML TF integration: tf cluster can't connect after killing first node with > default port 10800 > > > Key: IGNITE-9338 > URL: https://issues.apache.org/jira/browse/IGNITE-9338 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Stepan Pilschikov >Assignee: Anton Dmitriev >Priority: Major > Labels: tf-integration > Fix For: 2.7 > > > Case: > - Run cluster with 3 node on 1 host > - Filling caches with data > - Running python script > - Killing lead node with port 10800 with chief + user_script processes > Expect: > - chief and user_script restarted on other node > - script rerun > Actual: > - chief and user_secript restarted on other node but started to crash and run > again because can't connect to default 10800 port -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9278) ML TF integration: Can't find free ports in range
[ https://issues.apache.org/jira/browse/IGNITE-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9278: - Ignite Flags: (was: Docs Required) > ML TF integration: Can't find free ports in range > - > > Key: IGNITE-9278 > URL: https://issues.apache.org/jira/browse/IGNITE-9278 > Project: Ignite > Issue Type: Bug > Components: ml >Affects Versions: 2.7 > Environment: CentOS 7 > Java 8 > Python 3.6.3 > Ports in range 1-11000 are free >Reporter: Stepan Pilschikov >Assignee: Anton Dmitriev >Priority: Major > Labels: tf-integration > Fix For: 2.7 > > > - Running cluster > - Fill caches > - Start script > Exception in nodes log > {code:java} > >>> >>> >>> >>> >>> ... ... ... ... ... ... ... >>> ... ... ... ... >>> >>> > >>> >>> >>> >>> >>> >>> > [15:27:50,295][SEVERE][service-#105][GridServiceProcessor] Service execution > stopped with error [name=TF_SERVICE_2e3875d0-1471-4f58-b51a-28d6e2dc8497, > execId=d40f3ffd-547c-4f26-867e-07c48b867bd5] > java.lang.IllegalStateException: No free ports in range [from=1, cnt=1000] > at > org.apache.ignite.tensorflow.cluster.util.ClusterPortManager.acquirePort(ClusterPortManager.java:107) > at > org.apache.ignite.tensorflow.cluster.util.TensorFlowClusterResolver.resolveAndAcquirePortsForWorkers(TensorFlowClusterResolver.java:103) > at > org.apache.ignite.tensorflow.cluster.util.TensorFlowClusterResolver.resolveAndAcquirePorts(TensorFlowClusterResolver.java:67) > at > org.apache.ignite.tensorflow.cluster.TensorFlowClusterManager.createCluster(TensorFlowClusterManager.java:116) > at > org.apache.ignite.tensorflow.cluster.TensorFlowClusterMaintainer.execute(TensorFlowClusterMaintainer.java:138) > at > org.apache.ignite.internal.processors.service.GridServiceProcessor$3.run(GridServiceProcessor.java:1396) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9336) [ML] ANN/SVM Trainer tests produce unpredictable results due to random data generation
[ https://issues.apache.org/jira/browse/IGNITE-9336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9336: - Ignite Flags: (was: Docs Required) > [ML] ANN/SVM Trainer tests produce unpredictable results due to random data > generation > -- > > Key: IGNITE-9336 > URL: https://issues.apache.org/jira/browse/IGNITE-9336 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > Remove random data generation and add static dataset into tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9482) [ML] Refactor all trainers' settters to withFieldName format for meta-algorithms
[ https://issues.apache.org/jira/browse/IGNITE-9482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9482: - Ignite Flags: (was: Docs Required) > [ML] Refactor all trainers' settters to withFieldName format for > meta-algorithms > > > Key: IGNITE-9482 > URL: https://issues.apache.org/jira/browse/IGNITE-9482 > Project: Ignite > Issue Type: Sub-task > Components: ml >Affects Versions: 2.7 >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9393) [ML] KMeans fails on complex data in cache
[ https://issues.apache.org/jira/browse/IGNITE-9393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9393: - Ignite Flags: (was: Docs Required) > [ML] KMeans fails on complex data in cache > -- > > Key: IGNITE-9393 > URL: https://issues.apache.org/jira/browse/IGNITE-9393 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > Described here > http://apache-ignite-users.70518.x6.nabble.com/NPE-exception-in-KMeansTrainer-td23504.html#a23512 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-7149) Gradient boosting for decision tree
[ https://issues.apache.org/jira/browse/IGNITE-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-7149: - Ignite Flags: Docs Required > Gradient boosting for decision tree > --- > > Key: IGNITE-7149 > URL: https://issues.apache.org/jira/browse/IGNITE-7149 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Alexey Platonov >Priority: Major > Labels: ml > Fix For: 2.7 > > > We want to implement gradient boosting for decision trees. It should be new > implementation of Trainer interface and we should keep possibility to choose > which trainer we want to use for our tree. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8667) Splitting of dataset to test and training sets
[ https://issues.apache.org/jira/browse/IGNITE-8667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8667: - Ignite Flags: Docs Required > Splitting of dataset to test and training sets > -- > > Key: IGNITE-8667 > URL: https://issues.apache.org/jira/browse/IGNITE-8667 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Anton Dmitriev >Priority: Major > Fix For: 2.7 > > > A mandatory part of any ML task is splitting dataset on test and train > subsets. The goal of this issues is to implement this splitting based on > ability to filter upstream cache entries that was added in IGNITE-8666. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8840) Random Forest
[ https://issues.apache.org/jira/browse/IGNITE-8840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8840: - Ignite Flags: Docs Required > Random Forest > - > > Key: IGNITE-8840 > URL: https://issues.apache.org/jira/browse/IGNITE-8840 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Alexey Platonov >Priority: Major > Fix For: 2.7 > > > We want to implement random forest algorithm. It should be based on our > implementation of decision trees. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8668) K-fold cross validation of models
[ https://issues.apache.org/jira/browse/IGNITE-8668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8668: - Ignite Flags: Docs Required > K-fold cross validation of models > - > > Key: IGNITE-8668 > URL: https://issues.apache.org/jira/browse/IGNITE-8668 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Anton Dmitriev >Priority: Major > Fix For: 2.7 > > > Cross validation is a well knows approach that allows to avoid overfitting > and therefore improve model quality. K-fold cross validation is based on > splitting dataset on _k_ disjoint subsets and using _k-1_ of them as train > subset and the remaining subset for test (with all possible combinations). > The goal of this task is to implement K-fold cross validation based on an > ability to filter dataset added recently in IGNITE-8666. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-8665) Umbrella: ML model validation for 2.7 release
[ https://issues.apache.org/jira/browse/IGNITE-8665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-8665. -- Resolution: Fixed > Umbrella: ML model validation for 2.7 release > - > > Key: IGNITE-8665 > URL: https://issues.apache.org/jira/browse/IGNITE-8665 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Yury Babak >Priority: Major > Fix For: 2.7 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8924) [ML] Parameter Grid for tuning hyper-parameters in Cross-Validation process
[ https://issues.apache.org/jira/browse/IGNITE-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8924: - Ignite Flags: Docs Required > [ML] Parameter Grid for tuning hyper-parameters in Cross-Validation process > --- > > Key: IGNITE-8924 > URL: https://issues.apache.org/jira/browse/IGNITE-8924 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > We want to have an analogue of Parameter Grid from scikit-learn to tune > hyper-parameters in Cross-Validation process. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8664) Encoding categorical features with One-of-K Encoder
[ https://issues.apache.org/jira/browse/IGNITE-8664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8664: - Ignite Flags: Docs Required > Encoding categorical features with One-of-K Encoder > --- > > Key: IGNITE-8664 > URL: https://issues.apache.org/jira/browse/IGNITE-8664 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8680) Encoding categorical features with OneHotEncoder
[ https://issues.apache.org/jira/browse/IGNITE-8680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8680: - Ignite Flags: Docs Required > Encoding categorical features with OneHotEncoder > > > Key: IGNITE-8680 > URL: https://issues.apache.org/jira/browse/IGNITE-8680 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8511) [ML] Add support for Multi-Class Logistic Regression
[ https://issues.apache.org/jira/browse/IGNITE-8511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8511: - Ignite Flags: Docs Required > [ML] Add support for Multi-Class Logistic Regression > > > Key: IGNITE-8511 > URL: https://issues.apache.org/jira/browse/IGNITE-8511 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8403) [ML] Add Binary Logistic Regression based on partitioned datasets and MLP
[ https://issues.apache.org/jira/browse/IGNITE-8403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8403: - Ignite Flags: Docs Required > [ML] Add Binary Logistic Regression based on partitioned datasets and MLP > - > > Key: IGNITE-8403 > URL: https://issues.apache.org/jira/browse/IGNITE-8403 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > Add binary logistic regression implementation based on partitioned dataset > and MLP(Multi-layered perceptron) architecture with SGD (Stochastic Gradient > Descent). > Provide test, example, model and trainer -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8567) [ML] Add Imputer and Binarizer for data preprocessing
[ https://issues.apache.org/jira/browse/IGNITE-8567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8567: - Ignite Flags: Docs Required > [ML] Add Imputer and Binarizer for data preprocessing > - > > Key: IGNITE-8567 > URL: https://issues.apache.org/jira/browse/IGNITE-8567 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > The imputing with Mean and Most frequent values options can be effectively > distributed. > [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Imputer.html#sklearn.preprocessing.Imputer] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9513) [ML] Unify all preprocessors trainers' generics
[ https://issues.apache.org/jira/browse/IGNITE-9513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9513: - Component/s: ml > [ML] Unify all preprocessors trainers' generics > --- > > Key: IGNITE-9513 > URL: https://issues.apache.org/jira/browse/IGNITE-9513 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > > Currently we have > EncoderTrainer implements PreprocessingTrainer > and > BinarizationTrainer implements PreprocessingTrainer Vector> > It will helps with raw types in OneVsRest or in Pipeline and CV processes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9514) [ML] Reduce time for the updating models on many partitions
Aleksey Zinoviev created IGNITE-9514: Summary: [ML] Reduce time for the updating models on many partitions Key: IGNITE-9514 URL: https://issues.apache.org/jira/browse/IGNITE-9514 Project: Ignite Issue Type: Task Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9513) [ML] Unify all preprocessors trainers' generics
Aleksey Zinoviev created IGNITE-9513: Summary: [ML] Unify all preprocessors trainers' generics Key: IGNITE-9513 URL: https://issues.apache.org/jira/browse/IGNITE-9513 Project: Ignite Issue Type: Improvement Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Currently we have EncoderTrainer implements PreprocessingTrainer and BinarizationTrainer implements PreprocessingTrainer It will helps with raw types in OneVsRest or in Pipeline and CV processes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8410) [ML] Unify KNNClassification/KNNRegression Model Trainer .fit() signatures
[ https://issues.apache.org/jira/browse/IGNITE-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8410: - Fix Version/s: 2.8 > [ML] Unify KNNClassification/KNNRegression Model Trainer .fit() signatures > -- > > Key: IGNITE-8410 > URL: https://issues.apache.org/jira/browse/IGNITE-8410 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Minor > Fix For: 2.8 > > > Make fit calls similar. > Should refactor one of trainers and remove one signature. The possible > solution to pass dataCache and ignite separately. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9463) [ML] Update ML tutorial with new model composition/update features
[ https://issues.apache.org/jira/browse/IGNITE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9463: - Fix Version/s: 2.8 > [ML] Update ML tutorial with new model composition/update features > -- > > Key: IGNITE-9463 > URL: https://issues.apache.org/jira/browse/IGNITE-9463 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.8 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8542) [ML] Add OneVsRest Trainer to handle cases with multiple class labels in dataset
[ https://issues.apache.org/jira/browse/IGNITE-8542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8542: - Fix Version/s: 2.8 > [ML] Add OneVsRest Trainer to handle cases with multiple class labels in > dataset > > > Key: IGNITE-8542 > URL: https://issues.apache.org/jira/browse/IGNITE-8542 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.8 > > > method extractClassLabels in LogRegressionMultiClassTrainer and in > SVMLinearMultiClassClassificationTrainer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8410) [ML] Unify KNNClassification/KNNRegression Model Trainer .fit() signatures
[ https://issues.apache.org/jira/browse/IGNITE-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8410: - Affects Version/s: (was: 2.6) > [ML] Unify KNNClassification/KNNRegression Model Trainer .fit() signatures > -- > > Key: IGNITE-8410 > URL: https://issues.apache.org/jira/browse/IGNITE-8410 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Minor > > Make fit calls similar. > Should refactor one of trainers and remove one signature. The possible > solution to pass dataCache and ignite separately. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9463) [ML] Update ML tutorial with new model composition/update features
[ https://issues.apache.org/jira/browse/IGNITE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9463: - Affects Version/s: (was: 2.8) > [ML] Update ML tutorial with new model composition/update features > -- > > Key: IGNITE-9463 > URL: https://issues.apache.org/jira/browse/IGNITE-9463 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.8 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9463) [ML] Update ML tutorial with new model composition/update features
[ https://issues.apache.org/jira/browse/IGNITE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9463: - Fix Version/s: (was: 2.7) > [ML] Update ML tutorial with new model composition/update features > -- > > Key: IGNITE-9463 > URL: https://issues.apache.org/jira/browse/IGNITE-9463 > Project: Ignite > Issue Type: New Feature > Components: ml >Affects Versions: 2.8 >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9463) [ML] Update ML tutorial with new model composition/update features
[ https://issues.apache.org/jira/browse/IGNITE-9463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9463: - Affects Version/s: (was: 2.7) 2.8 > [ML] Update ML tutorial with new model composition/update features > -- > > Key: IGNITE-9463 > URL: https://issues.apache.org/jira/browse/IGNITE-9463 > Project: Ignite > Issue Type: New Feature > Components: ml >Affects Versions: 2.8 >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9497) [ML] Add Pipeline support to Cross-Validation process
Aleksey Zinoviev created IGNITE-9497: Summary: [ML] Add Pipeline support to Cross-Validation process Key: IGNITE-9497 URL: https://issues.apache.org/jira/browse/IGNITE-9497 Project: Ignite Issue Type: New Feature Components: ml Affects Versions: 2.8 Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.8 Change API of ParamGrid.addHyperParam to support meta-information about Pipeline Stage Add to Cross-Validation method to support evaluate the whole Pipeline Process and inject hyper-parameters from the ParamGrid -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9482) [ML] Refactor all trainers' settters to withFieldName format for meta-algorithms
Aleksey Zinoviev created IGNITE-9482: Summary: [ML] Refactor all trainers' settters to withFieldName format for meta-algorithms Key: IGNITE-9482 URL: https://issues.apache.org/jira/browse/IGNITE-9482 Project: Ignite Issue Type: Sub-task Components: ml Affects Versions: 2.7 Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9463) [ML] Update ML tutorial with new model composition/update features
Aleksey Zinoviev created IGNITE-9463: Summary: [ML] Update ML tutorial with new model composition/update features Key: IGNITE-9463 URL: https://issues.apache.org/jira/browse/IGNITE-9463 Project: Ignite Issue Type: New Feature Components: ml Affects Versions: 2.7 Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.7 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9145) [ML] Add different strategies to index labels in StringEncoderTrainer
[ https://issues.apache.org/jira/browse/IGNITE-9145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9145: - Fix Version/s: (was: 2.7) > [ML] Add different strategies to index labels in StringEncoderTrainer > - > > Key: IGNITE-9145 > URL: https://issues.apache.org/jira/browse/IGNITE-9145 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > > The main idea to add a few strategies of indexing: sorting and so on. > Currently it supports only one strategy (most popular with zero and less > popular with the max index size). > There are can be a few options > * 'frequencyDesc': descending order by label frequency (most frequent label > assigned 0) > * 'frequencyAsc': ascending order by label frequency (least frequent label > assigned 0) > * 'alphabetDesc': descending alphabetical order > * 'alphabetAsc': ascending alphabetical order > > Please, update the method **transformFrequenciesToEncodingValues and add the > strategy as a parameter of trainer. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9421) ML Examples: LogisticRegressionSGDTrainerExample example result not correct
[ https://issues.apache.org/jira/browse/IGNITE-9421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9421: - Affects Version/s: (was: 2.6) > ML Examples: LogisticRegressionSGDTrainerExample example result not correct > --- > > Key: IGNITE-9421 > URL: https://issues.apache.org/jira/browse/IGNITE-9421 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Stepan Pilschikov >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > Running > org.apache.ignite.examples.ml.regression.logistic.binary.LogisticRegressionSGDTrainerExample > example > Output: > {code} > >>> Absolute amount of errors 100 > >>> Accuracy 0.0 > >>> Confusion matrix is [[50, 50], [0, 0]] > >>> - > >>> Logistic regression model over partitioned dataset usage example > >>> completed. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9393) [ML] KMeans fails on complex data in cache
Aleksey Zinoviev created IGNITE-9393: Summary: [ML] KMeans fails on complex data in cache Key: IGNITE-9393 URL: https://issues.apache.org/jira/browse/IGNITE-9393 Project: Ignite Issue Type: Bug Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Described here http://apache-ignite-users.70518.x6.nabble.com/NPE-exception-in-KMeansTrainer-td23504.html#a23512 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9336) [ML] ANN/SVM Trainer tests produce unpredictable results due to random data generation
Aleksey Zinoviev created IGNITE-9336: Summary: [ML] ANN/SVM Trainer tests produce unpredictable results due to random data generation Key: IGNITE-9336 URL: https://issues.apache.org/jira/browse/IGNITE-9336 Project: Ignite Issue Type: Bug Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Remove random data generation and add static dataset into tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9283) [ML] Add Discrete Cosine preprocessor
[ https://issues.apache.org/jira/browse/IGNITE-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9283: - Description: Add [https://en.wikipedia.org/wiki/Discrete_cosine_transform] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm Requirements for successful PR: # PartitionedDataset usage # Trainer-Model paradigm support # Tests for Model and for Trainer (and other stuff) # Example of usage with small, but famous dataset like IRIS, Titanic or House Prices # Javadocs/codestyle according guidelines was: Add [https://en.wikipedia.org/wiki/Discrete_cosine_transform] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm > [ML] Add Discrete Cosine preprocessor > - > > Key: IGNITE-9283 > URL: https://issues.apache.org/jira/browse/IGNITE-9283 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Aleksey Zinoviev >Priority: Major > > Add [https://en.wikipedia.org/wiki/Discrete_cosine_transform] > Please look at the MinMaxScaler or Normalization packages in preprocessing > package. > Add classes if required > 1) Preprocessor > 2) Trainer > 3) custom PartitionData if shuffling is a step of algorithm > > Requirements for successful PR: > # PartitionedDataset usage > # Trainer-Model paradigm support > # Tests for Model and for Trainer (and other stuff) > # Example of usage with small, but famous dataset like IRIS, Titanic or > House Prices > # Javadocs/codestyle according guidelines > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9285) [ML] Add MaxAbsScaler as a preprocessing stage
[ https://issues.apache.org/jira/browse/IGNITE-9285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9285: - Description: Add analogue of [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html#sklearn.preprocessing.MaxAbsScaler] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm Requirements for successful PR: # PartitionedDataset usage # Trainer-Model paradigm support # Tests for Model and for Trainer (and other stuff) # Example of usage with small, but famous dataset like IRIS, Titanic or House Prices # Javadocs/codestyle according guidelines was: Add analogue of [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html#sklearn.preprocessing.MaxAbsScaler] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm > [ML] Add MaxAbsScaler as a preprocessing stage > -- > > Key: IGNITE-9285 > URL: https://issues.apache.org/jira/browse/IGNITE-9285 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Aleksey Zinoviev >Priority: Major > > Add analogue of > [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html#sklearn.preprocessing.MaxAbsScaler] > Please look at the MinMaxScaler or Normalization packages in preprocessing > package. > Add classes if required > 1) Preprocessor > 2) Trainer > 3) custom PartitionData if shuffling is a step of algorithm > > Requirements for successful PR: > # PartitionedDataset usage > # Trainer-Model paradigm support > # Tests for Model and for Trainer (and other stuff) > # Example of usage with small, but famous dataset like IRIS, Titanic or > House Prices > # Javadocs/codestyle according guidelines -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9282) [ML] Add Naive Bayes classifier
[ https://issues.apache.org/jira/browse/IGNITE-9282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9282: - Description: Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features. So we want to add this algorithm to Apache Ignite ML module. Ideally, implementation should support both multinomial naive Bayes and Bernoulli naive Bayes. Requirements for successful PR: # PartitionedDataset usage # Trainer-Model paradigm support # Tests for Model and for Trainer (and other stuff) # Example of usage with small, but famous dataset like IRIS, Titanic or House Prices # Javadocs/codestyle according guidelines was: Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features. So we want to add this algorithm to Apache Ignite ML module. Ideally, implementation should support both multinomial naive Bayes and Bernoulli naive Bayes. > [ML] Add Naive Bayes classifier > --- > > Key: IGNITE-9282 > URL: https://issues.apache.org/jira/browse/IGNITE-9282 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Aleksey Zinoviev >Priority: Major > > Naive Bayes classifiers are a family of simple probabilistic classifiers > based on applying Bayes' theorem with strong (naive) independence assumptions > between the features. > So we want to add this algorithm to Apache Ignite ML module. > Ideally, implementation should support both multinomial naive Bayes and > Bernoulli naive Bayes. > Requirements for successful PR: > # PartitionedDataset usage > # Trainer-Model paradigm support > # Tests for Model and for Trainer (and other stuff) > # Example of usage with small, but famous dataset like IRIS, Titanic or > House Prices > # Javadocs/codestyle according guidelines > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9284) [ML] Add a Standard Scaler
[ https://issues.apache.org/jira/browse/IGNITE-9284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9284: - Description: Add analogue of [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm Requirements for successful PR: # PartitionedDataset usage # Trainer-Model paradigm support # Tests for Model and for Trainer (and other stuff) # Example of usage with small, but famous dataset like IRIS, Titanic or House Prices # Javadocs/codestyle according guidelines was: Add analogue of [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm > [ML] Add a Standard Scaler > -- > > Key: IGNITE-9284 > URL: https://issues.apache.org/jira/browse/IGNITE-9284 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Aleksey Zinoviev >Priority: Major > > Add analogue of > [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html] > Please look at the MinMaxScaler or Normalization packages in preprocessing > package. > Add classes if required > 1) Preprocessor > 2) Trainer > 3) custom PartitionData if shuffling is a step of algorithm > > Requirements for successful PR: > # PartitionedDataset usage > # Trainer-Model paradigm support > # Tests for Model and for Trainer (and other stuff) > # Example of usage with small, but famous dataset like IRIS, Titanic or > House Prices > # Javadocs/codestyle according guidelines > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-9281) [ML] Starter ML tasks
[ https://issues.apache.org/jira/browse/IGNITE-9281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-9281: Assignee: Aleksey Zinoviev > [ML] Starter ML tasks > - > > Key: IGNITE-9281 > URL: https://issues.apache.org/jira/browse/IGNITE-9281 > Project: Ignite > Issue Type: Wish > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: None > > > This ticket is an umbrella ticket for ML starter tasks. > Please, contact [~zaleslaw] to assign and get help with one of this tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-9239) [ML] KMeansTrainer crashed if amount of possible clusters more than amount of partitions in dataset
[ https://issues.apache.org/jira/browse/IGNITE-9239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-9239. -- Resolution: Fixed > [ML] KMeansTrainer crashed if amount of possible clusters more than amount of > partitions in dataset > --- > > Key: IGNITE-9239 > URL: https://issues.apache.org/jira/browse/IGNITE-9239 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > > How to reproduce? > Set the K parameter in KMeans Trainer to 100, and run KMeansClusterization > Example > \ > StackTrace is > Exception in thread "KMeansClusterizationExample-#44" > java.lang.RuntimeException: java.lang.IllegalArgumentException: bound must be > positive > at > org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.fit(KMeansTrainer.java:112) > at > org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.fit(KMeansTrainer.java:46) > at org.apache.ignite.ml.trainers.DatasetTrainer.fit(DatasetTrainer.java:68) > at > org.apache.ignite.examples.ml.clustering.KMeansClusterizationExample.lambda$main$0(KMeansClusterizationExample.java:60) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalArgumentException: bound must be positive > at java.util.Random.nextInt(Random.java:388) > at > org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.initClusterCentersRandomly(KMeansTrainer.java:193) > at > org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.fit(KMeansTrainer.java:86) > ... 4 more > > > The possible solution : > correct the mechanism of rndPnts computation in the row 180-190 in > KMeansTrainer -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9283) [ML] Add Discrete Cosine preprocessor
[ https://issues.apache.org/jira/browse/IGNITE-9283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9283: - Component/s: ml > [ML] Add Discrete Cosine preprocessor > - > > Key: IGNITE-9283 > URL: https://issues.apache.org/jira/browse/IGNITE-9283 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Aleksey Zinoviev >Priority: Major > > Add [https://en.wikipedia.org/wiki/Discrete_cosine_transform] > Please look at the MinMaxScaler or Normalization packages in preprocessing > package. > Add classes if required > 1) Preprocessor > 2) Trainer > 3) custom PartitionData if shuffling is a step of algorithm > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-9284) [ML] Add a Standard Scaler
[ https://issues.apache.org/jira/browse/IGNITE-9284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-9284: - Component/s: ml > [ML] Add a Standard Scaler > -- > > Key: IGNITE-9284 > URL: https://issues.apache.org/jira/browse/IGNITE-9284 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Aleksey Zinoviev >Priority: Major > > Add analogue of > [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html] > Please look at the MinMaxScaler or Normalization packages in preprocessing > package. > Add classes if required > 1) Preprocessor > 2) Trainer > 3) custom PartitionData if shuffling is a step of algorithm > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9285) [ML] Add MaxAbsScaler as a preprocessing stage
Aleksey Zinoviev created IGNITE-9285: Summary: [ML] Add MaxAbsScaler as a preprocessing stage Key: IGNITE-9285 URL: https://issues.apache.org/jira/browse/IGNITE-9285 Project: Ignite Issue Type: Sub-task Components: ml Reporter: Aleksey Zinoviev Add analogue of [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MaxAbsScaler.html#sklearn.preprocessing.MaxAbsScaler] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9284) [ML] Add a Standard Scaler
Aleksey Zinoviev created IGNITE-9284: Summary: [ML] Add a Standard Scaler Key: IGNITE-9284 URL: https://issues.apache.org/jira/browse/IGNITE-9284 Project: Ignite Issue Type: Sub-task Reporter: Aleksey Zinoviev Add analogue of [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9283) [ML] Add Discrete Cosine preprocessor
Aleksey Zinoviev created IGNITE-9283: Summary: [ML] Add Discrete Cosine preprocessor Key: IGNITE-9283 URL: https://issues.apache.org/jira/browse/IGNITE-9283 Project: Ignite Issue Type: Sub-task Reporter: Aleksey Zinoviev Add [https://en.wikipedia.org/wiki/Discrete_cosine_transform] Please look at the MinMaxScaler or Normalization packages in preprocessing package. Add classes if required 1) Preprocessor 2) Trainer 3) custom PartitionData if shuffling is a step of algorithm -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9282) [ML] Add Naive Bayes classifier
Aleksey Zinoviev created IGNITE-9282: Summary: [ML] Add Naive Bayes classifier Key: IGNITE-9282 URL: https://issues.apache.org/jira/browse/IGNITE-9282 Project: Ignite Issue Type: Sub-task Components: ml Reporter: Aleksey Zinoviev Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features. So we want to add this algorithm to Apache Ignite ML module. Ideally, implementation should support both multinomial naive Bayes and Bernoulli naive Bayes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-7741) Fix javadoc for QR factorization
[ https://issues.apache.org/jira/browse/IGNITE-7741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-7741. -- Resolution: Invalid The QR factorization was removed in 2.6 > Fix javadoc for QR factorization > > > Key: IGNITE-7741 > URL: https://issues.apache.org/jira/browse/IGNITE-7741 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Yury Babak >Priority: Minor > > Wrong javadoc for QR factorization. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5828) Decompositions refactoring
[ https://issues.apache.org/jira/browse/IGNITE-5828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5828. -- Resolution: Invalid The decomposition algorithms were removed in 2.6 > Decompositions refactoring > -- > > Key: IGNITE-5828 > URL: https://issues.apache.org/jira/browse/IGNITE-5828 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Yury Babak >Priority: Major > > (?) Externalization for decompositions. > (?) QRDecomposition performance. > (?) EigenDecompositionTest - corner case failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5845) Benchmarks for ML algorithms.
[ https://issues.apache.org/jira/browse/IGNITE-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5845. -- Resolution: Duplicate > Benchmarks for ML algorithms. > - > > Key: IGNITE-5845 > URL: https://issues.apache.org/jira/browse/IGNITE-5845 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Yury Babak >Priority: Major > > We want to create some benchmarks for ML algorithms. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5844) Distributed versions of matrix decompositions
[ https://issues.apache.org/jira/browse/IGNITE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5844. -- Resolution: Invalid The matrix decomposition was removed in 2.6 > Distributed versions of matrix decompositions > - > > Key: IGNITE-5844 > URL: https://issues.apache.org/jira/browse/IGNITE-5844 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Priority: Major > > We want to add support for distributed matrices. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-6059) Use any distributed matrix in K-Means
[ https://issues.apache.org/jira/browse/IGNITE-6059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-6059. -- Resolution: Invalid This algorithm was totally rewritten > Use any distributed matrix in K-Means > - > > Key: IGNITE-6059 > URL: https://issues.apache.org/jira/browse/IGNITE-6059 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Yury Babak >Priority: Major > > Currently k-means work only with row/col matrix. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5825) K-Means refactoring
[ https://issues.apache.org/jira/browse/IGNITE-5825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5825. -- Resolution: Invalid The KMeans algorithm was totally changed > K-Means refactoring > --- > > Key: IGNITE-5825 > URL: https://issues.apache.org/jira/browse/IGNITE-5825 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Yury Babak >Priority: Major > > Improve performance of points copying. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5824) Adjust precision in math unit tests.
[ https://issues.apache.org/jira/browse/IGNITE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5824. -- Resolution: Invalid That problem is solved in another tickets > Adjust precision in math unit tests. > > > Key: IGNITE-5824 > URL: https://issues.apache.org/jira/browse/IGNITE-5824 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Yury Babak >Priority: Major > > Find which precision is sufficient for math related tests. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5801) Externalization for offheap vectors/matrices
[ https://issues.apache.org/jira/browse/IGNITE-5801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5801. -- Resolution: Fixed All offheap vectors will be removed in 2.7 > Externalization for offheap vectors/matrices > > > Key: IGNITE-5801 > URL: https://issues.apache.org/jira/browse/IGNITE-5801 > Project: Ignite > Issue Type: Bug > Components: ml >Reporter: Yury Babak >Priority: Major > > Add externalization support for off-heap structures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5799) Caching for some intermediate calcs
[ https://issues.apache.org/jira/browse/IGNITE-5799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5799. -- Resolution: Invalid The distirbuted matrices were removed from the codebase in 2.6 > Caching for some intermediate calcs > --- > > Key: IGNITE-5799 > URL: https://issues.apache.org/jira/browse/IGNITE-5799 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Yury Babak >Priority: Major > > Check possibility and necessity of caching some intermediate calcs like > decomposition for matrix determinant calculation -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5723) Improve code quality for existing code.
[ https://issues.apache.org/jira/browse/IGNITE-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5723. -- Resolution: Invalid Unclear ticket should be closed > Improve code quality for existing code. > --- > > Key: IGNITE-5723 > URL: https://issues.apache.org/jira/browse/IGNITE-5723 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Yury Babak >Priority: Major > > (?) check code style for all sources. > (?) check code coverage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5724) Remove all autoboxing staff from the component.
[ https://issues.apache.org/jira/browse/IGNITE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5724. -- Resolution: Invalid No this stuff in codebase > Remove all autoboxing staff from the component. > --- > > Key: IGNITE-5724 > URL: https://issues.apache.org/jira/browse/IGNITE-5724 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Yury Babak >Priority: Major > > Find and remove all boxing/unboxing code from vectors and matrices. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-5645) Locking mechanism for distributed datasets.
[ https://issues.apache.org/jira/browse/IGNITE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-5645: Assignee: Aleksey Zinoviev > Locking mechanism for distributed datasets. > --- > > Key: IGNITE-5645 > URL: https://issues.apache.org/jira/browse/IGNITE-5645 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > > We must to have mechanism for protect distributed matrix from changes during > calculations. Current locking mechanism is bad choice for locking a huge > cache keyset, so we need a new one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5646) Use affinity keys for distributed matrice blocks
[ https://issues.apache.org/jira/browse/IGNITE-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5646. -- Resolution: Invalid The distributed matrices were dropped from the codebase in 2.6 > Use affinity keys for distributed matrice blocks > > > Key: IGNITE-5646 > URL: https://issues.apache.org/jira/browse/IGNITE-5646 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Priority: Major > > We want to implement affinity collocation for distributed matrices. > We must guarantee that the new block for computation result will be stored in > the same node like the initial blocks -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-5645) Locking mechanism for distributed datasets.
[ https://issues.apache.org/jira/browse/IGNITE-5645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-5645: - Summary: Locking mechanism for distributed datasets. (was: Locking mechanism for distributed matrices.) > Locking mechanism for distributed datasets. > --- > > Key: IGNITE-5645 > URL: https://issues.apache.org/jira/browse/IGNITE-5645 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Priority: Major > > We must to have mechanism for protect distributed matrix from changes during > calculations. Current locking mechanism is bad choice for locking a huge > cache keyset, so we need a new one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5220) Partial derivatives calculation.
[ https://issues.apache.org/jira/browse/IGNITE-5220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5220. -- Resolution: Won't Fix We algorithm was totally changed. This ticket was closed during SGD implementation. > Partial derivatives calculation. > > > Key: IGNITE-5220 > URL: https://issues.apache.org/jira/browse/IGNITE-5220 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Priority: Major > > We need mechanism of computation of partial derivatives which we need for > gradient descent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5219) Generalization of cost function for Linear Regression.
[ https://issues.apache.org/jira/browse/IGNITE-5219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5219. -- Resolution: Won't Fix The algorithm was totally changed > Generalization of cost function for Linear Regression. > -- > > Key: IGNITE-5219 > URL: https://issues.apache.org/jira/browse/IGNITE-5219 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Yury Babak >Priority: Major > > We want to add support of custom cost functions for Linear Regression. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-5216) Add Stream API support to Ignite ML matrices.
[ https://issues.apache.org/jira/browse/IGNITE-5216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-5216. -- Resolution: Won't Fix > Add Stream API support to Ignite ML matrices. > - > > Key: IGNITE-5216 > URL: https://issues.apache.org/jira/browse/IGNITE-5216 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Priority: Major > > We want to add Stream API support to Ignite matrices and possibly to vectors. > We already have implementation of Spliterator for AbstractVector and > AbstractMatrix so it's looks like next step. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9281) [ML] Starter ML tasks
Aleksey Zinoviev created IGNITE-9281: Summary: [ML] Starter ML tasks Key: IGNITE-9281 URL: https://issues.apache.org/jira/browse/IGNITE-9281 Project: Ignite Issue Type: Wish Components: ml Reporter: Aleksey Zinoviev Fix For: None This ticket is an umbrella ticket for ML starter tasks. Please, contact [~zaleslaw] to assign and get help with one of this tasks. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-6642) Integration with PMML
[ https://issues.apache.org/jira/browse/IGNITE-6642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-6642: Assignee: Aleksey Zinoviev > Integration with PMML > - > > Key: IGNITE-6642 > URL: https://issues.apache.org/jira/browse/IGNITE-6642 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > > PMML - Predictive Model Markup Language is XML based language which used in > SPARK MLlib and others platforms. > Here some additional info about PMML: > (i) http://dmg.org/pmml/v4-3/GeneralStructure.html > (i) https://github.com/jpmml/jpmml-model -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9261) [ML] Add ANN algorithm based on ACD concept
Aleksey Zinoviev created IGNITE-9261: Summary: [ML] Add ANN algorithm based on ACD concept Key: IGNITE-9261 URL: https://issues.apache.org/jira/browse/IGNITE-9261 Project: Ignite Issue Type: New Feature Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev The ACD concept is implemented via centroids searching with KMeans help. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-7797) Adopt yardstick tests for the new version of kNN classification algorithm
[ https://issues.apache.org/jira/browse/IGNITE-7797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-7797. -- Resolution: Won't Fix It was decided: no new specific yardstick tests here > Adopt yardstick tests for the new version of kNN classification algorithm > - > > Key: IGNITE-7797 > URL: https://issues.apache.org/jira/browse/IGNITE-7797 > Project: Ignite > Issue Type: Sub-task >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9239) [ML] KMeansTrainer crashed if amount of possible clusters more than amount of partitions in dataset
Aleksey Zinoviev created IGNITE-9239: Summary: [ML] KMeansTrainer crashed if amount of possible clusters more than amount of partitions in dataset Key: IGNITE-9239 URL: https://issues.apache.org/jira/browse/IGNITE-9239 Project: Ignite Issue Type: Bug Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev How to reproduce? Set the K parameter in KMeans Trainer to 100, and run KMeansClusterization Example \ StackTrace is Exception in thread "KMeansClusterizationExample-#44" java.lang.RuntimeException: java.lang.IllegalArgumentException: bound must be positive at org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.fit(KMeansTrainer.java:112) at org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.fit(KMeansTrainer.java:46) at org.apache.ignite.ml.trainers.DatasetTrainer.fit(DatasetTrainer.java:68) at org.apache.ignite.examples.ml.clustering.KMeansClusterizationExample.lambda$main$0(KMeansClusterizationExample.java:60) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: bound must be positive at java.util.Random.nextInt(Random.java:388) at org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.initClusterCentersRandomly(KMeansTrainer.java:193) at org.apache.ignite.ml.clustering.kmeans.KMeansTrainer.fit(KMeansTrainer.java:86) ... 4 more The possible solution : correct the mechanism of rndPnts computation in the row 180-190 in KMeansTrainer -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-9145) [ML] Add different strategies to index labels in StringEncoderTrainer
Aleksey Zinoviev created IGNITE-9145: Summary: [ML] Add different strategies to index labels in StringEncoderTrainer Key: IGNITE-9145 URL: https://issues.apache.org/jira/browse/IGNITE-9145 Project: Ignite Issue Type: Improvement Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Fix For: 2.7 The main idea to add a few strategies of indexing: sorting and so on. Currently it supports only one strategy (most popular with zero and less popular with the max index size). There are can be a few options * 'frequencyDesc': descending order by label frequency (most frequent label assigned 0) * 'frequencyAsc': ascending order by label frequency (least frequent label assigned 0) * 'alphabetDesc': descending alphabetical order * 'alphabetAsc': ascending alphabetical order Please, update the method **transformFrequenciesToEncodingValues and add the strategy as a parameter of trainer. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-7827) Adopt kNN regression to the new Partitioned Dataset
[ https://issues.apache.org/jira/browse/IGNITE-7827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-7827. -- Resolution: Fixed > Adopt kNN regression to the new Partitioned Dataset > --- > > Key: IGNITE-7827 > URL: https://issues.apache.org/jira/browse/IGNITE-7827 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-7828) Adopt yardstick tests for the new version of kNN regression algorithm
[ https://issues.apache.org/jira/browse/IGNITE-7828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-7828. -- Resolution: Won't Fix > Adopt yardstick tests for the new version of kNN regression algorithm > - > > Key: IGNITE-7828 > URL: https://issues.apache.org/jira/browse/IGNITE-7828 > Project: Ignite > Issue Type: Sub-task > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-8669) Model estimation
[ https://issues.apache.org/jira/browse/IGNITE-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-8669. -- Resolution: Fixed > Model estimation > > > Key: IGNITE-8669 > URL: https://issues.apache.org/jira/browse/IGNITE-8669 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > We want to have the common mechanism for model estimation. > For estimation we want to have: > * Accuracy/precision/recall > * F score > * TPR/FRP -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8669) Model estimation
[ https://issues.apache.org/jira/browse/IGNITE-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8669: - Description: We want to have the common mechanism for model estimation. For estimation we want to have: * Accuracy/precision/recall * F score * TPR/FRP was: We want to have the common mechanism for model estimation. For estimation we want to have: * Accuracy/precision/recall * F score * TPR/FRP * ROC AUC > Model estimation > > > Key: IGNITE-8669 > URL: https://issues.apache.org/jira/browse/IGNITE-8669 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.7 > > > We want to have the common mechanism for model estimation. > For estimation we want to have: > * Accuracy/precision/recall > * F score > * TPR/FRP -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8669) Model estimation
[ https://issues.apache.org/jira/browse/IGNITE-8669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-8669: Assignee: Aleksey Zinoviev (was: Anton Dmitriev) > Model estimation > > > Key: IGNITE-8669 > URL: https://issues.apache.org/jira/browse/IGNITE-8669 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.6 > > > We want to have the common mechanism for model estimation. > For estimation we want to have: > * Accuracy/precision/recall > * F score > * TPR/FRP > * ROC AUC -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8664) Encoding categorical features with One-of-K Encoder
[ https://issues.apache.org/jira/browse/IGNITE-8664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8664: - Summary: Encoding categorical features with One-of-K Encoder (was: Encoding categorical features) > Encoding categorical features with One-of-K Encoder > --- > > Key: IGNITE-8664 > URL: https://issues.apache.org/jira/browse/IGNITE-8664 > Project: Ignite > Issue Type: New Feature > Components: ml >Reporter: Yury Babak >Assignee: Aleksey Zinoviev >Priority: Major > Fix For: 2.6 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8451) [ML] Refactor Labeled Dataset: remove unused methods and fields
[ https://issues.apache.org/jira/browse/IGNITE-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-8451: Assignee: Yury Babak (was: Aleksey Zinoviev) > [ML] Refactor Labeled Dataset: remove unused methods and fields > --- > > Key: IGNITE-8451 > URL: https://issues.apache.org/jira/browse/IGNITE-8451 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Yury Babak >Priority: Major > > Remove > * loading from file > * distributed version (we need local version only) > * parent class Dataset and meta-information -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8567) [ML] Add Imputer and Binarizer for data preprocessing
Aleksey Zinoviev created IGNITE-8567: Summary: [ML] Add Imputer and Binarizer for data preprocessing Key: IGNITE-8567 URL: https://issues.apache.org/jira/browse/IGNITE-8567 Project: Ignite Issue Type: New Feature Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev The imputing with Mean and Most frequent values options can be effectively distributed. [http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.Imputer.html#sklearn.preprocessing.Imputer] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8542) [ML] Add OneVsRest Trainer to handle cases with multiple class labels in dataset
Aleksey Zinoviev created IGNITE-8542: Summary: [ML] Add OneVsRest Trainer to handle cases with multiple class labels in dataset Key: IGNITE-8542 URL: https://issues.apache.org/jira/browse/IGNITE-8542 Project: Ignite Issue Type: Improvement Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8511) [ML] Add support for Multi-Class Logistic Regression
Aleksey Zinoviev created IGNITE-8511: Summary: [ML] Add support for Multi-Class Logistic Regression Key: IGNITE-8511 URL: https://issues.apache.org/jira/browse/IGNITE-8511 Project: Ignite Issue Type: New Feature Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8451) [ML] Refactor Labeled Dataset: remove unused methods and fields
[ https://issues.apache.org/jira/browse/IGNITE-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8451: - Description: Remove * loading from file * distributed version (we need local version only) > [ML] Refactor Labeled Dataset: remove unused methods and fields > --- > > Key: IGNITE-8451 > URL: https://issues.apache.org/jira/browse/IGNITE-8451 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > > Remove > * loading from file > * distributed version (we need local version only) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8451) [ML] Refactor Labeled Dataset: remove unused methods and fields
[ https://issues.apache.org/jira/browse/IGNITE-8451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8451: - Description: Remove * loading from file * distributed version (we need local version only) * parent class Dataset and meta-information was: Remove * loading from file * distributed version (we need local version only) > [ML] Refactor Labeled Dataset: remove unused methods and fields > --- > > Key: IGNITE-8451 > URL: https://issues.apache.org/jira/browse/IGNITE-8451 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > > Remove > * loading from file > * distributed version (we need local version only) > * parent class Dataset and meta-information -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8451) [ML] Refactor Labeled Dataset: remove unused methods and fields
Aleksey Zinoviev created IGNITE-8451: Summary: [ML] Refactor Labeled Dataset: remove unused methods and fields Key: IGNITE-8451 URL: https://issues.apache.org/jira/browse/IGNITE-8451 Project: Ignite Issue Type: Improvement Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8450) [ML] Cleanup the ML package: remove unused vector/matrix classes
[ https://issues.apache.org/jira/browse/IGNITE-8450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8450: - Description: Remove * unused algebraic classes * related tests * related matrix algorithms * realted utils staff * related examples * related yardstick tests > [ML] Cleanup the ML package: remove unused vector/matrix classes > > > Key: IGNITE-8450 > URL: https://issues.apache.org/jira/browse/IGNITE-8450 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > > Remove > * unused algebraic classes > * related tests > * related matrix algorithms > * realted utils staff > * related examples > * related yardstick tests > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8450) [ML] Cleanup the ML package: remove unused vector/matrix classes
Aleksey Zinoviev created IGNITE-8450: Summary: [ML] Cleanup the ML package: remove unused vector/matrix classes Key: IGNITE-8450 URL: https://issues.apache.org/jira/browse/IGNITE-8450 Project: Ignite Issue Type: Improvement Components: ml Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8398) Update documentation for KMeans clustering (release 2.5)
[ https://issues.apache.org/jira/browse/IGNITE-8398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-8398: Assignee: Akmal Chaudhri (was: Aleksey Zinoviev) > Update documentation for KMeans clustering (release 2.5) > > > Key: IGNITE-8398 > URL: https://issues.apache.org/jira/browse/IGNITE-8398 > Project: Ignite > Issue Type: Improvement > Components: documentation, ml >Affects Versions: 2.5 >Reporter: Aleksey Zinoviev >Assignee: Akmal Chaudhri >Priority: Major > > In Apache Ignite 2.5 we have changed a kMeans clustering and remove > FuzzyCMeans working on top of partition based dataset and now we need to > update documentation for this feature. > > Previous version: > [https://dash.readme.io/project/apacheignite/v2.4/docs/k-means-clustering] > update with > New version: > [https://dash.readme.io/project/apacheignite/v2.4/docs/k-means-clustering-25] > > IMPORTANT: Remove page > [https://dash.readme.io/project/apacheignite/v2.4/docs/fuzzy-c-means-clustering] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8399) Add documentation for SVM Binary and Multi-class classification (release 2.5)
[ https://issues.apache.org/jira/browse/IGNITE-8399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-8399: Assignee: Akmal Chaudhri (was: Aleksey Zinoviev) > Add documentation for SVM Binary and Multi-class classification (release 2.5) > - > > Key: IGNITE-8399 > URL: https://issues.apache.org/jira/browse/IGNITE-8399 > Project: Ignite > Issue Type: Improvement > Components: documentation, ml >Affects Versions: 2.5 >Reporter: Aleksey Zinoviev >Assignee: Akmal Chaudhri >Priority: Major > > In Apache Ignite 2.5 we have added a SVM Binary and Multi-class > classification working on top of partition based dataset and now we need to > update documentation for this feature. > Add page [https://dash.readme.io/project/apacheignite/v2.4/docs/svm-25] > Add page > [https://dash.readme.io/project/apacheignite/v2.4/docs/svm-multi-class-classification-25] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8397) Update documentation for kNN regression (release 2.5)
[ https://issues.apache.org/jira/browse/IGNITE-8397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-8397: Assignee: Akmal Chaudhri (was: Aleksey Zinoviev) > Update documentation for kNN regression (release 2.5) > - > > Key: IGNITE-8397 > URL: https://issues.apache.org/jira/browse/IGNITE-8397 > Project: Ignite > Issue Type: Improvement > Components: documentation, ml >Affects Versions: 2.5 >Reporter: Aleksey Zinoviev >Assignee: Akmal Chaudhri >Priority: Major > > In Apache Ignite 2.5 we have changed a kNN regression working on top of > partition based dataset and now we need to update documentation for this > feature. > > Previous version: > [https://dash.readme.io/project/apacheignite/v2.4/docs/knn-regression] > update with > New version: > [https://dash.readme.io/project/apacheignite/v2.4/docs/k-nn-regression-25|http://example.com] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (IGNITE-8396) Update documentation for kNN classification (release 2.5)
[ https://issues.apache.org/jira/browse/IGNITE-8396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev reassigned IGNITE-8396: Assignee: Akmal Chaudhri (was: Aleksey Zinoviev) > Update documentation for kNN classification (release 2.5) > - > > Key: IGNITE-8396 > URL: https://issues.apache.org/jira/browse/IGNITE-8396 > Project: Ignite > Issue Type: Improvement > Components: documentation, ml >Affects Versions: 2.5 >Reporter: Aleksey Zinoviev >Assignee: Akmal Chaudhri >Priority: Major > > In Apache Ignite 2.5 we have changed a kNN classification working on top of > partition based dataset and now we need to update documentation for this > feature. > > Previous version: > [https://dash.readme.io/project/apacheignite/v2.4/docs/knn-classification] > update with > New version: > [https://dash.readme.io/project/apacheignite/v2.4/docs/k-nn-classification-25] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IGNITE-8410) [ML] Unify KNNClassification/KNNRegression Model Trainer .fit() signatures
Aleksey Zinoviev created IGNITE-8410: Summary: [ML] Unify KNNClassification/KNNRegression Model Trainer .fit() signatures Key: IGNITE-8410 URL: https://issues.apache.org/jira/browse/IGNITE-8410 Project: Ignite Issue Type: Improvement Components: ml Affects Versions: 2.6 Reporter: Aleksey Zinoviev Assignee: Aleksey Zinoviev Make fit calls similar. Should refactor one of trainers and remove one signature. The possible solution to pass dataCache and ignite separately. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (IGNITE-8250) Adopt Fuzzy CMeans to PartitionedDatasets
[ https://issues.apache.org/jira/browse/IGNITE-8250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev updated IGNITE-8250: - Description: Add Model/Trainer, tests, example > Adopt Fuzzy CMeans to PartitionedDatasets > - > > Key: IGNITE-8250 > URL: https://issues.apache.org/jira/browse/IGNITE-8250 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > > Add Model/Trainer, tests, example -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (IGNITE-8168) [ML] Add KMeans version for Partitioned Datasets
[ https://issues.apache.org/jira/browse/IGNITE-8168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Zinoviev resolved IGNITE-8168. -- Resolution: Fixed > [ML] Add KMeans version for Partitioned Datasets > > > Key: IGNITE-8168 > URL: https://issues.apache.org/jira/browse/IGNITE-8168 > Project: Ignite > Issue Type: Improvement > Components: ml >Reporter: Aleksey Zinoviev >Assignee: Aleksey Zinoviev >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)