[ https://issues.apache.org/jira/browse/IGNITE-12079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Zinoviev updated IGNITE-12079: ------------------------------------- Fix Version/s: (was: 2.8) 2.9 > [ML][Umbrella] Add advanced preprocessing techniques > ---------------------------------------------------- > > Key: IGNITE-12079 > URL: https://issues.apache.org/jira/browse/IGNITE-12079 > Project: Ignite > Issue Type: New Feature > Components: ml > Affects Versions: 2.9 > Reporter: Alexey Zinoviev > Assignee: Alexey Zinoviev > Priority: Major > Fix For: 2.9 > > > *Main goal:* > To reduce the gap between Apache Spark and Apache Ignite in preprocessing > operations. The reducing of the gap could help with loading Spark ML > Pipelines to Ignite ML. > > Next steps: > # Add Frequency Encoder > # Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, > LEAST_FREQUENT) > # Add RobustScaler (will be added in Spark 3.0) > # Add CountVectorizer > # Add FeatureHasher > # Add QuantileDiscretizer > # Add Locality Sensitive Hashing (LSH) > # Add LabelEncoder > # Add RevertStringIndexing > # Add multi-column preprocessor -- This message was sent by Atlassian Jira (v8.3.4#803005)