[ https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy Pavlov updated IGNITE-8666: ----------------------------------- Fix Version/s: (was: 2.6) 2.7 > Add ability of filtering data during datasets creation > ------------------------------------------------------ > > Key: IGNITE-8666 > URL: https://issues.apache.org/jira/browse/IGNITE-8666 > Project: Ignite > Issue Type: New Feature > Components: ml > Reporter: Yury Babak > Assignee: Anton Dmitriev > Priority: Major > Fix For: 2.7 > > > So far we use straightforward strategy to feed data into partition based > dataset. We retrieve all entries from an upstream cache partition, transform > it somehow and write into correspondent dataset partition (data and context). > As result we can't choose the data to be fed into dataset and data to be not > fed. To implement IGNITE-8667 (Splitting of dataset to test and training > sets) and IGNITE-8668 (K-fold cross validation of models) we need to have > such ability. > The goal of this task is to add an ability to filter data that fed from cache > to dataset. It will allow us to create different dataset (training, testing, > k-fold, etc...) based on a single cache. -- This message was sent by Atlassian JIRA (v7.6.3#76005)