[ 
https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-8666:
-----------------------------------
    Fix Version/s:     (was: 2.6)
                   2.7

> Add ability of filtering data during datasets creation
> ------------------------------------------------------
>
>                 Key: IGNITE-8666
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8666
>             Project: Ignite
>          Issue Type: New Feature
>          Components: ml
>            Reporter: Yury Babak
>            Assignee: Anton Dmitriev
>            Priority: Major
>             Fix For: 2.7
>
>
> So far we use straightforward strategy to feed data into partition based 
> dataset. We retrieve all entries from an upstream cache partition, transform 
> it somehow and write into correspondent dataset partition (data and context). 
> As result we can't choose the data to be fed into dataset and data to be not 
> fed. To implement IGNITE-8667 (Splitting of dataset to test and training 
> sets) and IGNITE-8668 (K-fold cross validation of models) we need to have 
> such ability.
> The goal of this task is to add an ability to filter data that fed from cache 
> to dataset. It will allow us to create different dataset (training, testing, 
> k-fold, etc...) based on a single cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to