[jira] [Updated] (IGNITE-8666) Add ability of filtering data during datasets creation

2018-06-26 Thread Dmitriy Pavlov (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Pavlov updated IGNITE-8666:
---
Fix Version/s: (was: 2.6)
   2.7

> Add ability of filtering data during datasets creation
> --
>
> Key: IGNITE-8666
> URL: https://issues.apache.org/jira/browse/IGNITE-8666
> Project: Ignite
>  Issue Type: New Feature
>  Components: ml
>Reporter: Yury Babak
>Assignee: Anton Dmitriev
>Priority: Major
> Fix For: 2.7
>
>
> So far we use straightforward strategy to feed data into partition based 
> dataset. We retrieve all entries from an upstream cache partition, transform 
> it somehow and write into correspondent dataset partition (data and context). 
> As result we can't choose the data to be fed into dataset and data to be not 
> fed. To implement IGNITE-8667 (Splitting of dataset to test and training 
> sets) and IGNITE-8668 (K-fold cross validation of models) we need to have 
> such ability.
> The goal of this task is to add an ability to filter data that fed from cache 
> to dataset. It will allow us to create different dataset (training, testing, 
> k-fold, etc...) based on a single cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-8666) Add ability of filtering data during datasets creation

2018-06-01 Thread Anton Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Dmitriev updated IGNITE-8666:
---
Description: 
So far we use straightforward strategy to feed data into partition based 
dataset. We retrieve all entries from an upstream cache partition, transform it 
somehow and write into correspondent dataset partition (data and context). As 
result we can't choose the data to be fed into dataset and data to be not fed. 
To implement IGNITE-8667 (Splitting of dataset to test and training sets) and 
IGNITE-8668 (K-fold cross validation of models) we need to have such ability.

The goal of this task is to add an ability to filter data that fed from cache 
to dataset. It will allow us to create different dataset (training, testing, 
k-fold, etc...) based on a single cache.

  was:
So far we use straightforward strategy to feed data into partition based 
dataset. We retrieve all entries from an upstream cache partition, transform it 
somehow and write into correspondent dataset partition (data and context). As 
result we can't choose the data to be fed into dataset and data to be not fed. 
To implement IGNITE-8667 (Splitting of dataset to test and training sets) and 
IGNITE-8668 (K-fold cross validation of models) we need to have such ability.

The goal of this task is to add an ability to filter data that fed from cache 
to dataset. It will allow us to create different dataset (training, testing, 
k-fold, etc...) based on a single cache


> Add ability of filtering data during datasets creation
> --
>
> Key: IGNITE-8666
> URL: https://issues.apache.org/jira/browse/IGNITE-8666
> Project: Ignite
>  Issue Type: New Feature
>  Components: ml
>Reporter: Yury Babak
>Assignee: Anton Dmitriev
>Priority: Major
> Fix For: 2.6
>
>
> So far we use straightforward strategy to feed data into partition based 
> dataset. We retrieve all entries from an upstream cache partition, transform 
> it somehow and write into correspondent dataset partition (data and context). 
> As result we can't choose the data to be fed into dataset and data to be not 
> fed. To implement IGNITE-8667 (Splitting of dataset to test and training 
> sets) and IGNITE-8668 (K-fold cross validation of models) we need to have 
> such ability.
> The goal of this task is to add an ability to filter data that fed from cache 
> to dataset. It will allow us to create different dataset (training, testing, 
> k-fold, etc...) based on a single cache.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-8666) Add ability of filtering data during datasets creation

2018-06-01 Thread Anton Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Dmitriev updated IGNITE-8666:
---
Description: 
So far we use straightforward strategy to feed data into partition based 
dataset. We retrieve all entries from an upstream cache partition, transform it 
somehow and write into correspondent dataset partition (data and context). As 
result we can't choose the data to be fed into dataset and data to be not fed. 
To implement IGNITE-8667 (Splitting of dataset to test and training sets) and 
IGNITE-8668 (K-fold cross validation of models) we need to have such ability.

The goal of this task is to add an ability to filter data that fed from cache 
to dataset. It will allow us to create different dataset (training, testing, 
k-fold, etc...) based on a single cache

  was:
So far we use straightforward strategy to feed data into partition based 
dataset. We retrieve all entries from an upstream cache partition, transform it 
somehow and write into correspondent dataset partition (data and context). As 
result we can't choose the data to be fed into dataset and data to be not fed. 
To implement IGNITE-8667 (Splitting of dataset to test and training sets) and 
IGNITE-8668 (K-fold cross validation of models) we need to have such ability.

The goal of this task is to add an ability to filter data that fed from cache 
to dataset.


> Add ability of filtering data during datasets creation
> --
>
> Key: IGNITE-8666
> URL: https://issues.apache.org/jira/browse/IGNITE-8666
> Project: Ignite
>  Issue Type: New Feature
>  Components: ml
>Reporter: Yury Babak
>Assignee: Anton Dmitriev
>Priority: Major
> Fix For: 2.6
>
>
> So far we use straightforward strategy to feed data into partition based 
> dataset. We retrieve all entries from an upstream cache partition, transform 
> it somehow and write into correspondent dataset partition (data and context). 
> As result we can't choose the data to be fed into dataset and data to be not 
> fed. To implement IGNITE-8667 (Splitting of dataset to test and training 
> sets) and IGNITE-8668 (K-fold cross validation of models) we need to have 
> such ability.
> The goal of this task is to add an ability to filter data that fed from cache 
> to dataset. It will allow us to create different dataset (training, testing, 
> k-fold, etc...) based on a single cache



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-8666) Add ability of filtering data during datasets creation

2018-06-01 Thread Anton Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Dmitriev updated IGNITE-8666:
---
Description: 
So far we use straightforward strategy to feed data into partition based 
dataset. We retrieve all entries from an upstream cache partition, transform it 
somehow and write into correspondent dataset partition (data and context). As 
result we can't choose the data to be fed into dataset and data to be not fed. 
To implement IGNITE-8667 (Splitting of dataset to test and training sets) and 
IGNITE-8668 (K-fold cross validation of models) we need to have such ability.

The goal of this task is to add an ability to filter data that fed from cache 
to dataset.

  was:So far we use straightforward strategy to feed data into partition based 
dataset. We retrieve all entries from an upstream cache partition, transform it 
somehow and write into correspondent dataset partition (data and context). As 
result we can't choose the data to be fed into dataset and data to be not fed. 
To implement IGNITE-8667 (Splitting of dataset to test and training sets) and 
IGNITE-8668 (K-fold cross validation of models) we need to have such ability.


> Add ability of filtering data during datasets creation
> --
>
> Key: IGNITE-8666
> URL: https://issues.apache.org/jira/browse/IGNITE-8666
> Project: Ignite
>  Issue Type: New Feature
>  Components: ml
>Reporter: Yury Babak
>Assignee: Anton Dmitriev
>Priority: Major
> Fix For: 2.6
>
>
> So far we use straightforward strategy to feed data into partition based 
> dataset. We retrieve all entries from an upstream cache partition, transform 
> it somehow and write into correspondent dataset partition (data and context). 
> As result we can't choose the data to be fed into dataset and data to be not 
> fed. To implement IGNITE-8667 (Splitting of dataset to test and training 
> sets) and IGNITE-8668 (K-fold cross validation of models) we need to have 
> such ability.
> The goal of this task is to add an ability to filter data that fed from cache 
> to dataset.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (IGNITE-8666) Add ability of filtering data during datasets creation

2018-06-01 Thread Anton Dmitriev (JIRA)


 [ 
https://issues.apache.org/jira/browse/IGNITE-8666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Dmitriev updated IGNITE-8666:
---
Description: So far we use straightforward strategy to feed data into 
partition based dataset. We retrieve all entries from an upstream cache 
partition, transform it somehow and write into correspondent dataset partition 
(data and context). As result we can't choose the data to be fed into dataset 
and data to be not fed. To implement IGNITE-8667 (Splitting of dataset to test 
and training sets) and IGNITE-8668 (K-fold cross validation of models) we need 
to have such ability.

> Add ability of filtering data during datasets creation
> --
>
> Key: IGNITE-8666
> URL: https://issues.apache.org/jira/browse/IGNITE-8666
> Project: Ignite
>  Issue Type: New Feature
>  Components: ml
>Reporter: Yury Babak
>Assignee: Anton Dmitriev
>Priority: Major
> Fix For: 2.6
>
>
> So far we use straightforward strategy to feed data into partition based 
> dataset. We retrieve all entries from an upstream cache partition, transform 
> it somehow and write into correspondent dataset partition (data and context). 
> As result we can't choose the data to be fed into dataset and data to be not 
> fed. To implement IGNITE-8667 (Splitting of dataset to test and training 
> sets) and IGNITE-8668 (K-fold cross validation of models) we need to have 
> such ability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)