Chengxiang Li created FLINK-2396:
------------------------------------
Summary: Review the datasets of dynamic path and static path in
iteration.
Key: FLINK-2396
URL: https://issues.apache.org/jira/browse/FLINK-2396
Project: Flink
Issue Type: Improvement
Components: Core
Reporter: Chengxiang Li
Priority: Minor
Currently Flink would cached dataset in static path as it assumes that dataset
stay the same during the iteration, but this assumption does not always be
true. Take sampling for example, the iteration data set is something like the
weight vector of model and there is another training dataset from which to take
a small sample to update the weight vector in each iteration (e.g. Stochastic
Gradient Descent), we expect sampled dataset is different in each iteration,
but Flink would cache the sampled dataset as it in static path.
We should review how Flink identify dynamic path and static path, and support
add sampled dataset in above example to dynamic path.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)