Andrew Palumbo created MAHOUT-1802:
--------------------------------------
Summary: Capture attached checkpoints (if cached)
Key: MAHOUT-1802
URL: https://issues.apache.org/jira/browse/MAHOUT-1802
Project: Mahout
Issue Type: Improvement
Affects Versions: 0.11.1
Reporter: Andrew Palumbo
Assignee: Andrew Palumbo
Fix For: 0.11.2
Currently, the optimizer generates checkpoints and attaches them to actual
logical elements of the DAG via CheckpointAction$cp.
the way it worsk today is as follows:
{code}
drmC = drmA+ drmB
val cp1 = drmC.checkpoint() // checkpoint
val cp2 = drmC.checkpoint() // cp2 == cp1
drmD = cp1 + drmE // cp1 + drmE
{code}
but, in:
{code}
drmD = drmC + drmE // computes drmA + drmB + drmC all over
{code}
{{drmC}} already has {{cp1}} attached to it so we should assume the common
computational path is the intent here regardless and should be used, instead of
building plans that recompute it. That is,
{{drmD = drmC + drmE}} should imply {{cp1 + drmE}} as well even if checkpoint
is not used explicitly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)