[
https://issues.apache.org/jira/browse/MAHOUT-319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914134#action_12914134
]
Ted Dunning commented on MAHOUT-319:
------------------------------------
Saikat,
That sounds like an excellent thing to do. My guess is that you don't need a
huge machinery and will be able to implement the local disk and HDFS versions
with the same code. The RDBMS is probably not of interest because this is for
checkpointing a running algorithm so that it can be restarted later. If you
want to analyze something, you probably don't want the vectors in the RDBMS
anyway because dot products are so painful there.
We have a 3-9 month release cycle, but any patch that you produce would be
available pretty much right away in the trunk version of Mahout. At this
point, trunk is typically much more useful than an older release because so
many new and exciting things are going into Mahout on a daily basis. Once
things are more stable and the changes become more incremental, I would expect
stable releases to become more prominent.
> SVD solvers should be gracefully stoppable/restartable
> ------------------------------------------------------
>
> Key: MAHOUT-319
> URL: https://issues.apache.org/jira/browse/MAHOUT-319
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.3
> Reporter: Jake Mannix
> Assignee: Jake Mannix
> Fix For: 0.5
>
>
> LanczosSolver, DistributedLanczosSolver, and HebbianSolver all keep copious
> amounts of memory-resident data which is lost if the app crashes or is killed
> (OOM, forgetting to run in a screen session, and losing net connectivity to
> the server running it, etc...).
> These algorithms (and many other Mahout processes!) should enable a pluggable
> "persist state" mechanism (to HDFS, RDBMS, local disk, key-value store, etc),
> and similarly, a way to pick up and start from such a state.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.