[
https://issues.apache.org/jira/browse/MAHOUT-376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12917375#action_12917375
]
Dmitriy Lyubimov commented on MAHOUT-376:
-----------------------------------------
Actually, i reviewed getSplits() code for sequence file. it seems to honor
minSplitSize property of the FileInputFormat. What's more, it makes sure that
the last block is no less than 1.1 times min split size. So that should work
nicely. if we get insufficient # of rows in mappers, i guess we can just
increase the minSplitSize. Unless input is partitioned so that min split size >
min file size in the input.
> Implement Map-reduce version of stochastic SVD
> ----------------------------------------------
>
> Key: MAHOUT-376
> URL: https://issues.apache.org/jira/browse/MAHOUT-376
> Project: Mahout
> Issue Type: Improvement
> Components: Math
> Reporter: Ted Dunning
> Assignee: Ted Dunning
> Fix For: 0.5
>
> Attachments: MAHOUT-376.patch, sd-bib.bib, sd.pdf, sd.tex, Stochastic
> SVD using eigensolver trick.pdf
>
>
> See attached pdf for outline of proposed method.
> All comments are welcome.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.