[
https://issues.apache.org/jira/browse/MAHOUT-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012735#comment-13012735
]
Dmitriy Lyubimov commented on MAHOUT-633:
-----------------------------------------
bq. In the tests I ran, not significant. But i was using preprocessing handler
on VectorWritable (the patch you guys were reluctant to accept) which did not
create intermediate vector storage at all. All matrix elements were passed in
on stack. Mahout's patch doesn't have this code but i will be happy to put it
in jira for discussion. Also, i did not run close to memory limits on the tasks
i ran with SSVD. I just don't have datasets that big.
Actually correction: i do think i saw difference in running time between code
running with existing VectorWriable that creates interim vector instances for
each iteration and the code that just passes matrix elements on stack. But like
i said i did not measure the difference it as it was not my goal. And it's not
the evidence i'd like to appeal anyway since there wasn't much side info. But i
have other situations (outside Mahout realm but also iterative batches with big
side info) that i'd like to appeal to.
> Add SequenceFileIterable; put Iterable stuff in one place
> ---------------------------------------------------------
>
> Key: MAHOUT-633
> URL: https://issues.apache.org/jira/browse/MAHOUT-633
> Project: Mahout
> Issue Type: Improvement
> Components: Classification, Clustering, Collaborative Filtering
> Affects Versions: 0.4
> Reporter: Sean Owen
> Assignee: Sean Owen
> Priority: Minor
> Labels: iterable, iterator, sequence-file
> Fix For: 0.5
>
> Attachments: MAHOUT-633.patch, MAHOUT-633.patch, MAHOUT-633.patch
>
>
> In another project I have a useful little class, SequenceFileIterable, which
> simplifies iterating over a sequence file. It's like FileLineIterable. I'd
> like to add it, then use it throughout the code. See patch, which for now
> merely has the proposed new classes.
> Well it also moves some other iterator-related classes that seemed to be
> outside their rightful home in common.iterator.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira