[
https://issues.apache.org/jira/browse/MAHOUT-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177902#comment-13177902
]
Josh Patterson commented on MAHOUT-833:
---------------------------------------
Ok, I went ahead and added the chunksize param in, it only inflates over the
param slightly based on the prefix and adding the paths to each key.
On another note, what is the best way to kick off this job?
1. add another command to the mahout bash script prop file?
2. add a flag to the existing "/bin/mahout seqdirectory" setup that would
kickoff the MR job instead of the serial process, something like:
/bin/mahout seqdirectory -mr [ more options ]
JP
> Make conversion to sequence files map-reduce
> --------------------------------------------
>
> Key: MAHOUT-833
> URL: https://issues.apache.org/jira/browse/MAHOUT-833
> Project: Mahout
> Issue Type: Improvement
> Components: Integration
> Affects Versions: 0.5
> Reporter: Grant Ingersoll
> Labels: MAHOUT_INTRO_CONTRIBUTE
>
> Given input that is on HDFS, the SequenceFilesFrom****.java classes should be
> able to do their work in parallel.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira