Hi Joe, Sure, I'd be very happy to take a look. We could continue to work on this together based on how much time you have available if you like.
Even just bouncing ideas off each other could be interesting. On Dec 20, 2011, at 2:33 PM, "Joe Prasanna Kumar (Commented) (JIRA)" <[email protected]> wrote: > > [ > https://issues.apache.org/jira/browse/MAHOUT-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173573#comment-13173573 > ] > > Joe Prasanna Kumar commented on MAHOUT-833: > ------------------------------------------- > > Raphael, > > For SequenceFilesFromMailArchives, I started writing a new InputFormat and > RecordReader that would parse each of the mail messages and output them into > a Sequence File. It is in a very raw state and since I am travelling for > work, i wont be able to do much with it for a month. so if you have already > started on it, please feel free to upload a patch.. If you are interested in > looking at what i've written so far, let me know and i'll cleanup a bit, add > some comments and email you or something. > > reg > Joe. > >> Make conversion to sequence files map-reduce >> -------------------------------------------- >> >> Key: MAHOUT-833 >> URL: https://issues.apache.org/jira/browse/MAHOUT-833 >> Project: Mahout >> Issue Type: Improvement >> Components: Integration >> Affects Versions: 0.5 >> Reporter: Grant Ingersoll >> Labels: MAHOUT_INTRO_CONTRIBUTE >> >> Given input that is on HDFS, the SequenceFilesFrom****.java classes should >> be able to do their work in parallel. > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > >
