[ https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677237#comment-13677237 ]
Grant Ingersoll commented on MAHOUT-944: ---------------------------------------- Hmm, I wonder if I should have squashed my local commits: {quote} Committed r1490329 W: 0a28b0f322ffe888553b9e2adf0b6f098b679f16 and refs/remotes/origin/trunk differ, using rebase: :040000 040000 779e2a48da78d2f59f994c83eb1cb91a42b04d41 6e8221954eecd7ee27788976dc7b2665985cd7e6 M integration :100644 100644 492aa3aacbee4e33fb70a2e361d772a9d881ae04 09c5ae712a035af3eef2c3c56db708b8fa75e1b3 M pom.xml :040000 040000 39350289431946a74a7bd15fbf72947261055536 c7274b40f5de032b1668ed9d6f2d1fa24ff0a124 M src Current branch MAHOUT-944 is up to date. # of revisions changed before: d668ddf606dbb0d046f0fe8e3eb97e06fcd4c406 9eafd07120a1810d778dfeb4502ba36b5b3eacfe 253a58c30d0a22150234975f782720248b51a8cb after: 0a28b0f322ffe888553b9e2adf0b6f098b679f16 d668ddf606dbb0d046f0fe8e3eb97e06fcd4c406 9eafd07120a1810d778dfeb4502ba36b5b3eacfe 253a58c30d0a22150234975f782720248b51a8cb If you are attempting to commit merges, try running: git rebase --interactive --preserve-merges refs/remotes/origin/trunk Before dcommitting {quote} > LuceneIndexToSequenceFiles (lucene2seq) utility > ----------------------------------------------- > > Key: MAHOUT-944 > URL: https://issues.apache.org/jira/browse/MAHOUT-944 > Project: Mahout > Issue Type: New Feature > Components: Integration > Affects Versions: 0.5 > Reporter: Frank Scholten > Assignee: Grant Ingersoll > Priority: Minor > Fix For: 0.8 > > Attachments: MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, > MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, > MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, > MAHOUT-944.patch > > > Here is a lucene2seq tool I used in a project. It creates sequence files > based on the stored fields of a lucene index. > The output from this tool can be then fed into seq2sparse and from there you > can do text clustering. > Comes with Java bean configuration. > Let me know what you think. Some CLI code can be added later on. I used this > for a small-scale project +- 100.000 docs. Is a MR version useful or is that > overkill? > See https://github.com/frankscholten/mahout/tree/lucene2seq for commits and > review comments from Simon Willnauer (Thanks Simon!) > or the attached patch. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira