[ 
https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677237#comment-13677237
 ] 

Grant Ingersoll commented on MAHOUT-944:
----------------------------------------

Hmm, I wonder if I should have squashed my local commits:
{quote}
Committed r1490329
W: 0a28b0f322ffe888553b9e2adf0b6f098b679f16 and refs/remotes/origin/trunk 
differ, using rebase:
:040000 040000 779e2a48da78d2f59f994c83eb1cb91a42b04d41 
6e8221954eecd7ee27788976dc7b2665985cd7e6 M      integration
:100644 100644 492aa3aacbee4e33fb70a2e361d772a9d881ae04 
09c5ae712a035af3eef2c3c56db708b8fa75e1b3 M      pom.xml
:040000 040000 39350289431946a74a7bd15fbf72947261055536 
c7274b40f5de032b1668ed9d6f2d1fa24ff0a124 M      src
Current branch MAHOUT-944 is up to date.
# of revisions changed  
before:
 d668ddf606dbb0d046f0fe8e3eb97e06fcd4c406
9eafd07120a1810d778dfeb4502ba36b5b3eacfe
253a58c30d0a22150234975f782720248b51a8cb 

after:
 0a28b0f322ffe888553b9e2adf0b6f098b679f16
d668ddf606dbb0d046f0fe8e3eb97e06fcd4c406
9eafd07120a1810d778dfeb4502ba36b5b3eacfe
253a58c30d0a22150234975f782720248b51a8cb 
 If you are attempting to commit  merges, try running:
         git rebase --interactive --preserve-merges  refs/remotes/origin/trunk 
Before dcommitting
{quote}
                
> LuceneIndexToSequenceFiles (lucene2seq) utility
> -----------------------------------------------
>
>                 Key: MAHOUT-944
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-944
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Integration
>    Affects Versions: 0.5
>            Reporter: Frank Scholten
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 0.8
>
>         Attachments: MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, 
> MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, 
> MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, 
> MAHOUT-944.patch
>
>
> Here is a lucene2seq tool I used in a project. It creates sequence files 
> based on the stored fields of a lucene index.
> The output from this tool can be then fed into seq2sparse and from there you 
> can do text clustering.
> Comes with Java bean configuration.
> Let me know what you think. Some CLI code can be added later on. I used this 
> for a small-scale project +- 100.000 docs. Is a MR version useful or is that 
> overkill?
> See https://github.com/frankscholten/mahout/tree/lucene2seq for commits and 
> review comments from Simon Willnauer (Thanks Simon!)
> or the attached patch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to