Re: [jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

Grant Ingersoll Thu, 06 Jun 2013 21:13:35 -0700

Sorry, didn't see this until I had already fixed it.

On Jun 7, 2013, at 5:59 AM, "Suneel Marthi (JIRA)" <j...@apache.org> wrote:


> 
>    [ 
> https://issues.apache.org/jira/browse/MAHOUT-944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13677794#comment-13677794
>  ] 
> 
> Suneel Marthi commented on MAHOUT-944:
> --------------------------------------
> 
> I'll take care of the Version thing, have a JIRA M-1244 open for that.
> 
>> LuceneIndexToSequenceFiles (lucene2seq) utility
>> -----------------------------------------------
>> 
>>                Key: MAHOUT-944
>>                URL: https://issues.apache.org/jira/browse/MAHOUT-944
>>            Project: Mahout
>>         Issue Type: New Feature
>>         Components: Integration
>>   Affects Versions: 0.5
>>           Reporter: Frank Scholten
>>           Assignee: Grant Ingersoll
>>           Priority: Minor
>>            Fix For: 0.8
>> 
>>        Attachments: MAHOUT-944-minor.patch, MAHOUT-944.patch, 
>> MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, 
>> MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch, 
>> MAHOUT-944.patch, MAHOUT-944.patch, MAHOUT-944.patch
>> 
>> 
>> Here is a lucene2seq tool I used in a project. It creates sequence files 
>> based on the stored fields of a lucene index.
>> The output from this tool can be then fed into seq2sparse and from there you 
>> can do text clustering.
>> Comes with Java bean configuration.
>> Let me know what you think. Some CLI code can be added later on. I used this 
>> for a small-scale project +- 100.000 docs. Is a MR version useful or is that 
>> overkill?
>> See https://github.com/frankscholten/mahout/tree/lucene2seq for commits and 
>> review comments from Simon Willnauer (Thanks Simon!)
>> or the attached patch.
> 
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira

--------------------------------------------
Grant Ingersoll | @gsingers
http://www.lucidworks.com

Re: [jira] [Commented] (MAHOUT-944) LuceneIndexToSequenceFiles (lucene2seq) utility

Reply via email to