Please file a jira for this. Thanks again. Sent from my iPhone
> On Apr 18, 2014, at 10:34 PM, Terry Blankers <te...@amritanet.com> wrote: > > Hi Frank, > > In working with a small test index, if I change the 'body' field to indexed > it indeed does work as expected. It would be great if lucene2seq could be > fixed to read un-indexed stored fields as per design as I need to query > various corpura where I don't have control over the schema. Is there anything > else I can do at this point? > > Thanks, > > Terry > > >> On 4/16/14, 1:52 PM, Frank Scholten wrote: >> Hi Terry, >> >> What happens when you make the 'body' field indexed in your schema? >> >> LuceneIndexHelper checks the field using an IndexSearcher so it might be >> that the field has to be indexed as well as being stored, which would be a >> bug because lucene2seq is designed to load stored fields. >> >> Cheers, >> >> Frank >> >> >>> On Fri, Apr 11, 2014 at 5:33 AM, Terry Blankers <te...@amritanet.com> wrote: >>> >>> Hi All, I'm very new to trying to use lucene2seq so I'm not sure if it's >>> just user error, but I'm experiencing some unexpected behavior when running >>> lucene2seq against my solr index (4.7.1). I've tried using both 0.9 and the >>> trunk build of mahout. (And BTW, I have been able to successfully run >>> Reuters example as a test baseline.) >>> >>> >>> Here's the command I'm running: >>> >>> $MAHOUT_HOME/bin/mahout lucene2seq -i >>> /home/ec2-user/solr/solr-data/solrindex/index -o solr/sequence -id >>> key_sha1hex -f body -xm sequential -q topics:diabetes -n 500 >>> >>> >>> Excerpts from my solr schema: >>> >>> <fieldname="content"type="text"stored="false"indexed=" >>> true"multiValued="true"/> >>> <fieldname="body"type="string"stored="true"indexed="false"/> >>> >>> <!-- Use the indexed/un-stored "content" field for searching --><copyField >>> source="body" dest="content" /> >>> <!-- field for the QueryParser to use when an explicit fieldname is absent >>> --><defaultSearchField>content</defaultSearchField> >>> >>> >>> >>> When I use SolrAdmin and specify fl=body the search handler returns the >>> 'body' field with data as expected. Yet I get the following error when >>> running lucene2seq and specify '-f body': >>> >>> /IllegalArgumentException: Field 'body' does not exist in the index/ >>> >>> >>> >>> And if I specify '-f content', lucene2seq runs without errors or warnings, >>> but seqdumper output shows no values for any key: >>> >>> /Key class: class org.apache.hadoop.io.Text Value Class: class >>> org.apache.hadoop.io.Text >>> Key: 96C4C76CF9D7449C724CA77CB8F650EAFD33E31C: Value: >>> Key: D6842B81B8D09733B50BEDB4767C2A5C49E43B20: Value: >>> Key: 61CB95FEE2C6BF0AC6E8A1F7738338CA36F42264: Value: >>> Key: 0F9903B72A7C9F0373A5171403B3AAEB291B16E1: Value: / >>> >>> >>> Can anyone give me any suggestions as to how to track down what might be >>> happening here? >>> >>> Many thanks, >>> >>> Terry >