I ran one of the :data fields through the StandardAnalyzer - the only
one we have used - and it tokenized it with no complaints.

Interestingly, the last batch of 1700 sites that we added
incrementally to our index does not seem to suffer from this problem.



On 6/13/07, Jens Kraemer <[EMAIL PROTECTED]> wrote:
> On Wed, Jun 13, 2007 at 08:58:36AM -0400, Richard Jones wrote:
> > According to my IndexReader's field_infos, all the fields are stored
> > and indexed, with :with_positions_offsets for the term_vectors.
> >
> > A look at a term vector for one of these :data fields gives:
> >
> > #<struct Ferret::Index::TermVector field=:data, terms=[], offsets=nil>
> >
> > Is this what they look like when you index with :index=>no?
>
> no, with index => no no term vectors can be stored and then term_vector
> returns nil, not an empty tv.
>
> The scenario you have could happen if your analyzer choked at indexing
> time and returned not a single term for your document (just like if you
> had a doc full of stop words).
>
> Since you have the stored contents, could you try to index that data
> again and see if the problem can be reproduced?
>
> Jens
>
>
>
> --
> Jens Krämer
> webit! Gesellschaft für neue Medien mbH
> Schnorrstraße 76 | 01069 Dresden
> Telefon +49 351 46766-0 | Telefax +49 351 46766-66
> [EMAIL PROTECTED] | www.webit.de
>
> Amtsgericht Dresden | HRB 15422
> GF Sven Haubold, Hagen Malessa
> _______________________________________________
> Ferret-talk mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/ferret-talk
>


-- 
Richard Jones
[EMAIL PROTECTED]
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to