BTW Erick this works brilliantly with UN_TOKENIZED. SUPER fast :)
On 2/25/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
Yes, I'm pretty sure you have to index the field (UN_TOKENIZED) to be able to fetch it with TermDocs/TermEnum! The loop I posted works like this.... for each term in the index for the field if this is one I want to update use a TermDocs to get to that document and operate on it. But this is actually pretty silly. Your loop uses a better approach, except you're not using TermDocs correctly. Try TermDocs tDocs = new IndexReader.TermDocs() for (Business biz : updates) { Term t = new Term("id", biz.getId()); tDocs.seek(t); while (tDocs.next()) { Document doc = reader.document(tDocs.doc()); } } But TermDocs/TermEnum is looking at terms in the index. If you haven't indexed the term, you won't find it, so your Field.Index.NO is really hurting you here. Best Erick On 2/24/07, no spam <[EMAIL PROTECTED]> wrote: > > I didn't fully understand your last post and why I wanted to do > IndexReader.terms() then IndexReader.termDocs(). Won't something like > this > work? > > for (Business biz : updates) > { > Term t = new Term("id", biz.getId()+""); > TermDocs tDocs = reader.termDocs(t); > > while (tDocs.next()) > { > Document doc = reader.document(tDocs.doc()); > } > } > > But tDocs never contains any docs. Is this because I've indexed my pk > like > this: > > doc.add(new Field("id", biz.getId(), Field.Store.YES, Field.Index.NO)); > > instead of > > doc.add(new Field("id", biz.getId(), Field.Store.YES, > Field.Index.UNTOKENIZED)); > > Mark > > On 2/21/07, Erick Erickson <[EMAIL PROTECTED]> wrote: > > > > I think you can get MUCH better efficiency by using TermEnum/TermDocs. > But > > I > > think you need to index (UN_TOKENIZED) your primary key (although now > I'm > > not sure. But I'd be surprised if TermEnum worked with un-indexed data. > > Still, it'd be worth trying but I've always assumed that TermEnums only > > worked on indexed fields....)..... > > > > Anyway, your loop looks more like this... > > > > TermEnum terms = IndexReader.terms(new Term("primarykey", "")); > > TermDocs tDocs = IndexRreader.termDocs(); > > > > while (terms.next()) { > > if (docsToUpdate.contains(terms.text()) { > > tDocs.seek(terms.term()); > > writer.updateDocument(tDocs.doc()); > > } > > } > > > > NOTE: I've been fast and loose with edge conditions, like insuring that > > while (terms.next()) doesn't skip the first term, so caveat emptor.... > > This > > loop also assumes that there is one and only one document in your index > > with > > the primary key. Otherwise, you have to do some more work with the > > TermDocs > > class to process each document that has your primary key... > > > > This is similar to creating Lucene filters, which is very fast.... > > > > Hope this helps > > Erick > > > > > > > > >