[jira] Commented: (LUCENE-1278) Add optional storing of document numbers in term dictionary

Jason Rutherglen (JIRA) Mon, 05 May 2008 06:04:24 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594231#action_12594231
 ]


Jason Rutherglen commented on LUCENE-1278:
------------------------------------------

Storing the docs is off by default and will add index size only if the user 
wishes.  The byte blob allows not reading the docs when loaddocs is false.  
Field cache and range query loading is very slow because of the dual seeks per 
term (for termenum then termdocs).  If in a separate file the terms are 
redundant.  

An field cache example:

protected Object createValue(IndexReader reader, Object entryKey)
        throws IOException {
      Entry entry = (Entry) entryKey;
      String field = entry.field;
      IntParser parser = (IntParser) entry.custom;
      final int[] retArray = new int[reader.maxDoc()];
      // TermDocs termDocs = reader.termDocs();  
      //TermEnum termEnum = reader.terms (new Term (field, ""));
      TermEnum termEnum = reader.terms (new Term (field, ""), true);
      try {
        do {
          Term term = termEnum.term();
          if (term==null || term.field() != field) break;
          int termval = parser.parseInt(term.text());
          int[] docs = termEnum.docs();
          for (int x=0; x < docs.length; x++) {
            retArray[docs[x]] = termval;
          }
          //termDocs.seek (termEnum);
          //while (termDocs.next()) {
          //  retArray[termDocs.doc()] = termval;
          //}
        } while (termEnum.next());
      } finally {
        //termDocs.close();
        termEnum.close();
      }
      return retArray;
    }

> Add optional storing of document numbers in term dictionary
> -----------------------------------------------------------
>
>                 Key: LUCENE-1278
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1278
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index
>    Affects Versions: 2.3.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>         Attachments: lucene.1278.5.4.2008.patch, 
> lucene.1278.5.5.2008.2.patch, lucene.1278.5.5.2008.patch
>
>
> Add optional storing of document numbers in term dictionary.  String index 
> field cache and range filter creation will be faster.  
> Example read code:
> {noformat}
> TermEnum termEnum = indexReader.terms(TermEnum.LOAD_DOCS);
> do {
>   Term term = termEnum.term();
>   if (term == null || term.field() != field) break;
>   int[] docs = termEnum.docs();
> } while (termEnum.next());
> {noformat}
> Example write code:
> {noformat}
> Document document = new Document();
> document.add(new Field("tag", "dog", Field.Store.YES, 
> Field.Index.UN_TOKENIZED, Field.Term.STORE_DOCS));
> indexWriter.addDocument(document);
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[jira] Commented: (LUCENE-1278) Add optional storing of document numbers in term dictionary

Reply via email to