Im sending a snippet of code how to reconstruct UNSTORED fields.
It has two parts:
DB+terms

Class.forName("org.postgresql.Driver").newInstance();
con = DriverManager.getConnection("jdbc:postgresql:lucene", "lucene", "lucene"); PreparedStatement psCompany=con.prepareStatement("INSERT INTO term (name,value,doc,pos) VALUES ('company',?,?,?)"); Directory sourceDir = FSDirectory.getDirectory(source,false);
           Directory targetDir = FSDirectory.getDirectory(target,true);
           Analyzer analyzer = (Analyzer)analyzers.get(lang);

           if(!IndexReader.indexExists(sourceDir))
           {
System.err.println("Source index doesn't live on specified path");
               return;
           }
ir = IndexReader.open(sourceDir);
           iw = new IndexWriter(targetDir,analyzer,true);
int numdocs = ir.numDocs(); TermEnum terms = ir.terms();

           String fnCompany = "company".intern();
           while(terms.next())
           {
               Term t = terms.term();
               if(fnCompany==t.field())
               {
                   int docfreq = ir.docFreq(t);
                   psCompany.setString(1, t.text());
                   TermPositions tp = ir.termPositions(t);
                   for(int i=0;i<docfreq;i++)
                   {
                       tp.next();
                       int docId =tp.doc();
                       for(int j=0,len=tp.freq();j<len;j++)
                       {
                           int pos = tp.nextPosition();
                           psCompany.setInt(2, docId);
                           psCompany.setInt(3, pos);
                           psCompany.executeUpdate();
                       }
                   }
               }
           }

For indexing you need, I suppose, - length of field (max(pos) + maxPosTerm.length()) and fields is recon by select pos,value,length(value) from term where name=? and doc=? order by pos asc;
and (analyzer) tokenstream is just wrapper for resultset.


If you would like to check if table is correct than
select t.value,count(*) as co from (select distinct doc,value from term) as t group by t.value order by co desc;

Isssues - you can store it but the information is not OK - it just clone UN_STORED well

PS: PostgreSQL with proper PK is pretty fast

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to