Wow, very cool, even though I don't use Oracle anywhere at the moment. You probably don't want that rowid field tokenized, by the way.
Otis ----- Original Message ---- From: Marcelo Ochoa <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, November 22, 2006 8:44:58 AM Subject: Re: Oracle and Lucene Integration Hi Mark: > Very interesting. > > So how does this solution manage mapping Oracle primary keys to and from > Lucene doc ids? I am storing the rowid value as a Document field, here a code sniped Document doc = new Document(); doc.add(new Field("rowid", rowid, Field.Store.YES, Field.Index.TOKENIZED)); Object value = rs.getObject(2); String valueStr = null; if (value!=null) { // Sanity checks if (value instanceof CLOB) valueStr = ((CLOB)value).getSubString(1,(int)((CLOB)value).length()); else if (value instanceof XMLType) valueStr = ((XMLType)value).extract("//text()","").getStringVal(); else valueStr = value.toString(); doc.add(new Field(col,valueStr,Field.Store.NO,Field.Index.TOKENIZED)); writer.addDocument(doc); So when I am querying I can get the rowid back using: if (iterator.hasNext()) { // append rowid to collection Hit hit = (Hit) iterator.next(); try { rid = hit.get("rowid"); score = hit.getScore(); } catch (IOException e) { e.printStackTrace(); throw new SQLException(e.getMessage()); } rlist[i] = new String(rid); slist.put(rid,new Float(score)); idx++; } else {............. and passing the rowid to the Oracle execution plan which is collecting in bacth of 2000 rowids. > > >> Another benefits of using the Data Cartridge API is that if the > >>table T1 has insert, update or delete rows operations a corresponding > >>Java method will be called to automatically update the Lucene Index. > > I suspect the tricky bit is optimizing the opening/closing of Lucene > IndexReaders/Writers especially in the event of large batches of database > updates. > Does this API pass the transactional info which would help organize the > batching of the Lucene reader.delete and writer.add calls? Well, I think that Oracle Text uses a Queue to store large batches, because it use a ctx_sys.sync procedure to update the index ;) We can make the same solution using Oracle AQ. > > Cheers > Mark Best regards, Marcelo. -- Marcelo F. Ochoa http://marcelo.ochoa.googlepages.com/home ______________ Do you Know DBPrism? Look @ DB Prism's Web Site http://www.dbprism.com.ar/index.html More info? Chapter 17 of the book "Programming the Oracle Database using Java & Web Services" http://www.amazon.com/gp/product/1555583296/ Chapter 21 of the book "Professional XML Databases" - Wrox Press http://www.amazon.com/gp/product/1861003587/ Chapter 8 of the book "Oracle & Open Source" - O'Reilly http://www.oreilly.com/catalog/oracleopen/ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]