Hi John:
  Did you test/know Lucene Domain Index for Oracle database?
http://marceloochoa.blogspot.com/2007/09/running-lucene-inside-your-oracle-jvm.html
  If you are using Oracle 10g/11g is completed integrated in Oracle
memory space like Oracle Text but based in  Lucene.
  No network round trip is involved during indexing/querying time and
Lucene store is replaced by BLOB database storage.
  Also you can query your Oracle text store direct by SQL and two new
operators, lcontains and lscore based and Lucene and directly
integrated with the Oracle execution plan.
  Best regards, Marcelo.

On Mon, Jul 28, 2008 at 10:53 PM, John Evans <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I have successfully used Lucene in the "tradtiional" way to provide
> full-text search for various websites.  Now I am tasked with developing a
> data-store to back a web crawler.  The crawler can be configured to retrieve
> arbitrary fields from arbitrary pages, so the result is that each document
> may have a random assortment of fields.  It seems like Lucene may be a
> natural fit for this scenario since you can obviously add arbitrary fields
> to each document and you can store the actually data in the database. I've
> done some research to make sure that it would meet all of our individual
> requirements (that we can iterate over documents, update (delete/replace)
> documents, etc.) and everything looks good.  I've also seen a couple of
> references around the net to other people trying similar things... however,
> I know it's not meant to be used this way, so I thought I would post here
> and ask for guidance?  Has anyone done something similar?  Is there any
> specific reason to think this is a bad idea?
>
> The one thing that I am least certain about his how well it will scale.  We
> may reach the point where we have tens of millions of documents and a high
> percentage of those documents may be relatively large (10k-50k each).  We
> actually would NOT be expecting/needing Lucene's normal extreme fast text
> search times for this, but we would need reasonable times for adding new
> documents to the index, retrieving documents by ID (for iterating over all
> documents), optimizing the index after a series of changes, etc.
>
> Any advice/input/theories anyone can contribute would be greatly
> appreciated.
>
> Thanks,
> -
> John
>



-- 
Marcelo F. Ochoa
http://marceloochoa.blogspot.com/
http://marcelo.ochoa.googlepages.com/home
______________
Do you Know DBPrism? Look @ DB Prism's Web Site
http://www.dbprism.com.ar/index.html
More info?
Chapter 17 of the book "Programming the Oracle Database using Java &
Web Services"
http://www.amazon.com/gp/product/1555583296/
Chapter 21 of the book "Professional XML Databases" - Wrox Press
http://www.amazon.com/gp/product/1861003587/
Chapter 8 of the book "Oracle & Open Source" - O'Reilly
http://www.oreilly.com/catalog/oracleopen/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to