+1

----- Original Message ----- 
From: "Kevin Burton" <[EMAIL PROTECTED]>
To: "Lucene Users List" <[EMAIL PROTECTED]>
Sent: Tuesday, May 18, 2004 2:43 PM
Subject: Internal full content store within Lucene


> Per the discussion the other day about storing content external to 
> Lucene I think we have an opportunity to improve the lucene core and 
> bring a lot of functionality to future developers.
> 
> Right now Lucene allows you to have a 'stored' field which keeps the 
> content with a segment along with your inverted index.
> 
> While this is flexible for small indexes in production environments it 
> falls down because index merges take FOREVER.
> 
> A thread the other day opened up and suggesting storing just a pointer 
> to a file on the filesystem.  This got me thinking about a long term 
> mechanism I wanted for our cluster where we store content outside of the 
> index in a high performance flat-file database.
> 
> The Lucene index would only maintain FILENO-:OFFSET:LENGTH info within 
> the index and this would allow us to point to our flat file database. 
> 
> This would allow Lucene index merges to be FAST, support native field 
> storage, and allow the filesystem optimize contiguous blocks for the 
> flat content store.  Everyone wins.
> 
> This is what the Internet archive uses:
> 
> http://www.archive.org/web/researcher/ArcFileFormat.php
> 
> I propose that Lucene support a new form of stored field that allows 
> external storage engine to keep the content in a flat text store.
> 
> How much interest is there for this?  I have to do this for work and 
> will certainly take the extra effort into making this a standard Lucene 
> feature. 
> 
> I can come up with a requirements doc and a more formal proposal in 
> another email if I get enough +1s...
> 
> Kevin
> 
> -- 
> 
> Please reply using PGP.
> 
>     http://peerfear.org/pubkey.asc    
>     
>     NewsMonster - http://www.newsmonster.org/
>     
> Kevin A. Burton, Location - San Francisco, CA, Cell - 415.595.9965
>        AIM/YIM - sfburtonator,  Web - http://peerfear.org/
> GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
>   IRC - freenode.net #infoanarchy | #p2p-hackers | #newsmonster
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to