Similar thoughts here.  I don't have ML thread pointers nor JIRA issue 
pointers, but there has been discussion in this area before, and I believe the 
thinking was that what's needed is a general interface/abstraction/API for 
storing and loading field data to an external component, be that a BDB, an 
RDBMS, or something like Skwish.  I *think* that often came up in the context 
of Document updates (as opposed to delete+add).


I didn't look at Skwish, but I think this is the direction to explore, Babak, 
esp. if we can come up with something that let's one plug in other types of 
storage, as well as deal with transaction type stuff that Ian mentioned.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Ian Holsman <li...@holsman.net>
> To: java-dev@lucene.apache.org
> Sent: Friday, December 26, 2008 5:40:36 AM
> Subject: Re: Blob storage
> 
> Babak Farhang wrote:
> > Most of all, I'm trying to communicate an *idea* which itself cannot
> > be encumbered by any license, anyway. But if you want to incorporate
> > some of this code into an asf project, I'd be happy to also release it
> > under the apache license. Hope the license I chose for my project
> > doesn't get in the way of this conversation..
> >  
> 
> as an idea, let me offer some thoughts.
> - there will be a trade-off where reading the info from a 2nd system 
> would be slower than just a single call which has all the results. 
> Especially if you have to fetch a couple of these things.
> 
> - how is this different than BDB, and a UUID. couldn't you just store it 
> using that?
> 
> - how are you going to deal with situations where the commit fails in 
> lucene. does the client have to recognize this and rollback skwish?
> 
> - there will need to be some kind of reconciliation process that will 
> need to deal with inconsistencies where someone forgets to delete the 
> skiwsh object when they have deleted the lucene record.
> 
> on a positive note, it would shrink the index size and allow more 
> records to fit in memory.
> 
> Regards
> Ian
> > On Fri, Dec 26, 2008 at 12:46 AM, Noble Paul നോബിള്‍ नोब्ळ्
> > wrote:
> >  
> >> The license is GPL . It cannont be used directly in any apache projects
> >>
> >> On Fri, Dec 26, 2008 at 12:47 PM, Babak Farhang wrote:
> >>    
> >>>> I assume one could use Skwish instead of Lucene's normal stored fields to
> >>>> store & retrieve document data?
> >>>>        
> >>> Exactly: instead of storing the field's value directly in Lucene, you
> >>> could store it in skwish and then store its skwish id in the Lucene
> >>> field instead.  This works well for serving large streams (e.g.
> >>> original document contents).
> >>>
> >>>      
> >>>> Have you run any threaded performance tests comparing the two?
> >>>>        
> >>> No direct comps, yet.
> >>>
> >>> -b
> >>>
> >>>
> >>> On Thu, Dec 25, 2008 at 5:22 AM, Michael McCandless
> >>> wrote:
> >>>      
> >>>> This looks interesting!
> >>>> I assume one could use Skwish instead of Lucene's normal stored fields to
> >>>> store & retrieve document data?
> >>>> Have you run any threaded performance tests comparing the two?
> >>>> Mike
> >>>>
> >>>> Babak Farhang wrote:
> >>>>        
> >>>>> Hi everyone,
> >>>>>
> >>>>> I've been working on a library called Skwish to complement indexes
> >>>>> like Lucene,  for blob storage and retrieval. This is nothing more
> >>>>> than a structured implementation of storing all the files in one file
> >>>>> and managing their offsets in another.  The idea is to provide a fast,
> >>>>> concurrent, lock-free way to serve lots of files to lots of users.
> >>>>>
> >>>>> Hope you find it useful or interesting.
> >>>>>
> >>>>> -Babak
> >>>>> http://skwish.sourceforge.net/
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> >>>>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
> >>>>>
> >>>>>          
> >>>>        
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> >>> For additional commands, e-mail: java-dev-h...@lucene.apache.org
> >>>
> >>>
> >>>      
> >>
> >> --
> >> --Noble Paul
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-dev-h...@lucene.apache.org
> >>
> >>
> >>    
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-dev-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to