On Saturday, November 29, 2003, at 07:00 AM, Stefano Mazzocchi wrote:
I think you'll find that Lucene will serve Slide's needs nicely - you'll just have to be a little creative in how you build Lucene Document objects and break things into fields. Lucene is a "flat" structure - so implying hierarchy requires some thought - perhaps just the URI will work to give you the hierarchy you need. But if properties are also hierarchical (can't non-live, "dead"?, properties contain an entire DOM tree?) then things will get more interesting and tricky.

Hmmm, seems to me like trying to fit a square into a rounded hole.

Perhaps. But, folks are doing object-relational mapping with databases. A database could be viewed as simply a flat structure of bytes on a disk. So, mapping Lucene's flat structure into something more structured and hierarchical is do-able. ZOE (the e-mail client-server-indexer) does a lot of this type of stuff using Lucene, in fact.


But, certainly it is just one possible solution and may not be the most pragmatic one. If a database is being used for property storage already, then Lucene might be overkill for a query like you provided.

Can you elaborate more on how you would do a query like

 SELECT {DAV}allprop
 FROM /files/whatever
 WHERE {DAV}contentlength > 40000
 ORDER BY {DAV}lastmodified

on top of lucene?

I would AND together a PrefixQuery for URI "/files/whatever" (allowing it to search a sub-tree rooted there) with a RangeQuery on field "contentlength" for values 40000 and greater.


Ordering is not something Lucene does though, other than by it's score calculation, so this is where the mismatch occurs most strongly. If you're doing full-text searching combined with these types of conditions and want the order to be by how well the documents match your query then Lucene will shine. Traditional relational database type of queries with ORDER BY clauses don't map as well. Ordering, though, can be applied after the query results are returned in this case as you will want to collect all documents that match the query anyway. I'd almost be willing to bet that Lucene will beat most, if not all, relational databases here especially in this case where the hierarchy is being recursively traversed.

Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to