On Saturday 13 September 2008, Shi Yusen wrote: > Perhaps something easy and simple can be done first, creating and > updating product info in a catalog to a Lucene index. > > No matter whether the Lucene index would be used in OFBiz backoffice, > it's still very useful for some other scenarioes such as in a catalog > CD. Someone may want to distribute a catalog and other info in a CD-ROM > or similar, a Java-based client can use the Lucene index to do search > without an OFBiz installation. > > Actually, we can offer the later part in a component with configurable > search pipeline function. > > Shi Yusen/Beijing Langhua Ltd. > > > > 在 2008-09-13六的 11:18 -0600,David E Jones写道: > > > While it's possible that Lucene (or Solr) is faster for the keyword > > searches I wouldn't be convinced until I saw a comparison done on a > > reasonably large data set between Lucene and the ProductKeyword table > > using a few different keyword combinations. With ProductKeyword we're > > using a database index on the keywords to lookup productIds, which is > > basically what Lucene does with its own reverse index. > > > > Lucene does do some cool search expression stuff that our current > > product searching doesn't support. However, the current product search > > does support various features like stem removal and thesaurus > > expansion (which has been mentioned in this thread). > > > > One of the really big problems with moving to Lucene is how to handle > > the parametric searching and flexible sorting that we currently do by > > taking advantage of a dozen or so tables in the database to search on > > features associated with products and categories (optionally including > > all sub-categories) and prices and catalogs and stores, and on top of > > that it's easy to add constraints for just about anything else you > > might associate with a product. > > > > The option of doing a Lucene search first to get a set of productIds > > that match and then passing that to the database with a possibly > > massive IN expression would work, but might perform horribly because > > of all of the data that needs to be moved around and such. > > > > If Solr supports this sort of parametric search it might be > > interesting, but it would be a LOT of redundant data to keep track of, > > and I don't really like that a whole lot... > > > > So, back to the beginning, unless someone can show that Lucene beats > > out the keyword indexing that a good database (and properly configured > > to make sure the keyword index is working and so on) does with the > > ProductKeyword table then I wouldn't even want to start going in this > > direction. > > > > -David > > > > On Sep 13, 2008, at 6:43 AM, Patrick Antivackis wrote: > > > Hello, > > > Just to put some light on the product search. > > > Main class involved : > > > applications/product/src/org/ofbiz/product/product/ProductSearch.java > > > > > > It's 100% SGDB based, not lucene or whatever. > > > > > > For a reminder, there is an entity in Ofbiz called ProductKeyword > > > which > > > primary key is ProductId and Keyword (varchar(60)) and that is > > > filled at > > > each creation update of the product carateristics, name, fields,.... > > > > > > So is it today the best and most efficient way to do search? huho, > > > not sure > > > you are right.. But for product only, it's usually enough (boolean > > > search > > > speaking). Now if need also to index files that are associated with > > > product > > > and may be (but i don't know if exist already as i never looked) if > > > need to > > > index CMS and files uploaded through CMS, a solution based on a real > > > search > > > engine should be far more superior. > > > > > > Regards > > > > > > 2008/9/11 madppiper <[EMAIL PROTECTED]> > > > > > >> BJ Freeman wrote: > > >>> You have stated what caused the responses, when you made > > >>> assumptions. > > >>> [I have worked with Solr, not lucene.] > > >>> > > >>> You have not investigated how ofbiz works. > > >> > > >> I think that comments like that are not only unneccesary, but > > >> unhealthy for > > >> any open discussion. (Please read my original message again, > > >> replace the > > >> term "proprietary" with "native", keep in mind that OFBIz does NOT > > >> use > > >> Lucene for searching - so I was told several times now, and then skip > > >> through the original question at hand) > > >> > > >> > > >> > > >> @Jacques: Thanks for the response - not quite. There are actually two > > >> questions at hand: > > >> > > >> 1) > > >> What search engine, if any, is used by OFBiz to generate keyword > > >> search > > >> results for Products? > > >> > > >> 2) > > >> If 1) can be answered with "NO Searchengine per se" - which would > > >> implie > > >> that we are doing real database queries right now (perhaps one that > > >> use > > >> Fulltext-query algorithms), would it not be a good idea to move to a > > >> standalone searchengine as Solr? > > >> > > >> > > >> -- > > >> View this message in context: > > >> http://www.nabble.com/Replacing-Lucene-with-Solr-tp19412826p19429281.h > > >>tml Sent from the OFBiz - Dev mailing list archive at Nabble.com.
Might some answers be found by looking at the performance of H2 which has Lucene built into it. David