Sure, i have no problem with that. And if you want to switch it off by default to keep the current behavior (CI) i'm fine with it too.
Thanks ! ________________________________________ From: K. John Wu [[email protected]] Sent: March-09-12 10:47 PM To: Dominique Prunier Cc: FastBit Users Subject: Re: [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to match LIKE patterns case-sensitively and perform specific optimizations Hi, Dominique, I would like to add FASTBIT_ prefix to the macro CS_PATTERN_MATCH to avoid possible collision when FastBit is used with other package. Hope you don't mind. John On 3/9/12 4:03 PM, K. John Wu wrote: > Hi, Dominique, > > I have run through my usual set of tests and did not find any problem > with your patch. It is now in SVN 482. Please give it a try when you > get the chance. > > Thanks. > > John > > > > On 3/9/12 10:17 AM, Dominique Prunier wrote: >> Quick update to my patch: >> >> · Changed dictionary::patternMatch to make it work with CI too >> (and i think for efficiency reasons, i have to keep all this here) >> >> · Moved the STR_MATCH_* constants from util.cpp to util.h and >> use them in dictionary::patternMatch >> >> · Removed the CS/CI ifdef from category.cpp >> >> >> >> I did more testing, and on my set of ~90 000 test queries, the >> execution time dropped from ~515 seconds to ~20 seconds. >> >> >> >> Thanks, >> >> >> >> *From:*[email protected] >> [mailto:[email protected]] *On Behalf Of *Dominique >> Prunier >> *Sent:* Thursday, March 08, 2012 2:39 PM >> *To:* FastBit Users >> *Subject:* [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to >> match LIKE patterns case-sensitively and perform specific optimizations >> >> >> >> Here is the first version of my patch to switch SQL like from case >> insensitive to case sensitive and optimize this use case with CATEGORY >> columns. >> >> >> >> In a nutshell, what changed is: >> >> · We extract the longest (handling the escape char too) >> constant prefix from the pattern >> >> · Instead of testing every value in the dictionary, we binary >> search the range of values to search (which sometimes even allow to >> skip pattern matching if no valid range can be found) >> >> · We test every value in the range >> >> >> >> On a large dictionary (~130k entries), i’ve commonly it can be one or >> two order of magnitude faster (in my example, a simple query with a >> single LIKE predicate drops from ~10ms to ~0.4ms). >> >> >> >> What i’d like to change/refactor (i’m really a newbie in c++): >> >> · Remove the prefix extraction and pattern matching code from >> dictionary and replace the added method patternSearch by something >> like findRange. I believe that matching and pattern handling code >> doesn’t belong to the dictionary. I’d rather move this back to the >> category class or something. >> >> · Having to use a c++ string object to rebuild the longest >> constant prefix bugs me (suggestions ?). I’m also thinking to have a >> version that doesn’t support escaping, but it would force me to change >> strMatch a bit more >> >> · To closely match the previous behavior, you can’t match an >> empty pattern (even the empty string doesn’t match), maybe that would >> worh being changed >> >> >> >> As always John, feel free to include this into the main branch. I’m >> waiting for suggestions to make it more efficient, cleaner, ... >> >> >> >> Thanks, >> >> >> >> */Dominique Prunier/**//* >> >> APG Lead Developper >> >> Logo-W4N-100dpi >> >> 4388, rue Saint-Denis >> >> Bureau 309 >> >> Montreal (Quebec) H2J 2L1 >> >> Tel. +1 514-842-6767 x310 >> >> Fax +1 514-842-3989 >> >> [email protected] <mailto:[email protected]> >> >> www.watch4net.com <http://www.watch4net.com/> >> >> / / >> >> /This message is for the designated recipient only and may contain >> privileged, proprietary, or otherwise private information. If you have >> received it in error, please notify the sender immediately and delete >> the original. Any other use of this electronic mail by you is prohibited. >> >> //Ce message est pour le récipiendaire désigné seulement et peut >> contenir des informations privilégiées, propriétaires ou autrement >> privées. Si vous l'avez reçu par erreur, S.V.P. avisez l'expéditeur >> immédiatement et effacez l'original. Toute autre utilisation de ce >> courrier électronique par vous est prohibée./// >> >> >> >> >> >> _______________________________________________ >> FastBit-users mailing list >> [email protected] >> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
