Hi, Dominique,

I would like to add FASTBIT_ prefix to the macro CS_PATTERN_MATCH to
avoid possible collision when FastBit is used with other package.
Hope you don't mind.

John


On 3/9/12 4:03 PM, K. John Wu wrote:
> Hi, Dominique,
> 
> I have run through my usual set of tests and did not find any problem
> with your patch.  It is now in SVN 482.  Please give it a try when you
> get the chance.
> 
> Thanks.
> 
> John
> 
> 
> 
> On 3/9/12 10:17 AM, Dominique Prunier wrote:
>> Quick update to my patch:
>>
>> ·         Changed dictionary::patternMatch to make it work with CI too
>> (and i think for efficiency reasons, i have to keep all this here)
>>
>> ·         Moved the STR_MATCH_* constants from util.cpp to util.h and
>> use them in dictionary::patternMatch
>>
>> ·         Removed the CS/CI ifdef from category.cpp
>>
>>  
>>
>> I did more testing, and on my set of ~90 000 test queries, the
>> execution time dropped from ~515 seconds to ~20 seconds.
>>
>>  
>>
>> Thanks,
>>
>>  
>>
>> *From:*[email protected]
>> [mailto:[email protected]] *On Behalf Of *Dominique
>> Prunier
>> *Sent:* Thursday, March 08, 2012 2:39 PM
>> *To:* FastBit Users
>> *Subject:* [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to
>> match LIKE patterns case-sensitively and perform specific optimizations
>>
>>  
>>
>> Here is the first version of my patch to switch SQL like from case
>> insensitive to case sensitive and optimize this use case with CATEGORY
>> columns.
>>
>>  
>>
>> In a nutshell, what changed is:
>>
>> ·         We extract the longest (handling the escape char too)
>> constant prefix from the pattern
>>
>> ·         Instead of testing every value in the dictionary, we binary
>> search the range of values to search (which sometimes even allow to
>> skip pattern matching if no valid range can be found)
>>
>> ·         We test every value in the range
>>
>>  
>>
>> On a large dictionary (~130k entries), i’ve commonly it can be one or
>> two order of magnitude faster (in my example, a simple query with a
>> single LIKE predicate drops from ~10ms to ~0.4ms).
>>
>>  
>>
>> What i’d like to change/refactor (i’m really a newbie in c++):
>>
>> ·         Remove the prefix extraction and pattern matching code from
>> dictionary and replace the added method patternSearch by something
>> like findRange. I believe that matching and pattern handling code
>> doesn’t belong to the dictionary. I’d rather move this back to the
>> category class or something.
>>
>> ·         Having to use a c++ string object to rebuild the longest
>> constant prefix bugs me (suggestions ?). I’m also thinking to have a
>> version that doesn’t support escaping, but it would force me to change
>> strMatch a bit more
>>
>> ·         To closely match the previous behavior, you can’t match an
>> empty pattern (even the empty string doesn’t match), maybe that would
>> worh being changed
>>
>>  
>>
>> As always John, feel free to include this into the main branch. I’m
>> waiting for suggestions to make it more efficient, cleaner, ...
>>
>>  
>>
>> Thanks,
>>
>>  
>>
>> */Dominique Prunier/**//*
>>
>>  APG Lead Developper
>>
>> Logo-W4N-100dpi
>>
>>  4388, rue Saint-Denis
>>
>>  Bureau 309
>>
>>  Montreal (Quebec)  H2J 2L1
>>
>>  Tel. +1 514-842-6767  x310
>>
>>  Fax +1 514-842-3989
>>
>>  [email protected] <mailto:[email protected]>
>>
>>  www.watch4net.com <http://www.watch4net.com/>
>>
>> /  /
>>
>> /This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise private information. If you have
>> received it in error, please notify the sender immediately and delete
>> the original. Any other use of this electronic mail by you is prohibited.
>>
>> //Ce message est pour le récipiendaire désigné seulement et peut
>> contenir des informations privilégiées, propriétaires ou autrement
>> privées. Si vous l'avez reçu par erreur, S.V.P. avisez l'expéditeur
>> immédiatement et effacez l'original. Toute autre utilisation de ce
>> courrier électronique par vous est prohibée.///
>>
>>  
>>
>>
>>
>> _______________________________________________
>> FastBit-users mailing list
>> [email protected]
>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to