Hey John,

The problem is that it doesn't actually fail when reading the index. The index 
is read but during the evaluation, i have segfaults, bogus results or valgrind 
errors. Once i regenerated the indexes for my category column, everything 
worked liked a charm.

It was also misleading because of the other issue (unsigned ints that should 
have been signed ints) that segfaulted too.

Thanks,

-----Original Message-----
From: K. John Wu [mailto:[email protected]] 
Sent: Monday, March 12, 2012 12:43 PM
To: Dominique Prunier
Cc: FastBit Users
Subject: Re: [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to match 
LIKE patterns case-sensitively and perform specific optimizations

Hi, Dominique,

I thought that I have checked index types.  If you happen to know the
stack trace for the reading operation, let me know.  Otherwise, it
might take me a while to figure out a good way to reproduce the problem..

John


On 3/12/12 9:30 AM, Dominique Prunier wrote:
> Ok, figured out the other segfault. The index have to be regenerated with the 
> change from relic to direkte. My guess is that it was reading something 
> invalid. Is there a missing check in the index read method ?
> 
> Thanks,
> 
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Dominique Prunier
> Sent: Monday, March 12, 2012 11:45 AM
> To: K. John Wu
> Cc: FastBit Users
> Subject: Re: [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to 
> match LIKE patterns case-sensitively and perform specific optimizations
> 
> Hey John,
> 
> The fix, as checked out in the revision 484 breaks the binary search of the 
> pattern prefix:
> -       int32_t b = 0;
> -       int32_t e = key_.size() - 1;
> +       uint32_t b = 0;
> +       uint32_t e = key_.size() - 1;
> 
> Since the stop condition of the loop can be that one of the index is -1, this 
> now fails with a segfault.
> 
> I'm troubleshooting another segfault in the bitvector right now (could it be 
> related to the change in r 479 ?)
> 
> Thanks,
> 
> -----Original Message-----
> From: K. John Wu [mailto:[email protected]] 
> Sent: Saturday, March 10, 2012 2:24 PM
> To: Dominique Prunier
> Cc: FastBit Users
> Subject: Re: [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to 
> match LIKE patterns case-sensitively and perform specific optimizations
> 
> Just checked in the modification to allow users to define
> FASTBIT_CS_PATTERN_MATCH to 0 to disable case sensitive matches.  The
> new SVN revision is 484.
> 
> Also looked through other macros to make sure they are used consistently.
> 
> John
> 
> 
> On 3/10/12 9:20 AM, Dominique Prunier wrote:
>> Hey John,
>>
>> I just noticed a small typo in utils.h, the macro is called FASTBOT_... I 
>> don't think it was expected but it has the nice side effect of disabling new 
>> code by default thus preserving current behavior (case insensitive). Should 
>> we actually keep it in util.h now that it is documented in INSTALL ?
>>
>> https://codeforge.lbl.gov/plugins/scmsvn/viewcvs.php/trunk/src/util.h?root=fastbit&r1=483&r2=482&pathrev=483
>>
>> Thanks,
>> ________________________________________
>> From: K. John Wu [[email protected]]
>> Sent: March-09-12 10:47 PM
>> To: Dominique Prunier
>> Cc: FastBit Users
>> Subject: Re: [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to 
>> match LIKE patterns case-sensitively and perform specific optimizations
>>
>> Hi, Dominique,
>>
>> I would like to add FASTBIT_ prefix to the macro CS_PATTERN_MATCH to
>> avoid possible collision when FastBit is used with other package.
>> Hope you don't mind.
>>
>> John
>>
>>
>> On 3/9/12 4:03 PM, K. John Wu wrote:
>>> Hi, Dominique,
>>>
>>> I have run through my usual set of tests and did not find any problem
>>> with your patch.  It is now in SVN 482.  Please give it a try when you
>>> get the chance.
>>>
>>> Thanks.
>>>
>>> John
>>>
>>>
>>>
>>> On 3/9/12 10:17 AM, Dominique Prunier wrote:
>>>> Quick update to my patch:
>>>>
>>>> ·         Changed dictionary::patternMatch to make it work with CI too
>>>> (and i think for efficiency reasons, i have to keep all this here)
>>>>
>>>> ·         Moved the STR_MATCH_* constants from util.cpp to util.h and
>>>> use them in dictionary::patternMatch
>>>>
>>>> ·         Removed the CS/CI ifdef from category.cpp
>>>>
>>>>
>>>>
>>>> I did more testing, and on my set of ~90 000 test queries, the
>>>> execution time dropped from ~515 seconds to ~20 seconds.
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>>
>>>> *From:*[email protected]
>>>> [mailto:[email protected]] *On Behalf Of *Dominique
>>>> Prunier
>>>> *Sent:* Thursday, March 08, 2012 2:39 PM
>>>> *To:* FastBit Users
>>>> *Subject:* [FastBit-users] PATCH: new CS_PATTERN_MATCH define added to
>>>> match LIKE patterns case-sensitively and perform specific optimizations
>>>>
>>>>
>>>>
>>>> Here is the first version of my patch to switch SQL like from case
>>>> insensitive to case sensitive and optimize this use case with CATEGORY
>>>> columns.
>>>>
>>>>
>>>>
>>>> In a nutshell, what changed is:
>>>>
>>>> ·         We extract the longest (handling the escape char too)
>>>> constant prefix from the pattern
>>>>
>>>> ·         Instead of testing every value in the dictionary, we binary
>>>> search the range of values to search (which sometimes even allow to
>>>> skip pattern matching if no valid range can be found)
>>>>
>>>> ·         We test every value in the range
>>>>
>>>>
>>>>
>>>> On a large dictionary (~130k entries), i’ve commonly it can be one or
>>>> two order of magnitude faster (in my example, a simple query with a
>>>> single LIKE predicate drops from ~10ms to ~0.4ms).
>>>>
>>>>
>>>>
>>>> What i’d like to change/refactor (i’m really a newbie in c++):
>>>>
>>>> ·         Remove the prefix extraction and pattern matching code from
>>>> dictionary and replace the added method patternSearch by something
>>>> like findRange. I believe that matching and pattern handling code
>>>> doesn’t belong to the dictionary. I’d rather move this back to the
>>>> category class or something.
>>>>
>>>> ·         Having to use a c++ string object to rebuild the longest
>>>> constant prefix bugs me (suggestions ?). I’m also thinking to have a
>>>> version that doesn’t support escaping, but it would force me to change
>>>> strMatch a bit more
>>>>
>>>> ·         To closely match the previous behavior, you can’t match an
>>>> empty pattern (even the empty string doesn’t match), maybe that would
>>>> worh being changed
>>>>
>>>>
>>>>
>>>> As always John, feel free to include this into the main branch. I’m
>>>> waiting for suggestions to make it more efficient, cleaner, ...
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>>
>>>>
>>>> */Dominique Prunier/**//*
>>>>
>>>>  APG Lead Developper
>>>>
>>>> Logo-W4N-100dpi
>>>>
>>>>  4388, rue Saint-Denis
>>>>
>>>>  Bureau 309
>>>>
>>>>  Montreal (Quebec)  H2J 2L1
>>>>
>>>>  Tel. +1 514-842-6767  x310
>>>>
>>>>  Fax +1 514-842-3989
>>>>
>>>>  [email protected] <mailto:[email protected]>
>>>>
>>>>  www.watch4net.com <http://www.watch4net.com/>
>>>>
>>>> /  /
>>>>
>>>> /This message is for the designated recipient only and may contain
>>>> privileged, proprietary, or otherwise private information. If you have
>>>> received it in error, please notify the sender immediately and delete
>>>> the original. Any other use of this electronic mail by you is prohibited.
>>>>
>>>> //Ce message est pour le récipiendaire désigné seulement et peut
>>>> contenir des informations privilégiées, propriétaires ou autrement
>>>> privées. Si vous l'avez reçu par erreur, S.V.P. avisez l'expéditeur
>>>> immédiatement et effacez l'original. Toute autre utilisation de ce
>>>> courrier électronique par vous est prohibée.///
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> FastBit-users mailing list
>>>> [email protected]
>>>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
> _______________________________________________
> FastBit-users mailing list
> [email protected]
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to