Hi, Dominique,

Thanks for the suggestions and and test cases.  Just checked in a set
of changes as SVN Revision 506.  Here is a bit more explanation.

On item 1, I have taken the option you've suggested in the first
message, i.e., use the macro FASTBIT_EMPTY_STRING_AS_NULL

On item 2, I have restored the escaping of using backslash as you
requested.

Will go through you tests cases next and see what else I need to change.

John


On 3/28/12 8:02 AM, Dominique Prunier wrote:
> Hi John,
> 
> Here is two simple test cases for my two issues with r505.
> 
> 1. empty-strings.zip: there is no way to select empty string anymore
> 
> **** FAILED (bad result count) with where clause << a='' >>
> 
> **** FAILED (bad result count) with where clause << a=0 >>
> 
> **** FAILED (hard) with where clause << a LIKE '' >>
> 
> 2. de-escaping.zip: there is no way to select a string with a reserved char 
> in it
> 
> **** FAILED (bad result count) with where clause << a='it\'s good' >>
> 
> Warning -- ibis::whereParser encountered syntax error, unexpected name string 
> at location a='it's good':1.7
> Warning -- query[96w0-3xn8Tg----1]::setWhereClause -- failed to parse the 
> WHERE clause "a='it's good'"
> Warning -- fastbit_build_query failed to assign conditions (a='it's good') to 
> a query
> **** FAILED (hard) with where clause << a='it's good' >>
> 
> Warning -- ibis::whereParser encountered syntax error, unexpected $undefined 
> at location a=it's good:1.5
> Warning -- query[96w0-3xn8Tg----2]::setWhereClause -- failed to parse the 
> WHERE clause "a=it's good"
> Warning -- fastbit_build_query failed to assign conditions (a=it's good) to a 
> query
> **** FAILED (hard) with where clause << a=it's good >>
> 
> With the attached patch, it fixes my test cases (except a LIKE '' that didn't 
> work before and a='it's good' and a=it's good whieh are invalid where 
> clauses).
> 
> Thanks,
> 
> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Dominique Prunier
> Sent: Wednesday, March 28, 2012 10:04 AM
> To: K. John Wu
> Cc: FastBit Users
> Subject: Re: [FastBit-users] PATCH: perf boost on top of r501
> 
> Hey John,
> 
> I'm trying r505 right know, i have two questions/remarks:
> 
>  1. I saw a change about the dictionary now able to accept empty strings and 
> treat them as normal strings instead of NULLs. This breaks quite a lot of my 
> test cases, specifically those which used to test category=0 or !=0 to 
> simulate the IS NULL/IS NOT NULL because testing for an empty string doesn't 
> work (='' or LIKE '' used to fail or return invalid results). What would be 
> your recommendation for that use case ? Could we add a define for those 
> relying on this behavior ? Something like:
> 
> -    //if (*str == 0) return 0;
> +#ifdef FASTBIT_EMPTY_STRING_AS_NULL
> +    if (*str == 0) return 0;
> +#endif
> 
>  2. I can't find anywhere the de-escaping patch in r505, am i missing 
> something ?
> 
> Thanks,
> 
> -----Original Message-----
> From: K. John Wu [mailto:[email protected]] 
> Sent: Wednesday, March 28, 2012 1:55 AM
> To: Dominique Prunier
> Cc: FastBit Users
> Subject: Re: [FastBit-users] PATCH: perf boost on top of r501
> 
> Hi, Dominique,
> 
> There are some updates to involving merge of dictionaries and to
> exercise the operations involving unmatched quotes.  The new code is
> SVN R505.
> 
> Please let me know if you have any additional questions.
> 
> John
> 
> 
> On 3/27/12 1:03 PM, Dominique Prunier wrote:
>> Hey John,
>>
>> There is definitely a need for FastBit escaping. The escaping problem is not 
>> at the shell level (though we could have one there) since in pure C/C++ 
>> code, there's no shell involved when i'm building a where clause from a 
>> string. The problem is at the where clause parsing level (in the lexer to be 
>> more precise) to be able to express string literals among other things (and 
>> not only metas, it is also white spaces, ...).
>> Typically, my test that fails is as simple as calling 
>> fastbit_build_query(..., ..., "a='it\\'s good'"). This is expected to create 
>> a qString << a = it's good >> but now, it creates a qString << a = it\'s 
>> good >> which is wrong. The attached patch restores the descaping, but _not_ 
>> the double quote stripping (because it is already handled in the lexer). All 
>> my test cases works after applying it on r503.
> 
> The constructors for ibis::qString and ibis::qLike really should not
> strip away anything.  In your case, you should be able to do the following
> 
> .../tcapi data-dir "a=\"it's good\""
> 
> if you are using fastbit_build_query, you can use the same string
> fastbit_build_query(..., ..., "a=\"it's good\"");
> 
> Since FastBit regular expression only support four meta characters ? *
> _ %.  There is no need to escape anything.  It is probably cleaner to
> not introduce stripping of anything special (except the outer most
> quotes, which should be only done once).
> 
>>
>> About the decompression, thanks for the link, this is very interesting stuff 
>> ! But my point here is not about questing the fact the decompression can be 
>> better is some case, i was just under the impression that the hit vector 
>> given to category::patternSearch was _always_ already decompressed since it 
>> is ultimately a bitvector that has been created from scratch for query 
>> evaluation (it would need verification though). My guess is that the few 
>> percent of performance i'm loosing here are attributable to the check 
>> (hits.isCompressed() && hits.bytes()*mult + bv->bytes() > hits.size()), 
>> since it gets executed _A LOT_ of times. I'll try to investigate it a little 
>> bit further.
> 
> I have rearranged the tests for decompression in layers which
> hopefully will eliminate the need to perform more expensive tests in
> your case that presumably involve a fairly small number of values.
> _______________________________________________
> FastBit-users mailing list
> [email protected]
> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
_______________________________________________
FastBit-users mailing list
[email protected]
https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users

Reply via email to