On 3/27/12 9:41 AM, Dominique Prunier wrote: > Hey John, > > I'm testing r503 right away (my usual performance and regression testing). > > Multiple remarks: > * in qExpr.cpp, i think you removed too much code. The superfluous part was > the stripping of the quotes (which is already handled in the lexer), but you > still have to keep the de-escaping logic here.
Presumably, the characters are special to a shell will be stripped away by the shell already. What passed through argv to a program should be exactly what user intended. None of the meta characters "?*_%" are special or require escaping, therefore, there is no need for FastBit to have an escape character. Let me know if you have a use case there requires an escape. > * about the decompression, shouldn't the hit vector already be uncompressed > (it is a vector created from scratch) ? it seems that r503 is a few percent > slower than my original patch There is a need in some cases to decompress the bitvector to avoid memory allocations. Exactly when to do this is obviously a subject of further discussion. There was an extensive study on this some time ago <http://lbl.gov/%7Ekwu/ps/LBNL-54673.html>. Maybe this case deserves a detailed review.. > > Except that, it seems all good. Hopefully, i'll give a closer look this > afternoon and get my stable version. > > Thanks, > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of K. John Wu > Sent: Tuesday, March 27, 2012 1:38 AM > To: FastBit Users > Subject: Re: [FastBit-users] PATCH: perf boost on top of r501 > > Hi, Dominique, > > All you changes should be in SVN Revision 503. Please take a look. > > John > > > On 3/26/12 6:07 PM, Dominique Prunier wrote: >> Hey John, >> >> >> About the compression/decompression, i'm not sure what they were useful for, >> i simply removed them and it simply gave me a nice performance boost :) >> Do you think these 2 performance patches would be worth including into the >> subversion (along with the few other bug fixes i sent today) ? >> >> Thanks, >> >> ________________________________________ >> From: K. John Wu [[email protected]] >> Sent: March-26-12 7:18 PM >> To: FastBit Users >> Cc: Dominique Prunier >> Subject: Re: [FastBit-users] PATCH: perf boost on top of r501 >> >> Hi, Dominique, >> >> Thanks for the performance patches. >> >> The real problem with the decompress call in patternSearch is that as >> is the call to decompress seems to be repeated with without any limit. >> It should only be called once. I will examine all other calls to >> decompress to make sure they are only called once.. >> >> John >> >> >> On 3/26/12 9:43 AM, Dominique Prunier wrote: >>> To go even further, it is also possible to add the attached patch __on >>> top__ of the previous one: >>> >>> >>> >>> · don't use bitvector::sloppyCount at all in >>> category::patternSearch >>> >>> · need to AND with the mast when there's no match in >>> query::doEvaluate >>> >>> >>> >>> Thanks, >>> >>> >>> >>> *From:*Dominique Prunier >>> *Sent:* Monday, March 26, 2012 11:46 AM >>> *To:* FastBit Users >>> *Subject:* PATCH: perf boost on top of r501 >>> >>> >>> >>> Hey John, >>> >>> >>> >>> I tried the performance of the sloppyCount and it seems quite >>> convincing. It is only 1% slower than returning constants (1 or 0) and >>> could potentially speedup trivial cases from time to time i guess. >>> >>> >>> >>> So i ported my performance patch on top of r501 and replaced my >>> “return 1 or 0” by some calls to sloppyCount. Here is what the patch >>> includes: >>> >>> · in category::patternSearch, skip the >>> decompression/compression phase (this one has a __HUGE__ impact on perf) >>> >>> · in category::patternSearch, use sloppyCount for estimation >>> >>> · in query::doEvaluate, fix returned value when term == null >>> (was 0, now is row count) >>> >>> · in query::doEvaluate, even more usage of >>> bitvector::sloppyCount instead of bitvector::cnt >>> >>> · in query::doEvaluate, removed complex OR evaluation >>> depending on bitmap sizes (maybe this can be restored in a less costly >>> way) >>> >>> >>> >>> So far, it passes all my tests (this is not a very intrusive change). >>> >>> Please tell me if this would be candidate for the next stable release. >>> >>> >>> >>> Thanks, >>> >>> >>> >>> */Dominique Prunier/**//* >>> >>> APG Lead Developper >>> >>> Logo-W4N-100dpi >>> >>> 4388, rue Saint-Denis >>> >>> Bureau 309 >>> >>> Montreal (Quebec) H2J 2L1 >>> >>> Tel. +1 514-842-6767 x310 >>> >>> Fax +1 514-842-3989 >>> >>> [email protected] <mailto:[email protected]> >>> >>> www.watch4net.com <http://www.watch4net.com/> >>> >>> / / >>> >>> /This message is for the designated recipient only and may contain >>> privileged, proprietary, or otherwise private information. If you have >>> received it in error, please notify the sender immediately and delete >>> the original. Any other use of this electronic mail by you is prohibited. >>> >>> //Ce message est pour le récipiendaire désigné seulement et peut >>> contenir des informations privilégiées, propriétaires ou autrement >>> privées. Si vous l'avez reçu par erreur, S.V.P. avisez l'expéditeur >>> immédiatement et effacez l'original. Toute autre utilisation de ce >>> courrier électronique par vous est prohibée./// >>> >>> >>> >>> >>> >>> _______________________________________________ >>> FastBit-users mailing list >>> [email protected] >>> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users >> _______________________________________________ >> FastBit-users mailing list >> [email protected] >> https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users > _______________________________________________ > FastBit-users mailing list > [email protected] > https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users _______________________________________________ FastBit-users mailing list [email protected] https://hpcrdm.lbl.gov/cgi-bin/mailman/listinfo/fastbit-users
