Re: [pkg-discuss] Code Review Request for 2672 Case Insensitive search

Brock Pytlik Wed, 27 Aug 2008 20:17:06 -0700

Roland Mainz wrote:
> Brock Pytlik wrote:
>   
>> Roland Mainz wrote:
>>     
>>> Brock Pytlik wrote:
>>>       
>>>> http://cr.opensolaris.org/~bpytlik/ips-2672-v1/
>>>> has the patch.
>>>>
>>>> It makes case-insensitive search both possible and the default behavior.
>>>> For local search, there's an option ( -I though I'm definitely open to
>>>> alternatives -c?) that allows for case sensitive search. For protocol
>>>> reasons, remote search is only case insensitive. There is a small (IMO)
>>>> performance hit for both local and remote search. Case sensitive search
>>>> still takes the same amount of time: .5-.6 seconds but case-insensitive
>>>> search takes .7-.8 seconds.
>>>>
>>>> Remote search takes the biggest hit, going from .1-.2 seconds to .6-.7
>>>> seconds.
>>>>         
>>> Just curious: What does "remote search" mean in this case ?
>>> Case-insenstive matching depends on the locale and AFAIK would require
>>> to pass the the locale token (e.g. LANG, LC_*, LC_ALL) to the remote
>>> site...
>>>       
>> As Shawn said, remote search means searching the repositories of the
>> authorities you have set. Since right now, we do no language specific
>> adaptation, this would be an RFE.
>>     
>
> Erm... this isn't an RFE - this is AFAIK a _bug_ (that's why I've CC:ed
> [EMAIL PROTECTED] to get a verification and help) and right
> now the code only works by accident. If client and server run in
> different locales you may end-up in a situation where the
> case-insensitive matching doesn't work as expected.
>   
I think of it as an RFE b/c it wasn't in my requirements and I didn't 
change behavior of search compared to the previous version. If you want 
to call it a bug when you file it, that's fine by me. So when the server 
has data from one locale, is running in a different locale (since I 
presume the metadata will be in one more than one locale or this 
wouldn't be an issue), and the client sends a request from a third 
locale, what's the correct behavior?
>   
>> We use Python's .upper and .lower
>> functions, which eventually call down to the C library functions tolower
>> and toupper.
>>     
>
> Erm... does it call |toupper()| or |towupper()| ? The first form is
> AFAIK for single-byte locales only while the 2nd form works on |wchar_t|
> (which should IMO be preferred).
>
>   
Offhand, I'd guess the first ones as that what I said previously, but I 
don't have the code in front of me at the moment.
>> Those use the locale for the program to determine the
>> mapping between cases. If they don't recognize the character, they
>> simply pass over it. Luckily, there is a solution, we can use Python's
>> locale module to adapt out locale, but it's not thread safe, so making
>> it work correctly in the current depot would not be trivial. Of course,
>> we could roll our own lower and upper methods that didn't rely on a
>> global locale and was thread safe, but that's also outside the scope of
>> this bug. Please file a RFE for this.
>>     
>
> As said above this isn't a RFE since you can't assume that an uppercase
> character in one locale has the same lowercase counterpart in a
> different locale.
>
> ----
>
> Bye,
> Roland
>
>


_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Re: [pkg-discuss] Code Review Request for 2672 Case Insensitive search

Reply via email to