Have you checked the Font you are using to display the character string to see 
if it contains the bicycle character? If not, you probably won’t get the 
character you seek.

- Jack

> On Apr 6, 2015, at 11:15 AM, Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote:
> 
> 
>> On 7 Apr 2015, at 00:15, Quincey Morris 
>> <quinceymor...@rivergatesoftware.com> wrote:
>> 
>> On Apr 6, 2015, at 09:19 , Gerriet M. Denkmann <gerr...@mdenkmann.de> wrote:
>>> 
>>> Where is my bicycle gone? What am I doing wrong?
>> 
>> Before this thread heads further into outer space…
>> 
>> I suspect it [NSCharacterSet] is just broken. Look here, for example:
>> 
>>      
>> http://stackoverflow.com/questions/23000812/creating-nscharacterset-with-unicode-smp-entries-testing-membership-is-this
>> 
>> The problem is that it’s unclear whether the “characters” in NSCharacterSet 
>> are internally UTF-16 code units, UTF-32 code units, Unicode code points, or 
>> something else. According to the NSCharacterSet documentation:
>> 
>>> "An NSCharacterSet object represents a set of Unicode-compliant characters.”
>> 
>> and:
>> 
>>> "The NSCharacterSet class declares the programmatic interface for an object 
>>> that manages a set of Unicode characters (see the NSString class cluster 
>>> specification for information on Unicode).”
>> 
>> According the NSString documentation:
>> 
>>> "A string object presents itself as an array of Unicode characters (Unicode 
>>> is a registered trademark of Unicode, Inc.). You can determine how many 
>>> characters a string object contains with the length method and can retrieve 
>>> a specific character with the characterAtIndex: method.”
>> 
>> Working backwards, we know that the characters that are counted by 
>> -[NSString length]’ are UTF-16 code units, so this all *possibly* implies 
>> that NSCharacterSet characters are UTF-16 code units, too. Plus, back in 
>> NSCharacterSet documentation:
>> 
>>> "NSCharacterSet’s principal primitive method, characterIsMember:, provides 
>>> the basis for all other instance methods in its interface.”
>> 
>> If that’s true, ‘longCharacterIsMember:’ is pretty much screwed.
>> 
>> Perhaps the NSCharacterSet documentation is just wrong. Or perhaps, when the 
>> API was enhanced in 10.2 (see: 
>> http://www.cocoabuilder.com/archive/cocoa/73297-working-with-32-bit-unicode-nsstring-stringwithutf32string-const-utf32char-bytes-needed.html,
>>  for some tantalizing hints about NSCharacterSet), the implementation was a 
>> hack that works somehow but isn’t documented. I don’t think you’re going to 
>> get any definitive answer except directly from Apple.
>> 
>> A suggestion, though:
>> 
>> Try building your character set using ‘characterSetWithRange:’ and/or the 
>> NSMutableCharacterSet methods that add ranges, instead of using NSStrings. 
>> Maybe NSCharacterSet really is UTF-32-based, but not — for code 
>> compatibility reasons — when using NSStrings explicitly.
> 
> 1. longCharacterIsMember seems to be ok:
>               NSCharacterSet *alphanumericCharacterSet = [ NSCharacterSet 
> alphanumericCharacterSet ];
>               BOOL pp = [ alphanumericCharacterSet longCharacterIsMember: 
> 0x2f800 ];
> returns YES as it should.
> 
> 2. characterSetWithCharactersInString seems to take only the lower 16 bits of 
> the code points in the string. Bug.
> Works ok though, if all chars in the string have code points ≥ 0x10000 (e.g. 
> "𝄞🚲")
> 
> 3. the documentation about bitmapRepresentation  is wrong. It says: "A raw 
> bitmap representation of a character set is a byte array of 2^16 bits (that 
> is, 8192 bytes)."
> But alphanumericCharacterSet has a bitmap with 32771 = 0x8003 bytes, which 
> mostly look ok.
> It has some strange things though at the end: 
> 0x2fa1e → 0x2fa2d 
> 0x30011 → 0x30207 
> which I do not recognise as alphanumeric.
> 
> 4. characterSetWithRange works a bit better:
>       NSCharacterSet *a = [ NSCharacterSet characterSetWithRange: 
> NSMakeRange(0x1F6B2,1) ];
>       BOOL pp = [ a longCharacterIsMember: 0x1F6B2 ]; → returns YES as it 
> should.
> 
> But when I look at the bitmapRepresentation I see 16385 bytes with two bits 
> set: 0x10000 and 0x1f6ba (8 bits off)
> 
> Looks like the format of the bitmapRepresentation is slightly more complex 
> than documented.
> 
> 
> Kind regards,
> 
> Gerriet.
> 
> 
> _______________________________________________
> 
> Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)
> 
> Please do not post admin requests or moderator comments to the list.
> Contact the moderators at cocoa-dev-admins(at)lists.apple.com
> 
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/cocoa-dev/jackbrindle%40me.com
> 
> This email sent to jackbrin...@me.com


_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to