Re: Swift: How to determine if a Character represents whitespace?

Quincey Morris Wed, 01 Apr 2015 22:28:48 -0700

On Apr 1, 2015, at 21:17 , Charles Jenkins <cejw...@gmail.com> wrote:
> 
> for ch in String(char).utf16 {  
> if !set.characterIsMember(ch) { found = false }  
> }


Except that this code can’t possibly be right, in general. 

1. A ‘unichar’ is a UTF-16 code value, but it’s not a Unicode code point. Some 
UTF-16 code values have no meaning as “characters” by themselves. I think you 
could mitigate this problem by using ‘longCharacterIsMember’, which takes a 
UTF-32 code value instead (and enumerating the string as UTF-32 instead of 
UTF-16).

2. A Swift ‘Character’ isn’t a Unicode code point, but rather a grapheme. That 
is, it might be a sequence of code points (and I mean code points, not code 
values). It might be such a sequence either because there’s no way of 
representing the grapheme by a single code point, or because it’s a composed 
character made up of a base code points and some combining characters.

In this case, you can’t validly test the individual code points for membership 
of the character set.

I’m not sure, but I suspect the underlying obstacle is that NSCharacterSet is 
at best a set of code points, and you cannot test a grapheme for membership of 
a set of code points.

In your particular application, if it’s true that all** Unicode whitespace 
characters are represented as a single code point (via a single UTF-32 code 
value), or a single UTF-16 code value, then you can get away with one of the 
above solutions. Otherwise you’re going to need a more complex solution, that 
doesn’t involve NSCharacterSet at all.



** Or at least the ones you happen to care about, but ignoring the others may 
be a perilous proceeding.



_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: Swift: How to determine if a Character represents whitespace?

Reply via email to