On 23 Jan 2013, at 01:32, Keary Suska <cocoa-...@esoteritech.com> wrote:

> On Jan 22, 2013, at 6:18 PM, Jens Alfke wrote:
> 
>> On Jan 22, 2013, at 3:28 PM, jonat...@mugginsoft.com wrote:
>> 
>>> Is  + (id)letterCharacterSet the best choice here?
>> 
>> The API docs say "Informally, this set is the set of all characters used as 
>> letters of alphabets and ideographs.”
>> Which very strongly implies it is not just ASCII, but covers all Unicode 
>> alphabets.
>> 
>> Some languages, like Java and Go, can handle non-ASCII letters in 
>> identifiers, but most can’t. I would stick with a character set consisting 
>> of only upper and lowercase ASCII letters, digits and the underscore. And 
>> you’d probably want to force the first character to be a lowercase letter 
>> since some languages assign special meaning to identifiers that start with a 
>> capital letter or with an underscore.
> 
> 
> I was thinking that by "language" the OP meant linguistic rather than 
> programming.
Yes I mean linguistic language rather than programming.

> For the latter it is a bit easier to find the lowest common denominator, 
> which is probably strictly ascii alpha and numbers, beginning with lowercase 
> alpha and probably even a character limit of around 12. That would 
> automatically exclude the identifier pattern example provided as most 
> languages do not permit dashes in identifier names (that I know of).

After a bit more experimenting it is clear that restricting the variable's 
character set is sensible.

I think I will have to introduce an intermediate NSTextView that filters and 
displays the natural language input.
The user can then modify the variable name if necessary before it gets utilised 
further.

So the approach may be:

1. decompose the input with decomposedStringWithCanonicalMapping
2. build a NSMutableCharacterSet that asserts +alphanumericCharacterSet.

The docs for +alphanumericCharacterSet

A character set containing the characters in the categories Letters, Marks, and 
Numbers.

So I think this means that I will have to also exclude + nonBaseCharacterSet.

The actual content of an NSCharacterSet is only accessible via 
-bitmapRepresentation which makes previewing the content slightly involved.

>  Additionally, for higher-level interpreted languages the app would need to 
> understand identifier prefixes such as $, @ and % (and maybe &).

The output from this is processed through a Mustache template that takes care 
of all the variable name decoration ($, @, - and the like). 
The template will also enforce case (upper, lower, Pascal etc) and white space 
handling (remove, -, _ etc)

Thanks for the suggestions.

Jonathan
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to