Re: character escapes in source? ... was: Re: Eclipse: Invalid character constant

Earwin Burrfoot Thu, 07 Apr 2011 23:50:10 -0700

On Fri, Apr 8, 2011 at 03:01, Robert Muir <[email protected]> wrote:
> On Thu, Apr 7, 2011 at 6:48 PM, Chris Hostetter
> <[email protected]> wrote:
>>
>> : -1. These files should be readable, for maintaining, debugging and
>> : knowing whats going on.
>>
>> Readability is my main concern ... i don't know (and frequently can't
>> tell) the differnece between a lot of non ascii characters -- and i'm
>> guessing i'm not alone.  when it's spelled out explicitly using the
>> character name or escape code, there is no ambiquity about what character
>> was intended, or wether it got screwed up by some tool along the way (ie:
>> the svn server, an svn client, the patch command, a text editor, an IDE,
>> ant's "fixcrlf" task, etc...)
>
> Please take the time, just 5 or 10 minutes, to look thru some of this
> source code and tests.
>
> Imagine if you couldn't just look at the code to see what it does, but
> had to decode from some crazy numeric encoding scheme.
> Imagine if it were this way for things like stopword lists too.
>
> It would be basically impossible for you to look at the code and
> figure out what it does!
> For example, try looking at thai analyzer tests, if these were all
> numbers, how would you know wtf is going on?
>
> Although this comes up from time to time, I stand firm on my -1
> because its important to me for the source code to be readable.
> I'm not willing to give this up just because some people cannot read
> writing system XYZ.
>
> I have said before, i'm willing to change my -1 vote on this, if *ALL*
> string constants (including english ones) are changed to be character
> escapes.
> If you imagine what the code would look like if english string
> constants were instead codes, then I think you will understand my
> point of view!
>
> Its really really important to source code readability to be able to
> open a file and understand what it does, not to have to use some
> decoder because it uses characters other people dont understand.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>


I think having both raw characters /and/ encoded representation is the
best? (one of them in comments)
I'm all for unicode sources, but at least two things hit me repeatedly:
1. Tools do screw up, and you have to recover somehow.
eg. IntelliJ IDEA's 'shelve' function uses platform default (MacRoman
in my case) and I've lost some text on things I shelved but never
committed anywhere.
2. There are characters that look all the same.
E.g. different whitespace/dashes. Or, (if you have cyrillic in your
fonts) I dare you to discern between a/а, c/с, e/е, o/о.
These are different characters from latin and cyrillic charsets (left
latin/right cyrillic), but in 99% fonts they are visually identical.
I had a filter that folded up similarily looking characters, and it
was documented in exactly this way - raw char+code.

-- 
Kirill Zakharenko/Кирилл Захаренко
E-Mail/Jabber: [email protected]
Phone: +7 (495) 683-567-4
ICQ: 104465785

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: character escapes in source? ... was: Re: Eclipse: Invalid character constant

Reply via email to