I think its only for code ported from C/C++  , existing C# code nearly
always does it properly as  people have learned the issues eg the 2 Turkish
'I' that need to match both  , the double accent chars etc. In 2000-2003
when C# was new and there were a lot of C++ devs who were newer to C# such
code was more common.  I would also argue that such code is simple ( they
can become complex messes)  and the comment "small amount of users who want
to do inter. properly " - its any lib , or app that want to run in any
number of countries .

Personally if it wasnt for backward compatability id prefer string to do it
properly and not even have an indexer  (except via find(int) ) ..if you
want to do it the C way ( eg for a c benchmark) or use legacy code  use
string.GetASCIIArray or getUSC2/4Array and do what you like such an array
could be a value type array on the stack with no cardtable cost. It just
should not impact  the quality of normal code and allow people to do bad
things.  It can be quite good eg for a c comparison bench mark
 getasciiarray( true) // user indicates original data is ascii so can just
return the pointer and you should get a c like benchmark ( if run length
and cardtable costs are not too high)  except for the auto vectorization
newer java/C compilers can do ( is this an issue for stream reads ?) .  Not
sure why ( or if) people object to getting an array if you want to do array
style work ,  if you want strings use strings.

Though with slices string code should be much faster ( not creating new
strings unless there is a change,  only value objects for substrings , no
conversion from UTF8 source etc) , be more succinct , have lower bugs (
less [index - length -i +1) )  and use 40-50% less memory than Java/C#
anyway . For really hard core stuff your working with your own types (
trees , utf8 byte arrays etc)  not string anyway.

Ben


On Mon, Sep 16, 2013 at 11:59 PM, Jonathan S. Shapiro <[email protected]>wrote:

> On Fri, Sep 13, 2013 at 12:26 AM, David Jeske <[email protected]> wrote:
>
>> On Thu, Sep 12, 2013 at 2:04 PM, Jonathan S. Shapiro <[email protected]>wrote:
>>
>>> The more interesting challenge is to either (a) get people to use some
>>> form of StringReader for sequential access to strings
>>>
>>
>> To me this seems the saner route.
>>
>> If it meant more flexible ways to handle proper I18N, I would be happy
>> with a cursor-based string-reader. Here you could get a cursor from
>> start/end or a saved handle, and then increment/decrement the cursor
>> position, and save any position or range. You would not be able to
>> fabricate an arbitrary location like is possible with the a[i] API.
>>
>> This would prevent C-style "for (i=0;i<length;i++)" loops, but would
>> support a an iterator style of the same loop "for(cursor=s.start;
>> cursor.haschar; cursor.inc)", where inside the iterator you could save off
>> locations easily.
>>
>> Whether other folks would find this acceptable or not is another matter.
>>
>
> The problem with this line of reasoning is that it requires the majority
> of users to give up simple, bad code in order to serve the needs of a small
> number of users who actually want to do internationalization properly. Even
> if it's the right thing to do, changes of this kind rarely prosper.
>
> Fortunately, it shouldn't be that hard to design a pass that turns
> "for(i=0; i < length; i++)" and "foreach c in s" into something that uses a
> StringReader. The main catch with such optimizations is that you don't want
> them to introduce, say, heap allocation into a NoHeap procedure...
>
> shap
>
> _______________________________________________
> bitc-dev mailing list
> [email protected]
> http://www.coyotos.org/mailman/listinfo/bitc-dev
>
>
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to