It's been said that what the "masses" think of as binary data is outside
the concept of a string, and this lurker just don't see that.  A binary
string is string over a character set of size two, just like an ASCII
string is a string over a character set of size 128.  [Like character
strings, so-called binary data can even have different encodings besides
the usual eight bits packed into a byte, e.g. Base64 or 7E1 (7 bits even
parity 1 stop bit).]  And shifting is not at all limited to bit strings.
 If I have the bit string of length 5 (1, 0, 0, 0, 1) or 17 for short,
and the text string "Hello" then I can shift the second left by two to
get "llo  " as easily as I can shift the first left by two to get 4.
(The choice of fill character is of course up for debate.) Or
"arithmetically" shift the second right by two to form "HHHel" analagous
to an ASR of the bit string to yield 28. "And," "xor," etc. are less
obvious because there are multiple ways to define such operations when
there are more than two truth values. But they can, at any rate, be
defined: you can apply a function of two arguments element-wise to two
character strings of equal length to produce a third character string of
the same length. It seems right to leave these ops undefined by default
for non-binary strings (since there is definitely no one "right"
definition), but the prevailing notion that they *can't* be applied to,
say, Unicode text without making a horrible mess is just wrong.

>>> Dan Sugalski <[EMAIL PROTECTED]> 04/30/04 10:25 PM >>>
At 7:07 PM -0700 4/30/04, Jeff Clites wrote:
>On Apr 30, 2004, at 10:22 AM, Dan Sugalski wrote:
>
>>At 2:57 AM +1000 5/1/04, Andre Pang wrote:
>>>Of course Parrot should have a function to reinterpret something 
>>>of a string type as raw binary data and vice versa, but don't mix 
>>>binary data with strings: they are completely different types, and 
>>>raw binary data should never be able to be put into a string 
>>>register.  Maybe some blurring of binary data/strings should 
>>>happen at the Perl layer, but Parrot should keep them as distinct 
>>>as possible, IMHO.
>>
>>I'm trying to make sure that keeping them separate is possible, but 
>>it's important for everyone to remember that we're limited in what 
>>we can do.
>>
>>Parrot *can't* dictate semantics. That's not what we get to do.
>
>But your plan seems to be very much dictating semantics--treating a 
>whole class of reasonable string operations as "in that case, punt 
>and throw an exception".

That's why it's overridable. I fully expect most languages will do so 
by default, but the option to leave the exceptions on as a debugging 
aid.

>  And it's not clear that the semantics it is dictating in fact match 
>any of the target languages (or in fact, any existing language at 
>all). The at-runtime association of character set/encoding/language, 
>and the semantics it implies, is what I'm referring to here.

Yep, but with the exceptions disabled things'll act the way they should.
-- 
                                         Dan

--------------------------------------"it's like
this"-------------------
Dan Sugalski                          even samurai
[EMAIL PROTECTED]                         have teddy bears and even
                                       teddy bears get drunk

Reply via email to