Richard, your diligence is awesome!

I'm left feeling that, generally, objects should not implement nor understand 
#asString. I realize there is a lot of room for debate on this subject.



On May 11, 2020 10:12:10 PM PDT, Richard O'Keefe <rao...@gmail.com> wrote:
>For what it's worth, here's a moderately thorough examination of
>several
>different Smalltalk systems.
>Let A = astc, D = Dolphin, G = GNU Smalltalk, Q = Squeak,
>    S = ANSI standard, T = Strongtalk, V = VisualWorks, X = ST/X
>
>#asString
>
>ADGPQST!X  Character => String with: self
>ADGPQSTVX  String => self
>ADGPQSTVX  Symbol => String withAll: self
>.........  Text/Paragraph, where provided, => underlying string.
>A-G-----X  Collection => String withAll: self
>---PQ----  Collection => self printString
>A........  ByteArray => "decoded as Unicode"
>.DGPQ-TVX  ByteArray => "decoded as Latin1"
>ADG.Q..VX  Filename => "unparse (Filename, FilePath, whatever)
>AdgpQ.TVX  URI => "unparse (ZnUrl in P)
>Ad.PQ...X  UUID => "unparse (d, GUID equivalent)"
>.DGpq...X  Number => self displayString "or self printString, maybe via
>Object"
>...PQ...X  Object => self printString
>
>There's a rough consensus here:
> - if something *is* textual (Character, String, Symbol, Text,
>   Paragraph) #asString returns it as a normal string
> - if something is a parsed form of structured text (file name,
>   URI, UUID) its unparsed form is returned, converting it back
>   should yield an equal parsed form
> - if something is a byte array, it may be converted to a string
>   (the rule is to preserve the bytes, my library presumes byte
>   arrays are UTF8-encoded)
> - if numbers are accepted at all, their #printString is returned
> - if is none of the above, but has a #name or #title, that is
>   commonly the result.
>
>The open question is what converting an array of characters using
>#asString will do.  There is at least one Smalltalk out there
>where #(65 66 67) asString => 'ABC', but that's a step further
>than I personally want, though it is consistent with ByteArray.
>
>
>
>
>On Tue, 12 May 2020 at 11:16, Jerry Kott <jk...@image-ware.com> wrote:
>
>> Hi all,
>>
>> I’ve been lurking so far, but I must add my voice here and agree with
>> Richard.
>>
>> The malleability of Smalltalk tempts people into implementing
>#asString,
>> #name, and similar semantically ambiguous method names. Like Richard,
>I
>> regretted every single time I (or someone else on my team before me)
>> decided to use these. It goes back to ’Smalltalk with Style’
>(probably the
>> best little red book you could ever own): write intention revealing
>code
>> whenever you can, and comment liberally when you can’t.
>>
>> *Jerry Kott*
>> This message has been digitally signed.
>> PGP Fingerprint:
>> A9181736DD2F1B6CC7CF9E51AC8514F48C0979A5
>>
>>
>>
>> On 11-05-2020, at 2:48 PM, Richard Sargent <rsarg...@5x5.on.ca>
>wrote:
>>
>>
>>
>> On May 11, 2020 2:19:49 PM PDT, Richard O'Keefe <rao...@gmail.com>
>wrote:
>>
>> I was saying that I expected #($a $b $c) asString ==> 'abc'.
>>
>>
>> Over the years, I found myself being opposed to the idea that all
>objects
>> can sensibly have an #asString implementation. When it's been done,
>it
>> ultimately caused more problems than it solved.
>>
>> Consider $(48 49 50) asString. Do you expect it to give you a string
>with
>> all the digits? Or perhaps it's meant to interpret the elements as
>> byte-like things, as you would need for "String withAll:
>aCollection". So,
>> the numbers could be interpreted as codepoints, as they are in a
>ByteArray.
>>
>> But, what does "(Array with: Object new with: ProcessScheduler)
>asString"
>> mean?
>>
>> It seems to me that having all objects understand #asString leads to
>> confusion.
>>
>> If you want an array to print as its literal representation,
>implement
>> #printAsLiteral, so that your intention is clear.
>>
>>
>> If you want something that can be read back, that's what #storeString
>> is for,
>>
>> On Tue, 12 May 2020 at 01:28, Stéphane Ducasse
>> <stephane.duca...@inria.fr> wrote:
>>
>>
>>
>>
>> On 5 May 2020, at 16:16, Richard O'Keefe <rao...@gmail.com> wrote:
>>
>> By the way, while playing with this problem, I ran into a moderately
>> painful issue.
>>
>> There is a reason that Smalltalk has both #printString (to get a
>> printable representation of an object) and #asString (to convert a
>> sequence to another kind of sequence with the same elements.)  If I
>> *want* #printString, I know where to find it.  The definition in my
>> Smalltalk no reads
>>
>>   asString
>>     "What should #($a $b $c) do?
>>     - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
>>       there is no #asString.
>>     - ANSI, VW, Dolphin, CSOM:
>>       #asString is defined on characters and strings
>>       (and things like file names and URIs that are sort of
>>
>> strings),
>>
>>       so expect an error report.
>>     - VisualAge Smalltalk:
>>       '($a $b $c)'
>>     - Squeak and Pharo:
>>       '#($a $b $c)'
>>     - GNU Smalltalk, Smalltalk/X, and astc:
>>       'abc'
>>      I don't intend any gratuitous incompatibility, but when there
>>      is no consensus to be compatible with, one must pick something,
>>      and this seems most useful.
>>     "
>>     ^String withAll: self
>>
>> Does anyone here know WHY Squeak and Pharo do what they do here?
>>
>>
>> Oops I did not see the quotes on my screen..
>>
>> #( a b c) asString
>>
>> '#(#a #b #c)’
>>
>>
>> this is unclear to me why this is not good but I have no strong
>>
>> opinion
>>
>> that this is good.
>>
>> I worked on printString for literals because I wanted to have
>> self evaluating properties for basic literal like in Scheme and
>>
>> others.
>>
>> where
>> #t
>>
>>
>> #t
>>
>> And I payed attention that we get the same for literal arrays.
>> Now the conversion is open to me.
>>
>> #($a $b $c) asString
>>
>>
>> '#($a $b $c)’
>>
>> In fact I do not really understand why a string
>>
>> #($a $b $c) asString would be '(a b c)’
>> and its use
>> if this is to nicely display in the ui I would have
>> displayString doing it.
>>
>> S.
>>
>>
>>
>> On Wed, 6 May 2020 at 01:20, Richard O'Keefe <rao...@gmail.com>
>>
>> wrote:
>>
>>
>>
>> The irony is that the code I was responding to ISN'T obviously
>>
>> correct.
>>
>> Indeed, I found it rather puzzling.
>> The problem specification says that the input string may contain
>>
>> digits
>>
>> AND SPACES.  The original message includes this:
>>
>> Strings of length 1 or less are not valid. Spaces are allowed in the
>> input, but they should be stripped before checking. All other
>> non-digit characters are disallowed.
>>
>> Now it isn't clear what "disallowed" means.  I took it to mean "may
>>
>> occur and
>>
>> should simply mean the input is rejected as invalid."  Perhaps "may
>>
>> not occur"
>>
>> was the intention.  So we shall not quibble about such characters.
>>
>> But I can't for the life of me figure out how Trygve's code checks
>>
>> for spaces.
>>
>> One reason this is an issue is that the behaviour of #digitValue is
>>
>> not
>>
>> consistent between systems.
>> Character space digitValue
>>   does not exist in the ANSI standard
>>   answers -1 in many Smalltalks (which is a pain)
>>   answers a positive integer that can't be mistake for a digit in my
>>
>> Smalltalk
>>
>>   raises an exception in some Smalltalks.
>>
>> This is a comment I now have in my Smalltalk library for #digitValue
>>     "This is in the Blue Book, but unspecified on non-digits.
>>      Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
>>      answer -1 for characters that are not digits (or ASCII
>>
>> letters),
>>
>>      which is unfortunate but consistent with Inside Smalltalk
>>      which specifies this result for non-digits.
>>      ST/X and GST raise an exception which is worse.
>>      Digitalk ST/V documentation doesn't specify the result.
>>      This selector is *much* easier to use safely if it
>>      returns a 'large' (>= 36) value for non-digits."
>>
>> Let's compare three versions, the two I compared last time,
>> and the "version A" code I discussed before, which to my mind
>> is fairly readable.
>>
>> "Don't add slowness": 1 (normalised time)
>> "Trygve's code":  6.5
>> "High level code": 30.6 (or 4.7 times slower than Trygve's)
>>
>> Here's the "High level code".
>>     ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]])
>>
>> and: [
>>
>>       |digitsReverse|
>>       digitsReverse := (aString select: [:each | each isDigit])
>>
>> reverse.
>>
>>       digitsReverse size > 1 and: [
>>         |evens odds evenSum oddSum|
>>         odds  := digitsReverse withIndexSelect: [:y :i | i odd].
>>         evens := digitsReverse withIndexSelect: [:x :i | i even].
>>         oddSum  := odds  detectSum: [:y | y digitValue].
>>         evenSum := evens detectSum: [:x |
>>                      #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
>>         (oddSum + evenSum) \\ 10 = 0]]
>>
>> This is the kind of code I was recommending that Roelof write.
>>
>> As a rough guide, by counting traversals (including ones inside
>>
>> existing
>>
>> methods), I'd expect the "high level" code to be at least 10 times
>>
>> slower
>>
>> than the "no added slowness" code.
>>
>> We are in vehement agreement that there is a time to write high level
>> really obvious easily testable and debuggable code, and that's most
>> of the time, especially with programming exercises.
>>
>> I hope that we are also in agreement that factors of 30 (or even 6)
>> *can* be a serious problem.  I mean, if I wanted something that slow,
>> I'd use Ruby.
>>
>> I hope we are also agreed that (with the exception of investigations
>> like this one) the time to hack on something to make it faster is
>>
>> AFTER
>>
>> you have profiled it and determined that you have a problem.
>>
>> But I respectfully suggest that there is a difference taking slowness
>>
>> OUT
>>
>> and simply not going out of your way to add slowness in the first
>>
>> place.
>>
>>
>> I'd also like to remark that my preference for methods that traverse
>>
>> a
>>
>> sequence exactly once has more to do with Smalltalk protocols than
>> with efficiency.  If the only method I perform on an object is #do:
>> the method will work just as well for readable streams as for
>> collections.  If the only method I perform on an object is
>>
>> #reverseDo:
>>
>> the method will work just as well for Read[Write]Streams as for
>> SequenceReadableCollections, at least in my library.   It's just like
>> trying to write #mean so that it works for Durations as well as
>>
>> Numbers.
>>
>>
>> Oh heck, I suppose I should point out that much of the overheads in
>> this case could be eliminated by a Self-style compiler doing dynamic
>> inlining + loop fusion.    There's no reason *in principle*, given
>>
>> enough
>>
>> people, money, and time, that the differences couldn't be greatly
>> reduced in Pharo.
>>
>> On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <tryg...@ifi.uio.no>
>>
>> wrote:
>>
>>
>>
>> Richard,
>>
>> Thank you for looking at the code. It is comforting to learn that the
>>
>> code has been executed for a large number of examples without
>breaking.
>> The code is not primarily written for execution but for being read
>and
>> checked by the human end user. It would be nice if we could also
>check
>> that it gave the right answers, but I don't know how to do that.
>>
>>
>> The first question is: Can a human domain expert read the code and
>>
>> sign their name for its correctness?
>>
>>
>>
>> When this is achieved, a programming expert will transcribe the first
>>
>> code to a professional quality program. This time, the second code
>> should be reviewed by an independent programmer who signs their name
>> for its correct transcription from the first version.
>>
>>
>> --Trygve
>>
>> PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two
>>
>> ways of constructing a software design: One way is to make it so
>simple
>> that there are obviously no deficiencies and the other is to make it
>so
>> complicated that there are no obvious deficiencies. The first method
>is
>> far more difficult."
>>
>>
>> --Trygve
>>
>> On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:
>>
>> As a coding experiment, I adapted Trygve  Reenskoug's code to my
>> Smalltalk compiler, put in my code slightly tweaked, and benchmarked
>> them on randomly generated data.
>>
>> Result: a factor of 6.3.
>>
>> In Squeak it was a factor of ten.
>>
>> I had not, in all honesty, expected it to to be so high.
>>
>> On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <tryg...@ifi.uio.no>
>>
>> wrote:
>>
>>
>> A coding experiment.
>> Consider a Scrum development environment. Every programming team has
>>
>> an end user as a member.
>>
>> The team's task is to code a credit card validity check.
>> A first goal is that the user representative shall read the code and
>>
>> agree that it is a correct rendering of their code checker:
>>
>>
>>   luhnTest: trialNumber
>>       | s1 odd s2 even charValue reverse |
>> -----------------------------------------------
>> " Luhn test according to Rosetta"
>> "Reverse the order of the digits in the number."
>>   reverse := trialNumber reversed.
>> "Take the first, third, ... and every other odd digit in the reversed
>>
>> digits and sum them to form the partial sum s1"
>>
>>   s1 := 0.
>>   odd := true.
>>   reverse do:
>>       [:char |
>>           odd
>>               ifTrue: [
>>                   s1 := s1 + char digitValue.
>>               ].
>>               odd := odd not
>>       ].
>> "Taking the second, fourth ... and every other even digit in the
>>
>> reversed digits:
>>
>> Multiply each digit by two and sum the digits if the answer is
>>
>> greater than nine to form partial sums for the even digits"
>>
>>   "The subtracting 9 gives the same answer. "
>> "Sum the partial sums of the even digits to form s2"
>>   s2 := 0.
>>   even := false.
>>   reverse do:
>>       [:char |
>>           even
>>               ifTrue: [
>>                   charValue := char digitValue * 2.
>>                   charValue > 9 ifTrue: [charValue := charValue -
>>
>> 9].
>>
>>                   s2 := s2 + charValue
>>               ].
>>               even := even not
>>       ].
>> "If s1 + s2 ends in zero then the original number is in the form of a
>>
>> valid credit card number as verified by the Luhn test."
>>
>>   ^(s1 + s2) asString last = $0
>> ---------------------------------
>> Once this step is completed, the next step will be to make the code
>>
>> right without altering the algorithm (refactoring). The result should
>> be readable and follow the team's conventions.
>>
>>
>>
>> P.S. code attached.
>>
>>
>> --
>>
>> The essence of object orientation is that objects collaborate  to
>>
>> achieve a goal.
>>
>> Trygve Reenskaug      mailto: tryg...@ifi.uio.no
>> Morgedalsvn. 5A       http://folk.uio.no/trygver/
>> N-0378 Oslo             http://fullOO.info
>> Norway                     Tel: (+47) 468 58 625
>>
>>
>>
>> --------------------------------------------
>> Stéphane Ducasse
>> http://stephane.ducasse.free.fr / http://www.pharo.org
>> 03 59 35 87 52
>> Assistant: Julie Jonas
>> FAX 03 59 57 78 50
>> TEL 03 59 35 86 16
>> S. Ducasse - Inria
>> 40, avenue Halley,
>> Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
>> Villeneuve d'Ascq 59650
>> France
>>
>>
>>
>>
>>

Reply via email to