> On 11 Apr 2018, at 10:29, Alistair Grant <akgrant0...@gmail.com> wrote: > > Hi Denis, > > On 11 April 2018 at 10:02, Denis Kudriashov <dionisi...@gmail.com> wrote: >> >> 2018-04-11 8:32 GMT+02:00 Alistair Grant <akgrant0...@gmail.com>: >>> >>>>>> Where is it being said that #next and/or #atEnd should be blocking or >>>>>> non-blocking ? >>>>> >>>>> There is existing code that assumes that #atEnd is non-blocking and >>>>> that #next is allowed block. I believe that we should keep those >>>>> conditions. >>>> >>>> I fail to see where that is written down, either way. Can you point me >>>> to comments stating that, I would really like to know ? >>> >>> I'm not aware of it being written down, just that ever existing >>> implementation I'm aware of behaves this way. >>> >>> On the other hand, making #atEnd blocking breaks Eliot's REPL sample >>> (in Squeak). >> >> >> Could you write here this example, please? > > The code is loaded in squeak using: > > https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/buildspurtrunkreaderimage.sh > > for 32 bit images. It loads: > > https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/LoadReader.st > > which loads package CogTools-Listener in http://source.squeak.org/VMMaker > > An image that automatically runs the code and nothing else is created in: > > https://github.com/OpenSmalltalk/opensmalltalk-vm/blob/Cog/image/StartReader.st > > > If you want to run it interactively you can load CogTools-Listener and > do something like: > > StdioListener new > quitOnEof: false; > run
What does #quitOnEof: do ? Can the StdioListener code be browsed/viewed online somewhere ? > If you modify #atEnd to block it will result in the "squeak>" input > prompt being printed in the terminal after the input has been entered. How does one modify #atEnd to block ? I suppose you are talking about StdioStream>>#atEnd ? ^ self peek isNil ? PS: I liked your runnable example better, I will try it later on. Thx! > The code can be loaded in to Pharo and basically works, but the output > tends to be hidden behind the next input prompt because it uses #cr > instead of #lf. You can easily modify StdioListener>>initialize to > set the line end convention in stdout. > > NOTE: It is not intended to be a release quality implementation of a > evaluation loop. The whole purpose as I understand it is for it to be > as simple as possible to assist in tracking down issues using the VM > simulator. It runs minimal code to get to the point of waiting for > user input and then allows an expression that causes problems to be > entered and traced using the simulator. > > Cheers, > Alistair > > > >>>>>> How is this related to how EOF is signalled ? >>>>> >>>>> Because, combined with terminal EOF not being known until the user >>>>> explicitly flags it (with Ctrl-D) it means that #atEnd can't be used >>>>> for iterating over input from stdin connected to a terminal. >>>> >>>> This seems to me like an exception that only holds for one particular >>>> stream in one particular scenario (interactive stdin). I might be wrong. >>>> >>>>>> It seems to me that there are quite a few classes of streams that are >>>>>> 'special' in the sense that #next could be blocking and/or #atEnd could >>>>>> be >>>>>> unclear - socket/network streams, serial streams, maybe stdio >>>>>> (interactive >>>>>> or not). Without a message like #isDataAvailable you cannot handle those >>>>>> without blocking. >>>>> >>>>> Right. I think this is a distraction (I was trying to explain some >>>>> details, but it's causing more confusion instead of helping). >>>>> >>>>> The important point is that #atEnd doesn't work for iterating over >>>>> streams with terminal input >>>> >>>> Maybe you should also point to the actual code that fails. I mean you >>>> showed a partial stack trace, but not how you got there, precisely. How >>>> does >>>> the application reading from an interactive stdin do to get into trouble ? >>> >>> Included below. >>> >>> >>>>>> Reading from stdin seems like a very rare case for a Smalltalk system >>>>>> (not that it should not be possible). >>>>> >>>>> There's been quite a bit of discussion and several projects recently >>>>> related to using pharo for scripting, so it may become more common. >>>>> E.g. >>>>> >>>>> >>>>> https://www.quora.com/Can-Smalltalk-be-a-batch-file-scripting-language/answer/Philippe-Back-1?share=c19bfc95 >>>>> https://github.com/rajula96reddy/pharo-cli >>>> >>>> Still, it is not common at all. >>>> >>>>>> I have a feeling that too much functionality is being pushed into too >>>>>> small an API. >>>>> >>>>> This is just about how should Zinc streams be iterating over the >>>>> underlying streams. You didn't like checking the result of #next for >>>>> nil since it isn't general, correctly pointing out that nil is a valid >>>>> value for non-byte oriented streams. But #atEnd doesn't work for >>>>> stdin from a terminal. >>>>> >>>>> >>>>> At this point I think there are three options: >>>>> >>>>> 1. Modify Zinc to check the return value of #next instead of using >>>>> #atEnd. >>>>> >>>>> This is what all existing character / byte oriented streams in Squeak >>>>> and Pharo do. At that point the Zinc streams can be used on all file >>>>> / stdio input and output. >>>> >>>> I agree that such code exists in many places, but there is lots of >>>> stream reading that does not check for nils. >>> >>> Right. Streams can be categorised in many ways, but for this >>> discussion I think streams are broken in to two types: >>> >>> 1) Byte / Character oriented >>> 2) All others >>> >>> For historical reasons, byte / character oriented streams need to >>> check for EOF by using "stream next == nil" and all other streams >>> should use #atEnd. >>> >>> This avoids the "nil being part of the domain" issue that was >>> discussed earlier in the thread. >>> >>> >>>>> 2. Modify all streams to signal EOF in some other way, i.e. a sentinel >>>>> or notification / exception. >>>>> >>>>> This is what we were discussing below. But it is a decent chunk of >>>>> work with significant impact on the existing code base. >>>> >>>> Agreed. This would be a future extension. >>>> >>>>> 3. Require anyone who wants to read from stdin to code around Zinc's >>>>> inability to handle terminal input. >>>>> >>>>> I'd prefer to avoid this option if possible. >>>> >>>> See higher for a more concrete usage example request. >>> >>> >>> testAtEnd.st >>> -- >>> | ch stream string stdin | >>> >>> 'stdio.cs' asFileReference fileIn. >>> "stdin := FileStream stdin." >>> stdin := ZnCharacterReadStream on: >>> (ZnBufferedReadStream on: >>> Stdio stdin). >>> stream := (String new: 100) writeStream. >>> ch := stdin next. >>> [ ch == nil ] whileFalse: [ >>> stream nextPut: ch. >>> ch := stdin next. ]. >>> string := stream contents. >>> FileStream stdout >>> nextPutAll: string; lf; >>> nextPutAll: 'Characters read: '; >>> nextPutAll: string size asString; >>> lf. >>> Smalltalk snapshot: false andQuit: true. >>> -- >>> >>> Execute with: >>> >>> ./pharo --headless Pharo7.0-64bit-e76f1a2.image testAtEnd.st >>> >>> and type Ctrl-D gives: >>> >>> >>> 'Errors in script loaded from testAtEnd.st' >>> MessageNotUnderstood: receiver of "<" is nil >>> UndefinedObject(Object)>>doesNotUnderstand: #< >>> ZnUTF8Encoder>>nextCodePointFromStream: >>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream: >>> ZnCharacterReadStream>>nextElement >>> ZnCharacterReadStream(ZnEncodedReadStream)>>next >>> UndefinedObject>>DoIt >>> OpalCompiler>>evaluate >>> >>> >>> Using #atEnd to control the loop instead of "stdin next == nil" >>> produces the same result. >>> >>> Replacing stdin with FileStream stdin makes the script work. >>> >>> stdio.cs fixes a bug in StdioStream which really isn't part of this >>> discussion (PR to be submitted). >>> >>> Cheers, >>> Alistair >>> >>> >>> >>> >>>>> Does that clarify the situation? >>>> >>>> Yes, it helps. Thanks. But questions remain. >>>> >>>>> Thanks, >>>>> Alistair >>>>> >>>>> >>>>> >>>>>>> On 10 Apr 2018, at 18:30, Alistair Grant <akgrant0...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>> First a quick update: >>>>>>> >>>>>>> After doing some work on primitiveFileAtEnd, #atEnd now answers >>>>>>> correctly for files that don't report their size correctly, e.g. >>>>>>> /dev/urandom and /proc/cpuinfo, whether the files are opened directly >>>>>>> or >>>>>>> redirected through stdin. >>>>>>> >>>>>>> However determining whether stdin from a terminal has reached the end >>>>>>> of >>>>>>> file can't be done without making #atEnd blocking since we have to >>>>>>> wait >>>>>>> for the user to flag the end of file, e.g. by typing Ctrl-D. And >>>>>>> #atEnd >>>>>>> is assumed to be non-blocking. >>>>>>> >>>>>>> So currently using ZnCharacterReadStream with stdin from a terminal >>>>>>> will >>>>>>> result in a stack dump similar to: >>>>>>> >>>>>>> MessageNotUnderstood: receiver of "<" is nil >>>>>>> UndefinedObject(Object)>>doesNotUnderstand: #< >>>>>>> ZnUTF8Encoder>>nextCodePointFromStream: >>>>>>> ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream: >>>>>>> ZnCharacterReadStream>>nextElement >>>>>>> ZnCharacterReadStream(ZnEncodedReadStream)>>next >>>>>>> UndefinedObject>>DoIt >>>>>>> >>>>>>> >>>>>>> Going back through the various suggestions that have been made >>>>>>> regarding >>>>>>> using a sentinel object vs. raising a notification / exception, my >>>>>>> (still to be polished) suggestion is to: >>>>>>> >>>>>>> 1. Add an endOfStream instance variable >>>>>>> 2. When the end of the stream is reached answer the value of the >>>>>>> instance variable (i.e. the result of sending #value to the >>>>>>> variable). >>>>>>> 3. The initial default value would be a block that raises a >>>>>>> Deprecation >>>>>>> warning and then returns nil. This would allow existing code to >>>>>>> function for a changeover period. >>>>>>> 4. At the end of the deprecation period the default value would be >>>>>>> changed to a unique sentinel object which would answer itself as its >>>>>>> #value. >>>>>>> >>>>>>> At any time users of the stream can set their own sentinel, including >>>>>>> a >>>>>>> block that raises an exception. >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> Alistair >>>>>>> >>>>>>> >>>>>>> On 4 April 2018 at 19:24, Stephane Ducasse <stepharo.s...@gmail.com> >>>>>>> wrote: >>>>>>>> Thanks for this discussion. >>>>>>>> >>>>>>>> On Wed, Apr 4, 2018 at 1:37 PM, Sven Van Caekenberghe <s...@stfx.eu> >>>>>>>> wrote: >>>>>>>>> Alistair, >>>>>>>>> >>>>>>>>> First off, thanks for the discussions and your contributions, I >>>>>>>>> really appreciate them. >>>>>>>>> >>>>>>>>> But I want to have a discussion at the high level of the definition >>>>>>>>> and semantics of the stream API in Pharo. >>>>>>>>> >>>>>>>>>> On 4 Apr 2018, at 13:20, Alistair Grant <akgrant0...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> On 4 April 2018 at 12:56, Sven Van Caekenberghe <s...@stfx.eu> >>>>>>>>>> wrote: >>>>>>>>>>> Playing a bit devil's advocate, the idea is that, in general, >>>>>>>>>>> >>>>>>>>>>> [ stream atEnd] whileFalse: [ stream next. "..." ]. >>>>>>>>>>> >>>>>>>>>>> is no longer allowed ? >>>>>>>>>> >>>>>>>>>> It hasn't been allowed "forever" [1]. It's just been misused for >>>>>>>>>> almost as long. >>>>>>>>>> >>>>>>>>>> [1] Time began when stdio stream support was introduced. :-) >>>>>>>>> >>>>>>>>> I am still not convinced. Another way to put it would be that the >>>>>>>>> old #atEnd or #upToEnd do not make sense for these streams and some >>>>>>>>> new loop >>>>>>>>> is needed, based on a new test (it exists for socket streams already). >>>>>>>>> >>>>>>>>> [ stream isDataAvailable ] whileTrue: [ stream next ] >>>>>>>>> >>>>>>>>>>> And you want to replace it with >>>>>>>>>>> >>>>>>>>>>> [ stream next ifNil: [ false ] ifNotNil: [ :x | "..." true ] >>>>>>>>>>> whileTrue. >>>>>>>>>>> >>>>>>>>>>> That is a pretty big change, no ? >>>>>>>>>> >>>>>>>>>> That's the way quite a bit of code already operates. >>>>>>>>>> >>>>>>>>>> As Denis pointed out, it's obviously problematic in the general >>>>>>>>>> sense, >>>>>>>>>> since nil can be embedded in non-byte oriented streams. I suspect >>>>>>>>>> that in practice not many people write code that reads streams >>>>>>>>>> from >>>>>>>>>> both byte oriented and non-byte oriented streams. >>>>>>>>> >>>>>>>>> Maybe yes, maybe no. As Denis' example shows there is a clear >>>>>>>>> definition problem. >>>>>>>>> >>>>>>>>> And I do use streams of byte arrays or strings all the time, this >>>>>>>>> is really important. I want my parsers to work on all kinds of >>>>>>>>> streams. >>>>>>>>> >>>>>>>>>>> I think/feel like a proper EOF exception would be better, more >>>>>>>>>>> correct. >>>>>>>>>>> >>>>>>>>>>> [ [ stream next. "..." true ] on: EOF do: [ false ] ] whileTrue. >>>>>>>>>> >>>>>>>>>> I agree, but the email thread Nicolas pointed to raises some >>>>>>>>>> performance questions about this approach. It should be >>>>>>>>>> straightforward to do a basic performance comparison which I'll >>>>>>>>>> get >>>>>>>>>> around to if other objections aren't raised. >>>>>>>>> >>>>>>>>> Reading in bigger blocks, using #readInto:startingAt:count: (which >>>>>>>>> is basically Unix's (2) Read sys call), would solve performance >>>>>>>>> problems, I >>>>>>>>> think. >>>>>>>>> >>>>>>>>>>> Will we throw away #atEnd then ? Do we need it if we cannot use >>>>>>>>>>> it ? >>>>>>>>>> >>>>>>>>>> Unix file i/o returns EOF if the end of file has been reach OR if >>>>>>>>>> an >>>>>>>>>> error occurs. You should still check #atEnd after reading past >>>>>>>>>> the >>>>>>>>>> end of the file to make sure no error occurred. Another part of >>>>>>>>>> the >>>>>>>>>> primitive change I'm proposing is to return additional information >>>>>>>>>> about what went wrong in the event of an error. >>>>>>>>> >>>>>>>>> I am sorry, but this kind of semantics (the OR) is way too complex >>>>>>>>> at the general image level, it is too specific and based on certain >>>>>>>>> underlying implementation details. >>>>>>>>> >>>>>>>>> Sven >>>>>>>>> >>>>>>>>>> We could modify the read primitive so that it fails if an error >>>>>>>>>> has >>>>>>>>>> occurred, and then #atEnd wouldn't be required. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> Alistair >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> On 4 Apr 2018, at 12:41, Alistair Grant <akgrant0...@gmail.com> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Nicolas, >>>>>>>>>>>> >>>>>>>>>>>> On 4 April 2018 at 12:36, Nicolas Cellier >>>>>>>>>>>> <nicolas.cellier.aka.n...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 2018-04-04 12:18 GMT+02:00 Alistair Grant >>>>>>>>>>>>> <akgrant0...@gmail.com>: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Sven, >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Apr 04, 2018 at 11:32:02AM +0200, Sven Van >>>>>>>>>>>>>> Caekenberghe wrote: >>>>>>>>>>>>>>> Somehow, somewhere there was a change to the implementation >>>>>>>>>>>>>>> of the >>>>>>>>>>>>>>> primitive called by some streams' #atEnd. >>>>>>>>>>>>>> >>>>>>>>>>>>>> That's a proposed change by me, but it hasn't been integrated >>>>>>>>>>>>>> yet. So >>>>>>>>>>>>>> the discussion below should apply to the current stable vm >>>>>>>>>>>>>> (from August >>>>>>>>>>>>>> last year). >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> IIRC, someone said it is implemented as 'remaining size being >>>>>>>>>>>>>>> zero' >>>>>>>>>>>>>>> and some virtual unix files like /dev/random are zero sized. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Currently, for files other than sdio (stdout, stderr, stdin) >>>>>>>>>>>>>> it is >>>>>>>>>>>>>> effectively defined as: >>>>>>>>>>>>>> >>>>>>>>>>>>>> atEnd := stream position >= stream size >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> And, as you say, plenty of virtual unix files report size 0. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Now, all kinds of changes are being done image size to work >>>>>>>>>>>>>>> around this. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would phrase this slightly differently :-) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Some code does the right thing, while other code doesn't. >>>>>>>>>>>>>> E.g.: >>>>>>>>>>>>>> >>>>>>>>>>>>>> MultiByteFileStream>>upToEnd is good, while >>>>>>>>>>>>>> FileStream>>contents is incorrect >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am a strong believer in simple, real (i.e. infinite) >>>>>>>>>>>>>>> streams, but I >>>>>>>>>>>>>>> am not sure we are doing the right thing here. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Point is, I am not sure #next returning nil is official and >>>>>>>>>>>>>>> universal. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Consider the comments: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Stream>>#next >>>>>>>>>>>>>>> "Answer the next object accessible by the receiver." >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ReadStream>>#next >>>>>>>>>>>>>>> "Primitive. Answer the next object in the Stream represented >>>>>>>>>>>>>>> by the >>>>>>>>>>>>>>> receiver. Fail if the collection of this stream is not an >>>>>>>>>>>>>>> Array or a >>>>>>>>>>>>>>> String. >>>>>>>>>>>>>>> Fail if the stream is positioned at its end, or if the >>>>>>>>>>>>>>> position is out >>>>>>>>>>>>>>> of >>>>>>>>>>>>>>> bounds in the collection. Optional. See Object documentation >>>>>>>>>>>>>>> whatIsAPrimitive." >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Note how there is no talk about returning nil ! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think we should discuss about this first. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Was the low level change really correct and the right thing >>>>>>>>>>>>>>> to do ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> The primitive change proposed doesn't affect this discussion. >>>>>>>>>>>>>> It will >>>>>>>>>>>>>> mean that #atEnd returns false (correctly) sometimes, while >>>>>>>>>>>>>> currently it >>>>>>>>>>>>>> returns true (incorrectly). The end result is still >>>>>>>>>>>>>> incorrect, e.g. >>>>>>>>>>>>>> #contents returns an empty string for /proc/cpuinfo. >>>>>>>>>>>>>> >>>>>>>>>>>>>> You're correct about no mention of nil, but we have: >>>>>>>>>>>>>> >>>>>>>>>>>>>> FileStream>>next >>>>>>>>>>>>>> >>>>>>>>>>>>>> (position >= readLimit and: [self atEnd]) >>>>>>>>>>>>>> ifTrue: [^nil] >>>>>>>>>>>>>> ifFalse: [^collection at: (position := position + >>>>>>>>>>>>>> 1)] >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> which has been around for a long time (I suspect, before Pharo >>>>>>>>>>>>>> existed). >>>>>>>>>>>>>> >>>>>>>>>>>>>> Having said that, I think that raising an exception is a >>>>>>>>>>>>>> better >>>>>>>>>>>>>> solution, but it is a much, much bigger change than the one I >>>>>>>>>>>>>> proposed >>>>>>>>>>>>>> in https://github.com/pharo-project/pharo/pull/1180. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Cheers, >>>>>>>>>>>>>> Alistair >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Hi, >>>>>>>>>>>>> yes, if you are after universal behavior englobing Unix >>>>>>>>>>>>> streams, the >>>>>>>>>>>>> Exception might be the best way. >>>>>>>>>>>>> Because on special stream you can't allways say in advance, you >>>>>>>>>>>>> have to try. >>>>>>>>>>>>> That's the solution adopted by authors of Xtreams. >>>>>>>>>>>>> But there is a runtime penalty associated to it. >>>>>>>>>>>>> >>>>>>>>>>>>> The penalty once was so high that my proposal to generalize >>>>>>>>>>>>> EndOfStream >>>>>>>>>>>>> usage was rejected a few years ago by AndreaRaab. >>>>>>>>>>>>> http://forum.world.st/EndOfStream-unused-td68806.html >>>>>>>>>>>> >>>>>>>>>>>> Thanks for this, I'll definitely take a look. >>>>>>>>>>>> >>>>>>>>>>>> Do you have a sense of how Denis' suggestion of using an >>>>>>>>>>>> EndOfStream >>>>>>>>>>>> object would compare? >>>>>>>>>>>> >>>>>>>>>>>> It would keep the same coding style, but avoid the problems with >>>>>>>>>>>> nil. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Alistair >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> I have regularly benched Xtreams, but stopped a few years ago. >>>>>>>>>>>>> Maybe i can excavate and pass on newer VM. >>>>>>>>>>>>> >>>>>>>>>>>>> In the mean time, i had experimented a programmable end of >>>>>>>>>>>>> stream behavior >>>>>>>>>>>>> (via a block, or any other valuable) >>>>>>>>>>>>> http://www.squeaksource.com/XTream.htm >>>>>>>>>>>>> so as to reconcile performance and universality, but it was a >>>>>>>>>>>>> source of >>>>>>>>>>>>> complexification at implementation side. >>>>>>>>>>>>> >>>>>>>>>>>>> Nicolas >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Note also that a Guille introduced something new, #closed >>>>>>>>>>>>>>> which is >>>>>>>>>>>>>>> related to the difference between having no more elements >>>>>>>>>>>>>>> (maybe right now, >>>>>>>>>>>>>>> like an open network stream) and never ever being able to >>>>>>>>>>>>>>> produce more data. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Sven >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> <stdio.cs>