Re: [node-dev] Stream tweaks proposal

Dominic Tarr Mon, 30 Jul 2012 03:53:06 -0700

I think there is another problem here, if creationix writes a stream
that likes to read a byte, or a line, or whatever, it means that I
can't just pipe anything into it, because he basically needs a custom
`flow` function (as per
https://github.com/isaacs/readable-stream/blob/master/readable.js#L84-95
) but that will be on the readable side.


if data is gonna be pulled off the readable stream, then the pulling
duties really belong to the puller, not the pullee.

but, I can also sense complexity rearing it's ugly head, but maybe...
pipe could check for a dest.pull method, and use that instead of flow?

On Mon, Jul 30, 2012 at 5:51 AM, Bruno Jouhier <bjouh...@gmail.com> wrote:
> @tim
>
> The API that I used in this blog post is a simplified version of the API I
> implemented in streamline. I simplified it in the blog post because I just
> wanted to demo the equivalence between the two styles of API.
>
> The streams module that I am using
> (https://github.com/Sage/streamlinejs/blob/master/lib/streams/server/streams.md)
> has most of the features that you saw missing:
>
> * an optional "len" parameter in the read call.
> * low and high water mark options in the ReadableStream constructor.
>
> The "len" parameter has your "bytes" semantics and I use it exactly the way
> you describe (typically to read 4 bytes to get a frame length and then read
> N bytes for a frame). I did not implement "maxBytes" semantics because I did
> not need it (which does not mean it would not be useful). The thing is that
> all the additional bells and whistles can be implemented around the basic
> read(cb) call (called readChunk in my module).
>
> I introduced low and high mark options because I wanted to avoid a
> pause/resume dance around every data event when the data arrives faster than
> it is consumed. My assumption was that a little queue with high and low
> marks would reduce the number of pause/resume calls and improve performance.
> Basically tradiing a bit of space for speed. But I have to admit that I did
> not bench it. So, if the pause/resume dance costs very little this may be
> overkill.
>
> @isaac and mikeal,
>
> This callback proposal may sound very "anti-eventish" and it may give the
> impression that I'm sorta trying to eradicate events from node's APis
> (nobody said it but I can see how it could be perceived this way). This is
> not the case. I like node's event API and I find it very elegant. But node
> gives us two API styles (callbacks and events) and it is not always easy to
> choose between the two. Here is the rationale that I use to decide between
> them:
>
> My main criteria is CORRELATION. Basically, I start with the assumption that
> the API is event-oriented and then I analyze the degree of correlation
> between the various events. If the events are highly correlated, I choose
> the callback style. If there are loosely correlated, I keep the event style.
> Some examples:
>
> * User events (browser side) are very loosely correlated => event style
> * Incoming HTTP requests (server side) are also very loosely correlated =>
> event style
> * Data streams vary. If each data chunk is a complete message which is more
> or less independent from other messages, the event style is best. If, on the
> other hand, the chunks are correlated (because the whole stream has a strong
> internal structure, or because it has been chunked on arbitrary boundaries
> that don't match its internal structure), then the callback style is best.
> * Confirmation events (like "connect/error" events that follows a connection
> attempt, or a "drain" event that follows a write returning false) are fully
> correlated => callback style.
>
> Also, the event style API is more powerful than the callback style API as it
> supports multiple listeners.
> BUT:
>
> * It is very easy to wrap a callback API with an event listener.
> * Very often, in the correlated case, there is a "main" consumer which needs
> to correlate the events, and auxiliary consumers that don't care that much
> about the correlations (log them, feed statistics, etc). A dual API with
> callbacks for the main consumer and events for  the auxiliary ones works
> great.
> * Wrapping an event style API with a callback style API is a lot more
> difficult.
> * Callback style APIs are easier to use when the events are correlated
> because you don't need to setup state machines to re-correlate the events.
>
> Given this, I probably favor the callback style a lot more than most node
> developers. But this is not a systematic "anti-event" attitude, there is a
> rationale behind it and I wanted to share it with you.
>
> Bruno
>
>
>
> On Saturday, July 28, 2012 9:14:11 PM UTC+2, Mikeal Rogers wrote:
>>
>>
>> On Jul 28, 2012, at July 28, 201212:05 PM, Tim Caswell
>> <t...@creationix.com> wrote:
>>
>> > FWIW, I actually like Bruno's proposal.  It doesn't cover all the use
>> > cases, but it makes backpressure enabled pumps really easy.
>> >
>> > One use case missing that's easy to add is when consuming a binary
>> > protocol, I often only want part of the input.  For example, I might
>> > want to get the first 4 bytes, decode that as a uint32 length header
>> > and then read n more bytes for the body.  Without being able to
>> > request how many bytes I want, I have to handle putting data back in
>> > the stream that I don't need.  That's very error prone and tedious.
>> > So on the read function, add an optional "maxBytes" or "bytes"
>> > parameter.  The difference is in the maxBytes case, I want the data as
>> > soon as there is anything, even if it's less than the number of bytes
>> > I want.   In the "bytes" case I want to wait till that many bytes are
>> > available.  Both are valid for different use cases.
>>
>> The early stuff I saw included a "length" option.
>>
>> >
>> > Also streams (both readable and writable) need a configurable
>> > low-water mark.  I don't want to wait till the pipe is empty before I
>> > start piping data again.  This mark would control how soon writable
>> > streams called my write callback and how much readable streams would
>> > readahead from their data source before waiting for me to call read.
>> > I want to keep it always full.  It would be great if this was handled
>> > internally in the stream and consumers of the stream simply configured
>> > what the mark should be.
>>
>> I think you're missing how this works. Nobody automatically asks for data
>> so watermarks aren't strictly necessary. You ask for data if it's available
>> and you read as much as you can handle.
>>
>> There is no "readahead". If someone stops calling read() then the buffer
>> fills and, if it's a TCP stream, it's asked to stop sending data.
>>
>> Remember that when the "readable" event goes off it's expected that the
>> pending data is read in the same event loop cycle.
>>
>>
>>
>>
>

Re: [node-dev] Stream tweaks proposal

Reply via email to