Re: A Small Contribution to Phobos
I have implemented and submitted cache. Please review/destroy :) https://github.com/D-Programming-Language/phobos/pull/1364
Re: A Small Contribution to Phobos
On Monday, 3 June 2013 at 06:58:00 UTC, monarch_dodra wrote: I disagree. One thing a user could expect out of tee is to Hello. I have created a pull request for tee: https://github.com/D-Programming-Language/phobos/pull/1348 (I did this based on discussion in Issue 9882, and was not aware of this present forum thread until monarch_dodra asked me to present my approach here). My approach is to create an InputRange wrapper called TeeRange, which will call the user provided function with each element of the wrapped range during iteration. One concept with this is that the user can pass a flag to specify whether the function should be called on popFront (default) or on front. The following unittest from my change illustrates that distinction: --- unittest { // Manually stride to test different pipe behavior. void testRange(Range)(Range r) { const int strideLen = 3; int i = 0; typeof(Range.front) elem; while (!r.empty) { if (i % strideLen == 0) { elem = r.front(); } r.popFront(); i++; } } string txt = abcdefghijklmnopqrstuvwxyz; int popCount = 0; auto pipeOnPop = tee!(a = popCount++)(txt); testRange(pipeOnPop); assert(popCount == 26); int frontCount = 0; auto pipeOnFront = tee!(a = frontCount++, false)(txt); testRange(pipeOnFront); assert(frontCount == 9); } --- Thanks, irritate
Re: A Small Contribution to Phobos
On Sunday, 16 June 2013 at 13:39:35 UTC, irritate wrote: [SNIP] One concept with this is that the user can pass a flag to specify whether the function should be called on popFront (default) or on front. [SNIP] Thanks, irritate What made you change the parameter of : * pipeOnPop = false (eg call on front by default) to * pipeOnFront = false (eg call on pop by default) ? I think pipe on front makes more sense, since you'll actually *see* the last value that was passed if the stream is terminated, eg: [1, 2, 3, 4].tee!`writeln(processing: , a).until!a 2(); Which will output: processing: 1 processing: 2 end what about 3? Or processing something A... processing something B... core dump... (stream was actually processing C, but we are fooled into investigating B...) The *advantage* of pipeOnPop is that each element is piped at least once, and at most once, so that's good. However, it comes with pitfalls which (IMO) I think should be an explicit opt-in.
Re: A Small Contribution to Phobos
On Sunday, 16 June 2013 at 17:37:32 UTC, monarch_dodra wrote: What made you change the parameter of : * pipeOnPop = false (eg call on front by default) to * pipeOnFront = false (eg call on pop by default) ? Actually the original version was pipeOnPop = true by default. So I didn't change the logic, I just renamed the variable to make the flag more clear that you would pass in pipeOnFront.yes to opt-in. (Also to coincide with your comment on the pull request). I think pipe on front makes more sense, since you'll actually *see* the last value that was passed if the stream is terminated, eg: [1, 2, 3, 4].tee!`writeln(processing: , a).until!a 2(); Which will output: processing: 1 processing: 2 end what about 3? . . . The *advantage* of pipeOnPop is that each element is piped at least once, and at most once, so that's good. I think they both have their advantages, which is why it's probably important to be able to control the behavior regardless of which one is default. I choose pipeOnPop as the default because: 1) It more closely tied in to the idea of tapping into the data as the wrapped range is iterated over (i.e. calling front multiple times won't call the function multiple times, as you said). 2) I felt like I would personally use pipeOnPop more often, and figured the most commonly used case should not require the flag. But I'm not especially tied to it, and could see making pipeOnFront default if that is preferred. And actually, as I think about what I just wrote and also my previous unittest example, it almost feels like pipeOnPop gives you insight into the wrapped range itself, and pipeOnFront gives you more insight into how the range is used. The incoming vs. the outgoing, as it were. And I imagine I'd like to know more about what is coming in more of the time, but that's just my opinion. irritate
Re: A Small Contribution to Phobos
On Monday, June 17, 2013 03:41:45 irritate wrote: I think they both have their advantages, which is why it's probably important to be able to control the behavior regardless of which one is default. I choose pipeOnPop as the default because: 1) It more closely tied in to the idea of tapping into the data as the wrapped range is iterated over (i.e. calling front multiple times won't call the function multiple times, as you said). 2) I felt like I would personally use pipeOnPop more often, and figured the most commonly used case should not require the flag. But I'm not especially tied to it, and could see making pipeOnFront default if that is preferred. And actually, as I think about what I just wrote and also my previous unittest example, it almost feels like pipeOnPop gives you insight into the wrapped range itself, and pipeOnFront gives you more insight into how the range is used. The incoming vs. the outgoing, as it were. And I imagine I'd like to know more about what is coming in more of the time, but that's just my opinion. I would think that pipeOnPop would be better by default simply because it's the more efficient thing to do, especially if it's not clear which the programmer is more likely to want in the general case. - Jonathan M Davis
Re: A Small Contribution to Phobos
On Monday, 3 June 2013 at 02:31:00 UTC, Andrei Alexandrescu wrote: On 6/2/13 2:43 PM, monarch_dodra wrote: I think I just had a good idea. First, we introduce cached: cached will take the result of front, but only evaluate it once. This is a good idea in and out of itself, and should take the place of .array() in UFCS chains. Yah, cached() (better cache()?) should be nice. It may also offer lookahead, e.g. cache(5) would offer a non-standard lookahead(size_t n) up to 5 elements ahead. Hum... That'd be a whole different ballpark in terms of power, as opposed to the simple minded cached I had in mind. But I think both can coexist anyway, so I see no problem with adding extra functionality. From there, tee, is nothing more than calls funs on the front element every time front is called, then returns front. From there, users can user either of: MyRange.tee!foo(): This calls foo on every front element, and several times is front gets called several times. MyRange.tee!foo().cached(): This calls foo on every front element, but only once, and guaranteed at least once, if it gets iterated. I kinda dislike that tee() is hardly useful without cache. Andrei I disagree. One thing a user could expect out of tee is to print on every access, just to see which elements get pushed down the pipe, and in which order, as opposed to just print my range. In particular, I don't see why tee would not mix with random access. For example, with this program: auto r = [4, 3, 2, 1].tee!writeln(); writeln(first sort (not sorted)); r.sort(); writeln(second sort (already sorted)); r.sort(); I can see the output as: first sort (not sorted) 2 1 1 2 1 3 1 1 3 2 2 1 2 4 1 1 4 2 2 1 3 3 2 3 2 1 3 2 4 3 second sort (already sorted) 3 4 3 2 3 2 1 2 1 2 1 3 2 4 3 which gives me a good idea of how costly the sort algorithm is. It's a good way to find out if cache(d) or array should be inserted in my chain.
Re: A Small Contribution to Phobos
On Sunday, 2 June 2013 at 04:10:15 UTC, Jonathan M Davis wrote: On Sunday, June 02, 2013 04:57:53 Meta wrote: The callable bit won't work. It'll just call front. You'd have to do something like static if(isCallable!(ElementType!R)) r.front()(); Also, if front were pure, then calling it and doing nothing with its return value would result in a compilation error. The same goes if the element type is a pure callable. Calling front is kind of the point of exhaust(), otherwise you'd use takeNone(). You wouldn't use this if front were pure because the only reason you'd want exhaust is if you were (ab)using side effects (like I was the other day on D.learn). Having it error out if you were using it on a range with pure front() is actually a good thing because you've made some error in your reasoning if you think you want exhaust() to run in that situation. processSideEffects() is probably too long of name. And even if this did work exactly as you intended. I think that assuming that someone exhausting the range would would what front returns to be called is a bad idea. Maybe they do, maybe they don't, I'd expect that in most cases, they wouldn't. If that's what they want, they can call map before calling exhaust. Sticking a map before exhaust without it calling front() would accomplish nothing. I know this because my own little toy eat() just called popFront() originally on a Map range and nothing happened. You'd be skipping map's function if you don't call front.
Re: A Small Contribution to Phobos
For reference type ranges and input ranges which are not forward ranges, this will consume the range and return nothing. I originally wrote it to accept forward ranges and use save, but I wanted to make it as inclusive as possible. I guess I overlooked the case of ref ranges. As for ranges that aren't forward ranges, consider a simple input range. struct InputRange { int[] arr = [1, 2, 3, 4, 5]; int front() { return arr.front; } bool empty() { return arr.empty; } void popFront() { return arr.popFront; } } writeln(isForwardRange!InputRange); //False Range() .each!(n = write(n, )) .map!(n = n * n) .writeln; This outputs 1 2 3 4 5 [1, 4, 9, 16, 25], so each is not returning an empty range. I believe this is because r in this case is a value type range, and the foreach loop makes a copy of it. This does still leave the problem of reference type ranges. Also, range-based functions should not be strict (i.e. not lazy) without good reason. And I don't see much reason to make this strict. It's not lazy because it's intended to perform some mutating or otherwise side-effectful operation. Map doesn't play well with side effects, partially because of its laziness. A very contrived example: auto arr = [1, 2, 3, 4].map!(n = n.writeln); //Now what? It's not clear now what to do with the result. You could try a for loop: foreach (n; arr) n(); //Error: n cannot be of type void But that doesn't work. A solution would be to modify the function you pass to map: auto arr = [1, 2, 3, 4].map!((n) { n.writeln; return n; }); foreach (n; arr) {} //Prints 1 2 3 4 But that's both ugly and verbose. each also has the advantage of being able to return the original range (possibly modified), whereas map must return a MapResult due to its laziness, and you need that extra array call to bludgeon it into the correct form. each is also more efficient in that it doesn't need to return a copy of the data passed to it. It simply mutates it in-place. Also, it's almost the same thing as map. Why not just use map? The predicate can simply return the same value after it's operated on it. See above. There are some cases where map is clunky to work with due to it being non-strict. If we did add this, I'd argue that transform is a better name, but I'm still inclined to think that it's not worth adding. I chose the name each because it's a common idiom in a couple of other languages (Javascript, Ruby and Rust off the top of my head), and because I think it underlines the fact that each is meant to perform side-effectful operations. exhaust iterates a range until it is exhausted. It also has the nice feature that if range.front is callable, exhaust will call it upon each iteration. Range exhaust(Range)(Range r) if (isInputRange!(Unqual!Range)) { while (!r.empty) { r.front(); r.popFront(); } return r; } //Writes www.dlang.org. x is an empty MapResult range. auto x = www.dlang.org .map!((c) { c.write; return false; }) .exhaust; //Prints [] [1, 2, 3].exhaust.writeln; The callable bit won't work. It'll just call front. You'd have to do something like static if(isCallable!(ElementType!R)) r.front()(); I was having some trouble with writing exhaust and forgot all about ElementType. I'll change that. Also, if front were pure, then calling it and doing nothing with its return value would result in a compilation error. The same goes if the element type is a pure callable. Is this true for all pure functions? That seems like kind of strange behaviour to me, and doesn't really make sense given the definition of functional purity. And even if this did work exactly as you intended. I think that assuming that someone exhausting the range would would what front returns to be called is a bad idea. Maybe they do, maybe they don't, I'd expect that in most cases, they wouldn't. If that's what they want, they can call map before calling exhaust. I think the original reason that somebody wanted exhaust was because map is lazy and they wanted a function which could take the result of map and consume it while calling front each time. Otherwise, there wouldn't be much reason to have this, as there is takeNone and popFrontN. So, you want to have a function which you pass something (including a range) and then returns that same value after calling some other function? Does this really buy you much over just splitting up the expression - you're already giving a multline example anyway. It gives you the advantage of not having to split your UFCS chain up, which I personally find valuable, and I think other people would as well. I think it's quite similar to the various side-effectful monads in Haskell, which don't do anything with their argument other than return it, but perform some operation with side-effects in the process. I'll try to think up a better example for this, because I think
Re: A Small Contribution to Phobos
Meta: perform is pretty badly named, but I couldn't come up with a better one. It can be inserted in a UFCS chain and perform some operation with side-effects. It doesn't alter its argument, just returns it for the next function in the chain. T perform(alias dg, T)(ref T val) { dg(); return val; } //Prints Mapped: 2 4 [1, 2, 3, 4, 5] .filter!(n = n 3) .map!(n = n * n) .perform!({write(Mapped: );}) .each!(n = write(n, )); I'd like something like this in Phobos, but I'd like it to have a better name. But in most (all?) cases what I want to put inside such perform is a printing function, so I have opened this: http://d.puremagic.com/issues/show_bug.cgi?id=9882 exhaust iterates a range until it is exhausted. It also has the nice feature that if range.front is callable, exhaust will call it upon each iteration. Range exhaust(Range)(Range r) if (isInputRange!(Unqual!Range)) { while (!r.empty) { r.front(); r.popFront(); } return r; } //Writes www.dlang.org. x is an empty MapResult range. auto x = www.dlang.org .map!((c) { c.write; return false; }) .exhaust; //Prints [] [1, 2, 3].exhaust.writeln; I's also like this in Phobos, for debugging purposes. But I'd like it to return nothing, so you are forced to use it only at the end of a chain. (So I appreciate 2 of your 4 proposals. I have proposed both of them in D.learn time ago.) - Brad Anderson: Andrei didn't care for the tap() you propose but loved the idea of a tap() function that works like unix tee. Something like this Python itertool? def tee(iterable, n=2): it = iter(iterable) deques = [collections.deque() for i in range(n)] def gen(mydeque): while True: if not mydeque: # when the local deque is empty newval = next(it) # fetch a new value and for d in deques:# load it to all the deques d.append(newval) yield mydeque.popleft() return tuple(gen(d) for d in deques) Bye, bearophile
Re: A Small Contribution to Phobos
On 6/2/13 1:58 AM, Meta wrote: For reference type ranges and input ranges which are not forward ranges, this will consume the range and return nothing. I originally wrote it to accept forward ranges and use save, but I wanted to make it as inclusive as possible. I guess I overlooked the case of ref ranges. [snip] Thanks for sharing your ideas. I think consuming all of a range evaluating front and doing nothing should be the role of reduce with only one parameter (the range). That overload would take the range to be exhausted and return void. Thus your example becomes: [1, 2, 3, 4].map!(n = n.writeln).reduce; Andrei
Re: A Small Contribution to Phobos
Andrei Alexandrescu: [1, 2, 3, 4].map!(n = n.writeln).reduce; I have to shot this down for many reasons: I think it's better to give that final function a different name (like consume or something like that) because it's used for very different purposes and it returns nothing. Re-using the name reduce doesn't reduce the amount of Phobos lines of code, it doesn't make the user code simpler to understand, it's more obscure because it's more semantically overloaded, and it's not more easy to find in the documentation by the future D users. Function names are not language keywords, packing different purposes in the same name as static doesn't give any advantage, and only disadvantages. And using map with a lambda that returns nothing is not a style I like :-( It's probably better to encourage D programmers to give pure lambdas to map/filter, for several reasons (safety, cleanness, code style, future D front-end optimizations done on those higher order functions, to allow a better debuggability, and to avoid Phobos bugs like http://d.puremagic.com/issues/show_bug.cgi?id=9674 ). So I think it's better to introduce a new Phobos function like tap() that accepts a function/delegate with side effects that takes no input arguments. Bye, bearophile
Re: A Small Contribution to Phobos
I think consuming all of a range evaluating front and doing nothing should be the role of reduce with only one parameter (the range). That overload would take the range to be exhausted and return void. Thus your example becomes: Maybe, then, it would be best to have a template that calls reduce in such a way, that makes it perfectly clear what is happening.
Re: A Small Contribution to Phobos
On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote: On 6/2/13 1:58 AM, Meta wrote: For reference type ranges and input ranges which are not forward ranges, this will consume the range and return nothing. I originally wrote it to accept forward ranges and use save, but I wanted to make it as inclusive as possible. I guess I overlooked the case of ref ranges. [snip] Thanks for sharing your ideas. I think consuming all of a range evaluating front and doing nothing should be the role of reduce with only one parameter (the range). That overload would take the range to be exhausted and return void. Thus your example becomes: [1, 2, 3, 4].map!(n = n.writeln).reduce; Andrei One of the problems with using map for something such as this, is that the resulting object is not a range, since front now returns void, and a range *must* return a value. So that code will never compile (since reduce will ask for at least input range). Heck, I think we should make it so that map refuses to compile with an operator that returns void. It doesn't make much sense as-is. Usage has to be something like: map!((n) {n.writeln; return n;}) which is quite clunky. The idea of a tee range, that takes n, runs an operation on it, and then returns said n as is becomes really very useful (and more idiomatic). [1, 2, 3, 4].tee!(n = n.writeln). There! perfect :) I've dabbled in implementing such a function, but there are conceptual problems: If the user calls front twice in a row, then should fun be called twice? If user popsFront without calling front, should fun be called at all? Should it keep track of calls, to guarantee 1, and only 1, call on each element? I'm not sure there is a correct answer to that, which is one of the reasons I haven't actually submitted anything. I don't think argument-less reduce should do what you describe, as it would be a bit confusing what the function does. 1-names; 1-operation, IMO. Users might accidentally think they are getting an additive reduction :( I think a function called walk, in line with walkLength, would be much more appropriate, and make more sense to boot! But we run into the same problem... Should walk call front between each element? Both answers are correct, IMO.
Re: A Small Contribution to Phobos
On 6/2/13 9:20 AM, bearophile wrote: Andrei Alexandrescu: [1, 2, 3, 4].map!(n = n.writeln).reduce; I have to shot this down for many reasons: I think it's better to give that final function a different name (like consume or something like that) because it's used for very different purposes and it returns nothing. Re-using the name reduce doesn't reduce the amount of Phobos lines of code, it doesn't make the user code simpler to understand, it's more obscure because it's more semantically overloaded, and it's not more easy to find in the documentation by the future D users. Au contraire, there are many advantages. Using reduce leverages a well-understood notion instead of introducing a new one. There is less need for documentation, motivation, and explanations. Reduce with no function simply spans the entire range. Builds on an already-eager construct par excellence instead of adding a new one that must be remembered and distinguished from the lazy constructs. Actually my first thought when I saw consume() was to look up reduce, thinking, how do I reduce a range to nothing? Because that's the goal. Reduce is the obvious choice here. Function names are not language keywords, packing different purposes in the same name as static doesn't give any advantage, and only disadvantages. Strawman argument. Andrei
Re: A Small Contribution to Phobos
On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote: On 6/2/13 1:58 AM, Meta wrote: For reference type ranges and input ranges which are not forward ranges, this will consume the range and return nothing. I originally wrote it to accept forward ranges and use save, but I wanted to make it as inclusive as possible. I guess I overlooked the case of ref ranges. [snip] Thanks for sharing your ideas. I think consuming all of a range evaluating front and doing nothing should be the role of reduce with only one parameter (the range). That overload would take the range to be exhausted and return void. Thus your example becomes: [1, 2, 3, 4].map!(n = n.writeln).reduce; Andrei map being lazy, this can really do all kind of different stuff.
Re: A Small Contribution to Phobos
On 6/2/13 11:41 AM, monarch_dodra wrote: On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote: [1, 2, 3, 4].map!(n = n.writeln).reduce; Andrei One of the problems with using map for something such as this, is that the resulting object is not a range, since front now returns void, and a range *must* return a value. So that code will never compile (since reduce will ask for at least input range). Heck, I think we should make it so that map refuses to compile with an operator that returns void. It doesn't make much sense as-is. Hm, interesting. I'm destroyed. Usage has to be something like: map!((n) {n.writeln; return n;}) which is quite clunky. The idea of a tee range, that takes n, runs an operation on it, and then returns said n as is becomes really very useful (and more idiomatic). [1, 2, 3, 4].tee!(n = n.writeln). There! perfect :) I've dabbled in implementing such a function, but there are conceptual problems: If the user calls front twice in a row, then should fun be called twice? If user popsFront without calling front, should fun be called at all? Should it keep track of calls, to guarantee 1, and only 1, call on each element? I'm not sure there is a correct answer to that, which is one of the reasons I haven't actually submitted anything. I think there is one answer that arguably narrows the design space appropriately: just like the Unix utility, tee should provide a hook that creates an exact replica of the (portion of the) range being iterated. So calling front several times is nicely out of the picture. The remaining tactical options are: 1. evaluate .front for the parent range once in its constructor and then every time right after forwarding popFront() to the parent range. This is a bit eager because the constructor evaluates .front even if the client never does. 2. evaluate .front for the parent range just before forwarding popFront() to parent. This will call front even though the client doesn't (which I think is fine). 3. keep a bool that is set by constructor and popFront() and reset by front(). The bool makes sure front() is called if and only if the client calls it. I started writing the options mechanically without thinking of the implications. Now that I'm done, I think 2 is by far the best. I don't think argument-less reduce should do what you describe, as it would be a bit confusing what the function does. 1-names; 1-operation, IMO. Users might accidentally think they are getting an additive reduction :( Good point. I think a function called walk, in line with walkLength, would be much more appropriate, and make more sense to boot! But we run into the same problem... Should walk call front between each element? Both answers are correct, IMO. That's why I'm thinking: the moment .front gets evaluated, we get into the realm of reduce. Andrei
Re: A Small Contribution to Phobos
On Sunday, 2 June 2013 at 16:55:23 UTC, Andrei Alexandrescu wrote: On 6/2/13 11:41 AM, monarch_dodra wrote: On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote: [1, 2, 3, 4].map!(n = n.writeln).reduce; Andrei One of the problems with using map for something such as this, is that the resulting object is not a range, since front now returns void, and a range *must* return a value. So that code will never compile (since reduce will ask for at least input range). Heck, I think we should make it so that map refuses to compile with an operator that returns void. It doesn't make much sense as-is. Hm, interesting. I'm destroyed. Usage has to be something like: map!((n) {n.writeln; return n;}) which is quite clunky. The idea of a tee range, that takes n, runs an operation on it, and then returns said n as is becomes really very useful (and more idiomatic). [1, 2, 3, 4].tee!(n = n.writeln). There! perfect :) I've dabbled in implementing such a function, but there are conceptual problems: If the user calls front twice in a row, then should fun be called twice? If user popsFront without calling front, should fun be called at all? Should it keep track of calls, to guarantee 1, and only 1, call on each element? I'm not sure there is a correct answer to that, which is one of the reasons I haven't actually submitted anything. I think there is one answer that arguably narrows the design space appropriately: just like the Unix utility, tee should provide a hook that creates an exact replica of the (portion of the) range being iterated. So calling front several times is nicely out of the picture. The remaining tactical options are: 1. evaluate .front for the parent range once in its constructor and then every time right after forwarding popFront() to the parent range. This is a bit eager because the constructor evaluates .front even if the client never does. 2. evaluate .front for the parent range just before forwarding popFront() to parent. This will call front even though the client doesn't (which I think is fine). 3. keep a bool that is set by constructor and popFront() and reset by front(). The bool makes sure front() is called if and only if the client calls it. I started writing the options mechanically without thinking of the implications. Now that I'm done, I think 2 is by far the best. I think I just had a good idea. First, we introduce cached: cached will take the result of front, but only evaluate it once. This is a good idea in and out of itself, and should take the place of .array() in UFCS chains. It can store the result of an operation, but keeps the lazy iteration semantic. That's a win for functional programming right there. It would be most convenient right after an expansive call, such as after a map or whatnot. The semantic of cached would be: eagerly calls front once, always once, and exactly once, and stores the result. Calling front on cached returns said result. calling popFront repeats operation. From there, tee, is nothing more than calls funs on the front element every time front is called, then returns front. From there, users can user either of: MyRange.tee!foo(): This calls foo on every front element, and several times is front gets called several times. MyRange.tee!foo().cached(): This calls foo on every front element, but only once, and guaranteed at least once, if it gets iterated. I don't think argument-less reduce should do what you describe, as it would be a bit confusing what the function does. 1-names; 1-operation, IMO. Users might accidentally think they are getting an additive reduction :( Good point. I think a function called walk, in line with walkLength, would be much more appropriate, and make more sense to boot! But we run into the same problem... Should walk call front between each element? Both answers are correct, IMO. That's why I'm thinking: the moment .front gets evaluated, we get into the realm of reduce. Combined with my cached proposal, the problem is solved I think: walk does not call front, it merely pops. But, if combined with cached, then cache *will*, call front. Once and exactly once. This will call foo on all elements of my range (once exactly once): MyRange.tee!foo().cached().walk(); Unless I'm missing something, it looks like a sweet spot between functionality, modularity, and even efficiency...?
Re: A Small Contribution to Phobos
On Sunday, 2 June 2013 at 18:43:44 UTC, monarch_dodra wrote: On Sunday, 2 June 2013 at 16:55:23 UTC, Andrei Alexandrescu wrote: On 6/2/13 11:41 AM, monarch_dodra wrote: On Sunday, 2 June 2013 at 13:07:18 UTC, Andrei Alexandrescu wrote: [1, 2, 3, 4].map!(n = n.writeln).reduce; Andrei One of the problems with using map for something such as this, is that the resulting object is not a range, since front now returns void, and a range *must* return a value. So that code will never compile (since reduce will ask for at least input range). Heck, I think we should make it so that map refuses to compile with an operator that returns void. It doesn't make much sense as-is. Hm, interesting. I'm destroyed. Usage has to be something like: map!((n) {n.writeln; return n;}) which is quite clunky. The idea of a tee range, that takes n, runs an operation on it, and then returns said n as is becomes really very useful (and more idiomatic). [1, 2, 3, 4].tee!(n = n.writeln). There! perfect :) I've dabbled in implementing such a function, but there are conceptual problems: If the user calls front twice in a row, then should fun be called twice? If user popsFront without calling front, should fun be called at all? Should it keep track of calls, to guarantee 1, and only 1, call on each element? I'm not sure there is a correct answer to that, which is one of the reasons I haven't actually submitted anything. I think there is one answer that arguably narrows the design space appropriately: just like the Unix utility, tee should provide a hook that creates an exact replica of the (portion of the) range being iterated. So calling front several times is nicely out of the picture. The remaining tactical options are: 1. evaluate .front for the parent range once in its constructor and then every time right after forwarding popFront() to the parent range. This is a bit eager because the constructor evaluates .front even if the client never does. 2. evaluate .front for the parent range just before forwarding popFront() to parent. This will call front even though the client doesn't (which I think is fine). 3. keep a bool that is set by constructor and popFront() and reset by front(). The bool makes sure front() is called if and only if the client calls it. I started writing the options mechanically without thinking of the implications. Now that I'm done, I think 2 is by far the best. I think I just had a good idea. First, we introduce cached: cached will take the result of front, but only evaluate it once. This is a good idea in and out of itself, and should take the place of .array() in UFCS chains. It can store the result of an operation, but keeps the lazy iteration semantic. That's a win for functional programming right there. It would be most convenient right after an expansive call, such as after a map or whatnot. The semantic of cached would be: eagerly calls front once, always once, and exactly once, and stores the result. Calling front on cached returns said result. calling popFront repeats operation. From there, tee, is nothing more than calls funs on the front element every time front is called, then returns front. From there, users can user either of: MyRange.tee!foo(): This calls foo on every front element, and several times is front gets called several times. MyRange.tee!foo().cached(): This calls foo on every front element, but only once, and guaranteed at least once, if it gets iterated. I don't think argument-less reduce should do what you describe, as it would be a bit confusing what the function does. 1-names; 1-operation, IMO. Users might accidentally think they are getting an additive reduction :( Good point. I think a function called walk, in line with walkLength, would be much more appropriate, and make more sense to boot! But we run into the same problem... Should walk call front between each element? Both answers are correct, IMO. That's why I'm thinking: the moment .front gets evaluated, we get into the realm of reduce. Combined with my cached proposal, the problem is solved I think: walk does not call front, it merely pops. But, if combined with cached, then cache *will*, call front. Once and exactly once. This will call foo on all elements of my range (once exactly once): MyRange.tee!foo().cached().walk(); Unless I'm missing something, it looks like a sweet spot between functionality, modularity, and even efficiency...? I like the idea of cached and it's certainly useful if you need to iterate a range multiple times or something like that, but I also think that 90% of the time the user is just going to want to do something simple such as printing every element, and I think the syntax tee!(x = writeln(x)).cached.walk(); is both unnecessarily long and less efficient than simply: consume!(x = writeln(x)); // Template parameter is optional consume would always call front once
Re: A Small Contribution to Phobos
On 6/2/13 2:43 PM, monarch_dodra wrote: I think I just had a good idea. First, we introduce cached: cached will take the result of front, but only evaluate it once. This is a good idea in and out of itself, and should take the place of .array() in UFCS chains. Yah, cached() (better cache()?) should be nice. It may also offer lookahead, e.g. cache(5) would offer a non-standard lookahead(size_t n) up to 5 elements ahead. From there, tee, is nothing more than calls funs on the front element every time front is called, then returns front. From there, users can user either of: MyRange.tee!foo(): This calls foo on every front element, and several times is front gets called several times. MyRange.tee!foo().cached(): This calls foo on every front element, but only once, and guaranteed at least once, if it gets iterated. I kinda dislike that tee() is hardly useful without cache. Andrei
A Small Contribution to Phobos
I saw a thread a few days ago about somebody wanting a few UFCS-based convenience functions, so I thought that I'd take the opportunity to make a small contribution to Phobos. Currently I have four small functions: each, exhaust, perform, and tap, and would like some feedback. each is designed to perform operations with side-effects on each range element. To actually change the elements of the range, each element must be accepted by reference. Range each(alias fun, Range)(Range r) if (isInputRange!(Unqual!Range)) { alias unaryFun!fun _fun; foreach (ref e; r) { fun(e); } return r; } //Prints [-1, 0, 1] [1, 2, 3].each!((ref i) = i -= 2).writeln; exhaust iterates a range until it is exhausted. It also has the nice feature that if range.front is callable, exhaust will call it upon each iteration. Range exhaust(Range)(Range r) if (isInputRange!(Unqual!Range)) { while (!r.empty) { r.front(); r.popFront(); } return r; } //Writes www.dlang.org. x is an empty MapResult range. auto x = www.dlang.org .map!((c) { c.write; return false; }) .exhaust; //Prints [] [1, 2, 3].exhaust.writeln; perform is pretty badly named, but I couldn't come up with a better one. It can be inserted in a UFCS chain and perform some operation with side-effects. It doesn't alter its argument, just returns it for the next function in the chain. T perform(alias dg, T)(ref T val) { dg(); return val; } //Prints Mapped: 2 4 [1, 2, 3, 4, 5] .filter!(n = n 3) .map!(n = n * n) .perform!({write(Mapped: );}) .each!(n = write(n, )); Lastly is tap, which takes a value and performs some mutating operation on it. It then returns the value. T tap(alias dg, T)(auto ref T val) { dg(val); return val; } class Foo { int x; int y; } auto f = (new Foo).tap!((f) { f.x = 2; f.y = 3; }); //Prints 2 3 writeln(f.x, , f.y); struct Foo2 { int x; int y; } //Need to use ref for value types auto f2 = Foo2().tap!((ref f) { f.x = 3; f.y = 2; }); //Prints 3 2 writeln(f2.x, , f2.y); Do you think these small functions have a place in Phobos? I think each and exhaust would be best put into std.range, but I'm not quite sure where perform and tap should go. Also, there's that horrible name for perform, for which I would like to come up with a better name.
Re: A Small Contribution to Phobos
On Sunday, June 02, 2013 04:57:53 Meta wrote: I saw a thread a few days ago about somebody wanting a few UFCS-based convenience functions, so I thought that I'd take the opportunity to make a small contribution to Phobos. Currently I have four small functions: each, exhaust, perform, and tap, and would like some feedback. each is designed to perform operations with side-effects on each range element. To actually change the elements of the range, each element must be accepted by reference. Range each(alias fun, Range)(Range r) if (isInputRange!(Unqual!Range)) { alias unaryFun!fun _fun; foreach (ref e; r) { fun(e); } return r; } //Prints [-1, 0, 1] [1, 2, 3].each!((ref i) = i -= 2).writeln; For reference type ranges and input ranges which are not forward ranges, this will consume the range and return nothing. It would have to accept only forward ranges and save the result before iterating over it. Also, range-based functions should not be strict (i.e. not lazy) without good reason. And I don't see much reason to make this strict. Also, it's almost the same thing as map. Why not just use map? The predicate can simply return the same value after it's operated on it. If we did add this, I'd argue that transform is a better name, but I'm still inclined to think that it's not worth adding. exhaust iterates a range until it is exhausted. It also has the nice feature that if range.front is callable, exhaust will call it upon each iteration. Range exhaust(Range)(Range r) if (isInputRange!(Unqual!Range)) { while (!r.empty) { r.front(); r.popFront(); } return r; } //Writes www.dlang.org. x is an empty MapResult range. auto x = www.dlang.org .map!((c) { c.write; return false; }) .exhaust; //Prints [] [1, 2, 3].exhaust.writeln; The callable bit won't work. It'll just call front. You'd have to do something like static if(isCallable!(ElementType!R)) r.front()(); Also, if front were pure, then calling it and doing nothing with its return value would result in a compilation error. The same goes if the element type is a pure callable. And even if this did work exactly as you intended. I think that assuming that someone exhausting the range would would what front returns to be called is a bad idea. Maybe they do, maybe they don't, I'd expect that in most cases, they wouldn't. If that's what they want, they can call map before calling exhaust. perform is pretty badly named, but I couldn't come up with a better one. It can be inserted in a UFCS chain and perform some operation with side-effects. It doesn't alter its argument, just returns it for the next function in the chain. T perform(alias dg, T)(ref T val) { dg(); return val; } //Prints Mapped: 2 4 [1, 2, 3, 4, 5] .filter!(n = n 3) .map!(n = n * n) .perform!({write(Mapped: );}) .each!(n = write(n, )); So, you want to have a function which you pass something (including a range) and then returns that same value after calling some other function? Does this really buy you much over just splitting up the expression - you're already giving a multline example anyway. auto foo = [1, 2, 3, 4, 4].filt!(n = n 3)().map!(n = n * n)(); write(Mapped: ); foo.each!(n = write(n, )(); And I think that this is a perfect example of something that should just be done with foreach anyway. Not to mention, if you're calling very many functions, you're going to need to use multiple lines, in which case chaining the functions like that doesn't buy you much. All you end up doing is taking what would normally be a sequence of statements and turned it into one multiline statement. I don't think that this buys us much, especially when it's just calling one function which does nothing on any object in the chain. Lastly is tap, which takes a value and performs some mutating operation on it. It then returns the value. T tap(alias dg, T)(auto ref T val) { dg(val); return val; } class Foo { int x; int y; } auto f = (new Foo).tap!((f) { f.x = 2; f.y = 3; }); //Prints 2 3 writeln(f.x, , f.y); struct Foo2 { int x; int y; } //Need to use ref for value types auto f2 = Foo2().tap!((ref f) { f.x = 3; f.y = 2; }); //Prints 3 2 writeln(f2.x, , f2.y); Why do you need tap? So that you can use an anonymous function? If it had a name, you'd just use it with UFCS. I'd argue that this use case is minimal enough that you might as well just give it a name and then use UFCS if you really want to use UFCS, and if you want an anonymous function, what's the real gain of chaining it with UFCS anyway? It makes the expression much harder to read if you try and chain calls on the anonymous function. UFCS' main purpose is making it so that a function can be called on multiple types in the same manner
Re: A Small Contribution to Phobos
On Sunday, 2 June 2013 at 02:57:56 UTC, Meta wrote: I saw a thread a few days ago about somebody wanting a few UFCS-based convenience functions, so I thought that I'd take the opportunity to make a small contribution to Phobos. Currently I have four small functions: each, exhaust, perform, and tap, and would like some feedback. each is designed to perform operations with side-effects on each range element. To actually change the elements of the range, each element must be accepted by reference. Range each(alias fun, Range)(Range r) if (isInputRange!(Unqual!Range)) { alias unaryFun!fun _fun; foreach (ref e; r) { fun(e); } return r; } //Prints [-1, 0, 1] [1, 2, 3].each!((ref i) = i -= 2).writeln; exhaust iterates a range until it is exhausted. It also has the nice feature that if range.front is callable, exhaust will call it upon each iteration. Range exhaust(Range)(Range r) if (isInputRange!(Unqual!Range)) { while (!r.empty) { r.front(); r.popFront(); } return r; } //Writes www.dlang.org. x is an empty MapResult range. auto x = www.dlang.org .map!((c) { c.write; return false; }) .exhaust; //Prints [] [1, 2, 3].exhaust.writeln; perform is pretty badly named, but I couldn't come up with a better one. It can be inserted in a UFCS chain and perform some operation with side-effects. It doesn't alter its argument, just returns it for the next function in the chain. T perform(alias dg, T)(ref T val) { dg(); return val; } //Prints Mapped: 2 4 [1, 2, 3, 4, 5] .filter!(n = n 3) .map!(n = n * n) .perform!({write(Mapped: );}) .each!(n = write(n, )); Lastly is tap, which takes a value and performs some mutating operation on it. It then returns the value. T tap(alias dg, T)(auto ref T val) { dg(val); return val; } class Foo { int x; int y; } auto f = (new Foo).tap!((f) { f.x = 2; f.y = 3; }); //Prints 2 3 writeln(f.x, , f.y); struct Foo2 { int x; int y; } //Need to use ref for value types auto f2 = Foo2().tap!((ref f) { f.x = 3; f.y = 2; }); //Prints 3 2 writeln(f2.x, , f2.y); Do you think these small functions have a place in Phobos? I think each and exhaust would be best put into std.range, but I'm not quite sure where perform and tap should go. Also, there's that horrible name for perform, for which I would like to come up with a better name. You may find this forum discussion from several months ago interesting. http://forum.dlang.org/post/kglo9d$rjf$1...@digitalmars.com Confusingly, your each() seems to be fairly similar to what Andrei wanted tap() used for. Andrei didn't care for the tap() you propose but loved the idea of a tap() function that works like unix tee. I like exhaust() as I just had to write something similar. I like perform() just because I love UFCS range chains and anything to avoid those extra statements is alright in my book. This is probably not a majority opinion though. I can't think of a better name either though.