Re: One way to deal with transient ranges & char[] buffers

Jakob Ovrum Fri, 02 Aug 2013 00:55:40 -0700

On Friday, 2 August 2013 at 05:35:28 UTC, H. S. Teoh wrote:

Recently, I discovered an interesting idiom for dealing withtransientranges, esp. w.r.t. strings / char[]. Many places in D requirestring,but sometimes what you have is char[] which can't be convertedto stringexcept by .idup. But you don't want to .idup in generic code,because if
the input's already a string, then it's wasteful duplication.

Places in D that require `string` either do so because they needthe immutable guarantee or they do so out of error (e.g. shouldhave used a string of const characters instead). The latter canof course be worked around, but the only *solution* involvesfixing the upstream code, so I'll assume we're discussing theformer case.

We don't have any generic mechanism for deep copying ranges. The`save` primitive is often implemented by means of copying, butconceptually is doing something very different, so it cannot beapplied here. So, I don't see how your idea translates to rangesin general (not completely sure if it was intended to).

Thus, let's tackle the case of arrays/slices in particular, ofwhich strings are the most common example. There is a precedentin D to push to the decision to copy an array upwards in thecode. When the operations at hand require the immutableguarantee, state it in the interface of the code, such as byasking for `string` on a function's parameter. That's why so manyfunctions take `string` when they need the immutable guarantee,as opposed to `const(char)[]` or a template parameter, followedby a GC copy operation. This way, copies are not only minimized,but centralized more in user code where they are more visible,and the method of making the copy - remember, not all client codeis fine with rampant GC use - is also pushed up. Also, copies areone thing, but what if the caller had a string but in a differentencoding? Not only does an allocation have to be made, butdecoding and encoding is also necessary; the details of how tohandle this are also pushed up, with the same benefits. It's apretty mainstream idiom and is often reiterated by members of thecommunity, such as in Ali's talk at dconf.

Your proposed solution only shares one benefit with the solutiondescribed above - that if the direct caller had a `string`already (or a range of `string`s), nothing has to be done. Itforfeits all the other benefits for convenience. It also hasproblems with template bloat, which can be fixed but at asyntactical cost.

Overall I think it reduces the genericity of algorithms by tryingto handle input types it doesn't actually support, which can be abig problem for performance-critical code.

Re: One way to deal with transient ranges & char[] buffers

Reply via email to