Something's bothering me about the .trans method. This email lists a proposal to split its semantics into two methods.
I'm not yet convinced myself about this proposal. It's quite late in the game to make spec changes of established methods, and the change will break some downstream application code, some of which I wrote. So if the advantages don't come shining through, I will withdraw my proposal. But I thought I would make a case for it and see what people think of it. Here's the .trans spec: <http://perlcabal.org/syn/S05.html#Transliteration> Summary of the spec: .trans does what tr/// does (in Perl 5, and Perl 6). .trans also does a bunch of cool stuff with strings and regexes and Longest-Token Matching that tr/// never did. My proposal is to split away the cool stuff with strings and regexes and LTM into its own method. For purposes of concreteness, I will call this proposed method .translate -- name proposals are allowed in the thread and on the #perl6 channel, but in order to reduce bikeshedding, proposals without a rationale will be ignored. Also note that the discussion is not primarily about the name, but about the split itself. Here are my reasons for the split: * The spec literally goes from saying that the tr/// operator has a method form, to overloading this method form with features that are not in tr///. The method is showing signs of lack of cohesion. * I use the more advanced features frequently, and they're great for parallel, one-pass substitution of substrings. They're an improvement over Perl 5's corresponding features. I don't want them to go away, just to separate them into their own method. * Linguistically, trans*literation* (which is what C<tr> stands for) is about replacing individual characters. Substituting bigger chunks is a kind of translation. * Over the years, I've seen people struggling with the API of .trans, which is for all intents and purposes two separate APIs: one involving pairs of strings, and one involving pairs of Positional (spec says Array). Something about the whole thing violates Least Surprise. This whole email was prompted by my having to check the spec for the umpteenth time and realizing that the API simply doesn't vibe with me. Splitting up the two separate APIs into two methods would help. * The more advanced parts of the current API -- the ones that allow the right hand sides of pairs to be regexes or closures -- could be migrated out along with the pairs-of-Positional API. Then the .trans method would be left to handle *only* the things that tr/// handles, and the .translate method could do the cool stuff. * .trans could then have specially optimized code which does one-char substitution efficiently. Though we're not there yet, because of the cool stuff that I propose to move out into .translate, the plans for an efficient implementation of .trans involve somehow generating a grammar on-the-fly and then running it on the string to be substituted. This seems like extreme overkill for tr///. Have these things been bothering any of you too? Does the split make sense? Would it help us to simplify the API and make people Less Surprised by it? Are the projected wins of the spec change worth the upsetting of the ecosystem? Hopefully, // Carl