Switching over to the 'nom' branch of Rakudo introduced a large number of regressions and changes that affected many users of Rakudo. I know all of the core developers agree that this was really not a good thing, and we want to work very hard to avoid such instabilities in the future.
Ideally a commitment of this sort should probably be documented as a policy or framework somewhere that we can point to and use for guidance whenever we encounter potential breakages (which are bound to occur, since Perl 6 is a living language). In practice what this means is that we want to minimize the number and impact of any "breakages" that people encounter when using existing code on subsequent releases of Rakudo Star. Here I'm focusing mostly on the Star distribution, because that's where our stability commitments are strongest. We'll also need to manage breakages within the compiler, but part of the reason for separating compiler releases from distribution releases is exactly to make it possible for version management issues to be handled at different time scales. Rather than talk in generalities though, I have a few specific "use cases" that demonstrate breakages that we're going to have to manage very soon, and I'd like ideas and suggestions about how to handle them. From the steps we choose to handles these specific cases I think we can develop some broader guidelines for future ones. (If others know of any cases beyond these that need discussing, feel free to contribute them to this conversation!) ---- The first category of breakages are places where the Perl 6 specification has changed from what Rakudo currently implements. 1. ? quantifier in regexes The C<?> quantifier used to be specified to capture matches in the same manner as C<*> and C<+> -- that is, it produced a List of Match objects in its capture slot (either named or positional). The current version of the specification says that a ?-quantified capture fills the slot with either a single Match object or Nil. The difference can be seen in this code example: / <digit>? y z / In the specification used to implement the current regex engine, the returned match object would always have a List in the $<digit> slot; that list would contain a single Match object if a digit was found, or the list would be empty if no digit was found. That is, the match acted the same as a C< / <digit> ** 0,1 y z / > regex, thus the match for the digit would be found at $<digit>[0]. In the current specification (which Rakudo must now migrate to), the regex will capture any digit directly into the $<digit> slot of the returned Match object, and $<digit> will be Nil if no digit is found. A program looking for a capture result at $<digit>[0] will always get an undefined value. How do we inform users of this change, and when should it be made (in Rakudo and in Rakudo Star)? 2. Leading whitespace in rules and :sigspace A previous version of :sigspace (and hence 'rule') caused _all_ whitespace in a regex to be treated as significant; i.e., a rule declaration like rule xyz { x y z } would be identical to token xyz { <.ws> x <.ws> y <.ws> z <.ws> } In other words, the space before the 'x' in the rule declaration would invoke <.ws> to consume any whitespace prior to the 'x'. The current regex syntax definition changes this such that whitespace following certain constructs is no longer significant (in this case, the space following the opening brace). Thus the current spec has the xyz rule above translating to token xyz { x <.ws> y <.ws> z <.ws> } with no <.ws> consumption prior to the 'x'. This will break any existing grammars or rules that have been relying on the previous rule / :sigspace definition. Updating existing code to mimic the old behavior of 'rule' is fairly simple -- just add a <?> and a space where <.ws> is expected. rule xyz { <?> x y z } Again, how do we inform users of this change, and when should it be made? ---- Another category is where things are outright removed from the specification. 3. Str.bytes The C<.bytes> method on C<Str> has always been somewhat problematic; in Perl 6 we typically think of strings in terms of characters, codepoints, graphemes, or units other than bytes. The C<.bytes> method really makes more sense for something like C<Buf>, but not C<Str>. Thus it was decided to remove C<.bytes> completely from the C<Str> specification. How long should Rakudo keep Str.bytes available for programs that may be using it? How do we let people know that it's going away, and what to potentially use instead? (For Str.bytes, I've introduced an experimental "is DEPRECATED" trait into Rakudo, thus Str.bytes is actually marked as "DEPRECATED" in the source. We could potentially extend this trait so that an option or pragma causes any uses of DEPRECATED routines to generate warnings or exceptions.) ---- Another category is where parts of the Perl 6 specification are known to be fairly slushy, in that what is written is not at all what we expect Perl 6 to ultimately look like, nor what Rakudo implements. The IO library is the current poster child for this; we all agree that what is documented in S16 and other places is almost certainly not what we want, and changes are being introduced to Rakudo to explore better options. Most recently were changes introduced to the C<dir> function; a new implementation of C<dir> was committed that completely invalidated existing code. This has since been rectified so that older programs still work, but we know there are other IO-related changes that really need to be made but need exploration before we can determine what they will be yet. Other examples of this from the past would include Lists and Iterators, and regexes before that; the IO library is an immediate issue (including things like sockets and non-blocking I/O), macros may also end up changing somewhat as they're implemented; in the future I expect that S09 and parallel processing will have fairly slushy specs as people explore the implementations. How should we manage exploration of new(ish) Perl 6 features and libraries while preserving some sense of stability for people who are actively using those features? One suggestion in this case has been to completely freeze the existing IO implementation for stability purposes, while simultaneously prototyping and testing new IO features in other namespaces. Then as those newer IO features stabilize, deprecate and phase out the existing IO library in favor of the newer one. ---- Ultimately the Perl 6 specification says that version numbers are supposed to be able to manage these sorts of issues for us; i.e., if a program says C<use v6.0.2>, then it gets all of the semantics of exactly version 6.0.2, regardless of any deprecations or changes that may have happened since then. However, I don't believe Rakudo is yet at a place where we can provide this level of compatibility, so we need some other management policies in place until we do get there. Thanks in advance for any comments or suggestions. Pm