Martin,
Thank you for the additional testcases. They point out a few issues that I didn't interpret from the description in the Bash Reference Manual [http://www.gnu.org/software/bash/manual/bashref.html#Brace-Expansion]. Note that below I refer to paragraphs from this documentation. I do have a few issues with the expectations you've laid out. Comments follow... sebor-2 wrote: > > + > + TEST ("foo {1,2} bar", "foo 1 2 bar"); > + > This isn't a brace expansion. It is a literal string, followed by a brace expansion, followed by a literal string. When you run 'echo foo {1,2} bar' in the shell, each of the args are brace expanded individually, so the only thing that is brace expanded is the '{1,2}' and everything else is written literally. I believe this testcase is invalid. sebor-2 wrote: > > + // we don't have eval > + // TEST ("`zecho foo {1,2} bar`", "foo 1 2 bar"); > + // TEST ("$(zecho foo {1,2} bar)", "foo 1 2 bar"); > Same problem here. sebor-2 wrote: > > +#if 0 // not implemented yet > + > + // set the three variables > + rw_putenv ("var=baz:varx=vx:vary=vy"); > + > + TEST ("foo{bar,${var}.}", "foobar foobaz."); > + TEST ("foo{bar,${var}}", "foobar foobaz"); > + > + TEST ("${var}\"{x,y}", "bazx bazy"); > + TEST ("$var{x,y}", "vx vy"); > + TEST ("${var}{x,y}", "bazx bazy"); > + > + // unset all three variables > + rw_putenv ("var=:varx=:vary="); > + > +#endif // 0 > I don't expect this functionality to ever be implemented inside rw_brace_expand(). As mentioned in paragraph 4, the brace expansion itself is done before other expansions, and it does not interpret the text between the braces. Given this, I feel that the environment variable expansion must done at some later stage, by some other function, and the above test block is inappropriate for this test. sebor-2 wrote: > > + > + TEST ("{1..10}", "1 2 3 4 5 6 7 8 9 10"); > + > This is a case that I should be handling. I need to go back and add complete support for integer ranges, specifically ranges that include multidigit numbers and sign. sebor-2 wrote: > > + // this doesn't work in Bash 3.2 > + // TEST ("{0..10,braces}", "0 1 2 3 4 5 6 7 8 9 10 braces"); > + > I don't know how anyone could expect this to work. The first subexpression of the brace expansion list is '0..10', which itself is not a brace expansion, so it should not be expanded. It should be left as a literal. This happens to be the behavior I see with Bash 3.0. sebor-2 wrote: > > + // but this does > + TEST ("{{0..10},braces}", "0 1 2 3 4 5 6 7 8 9 10 braces"); > + TEST ("x{{0..10},braces}y", > + "x0y x1y x2y x3y x4y x5y x6y x7y x8y x9y x10y xbracesy"); > + > Obviously, both of these are valid versions of the previous test expression. sebor-2 wrote: > > + TEST ("{a..A}", > + "a ` _ ^ ] [ Z Y X W V U T S R Q P O N M L K J I H G F E D C B > A"); > + TEST ("{A..a}", > + "A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ ] ^ _ ` > a"); > + > Interesting. I didn't think it would make sense to allow mixing of lower and uppercase characters in the sequence expression because of the characters between 'Z' and 'a'. Obviously I was wrong. BTW, any idea what happened to ASCII 92? It is the backslash character that should appear between '[' and ']'. sebor-2 wrote: > > + > + TEST ("0{1..9} {10..20}", > + "01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20"); > This has the same problem as the first issue I brought up. This is actually two seperate brace expansions, the first is '0{1..9}' and the second is '{10..20}'. This is how the shell handles them, and this is how I handle them. If they were treated as one brace expansion by the shell, I would expect the postscript '{10..20}' expanded for each prefix/body expansion, much like you would see if you escaped the space. sebor-2 wrote: > > + // weirdly-formed brace expansions -- fixed in post-bash-3.1 > + TEST ("a-{b{d,e}}-c", "a-{bd}-c a-{be}-c"); > I don't understand how this could be interpreted as valid brace expansion at all. The body of the expansion is '{b{d,e}}'. Paragraph 5 [and paragraph 1 for that matter] require a correctly-formed brace expansion have unquoted [unescaped?] opening and closing braces, and at least one unquoted comma or a valid sequence expression. The body does not meet either of these requirements, so it must be invalid. To get the result shown, the obvious thing to do is to escape the outer braces. This would give us the valid expression 'a-\{b{d,e}\}-c', that happens to also work with previous versions of bash also. sebor-2 wrote: > > + TEST ("a-{bdef-{g,i}-c", "a-{bdef-g-c a-{bdef-i-c"); > Again, this does not seem correct according to the requirements of paragraph 5 [and 1]. If the body is supposed to be between a pair of braces, shouldn't the first unescaped opening brace match the first unescaped close brace at the same brace depth? If it is, then the outer brace expansion isn't valid because it doesn't have a terminating close brace. Even if one was added, the resulting expression has the same problem as the previous example. The nested expression 'bdef-{g,i}-c' isn't a series comma-seperated strings or a sequence expression. If you wanted the first brace to be ignored, as it is in the test, then it should be escaped. Then we would have 'a-\{bdef-{g,i}-c'. That expression follows the requirements outlined in the manual, and works with old versions of bash, and a human can pretty easily figure out what the expected result would be. Now I suppose that since invalid brace expansions are to be left unchanged, you could say that the first brace expansion is copied literally because it is invalid, but the second is valid and should be expanded. This almost explains how bash 3.2 gets these results, but it still seems wrong. If a subexpression is invalid it seems that the whole expression is invalid. sebor-2 wrote: > > + TEST ("{", "{"); > + TEST ("}", "}"); > + TEST ("{}", "{}"); > + TEST ("{ }", "{ }"); > + TEST ("{ }", "{ }"); // is this right? > I sure think it is. Again, the requirements say that these are not valid brace expansions, so they should be left unchanged. I'm wondering if the shell is doing some sort of whitespace collapse. Everything seems to work fine if you escape the spaces, so I'm thinking that is why you see the behavior that you do. So, with all that said, I've got a few thoughts. 1. I don't really like the idea of trying to emulate all behavior of the shell in rw_brace_expand. If we want that, then we should have made a bug entitled 'provide a complete implementation of bash'. 2. I don't feel comfortable trying to maintain compatibility with version 3.2 of bash. It doesn't seem to follow the documented requirements, and I believe that the odd behavior may be difficult to implement. The bash 3.0 implementation seems much more sane and that is what I tried to emulate when writing this code. 3. If you, er, we want to do brace expansion exactly like you see within bash, then we should write another function that tokenizes a string on whitespace and does brace expansion on each token. I was expecting the caller of rw_brace_expand() to expect the function to do brace expansion, not complete shell emulation. Travis -- View this message in context: http://www.nabble.com/svn-commit%3A-r628839----stdcxx-trunk-tests-self-0.braceexp.cpp-tp15551361p15560343.html Sent from the stdcxx-commits mailing list archive at Nabble.com.