Eric Kow <[email protected]> writes: > Trent: could you comment on the utf8.sh test?
As requested, I've prioritized this review. I started commenting on this test script by hunks, before I realized it had been amended later in the bundle. Therefore this review jumps around A LOT. Sorry! Apart from the VISUAL and JIS/10646 comments, I think these are all style nitpicks. >> [Add test for UTF-8 metadata >> Reinier Lamers <[email protected]>**20090917165327 >> Ignore-this: 3e81237e8af61a45d64ac60269e1fe90 UTF-8 >> ] addfile ./tests/utf8.sh >> hunk ./tests/utf8.sh 1 >> +#!/usr/bin/env bash >> +## Test for issue64 - Should store patch metadata in UTF-8 The file might better be called issue64-utf8.sh. >> +# This file is encoded in ISO-8859-15 aka latin9. It was crafted >> with a hex editor. Could do with vim/Emacs magic to tell them that the file is latin-9. >> +echo 'Selbstverst\344ndlich \374berraschend' > something.txt Is it worth including translations as comments, for monoglots? ;-) >> +echo 'l33tking\[email protected]' > interaction_script.txt >> +echo y >> interaction_script.txt >> +echo y >> interaction_script.txt >> +echo '\244uroh4xx0rz' >> interaction_script.txt >> +echo n >> interaction_script.txt Could be done with a heredoc later (i.e. <<EOF), but that might change the nature of the input if bash and files do encoding differently. >> +unset DARCSEMAIL >> +unset DARCS_TESTING_PREFS_DIR >> +unset EMAIL >> +set Is it necessary to clear these in this script specifically? Could they better be moved to tests/lib (or Setup.lhs) or deleted? Either way, I would move them up so that interaction_script is written to and read from on adjacent lines. >> +# Test recording non-UTF-8-encoded non-ASCII ("funny") metadata from >> +# interactive input >> +darcs record -i < interaction_script.txt This pattern repeats throughought the script, testing different ways for metadata to get into Darcs (interactively, from a file, from arguments, from the environment, ...) >> +if grep 'l33tking\305\[email protected]' changes.xml ; then >> + echo "patch author OK from interactive prompt" >> +else >> + echo "patch author from interactive prompt not UTF-8-encoded!" >> + exit 1 >> +fi Given the way these tests are used, I would simplify this to # Patch author from interactive prompt is UTF-8 encoded? grep '...' changes.xml Since if the grep fails, both the above lines will be shown by Setup.lhs. This comment applies to similar subsequent code. Update: I see a later patch in the bundle replaces this with a wrapper function. I'm not enthused, because tests/lib's set -x will make this even noisier. AFAICT it doesn't matter for any of these strings, but I am in the habit of saying "fgrep" if the search pattern is literal (i.e. not a regex). >> +# Test recording funny metadata from environment, Incidentally, "funny" is a bit loaded. Non-Latin1 characters shouldn't be amusing nor peculiar! :-) >> +echo '#!/usr/bin/env bash' > editor >> +echo 'echo All my \244s are gone > $1' >> editor # create an 'editor' that >> writes latin9 >> +chmod +x editor >> +export EDITOR="`pwd`/editor" Either use VISUAL, or ensure VISUAL is unset -- it takes precedence over EDITOR. >> +printf "y\ny\n" | darcs amend --edit -p 'Patch by ' This could simply be darcs amend ... <<<yyn since Darcs doesn't care about missing newlines, and the bashism <<< is a herestring. You might even be able to get away with (untested): VISUAL=tee darcs amend --edit -p 'Patch by ' <<EOF yny All my \244s are gone EOF >> + if [ -z "$3" ]; then >> + last="" >> + else >> + last="--last $3" >> + fi >> + >> + darcs changes $last --xml > changes.xml For a function with optional extra arguments, I suggest simply using "$@": f () { # f x y zs x=$1 y=$2 shift 2 g "$x" "$y" "$@" } f foo bar f baz quux --last 3 This gives you some flexibility later if a different call to "f" needs something other than a --last N, e.g. f quuux quuuux --max-count 7 In this case, you could perhaps add "changes max-count 1" to _darcs/prefs/defaults. >> +diff temp1/changes.xml temp2/changes.xml Could use cmp instead of diff, since we only need to know IF the file has changed, not HOW (because a human can inspect the files afterwards). >> +if grep \264ors _darcs/prefs/author ; then Not putting quotes around such a literal string won't affect a working system, but I would be a little wary of The Right Thing happening when the system isn't encoding things correctly. That is, what happens if Bash decides a multi-byte character is actually "à "? You might end up with "grep à oid file" instead of "grep 'à oid' file". I'm probably just being paranoid. Hmm, but now that I'm thinking about it, this script is only testing 8859 -- a unibyte encoding. Suggest iconving it into a second test, using JIS or 10646. PS: this test overall ends up being relatively long. Long procedural code means stuff might accidentally fail (or worse: accidentally pass) due to state leftover from earlier in the script. For example, I notice you cleaning up after _darcs/prefs/author vs. $EMAIL. Is it worth breaking up this script into smaller "chunks" that each test a little bit? _______________________________________________ darcs-users mailing list [email protected] http://lists.osuosl.org/mailman/listinfo/darcs-users
