Re: [arch-projects] [dbscripts] [PATCH 1/8] Fix quoting around variables, especially arrays.

2018-03-13 Thread Eli Schwartz via arch-projects
On 03/14/2018 12:53 AM, Luke Shumaker wrote:
> Part of it is to have a common style.  Trying to rectify two codebases
> that diverged 7 years ago is rough.  When trying to come up with clean
> diffs, having to guess "did the other one quote this variable?" makes
> it harder.  If you can say "always quote (except for the LHS of [[
> ]])" or something, that makes it a bit easier.

I'm not sure that "specifically for the sole sake of diffs against our
fork" is a valid justification on its own for modifying a coding style.

>>>  backup_package_variables() {
>>> -   for var in ${splitpkg_overrides[@]}; do
>>> +   for var in "${splitpkg_overrides[@]}"; do
>>> indirect="${var}_backup"
>>> -   eval "${indirect}=(\${$var[@]})"
>>> +   eval "${indirect}=(\"\${$var[@]}\")"
>>> done
>>>  }
>>>  
>>>  restore_package_variables() {
>>> -   for var in ${splitpkg_overrides[@]}; do
>>> +   for var in "${splitpkg_overrides[@]}"; do
>>> indirect="${var}_backup"
>>> if [ -n "${!indirect}" ]; then
>>> -   eval "${var}=(\${$indirect[@]})"
>>> +   eval "${var}=(\"\${$indirect[@]}\")"
>>> else
>>> -   unset ${var}
>>> +   unset "${var}"
>>> fi
>>> done
>>
>> This is too much escaping and metaprogramming, there are better ways of
>> backing up a variable to begin with. :/
>>
>> We do it in makepkg, I will have us do it here as well. Advantage: using
>> declare -p means the shell auto-escapes things where needed.
> 
> I haven't been keeping my thumb on makepkg git, but the eval lines as
> I wrote them exactly match the eval lines in makepkg 5.0.2's version
> of {backup,restore}_package_variables (makepkg's versions don't quote
> the for loops, or the unset command).

Hmm, I was thinking of:

eval "$restoretrap"
eval "$restoreset"
eval "$restoreshopt"
eval "$restore_envvars"

and similar. Maybe I should fix the backups as well, but that is a
slightly more complicated case there.

>>> -   if ! ${CLEANUP_DRYRUN}; then
>>> +   if ! "${CLEANUP_DRYRUN}"; then
>>
>> This is a variable being run as a command, so if there were to be spaces
>> in it we'd end up trying to run a command with a space in it. Arguably
>> we should not be running this as a command (even though they are set to
>> true/false which is a shell builtin blah blah blah) but since we are it
>> would be illogical to indicate that if there are spaces they should be
>> interpreted as string literals in an executable filename.
> 
> For the true/false idiom, quoting it is just a style rule.  I figure
> accepting the true/false idiom doesn't imply allowing the boolean
> variable to have any value.  Having the quotes would help catch the
> variable being erroneously set to a different value.

So would doing a bash test.

> At some point, I'd like to have `make lint` run shellcheck over
> dbscripts.  That's a long way off, both because of a whole bunch of
> changes needed in dbscripts to make it come back clean, and a few
> features needed in shellcheck to avoid having to drop entirely too
> many shellcheck directives in to the dbscripts source.
> 
> Anyway, I know linters should be taken with a grain of salt, but when
> there's something simple like this, that you know just about any
> linter would complain about... why not?

That would imply one of my long-term goals is being able to run a linter.

If I did, this rule would be the first thing I disabled -- it is far,
far too prone to both false positives and false negatives.


-- 
Eli Schwartz
Bug Wrangler and Trusted User



signature.asc
Description: OpenPGP digital signature


Re: [arch-projects] [dbscripts] [PATCH 1/8] Fix quoting around variables, especially arrays.

2018-03-13 Thread Luke Shumaker
On Wed, 14 Mar 2018 00:11:05 -0400,
Eli Schwartz via arch-projects wrote:
> > while read line; do
> > -   pkginfo=(${line})
> > +   pkginfo=("${line}")
> 
> That's completely wrong, just look at the next five lines.
> 
> > pkgbase=${pkginfo[0]}
> > pkgver=${pkginfo[1]}
> > pkgarch=${pkginfo[2]}
> > -   pkglicense=(${pkginfo[@]:3})
> > +   pkglicense=("${pkginfo[@]:3}")

You're absolutely right.  I realized I screwed up right after I sent
it.

> I don't really see the need to quote every variable just because it is a
> variable, at least in cases where the variable is fairly statically
> defined... I'd like to see path components that could have spaces
> quoted, and arrays I guess because semantically that's how you iterate
> over an array. Also you introduced some bugs, in cases where we actually
> want whitespace splitting...

Part of it is to have a common style.  Trying to rectify two codebases
that diverged 7 years ago is rough.  When trying to come up with clean
diffs, having to guess "did the other one quote this variable?" makes
it harder.  If you can say "always quote (except for the LHS of [[
]])" or something, that makes it a bit easier.

> >  backup_package_variables() {
> > -   for var in ${splitpkg_overrides[@]}; do
> > +   for var in "${splitpkg_overrides[@]}"; do
> > indirect="${var}_backup"
> > -   eval "${indirect}=(\${$var[@]})"
> > +   eval "${indirect}=(\"\${$var[@]}\")"
> > done
> >  }
> >  
> >  restore_package_variables() {
> > -   for var in ${splitpkg_overrides[@]}; do
> > +   for var in "${splitpkg_overrides[@]}"; do
> > indirect="${var}_backup"
> > if [ -n "${!indirect}" ]; then
> > -   eval "${var}=(\${$indirect[@]})"
> > +   eval "${var}=(\"\${$indirect[@]}\")"
> > else
> > -   unset ${var}
> > +   unset "${var}"
> > fi
> > done
> 
> This is too much escaping and metaprogramming, there are better ways of
> backing up a variable to begin with. :/
> 
> We do it in makepkg, I will have us do it here as well. Advantage: using
> declare -p means the shell auto-escapes things where needed.

I haven't been keeping my thumb on makepkg git, but the eval lines as
I wrote them exactly match the eval lines in makepkg 5.0.2's version
of {backup,restore}_package_variables (makepkg's versions don't quote
the for loops, or the unset command).

> > -   if ! ${CLEANUP_DRYRUN}; then
> > +   if ! "${CLEANUP_DRYRUN}"; then
> 
> This is a variable being run as a command, so if there were to be spaces
> in it we'd end up trying to run a command with a space in it. Arguably
> we should not be running this as a command (even though they are set to
> true/false which is a shell builtin blah blah blah) but since we are it
> would be illogical to indicate that if there are spaces they should be
> interpreted as string literals in an executable filename.

For the true/false idiom, quoting it is just a style rule.  I figure
accepting the true/false idiom doesn't imply allowing the boolean
variable to have any value.  Having the quotes would help catch the
variable being erroneously set to a different value.

> > -   if ! ([[ -z ${ALLOWED_LICENSES[@]} ]] || chk_license 
> > ${pkglicense[@]} || grep -Fqx "${pkgbase}" "${dirname}/sourceballs.force"); 
> > then
> > +   if ! ([[ -z ${ALLOWED_LICENSES[*]} ]] || chk_license 
> > "${pkglicense[@]}" || grep -Fqx "${pkgbase}" 
> > "${dirname}/sourceballs.force"); then
> 
> What's the check here anyways? This considers ALLOWED_LICENSES=('') to
> be non-empty, so we might as well check the length of
> ${#ALLOWED_LICENSES[@]} which is a clearer read.

Agree.

> > -   if [[ ! -z "$(echo ${@%\.*} | sed "s/ /\n/g" | sort | uniq -D)" ]]; then
> > +   if [[ ! -z "$(printf '%s\n' "${@%\.*}" | sort | uniq -D)" ]]; then
> 
> Thanks for noticing this! printf is a lot nicer. I'm also wondering if
> it makes more sense to use a single awk 'a[$0]++{exit 1}' rather than
> chaining sort/uniq and reading its length in a bash test... the problem
> here was always that uniq -D returns successfully even if it prints
> nothing. With awk we can actually have an exit code stating whether
> duplicates were found.

Yeah.  Awk is probably the better solution here.  Even if it's not,
that !-z could be -n.  This commit was mostly a dumb grep for unquoted
variables (also, awk exiting early is safe, because Bash builtins are
silent when the recieve SIGPIPE).

> > -   arch_pkgs=($(getpkgfiles 
> > "${STAGING}/${repo}/"*-${pkgarch}${PKGEXTS} 2>/dev/null))
> > -   for pkg in ${arch_pkgs[@]} ${any_pkgs[@]}; do
> > +   arch_pkgs=($(getpkgfiles 
> > "${STAGING}/${repo}/"*-"${pkgarch}"${PKGEXTS} 2>/dev/null))
> > +   for pkg in "${arch_pkgs[@]}" "${any_pkgs[@]}"; do
> 
> Dropping in and out of quotes here a number of times sort of hammers
>