Re: [PATCH] Skip initial expansion of valid array reference tokens in unset

2021-04-25 Thread Chet Ramey

On 4/19/21 11:09 AM, konsolebox wrote:

Attached patch demonstrates a solution that solves the current issues
of unset discussed in
https://lists.gnu.org/archive/html/bug-bash/2021-03/msg00056.html
while avoiding breakage of scripts and keeping the expansion of
subscripts consistent with expansion in references.


Thanks. I can use some of this.


--
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/



Re: [PATCH] Skip initial expansion of valid array reference tokens in unset

2021-04-19 Thread konsolebox
On Tue, Apr 20, 2021 at 9:35 AM Koichi Murase  wrote:
> In addition, you need to change `1 << 31' to `1u << 31'.  The literal
> `1' has the type `(signed) int'.  One needs to use `1u'---the literal
> of the type `unsigned int'.

I see, but I think I'll just skip that as I would also have to modify
other declarations for consistency.  I think the code is already good
enough for a proof of concept at least.

-- 
konsolebox



Re: [PATCH] Skip initial expansion of valid array reference tokens in unset

2021-04-19 Thread Koichi Murase
2021年4月20日(火) 9:14 konsolebox :
> On Tue, Apr 20, 2021 at 2:08 AM Koichi Murase  wrote:
> > AFAIK `1 << 31' causes undefined behavior when int is 32-bit signed
> > integer.  Instead, I think you may write `(int)(1u << 31)' for
> > conforming C compilers. `u << 31' is defined for unsigned integers,
> > and the conversion between signed and unsigned integers is defined to
> > be made by mod 2^{32} (when int is 32 bits).
>
> Not sure how this should be done right so I decided to just change
> 'flags' to unsigned int.
>
> -  int flags;   /* Flags associated with this word. */
> +  unsigned int flags;  /* Flags associated with this word. */

Yeah, right.  I haven't checked the type of `flags'.  We also need to
make it unsigned if we want to use bit 31.

In addition, you need to change `1 << 31' to `1u << 31'.  The literal
`1' has the type `(signed) int'.  One needs to use `1u'---the literal
of the type `unsigned int'.

--
Koichi



Re: [PATCH] Skip initial expansion of valid array reference tokens in unset

2021-04-19 Thread konsolebox
On Tue, Apr 20, 2021 at 2:08 AM Koichi Murase  wrote:
> AFAIK `1 << 31' causes undefined behavior when int is 32-bit signed
> integer.  Instead, I think you may write `(int)(1u << 31)' for
> conforming C compilers. `u << 31' is defined for unsigned integers,
> and the conversion between signed and unsigned integers is defined to
> be made by mod 2^{32} (when int is 32 bits).

Not sure how this should be done right so I decided to just change
'flags' to unsigned int.

-  int flags;   /* Flags associated with this word. */
+  unsigned int flags;  /* Flags associated with this word. */

> `y.tab.c' is automatically generated from `parse.y'. You should edit
> `parse.y' instead of `y.tab.c'

I also modified parse.y.  All changes made so far can be seen in
https://git.io/JOam0.

Thanks for the tips.

-- 
konsolebox



Re: [PATCH] Skip initial expansion of valid array reference tokens in unset

2021-04-19 Thread Koichi Murase
2021年4月20日(火) 0:16 konsolebox :
> Attached patch demonstrates a solution that solves the current issues
> of unset discussed in
> https://lists.gnu.org/archive/html/bug-bash/2021-03/msg00056.html
> while avoiding breakage of scripts and keeping the expansion of
> subscripts consistent with expansion in references.

Nice demonstration!  I haven't tried but have comments after taking a
glance at the patch.

> diff --git a/command.h b/command.h
> index 914198f..4dafcf0 100644
> --- a/command.h
> +++ b/command.h
> @@ -104,6 +104,7 @@ enum command_type { cm_for, cm_case, cm_while, cm_if, 
> cm_simple, cm_select,
>  #define W_CHKLOCAL (1 << 28) /* check for local vars on assignment */
>  #define W_NOASSNTILDE (1 << 29) /* don't do tilde expansion like an 
> assignment statement */
>  #define W_FORCELOCAL (1 << 30) /* force assignments to be to local 
> variables, non-fatal on assignment errors */
> +#define W_NOEXPAND (1 << 31) /* inhibits any form of expansion */

AFAIK `1 << 31' causes undefined behavior when int is 32-bit signed
integer.  Instead, I think you may write `(int)(1u << 31)' for
conforming C compilers. `u << 31' is defined for unsigned integers,
and the conversion between signed and unsigned integers is defined to
be made by mod 2^{32} (when int is 32 bits).

> diff --git a/y.tab.c b/y.tab.c
> index dcc5b7f..a9ae0e6 100644
> --- a/y.tab.c
> +++ b/y.tab.c
> @@ -7700,7 +7700,15 @@ got_token:

`y.tab.c' is automatically generated from `parse.y'. You should edit
`parse.y' instead of `y.tab.c'.  The corresponding lines are
parse.y:5512--5517.  To regenerate `y.tab.c' after the edit, maybe you
need bison (or yacc).

--
Koichi



[PATCH] Skip initial expansion of valid array reference tokens in unset

2021-04-19 Thread konsolebox
Attached patch demonstrates a solution that solves the current issues
of unset discussed in
https://lists.gnu.org/archive/html/bug-bash/2021-03/msg00056.html
while avoiding breakage of scripts and keeping the expansion of
subscripts consistent with expansion in references.

The patched code is also available in
https://github.com/konsolebox/bash/tree/skip_expansion_of_valid_array_refs_in_unset.
The patch is applied against the latest commit (f3a35a2d60) in master
branch.  The changes can be conveniently compared in
https://git.io/JOgrE.

The solution allows the following tests to pass.  A few of them were
described to fail in the other thread.

-

#!/bin/bash

declare -A a
key='$(echo foo)'

# Here the tokens are valid array references and are not expanded.
# The references are completely similar to the assignments.
# This solves the surprise expansion issues.

a[$key]=1
unset a[$key]
declare -p a # Displays no element

a['$key']=2
unset a['$key']
declare -p a # Displays no element

a["foo"]=3
unset a["foo"]
declare -p a # Displays no element

echo -

# Here the tokens are "strings".  They expand and keep the
# original behavior and allows existing scripts to not break.
# It also allows nref or iref references to be transparently
# referenced in it.

a[$key]=1
unset 'a[$key]' # Transforms to a[$key] after expansion
declare -p a # Displays no element

a['$key']=2
unset "a['\$key']" # Transforms to a['$key'] after expansion
declare -p a # Displays no element

a["foo"]=3
unset 'a["foo"]' # Transforms to a["foo"] after expansion
declare -p a # Displays no element

echo -

# The update also keeps compatibility with already existing behavior of
# unset when assoc_expand_once is enabled, but only for quoted tokens.
# I would prefer it totally removed though.
# Also notice assoc_expand_once's limitation below.

shopt -s assoc_expand_once

a[$key]=1
unset "a[$key]"
declare -p a # Displays no element

a['$key']=2
unset "a[\$key]"
declare -p a # Displays no element

a["foo"]=3
unset "a[foo]"
declare -p a # Displays no element

echo --

# For unsetting '@' and all elements:

key=@

declare -A a=(@ v0 . v1)
unset a[$key]
declare -p a # Displays 'declare -A a=([.]="v1" )'

declare -A a=(@ v0 . v1)
unset a[@]
declare -p a # Displays 'declare: a: not found'

echo -

shopt -u assoc_expand_once

declare -A a=(@ v0 . v1)
unset 'a[$key]'
declare -p a # Displays 'declare -A a=([.]="v1" )'

declare -A a=(@ v0 . v1)
unset 'a[@]'
declare -p a # Displays 'declare: a: not found'

echo -

shopt -s assoc_expand_once

declare -A a=(@ v0 . v1)
unset "a[$key]"
declare -p a # Displays 'declare: a: not found' (A limitation)

declare -A a=(@ v0 . v1)
unset 'a[@]'
declare -p a # Displays 'declare: a: not found'


-- 
konsolebox
diff --git a/builtins/set.def b/builtins/set.def
index 8ee0165..6ef1d17 100644
--- a/builtins/set.def
+++ b/builtins/set.def
@@ -872,10 +872,6 @@ unset_builtin (list)
   else if (unset_function && nameref)
 nameref = 0;
 
-#if defined (ARRAY_VARS)
-  vflags = assoc_expand_once ? (VA_NOEXPAND|VA_ONEWORD) : 0;
-#endif
-
   while (list)
 {
   SHELL_VAR *var;
@@ -890,6 +886,9 @@ unset_builtin (list)
   unset_variable = global_unset_var;
 
 #if defined (ARRAY_VARS)
+  vflags = (assoc_expand_once && (list->word->flags & W_NOEXPAND) == 0) ?
+	  (VA_NOEXPAND|VA_ONEWORD) : 0;
+
   unset_array = 0;
   /* XXX valid array reference second arg was 0 */
   if (!unset_function && nameref == 0 && valid_array_reference (name, vflags))
diff --git a/command.h b/command.h
index 914198f..4dafcf0 100644
--- a/command.h
+++ b/command.h
@@ -104,6 +104,7 @@ enum command_type { cm_for, cm_case, cm_while, cm_if, cm_simple, cm_select,
 #define W_CHKLOCAL	(1 << 28)	/* check for local vars on assignment */
 #define W_NOASSNTILDE	(1 << 29)	/* don't do tilde expansion like an assignment statement */
 #define W_FORCELOCAL	(1 << 30)	/* force assignments to be to local variables, non-fatal on assignment errors */
+#define W_NOEXPAND	(1 << 31)	/* inhibits any form of expansion */
 
 /* Flags for the `pflags' argument to param_expand() and various
parameter_brace_expand_xxx functions; also used for string_list_dollar_at */
diff --git a/parser.h b/parser.h
index 59bddac..a70481a 100644
--- a/parser.h
+++ b/parser.h
@@ -48,6 +48,7 @@
 #define PST_REDIRLIST	0x08	/* parsing a list of redirections preceding a simple command name */
 #define PST_COMMENT	0x10	/* parsing a shell comment; used by aliases */
 #define PST_ENDALIAS	0x20	/* just finished expanding and consuming an alias */
+#define PST_UNSET	0x40
 
 /* Definition of the delimiter stack.  Needed by parse.y and bashhist.c. */
 struct dstack {
diff --git a/subst.c b/subst.c
index 6132316..da391d2 100644
--- a/subst.c
+++ b/subst.c
@@ -4378,6 +4378,8 @@ dequote_list (list)
 
   for (tlist = list; tlist; tlist = tlist->next)
 {
+  if (tlist->word->flags & W_NOEXPAND)
+	continue;
   s =