Hi!
----
I found an issue in ksh93s-_20060912 on Solaris 11/B48/i386 which may be
related to the substring operator "${strvar:index:ressize}": It seems
the operator has problems to handle multibyte characters correctly.
The following testcase...
-- snip --
(TESTSHELL=/path/to_shell_which_should_be_tested ; export
LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 ; $TESTSHELL -c 'cat /usr/pub/UTF-8
| while read i ; do echo "a=$(printf "%s\n" "$i" | /usr/bin/wc -m)
b=${#i} c=$( (for (( ci=0 ; ci < ${#i} ; ci++ )) ; do printf "%s"
"${i:$ci:1}" ; done) | /usr/bin/wc -m)" ; done') | head -20
-- snip --
("/usr/pub/UTF-8" is a sample file which contains a large range of
unicode characters encoded in UTF-8)
...returns the following output when
"TESTSHELL=/path/to_shell_which_should_be_tested" is replaced by
/bin/bash ("GNU bash, version 3.00.16(1)-release
(i386-pc-solaris2.11)"):
-- snip --
a= 73 b=72 c= 72
a= 73 b=72 c= 72
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 71 b=70 c= 70
a= 72 b=71 c= 71
a= 74 b=73 c= 73
a= 74 b=73 c= 73
a= 74 b=73 c= 73
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 70 b=69 c= 69
a= 70 b=69 c= 69
-- snip --
(this is AFAIK the expected behaviour)
ksh93s-_20060912 returns a different output:
-- snip --
a= 73 b=72 c= 72
a= 73 b=72 c= 72
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 72 b=71 c= 71
a= 71 b=70 c= 70
a= 72 b=71 c= 71
a= 74 b=73 c= 73
a= 74 b=73 c= 73
a= 74 b=73 c= 73
a= 72 b=71 c= 59
a= 72 b=71 c= 59
a= 72 b=71 c= 59
a= 72 b=71 c= 59
a= 72 b=71 c= 59
a= 72 b=71 c= 59
a= 72 b=71 c= 59
a= 72 b=71 c= 59
a= 70 b=69 c= 57
a= 70 b=69 c= 57
-- snip --
Note the values at the end - as soon as multibyte characters are read
from /usr/pub/UTF-8 the values start to be wrong (I have uploaded a
bzip2'ed version of /usr/pub/UTF-8 to
http://www.opensolaris.org/os/project/ksh93-integration/downloads/solaris_11_b48__usr_pub_UTF-8.bz2
for testing purposes).
----
Bye,
Roland
--
__ . . __
(o.\ \/ /.o) roland.mainz at nrubsig.org
\__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer
/O /==\ O\ TEL +49 641 7950090
(;O/ \/ \O;)