Re: clarification needed: shell 'exec' + function (builtin, ???)

2020-12-10 Thread Steffen Nurpmeso
[I bring back austin-group-l, ok?

Thorsten Glaser wrote in
 :
 |Steffen Nurpmeso dixit:
 |
 |>  #include 
 |>  #include 
 |>  #include 
 |>  #include 
 |>  int main(void){
 |> char inb[16], oub[16], *inbp, *oubp;
 |> iconv_t id;
 |> size_t inl, oul;
 |>
 |> memcpy(inbp = inb, "a\303\244c", sizeof("a\303\244c"));
 |> inl = sizeof("a\303\244c") -1;
 |
 |Not -1 otherwise oub will not be NUL-terminated and end with junk:
 |
 |$ ./a.out
 |Converting 4 
 |GOT 

Sure thing.  Just like below.  Normally stack pages are cow forked
from zero if i understand that right.  But maybe i do not.

 |Without the trailing NUL, stateful conversation may also be
 |incomplete…
 |
 |> oul = sizeof oub;
 |> oubp = oub;
 |>
 |> if((id = iconv_open("ascii", "utf8")) == (iconv_t)-1)
 |>   return 1;
 |
 |Throws 1 because you need "utf-8", but with it, see above.

Well names and iconv are a thing.  Especially regarding Unicode
(and nl_langinfo(CODESET), if i remember UnixWare right).

 |> fprintf(stderr, "Converting %lu <%s>\n",(unsigned long)inl, inbp);
 |> if(iconv(id, &inbp, &inl, &oubp, &oul) == (size_t)-1){
 |>fprintf(stderr, "Fail <%s>\n", strerror(errno));
 |>return 2;
 |>}
 |> fprintf(stderr, "GOT <%s>\n", oub);
 |> iconv_close(id);
 |> return 0;
 |>}
 |>
 |>you should get replacement characters out of the box?
 |
 |Citrus iconv agrees. Its manpage says:
 |
 | If the string pointed to by *src contains a character which is valid
 | under the source codeset but can not be converted to the destination
 | codeset, the character is replaced by an "invalid character" which
 | depends on the destination codeset, e.g., '?', and the conversion \
 | is con-
 | tinued. iconv() returns the number of such "invalid conversions".

That was my thinking.
Thanks for confirming this.

 --End of 

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


Re: clarification needed: shell 'exec' + function (builtin, ???)

2020-12-10 Thread Joerg Schilling
"shwaresyst via austin-group-l at The Open Group" 
 wrote:

> 
> I agree more clarification is desirable. The reason I see as why the function 
> isn't executed is it may be treating it as an invoke of "sh -c ls", because 
> ls is a function, but this new sh does not inherit that definition so it 
> looks on path instead and finds the utility.
> On Wednesday, December 9, 2020 Thorsten Glaser via austin-group-l at The Open 
> Group  wrote:
> Hi *,

Hi,

here is where the original mail ended for me. Interesting that you did get
more content. Is there any idea, why I received only the first line from the
original mail?

...
 
> I???ve got a report in IRC by a user who spotted a cross-shell difference.
> 
> In my opinion, the invocation???
> 
>     sh -c 'ls() { echo meow; }; exec ls'
> 
> ??? is supposed to output "meow\n and return to the caller with a zero
> errorlevel.
> 
> Some shells execve() the ls(1) binary instead.

Thorsten,

do you know any shell besides mksh and zsh that call the function with this 
command? 

From my understanding, calling the function is a bug.

Important for me is that the Bourne Shell, ksh88 and ksh93 call ls(1), so this 
is historically correct and it was not seen as a problem by David Korn.

Jörg

-- 
EMail:jo...@schily.net  Jörg Schilling D-13353 Berlin
Blog: http://schily.blogspot.com/
URL:  http://cdrecord.org/private/ 
http://sourceforge.net/projects/schilytools/files/


Re: clarification needed: shell 'exec' + function (builtin, ???)

2020-12-10 Thread Joerg Schilling
Steffen Nurpmeso  wrote:

> this is an iconv(3)-related error that was fixed in later version
> of the mailer you use.  The very error came up on the ML this
> year[1], basically you use LATIN1 on your box, as could be
> expected, but Thorsten is known to be a Unicode character
> "junkie", so to say.

You are correct,

I have been able to save the mail as file and to run iconv(1) on the content.
Maybe a problem is that the first missing line is a line with a character that
is not part of ISO-8859-1

Jörg

-- 
EMail:jo...@schily.net  Jörg Schilling D-13353 Berlin
Blog: http://schily.blogspot.com/
URL:  http://cdrecord.org/private/ 
http://sourceforge.net/projects/schilytools/files/


Re: clarification needed: shell 'exec' + function (builtin, ???)

2020-12-10 Thread Joerg Schilling
"Thorsten Glaser via austin-group-l at The Open Group" 
 wrote:

> This is because m4.opengroup.org runs qmail, the arsehole under the MTAs,
> which auto-converted the mail from quoted-printable to 8bit, sending it
> as 8bit even to MTAs that don't offer 8BITMIME (I configured my sendmail
> not to do that as well, so I got the same truncated mail back :( other
> than qmail, exim is known to break the MIME and SMTP standards like that).

Thank you for this information.

> >From IRC:
> 
> 15:57 < orbea> yash matches the bash behavior fwiw
> 16:26 < orbea> pdksh, oksh, loksh, zsh and posh match mksh's behavior with 
> 'exec', everything else including
>ksh2020 and hsh match bash/yash
> 16:26 < orbea> as reproduced with: ls () { echo foo; }; exec ls
> 16:27 < miskatonic> and the difference is what?
> 16:28 < orbea> mksh prints 'foo', yash executes ls(1)

OK, mksh pdksh and posh have te same origin.
I don't know oksh, loksh

> I can live with it being open to implementations as well, but it's
> best to clarify.

Well, the Bourne Shell man page says:

 The command specified by the arguments  is  executed  in
 place  of  this  shell  without  creating a new process. ...
 
The POSIX text is:

If exec is specified with command, it shall replace the shell 
with command without creating a new process. ...

So the main statement in both is that the command is executed in place of 
the shell. This seems to be obviously a hint that the shell cannot run 
anymore and thus the function cannot be executed.

Jörg

-- 
EMail:jo...@schily.net  Jörg Schilling D-13353 Berlin
Blog: http://schily.blogspot.com/
URL:  http://cdrecord.org/private/ 
http://sourceforge.net/projects/schilytools/files/


Re: clarification needed: shell 'exec' + function (builtin, ???)

2020-12-10 Thread Thorsten Glaser
Steffen Nurpmeso dixit:

>  #include 
>  #include 
>  #include 
>  #include 
>  int main(void){
> char inb[16], oub[16], *inbp, *oubp;
> iconv_t id;
> size_t inl, oul;
>
> memcpy(inbp = inb, "a\303\244c", sizeof("a\303\244c"));
> inl = sizeof("a\303\244c") -1;

Not -1 otherwise oub will not be NUL-terminated and end with junk:

$ ./a.out
Converting 4 
GOT 

Without the trailing NUL, stateful conversation may also be
incomplete…

> oul = sizeof oub;
> oubp = oub;
>
> if((id = iconv_open("ascii", "utf8")) == (iconv_t)-1)
>   return 1;

Throws 1 because you need "utf-8", but with it, see above.

> fprintf(stderr, "Converting %lu <%s>\n",(unsigned long)inl, inbp);
> if(iconv(id, &inbp, &inl, &oubp, &oul) == (size_t)-1){
>fprintf(stderr, "Fail <%s>\n", strerror(errno));
>return 2;
> }
> fprintf(stderr, "GOT <%s>\n", oub);
> iconv_close(id);
> return 0;
>  }
>
>you should get replacement characters out of the box?

Citrus iconv agrees. Its manpage says:

 If the string pointed to by *src contains a character which is valid
 under the source codeset but can not be converted to the destination
 codeset, the character is replaced by an "invalid character" which
 depends on the destination codeset, e.g., '?', and the conversion is con-
 tinued. iconv() returns the number of such "invalid conversions".


Re: mail encoding not-fun (was Re: clarification needed: shell 'exec' + function (builtin, ???))

2020-12-10 Thread Steffen Nurpmeso
Thorsten Glaser via austin-group-l at The Open Group wrote in
 :
 |Steffen Nurpmeso via austin-group-l at The Open Group dixit:
 |
 |>|This is because m4.opengroup.org runs qmail, the arsehole under the MTAs,
 |>|which auto-converted the mail from quoted-printable to 8bit, sending it
 |>|as 8bit even to MTAs that don't offer 8BITMIME (I configured my sendmail
 |>|not to do that as well, so I got the same truncated mail back :( other
 |>|than qmail, exim is known to break the MIME and SMTP standards like \
 |>|that).
 |>
 |>Naaah, not true Thorsten.  At least this time.
 |
 |This one *is* correct, as I got the broken message back as well.
 |It contains an embedded NUL.
 |
 |But apparently, this was not the cause of J�rg’s problem ☻

Evil, you.  Hey, i live also on IRC since ~1.5 years for the first
time ever, and on this Linux-Distro (just released 3.6 two days
ago) here is someone active from Düsseldorf, and i now sometimes
listen to La Düsseldorf from La Düsseldorf.  (Of course Neu! and
Lilo Engel are less populist.)  Lots of longing for the 70s here.
Tja.

All he would need to do would be to upgrade to a newer version,
i think i innocently prod him two times on that.

 |>Related to my MUA.
 |[…]
 |>I have been able to save the mail as file and to run iconv(1) on the \
 |>content.
 |
 |oic
 |
 |>Maybe a problem is that the first missing line is a line with a character \
 |>that
 |>is not part of ISO-8859-1
 |
 |Yes, of course, I have been writing in UTF-8 for a while.

Not so here, even though i have ~/.kent.xmodmaprc keycode
adjustments for some German and French quotation marks, i find it
hard to go "random Unicode".  I usually use altgr/g-a in vim and
the look in UnicodeData.txt to find the codepoint ;)

 |[00:02]  gecko: benutzt du emacs ?
 |[00:03]  nö  [00:03]  nur n normalen mac

Graphical selection may be a winner here, indeed.

 |[00:04]  argl   [00:04]  ne den editor
 | -- Vutral und gecko2 in #deutsch (NB: Editor? Betriebssystem.)
 --End of 

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


Re: clarification needed: shell 'exec' + function (builtin, ???)

2020-12-10 Thread Steffen Nurpmeso
Hallo Jörg, all,

Joerg Schilling wrote in
 <20201210004945.i3n8e%sch...@schily.net>:
 |Steffen Nurpmeso  wrote:
 |> this is an iconv(3)-related error that was fixed in later version
 |> of the mailer you use.  The very error came up on the ML this
 |> year[1], basically you use LATIN1 on your box, as could be
 |> expected, but Thorsten is known to be a Unicode character
 |> "junkie", so to say.
 |
 |You are correct,

Yep -- unfortunately.

 |I have been able to save the mail as file and to run iconv(1) on the \
 |content.

Yes, we temporarily did not restart for ILSEQ, if your prompt
would include "set prompt='\${^ERRNAME}', for example, you would
have seen that an error happened.
But of course we are tolerant for weird base64, so we should be
tolerant for weird iconv, thus i "restored the original
behavirour", so to say.

That reminds me of iconv weirdness regarding hard-to-test
replacement characters, which makes testing really hard.  Wasn't
there an issue on that going on, being able to specify it
explicitly, and whether it stands for an entire character or for
by-byte sequences would be a great improvement.

While talking about iconv, i got closed glibc bug[1] as "resolved
invalid", but wouldn't you all agree that in the following

  #include  
  #include 
  #include 
  #include 
  int main(void){
 char inb[16], oub[16], *inbp, *oubp;
 iconv_t id;
 size_t inl, oul;

 memcpy(inbp = inb, "a\303\244c", sizeof("a\303\244c"));
 inl = sizeof("a\303\244c") -1;
 oul = sizeof oub;
 oubp = oub;

 if((id = iconv_open("ascii", "utf8")) == (iconv_t)-1)
   return 1;
 fprintf(stderr, "Converting %lu <%s>\n",(unsigned long)inl, inbp);
 if(iconv(id, &inbp, &inl, &oubp, &oul) == (size_t)-1){
fprintf(stderr, "Fail <%s>\n", strerror(errno));
return 2;
 }  
 fprintf(stderr, "GOT <%s>\n", oub);
 iconv_close(id);
 return 0;
  }

you should get replacement characters out of the box?
I said by then

   $ /tmp/zt
   Converting 4 
   Fail 

  whereas musl gives

   $ ./zt
   Converting 4 
   GOT 

and i still think musl is totally right (also by giving only one
replacement character.

  [1] https://sourceware.org/bugzilla/show_bug.cgi?id=22908

 |Maybe a problem is that the first missing line is a line with a character \
 |that
 |is not part of ISO-8859-1

Yes, transliteration should possibly be possible.
On the other hand, if i change the above to

   if((id = iconv_open("ascii//TRANSLIT", "utf8")) == (iconv_t)-1)

i get

  Converting 4 
  GOT 

and with

   if((id = iconv_open("ascii//TRANSLIT", "utf8")) == (iconv_t)-1)

we are back at the error.

Ciao,

--steffen
|
|Der Kragenbaer,The moon bear,
|der holt sich munter   he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)