Date:        Wed, 28 Aug 2024 02:03:54 +0200
    From:        Steffen Nurpmeso <stef...@sdaoden.eu>
    Message-ID:  <20240828000354.qZaQvm7v@steffen%sdaoden.eu>

  | That confuses me again, unfortunately i got a bug report and
  | distracted.  I mean, i would
  |
  | 1. skip leading whitespace anyhow (IFS or not, which
  |    is a "documented bug" here i would say),
  |    for the shell this would be: leading IFS whitespace,

First, since you're concerned with your MUA, you can define whatever
rules you like for this, there's nothing in any MUA spec I know of
which requires anything like shell parsing/syntax/evaluation (with
the possible exception of MH (nmh) and only because that actually
uses the shell for everything other than the actual access to the
messages, etc), so if you want, you get to do as you like.

But if you're trying to emulate the shell rules, you should do
it correctly, not just almost, or you'll confuse people.   So
above, "leading IFS whitespace" certainly.

Further, if this reaches the end of the bytes subject to
field splitting, you're done (this is the exit condition).

  | 2. pass by none-to-many non-IFS bytes, the "field data", then
  |
  | 3.
  |    a. if there is a non-IFS-whitespace character:
  |       - delimit the field, even with empty "field data",
  |
  |    b. if there is a IFS-whitespace character:
  |       - delimit the field only with non-empty "field data",

No, you simply delimit the field.   The field cannot be empty
if the delimiter found is IFS whitespace, or you would have
ignored that in the "skip leading IFS whitespace" above.

That is, unless that #1 skip ran out of data (in which case you
don't get here) it must have ended at either a non-IFS character
(so the field is not empty, at the very least that character is
in it) or a non_IFS-whitespace character (empty fields are allowed).

  | 4. skip trailing (new leading) (IFS-) whitespace

Just "goto 1" (or "repeat").

The reason all this is messy, is that it is (more or less) the
way it was implemented in the original Bourne shell.   That tells
you that the implementation must be simple - the rules might seem
complex to explain, but the implementation is sure to be simple,
because that shell wasted no code it could avoid.

kre


Reply via email to