On Wed, Jun 21, 2017 at 1:08 AM, William A Rowe Jr <wr...@rowe-clan.net> wrote:
>
> But there were only two questionable values for \0, and in this case the
> answer is obvious. Invert the rule to a TOKEN char from the rather dubious
> TOKEN_STOP definition. Solved.

... for trunk, IMO. I don't suggest we touch this code again on
2.4 branch. For trunk, when I encounter the concept of a STOP
I envision there is a set of defined delimiters (e.g. punctuation/ctrls)
which serve as terminators. What HTTP defined was the set of
token characters in the bnf. T_HTTP_NOT_TOKEN would have
been a better name for the existing rule. T_HTTP_TOKEN is
simply a better understood rule.

On to whether element [NUL] tests aught to be trustable or
unused; My research (unsure if this was shared earlier) starts
with tests of isfn('\0');

isalnum  false
isalpha  false
iscntrl  true
isdigit  false
isgraph  false
islower  false
isprint  false
ispunct  false
isspace  false
isupper  false
isxdigit false
isascii  true
isblank  false

In each case above, the isfn('\0') results are plainly obvious, right
down to iscntrl(NUL)'s and isascii(NUL)'s result. Our resulting
test_char table values include element [NUL] as well, and it needs
to be sensible for the many purposes...

Our declarations;

T_ESCAPE_SHELL_CMD     false (correct, \0 is EOS, not a char)
T_ESCAPE_PATH_SEGMENT  false (correct, \0 is EOS, not a char)
T_OS_ESCAPE_PATH       false (correct, \0 is EOS, not a char)
T_HTTP_TOKEN_STOP      true (correct, \0 is a non-token byte)
T_ESCAPE_LOGITEM       false (correct, \0 is EOS not a char)
T_ESCAPE_FORENSIC      true (interesting/debatable)
T_ESCAPE_URLENCODED    false (correct, \0 is EOS not a char)
T_HTTP_CTRLS           true (correct, \0 is NUL, non VCHAR)
T_VCHAR_OBSTEXT        false (correct, \0 is NUL < 0x80)

In the _ESCAPE's we are escaping a string, and NUL's terminate
a string and are not escaped. Any NUL byte traveling the wire
needs to be evaluated by the appropriate TOKEN, CTRLS or
OBSTEXT rule by HTTP RFC.

The T_ESCAPE_FORENSIC case inverts the general rule for
command lines and paths we are constructing, but as a symbol
over the wire, I can agree with the idea of logging an escaped \0
in the case of a NUL input byte. This differs from logging in that
our logging API has no counted-string semantic (if it did, I guess
we would encrypt \0's.)

Reply via email to