2010/12/9 Kevin P. Fleming <[email protected]>:
> ISO-8859 isn't specific enough; there are 16 subsections of ISO-8859,
> with different encodings. The character you are trying represent has
> different encodings in many of them.
Yes :(
> In SMTP there is some sort of syntax that can be used to specify the
> character encoding of the display name portion of a header string... but
> I don't know if that's allowed in SIP or not. Based on the ABNF you've
> posted above it's clearly not allowed.
It's not allowed, sure.
The problem is the following:
Currently my parser applies official BNF grammar for unknown header values:
unknown-header = header-name HCOLON header-value CRLF
header-value = *(TEXT-UTF8char / UTF8-CONT / LWS)
TEXT-UTF8char = %x21-7E / UTF8-NONASCII
UTF8-NONASCII = %xC0-DF 1UTF8-CONT
/ %xE0-EF 2UTF8-CONT
/ %xF0-F7 3UTF8-CONT
/ %xF8-Fb 4UTF8-CONT
/ %xFC-FD 5UTF8-CONT
UTF8-CONT = %x80-BF
I've relaxed it:
unknown-header = header-name HCOLON header-value CRLF
header-value = ( any )*
However it makes the parser invalid/wrong in some cases as when a
custom header value contains line folding. The correct grammar (above)
avoids this problem. So I need a "mix", something not so strict as the
official BNF but it must not invalidate well formed headers (even if
exotic).
Thanks a lot.
--
Iñaki Baz Castillo
<[email protected]>
_______________________________________________
Sip-implementors mailing list
[email protected]
https://lists.cs.columbia.edu/cucslists/listinfo/sip-implementors