On Mon, 2014-05-05 at 14:42 +0200, Patrick Ohly wrote:
> On Fri, 2014-05-02 at 11:59 +0200, Lukas Zeller wrote:
> > On 02.05.2014, at 10:38, Patrick Ohly <patrick.o...@intel.com> wrote:
> > > I noticed another problem with the "use X-ABLabel parameter" approach:
> > > storing complex strings (spaces, quotation marks) in a parameter value
> > > is harder.
> > 
> > That's probably why Apple chose the X-ABLabel property approach. An
> > unparseable parameter could ruin the real data, a unknown property is
> > less dangerous.
> > 
> > > The EDS vCard parser gets it wrong and fails to parse:
> > > 
> > > X-ABRELATEDNAMES;X-ABLabel=domestic partner:domestic partner
> > > 
> > > That is valid according to http://tools.ietf.org/html/rfc2425#page-5
> > > because the space is a SAFE-CHAR and thus may appear in a ptext, but the
> > > EDS parser does not expect the space. To work around this, we could
> > > voluntarily quote the string even though it is not required. 
> > > 
> > > Now, the conceptual problem with "X-ABLabel parameter" is that a quoted
> > > string cannot contain double quotes. It is probably rare that a user
> > > enters double quotes as part of a label, but it cannot be ruled out
> > > either. Line breaks are also only allowed in property values and not
> > > parameter values.
> 
> I've looked into TMimeDirProfileHandler::generateValue() some more to
> understand under which circumstances libsynthesis uses quoted strings
> (with double quotes at start and end) as parameter value. At first
> glance it doesn't seem to do that at all. Instead special values are
> escaped with backslash.
> 
> item29.X-ABLabel:custom-label5\nUmlaut-ä\nSemicolon\;
> ->
> X-ABRELATEDNAMES;X-ABLabel=custom-label5\nUmlaut-ä\nSemicolon\;:custom 
> relationship
> 
> Where is it specified that the backslash escape mechanism can be used in
> parameter values?
> 
> http://tools.ietf.org/html/rfc2425#page-5 says:
> 
> 
>    param        = param-name "=" param-value *("," param-value)
> 
>    param-name   = x-name / iana-token
> 
>    param-value  = ptext / quoted-string
> 
>    ptext  = *SAFE-CHAR
> 
>    SAFE-CHAR    = WSP / %x21 / %x23-2B / %x2D-39 / %x3C-7E / NON-ASCII
>       ; Any character except CTLs, DQUOTE, ";", ":", ","
> 
> My reading of that is that special characters must not appear in a ptext
> at all, not even when escaped with backslash. One has to use a quoted
> string, which (unfortunately) cannot hold all characters either.
> 
> IMHO libsynthesis is currently producing broken vCards. I consider
> changing the code so that it uses quoted strings if it detects unsafe
> characters in the value and filters out invalid ones. "unsafe" would be
> more conservative than in the RFC itself and also include spaces, to
> work around the EDS parser bug.

Attached is a patch which makes the libsynthesis parser and generator
behave according to my current understanding of the RFCs. The risk of
course is that there are cases where backslashes are used in parameter
values and the peer (unpatched libsynthesis, other implementations)
expect that backslash escaping is used.

The Evolution Data Server parser does not use backslash escaping for
parameters. To exchange values containing backslashes, the patch is
needed.

This becomes relevant in the context of the X-ABLabel parameter. My hope
is that all other parameters are simple enough that the ambiguity never
arises.

Or do we need an on/off switch for backslash escaping depending on the
peer?

-- 
Best Regards, Patrick Ohly

The content of this message is my personal opinion only and although
I am an employee of Intel, the statements I make here in no way
represent Intel's position on the issue, nor am I authorized to speak
on behalf of Intel on this matter.


>From 8a7356c1518aca405eb29ddf12a2d1052375da68 Mon Sep 17 00:00:00 2001
From: Patrick Ohly <patrick.o...@intel.com>
Date: Tue, 6 May 2014 12:13:19 +0200
Subject: [PATCH 2/2] MIME parser + encoder: no backslash quoting in parameter
 values
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

RFC 2425 (MIME DIR) and RFC 6350 (vCard) do not describe backslash
escaping for special characters in parameter values. Only the more
limited quoted-string (double quotes at start and end, no line breaks,
no double quotes inside value) is specified.

The following two examples both contain a literal backslash:

URL;X-ABLabel="Custom-label6 Backslash \":http://custom.com
X-ABRELATEDNAMES;X-ABLabel="custom-label5 Umlaut ä Semicolon ; Backslash \ end of label":custom relationship

This commit limits backslash escaping to parsing and generating
property values. Backslashs in parameters are stored literally during
parsing. A quoted string parameter value is used for every value that
is more complex than alphanumeric plus underscore and hyphen.

In particular spaces (while allowed in unquoted values) are only
generated as part of quoted strings because the Evolution Data Server
parser had problems with them (fixed in EDS 3.10).
---
 src/sysync/mimedirprofile.cpp |   60 ++++++++++++++++++++++++++++-------------
 1 file changed, 42 insertions(+), 18 deletions(-)

diff --git a/src/sysync/mimedirprofile.cpp b/src/sysync/mimedirprofile.cpp
index a1f88b8..83acbca 100644
--- a/src/sysync/mimedirprofile.cpp
+++ b/src/sysync/mimedirprofile.cpp
@@ -2359,6 +2359,34 @@ sInt16 TMimeDirProfileHandler::generateValue(
             }
             // - val is now translated enum (or original value if value does not match any enum text)
             valsiz+=val.size();
+            // We have two choices for parameter values:
+            // - quoted string in double quotes
+            // - a simple string without
+            //
+            // Line breaks and control characters are not supported
+            // either way; the backslash escape mechanism is not used
+            // unless explicitly specified otherwise for specific
+            // parameters (like TYPE).
+            //
+            // We pick the simple string option only if the value
+            // contains only alphanumeric characters plus hyphen and
+            // underscore. Spaces are allowed by the RFC, but are
+            // known to cause issues in other parsers (EDS before
+            // 3.10) unless used in a quoted string, therefore we
+            // are more conservative than the RFC.
+            bool quotedstring = false;
+            if (aParamValue) {
+              for (const char *p=val.c_str();(c=*p)!=0;p++) {
+                if (!(isalnum(c) ||
+                      c == '-' ||
+                      c == '_')) {
+                  quotedstring = true;
+                  outval+='"';
+                  break;
+                }
+              }
+            }
+
             // perform escaping and determine need for encoding
             bool spaceonly = true;
             bool firstchar = true;
@@ -2380,24 +2408,26 @@ sInt16 TMimeDirProfileHandler::generateValue(
               // escape reserved chars
               switch (c) {
                 case '"':
-                  if (firstchar && aParamValue && aMimeMode==mimo_standard) goto do_escape; // if param value starts with a double quote, we need to escape it because param value can be in double-quote-enclosed form
+                  if (aParamValue) { c = '\''; goto add_char; } // replace double quotes with single quotes
+                  // if (firstchar && aParamValue && aMimeMode==mimo_standard) goto do_escape; // if param value starts with a double quote, we need to escape it because param value can be in double-quote-enclosed form
                   goto add_char; // otherwise, just add
                 case ',':
                   // in MIME-DIR, always escape commas, in pre-MIME-DIR only if usage in value list requires it
-                  if (!aCommaEscape && aMimeMode==mimo_old) goto add_char;
+                  if ((!aCommaEscape && aMimeMode==mimo_old) || quotedstring) goto add_char;
                   goto do_escape;
                 case ':':
                   // always escape colon in parameters
-                  if (!aParamValue) goto add_char;
+                  if (!aParamValue || quotedstring) goto add_char;
                   goto do_escape;
                 case '\\':
+                  if (quotedstring) goto add_char;
                   // Backslash must always be escaped
                   // - for MIMO-old: at least Nokia 9210 does it this way
                   // - for MIME-DIR: specified in the standard
                   goto do_escape;
                 case ';':
                   // in MIME-DIR, always escape semicolons, in pre-MIME-DIR only in parameters and structured values
-                  if (!aParamValue && !aStructured && aMimeMode==mimo_old) goto add_char;
+                  if ((!aParamValue && !aStructured && aMimeMode==mimo_old) || quotedstring) goto add_char;
                 do_escape:
                   if (!aEscapeOnlyLF) {
                     // escape chars with backslash
@@ -2408,6 +2438,7 @@ sInt16 TMimeDirProfileHandler::generateValue(
                   // ignore returns
                   break;
                 case '\n':
+                  if (quotedstring) { c = ' '; goto add_char; }
                   // quote linefeeds
                   if (aMimeMode==mimo_old) {
                     if (aEncoding==enc_none) {
@@ -2435,6 +2466,12 @@ sInt16 TMimeDirProfileHandler::generateValue(
                   break;
               }
             } // for all chars in val item
+
+            // terminate quoted string parameter value
+            if (quotedstring) {
+              outval+='"';
+            }
+
             // go to next item in the val list (if any)
             if (*lp!=0) {
               // more items in the list
@@ -3710,7 +3747,7 @@ bool TMimeDirProfileHandler::parseValue(
             break;
           }
           // check for escaped chars
-          if (c=='\\') {
+          if (!aParamValue && c=='\\') {
             p++;
             c=*p;
             if (!c) break; // half escape sequence, ignore
@@ -3914,19 +3951,6 @@ bool TMimeDirProfileHandler::parseProperty(
         else {
           // not within double quoted value
           if (c==':' || c==';') break; // end of value
-          // check escaped characters
-          if (c=='\\') {
-            // escape char, do not check next char for end-of-value (but DO NOT expand \-escaped chars here!!)
-            vp=nextunfolded(vp,aMimeMode);
-            c=*vp; // get next
-            if (c) {
-              val+='\\'; // keep the escaped sequence for later when value is actually processed!
-            }
-            else {
-              // half-finished escape at end of value, ignore
-              break;
-            }
-          }
         }
         val+=c;
         // cancel QP softbreaks if encoding is already switched to QP at this point
-- 
1.7.10.4

_______________________________________________
os-libsynthesis mailing list
os-libsynthesis@synthesis.ch
http://lists.synthesis.ch/mailman/listinfo/os-libsynthesis

Reply via email to