Qgil added a subscriber: Petrb.
Qgil added a comment.

Petr Bena via lists.wikimedia.org @petrb ?
2:20 PM (2 hours ago)

to Wikimedia 
After short investigation the answer is pretty straight forward and
explained in https://bugzilla.mozilla.org/show_bug.cgi?id=839023

quoting:

U+0000-U+001F are illegal in HTML 4.0 and XML 1.0 (except the
characters HR, LF and CR). And it's not permitted to use named
character references such as  either (although it is permitted
in XML 1.1, except for NUL):
http://www.w3.org/International/questions/qa-controls

possible fixes:

* Run SQL query that find and replace these characters
* Patch bugzilla so that it replace them during xml conversion

Inside Bugzilla/WebService/Server/XMLRPC.pm, in _strip_undefs, at the
end of the function (around line 250):

    if (ref $initial eq '')
    {
      $initial =~ s/([\x01-\x08\x0b\x0c\x0f-\x1f])/sprintf "\\x%02x",ord($1)/ge;
    }

should do the trick but that, indeed, damages some binaries. Do we
actually want to export them? Because XML is not a good format for
exports of binary files as it doesn't allow some characters. What
about getting the out using some SQL query? Why do we even need to use
XML? Is it only way to import to phab?

TASK DETAIL
  https://phabricator.wikimedia.org/T815

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

To: Qgil
Cc: wikibugs-l, chasemp, Dzahn, QChris, Aklapper, Qgil, luser, Amire80, jayvdb, 
Liuxinyu970226, Petrb



_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to