Sending again, since I don't think this email made it to the libxml mailing
list since I was not subscribed.

---------- Forwarded message ----------
From: Joel Hockey <joelhoc...@chromium.org>
Date: Wed, Jan 3, 2018 at 5:01 PM
Subject: Re: [xml] Patch to fix ICU flush and pivot buffer
To: "Jungshik Shin (신정식, 申政湜)" <js...@chromium.org>
Cc: Nick Wellnhofer <wellnho...@aevum.de>, Markus Scherer <
msche...@google.com>, "xml@gnome.org" <xml@gnome.org>, Markus Scherer <
markus....@gmail.com>


Nick, I have another patch for some additional call sites where flush is
being incorrectly set on the non-final read.

This was found by the chromium fuzzing tests.
https://bugs.chromium.org/p/chromium/issues/detail?id=790944

I have included a test case for this which uses UTF8 and only works with
icu.

I saw that you were able to create a testcase with EUC-JP last time which
worked with icu and iconv.  I've tried quite a bit to do something similar,
but I can't replicate the error condition with that encoding.  I don't
expect that you would want to check in this testcase, but I've included for
you to run locally if you like.


On Thu, Nov 9, 2017 at 11:36 AM, Joel Hockey <joelhoc...@chromium.org>
wrote:

> Yes, I will update chromium with this as per https://cs.chromium.org/ch
> romium/src/third_party/libxml/chromium/roll.py
>
> On Thu, Nov 9, 2017 at 10:35 AM, Jungshik Shin (신정식, 申政湜) <
> js...@chromium.org> wrote:
>
>> Thank you, Joel and Nick !
>>
>> Joel:  I guess you're gonna roll libxml in the Chromium tree to a version
>> including these changes.
>>
>> Jungshik
>>
>> 2017-11-08 15:22 GMT-08:00 Joel Hockey <joelhoc...@chromium.org>:
>>
>>> Thanks Nick.  Nice work with the test.
>>>
>>>
>>>
>>> On Sun, Nov 5, 2017 at 2:04 AM, Nick Wellnhofer <wellnho...@aevum.de>
>>> wrote:
>>>
>>>> On 26/10/2017 03:17, Joel Hockey wrote:
>>>>
>>>>> I've updated the patch using git format-patch.
>>>>>
>>>>
>>>> Thanks for the updated patch. Applied here:
>>>> https://git.gnome.org/browse/libxml2/commit/?id=0b19f236a263
>>>> a7b0acacd4ea84dc7237303ee3d9
>>>>
>>>> The original bug found by fuzzer only relates to UTF8 decoding, so
>>>>> using Shift-JIS or anything else wont help.
>>>>>
>>>>
>>>> Why not? My reasoning was that ICU uses the same code path for all
>>>> variable-width encodings. I simply converted your test file to EUC-JP and
>>>> it turns out that this triggers the bug as well:
>>>>
>>>> https://git.gnome.org/browse/libxml2/commit/?id=72182550926d
>>>> 31ad17357bd3ed69e49d7e69df02
>>>>
>>>> Nick
>>>>
>>>
>>>
>>
>
From 441e1e413a8f67c0813fa0e04b19dfea76e5ece6 Mon Sep 17 00:00:00 2001
From: Joel Hockey <joel.hoc...@gmail.com>
Date: Tue, 2 Jan 2018 21:47:35 -0800
Subject: [PATCH] Change calls to xmlCharEncInput to set flush false when not
 final call. Having flush incorrectly set to true causes errors for ICU.

---
 HTMLparser.c      | 2 +-
 parserInternals.c | 2 +-
 xmlIO.c           | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/HTMLparser.c b/HTMLparser.c
index 7e243e60..9adeb174 100644
--- a/HTMLparser.c
+++ b/HTMLparser.c
@@ -3635,7 +3635,7 @@ htmlCheckEncodingDirect(htmlParserCtxtPtr ctxt, const xmlChar *encoding) {
 	     */
 	    processed = ctxt->input->cur - ctxt->input->base;
 	    xmlBufShrink(ctxt->input->buf->buffer, processed);
-	    nbchars = xmlCharEncInput(ctxt->input->buf, 1);
+	    nbchars = xmlCharEncInput(ctxt->input->buf, 0);
 	    if (nbchars < 0) {
 		htmlParseErr(ctxt, XML_ERR_INVALID_ENCODING,
 		             "htmlCheckEncoding: encoder error\n",
diff --git a/parserInternals.c b/parserInternals.c
index 09876ab4..8c0cd57a 100644
--- a/parserInternals.c
+++ b/parserInternals.c
@@ -1214,7 +1214,7 @@ xmlSwitchInputEncodingInt(xmlParserCtxtPtr ctxt, xmlParserInputPtr input,
                 /*
                  * convert as much as possible of the buffer
                  */
-                nbchars = xmlCharEncInput(input->buf, 1);
+                nbchars = xmlCharEncInput(input->buf, 0);
             } else {
                 /*
                  * convert just enough to get
diff --git a/xmlIO.c b/xmlIO.c
index f61dd05a..82543477 100644
--- a/xmlIO.c
+++ b/xmlIO.c
@@ -3157,7 +3157,7 @@ xmlParserInputBufferPush(xmlParserInputBufferPtr in,
 	 * convert as much as possible to the parser reading buffer.
 	 */
 	use = xmlBufUse(in->raw);
-	nbchars = xmlCharEncInput(in, 1);
+	nbchars = xmlCharEncInput(in, 0);
 	if (nbchars < 0) {
 	    xmlIOErr(XML_IO_ENCODER, NULL);
 	    in->error = XML_IO_ENCODER;
@@ -3273,7 +3273,7 @@ xmlParserInputBufferGrow(xmlParserInputBufferPtr in, int len) {
 	 * convert as much as possible to the parser reading buffer.
 	 */
 	use = xmlBufUse(in->raw);
-	nbchars = xmlCharEncInput(in, 1);
+	nbchars = xmlCharEncInput(in, 0);
 	if (nbchars < 0) {
 	    xmlIOErr(XML_IO_ENCODER, NULL);
 	    in->error = XML_IO_ENCODER;
-- 
2.15.1.620.gb9897f4670-goog

<?xml version="1.0" encoding="UTF8-"?>
<foo>
Text with UTF8 chars at position 214 (0xd6) and 513 (0x201)
______
_______________
_______________
_______________
_______________
_______________
_______________
____񑘓£_____
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_______________
_񑘓£________
</foo>
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to