D7030: dirs: fix trivial over-read of input data

2019-10-09 Thread durin42 (Augie Fackler)
Closed by commit rHG2a0774e9d2a8: dirs: fix trivial over-read of input data 
(authored by durin42).
This revision was automatically updated to reflect the committed changes.

REPOSITORY
  rHG Mercurial

CHANGES SINCE LAST UPDATE
  https://phab.mercurial-scm.org/D7030?vs=16986&id=17031

CHANGES SINCE LAST ACTION
  https://phab.mercurial-scm.org/D7030/new/

REVISION DETAIL
  https://phab.mercurial-scm.org/D7030

AFFECTED FILES
  mercurial/cext/dirs.c

CHANGE DETAILS

diff --git a/mercurial/cext/dirs.c b/mercurial/cext/dirs.c
--- a/mercurial/cext/dirs.c
+++ b/mercurial/cext/dirs.c
@@ -68,26 +68,41 @@
while ((pos = _finddir(cpath, pos - 1)) != -1) {
PyObject *val;
 
-   /* It's likely that every prefix already has an entry
-  in our dict. Try to avoid allocating and
-  deallocating a string for each prefix we check. */
-   if (key != NULL)
-   ((PyBytesObject *)key)->ob_shash = -1;
-   else {
-   /* Force Python to not reuse a small shared string. */
-   key = PyBytes_FromStringAndSize(cpath,
-pos < 2 ? 2 : pos);
+   if (pos < 2) {
+   key = PyBytes_FromStringAndSize(cpath, pos);
if (key == NULL)
goto bail;
+   } else {
+   /* It's likely that every prefix already has an entry
+  in our dict. Try to avoid allocating and
+  deallocating a string for each prefix we check. */
+   if (key != NULL)
+   ((PyBytesObject *)key)->ob_shash = -1;
+   else {
+   /* We know pos >= 2, so we won't get a small
+* shared string. */
+   key = PyBytes_FromStringAndSize(cpath, pos);
+   if (key == NULL)
+   goto bail;
+   }
+   /* Py_SIZE(o) refers to the ob_size member of
+* the struct. Yes, assigning to what looks
+* like a function seems wrong. */
+   Py_SIZE(key) = pos;
+   ((PyBytesObject *)key)->ob_sval[pos] = '\0';
}
-   /* Py_SIZE(o) refers to the ob_size member of the struct. Yes,
-   * assigning to what looks like a function seems wrong. */
-   Py_SIZE(key) = pos;
-   ((PyBytesObject *)key)->ob_sval[pos] = '\0';
 
val = PyDict_GetItem(dirs, key);
if (val != NULL) {
PYLONG_VALUE(val) += 1;
+   if (pos < 2) {
+   /* This was a short string, so we
+* probably got a small shared string
+* we can't mutate on the next loop
+* iteration. Clear it.
+*/
+   Py_CLEAR(key);
+   }
break;
}
 



To: durin42, #hg-reviewers, indygreg
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel


D7030: dirs: fix trivial over-read of input data

2019-10-09 Thread durin42 (Augie Fackler)
durin42 created this revision.
Herald added a subscriber: mercurial-devel.
Herald added a reviewer: hg-reviewers.

REVISION SUMMARY
  This code, introduced in 8c0a7eeda06d 
, 
was intentionally over-reading
  an input string to avoid getting a shared string object for a one-byte
  input. Unfortunately with an empty input (like in the case of a fuzzer
  getting started) this was a trivial over-read and triggered an
  AddressSanitizer failure.
  
  I went out of my way to make sure the code still does the
  copy-avoidance tricks. I don't think this change will cost us much
  performance since the one-character strings should be cached
  aggressively anyway.

REPOSITORY
  rHG Mercurial

REVISION DETAIL
  https://phab.mercurial-scm.org/D7030

AFFECTED FILES
  mercurial/cext/dirs.c

CHANGE DETAILS

diff --git a/mercurial/cext/dirs.c b/mercurial/cext/dirs.c
--- a/mercurial/cext/dirs.c
+++ b/mercurial/cext/dirs.c
@@ -68,26 +68,41 @@
while ((pos = _finddir(cpath, pos - 1)) != -1) {
PyObject *val;
 
-   /* It's likely that every prefix already has an entry
-  in our dict. Try to avoid allocating and
-  deallocating a string for each prefix we check. */
-   if (key != NULL)
-   ((PyBytesObject *)key)->ob_shash = -1;
-   else {
-   /* Force Python to not reuse a small shared string. */
-   key = PyBytes_FromStringAndSize(cpath,
-pos < 2 ? 2 : pos);
+   if (pos < 2) {
+   key = PyBytes_FromStringAndSize(cpath, pos);
if (key == NULL)
goto bail;
+   } else {
+   /* It's likely that every prefix already has an entry
+  in our dict. Try to avoid allocating and
+  deallocating a string for each prefix we check. */
+   if (key != NULL)
+   ((PyBytesObject *)key)->ob_shash = -1;
+   else {
+   /* We know pos >= 2, so we won't get a small
+* shared string. */
+   key = PyBytes_FromStringAndSize(cpath, pos);
+   if (key == NULL)
+   goto bail;
+   }
+   /* Py_SIZE(o) refers to the ob_size member of
+* the struct. Yes, assigning to what looks
+* like a function seems wrong. */
+   Py_SIZE(key) = pos;
+   ((PyBytesObject *)key)->ob_sval[pos] = '\0';
}
-   /* Py_SIZE(o) refers to the ob_size member of the struct. Yes,
-   * assigning to what looks like a function seems wrong. */
-   Py_SIZE(key) = pos;
-   ((PyBytesObject *)key)->ob_sval[pos] = '\0';
 
val = PyDict_GetItem(dirs, key);
if (val != NULL) {
PYLONG_VALUE(val) += 1;
+   if (pos < 2) {
+   /* This was a short string, so we
+* probably got a small shared string
+* we can't mutate on the next loop
+* iteration. Clear it.
+*/
+   Py_CLEAR(key);
+   }
break;
}
 



To: durin42, #hg-reviewers
Cc: mercurial-devel
___
Mercurial-devel mailing list
Mercurial-devel@mercurial-scm.org
https://www.mercurial-scm.org/mailman/listinfo/mercurial-devel