* Fay Stegerman <f...@obfusk.net> [2024-08-23 17:38]:
> * Fay Stegerman <f...@obfusk.net> [2024-08-22 23:12]:
> > * Alan Coopersmith <alan.coopersm...@oracle.com> [2024-08-22 20:56]:
> > > -------- Forwarded Message --------
> > > Subject:  [Security-announce][CVE-2024-8088] Infinite loop when iterating
> > > over zip archive entry names
> > > Date:     Thu, 22 Aug 2024 13:40:20 -0500
> > > From:     Seth Larson <s...@python.org>
> > > Reply-To:         security-...@python.org
> > > To:       security-annou...@python.org
> > >
> > > There is a HIGH severity vulnerability affecting the CPython "zipfile" 
> > > module.
> > >
> > > When iterating over names of entries in a zip archive (for example, 
> > > methods
> > > of "zipfile.ZipFile" like "namelist()", "iterdir()", "extractall()", etc)
> > > the process can be put into an infinite loop with a maliciously crafted
> > > zip archive. This defect applies when reading only metadata or extracting
> > > the contents of the zip archive. Programs that are not handling
> > > user-controlled zip archives are not affected.
> > >
> > > Please see the linked CVE ID for the latest information on affected 
> > > versions:
> > >
> > > * https://www.cve.org/CVERecord?id=CVE-2024-8088
> > > * https://github.com/python/cpython/pull/122906
> > > * https://github.com/python/cpython/issues/122905
> >
> > A small correction/addendum based on reading the vulnerability report and 
> > the PR
> > that fixes this (as well as being quite familiar with Python zipfile.ZipFile
> > internals and confused how this would affect it): it's not zipfile.ZipFile 
> > and
> > its methods that are affected, at least not directly, but zipfile.Path.  The
> > issue being this code in zipfile._path._ancestry():
> >
> >   path = path.rstrip(posixpath.sep)
> >   while path and path != posixpath.sep:
> >       yield path
> >       path, tail = posixpath.split(path)
> >
> > Which results in an infinite loop because for example posixpath.split("//") 
> > ==
> > ("//", "") but "//" != posixpath.sep:
> >
> >   >>> it = zipfile._path._parents("//foo")
> >   >>> next(it)
> >   '//'
> >   >>> next(it)
> >   '//'
> >   >>> next(it)
> >   '//'
> >
> > The infinite loop has been fixed by sanitising the paths.
> 
> Forgot to mention this: the infinite loop is triggered when zipfile.Path adds
> "implied directories" -- using _parents(), which calls _ancestry() -- in the
> overridden .namelist() for the custom zipfile.ZipFile subclass it wraps.  
> Which
> is (indirectly) used by almost all of the zipfile.Path methods like 
> .iterdir(),
> .glob(), .exists(), .joinpath() etc.
> 
>   >>> zf = zipfile.ZipFile(io.BytesIO(), "w")
>   >>> zf.filename = "foo.zip"
>   >>> zf.writestr("a/b/c", "abc")
>   >>> zf.writestr("d/e", "de")
>   >>> zf.namelist()
>   ['a/b/c', 'd/e']
>   >>> zf.__class__
>   <class 'zipfile.ZipFile'>
> 
>   >>> p = zipfile.Path(zf)
>   >>> p.root.namelist()
>   ['a/b/c', 'd/e', 'a/b/', 'a/', 'd/']
>   >>> list(p.iterdir())
>   [Path('foo.zip', 'a/'), Path('foo.zip', 'd/')]
>   >>> zf.__class__
>   <class 'zipfile._path.CompleteDirs'>
> 
>   >>> zf.writestr("//oops", "oops")
>   >>> # infinite loop via joinpath -> resolve_dir -> _name_set -> namelist ->
>   >>> # _implied_dirs -> _ancestry
>   >>> p / "foo"
> 
> As zipfile.Path modifies the class of the original ZipFile, calling 
> .namelist()
> or .extractall() on the original ZipFile used to create the Path afterwards is
> also affected even though zipfile.ZipFile as such is not.

I just reported [1] that the patch introduced a regression:

  >>> import io, zipfile
  >>> zf = zipfile.ZipFile(io.BytesIO(), "w")
  >>> zf.writestr("d:/foo", "bar")
  >>> zf.extractall("a")
  >>> open("a/d:/foo").read()
  'bar'
  >>> p = zipfile.Path(zf)
  >>> x = p / "d" / "foo"
  >>> y = p / "d:" / "foo"
  >>> list(p.iterdir())   # before: [Path(None, 'd:/')]
  [Path(None, 'd/')]
  >>> p.root.namelist()   # before: ['d:/foo', 'd:/']
  ['d/foo', 'd/']
  >>> x.exists()          # before: False
  True
  >>> y.exists()          # before: True
  False
  >>> zf.extractall("b")  # before: worked like above
  KeyError: "There is no item named 'd/foo' in the archive"
  >>> x.read_text()       # before: FileNotFoundError
  KeyError: "There is no item named 'd/foo' in the archive"
  >>> y.read_text()       # before: worked
  FileNotFoundError: ...

- Fay

[1] https://github.com/python/cpython/issues/123270

Reply via email to