Package: release.debian.org
Severity: normal
Tags: bullseye
User: release.debian....@packages.debian.org
Usertags: pu
X-Debbugs-Cc: pan...@packages.debian.org, Guilhem Moulin <guil...@debian.org>
Control: affects -1 + src:pandoc

[ Reason ]

pandoc 2.9.2.1-1 is vulnerable to CVE-2023-35936: Arbitrary file write
vulnerability via specially crafted image element in the input when generating
files using the `--extract-media` option or outputting to PDF format.

The Security Team decided not to issue a DSA for that CVE, but it's now fixed in
buster-security (2.2.1-3+deb10u1) as well as sid (2.17.1.1-2), so it makes sense
to fix it via (o)s-pu too.

[ Impact ]

For users uprading from buster-security to bullseye, that would be a security
regression.

[ Tests ]

A new unit test was added upstream, and backported along with the code fixes.  
The
test suite is now run at build time (this was not the case before due to
#1010179 — in fact some unit tests had to be updated for the suite to pass).  I
also manually verified that the PoC were fixed.

[ Risks ]

The upstream fixes were not trivial to backport due to major refactoring, but 
test
coverage is good.  (Upstream changes to pandoc.cabal are a no-op as far as 
debian
packaging is concerned.)

[ Checklist ]

  [x] *all* changes are documented in the d/changelog
  [x] I reviewed all changes and I approve them
  [x] attach debdiff against the package in oldstable
  [x] the issue is verified as fixed in unstable

[ Changes ]

  * Add d/salsa-ci.yml for Salsa CI.
  * Fix upstream test suite and make sure it is run at build time (cf. 
#1010179).
  * Fix CVE-2023-35936 and CVE-2023-38745: Arbitrary file write vulnerability
    via specially crafted image element in the input when generating files using
    the `--extract-media` option or outputting to PDF format. (Closes: #1041976)

-- 
Guilhem.
diffstat for pandoc-2.9.2.1 pandoc-2.9.2.1

 changelog                                         |   13 +
 patches/2001_templates_avoid_privacy_breach.patch |   72 ++++++++++-
 patches/Adjust-tests.patch                        |  133 ++++++++++++++++++++
 patches/CVE-2023-35936.patch                      |  144 ++++++++++++++++++++++
 patches/CVE-2023-38745.patch                      |   92 ++++++++++++++
 patches/series                                    |    3 
 rules                                             |    2 
 salsa-ci.yml                                      |    8 +
 8 files changed, 462 insertions(+), 5 deletions(-)

diff -Nru pandoc-2.9.2.1/debian/changelog pandoc-2.9.2.1/debian/changelog
--- pandoc-2.9.2.1/debian/changelog     2020-08-23 10:24:33.000000000 +0200
+++ pandoc-2.9.2.1/debian/changelog     2023-07-21 19:59:53.000000000 +0200
@@ -1,3 +1,16 @@
+pandoc (2.9.2.1-1+deb11u1) bullseye; urgency=high
+
+  * Non-maintainer upload.
+  * Add d/salsa-ci.yml for Salsa CI.
+  * Fix upstream test suite and make sure it is run at build time (cf.
+    #1010179).
+  * Fix CVE-2023-35936 and CVE-2023-38745: Arbitrary file write vulnerability
+    via specially crafted image element in the input when generating files
+    using the `--extract-media` option or outputting to PDF format. (Closes:
+    #1041976)
+
+ -- Guilhem Moulin <guil...@debian.org>  Fri, 21 Jul 2023 19:59:53 +0200
+
 pandoc (2.9.2.1-1) unstable; urgency=medium
 
   [ upstream ]
diff -Nru 
pandoc-2.9.2.1/debian/patches/2001_templates_avoid_privacy_breach.patch 
pandoc-2.9.2.1/debian/patches/2001_templates_avoid_privacy_breach.patch
--- pandoc-2.9.2.1/debian/patches/2001_templates_avoid_privacy_breach.patch     
2020-08-23 09:39:53.000000000 +0200
+++ pandoc-2.9.2.1/debian/patches/2001_templates_avoid_privacy_breach.patch     
2023-07-21 19:59:53.000000000 +0200
@@ -1,9 +1,12 @@
 Description: Avoid potential privacy breaches in templates
 Author: Jonas Smedegaard <d...@jones.dk>
 License: GPL-3+
-Last-Update: 2018-06-12
+Last-Update: 2023-07-21
 ---
 This patch header follows DEP-3: http://dep.debian.net/deps/dep3/
+
+diff --git a/data/dzslides/template.html b/data/dzslides/template.html
+index 56ef896..c3c4c9e 100644
 --- a/data/dzslides/template.html
 +++ b/data/dzslides/template.html
 @@ -48,7 +48,7 @@
@@ -42,9 +45,11 @@
        font-size: 30px;
    }
  
+diff --git a/data/templates/default.dzslides b/data/templates/default.dzslides
+index 11103ab..21bb8fa 100644
 --- a/data/templates/default.dzslides
 +++ b/data/templates/default.dzslides
-@@ -20,15 +20,12 @@
+@@ -20,15 +20,12 @@ $for(css)$
    <link rel="stylesheet" href="$css$">
  $endfor$
  $else$
@@ -61,9 +66,11 @@
        font-size: 30px;
    }
  
+diff --git a/data/templates/default.html5 b/data/templates/default.html5
+index 0676215..724cde4 100644
 --- a/data/templates/default.html5
 +++ b/data/templates/default.html5
-@@ -23,9 +23,6 @@
+@@ -23,9 +23,6 @@ $endfor$
  $if(math)$
    $math$
  $endif$
@@ -73,9 +80,11 @@
  $for(header-includes)$
    $header-includes$
  $endfor$
+diff --git a/src/Text/Pandoc/Options.hs b/src/Text/Pandoc/Options.hs
+index fde8a9a..c40c3ec 100644
 --- a/src/Text/Pandoc/Options.hs
 +++ b/src/Text/Pandoc/Options.hs
-@@ -308,10 +308,10 @@
+@@ -308,10 +308,10 @@ isEnabled :: HasSyntaxExtensions a => Extension -> a -> 
Bool
  isEnabled ext opts = ext `extensionEnabled` getExtensions opts
  
  defaultMathJaxURL :: Text
@@ -88,3 +97,58 @@
  
  $(deriveJSON defaultOptions ''ReaderOptions)
  
+diff --git a/test/lhs-test.html b/test/lhs-test.html
+index 8153a4b..6bde36c 100644
+--- a/test/lhs-test.html
++++ b/test/lhs-test.html
+@@ -75,9 +75,6 @@
+     code span.vs { color: #4070a0; } /* VerbatimString */
+     code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } 
/* Warning */
+   </style>
+-  <!--[if lt IE 9]>
+-    <script 
src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
+-  <![endif]-->
+ </head>
+ <body>
+ <h1 id="lhs-test">lhs test</h1>
+diff --git a/test/lhs-test.html+lhs b/test/lhs-test.html+lhs
+index 0ada9fa..e86bfd6 100644
+--- a/test/lhs-test.html+lhs
++++ b/test/lhs-test.html+lhs
+@@ -75,9 +75,6 @@
+     code span.vs { color: #4070a0; } /* VerbatimString */
+     code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } 
/* Warning */
+   </style>
+-  <!--[if lt IE 9]>
+-    <script 
src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
+-  <![endif]-->
+ </head>
+ <body>
+ <h1 id="lhs-test">lhs test</h1>
+diff --git a/test/s5-fancy.html b/test/s5-fancy.html
+index 7f5350b..23892c8 100644
+--- a/test/s5-fancy.html
++++ b/test/s5-fancy.html
+@@ -28,7 +28,7 @@
+   <link rel="stylesheet" href="s5/default/opera.css" type="text/css" 
media="projection" id="operaFix" />
+   <!-- S5 JS -->
+   <script src="s5/default/slides.js" type="text/javascript"></script>
+-  <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"; 
type="text/javascript"></script>
++  <script src="/usr/share/javascript/mathjax/tex-mml-chtml.js" 
type="text/javascript"></script>
+ </head>
+ <body>
+ <div class="layout">
+diff --git a/test/writer.html5 b/test/writer.html5
+index 431503b..859f8b9 100644
+--- a/test/writer.html5
++++ b/test/writer.html5
+@@ -16,9 +16,6 @@
+     div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
+     ul.task-list{list-style: none;}
+   </style>
+-  <!--[if lt IE 9]>
+-    <script 
src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
+-  <![endif]-->
+ </head>
+ <body>
+ <header id="title-block-header">
diff -Nru pandoc-2.9.2.1/debian/patches/Adjust-tests.patch 
pandoc-2.9.2.1/debian/patches/Adjust-tests.patch
--- pandoc-2.9.2.1/debian/patches/Adjust-tests.patch    1970-01-01 
01:00:00.000000000 +0100
+++ pandoc-2.9.2.1/debian/patches/Adjust-tests.patch    2023-07-21 
19:59:53.000000000 +0200
@@ -0,0 +1,133 @@
+From: John MacFarlane <j...@berkeley.edu>
+Date: Tue, 12 May 2020 13:51:48 -0700
+Subject: Use latest skylighting.
+
+This adds `aria-hidden="true"` to the empty a elements, which
+helps people who use screen readers.
+
+Origin: 
https://github.com/jgm/pandoc/commit/8d0c124e5f76af6aa08acf9b0c526822f65e232b
+Origin: 
https://github.com/jgm/pandoc/commit/112e98def6baf3433e99fbaa3e7280cad16f5422
+---
+ pandoc.cabal           | 4 ++--
+ stack.yaml             | 4 ++--
+ test/command/5627.md   | 8 ++++----
+ test/command/5650.md   | 8 ++++----
+ test/lhs-test.html     | 6 +++---
+ test/lhs-test.html+lhs | 6 +++---
+ 6 files changed, 18 insertions(+), 18 deletions(-)
+
+diff --git a/pandoc.cabal b/pandoc.cabal
+index f6c03a0..a8b9f8e 100644
+--- a/pandoc.cabal
++++ b/pandoc.cabal
+@@ -404,8 +404,8 @@ library
+                  tagsoup >= 0.14.6 && < 0.15,
+                  base64-bytestring >= 0.1 && < 1.1,
+                  zlib >= 0.5 && < 0.7,
+-                 skylighting >= 0.8.3.2 && < 0.9,
+-                 skylighting-core >= 0.8.3.2 && < 0.9,
++                 skylighting >= 0.8.5 && < 0.9,
++                 skylighting-core >= 0.8.5 && < 0.9,
+                  data-default >= 0.4 && < 0.8,
+                  temporary >= 1.1 && < 1.4,
+                  blaze-html >= 0.9 && < 0.10,
+diff --git a/stack.yaml b/stack.yaml
+index b56beb7..92a80f7 100644
+--- a/stack.yaml
++++ b/stack.yaml
+@@ -15,8 +15,8 @@ extra-deps:
+ - pandoc-types-1.20
+ - texmath-0.12.0.1
+ - haddock-library-1.8.0
+-- skylighting-0.8.3.2
+-- skylighting-core-0.8.3.2
++- skylighting-0.8.5
++- skylighting-core-0.8.5
+ - regex-pcre-builtin-0.95.0.8.8.35
+ - doclayout-0.3
+ - emojis-0.1
+diff --git a/test/command/5627.md b/test/command/5627.md
+index 0f67a08..41fec00 100644
+--- a/test/command/5627.md
++++ b/test/command/5627.md
+@@ -20,8 +20,8 @@ Something
+ <li>Two <code>--&gt;something&lt;!--</code></li>
+ <li>Three</li>
+ </ol>
+-<div class="sourceCode" id="cb1"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb1-1"><a href="#cb1-1"></a>--&gt;<span 
class="co">&lt;!--&lt;script&gt;alert(&#39;Escaped!&#39;)&lt;/script&gt;</span></span></code></pre></div>
+-<div class="sourceCode" id="cb2"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb2-1"><a 
href="#cb2-1"></a>Something</span></code></pre></div>
++<div class="sourceCode" id="cb1"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb1-1"><a href="#cb1-1" 
aria-hidden="true"></a>--&gt;<span 
class="co">&lt;!--&lt;script&gt;alert(&#39;Escaped!&#39;)&lt;/script&gt;</span></span></code></pre></div>
++<div class="sourceCode" id="cb2"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb2-1"><a href="#cb2-1" 
aria-hidden="true"></a>Something</span></code></pre></div>
+ ```
+ 
+ ```
+@@ -46,8 +46,8 @@ Something
+ <li><code>--&gt;something&lt;!--</code></li>
+ <li>bye <code>--&gt;something else&lt;!--</code></li>
+ </ul>
+-<div class="sourceCode" id="cb1"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb1-1"><a href="#cb1-1"></a>--&gt;<span 
class="co">&lt;!--&lt;script&gt;alert(&#39;Escaped!&#39;)&lt;/script&gt;</span></span></code></pre></div>
+-<div class="sourceCode" id="cb2"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb2-1"><a 
href="#cb2-1"></a>Something</span></code></pre></div>
++<div class="sourceCode" id="cb1"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb1-1"><a href="#cb1-1" 
aria-hidden="true"></a>--&gt;<span 
class="co">&lt;!--&lt;script&gt;alert(&#39;Escaped!&#39;)&lt;/script&gt;</span></span></code></pre></div>
++<div class="sourceCode" id="cb2"><pre class="sourceCode html"><code 
class="sourceCode html"><span id="cb2-1"><a href="#cb2-1" 
aria-hidden="true"></a>Something</span></code></pre></div>
+ ```
+ 
+ ```
+diff --git a/test/command/5650.md b/test/command/5650.md
+index 0618f41..2f36c60 100644
+--- a/test/command/5650.md
++++ b/test/command/5650.md
+@@ -5,8 +5,8 @@ a
+ b
+ ```
+ ^D
+-<div class="sourceCode" id="foocb1"><pre class="sourceCode haskell"><code 
class="sourceCode haskell"><span id="foocb1-1"><a href="#foocb1-1"></a>a</span>
+-<span id="foocb1-2"><a href="#foocb1-2"></a>b</span></code></pre></div>
++<div class="sourceCode" id="foocb1"><pre class="sourceCode haskell"><code 
class="sourceCode haskell"><span id="foocb1-1"><a href="#foocb1-1" 
aria-hidden="true"></a>a</span>
++<span id="foocb1-2"><a href="#foocb1-2" 
aria-hidden="true"></a>b</span></code></pre></div>
+ ````
+ 
+ ````
+@@ -16,7 +16,7 @@ a
+ b
+ ```
+ ^D
+-<div class="sourceCode" id="foobar"><pre class="sourceCode haskell"><code 
class="sourceCode haskell"><span id="foobar-1"><a href="#foobar-1"></a>a</span>
+-<span id="foobar-2"><a href="#foobar-2"></a>b</span></code></pre></div>
++<div class="sourceCode" id="foobar"><pre class="sourceCode haskell"><code 
class="sourceCode haskell"><span id="foobar-1"><a href="#foobar-1" 
aria-hidden="true"></a>a</span>
++<span id="foobar-2"><a href="#foobar-2" 
aria-hidden="true"></a>b</span></code></pre></div>
+ ````
+ 
+diff --git a/test/lhs-test.html b/test/lhs-test.html
+index 6bde36c..43ff13e 100644
+--- a/test/lhs-test.html
++++ b/test/lhs-test.html
+@@ -80,9 +80,9 @@
+ <h1 id="lhs-test">lhs test</h1>
+ <p><code>unsplit</code> is an arrow that takes a pair of values and combines 
them to
+ return a single value:</p>
+-<div class="sourceCode" id="cb1"><pre class="sourceCode haskell 
literate"><code class="sourceCode haskell"><span id="cb1-1"><a 
href="#cb1-1"></a><span class="ot">unsplit ::</span> (<span 
class="dt">Arrow</span> a) <span class="ot">=&gt;</span> (b <span 
class="ot">-&gt;</span> c <span class="ot">-&gt;</span> d) <span 
class="ot">-&gt;</span> a (b, c) d</span>
+-<span id="cb1-2"><a href="#cb1-2"></a>unsplit <span class="ot">=</span> arr 
<span class="op">.</span> <span class="fu">uncurry</span></span>
+-<span id="cb1-3"><a href="#cb1-3"></a>          <span class="co">-- arr (\op 
(x,y) -&gt; x `op` y)</span></span></code></pre></div>
++<div class="sourceCode" id="cb1"><pre class="sourceCode haskell 
literate"><code class="sourceCode haskell"><span id="cb1-1"><a href="#cb1-1" 
aria-hidden="true"></a><span class="ot">unsplit ::</span> (<span 
class="dt">Arrow</span> a) <span class="ot">=&gt;</span> (b <span 
class="ot">-&gt;</span> c <span class="ot">-&gt;</span> d) <span 
class="ot">-&gt;</span> a (b, c) d</span>
++<span id="cb1-2"><a href="#cb1-2" aria-hidden="true"></a>unsplit <span 
class="ot">=</span> arr <span class="op">.</span> <span 
class="fu">uncurry</span></span>
++<span id="cb1-3"><a href="#cb1-3" aria-hidden="true"></a>          <span 
class="co">-- arr (\op (x,y) -&gt; x `op` y)</span></span></code></pre></div>
+ <p><code>(***)</code> combines two arrows into a new arrow by running the two 
arrows on a
+ pair of values (one arrow on the first item of the pair and one arrow on the
+ second item of the pair).</p>
+diff --git a/test/lhs-test.html+lhs b/test/lhs-test.html+lhs
+index e86bfd6..8cdf3f2 100644
+--- a/test/lhs-test.html+lhs
++++ b/test/lhs-test.html+lhs
+@@ -80,9 +80,9 @@
+ <h1 id="lhs-test">lhs test</h1>
+ <p><code>unsplit</code> is an arrow that takes a pair of values and combines 
them to
+ return a single value:</p>
+-<div class="sourceCode" id="cb1"><pre class="sourceCode literatehaskell 
literate"><code class="sourceCode literatehaskell"><span id="cb1-1"><a 
href="#cb1-1"></a><span class="ot">&gt; unsplit ::</span> (<span 
class="dt">Arrow</span> a) <span class="ot">=&gt;</span> (b <span 
class="ot">-&gt;</span> c <span class="ot">-&gt;</span> d) <span 
class="ot">-&gt;</span> a (b, c) d</span>
+-<span id="cb1-2"><a href="#cb1-2"></a><span class="ot">&gt;</span> unsplit 
<span class="ot">=</span> arr <span class="op">.</span> <span 
class="fu">uncurry</span></span>
+-<span id="cb1-3"><a href="#cb1-3"></a><span class="ot">&gt;</span>           
<span class="co">-- arr (\op (x,y) -&gt; x `op` 
y)</span></span></code></pre></div>
++<div class="sourceCode" id="cb1"><pre class="sourceCode literatehaskell 
literate"><code class="sourceCode literatehaskell"><span id="cb1-1"><a 
href="#cb1-1" aria-hidden="true"></a><span class="ot">&gt; unsplit ::</span> 
(<span class="dt">Arrow</span> a) <span class="ot">=&gt;</span> (b <span 
class="ot">-&gt;</span> c <span class="ot">-&gt;</span> d) <span 
class="ot">-&gt;</span> a (b, c) d</span>
++<span id="cb1-2"><a href="#cb1-2" aria-hidden="true"></a><span 
class="ot">&gt;</span> unsplit <span class="ot">=</span> arr <span 
class="op">.</span> <span class="fu">uncurry</span></span>
++<span id="cb1-3"><a href="#cb1-3" aria-hidden="true"></a><span 
class="ot">&gt;</span>           <span class="co">-- arr (\op (x,y) -&gt; x 
`op` y)</span></span></code></pre></div>
+ <p><code>(***)</code> combines two arrows into a new arrow by running the two 
arrows on a
+ pair of values (one arrow on the first item of the pair and one arrow on the
+ second item of the pair).</p>
diff -Nru pandoc-2.9.2.1/debian/patches/CVE-2023-35936.patch 
pandoc-2.9.2.1/debian/patches/CVE-2023-35936.patch
--- pandoc-2.9.2.1/debian/patches/CVE-2023-35936.patch  1970-01-01 
01:00:00.000000000 +0100
+++ pandoc-2.9.2.1/debian/patches/CVE-2023-35936.patch  2023-07-21 
19:59:53.000000000 +0200
@@ -0,0 +1,144 @@
+From: John MacFarlane <j...@berkeley.edu>
+Date: Tue, 20 Jun 2023 13:50:13 -0700
+Subject: Fix a security vulnerability in MediaBag and
+ T.P.Class.IO.writeMedia.
+
+This vulnerability, discovered by Entroy C, allows users to write
+arbitrary files to any location by feeding pandoc a specially crafted
+URL in an image element.  The vulnerability is serious for anyone
+using pandoc to process untrusted input.
+
+Origin: 
https://github.com/jgm/pandoc/commit/5e381e3878b5da87ee7542f7e51c3c1a7fd84b89
+Origin: 
https://github.com/jgm/pandoc/commit/54561e9a6667b36a8452b01d2def9e3642013dd6
+Origin: 
https://github.com/jgm/pandoc/commit/df4f13b262f7be5863042f8a5a1c365282c81f07
+Origin: 
https://github.com/jgm/pandoc/commit/fe62da61dfd33e6b4c0c03895c528a47a0405bf7
+Origin: 
https://github.com/jgm/pandoc/commit/5246f02f0bb9c176a6d2f6e3d0c03407d8a67445
+Bug: https://github.com/jgm/pandoc/security/advisories/GHSA-xj5q-fv23-575g
+Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-35936
+---
+ pandoc.cabal                         |  1 +
+ src/Text/Pandoc/Class/PandocIO.hs    | 12 ++++++------
+ src/Text/Pandoc/Class/PandocMonad.hs |  2 +-
+ test/Tests/MediaBag.hs               | 37 ++++++++++++++++++++++++++++++++++++
+ test/test-pandoc.hs                  |  2 ++
+ 5 files changed, 47 insertions(+), 7 deletions(-)
+ create mode 100644 test/Tests/MediaBag.hs
+
+diff --git a/pandoc.cabal b/pandoc.cabal
+index a8b9f8e..04ab218 100644
+--- a/pandoc.cabal
++++ b/pandoc.cabal
+@@ -762,6 +762,7 @@ test-suite test-pandoc
+                   Tests.Lua
+                   Tests.Lua.Module
+                   Tests.Shared
++                  Tests.MediaBag
+                   Tests.Readers.LaTeX
+                   Tests.Readers.HTML
+                   Tests.Readers.JATS
+diff --git a/src/Text/Pandoc/Class/PandocIO.hs 
b/src/Text/Pandoc/Class/PandocIO.hs
+index 1cbfd68..0472816 100644
+--- a/src/Text/Pandoc/Class/PandocIO.hs
++++ b/src/Text/Pandoc/Class/PandocIO.hs
+@@ -57,7 +57,7 @@ import Network.HTTP.Client.Internal (addProxy)
+ import Network.HTTP.Client.TLS (tlsManagerSettings)
+ import Network.HTTP.Types.Header ( hContentType )
+ import Network.Socket (withSocketsDo)
+-import Network.URI ( unEscapeString )
++import Network.URI (URI(..), parseURI, unEscapeString)
+ import Prelude
+ import System.Directory (createDirectoryIfMissing)
+ import System.Environment (getEnv)
+@@ -131,11 +131,11 @@ instance PandocMonad PandocIO where
+   newUniqueHash = hashUnique <$> liftIO IO.newUnique
+ 
+   openURL u
+-   | Just u'' <- T.stripPrefix "data:" u = do
+-       let mime     = T.takeWhile (/=',') u''
+-       let contents = UTF8.fromString $
+-                       unEscapeString $ T.unpack $ T.drop 1 $ T.dropWhile 
(/=',') u''
+-       return (decodeLenient contents, Just mime)
++   | Just (URI{ uriScheme = "data:",
++                uriPath = upath }) <- parseURI (T.unpack u) = do
++       let (mime, rest) = break (== ',') $ unEscapeString upath
++       let contents = UTF8.fromString $ drop 1 rest
++       return (decodeLenient contents, Just (T.pack mime))
+    | otherwise = do
+        let toReqHeader (n, v) = (CI.mk (UTF8.fromText n), UTF8.fromText v)
+        customHeaders <- map toReqHeader <$> getsCommonState stRequestHeaders
+diff --git a/src/Text/Pandoc/Class/PandocMonad.hs 
b/src/Text/Pandoc/Class/PandocMonad.hs
+index 8229668..eb6bedd 100644
+--- a/src/Text/Pandoc/Class/PandocMonad.hs
++++ b/src/Text/Pandoc/Class/PandocMonad.hs
+@@ -612,7 +612,7 @@ fetchMediaResource :: PandocMonad m
+               => T.Text -> m (FilePath, Maybe MimeType, BL.ByteString)
+ fetchMediaResource src = do
+   (bs, mt) <- downloadOrRead src
+-  let ext = fromMaybe (T.pack $ takeExtension $ T.unpack src)
++  let ext = fromMaybe (T.pack $ takeExtension $ unEscapeString $ T.unpack src)
+                       (mt >>= extensionFromMimeType)
+   let bs' = BL.fromChunks [bs]
+   let basename = showDigest $ sha1 bs'
+diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs
+new file mode 100644
+index 0000000..8a57337
+--- /dev/null
++++ b/test/Tests/MediaBag.hs
+@@ -0,0 +1,37 @@
++{-# LANGUAGE OverloadedStrings #-}
++module Tests.MediaBag (tests) where
++
++import Test.Tasty
++import Test.Tasty.HUnit
++-- import Tests.Helpers
++import Text.Pandoc.Class (extractMedia, fillMediaBag, runIOorExplode)
++import System.IO.Temp (withTempDirectory)
++import Text.Pandoc.Shared (inDirectory)
++import System.FilePath
++import Text.Pandoc.Builder as B
++import System.Directory (doesFileExist, copyFile)
++
++tests :: [TestTree]
++tests = [
++  testCase "test fillMediaBag & extractMedia" $
++      withTempDirectory "." "extractMediaTest" $ \tmpdir -> inDirectory 
tmpdir $ do
++        copyFile "../../test/bodybg.gif" "bodybg.gif"
++        let d = B.doc $
++                  B.para (B.image "../../test/lalune.jpg" "" mempty) <>
++                  B.para (B.image "bodybg.gif" "" mempty) <>
++                  B.para (B.image 
"data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua"
 "" mempty) <>
++                  B.para (B.image 
""
 "" mempty)
++        runIOorExplode $ do
++          fillMediaBag d
++          extractMedia "foo" d
++        exists1 <- doesFileExist ("foo" </> 
"278e30c6961bc3e263c638fb15e114d35290db05.gif")
++        assertBool "file in directory is not extracted with hashed name" 
exists1
++        exists2 <- doesFileExist ("foo" </> 
"f9d88c3dbe18f6a7f5670e994a947d51216cdf0e.jpg")
++        assertBool "file above directory is not extracted with hashed name" 
exists2
++        exists3 <- doesFileExist ("foo" </> 
"2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua")
++        exists4 <- doesFileExist "a.lua"
++        assertBool "data uri with malicious payload gets written outside of 
destination dir"
++          (exists3 && not exists4)
++        exists5 <- doesFileExist ("foo" </> 
"d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif")
++        assertBool "data uri with gif is not properly decoded" exists5
++  ]
+diff --git a/test/test-pandoc.hs b/test/test-pandoc.hs
+index 9d64b61..b1804dd 100644
+--- a/test/test-pandoc.hs
++++ b/test/test-pandoc.hs
+@@ -44,6 +44,7 @@ import qualified Tests.Writers.Powerpoint
+ import qualified Tests.Writers.RST
+ import qualified Tests.Writers.TEI
+ import Tests.Helpers (findPandoc)
++import qualified Tests.MediaBag
+ import Text.Pandoc.Shared (inDirectory)
+ 
+ tests :: FilePath -> TestTree
+@@ -51,6 +52,7 @@ tests pandocPath = testGroup "pandoc tests"
+         [ Tests.Command.tests pandocPath
+         , testGroup "Old" (Tests.Old.tests pandocPath)
+         , testGroup "Shared" Tests.Shared.tests
++        , testGroup "MediaBag" Tests.MediaBag.tests
+         , testGroup "Writers"
+           [ testGroup "Native" Tests.Writers.Native.tests
+           , testGroup "ConTeXt" Tests.Writers.ConTeXt.tests
diff -Nru pandoc-2.9.2.1/debian/patches/CVE-2023-38745.patch 
pandoc-2.9.2.1/debian/patches/CVE-2023-38745.patch
--- pandoc-2.9.2.1/debian/patches/CVE-2023-38745.patch  1970-01-01 
01:00:00.000000000 +0100
+++ pandoc-2.9.2.1/debian/patches/CVE-2023-38745.patch  2023-07-21 
19:59:53.000000000 +0200
@@ -0,0 +1,92 @@
+From: John MacFarlane <j...@berkeley.edu>
+Date: Thu, 20 Jul 2023 09:26:38 -0700
+Subject: Fix new variant of the vulnerability in CVE-2023-35936.
+
+Guilhem Moulin noticed that the fix to CVE-2023-35936 was incomplete.
+An attacker could get around it by double-encoding the malicious
+extension to create or override arbitrary files.
+
+    $ echo 
'![](data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%252f%252e%252e%252f%252e%252e%252fb%252elua)'
 >b.md
+    $ .cabal/bin/pandoc b.md --extract-media=bar
+    <p><img
+    
src="bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+%2f%2e%2e%2f%2e%2e%2fb%2elua"
 /></p>
+    $ cat b.lua
+    print "hello"
+    $ find bar
+    bar/
+    bar/2a0eaa89f43fada3e6c577beea4f2f8f53ab6a1d.lua+
+
+This commit adds a test case for this more complex attack and fixes
+the vulnerability.
+
+Origin: 
https://github.com/jgm/pandoc/commit/eddedbfc14916aa06fc01ff04b38aeb30ae2e625
+Bug-Debian: https://security-tracker.debian.org/tracker/CVE-2023-38745
+---
+ src/Text/Pandoc/Class/PandocMonad.hs | 10 +++++-----
+ test/Tests/MediaBag.hs               | 12 +++++++++++-
+ 2 files changed, 16 insertions(+), 6 deletions(-)
+
+diff --git a/src/Text/Pandoc/Class/PandocMonad.hs 
b/src/Text/Pandoc/Class/PandocMonad.hs
+index eb6bedd..13fd7c2 100644
+--- a/src/Text/Pandoc/Class/PandocMonad.hs
++++ b/src/Text/Pandoc/Class/PandocMonad.hs
+@@ -56,14 +56,13 @@ import Codec.Archive.Zip
+ import Control.Monad.Except (MonadError (catchError, throwError),
+                              MonadTrans, lift, when)
+ import Data.Digest.Pure.SHA (sha1, showDigest)
+-import Data.Maybe (fromMaybe)
+ import Data.Time (UTCTime)
+ import Data.Time.Clock.POSIX (POSIXTime, utcTimeToPOSIXSeconds)
+ import Data.Time.LocalTime (TimeZone, ZonedTime, utcToZonedTime)
+ import Network.URI ( escapeURIString, nonStrictRelativeTo,
+                      unEscapeString, parseURIReference, isAllowedInURI,
+                      parseURI, URI(..) )
+-import System.FilePath ((</>), (<.>), takeExtension, dropExtension,
++import System.FilePath ((</>), takeExtension, dropExtension,
+                         isRelative, splitDirectories)
+ import System.Random (StdGen)
+ import Text.Pandoc.BCP47 (Lang(..), parseBCP47, renderLang)
+@@ -612,11 +611,12 @@ fetchMediaResource :: PandocMonad m
+               => T.Text -> m (FilePath, Maybe MimeType, BL.ByteString)
+ fetchMediaResource src = do
+   (bs, mt) <- downloadOrRead src
+-  let ext = fromMaybe (T.pack $ takeExtension $ unEscapeString $ T.unpack src)
+-                      (mt >>= extensionFromMimeType)
++  let ext = case (takeExtension $ unEscapeString $ T.unpack src) of
++              '.':e | '%' `notElem` e -> '.':e
++              _ -> maybe "" (\x -> '.':T.unpack x) (mt >>= 
extensionFromMimeType)
+   let bs' = BL.fromChunks [bs]
+   let basename = showDigest $ sha1 bs'
+-  let fname = basename <.> T.unpack ext
++  let fname = basename <> ext
+   return (fname, mt, bs')
+ 
+ -- | Traverse tree, filling media bag for any images that
+diff --git a/test/Tests/MediaBag.hs b/test/Tests/MediaBag.hs
+index 8a57337..ffba89f 100644
+--- a/test/Tests/MediaBag.hs
++++ b/test/Tests/MediaBag.hs
+@@ -19,7 +19,7 @@ tests = [
+         let d = B.doc $
+                   B.para (B.image "../../test/lalune.jpg" "" mempty) <>
+                   B.para (B.image "bodybg.gif" "" mempty) <>
+-                  B.para (B.image 
"data://image/png;base64,cHJpbnQgImhlbGxvIgo=;.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua"
 "" mempty) <>
++                  B.para (B.image 
";.lua+%2f%2e%2e%2f%2e%2e%2fa%2elua" 
"" mempty) <>
+                   B.para (B.image 
""
 "" mempty)
+         runIOorExplode $ do
+           fillMediaBag d
+@@ -34,4 +34,14 @@ tests = [
+           (exists3 && not exists4)
+         exists5 <- doesFileExist ("foo" </> 
"d5fceb6532643d0d84ffe09c40c481ecdf59e15a.gif")
+         assertBool "data uri with gif is not properly decoded" exists5
++        -- double-encoded version:
++        let e = B.doc $
++                  B.para (B.image 
";.lua+%252f%252e%252e%252f%252e%252e%252fb%252elua"
 "" mempty)
++        runIOorExplode $ do
++          fillMediaBag e
++          extractMedia "bar" e
++        exists6 <- doesFileExist ("bar" </> 
"772ceca21a2751863ec46cb23db0e7fc35b9cff8.png")
++        exists7 <- doesFileExist "b.lua"
++        assertBool "data uri with double-encoded malicious payload gets 
written outside of destination dir"
++          (exists6 && not exists7)
+   ]
diff -Nru pandoc-2.9.2.1/debian/patches/series 
pandoc-2.9.2.1/debian/patches/series
--- pandoc-2.9.2.1/debian/patches/series        2020-08-23 10:15:20.000000000 
+0200
+++ pandoc-2.9.2.1/debian/patches/series        2023-07-21 19:59:53.000000000 
+0200
@@ -1,3 +1,6 @@
 020200417~a9ef15b.patch
 2001_templates_avoid_privacy_breach.patch
 2002_program_package_hint.patch
+Adjust-tests.patch
+CVE-2023-35936.patch
+CVE-2023-38745.patch
diff -Nru pandoc-2.9.2.1/debian/rules pandoc-2.9.2.1/debian/rules
--- pandoc-2.9.2.1/debian/rules 2020-08-23 09:35:10.000000000 +0200
+++ pandoc-2.9.2.1/debian/rules 2023-07-21 19:59:53.000000000 +0200
@@ -1,5 +1,6 @@
 #!/usr/bin/make -f
 
+export DEB_ENABLE_TESTS = yes
 include /usr/share/cdbs/1/rules/debhelper.mk
 -include /usr/share/cdbs/1/class/hlibrary.mk
 
@@ -182,7 +183,6 @@
 DEB_SETUP_GHC_CONFIGURE_ARGS += --ghc-options="-optc--param 
-optcggc-min-expand=10 -O0"
 endif
 
-DEB_ENABLE_TESTS = yes
 DEB_SETUP_GHC_CONFIGURE_ARGS += $(if $(filter 
nocheck,$(DEB_BUILD_OPTIONS)),,-ftests)
 
 DEB_INSTALL_DOCS_ALL += README.md
diff -Nru pandoc-2.9.2.1/debian/salsa-ci.yml pandoc-2.9.2.1/debian/salsa-ci.yml
--- pandoc-2.9.2.1/debian/salsa-ci.yml  1970-01-01 01:00:00.000000000 +0100
+++ pandoc-2.9.2.1/debian/salsa-ci.yml  2023-07-21 19:59:53.000000000 +0200
@@ -0,0 +1,8 @@
+---
+include:
+  - 
https://salsa.debian.org/salsa-ci-team/pipeline/raw/master/recipes/debian.yml
+
+variables:
+  RELEASE: 'bullseye'
+  SALSA_CI_DISABLE_REPROTEST: 1
+  SALSA_CI_DISABLE_LINTIAN: 1

Attachment: signature.asc
Description: PGP signature

Reply via email to