[oss-security] Re: Ghostscript 10.03.1 (2024-05-02) fixed 5 CVEs including CVE-2024-33871 arbitrary code execution

Thomas Rinsma Wed, 03 Jul 2024 08:21:08 -0700

Hi,

Per Solar's request, here is some information on recent Ghostscriptbugs. They have all been fixed upstream already for either ~1 month(10.03.1) or ~4 months (10.03.0). It looks like patches have also landedin most distros, but there is not a super clear changelog or versionhistory so this might help clarify things.

Note that this is just a subset of all vulnerabilities fixed in 10.03.0and 10.03.1: these are just the bugs I myself found and reported.


# CVE-2024-29509 - heap buffer overflow via the PDFPassword parameter

The `runpdf` command (and friends) allows the new C-based PDFinterpreter to be invoked from within PS. With this, we can pass variousflags and arguments (see `pdf_impl_set_param`) that are normally passedvia the command-line when the PDF interpreter is invoked directly.

It turns out that validation of several of these parameters is flawed,maybe because they were considered somewhat "trusted", beingcommand-line arguments originally.

The fields `ctx->encryption.Password` and `ctx->encryption.PasswordLen`are set based on the value of `PDFPassword`. During the decryptionprocess, in `check_password_R5` in `pdf_sec.c`, a buffer is allocatedbased on the string-length of this field:

```

code = pdfi_object_alloc(ctx, PDF_STRING,strlen(ctx->encryption.Password), (pdf_obj **)&P);

```

However, a `memcpy` later copies the full length of the PS-suppliedobject into this buffer:


```
memcpy(P->data, Password, PasswordLen);
```

Because PS-strings are not null-terminated, this will result in a heapbuffer overflow when a value of `PDFPassword` is supplied with a nullbyte in the middle. For example, the following will result in a `memcpy`of 7 bytes into a buffer of size 3:


```
/PDFPassword (foo\000bar) def
```

This bug was fixed in 10.03.0 (2024-03-06), and is bug (1) in thisreport: https://bugs.ghostscript.com/show_bug.cgi?id=707510



# CVE-2024-29506 - stack buffer overflow in pdfi_apply_filter()

The `PDFDEBUG` flag controls the value of `ctx->args.debug`. In`pdfi_apply_filter` this enables execution of a `memcpy` into a stackbuffer, without bounds checks. The input (`n->data`, the PDF filtername) is an attacker controlled buffer of arbitrary size. A filter namelarger than 100 will overflow the `str` buffer.


```
if (ctx->args.pdfdebug)
    {
        char str[100];
        memcpy(str, (const char *)n->data, n->length);
        str[n->length] = '\0';
        dmprintf1(ctx->memory, "FILTER NAME:%s\n", str);
    }
```

This bug was also fixed in 10.03.0 (2024-03-06), and is bug (2) in thisreport: https://bugs.ghostscript.com/show_bug.cgi?id=707510



# CVE-2024-29507 - stack buffer overflow via CIDFSubstPath/Font params

Under specific conditions, the `cidfsubstpath` and `cidfsubstfont`parameters (set by corresponding Postscript objects) are used to loadsubstitute fonts (this is in `pdfi_open_CIDFont_substitute_file`). Thevalues are `memcpy`d into the `fontfname` buffer without bounds checks.Hence, an attacker can pass values larger than the buffer size totrigger a stack buffer overflow.


```
char fontfname[gp_file_name_sizeof]; // 4096

// .. <snip> ...

if (ctx->args.cidfsubstpath.data == NULL) {
    memcpy(fontfname, fsprefix, fsprefixlen);
}
else {

memcpy(fontfname, ctx->args.cidfsubstpath.data,ctx->args.cidfsubstpath.size);

    fsprefixlen = ctx->args.cidfsubstpath.size;
}

if (ctx->args.cidfsubstfont.data == NULL) {
    // ... <snip> ...
}
else {

memcpy(fontfname, ctx->args.cidfsubstfont.data,ctx->args.cidfsubstfont.size);

    defcidfallacklen = ctx->args.cidfsubstfont.size;
}
```

This bug was also fixed in 10.03.0 (2024-03-06), and is bug (3) in thisreport: https://bugs.ghostscript.com/show_bug.cgi?id=707510



# CVE-2024-29508 - heap pointer leak in pdf_base_font_alloc()

The function `pdf_base_font_alloc` used by the `pdfwrite` device willuse a hexadecimal pointer representation (`".F" PRI_INTPTR`) for theconstructed BaseFont name if the input name is empty:


```
if (pfname->size > 0) {
    font_name.data = pfname->chars;
    font_name.size = pfname->size;
    while (pdf_has_subset_prefix(font_name.data, font_name.size)) {
        /* Strip off an existing subset prefix. */
        font_name.data += SUBSET_PREFIX_SIZE;
        font_name.size -= SUBSET_PREFIX_SIZE;
    }
} else {
    gs_snprintf(fnbuf, sizeof(fnbuf), ".F" PRI_INTPTR, (intptr_t)copied);
    font_name.data = (byte *)fnbuf;
    font_name.size = strlen(fnbuf);
}
```

Resulting in, for example:

```

<</BaseFont/YZKFTQ+.F0x5618b147e378/FontDescriptor 8 0 R/ToUnicode 11 0R/Type/Font ...

```

An attacker can obtain this pointer value by reading back the outputfile (after writing to a temporary writable and readable location).

This bug (and various other pointer leaks) were fixed in 10.03.0(2024-03-06), and is bug (4) in this report:https://bugs.ghostscript.com/show_bug.cgi?id=707510



# CVE-2024-29511 - arbitrary file read/write through Tesseract config

The `ocr` family of devices invoke Tesseract to perform OCR operations.The device parameter `OCRLanguage` is used by Tesseract to load a datafile for that specific language. Specifically, such a file is loadedfrom `./<OCRLanguage>.traineddata`. By using a path traversal to`/tmp/`, we can force Tesseract to load our own data file:


```
mark
/OutputFile (/tmp/notused)
/OCRLanguage (../../../../../tmp/test) % loads /tmp/test.traineddata
/OutputDevice /ocr
.dicttomark
setpagedevice
```

As it turns out, Tesseract `traineddata` files can include variousconfiguration values, including `user_patterns_file` which will try toload patterns from the given path, and `debug_file` which will writedebug information to the given path. The debug information is quiteverbose, and will print full input lines if they don’t start with avalid character in the trained language. By constructing our "language"such that no character is valid, all lines in the pattern file areprinted. For example, the configuration settings:


```
debug_file /tmp/out
user_patterns_file /etc/passwd
```

will result in a file `/tmp/out` containing:

```
Error: failed to insert pattern 'root:x:0:0:root:/root:/bin/bash'

Error: failed to insert pattern'daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin'

Error: failed to insert pattern 'bin:x:2:2:bin:/bin:/usr/sbin/nologin'
Error: failed to insert pattern 'sys:x:3:3:sys:/dev:/usr/sbin/nologin'
Error: failed to insert pattern 'sync:x:4:65534:sync:/bin:/bin/sync'
<etc>
```

In Postscript we can:

1. Construct the traineddata file under `/tmp/`

2. Use path traversal in `OCRLanguage` to load it when initializing the`ocr` device

3. Read the resulting output data in `/tmp/out`

This allows us to read arbitrary files outside of the SAFER sandbox, andwrite to arbitrary file paths, although during writing, every line willstart with `Error: failed to insert pattern '` and end with `'`.

Note that this is the Tesseract/OCR-related bug that was referred to bythe Ghostscript changelog (and quoted earlier in this thread). Contraryto what is stated in the changelog it does not lead to RCE by itself,just file read/write. It also requires Ghostscript to be compiled withTesseract support.



# CVE-2024-29510 - format string injection in uniprint device

The `uniprint` device allows the user to provide various stringfragments as device options, which are later appended to the outputfile. Two of these parameters, `upWriteComponentCommands` and`upYMoveCommand`, are actually treated as format strings, specificallyfor `gp_fprintf` and `gs_snprintf`. For these, the intention is for theuser to include just one format specifier in the string, but there is nologic preventing arbitrary format strings (with multiple specifiers)from being used.

With full control over the format string (by setting a page device withthe respective options), and read access to the device output (bysetting it to a temporary file path), an attacker can abuse this to leakdata from the stack and perform memory corruption. This is specificallyimpactful in the cases of `gs_snprintf` (as opposed to `gp_fprintf`), asits format-string parsing logic is not hardened by compiler measureslike `D_FORTIFY_SOURCE`, while it still supports the `%n` modifier.

Bug report and public blog post with more details and PoC leading to aSAFER sandbox bypass:


https://bugs.ghostscript.com/show_bug.cgi?id=707662
https://codeanlabs.com/blog/research/cve-2024-29510-ghostscript-format-string-exploitation/

---

Cheers,
Thomas

[oss-security] Re: Ghostscript 10.03.1 (2024-05-02) fixed 5 CVEs including CVE-2024-33871 arbitrary code execution

Reply via email to