Hi all,

I'm a first-time contributor to NumPy and would appreciate community feedback 
on a proposed small API addition, currently in PR #29294:

TL;DR:
`np.savez_compressed` today always uses `zipfile.ZIP_DEFLATED` at default 
level. The PR allows users to control any `zipfile.ZipFile` compression method 
or level, in a backwards-compatible way.

### What would change?

A new optional keyword argument:

```python
np.savez_compressed(
    "data.npz",
    a=array0,
    b=array1,
    zipfile_kwargs={"compression": "lzma", "compresslevel": 9},
)
```

* `zipfile_kwargs` (default `None`) is forwarded directly to `zipfile.ZipFile`. 
NumPy does not parse its contents beyond mapping human-friendly aliases for 
`"stored"`, `"deflated"`, `"bzip2"`, `"lzma"` (case-insensitive).
* If `zipfile_kwargs` is not used, behavior remains identical to current 
`np.savez_compressed`.
* No new top-level keywords like `compression=` are added, so existing code 
like `np.savez_compressed(file, compression=my_array)` remains valid.
* Full tests included as in the new class `TestSavezCompressed`.

### Why this addition?

There have been recurring requests for:

* Controlling deflate level,
* Using `bzip2` or `lzma` in `.npz` files,
* Improving compression ratio for large arrays.

Without this, users must manually rewrite `.npz` archives—an inconvenient and 
inefficient workaround.

By forwarding `zipfile.ZipFile` kwargs, users can leverage all its options 
cleanly, including any future compression methods added to the standard library.


### Risk and Compatibility

* No changes to default behavior.
* Single new reserved keyword, `zipfile_kwargs`, to minimize potential key 
collisions.
* Python version requirements respected: the PR only targets Python ≥3.11.

### Open Question

Is `zipfile_kwargs` an acceptable keyword name? It makes intent clear, but 
alternative suggestions are welcome.

### Call for Feedback

I'd appreciate hearing from maintainers and contributors:

* Is there support for exposing this functionality?
* Any objections to the proposed API shape?

Thank you for your time and guidance!

Sajjad Ali
PR link: https://github.com/numpy/numpy/pull/29294
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: arch...@mail-archive.com

Reply via email to