Script 'mail_helper' called by obssrc
Hello community,

here is the log from the commit of package openhtj2k for openSUSE:Factory 
checked in at 2026-04-10 17:54:08
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/openhtj2k (Old)
 and      /work/SRC/openSUSE:Factory/.openhtj2k.new.21863 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Package is "openhtj2k"

Fri Apr 10 17:54:08 2026 rev:5 rq:1345772 version:0.9.0

Changes:
--------
--- /work/SRC/openSUSE:Factory/openhtj2k/openhtj2k.changes      2026-04-07 
16:51:24.006727006 +0200
+++ /work/SRC/openSUSE:Factory/.openhtj2k.new.21863/openhtj2k.changes   
2026-04-10 18:03:25.935647421 +0200
@@ -1,0 +2,88 @@
+Fri Apr 10 08:29:23 UTC 2026 - Michael Vetter <[email protected]>
+
+- Update to 0.9.0:
+  Performance:
+  * HT block decoder: nibble-packed sigma representation for SPP/MRP passes,
+    replacing byte-per-sample storage; halves memory traffic in significance
+    propagation and magnitude refinement
+  * HT block decoder: branchless VLC unstuff eliminates short-circuit branches;
+    CTZ-based SPP/MRP with branchless bit importers halve branch misses
+  * HT block decoder: fused dequantize pass with vectorized kappa computation
+    and specialized 9/7 cascade; removes a separate dequant loop
+  * HT block decoder: vectorize Emax computation in cleanup decode
+  * AVX2: add 4-quad 16-bit decode path for HT block decoder
+  * AVX-512: expand coverage to FDWT vertical lifting and all color transform
+    functions (forward/inverse ICT and RCT)
+  * AVX-512: masked stores for vertical DWT tail columns, eliminating scalar
+    fallback for non-multiple-of-16 widths
+  * NEON: port nibble-packed sigma SPP/MRP, add irrev 5/3 (ATK) IDWT kernels
+    (horizontal and vertical), add 16-bit HT cleanup decode path (pLSB > 16)
+  * WASM: port nibble-packed sigma SPP/MRP to WASM SIMD decoder
+  * Vectorize PPM interleave output (SSE4.1 / NEON) and PGM/PGX byte-packing
+    in decoder CLI
+  * ThreadPool: lock-free get() via atomic index; skip pool overhead entirely
+    in single-threaded mode
+  * ThreadPool: replace std::queue<std::function<void()>> with a fixed-size
+    32 K-slot InlineTask ring buffer with cache-line aligned head/tail; 
eliminates
+    per-task heap allocation from std::deque chunk management and std::function
+    type-erased storage. 32-byte inline storage with memcpy fast path for
+    trivially-copyable lambdas (all hot-path captures); relocate-trampoline 
slow
+    path for non-trivial captures
+  * DWT batch path: eliminate per-tile per-level Xext / Yext copy buffer.
+  * In-place horizontal DWT now handles all rows (including the first and last
+    row of each tile) by prepending DWT_LEFT_SLACK = 8 floats and appending
+    DWT_RIGHT_SLACK = SIMD_PADDING = 32 floats of border slack to every
+    j2k_subband / j2k_resolution::i_samples allocation; the user-visible
+    pointer is offset by DWT_LEFT_SLACK. Removes a per-row memcpy round-trip
+    for the boundary rows
+  * DWT: replace scalar PSE-fill loops with constant-pattern SIMD reflection
+    (vpermps on AVX2 / AVX-512, vrev64q + vextq on NEON, i32x4_shuffle
+    on WASM SIMD); covers idwt_1d_row_inplace, idwt_1d_sr_inplace, and
+    fdwt_1d_sr_inplace. Drops the PSEo() modulo + branch on the hot path
+  * coding_units: flatten unique_ptr<unique_ptr<T>[]> double-indirection in
+    pband, subbands, precincts, and resolution to single-allocation
+  * flat arrays via operator new[] + placement-new (matching the existing
+    j2k_codeblock pattern). Reduces N+1 heap allocations to 1 per container
+    and improves cache locality
+  * coding_units: embed tagtree by value in j2k_precinct_subband,
+    eliminating two heap allocations per precinct-subband
+  * decoder: hoist per-tile scratch vectors (band_h, tile_x_off, tile_w,
+    row_ptrs, accum) out of the multi-tile scatter loop in
+    decode_line_based_stream; reduces O(numTiles.x × numTiles.y) heap
+    allocations to a constant 5
+  Bug Fixes:
+  * Fix AVX2 pack_i32_to_u8: signed values were incorrectly clamped to zero
+    by packus; now clamps via max(0) before packing
+  * Fix fused dequant SIMD overshoot race condition with branchless VLC/MEL
+  * Fix WASM wasm_u16x8_shl → wasm_i16x8_shl in 16-bit decode path
+  * Fix WASM 16-bit HT decode regression and DC level shift
+  * Fix intermittent DFS/ATK test failures: zero full stride×height in LL
+    resolution buffer
+  * Fix segfault when /dev/null used as decoder output
+  * Suppress all compiler warnings in native and WASM builds
+  * Fix four memory leaks on exception paths: ~j2k_tile_component now finalizes
+    line_dec / line_enc so aligned_mem_alloc'd sub-resources are released
+    even if finalize_*() was never called explicitly; init_line_decode
+    validates DWT direction before allocating state arrays (fail-fast);
+    create_compressed_buffer NULL-checks the realloc result instead of
+    losing the original pointer; the four decoder.cpp invoke variants
+    validate tile count before allocating raw new int32_t[] output buffers
+    (was leaking on the validation throw)
+  * coding_units: add RAII placement_new_array_guard<T> and apply it at all
+    five placement-new array construction sites (codeblocks, pband, subbands,
+    precincts, resolution); partial construction failures now unwind cleanly
+    instead of leaving the destructor calling ~T on uninitialized memory
+  * ThreadPool: replace silent ring overflow under NDEBUG (assert-only) with
+    a std::runtime_error throw in both push_task and push_batch; raise
+    RING_CAP from 8192 to 32768 to comfortably hold the worst-case 4K-tile
+    encode burst (a single precinct can enqueue ~5-8 K codeblock tasks
+    back-to-back via push_batch before workers begin draining)
+  WASM:
+  * Node.js v24 compatibility: rewrite module loader to evaluate Emscripten JS
+    as CJS via new Function(), pass .wasm binary directly via wasmBinary
+    option, bypassing Emscripten's fetch()-based loader
+  * Add ~137 WASM conformance tests (HT Profile 0, Profile 1, HiFi) using
+    node open_htj2k_dec.mjs decode + native imgcmp comparison
+  * Align index.html ESM exports with updated wrapper API
+
+-------------------------------------------------------------------

Old:
----
  v0.8.0.tar.gz

New:
----
  v0.9.0.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ openhtj2k.spec ++++++
--- /var/tmp/diff_new_pack.BPd08g/_old  2026-04-10 18:03:27.951730579 +0200
+++ /var/tmp/diff_new_pack.BPd08g/_new  2026-04-10 18:03:27.955730744 +0200
@@ -17,7 +17,7 @@
 
 
 Name:           openhtj2k
-Version:        0.8.0
+Version:        0.9.0
 Release:        0
 Summary:        An open source implementation of ITU-T Rec.814 | ISO 15444-15 
(a.k.a. HTJ2K)
 License:        BSD-3-Clause

++++++ v0.8.0.tar.gz -> v0.9.0.tar.gz ++++++
/work/SRC/openSUSE:Factory/openhtj2k/v0.8.0.tar.gz 
/work/SRC/openSUSE:Factory/.openhtj2k.new.21863/v0.9.0.tar.gz differ: char 13, 
line 1

Reply via email to