Hi Yuan, I tink it’s not ideal to disclose security issues before the release candidate is officially released. If there is discussion needed on this, they should happen on the private list (or a security list, if the project has one). After the RC is released, a separate announcement would be the ideal path. This way we are disclosing an attack vector with users having no way to mitigate possibly giving malicious entities an exact guide on how to exploit.
And your email contained the wording I was referring to in the other discussion … „open-source release“ 😉 What else should this be? I know that internally the commercial offering is referred to as „enterprise version“ and the other is naturally the „open-source version“ … Just mentioning it to increase sensitivity on this Chris Von: Yuan Tian <[email protected]> Datum: Freitag, 3. Juli 2026 um 01:49 An: dev <[email protected]> Betreff: [DISCUSS] Kicking off 1.3.8 release: critical query livelock fix in 1.3.6 / 1.3.7 Hi all, I'd like to propose that we finalize the scope of the next open-source release 1.3.8 and start the release process as soon as possible. The main motivation is a critical query bug that affects both v1.3.6 and v1.3.7. The fix is already merged on dev/1.3, so 1.3.8 should at least include it. == Bug description and severity == Under a specific but fairly common data pattern, a query on an aligned device enters a livelock: it never returns and never errors out, while the query driver thread spins at ~100% CPU repeatedly burning its full time slice, until the query finally hits the timeout. I would rate this as critical: * The affected query always fails (by timeout), with no error message that points to the cause, which makes it very hard for users to diagnose. * Each stuck query pins query driver threads at full CPU for the entire timeout window. A handful of such queries can saturate the query thread pool and CPU, degrading all other queries on the node. * The trigger pattern is common in practice: an aligned device where the queried measurement is sparse (contains nulls), combined with a time range filter. Aggregations (e.g. count) and raw queries are both affected, in both ASC and DESC order. We hit this in production on v1.3.6: EXPLAIN ANALYZE snapshots showed the scan operators' CPU time growing linearly (~60s per 15s wall time across driver threads) while output rows and all I/O statistics stayed completely frozen, and CPU flame graphs showed ~90% of samples inside SeriesScanUtil.initFirstChunkMetadata (with ~1/3 of that in the System.nanoTime() calls of the time-slice guard loop, i.e. a pure busy wait). == How to reproduce (verified on 1.3.6 / 1.3.7) == CREATE DATABASE root.sg1; INSERT INTO root.sg1.d1(timestamp, s1, s2) ALIGNED VALUES (1, 1, 1); INSERT INTO root.sg1.d1(timestamp, s1, s2) ALIGNED VALUES (2, null, 2); INSERT INTO root.sg1.d1(timestamp, s1, s2) ALIGNED VALUES (3, null, 3); FLUSH; SELECT s1 FROM root.sg1.d1 WHERE time >= 3 AND time <= 4 ORDER BY time DESC; Expected: an empty result set. Actual: the query hangs until timeout. An ascending variant triggers the same livelock, e.g. "SELECT count(s1) FROM ... WHERE time <= X" when s1's non-null values all lie after X (this is the shape we hit in production). == Root cause == Two statistics sources got out of sync: * File-level pruning (TimeFilter#canSkip) used the *time-column* statistics of the aligned timeseries metadata. * SeriesScanUtil's overlap checks use ITimeSeriesMetadata#getStatistics(), which for a single-measurement aligned scan returns the *value-column* statistics (the non-null range, a subset of the time-column range). Since v1.3.6, the memtable scan optimization (commit dbc0133a on dev/1.3) additionally clamps the overlap-check endpoint by the global time filter * SeriesScanUtil's overlap checks use ITimeSeriesMetadata#getStatistics(), which for a single-measurement aligned scan returns the *value-column* statistics (the non-null range, a subset of the time-column range). Since v1.3.6, the memtable scan optimization (commit dbc0133a on dev/1.3) additionally clamps the overlap-check endpoint by the global time filter range. As a result, a file whose time-column range overlaps the filter but whose queried measurement has no non-null value inside the filter range passes canSkip() and gets loaded, yet the clamped endpoint can never overlap the metadata's own statistics. initFirstChunkMetadata() then neither unpacks nor discards firstTimeSeriesMetadata, hasNextChunk() keeps returning Optional.empty(), and the operator's time-slice loop spins forever. v1.3.5 and earlier are not affected because the overlap endpoint was the metadata's own endTime, which always overlaps itself. == The fix == Already on dev/1.3: * apache/tsfile#716 — TimeFilter.canSkip()/allSatisfy() now use getStatistics(), consistent with the scan-side overlap checks. (develop-branch equivalent: apache/tsfile#715) * apache/iotdb#17120 — bumps dev/1.3 to a tsfile version containing the fix and adds a regression IT (testQueryWithGlobalTimeFilterOrderByTimeDesc). Note that dev/1.3 currently depends on tsfile 1.1.4-SNAPSHOT, so an official tsfile 1.1.4 release is a prerequisite for releasing IoTDB 1.3.8. == Proposal == 1. Release tsfile 1.1.4 (dev/1.1) first. 2. Cut rc/1.3.8 from dev/1.3 shortly after, which already contains the fix above as well as several other correctness fixes in the same area (e.g. #16993, #16970). 3. If you have other fixes or changes that should go into 1.3.8, please reply in this thread so we can settle the scope quickly. Any feedback is welcome. Best regards, ---------------- Yuan Tian
