hongzhi-gao opened a new pull request, #767:
URL: https://github.com/apache/tsfile/pull/767
## Summary
Adds **device enumeration** and **per-device timeseries metadata** to the
**C wrapper**, with matching **Python** bindings on `TsFileReader`. A follow-up
extends **`TimeseriesStatistic`** (numeric min/max/first/last, boolean
first/last, string stats)
---
## C API (`tsfile_cwrapper.h`)
**Types:** `DeviceID`, `TimeseriesStatistic`, `TimeseriesMetadata`,
`DeviceTimeseriesMetadataEntry`, `DeviceTimeseriesMetadataMap`.
**Sentinel:** `tsfile_c_metadata_empty_device_list_marker` — use with
`length == 0` when you need a non-null `device_ids` pointer for an empty query.
**Functions**
- `tsfile_reader_get_all_devices` → `out_devices` / `out_length`; free with
`tsfile_free_device_id_array`.
- `tsfile_reader_get_timeseries_metadata` → `out_map`; free with
`tsfile_free_device_timeseries_metadata_map`.
**Query semantics (`tsfile_reader_get_timeseries_metadata`)**
| `device_ids` | `length` | Result |
|--------------|----------|--------|
| `NULL` | (ignored) | All devices in the file |
| non-`NULL` | `0` | Empty map, `E_OK` (IDs not read) |
| non-`NULL` | `> 0` | Subset; only devices that exist appear |
**Errors:** If `length > 0` and any `device_ids[i].path == NULL` →
`E_INVALID_ARG` / `RET_INVALID_ARG`.
**Memory:** All heap data in `out_map` (including
`TimeseriesStatistic.str_*`) is owned by the API; **only** release via
`tsfile_free_device_timeseries_metadata_map`. Do not free `str_*` individually.
**`TimeseriesStatistic` (which flags matter)**
- `int_range_valid` + `*int64` — `INT32`, `DATE`, `INT64`, `TIMESTAMP`
- `float_range_valid` + `*float64` — `FLOAT`, `DOUBLE`
- `bool_ext_valid` + `first_bool` / `last_bool` — `BOOLEAN` (plus existing
`sum_valid` / `sum`)
- `str_ext_valid` + `str_*` — `STRING` (lexicographic min/max, time-ordered
first/last); `TEXT` — **first/last only**; do not rely on min/max for `TEXT`
---
## Python API
**Schema:** `DeviceID`, `TimeseriesStatistic` (extended; string fields
`Optional[str]`), `TimeseriesMetadata`.
**`TsFileReader`**
- `get_all_devices() -> list[DeviceID]`
- `get_timeseries_metadata(device_ids=None) -> dict[str,
list[TimeseriesMetadata]]`
- `None` → all devices; `[]` → `{}`; non-empty list → filter (`DeviceID`
or `str` paths).
Native memory is freed inside the extension; use the returned dict/objects
only.
**Example**
```python
reader = TsFileReader("file.tsfile")
devices = reader.get_all_devices()
meta = reader.get_timeseries_metadata(None)
sub = reader.get_timeseries_metadata([DeviceID("root.sg.d1"), "root.sg.d2"])
reader.close()
```
---
## Tests
- C++: `cpp/test/cwrapper/cwrapper_metadata_test.cc`
- Python: `python/tests/test_reader_metadata.py`
---
**Note:** Order of devices/series follows the native C++ map/container
iteration unless you document a stronger guarantee later.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]