JackieTien97 commented on code in PR #816:
URL: https://github.com/apache/tsfile/pull/816#discussion_r3251951435
##########
python/tsfile/dataset/dataframe.py:
##########
@@ -593,17 +613,19 @@ def _from_subset(
obj._paths = parent._paths
obj._show_progress = parent._show_progress
obj._readers = parent._readers
- obj._index = _LogicalIndex(
+ # Reuse the parent's full mapping but restrict the membership scope to
+ # the requested subset.
+ subset_refs = list(series_refs)
+ parent_shards = parent._index.series_shards
+ subset_shards = {ref: parent_shards[ref] for ref in subset_refs}
+ obj._index = _DataFrameCatalog(
+ model=parent._index.model,
Review Comment:
`_from_subset` creates a new `_DataFrameCatalog` but does not propagate
`tables_with_sparse_tag_values` or `sparse_device_indices_by_compressed_path`.
If a view (created by slicing or boolean indexing) later calls
`_resolve_series_name` on a series whose device has sparse tags, the lookup
will fail because the compressed-path index is empty.
This was also missing in the old `_LogicalIndex` code, so it's a
pre-existing gap — but since this PR refactors `_from_subset`, it's worth
noting. A one-line fix:
```python
tables_with_sparse_tag_values=parent._index.tables_with_sparse_tag_values,
sparse_device_indices_by_compressed_path=parent._index.sparse_device_indices_by_compressed_path,
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]