JackieTien97 commented on code in PR #816:
URL: https://github.com/apache/tsfile/pull/816#discussion_r3251953344


##########
python/tsfile/dataset/reader.py:
##########
@@ -217,6 +227,111 @@ def _cache_metadata_table_model(self):
             )
             sys.stderr.flush()
 
+    def _cache_metadata_tree_model(self):
+        """Build the in-memory catalog by synthesizing one virtual table.
+
+        Tree TsFiles have no schema, so we materialize a single
+        ``TableEntry``: table name = the shared root segment, tag columns
+        = ``_col_1..._col_{N_max}`` (one per remaining path segment),
+        fields = union of measurements across devices. Per-device
+        ownership is preserved by registering only the
+        ``(device_id, field_idx)`` pairs actually written on disk in
+        ``series_stats_by_ref``.
+        """
+        metadata_groups = self._reader.get_timeseries_metadata(None)
+        if not metadata_groups:
+            raise ValueError("No devices found in tree-model TsFile")
+
+        # 1) Walk every device once to collect: root-segment, max depth, and
+        #    the union of measurements that pass the numeric filter.
+        root_name = None
+        max_depth = 0  # segments after the root (i.e. virtual tag depth)
+        device_specs = []  # list of (tail_segments, group, stats_by_field)
+        union_fields = []  # ordered union of measurement names
+        seen_field_names = set()
+
+        for device_path, group in metadata_groups.items():
+            if not device_path:
+                continue
+            full_segments = tuple(device_path.split("."))
+            if not full_segments:
+                continue
+            current_root = full_segments[0]
+            if root_name is None:
+                root_name = current_root

Review Comment:
   This validation rejects tree files with multiple root segments (`root_name 
!= current_root`). In practice, IoTDB tree-model files virtually always use 
`root` as the single root. However, the synthetic table uses the root segment 
as the `table_name`, so if a file ever had devices under multiple roots, the 
virtual table would be ambiguous.
   
   The error message is clear and this is the right behavior — just noting that 
I'd expect this to never actually fire in practice. If it does, it signals a 
seriously malformed file rather than a normal use case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to