GitHub user mengw15 added a comment to the discussion: Design: interactive grid 
for the operator result pane

Thanks for putting this together - a few comments / questions:

1. Full-scan latency + cancellation. Sort always needs the full set in memory 
(Iceberg has no order-by pushdown), and the residual filters (contains / 
endsWith / row-search) can't be pruned by file stats. Even the pushdownable ops 
(=, <, >, in, startsWith) only skip files when the data is clustered by that 
column — operator results are written in arrival order, so min/max ranges 
overlap and pruning is usually weak. On a large output this can mean scanning 
most of the table. Do we have a sense of the actual latency and compute cost 
there — how long does a user wait for a sort or row-search to come back? And if 
a user accidentally sorts/filters a huge result, is there a way to cancel an 
in-flight query (or a timeout) so the panel doesn't hang?

2. View vs. dataflow semantics. We're a workflow system, so I'm assuming the 
filter/sort here only changes what's shown in the panel — the data passed to 
the downstream operator is still the full, unfiltered output. If so, could this 
mislead users into thinking they've filtered the actual data? 

3. Persistence of the query state. Are the filter/sort (and their results) 
persisted? If a user sets a filter, switches away, and re-opens the operator, 
do they get the filtered view back, or does it re-scan and re-filter from 
scratch? 

4. Overlap with the Filter / Selection operators. We already have Filter and 
Selection operators. For a dataflow system, the more intuitive way to persist a 
filter is an operator — its output flows downstream (which also addresses the 
second point) and stays semantically consistent with the rest of the system; 
and if the operator-result cache (cc @Xiao-zhen-Liu ) is enabled, the cost 
should be comparable since the upstream is cached. Curious how you're drawing 
that line.

GitHub link: 
https://github.com/apache/texera/discussions/5395#discussioncomment-17287802

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to