GitHub user mengw15 added a comment to the discussion: Design: interactive grid for the operator result pane
Thanks for putting this together - a few comments / questions: 1. Full-scan latency + cancellation. Sort always needs the full set in memory (Iceberg has no order-by pushdown), and the residual filters (contains / endsWith / row-search) can't be pruned by file stats. Even the pushdownable ops (=, <, >, in, startsWith) only skip files when the data is clustered by that column — operator results are written in arrival order, so min/max ranges overlap and pruning is usually weak. On a large output this can mean scanning most of the table. Do we have a sense of the actual latency and compute cost there — how long does a user wait for a sort or row-search to come back? And if a user accidentally sorts/filters a huge result, is there a way to cancel an in-flight query (or a timeout) so the panel doesn't hang? 2. View vs. dataflow semantics. We're a workflow system, so I'm assuming the filter/sort here only changes what's shown in the panel — the data passed to the downstream operator is still the full, unfiltered output. If so, could this mislead users into thinking they've filtered the actual data? 3. Persistence of the query state. Are the filter/sort (and their results) persisted? If a user sets a filter, switches away, and re-opens the operator, do they get the filtered view back, or does it re-scan and re-filter from scratch? 4. Overlap with the Filter / Selection operators. We already have Filter and Selection operators. For a dataflow system, the more intuitive way to persist a filter is an operator — its output flows downstream (which also addresses the second point) and stays semantically consistent with the rest of the system; and if the operator-result cache (cc @Xiao-zhen-Liu ) is enabled, the cost should be comparable since the upstream is cached. Curious how you're drawing that line. GitHub link: https://github.com/apache/texera/discussions/5395#discussioncomment-17287802 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected]
