LakshSingla commented on code in PR #16800:
URL: https://github.com/apache/druid/pull/16800#discussion_r1719328472
##########
processing/src/main/java/org/apache/druid/query/operator/WindowOperatorQueryQueryToolChest.java:
##########
@@ -116,6 +128,36 @@ public Sequence<Object[]> resultsAsArrays(
return (Sequence) resultSequence;
}
+ @Override
+ public Optional<Sequence<FrameSignaturePair>> resultsAsFrames(
+ WindowOperatorQuery query,
+ Sequence<RowsAndColumns> resultSequence,
+ MemoryAllocatorFactory memoryAllocatorFactory,
+ boolean useNestedForUnknownTypes
+ )
+ {
Review Comment:
> the context flag about num bytes versus num rows is what determines which
thing does what, so there's a thing that already describes how to do the
transition
I thought about this, however, we can also have a cluster-level config that
determines the limit, so we should be looking at that as well in the window
tool chest, which seems uncool that the window tool chest has to determine what
to do.
> If we just blindly try one, fail and then do the other, that will show up
to users as a performance hit because they have no clue that there's this rando
intermediate logic that is failing for a reason
Fallback is mostly for when the types aren't known. I agree that it is a
performance hit, but at the time this feature was added, the signature informed
by the tool chest didn't need to have a type. Scan queries only had knowledge
of the column names (and not types), group by/time series... etc. toolchests
could return `null` for the aggregator's dimensions. The fallback was present
for these cases, where it's easy to detect the failure relatively early in the
whole subquery processing flow. Fallback meant that transitioning from row ->
byte based limit was simple. There's an undocumented parameter that treated
these null types as JSON types, but that had logical flaws of its own iirc.
Removing the fallback would make the change much easier and I have a lot
more confidence that the query doesn't need to fallback (and we have the known
cases before hand), however, I'd still like to keep it just in case for a
while. I have an idea, and it depends on the fact that RACs can convert itself
to frames properly, and window toolchests would never fall back.
Thanks for the help!! I can work with the "serialization": "frame" parameter
as a workaround to the current design choices.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]