praveenc7 commented on PR #15350:
URL: https://github.com/apache/pinot/pull/15350#issuecomment-2845736782
Hi @Jackie — adding a bit more colour here.
Some production workloads, **result correctness of returned data is
important over delay in projection (due to reload)**; in those cases
“drop-until-fully-loaded” behaviour is more desired
That said, we can start with **on-the-fly virtual-column** approach as the
default, and later gate the behaviour behind a flag such as*
```pinot.relaodColumnHandling = {DROP | VIRTUAL}```. if needed
Below is how the virtual-column path would work in the various consistency
windows that can arise right after a schema update (✅ = column present in the
schema on that component, ❌ = column absent).
| Broker schema | Server S1 schema | Server S2 schema | What happens in this
query-path? |
|---------------|-----------------|-----------------|----------------------------------|
| ✅ | ❌ | ❌ | Broker detects the column is missing from both result blocks,
adds an **all-null virtual column** during the reduce phase before returning
results. |
| ✅ | ✅ | ❌ | S1 returns real data; S2 returns no column data . Broker
stitches the two blocks, injecting the **null vector** only for S2’s rows so
the merged table is schema-consistent. |
| ✅ | ✅ | ✅ | All segments agree → no action required. |
| ❌ | ✅ | ✅ | **Corner case ** Servers has the column, but the broker hasn’t
refreshed the schema yet, we accept the server-schema and merge and return
results . |
| ❌ | ✅ | ❌ | Mixed state + broker unaware → treated like the previous row
(add column). |
| ❌ | ❌ | ❌ | No component knows about the column yet → nothing to do. |
**Key point**
**Single-hop correctness guarantee** – As long as the broker/Server has the
new schema, I can try to change the behavior that we guarantee a consistent
projection by padding missing segments with a null vector. How do you feel
about this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]