On 1/9/22 5:13 PM, Julien Rouhaud wrote:
For now the queryid mixes two different things: fingerprinting and query text
normalization. Should each calculation method be allowed to do a different
normalization too, and if yes where should be stored the state data needed for
that? If not, we would need some kind of primary hash for that purpose.
Do You mean JumbleState?
I think, registering queryId generator we should store also a pointer
(void **args) to an additional data entry, as usual.
Looking at Andrey's use case for wanting multiple hashes, I don't think that
adaptive optimization needs a normalized query string. The only use would be
to output some statistics, but this could be achieved by storing a list of
"primary queryid" for each adaptive entry. That's probably also true for
anything that's not monitoring intended. Also, all monitoring consumers should
probably agree on the same queryid, both fingerprint and normalized string, as
otherwise it's impossible to cross-reference metric data.
I can add one more use case.
Our extension for freezing query plan uses query tree comparison
technique to prove, that the plan can be applied (and we don't need to
execute planning procedure at all).
The procedure of a tree equality checking is expensive and we use
cheaper queryId comparison to identify possible candidates. So here, for
the better performance and queries coverage, we need to use query tree
normalization - queryId should be stable to some modifications in a
query text which do not change semantics.
As an example, query plan with external parameters can be used to
execute constant query if these constants correspond by place and type
to the parameters. So, queryId calculation technique returns also
pointers to all constants and parameters found during the calculation.
--
regards,
Andrey Lepikhov
Postgres Professional