On Wed, 2 Aug 2023 at 03:05, Andy Fan <zhihui.fan1...@gmail.com> wrote: > > Hi Matthias: > > On Wed, Aug 2, 2023 at 7:33 AM Andy Fan <zhihui.fan1...@gmail.com> wrote: >> >> >> >> On Tue, Aug 1, 2023 at 7:03 PM Matthias van de Meent >> <boekewurm+postg...@gmail.com> wrote: >>> >>> On Tue, 1 Aug 2023 at 06:39, Andy Fan <zhihui.fan1...@gmail.com> wrote: >>> > >>> > Hi: >>> > >>> > Currently if we want to extract a numeric field in jsonb, we need to use >>> > the following expression: cast (a->>'a' as numeric). It will turn a >>> > numeric >>> > to text first and then turn the text to numeric again. >>> >>> Why wouldn't you use cast(a->'a' as numeric), or ((a->'a')::numeric)? >> >> >> Thanks for this information! I didn't realize we have this function >> already at [1]. >> >> https://www.postgresql.org/docs/15/functions-json.html > > > Hi: > > I just found ((a->'a')::numeric) is not as effective as I expected. > > First in the above expression we used jsonb_object_field which > returns a jsonb (see JsonbValueToJsonb), and then we convert jsonb > to jsonbValue in jsonb_numeric (see JsonbExtractScalar). This > looks like a wastage.
Yes, it's not great, but that's just how this works. We can't pre-specialize all possible operations that one might want to do in PostgreSQL - that'd be absurdly expensive for binary and initial database sizes. > Secondly, because of the same reason above, we use PG_GETARG_JSONB_P(0), > which may detoast a value so we need to free it with PG_FREE_IF_COPY. > then this looks like another potential wastage. Is it? Detoasting only happens if the argument was toasted, and I have serious doubts that the result of (a->'a') will be toasted in our current system. Sure, we do need to allocate an intermediate result, but that's in a temporary memory context that should be trivially cheap to free. > /* > * v.val.numeric points into jsonb body, so we need to make a copy to > * return > */ > retValue = DatumGetNumericCopy(NumericGetDatum(v.val.numeric)); > > At last this method needs 1 extra FuncExpr than my method, this would > cost some expression execution effort. I'm not saying we need to avoid > expression execution generally, but extracting numeric fields from jsonb > looks a reasonable case. But we don't have special cases for the other jsonb types - the one that is available (text) is lossy and doesn't work reliably without making sure the field we're accessing is actually a string, and not any other type of value. > As a comparison, cast to other data types like > int2/int4 may be not needed since they are not binary compatible. Yet there are casts from jsonb to and back from int2, int4 and int8. I don't see a very good reason to add this, for the same reasons mentioned by Pavel. *If* we were to add this operator, I would want this patch to also include a #-variant for text[]-based deep access (c.q. #> / #>>), and equivalent operators for the json type to keep the current access operator parity. > Here is the performance comparison (with -O3, my previous post is -O0). > > select 1 from tb where (a->'a')::numeric = 2; 31ms. > select 1 from tb where (a@->'a') = 2; 15ms What's tb here? Kind regards, Matthias van de Meent Neon (https://neon.tech)