Re: Shared detoast Datum proposal

2024-05-23 Thread Andy Fan
Robert Haas writes: > On Wed, May 22, 2024 at 9:46 PM Andy Fan wrote: >> Please give me one more chance to explain this. What I mean is: >> >> Take SELECT f(a) FROM t1 join t2...; for example, >> >> When we read the Datum of a Var, we read it from tts->tts_values[*], no >> matter what kind of

Re: Shared detoast Datum proposal

2024-05-23 Thread Robert Haas
On Wed, May 22, 2024 at 9:46 PM Andy Fan wrote: > Please give me one more chance to explain this. What I mean is: > > Take SELECT f(a) FROM t1 join t2...; for example, > > When we read the Datum of a Var, we read it from tts->tts_values[*], no > matter what kind of TupleTableSlot is. So if we

Re: Shared detoast Datum proposal

2024-05-22 Thread Andy Fan
Nikita Malakhov writes: > Hi, > Andy, glad you've not lost interest in this work, I'm looking > forward to your improvements! Thanks for your words, I've adjusted to the rhythm of the community and welcome more feedback:) -- Best Regards Andy Fan

Re: Shared detoast Datum proposal

2024-05-22 Thread Andy Fan
Robert Haas writes: > On Tue, May 21, 2024 at 10:02 PM Andy Fan wrote: >> One more things I want to highlight it "syscache" is used for metadata >> and *detoast cache* is used for user data. user data is more >> likely bigger than metadata, so cache size control (hence eviction logic >> run

Re: Shared detoast Datum proposal

2024-05-22 Thread Robert Haas
On Tue, May 21, 2024 at 10:02 PM Andy Fan wrote: > One more things I want to highlight it "syscache" is used for metadata > and *detoast cache* is used for user data. user data is more > likely bigger than metadata, so cache size control (hence eviction logic > run more often) is more necessary

Re: Shared detoast Datum proposal

2024-05-22 Thread Nikita Malakhov
Hi, Robert, thank you very much for your response to this thread. I agree with most of things you've mentioned, but this proposal is a PoC, and anyway has a long way to go to be presented (if it ever would) as something to be committed. Andy, glad you've not lost interest in this work, I'm

Re: Shared detoast Datum proposal

2024-05-21 Thread Andy Fan
Andy Fan writes: > Hi Robert, > >> Andy Fan asked me off-list for some feedback about this proposal. I >> have hesitated to comment on it for lack of having studied the matter >> in any detail, but since I've been asked for my input, here goes: > > Thanks for doing this! Since we have two

Re: Shared detoast Datum proposal

2024-05-21 Thread Andy Fan
Hi Robert, > Andy Fan asked me off-list for some feedback about this proposal. I > have hesitated to comment on it for lack of having studied the matter > in any detail, but since I've been asked for my input, here goes: Thanks for doing this! Since we have two totally different designs

Re: Shared detoast Datum proposal

2024-05-20 Thread Robert Haas
Hi, Andy Fan asked me off-list for some feedback about this proposal. I have hesitated to comment on it for lack of having studied the matter in any detail, but since I've been asked for my input, here goes: I agree that there's a problem here, but I suspect this is not the right way to solve

Re: Shared detoast Datum proposal

2024-03-15 Thread Nikita Malakhov
Hi! Here's a slightly improved version of patch Tomas provided above (v2), with cache invalidations and slices caching added, still as PoC. The main issue I've encountered during tests is that when the same query retrieves both slices and full value - slices, like substring(t,...) the order of

Re: Shared detoast Datum proposal

2024-03-07 Thread Tomas Vondra
On 3/7/24 08:33, Nikita Malakhov wrote: > Hi! > > Tomas, I thought about this issue - >> What if you only need the first slice? In that case decoding everything >> will be a clear regression. > And completely agree with you, I'm currently working on it and will post > a patch a bit later. Cool.

Re: Shared detoast Datum proposal

2024-03-06 Thread Nikita Malakhov
Hi! Tomas, I thought about this issue - >What if you only need the first slice? In that case decoding everything >will be a clear regression. And completely agree with you, I'm currently working on it and will post a patch a bit later. Another issue we have here - if we have multiple slices

Re: Shared detoast Datum proposal

2024-03-06 Thread Tomas Vondra
On 3/5/24 10:03, Nikita Malakhov wrote: > Hi, > > Tomas, sorry for confusion, in my previous message I meant exactly > the same approach you've posted above, and came up with almost > the same implementation. > > Thank you very much for your attention to this thread! > > I've asked Andy

Re: Shared detoast Datum proposal

2024-03-06 Thread Tomas Vondra
On 3/6/24 07:09, Nikita Malakhov wrote: > Hi, > > In addition to the previous message - for the toast_fetch_datum_slice > the first (seems obvious) way is to detoast the whole value, save it > to cache and get slices from it on demand. I have another one on my > mind, but have to play with it

Re: Shared detoast Datum proposal

2024-03-05 Thread Nikita Malakhov
Hi, In addition to the previous message - for the toast_fetch_datum_slice the first (seems obvious) way is to detoast the whole value, save it to cache and get slices from it on demand. I have another one on my mind, but have to play with it first. -- Regards, Nikita Malakhov Postgres

Re: Shared detoast Datum proposal

2024-03-05 Thread Nikita Malakhov
Hi, Tomas, sorry for confusion, in my previous message I meant exactly the same approach you've posted above, and came up with almost the same implementation. Thank you very much for your attention to this thread! I've asked Andy about this approach because of the same reasons you mentioned -

Re: Shared detoast Datum proposal

2024-03-04 Thread Andy Fan
> >> 2. more likely to use up all the memory which is allowed. for example: >> if we set the limit to 1MB, then we kept more data which will be not >> used and then consuming all of the 1MB. >> >> My method is resolving this with some helps from other modules (kind of >> invasive) but can

Re: Shared detoast Datum proposal

2024-03-04 Thread Tomas Vondra
On 3/4/24 18:08, Andy Fan wrote: > ... >> >>> I assumed that releasing all of the memory at the end of executor once >>> is not an option since it may consumed too many memory. Then, when and >>> which entry to release becomes a trouble for me. For example: >>> >>> QUERY PLAN >>>

Re: Shared detoast Datum proposal

2024-03-04 Thread Andy Fan
> > I decided to whip up a PoC patch implementing this approach, to get a > better idea of how it compares to the original proposal, both in terms > of performance and code complexity. > > Attached are two patches that both add a simple cache in detoast.c, but > differ in when exactly the

Re: Shared detoast Datum proposal

2024-03-04 Thread Tomas Vondra
On 3/4/24 02:23, Andy Fan wrote: > > Tomas Vondra writes: > >> On 2/26/24 16:29, Andy Fan wrote: >>> >>> ...> >>> I can understand the benefits of the TOAST cache, but the following >>> issues looks not good to me IIUC: >>> >>> 1). we can't put the result to tts_values[*] since without the

Re: Shared detoast Datum proposal

2024-03-04 Thread Andy Fan
Tomas Vondra writes: >>> But I'm a bit surprised the patch needs to pass a >>> MemoryContext to so many places as a function argument - that seems to >>> go against how we work with memory contexts. Doubly so because it seems >>> to only ever pass CurrentMemoryContext anyway. So what's that

Re: Shared detoast Datum proposal

2024-03-04 Thread Tomas Vondra
On 3/4/24 02:29, Andy Fan wrote: > > Tomas Vondra writes: > >> On 3/3/24 07:10, Andy Fan wrote: >>> >>> Hi, >>> >>> Here is a updated version, the main changes are: >>> >>> 1. an shared_detoast_datum.org file which shows the latest desgin and >>> pending items during discussion. >>> 2. I

Re: Shared detoast Datum proposal

2024-03-03 Thread Andy Fan
Tomas Vondra writes: > On 3/3/24 07:10, Andy Fan wrote: >> >> Hi, >> >> Here is a updated version, the main changes are: >> >> 1. an shared_detoast_datum.org file which shows the latest desgin and >> pending items during discussion. >> 2. I removed the slot->pre_detoast_attrs totally. >>

Re: Shared detoast Datum proposal

2024-03-03 Thread Andy Fan
Tomas Vondra writes: > On 2/26/24 16:29, Andy Fan wrote: >> >> ...> >> I can understand the benefits of the TOAST cache, but the following >> issues looks not good to me IIUC: >> >> 1). we can't put the result to tts_values[*] since without the planner >> decision, we don't know if this will

Re: Shared detoast Datum proposal

2024-03-03 Thread Andy Fan
Tomas Vondra writes: > On 3/3/24 02:52, Andy Fan wrote: >> >> Hi Nikita, >> >>> >>> Have you considered another one - to alter pg_detoast_datum >>> (actually, it would be detoast_attr function) and save >>> detoasted datums in the detoast context derived >>> from the query context? >>> >>>

Re: Shared detoast Datum proposal

2024-03-03 Thread Tomas Vondra
On 3/3/24 07:10, Andy Fan wrote: > > Hi, > > Here is a updated version, the main changes are: > > 1. an shared_detoast_datum.org file which shows the latest desgin and > pending items during discussion. > 2. I removed the slot->pre_detoast_attrs totally. > 3. handle some pg_detoast_datum_slice

Re: Shared detoast Datum proposal

2024-03-03 Thread Tomas Vondra
On 2/26/24 16:29, Andy Fan wrote: > > ...> > I can understand the benefits of the TOAST cache, but the following > issues looks not good to me IIUC: > > 1). we can't put the result to tts_values[*] since without the planner > decision, we don't know if this will break small_tlist logic. But

Re: Shared detoast Datum proposal

2024-03-03 Thread Tomas Vondra
On 3/3/24 02:52, Andy Fan wrote: > > Hi Nikita, > >> >> Have you considered another one - to alter pg_detoast_datum >> (actually, it would be detoast_attr function) and save >> detoasted datums in the detoast context derived >> from the query context? >> >> We have just enough information at

Re: Shared detoast Datum proposal

2024-03-02 Thread Andy Fan
Hi, Here is a updated version, the main changes are: 1. an shared_detoast_datum.org file which shows the latest desgin and pending items during discussion. 2. I removed the slot->pre_detoast_attrs totally. 3. handle some pg_detoast_datum_slice use case. 4. Some implementation improvement.

Re: Shared detoast Datum proposal

2024-03-02 Thread Andy Fan
Hi Nikita, > > Have you considered another one - to alter pg_detoast_datum > (actually, it would be detoast_attr function) and save > detoasted datums in the detoast context derived > from the query context? > > We have just enough information at this step to identify > the datum - toast

Re: Shared detoast Datum proposal

2024-03-02 Thread Nikita Malakhov
Hi, Andy! Sorry for the delay, I have had long flights this week. I've reviewed the patch set, thank you for your efforts. I have several notes about patch set code, but first of I'm not sure the overall approach is the best for the task. As Tomas wrote above, the approach is very invasive and

Re: Shared detoast Datum proposal

2024-02-26 Thread Andy Fan
Nikita Malakhov writes: > Hi, > > Tomas, we already have a working jsonb partial detoast prototype, > and currently I'm porting it to the recent master. This is really awesome! Acutally when I talked to MySQL guys, they said MySQL already did this and I admit it can resolve some different

Re: Shared detoast Datum proposal

2024-02-26 Thread Andy Fan
> On 2/26/24 14:22, Andy Fan wrote: >> >>... >> >>> Also, toasted values >>> are not always being used immediately and as a whole, i.e. jsonb values are >>> fully >>> detoasted (we're working on this right now) to extract the smallest value >>> from >>> big json, and these values are not

Re: Shared detoast Datum proposal

2024-02-26 Thread Nikita Malakhov
Hi, Tomas, we already have a working jsonb partial detoast prototype, and currently I'm porting it to the recent master. Due to the size of the changes and very invasive nature it takes a lot of effort, but it is already done. I'm also trying to make the core patch less invasive. Actually, it is

Re: Shared detoast Datum proposal

2024-02-26 Thread Tomas Vondra
On 2/26/24 14:22, Andy Fan wrote: > >... > >> Also, toasted values >> are not always being used immediately and as a whole, i.e. jsonb values are >> fully >> detoasted (we're working on this right now) to extract the smallest value >> from >> big json, and these values are not worth keeping

Re: Shared detoast Datum proposal

2024-02-26 Thread Andy Fan
Hi, > When we store and detoast large values, say, 1Gb - that's a very likely > scenario, > we have such cases from prod systems - we would end up in using a lot of > shared > memory to keep these values alive, only to discard them later. First the feature can make sure if the original

Re: Shared detoast Datum proposal

2024-02-25 Thread Nikita Malakhov
Hi! I see this to be a very promising initiative, but some issues come into my mind. When we store and detoast large values, say, 1Gb - that's a very likely scenario, we have such cases from prod systems - we would end up in using a lot of shared memory to keep these values alive, only to discard

Re: Shared detoast Datum proposal

2024-02-25 Thread Andy Fan
Hi, >> >> Good idea. Either that (a separate README), or a comment in a header of >> some suitable .c/.h file (I prefer that, because that's kinda obvious >> when reading the code, I often not notice a README exists next to it). > > Great, I'd try this from tomorrow. I have made it.

Re: Shared detoast Datum proposal

2024-02-21 Thread Andy Fan
I see your reply when I started to write a more high level document. Thanks for the step by step help! Tomas Vondra writes: > On 2/20/24 19:38, Andy Fan wrote: >> >> ... >> >>> I think it should be done the other >>> way around, i.e. the patch should introduce the main feature first >>>

Re: Shared detoast Datum proposal

2024-02-21 Thread Tomas Vondra
On 2/20/24 19:38, Andy Fan wrote: > > ... > >> I think it should be done the other >> way around, i.e. the patch should introduce the main feature first >> (using the traditional Bitmapset), and then add Bitset on top of that. >> That way we could easily measure the impact and see if it's useful.

Re: Shared detoast Datum proposal

2024-02-20 Thread Andy Fan
Tomas Vondra writes: Hi Tomas, > > I took a quick look on this thread/patch today, so let me share a couple > initial thoughts. I may not have a particularly coherent/consistent > opinion on the patch or what would be a better way to do this yet, but > perhaps it'll start a discussion ...

Re: Shared detoast Datum proposal

2024-02-20 Thread Tomas Vondra
Hi, I took a quick look on this thread/patch today, so let me share a couple initial thoughts. I may not have a particularly coherent/consistent opinion on the patch or what would be a better way to do this yet, but perhaps it'll start a discussion ... The goal of the patch (as I understand it)

Re: Shared detoast Datum proposal

2024-02-19 Thread Andy Fan
Hi, I didn't another round of self-review. Comments, variable names, the order of function definition are improved so that it can be read as smooth as possible. so v6 attached. -- Best Regards Andy Fan >From f2e7772228e8a18027b9c29f10caba9c6570d934 Mon Sep 17 00:00:00 2001 From: "yizhi.fzh"

Re: Shared detoast Datum proposal

2024-01-23 Thread Andy Fan
Michael Zhilin writes: > Hi Andy, > > It looks like v5 is missing in your mail. Could you please check and resend > it? ha, yes.. v5 is really attached this time. commit eee0b2058f912d0d56282711c5d88bc0b1b75c2f (HEAD -> shared_detoast_value_v3) Author: yizhi.fzh Date: Tue Jan 23 13:38:34

Re: Shared detoast Datum proposal

2024-01-23 Thread Michael Zhilin
Hi Andy, It looks like v5 is missing in your mail. Could you please check and resend it? Thanks,  Michael. On 1/23/24 08:44, Andy Fan wrote: Hi, Peter Smith writes: 2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems there were CFbot test failures

Re: Shared detoast Datum proposal

2024-01-22 Thread Andy Fan
Hi, Peter Smith writes: > 2024-01 Commitfest. > > Hi, This patch has a CF status of "Needs Review" [1], but it seems > there were CFbot test failures last time it was run [2]. Please have a > look and post an updated version if necessary. > > == > [1]

Re: Shared detoast Datum proposal

2024-01-21 Thread Peter Smith
2024-01 Commitfest. Hi, This patch has a CF status of "Needs Review" [1], but it seems there were CFbot test failures last time it was run [2]. Please have a look and post an updated version if necessary. == [1] https://commitfest.postgresql.org/46/4759/ [2]

Re: Shared detoast Datum proposal

2024-01-08 Thread Andy Fan
Andy Fan writes: >> >> One of the tests was aborted at CFBOT [1] with: >> [09:47:00.735] dumping /tmp/cores/postgres-11-28182.core for >> /tmp/cirrus-ci-build/build/tmp_install//usr/local/pgsql/bin/postgres >> [09:47:01.035] [New LWP 28182] > > There was a bug in JIT part, here is the fix.

Re: Shared detoast Datum proposal

2024-01-06 Thread Andy Fan
Hi, vignesh C writes: > On Mon, 1 Jan 2024 at 19:26, Andy Fan wrote: >> >> >> Andy Fan writes: >> >> > >> > Some Known issues: >> > -- >> > >> > 1. Currently only Scan & Join nodes are considered for this feature. >> > 2. JIT is not adapted for this purpose yet. >> >> JIT is

Re: Shared detoast Datum proposal

2024-01-06 Thread vignesh C
On Mon, 1 Jan 2024 at 19:26, Andy Fan wrote: > > > Andy Fan writes: > > > > > Some Known issues: > > -- > > > > 1. Currently only Scan & Join nodes are considered for this feature. > > 2. JIT is not adapted for this purpose yet. > > JIT is adapted for this feature in v2. Any

Re: Shared detoast Datum proposal

2024-01-01 Thread Andy Fan
Andy Fan writes: > > Some Known issues: > -- > > 1. Currently only Scan & Join nodes are considered for this feature. > 2. JIT is not adapted for this purpose yet. JIT is adapted for this feature in v2. Any feedback is welcome. -- Best Regards Andy Fan >From

Shared detoast Datum proposal

2023-12-27 Thread Andy Fan
Problem: Toast works well for its claimed purposes. However the current detoast infrastructure doesn't shared the detoast datum in the query's lifespan, which will cause some inefficiency like this: SELECT big_jsonb_col->'a', big_jsonb_col->'b', big_jsonb_col->'c' FROM t; In the