Re: [GENERAL] insert on conflict

armand pirvu Tue, 27 Jun 2017 13:45:12 -0700


> On Jun 27, 2017, at 3:30 PM, Peter Geoghegan <[email protected]> wrote:
> 
> On Tue, Jun 27, 2017 at 12:40 PM, armand pirvu <[email protected]> wrote:
>> so how is it working in fact ? Isn't it working like looping in the
>> IVEE.dim_company  and for each company_id if the record does have a
>> correspondent in csischema.dim_company then update csischema.dim_company set
>> company_name = EXCLUDED.company_name  where company_id=...  ? If so isn't it
>> supposed to use the PK for each company_id ? Or is it more like building a
>> whole list from IVEE.dim_company and treat like a join ? Just trying to
>> understand
> 
> The processing here has to happen a tuple at a time. That's because
> the index structure itself is the only authoritative source of truth
> about whether or not there is a possible conflict. An MVCC snapshot
> isn't good enough, because it sees a consistent view of things, not
> the true physical reality of what exists or does not exist in the
> index.
> 
> What you end up with here is a weird nested loop join, you might say.
> The implementation couldn't do it any other way (this could never
> behave more like a merge join), because we must eagerly check for
> conflicts right as we insert (our insert would be registered by
> *other* inserters/upserters as a conflict). If the implementation did
> ever do it that way, it would break the important UPSERT guarantees
> around concurrency.
> 
> MERGE does this in other systems, which is okay for those other
> systems because MERGE makes no special promises about concurrency
> (e.g., you can get a unique violation in the joined-on column with
> MERGE). But, MERGE would be faster for bulk loading, which is what
> MERGE is good for.
> 
> -- 
> Peter Geoghegan

Hi Peter


So for example if IVEE.dim_company has 10k rows and csischema.dim_company has 
40 rows, what will happen for each row in IVEE.dim_company we check 
csischema.dim_company and if the check tells row is in it switches to update 
and this would mean yes I scan IVEE.dim_company, however should an update be 
needed in csischema.dim_company it will use the csischema.dim_company PK since 
we pass one value gotten from IVEE.dim_company
The question I guess is what happens IF I IVEE.dim_company accounts for far 
more than 5% of csischema.dim_company ? Will that translate into a scan on 
csischema.dim_company ?

What I am looking at now looks like a potential racing contention which so I am 
wondering if there are better ways to do it


Thanks
Armand
Re: [GENERAL] insert on conflict

Reply via email to