> On Nov 9, 2020, at 6:41 PM, Bharath Rupireddy 
> <bharath.rupireddyforpostg...@gmail.com> wrote:
>
> On Tue, Nov 3, 2020 at 4:54 PM Bharath Rupireddy
> <bharath.rupireddyforpostg...@gmail.com> wrote:
>>
>> If the approach followed in the patch looks okay, I can work on a separate 
>> patch for multi inserts in refresh materialized view cases.
>>
>
> Hi, I'm attaching a v2 patch that has multi inserts for CTAS as well
> as REFRESH MATERIALiZED VIEW.
>
> I did some testing: exec time in seconds.
>
> Use case 1: 1 int and 1 text column. each row size 129 bytes, size of
> 1 text column 101 bytes, number of rows 100million, size of heap file
> 12.9GB.
> HEAD: 220.733, 220.428
> Patch: 151.923, 152.484
>
> Thoughts?
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: 
> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.enterprisedb.com%2F&amp;data=04%7C01%7Cguopa%40vmware.com%7C2471a90558ce4bf0af5b08d8849c03bb%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C1%7C637405152899337347%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=QKeRMGQjOlOL%2FlQv%2BuEAb2ocLVq6zqXESKoNOaJ6YCo%3D&amp;reserved=0
> <v2-0001-Multi-Inserts-in-CTAS-Refresh-Materialized-View.patch>

Thanks for doing this. There might be another solution - use raw insert 
interfaces (i.e. raw_heap_insert()).
Attached is the test (not formal) patch that verifies this idea. 
raw_heap_insert() writes the page into the
table files directly and also write the FPI xlog when the tuples filled up the 
whole page. This seems be
more efficient.

In addition, those raw write interfaces call smgrimmedsync() when finishing raw 
inserting, this is because
the write bypasses the shared buffer so a CHECKPOINT plus crash might cause 
data corruption since
some FPI xlogs cannot be replayed and those table files are not fsync-ed during 
crash. It seems that a sync
request could be forwarded to the checkpointer for each table segment file and 
then we do not need to call
smgrimmedsync(). If the theory is correct this should be in a separate patch. 
Anyway I tested this idea
also by simply commenting out the smgrimmedsync() call in heap_raw_insert_end() 
(a new function in
the attached patch) since forwarding fsync request is light-weight.

I did a quick and simple testing. The test environment is a centos6 vm with 7G 
memory on my Mac laptop.
-O3 gcc compiler option; shared_buffers as 2GB. Did not check if parallel 
scanning is triggered by the test
query and the data volume is not large so test time is not long.

Here are the test script.
 create table t1 (a int, b int, c int, d int);
  insert into t1 select i,i,i,i from generate_series(1,10000000) i;
  show shared_buffers;
  \timing on
  create table t2 as select * from t1;
  \timing off

Here are the results:

HEAD (37d2ff380312):
        Time: 5143.041 ms (00:05.143)
Multi insert patch:
        Time: 4456.461 ms (00:04.456)
Raw insert (attached):
        Time: 2317.453 ms (00:02.317)
Raw insert + no smgrimmedsync():
        Time: 2103.610 ms (00:02.104).

From the above data raw insert is better; also forwarding sync should be able 
to improve further
(Note my laptop is with SSD so on machine with SATA/SAS, I believe forwarding 
sync should
be able to help more.)

I tested removing smgrimmedsync in "vacuum full” code that uses raw insert 
also. FYI.
HEAD:
  Time: 3567.036 ms (00:03.567)
no smgrimmedsync:
  Time: 3023.487 ms (00:03.023)


Raw insert could be used on CTAS & Create MatView. For Refresh MatView the code 
is a bit
different. I did not spend more time on this so not sure raw insert could be 
used for that.

But I think the previous multi insert work could be still used in at least 
"INSERT tbl SELECT…” (if the INSERT
is a simple one, e.g. no trigger, no index, etc).


Regards,
Paul

Attachment: v1-0001-ctas-using-raw-insert.patch
Description: v1-0001-ctas-using-raw-insert.patch

Reply via email to