Re: [External] Multiple COPY on the same table
1. The tables has no indexes at the time of load.2. The create table and copy are in the same transaction. So I guess that's pretty much it. I understand the long time it takes as some of the tables have 400+ million rows.Also the env is a container and since this is currently a POC system , not much time has been invested in fine tuning it. thanks all.
Re: [External] Multiple COPY on the same table
Maybe he just has a large file that needs to be loaded into a table... On 08/20/2018 11:47 AM, Vijaykumar Jain wrote: Hey Ravi, What is the goal you are trying to achieve here. To make pgdump/restore faster? To make replication faster? To make backup faster ? Also no matter how small you split the files into, if network is your bottleneck then I am not sure you can attain n times the benefit my simply sending the files in parallel but yeah maybe some benefit. But then for parallel processing you also need to ensure your server is having relevant resources or else it will just be a lot of context switching I guess ? Pg dump has an option to dump in parallel pgbasebackup is single threaded I read but pgbackrest can allow better parallel processing in backups. There is also logical replication where you can selectively replicate your tables to avoid bandwidth issues. I might have said a lot and nothing may be relevant, but you need to let us know the goal you want to achieve :) Regards, Vijay *From:* Ravi Krishna *Sent:* Monday, August 20, 2018 8:24:35 PM *To:* pgsql-general@lists.postgresql.org *Subject:* [External] Multiple COPY on the same table Can I split a large file into multiple files and then run copy using each file. The table does not contain any serial or sequence column which may need serialization. Let us say I split a large file to 4 files. Will the performance boost by close to 4x?? ps: Pls ignore my previous post which was without a subject (due to mistake) -- Angular momentum makes the world go 'round.
Re: [External] Multiple COPY on the same table
I guess this should help you, Ravi. https://www.postgresql.org/docs/10/static/populate.html On 8/20/18, 10:30 PM, "Christopher Browne" wrote: On Mon, 20 Aug 2018 at 12:53, Ravi Krishna wrote: > > What is the goal you are trying to achieve here. > > To make pgdump/restore faster? > > To make replication faster? > > To make backup faster ? > > None of the above. > > We got csv files from external vendor which are 880GB in total size, in 44 files. Some of the large tables had COPY running for several hours. I was just thinking of a faster way to load. Seems like #4... #4 - To Make Recovery faster Using COPY pretty much *is* the "faster way to load"... The main thing you should consider doing to make it faster is to drop indexes and foreign keys from the tables, and recreate them afterwards. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Re: [External] Multiple COPY on the same table
On Mon, 20 Aug 2018 at 12:53, Ravi Krishna wrote: > > What is the goal you are trying to achieve here. > > To make pgdump/restore faster? > > To make replication faster? > > To make backup faster ? > > None of the above. > > We got csv files from external vendor which are 880GB in total size, in 44 > files. Some of the large tables had COPY running for several hours. I was > just thinking of a faster way to load. Seems like #4... #4 - To Make Recovery faster Using COPY pretty much *is* the "faster way to load"... The main thing you should consider doing to make it faster is to drop indexes and foreign keys from the tables, and recreate them afterwards. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Re: [External] Multiple COPY on the same table
> What is the goal you are trying to achieve here. > To make pgdump/restore faster? > To make replication faster? > To make backup faster ? None of the above. We got csv files from external vendor which are 880GB in total size, in 44 files. Some of the large tables had COPY running for several hours. I was just thinking of a faster way to load.
Re: [External] Multiple COPY on the same table
Hey Ravi, What is the goal you are trying to achieve here. To make pgdump/restore faster? To make replication faster? To make backup faster ? Also no matter how small you split the files into, if network is your bottleneck then I am not sure you can attain n times the benefit my simply sending the files in parallel but yeah maybe some benefit. But then for parallel processing you also need to ensure your server is having relevant resources or else it will just be a lot of context switching I guess ? Pg dump has an option to dump in parallel pgbasebackup is single threaded I read but pgbackrest can allow better parallel processing in backups. There is also logical replication where you can selectively replicate your tables to avoid bandwidth issues. I might have said a lot and nothing may be relevant, but you need to let us know the goal you want to achieve :) Regards, Vijay From: Ravi Krishna Sent: Monday, August 20, 2018 8:24:35 PM To: pgsql-general@lists.postgresql.org Subject: [External] Multiple COPY on the same table Can I split a large file into multiple files and then run copy using each file. The table does not contain any serial or sequence column which may need serialization. Let us say I split a large file to 4 files. Will the performance boost by close to 4x?? ps: Pls ignore my previous post which was without a subject (due to mistake)