Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Am 07.11.2013 12:42, schrieb Dilip kumar: > This patch implementing the following TODO item > > Allow parallel cores to be used by vacuumdb > > http://www.postgresql.org/message-id/4f10a728.7090...@agliodbs.com [1] > > Like Parallel pg_dump, vacuumdb is provided with the option to run the vacuum of multiple tables in parallel. [ VACUUMDB –J ] > > 1. One new option is provided with vacuumdb to give the number of workers. > > 2. All worker will be started in beginning and all will be waiting for the vacuum instruction from the master. > > 3. Now, if table list is provided in vacuumdb command using -t then, it will send the vacuum of one table to one of the IDLE worker, next table to next IDLE worker and so on. > > 4. If vacuum is given for one DB then, it will execute select on pg_class to get the table list and fetch the table name one by one and also assign the vacuum responsibility to IDLE workers. > > [...] For this use case, would it make sense to queue work (tables) in order of their size, starting on the largest one? For the case where you have tables of varying size this would lead to a reduced overall processing time as it prevents large (read: long processing time) tables to be processed in the last step. While processing large tables at first and filling up "processing slots/jobs" when they get free with smaller tables one after the other would safe overall execution time. Regards Jan -- professional: http://www.oscar-consult.de Links: -- [1] http://www.postgresql.org/message-id/4f10a728.7090...@agliodbs.com
Re: [HACKERS] Changing pg_dump default file format
Am 07.11.2013 19:08, schrieb Joshua D. Drake:> > On 11/07/2013 10:00 AM, Josh Berkus wrote: >> If we wanted to change the defaults, I think it would be easier to >> create a separate bin name (e.g. pg_backup) than to change the existing >> parameters for pg_dump. > > I am not opposed to that. Allow pg_dump to be what it is, and create a > pg_backup? > > JD I would definitely agree to having "one" backup utility and making -Fc the default for SQL dumps. One could even argue if the functionality of pg_basebackup should be part of that too. But I would be fine with having two distinct utilities (one for file level backups and one for logical/SQL level backups), too. Btw, how hard would it be, to have pg_restore and now also pg_dump run with -j option do some ordering of work by size of e.g. the tables? E.g. if you run with -j4 it would make sense to start working on the largest tables (and it's indexes) first and continue by descending in t´size to keep all available "slots" filled as good as possible. Just at though. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] swapcache-style cache?
Am 23.02.2012 21:57, schrieb Greg Smith: On 02/22/2012 05:31 PM, james wrote: Has anyone considered managing a system like the DragonFLY swapcache for a DBMS like PostgreSQL? ie where the admin can assign drives with good random read behaviour (but perhaps also-ran random write) such as SSDs to provide a cache for blocks that were dirtied, with async write that hopefully writes them out before they are forcibly discarded. We know that battery-backed write caches are extremely effective for PostgreSQL writes. I see most of these tiered storage ideas as acting like a big one of those, which seems to hold in things like SAN storage that have adopted this sort of technique already. A SSD is quite large relative to a typical BBWC. [...] -Ultimately all this data needs to make it out to real disk. The funny thing about caches is that no matter how big they are, you can easily fill them up if doing something faster than the underlying storage can handle. [...] I don't think the idea of a swapcache is without merit; there's surely some applications that will benefit from it. It's got a lot of potential as a way to absorb short-term bursts of write activity. And there are some applications that could benefit from having a second tier of read cache, not as fast as RAM but larger and faster than real disk seeks. In all of those potential win cases, though, I don't see why the OS couldn't just manage the whole thing for us. First off, thank's very much for mentioning DragonFly's swapcache on this mailing list, which takes the burden off me/us to self-advertise this feature :) But swapcache is clearly not meant or designed to speed up any write activity by caching writes and delaying the write to the "target storage" to a later point in time. Swapcache does not affect writes in any way, actually. Swapcache does its writing when a clean VM page hits the inactive VM page queue. VM pages related to filesystem writes are dirty, the write occurs normally, then they become clean. But they still have to cycle into the VM page inactive queue before swapcache will touch them (write them out to swap). So, basically it is designed to speed up Metadata reads, and if configured to do so, data reads. So, it can take some read load burden of the disk subsystem and free the disk subsystem for more write activity, but that would be just a side effect, not a design goal. And, yes.. it does effect pgsql performance on read loads seriously. See BSD Mag 5/2011 http://bsdmag.org/magazine/1691-embedded-bsd-freebsd-alix and http://www.shiningsilence.com/dbsdlog/2011/04/12/7586.html Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.3 feature proposal: vacuumdb -j #
Am 13.01.2012 22:50, schrieb Josh Berkus: It occurs to me that I would find it quite personally useful if the vacuumdb utility was multiprocess capable. For example, just today I needed to manually analyze a database with over 500 tables, on a server with 24 cores. And I needed to know when the analyze was done, because it was part of a downtime. I had to resort to a python script. I'm picturing doing this in the simplest way possible: get the list of tables and indexes, divide them by the number of processes, and give each child process its own list. Any reason not to hack on this for 9.3? I don't see any reason not to do it, but plenty to do it. Right now I have systems hosting many databases, I need to vacuum full from time to time. I have wrapped vacuumdb with a shell script to actually use all the capacity that is available. A vacuumdb -faz just isn't that usefull on large machines anymore. Jan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] psql expanded auto
I have not tried the patch (yet), but Informix'sl dbacess would do about the same - and it's something I really missed. Jan -- Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers