Re: blending fast and temp space volumes
On Wed, Feb 21, 2018 at 5:09 PM, Peter Geoghegan <p...@bowt.ie> wrote: > On Wed, Feb 21, 2018 at 12:07 PM, Claudio Freire <klaussfre...@gmail.com> > wrote: >> On Wed, Feb 21, 2018 at 4:50 PM, Peter Geoghegan <p...@bowt.ie> wrote: >>> On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten <rottenwindf...@gmail.com> >>> wrote: >>>> side note: The disadvantage of local SSD is that it won't survive "hitting >>>> the virtual power button" on an instance, nor can it migrate automatically >>>> to other hardware. (We have to hit the power button to add memory/cpu to >>>> the system, and sometimes the power button might get hit by accident.) >>>> This >>>> is OK for temp space. I never have my database come up automatically on >>>> boot, and I have scripted the entire setup of the temp space volume and >>>> data >>>> structures. I can run that script before starting the database. I've >>>> done >>>> some tests and it seems to work great. I don't mind rolling back any >>>> transaction that might be in play during a power failure. >>> >>> It sounds like you're treating a temp_tablespaces tablespace as >>> ephemeral, which IIRC can have problems that an ephemeral >>> stats_temp_directory does not have. >> >> For instance? >> >> I've been doing that for years without issue. If you're careful to >> restore the skeleton directory structure at server boot up, I haven't >> had any issues. > > Then you clearly know what I mean already. That's not documented as > either required or safe anywhere. Ah, ok. But the OP did mention he was doing that already. So it should be safe.
Re: blending fast and temp space volumes
On Wed, Feb 21, 2018 at 4:50 PM, Peter Geogheganwrote: > On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten wrote: >> side note: The disadvantage of local SSD is that it won't survive "hitting >> the virtual power button" on an instance, nor can it migrate automatically >> to other hardware. (We have to hit the power button to add memory/cpu to >> the system, and sometimes the power button might get hit by accident.) This >> is OK for temp space. I never have my database come up automatically on >> boot, and I have scripted the entire setup of the temp space volume and data >> structures. I can run that script before starting the database. I've done >> some tests and it seems to work great. I don't mind rolling back any >> transaction that might be in play during a power failure. > > It sounds like you're treating a temp_tablespaces tablespace as > ephemeral, which IIRC can have problems that an ephemeral > stats_temp_directory does not have. For instance? I've been doing that for years without issue. If you're careful to restore the skeleton directory structure at server boot up, I haven't had any issues. On Wed, Feb 21, 2018 at 4:22 PM, Craig James wrote: > > On Wed, Feb 21, 2018 at 7:53 AM, Rick Otten >> I was wondering if there anyone had ideas for how to make that possible. >> I don't think I want to add the SAN disk to the same LVM volume group as the >> local disk, but maybe that would work, since I'm already building it with a >> script anyhow ... Is LVM smart enough to optimize radically different disk >> performances? > > > Couldn't you configure both devices into a single 6T device via RAID0 using > md? That would probably perform as slow as the slowest disk.
Re: effective_io_concurrency on EBS/gp2
On Mon, Feb 5, 2018 at 8:26 AM, Vitaliy Garnashevichwrote: >> I mean, that the issue is indeed affected by the order of rows in the >> table. Random heap access patterns result in sparse bitmap heap scans, >> whereas less random heap access patterns result in denser bitmap heap >> scans. Dense scans have large portions of contiguous fetches, a >> pattern that is quite adversely affected by the current prefetch >> mechanism in linux. >> > > Thanks for your input. > > How can I test a sparse bitmap scan? Can you think of any SQL commands which > would generate data and run such scans? > > Would a bitmap scan over expression index ((aid%1000)=0) do a sparse bitmap > scan? If you have a minimally correlated index (ie: totally random order), and suppose you have N tuples per page, you need to select less (much less) than 1/Nth of the table.
Re: effective_io_concurrency on EBS/gp2
On Sat, Feb 3, 2018 at 8:05 PM, Vitaliy Garnashevichwrote: > Looks like this behavior is not caused by, and does not depend on: > - variable performance in the cloud > - order of rows in the table > - whether the disk is EBS (backed by SSD or HDD), or ordinary SSD > - kernel version > > Does this mean that the default setting for eic on Linux is just inadequate > for how the modern kernels behave? Or am I missing something else in the > tests? > > Regards, > Vitaliy I have analyzed this issue quite extensively in the past, and I can say with high confidence that you're analysis on point 2 is most likely wrong. Now, I don't have all the information to make that a categorical assertion, you might have a point, but I believe you're misinterpreting the data. I mean, that the issue is indeed affected by the order of rows in the table. Random heap access patterns result in sparse bitmap heap scans, whereas less random heap access patterns result in denser bitmap heap scans. Dense scans have large portions of contiguous fetches, a pattern that is quite adversely affected by the current prefetch mechanism in linux. This analysis does point to the fact that I should probably revisit this issue. There's a rather simple workaround for this, pg should just avoid issuing prefetch orders for sequential block patterns, since those are already much better handled by the kernel itself.
Re: effective_io_concurrency on EBS/gp2
On Wed, Jan 31, 2018 at 11:21 PM, hzzhangjiazhiwrote: > HI > > I think this parameter will be usefull when the storage using RAID > stripe , otherwise turn up this parameter is meaningless when only has one > device。 Not at all. Especially on EBS, where keeping a relatively full queue is necessary to get max thoughput out of the drive. Problem is, if you're scanning a highly correlated index, the mechanism is counterproductive. I had worked on some POC patches for correcting that, I guess I could work something out, but it's low-priority for me. Especially since it's actually a kernel "bug" (or shortcoming), that could be fixed in the kernel rather than worked around by postgres.
Re: effective_io_concurrency on EBS/gp2
On Wed, Jan 31, 2018 at 1:57 PM, Vitaliy Garnashevichwrote: > More tests: > > io1, 100 GB: > > effective_io_concurrency=0 > Execution time: 40333.626 ms > effective_io_concurrency=1 > Execution time: 163840.500 ms In my experience playing with prefetch, e_i_c>0 interferes with kernel read-ahead. What you've got there would make sense if what postgres thinks will be random I/O ends up being sequential. With e_i_c=0, the kernel will optimize the hell out of it, because it's a predictable pattern. But with e_i_c=1, the kernel's optimization gets disabled but postgres isn't reading much ahead, so you get the worst possible case.
Re: 8.2 Autovacuum BUG ?
On Tue, Jan 30, 2018 at 10:55 AM, pavan95wrote: > Hello all, > > Will a sudden restart(stop/start) of a postgres database will generate this > huge WAL? Shouldn't
Re: 8.2 Autovacuum BUG ?
On Wed, Jan 24, 2018 at 8:50 AM, pavan95wrote: > Hello all, > > One more interesting observation made by me. > > I have ran the below query(s) on production: > > SELECT > relname, > age(relfrozenxid) as xid_age, > pg_size_pretty(pg_table_size(oid)) as table_size > FROM pg_class > WHERE relkind = 'r' and pg_table_size(oid) > 1073741824 > ORDER BY age(relfrozenxid) DESC ; > relname | > xid_age | table_size > > +-+ > *hxx* | > 7798262 | 3245 MB > hrx | > 7797554 | 4917 MB > irxx| > 7796771 | 2841 MB > hr_ | 7744262 | > 4778 MB > reimbxxx | 6767712 | 1110 MB > > show autovacuum_freeze_max_age; > autovacuum_freeze_max_age > --- > 2 > (1 row) > You seem to be rather far from the freeze_max_age. Unless you're consuming txids at a very high rate, I don't think that's your problem.
Re: 8.2 Autovacuum BUG ?
On Wed, Jan 24, 2018 at 3:54 AM, pavan95wrote: > Hi Claudio, > > We didn't configure any replication to our production server. Which strace > are you talking about? > This one: https://linux.die.net/man/1/strace You can attach it to a process (assuming you have the necessary permissions) and it will report all the syscalls the process does. That does slow down the process though. Then lsof ( https://linux.die.net/man/8/lsof ) can be used to map file descriptor numbers to file paths. You have to do it as soon as you read the output, because files get closed and file descriptors reused. So it's better to have a script that directly reads from /proc/pid/fd or fdinfo, but that takes some programming. It is nontrivial, but sometimes it's the only tool in your belt. You may want to try something else first though. > We did a keen observation that only at the time 9'th minute of the hour and > 39'th minute of the hour the so called archive logs are generated even > when nobody is connecting from application(off the business hours). Well, if you don't know what happens at those times (and only at those times), it's not that useful. Since you don't know what is causing this for certain, first thing you have to do is ascertain that. Try increasing logging as much as you can, especially around those times, and see what turns on then and not at other times. You can monitor autovacuum processes as well in pg_stat_activity, so make sure you check that as well, as autovacuum will only log once it's done. You do know autovacuum is running at those times, you have to check whether it isn't when WAL isn't being generated, and whether autovacuum is vacuuming the same tables over and over or what. Your earlier mails show autoanalyze runs, not vacuum. Those shouldn't cause so much WAL, but if it's running very often and you have lots of stats, then maybe. You can also try pg_stat_statements: https://www.postgresql.org/docs/9.1/static/pgstatstatements.html Again, concentrate on the differential - what happens at those times, that doesn't at other times. Another idea would be to check for freeze runs in autovacuum. Ie, what's described here: https://wiki.postgresql.org/wiki/VacuumHeadaches#FREEZE There's a nice blog post with some queries to help you with that here: http://www.databasesoup.com/2012/09/freezing-your-tuples-off-part-1.html (and it's continuation here: http://www.databasesoup.com/2012/10/freezing-your-tuples-off-part-2.html ). I'm not saying you should tune those parameters, what you were showing was autoanalyze activity, not vacuum freeze, but you should check whether you need to anyway.
Re: Batch insert heavily affecting query performance.
On Wed, Dec 27, 2017 at 2:10 PM, Mike Sofenwrote: > In my experience, that 77ms will stay quite constant even if your db grew > to > 1TB. Postgres IS amazing. BTW, for a db, you should always have > provisioned IOPS or else your performance can vary wildly, since the SSDs > are shared. > > > > Re Lambda: another team is working on a new web app using Lambda calls > and they were also experiencing horrific performance, just like yours (2 > seconds per call). They discovered it was the Lambda connection/spin-up > time causing the problem. They solved it by keeping several Lambda’s > “hot”, for an instant connection…solved the problem, the last I heard. > Google for that topic, you’ll find solutions. > You should try to implement an internal connection pool in your lambda. Lambda functions are reused. You have no guarantees as to how long these processes will live, but they will live for more than one request. So if you keep a persistent connection in your lambda code, the first invocation may be slow, but further invocations will be fast. Lambda will try to batch several calls at once. In fact, you can usually configure batching in the event source to try to maximize this effect. In my experience, your lambda will be most probably network-bound. Increase the lambda's memory allocation, to get a bigger chunk of the available network bandwidth (why they decided to call that "memory" nobody will ever be able to tell).
Re: Need Help on wal_compression
On Tue, Jan 9, 2018 at 3:53 AM, Rambabu Vwrote: > Hi Team, > > Daily 4000 Archive files are generating and these are occupying more > space, we are trying to compress wall files with using wal_compression > parameter, but we are not seeing any change in wal files count, could you > please help us on this. > That's very little information to go on. You'll probably want to inspect WAL record stats before and after enabling wal_compression to see whether it makes sense to do so. Take a look at pg_xlogdump --stats For example: $ pg_xlogdump --stats -p /path/to/pg_xlog 00010002C36400F0 00010002C36400FA Type N (%) Record size (%) FPI size (%)Combined size (%) - --- --- --- --- - --- XLOG 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Transaction 11 ( 0.00) 352 ( 0.00)0 ( 0.00) 352 ( 0.00) Storage0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) CLOG 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Database 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Tablespace 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) MultiXact 4 ( 0.00) 208 ( 0.00)0 ( 0.00) 208 ( 0.00) RelMap 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Standby2 ( 0.00) 116 ( 0.00)0 ( 0.00) 116 ( 0.00) Heap2 2504 ( 0.18) 78468 ( 0.20) 1385576 ( 3.55) 1464044 ( 1.89) Heap 667619 ( 48.23) 19432159 ( 50.47) 28641357 ( 73.35) 48073516 ( 61.99) Btree 712093 ( 51.45) 18643846 ( 48.42) 9021270 ( 23.10) 27665116 ( 35.67) Hash 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Gin0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Gist 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Sequence1918 ( 0.14) 349076 ( 0.91)0 ( 0.00) 349076 ( 0.45) SPGist 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) BRIN 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) CommitTs 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) ReplicationOrigin 0 ( 0.00)0 ( 0.00)0 ( 0.00)0 ( 0.00) Total1384151 38504225 [49.65%] 39048203 [50.35%] 77552428 [100%] That shows 50% of that are full page writes. This is with compression enabled. WAL compression will only help FPW, so if you don't have a large volume of FPW, or they don't compress well, you won't benefit much.