Re: [PERFORM] Performance Woes
Jeff Davis <[EMAIL PROTECTED]> writes: > On Wed, 2007-05-09 at 17:29 -0700, Joshua D. Drake wrote: >> Sounds to me like you just need to up the total amount of open files >> allowed by the operating system. > It looks more like the opposite, here's the docs for > max_files_per_process: I think Josh has got the right advice. The manual is just saying that you can reduce max_files_per_process to avoid the failure, but it's not making any promises about the performance penalty for doing that. Apparently Ralph's app needs a working set of between 800 and 1000 open files to have reasonable performance. > That is a lot of tables. Maybe a different OS will handle it better? > Maybe there's some way that you can use fewer connections and then the > OS could still handle it? Also, it might be worth rethinking the database structure to reduce the number of tables. But for a quick-fix, increasing the kernel limit seems like the easiest answer. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance Woes
Ralph Mason wrote: > I have several databases. They are each about 35gb in size and have about > 10.5K relations (count from pg_stat_all_tables) in them. Pg_class is about > 26k rows and the data directory contains about 70k files. These are busy > machines, they run about 50 xactions per second, ( aproxx insert / update / > delete about 500 rows per second). Is it always the same trigger the problematic one? Is it just PL/pgSQL, or do you have something else? Something that may be trying to open additional files for example? Something that may be trying to open files behind your back? PL/Perl with funky operators or temp files? Also, what PG version is this? > So we decreased the max_files_per_process to 800. This took care of the > error *BUT* about quadrupled the IO wait that is happening on the machine. > It went from a peek of about 50% to peeks of over 200% (4 processor > machines, 4 gigs ram, raid). The load on the machine remained constant. The max_files_per_process settings controls how many actual file descriptors each process is allowed to have. Postgres uses internally a "virtual file descriptor", which normally have one file descriptor open each. However, if your transactions need to access lots of files, the VFDs will close the kernel FDs to allow other VFDs to open theirs. So it sounds like your transaction has more than 800 files open. The extra IO wait could be caused by the additional system calls to open and close those files as needed. I would actually expect it to cause extra "system" load (as opposed to "user") rather than IO, but I'm not sure. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance Woes
Just adding a bit of relevant information: We have the kernel file-max setting set to 297834 (256 per 4mb of ram). /proc/sys/fs/file-nr tells us that we have roughly 13000 allocated handles of which zero are always free. On 10/05/07, Jeff Davis <[EMAIL PROTECTED]> wrote: On Wed, 2007-05-09 at 17:29 -0700, Joshua D. Drake wrote: > > 2007-05-09 03:07:50.083 GMT 1146975740: LOCATION: BasicOpenFile, > > fd.c:471 > > > > 2007-05-09 03:07:50.091 GMT 0: LOG: 0: duration: 12.362 ms > > > > 2007-05-09 03:07:50.091 GMT 0: LOCATION: exec_simple_query, > > postgres.c:1090 > > > > > > > > So we decreased the max_files_per_process to 800. This took care > > of the error **BUT** about quadrupled the IO wait that is happening > > on the machine. It went from a peek of about 50% to peeks of over > > 200% (4 processor machines, 4 gigs ram, raid). The load on the > > machine remained constant. > > > > Sounds to me like you just need to up the total amount of open files > allowed by the operating system. It looks more like the opposite, here's the docs for max_files_per_process: "Sets the maximum number of simultaneously open files allowed to each server subprocess. The default is one thousand files. If the kernel is enforcing a safe per-process limit, you don't need to worry about this setting. But on some platforms (notably, most BSD systems), the kernel will allow individual processes to open many more files than the system can really support when a large number of processes all try to open that many files. If you find yourself seeing "Too many open files" failures, try reducing this setting. This parameter can only be set at server start." To me, that means that his machine is allowing the new FD to be created, but then can't really support that many so it gives an error. Ralph, how many connections do you have open at once? It seems like the machine perhaps just can't handle that many FDs in all of those processes at once. That is a lot of tables. Maybe a different OS will handle it better? Maybe there's some way that you can use fewer connections and then the OS could still handle it? Regards, Jeff Davis ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster -- Scott Mohekey Systems Administrator Telogis Intelligent Location Technologies NOTICE: This message (including any attachments) contains CONFIDENTIAL INFORMATION intended for a specific individual and purpose, and is protected by law. If you are not the intended recipient, you should delete this message and are hereby notified that any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited
Re: [PERFORM] Performance Woes
>To me, that means that his machine is allowing the new FD to be created, >but then can't really support that many so it gives an error. files-max is 297834 ulimit is 100 (doesn't make sense but there you go) What I don’t really understand is with max_files_per_process at 800 we don't get the problem, but with 1000 we do. $lsof | wc -l 14944 $cat /proc/sys/fs/file-nr 12240 0 297834 >Ralph, how many connections do you have open at once? It seems like the >machine perhaps just can't handle that many FDs in all of those >processes at once. There are only 30 connections - of those probably only 10 are really active. It doesn't seem like we should be stressing this machine/ >That is a lot of tables. Maybe a different OS will handle it better? >Maybe there's some way that you can use fewer connections and then the >OS could still handle it? It would be less but then you can't maintain the db b/c of the constant vacuuming needed :-( I think the linux folks would get up in arms if you told them they couldn't handle that many open files ;-) Thanks, Ralph -- Internal Virus Database is out-of-date. Checked by AVG Free Edition. Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date: 05/12/2006 16:07 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance Woes
On Wed, 2007-05-09 at 17:29 -0700, Joshua D. Drake wrote: > > 2007-05-09 03:07:50.083 GMT 1146975740: LOCATION: BasicOpenFile, > > fd.c:471 > > > > 2007-05-09 03:07:50.091 GMT 0: LOG: 0: duration: 12.362 ms > > > > 2007-05-09 03:07:50.091 GMT 0: LOCATION: exec_simple_query, > > postgres.c:1090 > > > > > > > > So we decreased the max_files_per_process to 800. This took care > > of the error **BUT** about quadrupled the IO wait that is happening > > on the machine. It went from a peek of about 50% to peeks of over > > 200% (4 processor machines, 4 gigs ram, raid). The load on the > > machine remained constant. > > > > Sounds to me like you just need to up the total amount of open files > allowed by the operating system. It looks more like the opposite, here's the docs for max_files_per_process: "Sets the maximum number of simultaneously open files allowed to each server subprocess. The default is one thousand files. If the kernel is enforcing a safe per-process limit, you don't need to worry about this setting. But on some platforms (notably, most BSD systems), the kernel will allow individual processes to open many more files than the system can really support when a large number of processes all try to open that many files. If you find yourself seeing "Too many open files" failures, try reducing this setting. This parameter can only be set at server start." To me, that means that his machine is allowing the new FD to be created, but then can't really support that many so it gives an error. Ralph, how many connections do you have open at once? It seems like the machine perhaps just can't handle that many FDs in all of those processes at once. That is a lot of tables. Maybe a different OS will handle it better? Maybe there's some way that you can use fewer connections and then the OS could still handle it? Regards, Jeff Davis ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance Woes
Hello, You likely need to increase your file-max parameters using sysctl.conf. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Performance Woes
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of CAJ CAJ Sent: 10 May 2007 12:26 To: Ralph Mason Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance Woes I have several databases. They are each about 35gb in size and have about 10.5K relations (count from pg_stat_all_tables) in them. Pg_class is about 26k rows and the data directory contains about 70k files. These are busy machines, they run about 50 xactions per second, ( aproxx insert / update / delete about 500 rows per second). We started getting errors about the number of open file descriptors : 2007-05-09 03:07:50.083 GMT 1146975740: LOG: 53000: out of file descriptors: Too many open files; release and retry 2007-05-09 03:07:50.083 GMT 1146975740: CONTEXT: SQL statement "insert ….. " PL/pgSQL function "trigfunc_whatever" line 50 at execute statement 2007-05-09 03:07:50.083 GMT 1146975740: LOCATION: BasicOpenFile, fd.c:471 2007-05-09 03:07:50.091 GMT 0: LOG: 0: duration: 12.362 ms 2007-05-09 03:07:50.091 GMT 0: LOCATION: exec_simple_query, postgres.c:1090 So we decreased the max_files_per_process to 800. This took care of the error *BUT* about quadrupled the IO wait that is happening on the machine. It went from a peek of about 50% to peeks of over 200% (4 processor machines, 4 gigs ram, raid). The load on the machine remained constant. >What version of Pg/OS? What is your hardware config? >I had seen these errors with earlier versions of Pg 7.4.x which was fixed in later releases according to the changelogs "PostgreSQL 8.1.4 on x86_64-redhat-linux-gnu, compiled by GCC x86_64-redhat-linux-gcc (GCC) 4.1.0 20060304 (Red Hat 4.1.0-3)" su postgres -c 'ulimit -a' core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited max nice(-e) 0 file size (blocks, -f) unlimited pending signals (-i) 49152 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 100 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 max rt priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 49152 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited file locks (-x) unlimited Seems like I should be able to use lots and lots of open files. Machines are quad processor opterons, 4gb ram with raid 5 data and logging to a raid 0. Ralph -- Internal Virus Database is out-of-date. Checked by AVG Free Edition. Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date: 05/12/2006 16:07
Re: [PERFORM] Performance Woes
2007-05-09 03:07:50.083 GMT 1146975740: LOCATION: BasicOpenFile, fd.c:471 2007-05-09 03:07:50.091 GMT 0: LOG: 0: duration: 12.362 ms 2007-05-09 03:07:50.091 GMT 0: LOCATION: exec_simple_query, postgres.c:1090 So we decreased the max_files_per_process to 800. This took care of the error **BUT** about quadrupled the IO wait that is happening on the machine. It went from a peek of about 50% to peeks of over 200% (4 processor machines, 4 gigs ram, raid). The load on the machine remained constant. Sounds to me like you just need to up the total amount of open files allowed by the operating system. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Performance Woes
I have several databases. They are each about 35gb in size and have about 10.5K relations (count from pg_stat_all_tables) in them. Pg_class is about 26k rows and the data directory contains about 70k files. These are busy machines, they run about 50 xactions per second, ( aproxx insert / update / delete about 500 rows per second). We started getting errors about the number of open file descriptors : 2007-05-09 03:07:50.083 GMT 1146975740: LOG: 53000: out of file descriptors: Too many open files; release and retry 2007-05-09 03:07:50.083 GMT 1146975740: CONTEXT: SQL statement "insert ….. " PL/pgSQL function "trigfunc_whatever" line 50 at execute statement 2007-05-09 03:07:50.083 GMT 1146975740: LOCATION: BasicOpenFile, fd.c:471 2007-05-09 03:07:50.091 GMT 0: LOG: 0: duration: 12.362 ms 2007-05-09 03:07:50.091 GMT 0: LOCATION: exec_simple_query, postgres.c :1090 So we decreased the max_files_per_process to 800. This took care of the error **BUT** about quadrupled the IO wait that is happening on the machine. It went from a peek of about 50% to peeks of over 200% (4 processor machines, 4 gigs ram, raid). The load on the machine remained constant. What version of Pg/OS? What is your hardware config? I had seen these errors with earlier versions of Pg 7.4.x which was fixed in later releases according to the changelogs
[PERFORM] Performance Woes
Hi, I have several databases. They are each about 35gb in size and have about 10.5K relations (count from pg_stat_all_tables) in them. Pg_class is about 26k rows and the data directory contains about 70k files. These are busy machines, they run about 50 xactions per second, ( aproxx insert / update / delete about 500 rows per second). We started getting errors about the number of open file descriptors : 2007-05-09 03:07:50.083 GMT 1146975740: LOG: 53000: out of file descriptors: Too many open files; release and retry 2007-05-09 03:07:50.083 GMT 1146975740: CONTEXT: SQL statement "insert ….. " PL/pgSQL function "trigfunc_whatever" line 50 at execute statement 2007-05-09 03:07:50.083 GMT 1146975740: LOCATION: BasicOpenFile, fd.c:471 2007-05-09 03:07:50.091 GMT 0: LOG: 0: duration: 12.362 ms 2007-05-09 03:07:50.091 GMT 0: LOCATION: exec_simple_query, postgres.c:1090 So we decreased the max_files_per_process to 800. This took care of the error *BUT* about quadrupled the IO wait that is happening on the machine. It went from a peek of about 50% to peeks of over 200% (4 processor machines, 4 gigs ram, raid). The load on the machine remained constant. I am really to get an understanding of exactly what this setting is and ‘what’ is out of file descriptors and how I can fix that. I need to bring that IO back down. Thanks for any help. Ralph -- Internal Virus Database is out-of-date. Checked by AVG Free Edition. Version: 7.5.432 / Virus Database: 268.15.9/573 - Release Date: 05/12/2006 16:07