Send Netdot-devel mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
https://osl.uoregon.edu/mailman/listinfo/netdot-devel
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Netdot-devel digest..."
Today's Topics:
1. Re: odd PostgreSQL ping tests in v1.0.4 (William Bulley)
----------------------------------------------------------------------
Message: 1
Date: Thu, 1 May 2014 15:37:21 -0400
From: William Bulley <[email protected]>
Subject: Re: [Netdot-devel] odd PostgreSQL ping tests in v1.0.4
To: Anton Berezin <[email protected]>
Cc: Carlos Vicente <[email protected]>,
[email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii
Thank you for your important comments and clarifications.
See my comments below.
According to Anton Berezin <[email protected]> on Tue, 04/29/14 at 15:53:
>
> I am not sure I understand what you mean when you say that there is only one
> connection.
>
> In my test installation, the query you mentioned shows the PID of the
> postgres daemons which are handling the connection. Moreover, it also gives
> you the connection port number (if a TCP connection is used). This last bit
> is especially useful, since it allows one to find out exactly what process
> is holding the connection open (via grepping sockstat output on FreeBSD, or
> grepping netstat --tcp --program output on Linux).
>
> Without any cronjobs running, and having just a single psql shell open,
> we can see that all those connections except the one made by the psql shell
> are made by instances of Apache httpd. The amount of such processes is
> controlled by various Apache parameters such as ServerLimit, StartServers
> etc. So you can reduce it if you wish.
>
> > Does the PostgreSQL server have any limits on the number of such sessions?
>
> It does, but this will only become a problem if the number of apache
> processes + the number of simultaneous cronjobs exceeds this limit (assuming
> a spherical cow in a vacuum case when the DB server is only used by Netdot
> application itself).
>
> If need be, one can either decrease the number of Apaches running, or
> increase the connection limit in PostgreSQL - see max_connections parameter
> in postgresql.conf - PostgreSQL restart required.
>
> > Another problem occurs when, after several hours/days of testing,
> > we issue the following command as "root" from the top of the Netdot
> > file tree:
> >
> > # make dropdb (to be later followed by "# make installdb")
> >
> > the make(1) session complains with the error that there are "n"
> > sessions still using the PostgreSQL server and therefore the "drop"
> > action cannot be completed.
>
> > At this point, the "postgres" user must intervene and clear the
> > offending "sessions".
>
> Well. The easiest solution is to stop apache and disable the cronjobs
> before doing this. This is the most sensible thing to do anyways, since one
> would not want to leave the frontend accessible while the database it uses
> is being dropped, right?
>
> > BTW, each "login" to Netdot causes two such "ping" entries, and each
> > "logout" from Netdot by a user causes two more "ping" entries to be added.
>
> Hmmm. So basically you are saying that the amount of open sessions
> increases over time - without reaching some sort of saturation limit? This
> is not something I observe - but of course in this case pretty much
> everything I said above does not really apply, right? Which would mean
> that you need to first investigate what processes are holding those
> connections open.
>
> Also, in this case it might be interesting to know the versions of
>
> - DBI
> - DBD::Pg
> - PostgreSQL
> - apache
DBI - "1.609"
DBD::Pg - "2.19.3"
PostgreSQL - "9.2.6"
apache - "2.2.26"
> If the number of open connections increases without bound, this is of course
> highly abnormal.
Yep.
> Oh, and by the way, how did you find out that the "source" of those rows are
> the ping methods?
See below.
Our PostgreSQL 9.2.6 configuration has 300 max_connections
setting. If each "user" interaction with Netdot (login, do
something, logout) creates four "idle" PostgreSQL processes
in RHEL6, then the system will be quickly overloaded with
even mild usage.
We are in testing mode. We have no cron jobs running now.
Is there a way to configure or to code a "session idle timer"
equivalent for PostgreSQL?
We would rather not have to resort to the rather heavy handed
method of running periodic cron jobs to get rid of any "idle"
processes.
In order to ensure that any work we have done here in v1.0.4
was not contributing to our problems, I installed a standard,
unmodified v1.0.6 on our RHEL6 server. After I'd configured
the netdot/etc/Site.conf and apache22/conf/httpd.conf files,
I launched the apache22 server, and entered the standard URL
into my browser.
All I did was log in as "admin", change the "admin" password,
then I logged out to check the new password functioned okay.
After this, as "admin" I changed the passwords for the both
the "operator" and "guest" users. Finally, I verified that
the two new passwords worked for those two users, respectively.
These few tasks were literally all that I did.
At this point, I examined the process table using the ps(1)
command for any process containing the string "postgres":
postgres 10718 0.0 0.2 199200 21244 ? S Apr29 0:03
/usr/local/pgsql/bin/postgres -D /opt/database
postgres 10726 0.0 0.1 199428 13236 ? Ss Apr29 0:00 postgres:
checkpointer process
postgres 10727 0.0 0.0 199200 1408 ? Ss Apr29 0:02 postgres:
writer process
postgres 10728 0.0 0.0 199200 1168 ? Ss Apr29 0:01 postgres: wal
writer process
postgres 10729 0.0 0.0 200232 2504 ? Ss Apr29 0:04 postgres:
autovacuum launcher process
postgres 10730 0.0 0.0 142420 1212 ? Ss Apr29 0:07 postgres:
stats collector process
postgres 22360 0.0 0.0 200564 4604 ? Ss 14:21 0:00 postgres:
netdot_user netdot 127.0.0.1(50767) idle
postgres 22372 0.0 0.0 200648 5584 ? Ss 14:21 0:00 postgres:
netdot_user netdot 127.0.0.1(50768) idle
postgres 22375 0.0 0.0 200648 5456 ? Ss 14:21 0:00 postgres:
netdot_user netdot 127.0.0.1(50769) idle
postgres 22378 0.0 0.0 201520 7164 ? Ss 14:21 0:00 postgres:
netdot_user netdot 127.0.0.1(50770) idle
postgres 22381 0.0 0.0 201548 7168 ? Ss 14:21 0:00 postgres:
netdot_user netdot 127.0.0.1(50771) idle
postgres 22384 0.0 0.0 201540 7988 ? Ss 14:21 0:00 postgres:
netdot_user netdot 127.0.0.1(50772) idle
postgres 22389 0.0 0.0 201236 6380 ? Ss 14:22 0:00 postgres:
netdot_user netdot 127.0.0.1(50773) idle
postgres 22392 0.0 0.0 201440 7788 ? Ss 14:22 0:00 postgres:
netdot_user netdot 127.0.0.1(50774) idle
postgres 22397 0.0 0.0 201388 7660 ? Ss 14:22 0:00 postgres:
netdot_user netdot 127.0.0.1(50775) idle
postgres 22400 0.0 0.0 201236 6340 ? Ss 14:23 0:00 postgres:
netdot_user netdot 127.0.0.1(50776) idle
postgres 22404 0.0 0.1 201676 8100 ? Ss 14:23 0:00 postgres:
netdot_user netdot 127.0.0.1(50777) idle
Our PostgreSQL server had been running all during this time
since we last restarted it on 4/29/2014. I then ran the
following SELECT SQL command from within the PostgreSQL "psql"
command from a user shell on the server:
netdot=# select * from pg_stat_activity;
datid | datname | pid | usesysid | usename | application_name |
client_addr | client_hostname | client_port | backend_start |
xact_start | query_start |
state_change | waiting | state | query
-------+---------+-------+----------+-------------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+---------+--------+---------------------------------
87718 | netdot | 22360 | 16387 | netdot_user | |
127.0.0.1 | | 50767 | 2014-05-01 14:21:20.715827-04 |
| 2014-05-01 14:21:20.721127-04 | 2014-05-01
14:21:20.721166-04 | f | idle | DEALLOCATE dbdpg_p22356_1
87718 | netdot | 22372 | 16387 | netdot_user | |
127.0.0.1 | | 50768 | 2014-05-01 14:21:54.708273-04 |
| 2014-05-01 14:24:54.628021-04 | 2014-05-01
14:24:54.628302-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22375 | 16387 | netdot_user | |
127.0.0.1 | | 50769 | 2014-05-01 14:21:54.764297-04 |
| 2014-05-01 14:24:14.109033-04 | 2014-05-01
14:24:14.109224-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22378 | 16387 | netdot_user | |
127.0.0.1 | | 50770 | 2014-05-01 14:21:54.76834-04 |
| 2014-05-01 14:24:54.582058-04 | 2014-05-01
14:24:54.582104-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22381 | 16387 | netdot_user | |
127.0.0.1 | | 50771 | 2014-05-01 14:21:54.774615-04 |
| 2014-05-01 14:24:24.088902-04 | 2014-05-01
14:24:24.08896-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22384 | 16387 | netdot_user | |
127.0.0.1 | | 50772 | 2014-05-01 14:21:54.784423-04 |
| 2014-05-01 14:24:32.794984-04 | 2014-05-01
14:24:32.795071-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22389 | 16387 | netdot_user | |
127.0.0.1 | | 50773 | 2014-05-01 14:22:11.910088-04 |
| 2014-05-01 14:24:21.440224-04 | 2014-05-01
14:24:21.440269-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22392 | 16387 | netdot_user | |
127.0.0.1 | | 50774 | 2014-05-01 14:22:11.955962-04 |
| 2014-05-01 14:24:45.012678-04 | 2014-05-01
14:24:45.012888-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22397 | 16387 | netdot_user | |
127.0.0.1 | | 50775 | 2014-05-01 14:22:52.449597-04 |
| 2014-05-01 14:24:03.810301-04 | 2014-05-01
14:24:03.810486-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22400 | 16387 | netdot_user | |
127.0.0.1 | | 50776 | 2014-05-01 14:23:04.149136-04 |
| 2014-05-01 14:24:14.063364-04 | 2014-05-01
14:24:14.06341-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22404 | 16387 | netdot_user | |
127.0.0.1 | | 50777 | 2014-05-01 14:23:32.853409-04 |
| 2014-05-01 14:24:45.000936-04 | 2014-05-01
14:24:45.000982-04 | f | idle | SELECT 'DBD::Pg ping test'
87718 | netdot | 22500 | 10 | postgres | psql |
| | -1 | 2014-05-01 14:54:10.748147-04 | 2014-05-01
14:55:25.802684-04 | 2014-05-01 14:55:25.802684-04 | 2014-05-01
14:55:25.802694-04 | f | active | select * from pg_stat_activity;
(12 rows)
You can examine the corresponding PID values and TCP port
values in each of the above results. I cannot explain the
origin of all these "idle" processes. I hope you can help.
Regards,
web...
--
William Bulley Email: [email protected]
72 characters width template ----------------------------------------->|
------------------------------
_______________________________________________
Netdot-devel mailing list
[email protected]
https://osl.uoregon.edu/mailman/listinfo/netdot-devel
End of Netdot-devel Digest, Vol 86, Issue 2
*******************************************