Re: Policy server problem: connection timed out or connection reset by peer

Bill Cole Thu, 18 Aug 2016 20:58:05 -0700

On 18 Aug 2016, at 21:32, Zhang Huangbin wrote:

Dear Bill,
Thanks very much for helping.
On Aug 19, 2016, at 4:17 AM, Bill Cole<postfixlists-070...@billmail.scconsult.com> wrote:
What do you mean "run" the policy service? It's a python program.
Which must be running in order for it to be listening forconnections.Likely mechanisms would be via a SysV init script in /etc/init.d/ orvia a systemd service definition.
On some old Linux distributions, it's run with a SysV init script, buton CentOS 7 and Ubuntu 16.04, it's run via systemd.
If your policy server is listening on 127.0.0.1:1234, you could trythis:
for x in {1..100} ; do nc 127.0.0.1 1234 & done
That attempts to make 100 TCP connections to 127.0.0.1:1234 with 100different 'nc' processes, all running in the background.
If your policy server is accepting the connections, running the"jobs" command after all of those background processes have launchedshould show them all in "Stopped(SIGTTIN)" state, meaning that theyare connected and waiting for input.
I did this test with shell:

for i in $(seq 200); do
    nc 127.0.0.1 1234 &
done

'jobs' commands show 200 "Stopped" jobs.

That seems promising. It implies (I think) that your policy server is atleast accepting simultaneous connections. On the other hand, it ispossible that there's some facility in whatever Python components youare using to implement the server that gives you that low-levelfunctionality (accepting TCP connections asynchronously) without yourcode necessarily handling those connections in parallel.

If all 100 processes connect in a reasonable time, the next stepwould be to do the same test, but with input piped into all of the nccommands simulating what Postfix sends to a policy server.
I tested with shell commands below:

for i in $(seq 1000); do
    (cat <<EOF
request=smtpd_access_policy
protocol_state=RCPT
... [omit other attr=value here] ...
ccert_pubkey_fingerprint=

EOF
) | nc 127.0.0.1 7777 &
done


So, is this policy server listening on port 1234 or port 7777?

I'll assume this is just inconsistent (and pointless) obfuscation...

I get some "Ncat: Connection reset by peer." and "Ncat: Connectiontimed out." errors.
Does it mean that my policy server design (programming) is improper?Or, just slow performance?

That's beyond my expertise to say, but I think 1000 connections inparallel is likely to be challenging the normal resource constraints ofthe Linux kernel.

As Wietse noted more tersely, the only way to handle concurrentconnections is to not block your ability to accept and handle a newconnection while you wait for the completion of anything that might taketime with an existing connection. You have to hand off a new connectionto a new thread or process without reading from it or writing to it, andget back to accepting new connections as quickly as possible. I'm notfluent in Python and haven't worked with network server code in anylanguage for decades, so I can't say specifically what you need to do inyour program, but I know for sure that trying to serialize yourtransactions in a single threaded design is unworkable.

Re: Policy server problem: connection timed out or connection reset by peer

Reply via email to