On 18 Aug 2016, at 21:32, Zhang Huangbin wrote:
Dear Bill,
Thanks very much for helping.
On Aug 19, 2016, at 4:17 AM, Bill Cole
<postfixlists-070...@billmail.scconsult.com> wrote:
What do you mean "run" the policy service? It's a python program.
Which must be running in order for it to be listening for
connections.
Likely mechanisms would be via a SysV init script in /etc/init.d/ or
via a systemd service definition.
On some old Linux distributions, it's run with a SysV init script, but
on CentOS 7 and Ubuntu 16.04, it's run via systemd.
If your policy server is listening on 127.0.0.1:1234, you could try
this:
for x in {1..100} ; do nc 127.0.0.1 1234 & done
That attempts to make 100 TCP connections to 127.0.0.1:1234 with 100
different 'nc' processes, all running in the background.
If your policy server is accepting the connections, running the
"jobs" command after all of those background processes have launched
should show them all in "Stopped(SIGTTIN)" state, meaning that they
are connected and waiting for input.
I did this test with shell:
for i in $(seq 200); do
nc 127.0.0.1 1234 &
done
'jobs' commands show 200 "Stopped" jobs.
That seems promising. It implies (I think) that your policy server is at
least accepting simultaneous connections. On the other hand, it is
possible that there's some facility in whatever Python components you
are using to implement the server that gives you that low-level
functionality (accepting TCP connections asynchronously) without your
code necessarily handling those connections in parallel.
If all 100 processes connect in a reasonable time, the next step
would be to do the same test, but with input piped into all of the nc
commands simulating what Postfix sends to a policy server.
I tested with shell commands below:
for i in $(seq 1000); do
(cat <<EOF
request=smtpd_access_policy
protocol_state=RCPT
... [omit other attr=value here] ...
ccert_pubkey_fingerprint=
EOF
) | nc 127.0.0.1 7777 &
done
So, is this policy server listening on port 1234 or port 7777?
I'll assume this is just inconsistent (and pointless) obfuscation...
I get some "Ncat: Connection reset by peer." and "Ncat: Connection
timed out." errors.
Does it mean that my policy server design (programming) is improper?
Or, just slow performance?
That's beyond my expertise to say, but I think 1000 connections in
parallel is likely to be challenging the normal resource constraints of
the Linux kernel.
As Wietse noted more tersely, the only way to handle concurrent
connections is to not block your ability to accept and handle a new
connection while you wait for the completion of anything that might take
time with an existing connection. You have to hand off a new connection
to a new thread or process without reading from it or writing to it, and
get back to accepting new connections as quickly as possible. I'm not
fluent in Python and haven't worked with network server code in any
language for decades, so I can't say specifically what you need to do in
your program, but I know for sure that trying to serialize your
transactions in a single threaded design is unworkable.