On 18 Aug 2016, at 21:32, Zhang Huangbin wrote:

Dear Bill,

Thanks very much for helping.

On Aug 19, 2016, at 4:17 AM, Bill Cole <postfixlists-070...@billmail.scconsult.com> wrote:

What do you mean "run" the policy service? It's a python program.

Which must be running in order for it to be listening for connections. Likely mechanisms would be via a SysV init script in /etc/init.d/ or via a systemd service definition.

On some old Linux distributions, it's run with a SysV init script, but on CentOS 7 and Ubuntu 16.04, it's run via systemd.

If your policy server is listening on 127.0.0.1:1234, you could try this:

for x in {1..100} ; do nc 127.0.0.1 1234 & done

That attempts to make 100 TCP connections to 127.0.0.1:1234 with 100 different 'nc' processes, all running in the background.

If your policy server is accepting the connections, running the "jobs" command after all of those background processes have launched should show them all in "Stopped(SIGTTIN)" state, meaning that they are connected and waiting for input.

I did this test with shell:

for i in $(seq 200); do
    nc 127.0.0.1 1234 &
done

'jobs' commands show 200 "Stopped" jobs.

That seems promising. It implies (I think) that your policy server is at least accepting simultaneous connections. On the other hand, it is possible that there's some facility in whatever Python components you are using to implement the server that gives you that low-level functionality (accepting TCP connections asynchronously) without your code necessarily handling those connections in parallel.


If all 100 processes connect in a reasonable time, the next step would be to do the same test, but with input piped into all of the nc commands simulating what Postfix sends to a policy server.

I tested with shell commands below:

for i in $(seq 1000); do
    (cat <<EOF
request=smtpd_access_policy
protocol_state=RCPT
... [omit other attr=value here] ...
ccert_pubkey_fingerprint=

EOF
) | nc 127.0.0.1 7777 &
done

So, is this policy server listening on port 1234 or port 7777?

I'll assume this is just inconsistent (and pointless) obfuscation...

I get some "Ncat: Connection reset by peer." and "Ncat: Connection timed out." errors.

Does it mean that my policy server design (programming) is improper? Or, just slow performance?

That's beyond my expertise to say, but I think 1000 connections in parallel is likely to be challenging the normal resource constraints of the Linux kernel.

As Wietse noted more tersely, the only way to handle concurrent connections is to not block your ability to accept and handle a new connection while you wait for the completion of anything that might take time with an existing connection. You have to hand off a new connection to a new thread or process without reading from it or writing to it, and get back to accepting new connections as quickly as possible. I'm not fluent in Python and haven't worked with network server code in any language for decades, so I can't say specifically what you need to do in your program, but I know for sure that trying to serialize your transactions in a single threaded design is unworkable.

Reply via email to