Re: [Twisted-Python] spawnProcess - reapProcess not retrying on failures
On 3 September 2014 18:55, exar...@twistedmatrix.com wrote: On 03:27 pm, a...@roiban.ro wrote: On 3 September 2014 14:39, exar...@twistedmatrix.com wrote: [snip] Do you have any suggestion for how the calls should be made? reactor.run(installSignalHandlers=True, installStopHandlers=False) Perhaps. [snip] It might be nice to try to be somewhat flexible - in case there's some reason to change what signals the reactor wants to handle in the future. Perhaps: reactor.run(installSignalHandlers={SIGCHLD}) An entirely different direction could be to make this bit of configuration into initialization for the reactor. from twisted.internet.epollreactor import install install(installSignalHandlers={SIGCHLD}) from twisted.internet import reactor ... reactor.run() By keeping these details away from `IReactorCore.run`, that method remains maximally useful. For example, if you could set up the reactor this way, a normal `twistd` plugin would still be able to benefit from your choice, even with twistd's naive call of `reactor.run()` with no extra arguments. Application code calling these `install` functions is already supported (it's how you select a specific reactor, after all). Some of the install functions even accept arguments already. This would actually eliminate another existing issue - `IReactorCore.run` is actually defined to take no arguments. The implementations ignore this because someone thought it was important to be able to disable installation of signal handlers. I am happy to have a simple reactor.run() and move installSignalHandlers somewhere else. working with install(installSignalHandlers={SIGCHLD}) seems a bit complicated, as I assume that many developers rely on the automatic reactor installation. In the same time, I assume that 'installSignalHandlers' argument would be supported by all reactors this is why maybe we can have something like: from twisted.internet import reactor def customHandler(signum, frame): pass reactor.installSignalHandlers( SIGCHLD=True, # Install default handler SIGTERM=None, # Don't install handler SIGINT=customHandler, # Install custom handler # SIGBREAK is not request so that default handler is installed. ) # reactor.installSignalHandlers() installs all default handlers. reactor.run() reactor.run(InstallSignalHandlers=True|False) would be deprecated. In case reactor.installSignalHandlers is not called before run(), all default handlers will be installed. [snip] The sidecar process is an example of a general fix, though. The idea there is that Twisted itself runs a private child process (perhaps only when the first call to spawnProcess is made). It talks to that process using a file descriptor. That process can install a SIGCHLD handler (because Twisted owns it, application developers don't get to say they don't want one installed) or use another more invasive strategy for child process management. When you want to spawn a process, the main process tells the sidecar to do it. The sidecar relays traffic between the child and the original parent (or does something involving passing file descriptors across processes). This removes the need to ever install a SIGCHLD handler in the main process. It also probably enables some optimizations (reapProcesses is O(N!) on the number of child processes right now) that are very tricky or impossible otherwise. Jean-Paul Thanks for the details regarding the side-process dedicated to child process management. Not sure if we need a separate ticket for that, or add it as a comment to https://twistedmatrix.com/trac/ticket/5710 Thanks! -- Adi Roiban ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] spawnProcess - reapProcess not retrying on failures
On 07:26 am, a...@roiban.ro wrote: On 3 September 2014 18:55, exar...@twistedmatrix.com wrote: On 03:27 pm, a...@roiban.ro wrote: On 3 September 2014 14:39, exar...@twistedmatrix.com wrote: [snip] Do you have any suggestion for how the calls should be made? reactor.run(installSignalHandlers=True, installStopHandlers=False) Also note there is an old, widely scoped ticket: https://twistedmatrix.com/trac/ticket/2415 with some more stuff (not necessarily directly related to your comments on signal handling) on it. What would be really nice is if someone collected *all* of the complaints about `spawnProcess` into one place and integrated solutions to them into a design for a replacement. :) Jean-Paul ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Graceful shutdown of twistd application
On 10:52 am, sangiova...@nweb.it wrote: Hello list, I need to implement a graceful shutdown procedure for a twistd application. The application is made up of two services: an internet.TCPClient and an internet.TCPServer. They're glued together with a MultiService instance, which is in turn set to have 'application' as parent. The server and the client work together, making a proxy (SMTP server and AMQP client). My goal is the following: - intercept a SIGTERM signal - 'block' on the server side: since it's SMTP I get this by setting a variable that makes the server return tempfails (4xx) for new messages, while keeping current sessions active - wait until current requests are satisfied (I keep a dictionary of current pending messages) - shut the whole thing down This is exactly what before shutdown triggers are for. Alternatively, use the higher-level API and implement `stopService` on one of your services. Either way, return a `Deferred` that only fires when you're satisfied it is time for shutdown to proceed. You said before shutdown triggers are too late but you didn't say why. I think that's based on a misunderstanding - but if not, then explain why it doesn't work for your scenario. Jean-Paul What is the best solution for this use case? It's not really clear to me how to catch SIGTERM and handle pending requests *before* the underlying services start to shutdown (i.e. even addSystemEventTrigger('before', 'shutdown', callable) is called too late for my needs). Thank you very much for your help! Fabio ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Graceful shutdown of twistd application
On Thu, Sep 4, 2014 at 2:02 PM, exar...@twistedmatrix.com wrote: You said before shutdown triggers are too late but you didn't say why. I think that's based on a misunderstanding - but if not, then explain why it doesn't work for your scenario. Hi, thanks for your reply. I've tried the following: def sleep(secs): log.msg('from within trigger') d = defer.Deferred() reactor.callLater(secs, d.callback, None) return d reactor.addSystemEventTrigger('before', 'shutdown', sleep, 10) This is what I can see in the logs: Sep 4 14:25:06 prepyproxy01 proxy [4924]: [-] Received SIGTERM, shutting down. Sep 4 14:25:06 prepyproxy01 proxy [4924]: [-] from within trigger Sep 4 14:25:06 prepyproxy01 proxy [4924]: [TwistedProtocolConnection,client] twisted.internet.tcp.Connector instance at 0x05717be0 will retry in 2 seconds Sep 4 14:25:06 prepyproxy01 proxy [4924]: [TwistedProtocolConnection,client] Stopping factory __builtin__.RabbitMQClientFactory instance at 0x057172c0 Sep 4 14:25:06 prepyproxy01 proxy [4924]: [-] (TCP Port 10025 Closed) Sep 4 14:25:06 prepyproxy01 proxy [4924]: [-] Stopping factory __builtin__.TempfailingESMTPFactory instance at 0x057172a0 Sep 4 14:25:09 prepyproxy01 proxy [4924]: [-] Starting factory __builtin__.RabbitMQClientFactory instance at 0x057172c0 Sep 4 14:25:09 prepyproxy01 proxy [4924]: [TwistedProtocolConnection,client] [rmq01] RabbitMQ connection established Sep 4 14:25:16 prepyproxy01 proxy [4924]: [TwistedProtocolConnection,client] twisted.internet.tcp.Connector instance at 0x05717be0 will retry in 2 seconds Sep 4 14:25:16 prepyproxy01 proxy [4924]: [TwistedProtocolConnection,client] Stopping factory __builtin__.RabbitMQClientFactory instance at 0x057172c0 Sep 4 14:25:16 prepyproxy01 proxy [4924]: [-] Main loop terminated. Sep 4 14:25:16 prepyproxy01 proxy [4924]: [-] Server Shut Down. It seems to me that the shutdown phase doesn't wait for the deferred to fire before stopping my client and server. To be clear: my expected result is: - SIGTERM - pause 10s - client/server shutdown I am surely missing something, but I really can't figure out what. Oh, for the records: I'm using Twisted 13.2.0 on Pypy. Thanks! ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
[Twisted-Python] High throughput database logger
Hi, I'm looking at various options for implementing a high throughput database logger that will work with Twisted. My requirements, listed by importance: 1) small memory footprint 2) high speed 3) low garbage generation The application I'm working on runs continuously (24/7). I've experimented a bit with pysqlite and Twisted to see which approach is better suited (see attached example). Question 1: I noticed that all of the Twisted based versions are very slow compared to the plain sqlite3 test. This seems to be caused by atomic transaction management, namely a commit after each insert. Would be interested to know if there is a simple way to avoid this and do my own transaction management (aka batch commit). One other thing is the greatly varying amounts of garbage generated (peak memory) and memory usage between the Twisted variants. Question 2: I would have expected B (Twisted ADBAPI) to behave very similar to C/E since I'm using a connection pool of size 1 and all requests are queued and handled sequentially. Could any of you please give me some pointers as to why this is happening? Question 3: Even though objgraph lists the exact same reference count once the code is ran, the amount of used memory greatly differs. Any ideas what might be causing this? Any suggestions and/or pointers on how to improve/do this are more than welcome. Thank you for your time, Adrian import gc import objgraph import os import sqlite3 import sys from time import sleep from twisted.enterprise.adbapi import ConnectionPool from twisted.internet import defer, task, reactor def _removeFile(path): try: os.unlink(path) except OSError: pass def plain_sqlite3(conn, rows): query = 'INSERT INTO t (value) VALUES (1)' cursor = conn.cursor() for row in range(rows): cursor.execute(query) cursor.close() conn.commit() def adbapi(pool, rows): query = 'INSERT INTO tw (value) VALUES (2)' last = None for row in range(rows): last = pool.runOperation(query) last.addCallback(lambda _: None) return last def inline_callbacks(pool, rows): query = 'INSERT INTO tw (value) VALUES (3)' @defer.inlineCallbacks def do_insert(): for row in range(rows): deferred = pool.runOperation(query) deferred.addCallback(lambda _: None) yield deferred return do_insert() def semaphore(pool, rows): query = 'INSERT INTO tw (value) VALUES (4)' semaphore = defer.DeferredSemaphore(1) last = None for row in range(rows): last = semaphore.run(pool.runOperation, query) last.addCallback(lambda _: None) return last def cooperator(pool, rows): query = 'INSERT INTO tw (value) VALUES (5)' def generator(): for row in range(rows): deferred = pool.runOperation(query) deferred.addCallback(lambda _: None) yield deferred cooperator = task.Cooperator() return cooperator.coiterate(generator()) def run(callable, repeats): _removeFile('test-sq3.db3') conn = sqlite3.connect('./test-sq3.db3') cursor = conn.cursor() cursor.execute('CREATE TABLE t (id ROWID, value INTEGER)') for step in range(repeats): print Run #%d %s... % (step, inserter) callable(conn, 2000) conn.close() cursor = None conn = None def run_twisted(callable, repeats): _removeFile('test-twisted.db3') pool = ConnectionPool('sqlite3', cp_min=1, cp_max=1, database='test-twisted.db3', check_same_thread=False) pool.runOperation('CREATE TABLE tw (id ROWID, value INTEGER)') last = None @defer.inlineCallbacks def execute(): for step in range(repeats): print Run #%d %s... % (step, callable) last = callable(pool, 2000) yield last last.addCallback(lambda _: pool.close()) last.addCallback(lambda _: reactor.stop()) reactor.callWhenRunning(execute) reactor.run() last = None pool = None gc.collect() objgraph.show_growth() #run(plain_sqlite3, 100) #run_twisted(adbapi, 100) #run_twisted(inline_callbacks, 100) #run_twisted(semaphore, 100) run_twisted(cooperator, 100) print Press ENTER to exit... sys.stdin.read(1) gc.collect() objgraph.show_growth() A. Plain SQLite3 Memory: 17 Mb Peak memory: 19 Mb B. Twisted ADBAPI - Memory: 36 Mb Peak memory: 240 Mb wrapper_descriptor 1326 +15 function 2716 +13 dict 1895+8 getset_descriptor 444+5 weakref1067+4 member_descriptor 374+3 list331+3 method_descriptor 700+1 classobj108+1 module 165+1 C. Twisted Inline Callbacks --- Memory: 21 Mb Peak memory: 23 Mb
Re: [Twisted-Python] Graceful shutdown of twistd application
On 12:36 pm, sangiova...@nweb.it wrote: On Thu, Sep 4, 2014 at 2:02 PM, exar...@twistedmatrix.com wrote: You said before shutdown triggers are too late but you didn't say why. I think that's based on a misunderstanding - but if not, then explain why it doesn't work for your scenario. Hi, thanks for your reply. I've tried the following: def sleep(secs): log.msg('from within trigger') d = defer.Deferred() reactor.callLater(secs, d.callback, None) return d reactor.addSystemEventTrigger('before', 'shutdown', sleep, 10) All 'before' trigger are run concurrently. If you're using `Application` then your `sleep` trigger runs concurrently with the application's `stopService` trigger (because `Application` has its stopService added as another 'before' shutdown' trigger alongside yours). If you want to delay your application shutdown, you need to cooperate a little more closely with it. Either attach your application shutdown code as a callback to the sleep Deferred or move the sleep into the stopService implementation of one of the services on your application and trigger the remaining stopService calls (eg the stopService call on the MultiService you mentioned) when the sleep Deferred fires there. Jean-Paul ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] High throughput database logger
On 12:51 pm, adi.libot...@proatria.com wrote: Hi, I'm looking at various options for implementing a high throughput database logger that will work with Twisted. My requirements, listed by importance: 1) small memory footprint 2) high speed 3) low garbage generation The application I'm working on runs continuously (24/7). I've experimented a bit with pysqlite and Twisted to see which approach is better suited (see attached example). Question 1: I noticed that all of the Twisted based versions are very slow compared to the plain sqlite3 test. This seems to be caused by atomic transaction management, namely a commit after each insert. Not only this but in some of the Twisted versions you've introduced a round-trip communication from the reactor thread to a worker thread between each operation. This will greatly impact throughput by adding lots of latency to each insert. Would be interested to know if there is a simple way to avoid this and do my own transaction management (aka batch commit). Using twisted.enterprise.adbapi? You could probably hack something horrible together but it would definitely be a hack. I suggest you take a look at adbapi2 instead - http://trac.calendarserver.org/wiki/twext. One other thing is the greatly varying amounts of garbage generated (peak memory) and memory usage between the Twisted variants. Garbage and peak memory are different things. The Twisted-using version does a lot more - and some of your Twisted-using versions put the *entire* data set into memory (in a vastly expanded form, where each insert is represented by multiple large objects including Deferreds). So it's not too surprising the memory usage is greater. Question 2: I would have expected B (Twisted ADBAPI) to behave very similar to C/E since I'm using a connection pool of size 1 and all requests are queued and handled sequentially. Could any of you please give me some pointers as to why this is happening? You didn't actually label the code with these letters. :) I'm guessing B is the `adbapi` function, C is `inline_callbacks`, and E is `cooperator`. Also you didn't say in what respect you expected them to behavior similarly. You expected their memory usage to be the same? You expected their runtime to be the same? You expected them to put the same data into the database? As far as memory usage goes, B uses lots of memory for the same reason `semaphore` (D?) uses lots of memory. You queue up the entire dataset in memory as piles of tuples, lists, Deferreds, etc. adbapi might be executing the operations one at a time, but the *loop* inside `adbapi` runs all the way to the end all in one go. It starts every one of those `runOperation`s before any of them (probably) has a chance to execute. Question 3: Even though objgraph lists the exact same reference count once the code is ran, the amount of used memory greatly differs. Any ideas what might be causing this? Hopefully the above helps explain this. Something else you might consider is batching up your inserts (not just only committing after a batch of inserts). Since SQLite3 can only write from a single thread at a time, you're effectively limited to serialized inserts - so it doesn't make sense to try to start a second insert before the first has finished. When the first finishes, if 50 more data points have arrived, you should do one insert for all 50 of those - not 50 inserts each for one piece of data. This cuts off a bunch of your overhead - Python objects, round- trip latency for inter-thread communication, function calls, etc. Jean-Paul Any suggestions and/or pointers on how to improve/do this are more than welcome. Thank you for your time, Adrian ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python