Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On Apr 6, 2011, at 8:35 PM, Stephen Thorne wrote: Part of the discussion was about how to rewrite this in such a way that no python code needs to be run in order to discover all the tapname+description combinations that are available to twistd, this is because of a perceived performance and sanity deficit in using 'twistd'. My interest in this discussion is not so much in no python code should be executed but rather the current constraints of the system should be preserved (your whole package doesn't get imported) but you shouldn't have to write hacks like ServiceMaker (http://twistedmatrix.com/documents/11.0.0/api/twisted.application.service.ServiceMaker.html)to preserve them. Or, for that matter, do inner imports, like this one from your example: def makeService(self, options): from examplepackage.examplemodule import make_service return make_service(debug=options['debug']) Someone unfamiliar with the Twisted plugin system would probably not realize that the positioning of that import is critically important. It seems kind of random, and maybe sloppy, and a refactoring for stylistic fixes might move it to the top of the module. Of course, such a refactoring would make 'twistd --help' on any system with your code installed start executing gobs and gobs of additional code. Also, as a result of such a change, every 'twistd' server on such a system would have your entire examplepackage.examplemodule imported, silently of course, increasing their memory footprint and so on. As I have mentioned in other parts of this mailing list thread, there's already some caching going on, but it's never used. Observe: glyph@... twisted/plugins$ python Python 2.6.1 (...) from cPickle import load plugins = load(file('dropin.cache')) plugins['twisted_names'].plugins [CachedPlugin 'TwistedNames'/'twisted.plugins.twisted_names' (provides 'IPlugin, IServiceMaker')] plugins['twisted_names'].plugins[0].name 'TwistedNames' plugins['twisted_names'].plugins[0].description '\nUtility class to simplify the definition of L{IServiceMaker} plugins.\n ' plugins['twisted_names'].plugins[0].provided [InterfaceClass twisted.plugin.IPlugin, InterfaceClass twisted.application.service.IServiceMaker] import sys 'twisted.plugins' in sys.modules False The problem with this is that once you've loaded the plugins, you can't see it any more: from twisted.plugin import getPlugins from twisted.application.service import IServiceMaker allPlugins = list(getPlugins(IServiceMaker)) plugin = [p for p in allPlugins if p.tapname == 'dns'][0] plugin.description 'A domain name server.' plugin.name 'Twisted DNS Server' Those are the 'name' and 'description' attributes from the IServiceMaker provider, already implicitly loaded by getPlugins. You can't see the CachedPlugin any more. So, here's an idea, very similar to the one on the ticket. Keeping in mind the state described above, hopefully it will communicate my idea better. Right now, IPlugin is purely a marker. It provides no methods. I propose a new subinterface (designed to eventually replace it), IPlugin2, with one method, 'metadata()', that returns a dictionary mapping strings to strings. This _could_ be any object, limited only by what we think is a good idea to allow serializing. The second method would be 'willProvide(I)' which returns a boolean, whether the result of load() will provide the interface 'I'. Then there's a helper which you inherit which looks like: class Plugin2(object): implements(IPlugin2) def metadata(self): raise NotImplementedError(your metadata here) def willProvide(self, I): return I.providedBy(self) def load(self): return self The one rule here is that 'metadata()' must always return the same value for a particular version of the code. We will then serialize the metadata from calling metadata() into dropin.cache, and expose it to application code. My idea for exposing it is that if you then do 'getPlugins(IPlugin2)', you will get back an iterable of IPlugin2 providers, but not necessarily instances of your classes: they could be cached plugins, with cached results for metadata() and willProvide() - the latter based on the list currently saved as the 'provided' attribute. So a loop like this to load a twistd plugin by name: def twistdPluginByTapname(name): for p2 in getPlugins(IPlugin2): if p2.willProvide(IServiceMaker) and p2.metadata()['tapname'] == name: return p2.load() ... would not actually load any plugins, but work entirely from the cached metadata. Since you wouldn't be loading the plugin except to actually invoke its dynamic behavior, we would no longer need ServiceMaker, just an instance of the actual IServiceMaker plugin, with no local imports or anything. This would at least partially address one of your complaints, Stephen, in that it would mean that a plugin could be defined with 2 lines:
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On 04/07/2011 02:08 PM, Tim Allen wrote: If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps. Having had experience with JSON for configuration: it is a terrible format for configuration, if only because it does not support comments. The syntax is also a bit too strict: enough to be annoying in something you want to edit all the time and easily in my experience. cheers, David ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On Thu, Apr 07, 2011 at 03:24:57PM +0900, David wrote: On 04/07/2011 02:08 PM, Tim Allen wrote: If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps. Having had experience with JSON for configuration: it is a terrible format for configuration, if only because it does not support comments. The syntax is also a bit too strict: enough to be annoying in something you want to edit all the time and easily in my experience. Well, that's pretty depressing. The only other candidate I can even think of is YAML, and that's not in the standard library (as far as I know). Who'd have guessed it'd be so complicated to associate keys with values? ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On 04/07/2011 03:34 PM, Tim Allen wrote: Who'd have guessed it'd be so complicated to associate keys with values? If that's the only thing you need, .ini would work fine. Another solution would be python files with only literals, parsed through the ast module for safety. cheers, David ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Instrumenting Reactors
On 6 April 2011 10:55, Paul Thomas spongelavap...@googlemail.com wrote: Should I just hack into the reactor somewhere? Or is there something sitting in a library I haven't seen that will help with this? You can time blocking calls by instrumenting twisted.python.log.callWithContext and you could try writing the timing info to something fast, like Redis. Michael ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Asynchronous context in Twisted
Something like this would be awesome to have in Twisted. ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On 4/7/11 8:24 AM, David wrote: Having had experience with JSON for configuration: it is a terrible format for configuration, if only because it does not support comments. The syntax is also a bit too strict: enough to be annoying in something you want to edit all the time and easily in my experience. I agree. We use json as config-file format from time to time, but it always end up hurting you. I therefor hacked up this little library: https://github.com/edgeware/structprop ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On Thu, Apr 7, 2011 at 2:34 AM, Tim Allen t...@commsecure.com.au wrote: Well, that's pretty depressing. The only other candidate I can even think of is YAML, and that's not in the standard library (as far as I know). There's Coil, but it's also not in the std lib AFAIK: http://mike.marineau.org/coil/ Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On Apr 7, 2011, at 1:08 AM, Tim Allen wrote: If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps. I bet a lot of people have a deja-vu feeling about a config file syntax debate so I'll propose an alternative approach: RDF. Perhaps most people in this community will not like it, yet some might find it more fun that revamping their 2002 arguments about merits and pitfalls of various syntaxes. One of the reasons why I like RDF so much is that I can focus on what I need to express and let people pick the serialization syntax that better suits their mood, habits, tools and use-cases. I know that the use-case that's being discussed is slightly different (config files for the plugin system as opposed to config files for a specific plugin) but as an example: my twistd-plugin-driven webserver will gladly accept any of the attached configuration files, they are equivalent and there are commonly available tools to switch back and forth, including pure python ones. It could as well accept any other standard RDF serialization syntax, for example there are several other XML formats, a line-based grep-friendly syntax (NTriples) and a JSON format. The code that parses this and turns it into running twisted Services and web applications is about the same size of your average TAC file. If anybody wants to see it please email me privately, I'm not proud enough of other parts of my open source project containing it to advertise it on this list. Other than mentioning that RDF also comes with a standard query and update language and protocol (SPARQL), I won't enumerate other advantages here so let's see what some of drawbacks (and their counter-arguments) are: 1) it's not widely known yet (but so was XML in 2000 and JSON in 2002 and INI in 2011) 2) it would require to add a dependency for an RDF parser (people often argued the same way about XML, remember when libxml2 became a Gnome dependency? RDF is now becoming a requirement of Gnome and KDE...) 3) it's not python (yet the several python object-RDF-mapper libraries available seem to me much easier to use and way more simple than SQLAlchemy so I already switched from pickle to RDF whenever I want to serialize some object graph, BTW it's also safer and hand-editable) Sorry if I went too off-topic, ciao ste @prefix : http://example.com/twisted# . [ a :Server; :binder [ :port 80; :ip 0.0.0.0 ]; :binder [ :port 443; :ip 0.0.0.0; :pem server.pem ]; :app[ :description Some app; :resource someapp.SomeApp; :path /someapp; :users authdomains/someapp ]; :app[ :description My tiny little app... ...with verbose multi-line description text! ; :resource myapp.MyRootApp; :path /; :users authdomains/myapp ]; ]. [ a :Server; :binder [ :port 2020; :ip 127.0.0.1 ]; :app[ :resource anotherapp.AnotherApp; :path /; ]; ]. ?xml version=1.0 encoding=utf-8? rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#; xmlns=http://example.com/twisted#; Server app rdf:Description descriptionSome app/description path/someapp/path resourcesomeapp.SomeApp/resource usersauthdomains/someapp/users /rdf:Description /app app rdf:Description descriptionMy tiny little app... ...with verbose multi-line description text! /description path//path resourcemyapp.MyRootApp/resource usersauthdomains/myapp/users /rdf:Description /app binder rdf:Description ip0.0.0.0/ip port rdf:datatype=http://www.w3.org/2001/XMLSchema#integer;80/port /rdf:Description /binder binder rdf:Description ip0.0.0.0/ip pemserver.pem/pem port rdf:datatype=http://www.w3.org/2001/XMLSchema#integer;443/port /rdf:Description /binder /Server Server app rdf:Description path//path resourceanotherapp.AnotherApp/resource /rdf:Description /app binder rdf:Description ip127.0.0.1/ip port rdf:datatype=http://www.w3.org/2001/XMLSchema#integer;2020/port /rdf:Description /binder /Server /rdf:RDF ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
Itamar Turner-Trauring wrote: […] So, the design has to *not* rely on caching working. FWIW: this is an achievable goal. I have 32 different bzr plugins currently installed, and here's the difference they make: $ time bzr --no-plugins rocks It sure does! real 0m0.075s $ time bzr rocks It sure does! real 0m0.119s So that's about 1.5ms per plugin, on average. With a hot disk cache, at least… For comparison, 'twistd --version' takes 116ms, with a dropin.cache and (I think, although how can I tell?) no plugins installed. In part, we achieve this via the bzrlib.lazy_import hack, which plugins can and often do use, and by encouraging plugin authors to put as little code into their __init__.py files as possible. A typical plugin's __init__ might do just: # This is example_plugin/__init__.py # The actual command implementation is in # example_plugin/example_commands.py from bzrlib import commands commands.plugin_cmds.register_lazy('cmd_class_name', [], 'bzrlib.plugins.example_plugin.example_commands') Glyph's expressed scepticism that plugin authors and maintainers will know to keep their __init__.py files cheap to import. Bazaar's experience is different. Partly that's probably because the Bazaar community has paid a fair bit of attention to start up time and I suppose Twisted doesn't have that. But I think also it's partly because we've provided tools to help people diagnose what/who to blame for bzr being slow to start, like 'bzr --profile-imports', and even the crude 'time bzr rocks'. -Andrew. ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
[Twisted-Python] WaitForMultipleObjects socket limitation
Hi, I've made a proof of concept for asynchronous console input on Windows [1] and now I am trying to understand the limits of WaitForMultipleObjects API I've used. Documentation on win32eventreactor mentions limit for 64 objects: http://twistedmatrix.com/documents/11.0.0/api/twisted.internet.win32eventreactor.htm However, it is completely opaque what these objects are? For console handles and process handles it is quite obvious, but not for sockets. Is 64 the limit for total amount sockets opened on different ports? Is 64 the limit for connections made to a socket on specified port? 1. http://techtonik.rainforce.org/2011/03/asynchronous-input-from-windows-console.html -- anatoly t. ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On Apr 7, 2011, at 8:14 AM, Itamar Turner-Trauring wrote: On Thu, 2011-04-07 at 02:08 -0400, Glyph Lefkowitz wrote: My idea for exposing it is that if you then do 'getPlugins(IPlugin2)', you will get back an iterable of IPlugin2 providers, but not necessarily instances of your classes: they could be cached plugins, with cached results for metadata() and willProvide() - the latter based on the list currently saved as the 'provided' attribute. So a loop like this to load a twistd plugin by name: def twistdPluginByTapname(name): for p2 in getPlugins(IPlugin2): if p2.willProvide(IServiceMaker) and p2.metadata()['tapname'] == name: return p2.load() ... would not actually load any plugins, but work entirely from the cached metadata. That's where the whole idea falls down for me. Evidence suggests (and you note this earlier) that caching doesn't work anywhere in the real world. My current Ubuntu install complains about a read-only cache every time I run lore (and I'm pretty sure there's nothing added to my PYTHONPATH other than installed system packages). Any design which assumes caching works appears to be useless in the real world. So, the design has to *not* rely on caching working. Here's an idea: let's make caching actually work :). Prior experience indicates that with some small amount of dedication, it's possible to make a module in Twisted not be broken all the time. As you observed that I already mentioned earlier in the thread, caching never works because post-installation hooks are such a pain, and you have to have special permissions to access the cache file. So, separately from this, we could attempt a secondary cache read/write to a location much more likely to be writable by the user (something like ~/.local/var/cache/usr_lib_python2.6_site-packages.dropin.cache) read if the first one is out of date and written if writing the first one fails. Also: we already rely on this behavior, so things are just as broken now for you. For example, you'll end up loading the code for all twistd plugins and trial reporters when what you want are lore plugins. This could also be fixed independently. (To fix your particular installation right now, 'sudo twistd --help' or 'sudo lore' once.) And, finally, as a separate consideration, we could make cached metadata mean explicitly specified metadata instead. The important thing that I'm talking about doing first is making the system work exactly the same way that it does now, with one additional feature in the API which would allow us to make use of metadata that lives outside the Python code, using the existing mechanism for storing metadata that is currently not used. For a first cut, we wouldn't even remove the ServiceMaker hack, just add the new feature to it so that we could do slightly less importing at startup. ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On Apr 7, 2011, at 9:19 AM, Andrew Bennetts wrote: Itamar Turner-Trauring wrote: […] So, the design has to *not* rely on caching working. FWIW: this is an achievable goal. I have 32 different bzr plugins currently installed, and here's the difference they make: $ time bzr --no-plugins rocks It sure does! real0m0.075s $ time bzr rocks It sure does! real0m0.119s So that's about 1.5ms per plugin, on average. With a hot disk cache, at least… Is your cache as hot for Twisted as for bzr? Have you replicated these results in a randomized, double-blind clinical trial? ;-) I'm not surprised that bzr has faster startup though; twistd has not been (and doubtful will ever be) nearly so ruthlessly optimized. Maybe it's time to put a startup benchmark on http://speed.twistedmatrix.com/, at least that way we could keep track. For comparison, 'twistd --version' takes 116ms, with a dropin.cache and (I think, although how can I tell?) no plugins installed. Twisted itself installs 22 dropins (python files which each define at least one plugin), which comprise 48 plugins of various types, so there are always some. You should be able to tell, though. It's pathetic that we don't have a command-line tool to inspect the available plugins and what they're doing. Independent of the other issues under discussion here: http://twistedmatrix.com/trac/ticket/5039. But this is all moot. 'twistd --version' doesn't scan for plugins, so that's all just the normal startup time; apparently we import too much in the first place. The thing to compare with is 'twistd --help' or even just 'twistd [some-plugin]' (since invoking one plugin actually loads all of them). Plus - this is really the genesis for this thread - the dropin.cache isn't really saving us much work at all right now, because all the plugins get loaded anyway for all practical uses of plugin scanning. In part, we achieve this via the bzrlib.lazy_import hack, which plugins can and often do use, and by encouraging plugin authors to put as little code into their __init__.py files as possible. A typical plugin's __init__ might do just: # This is example_plugin/__init__.py # The actual command implementation is in # example_plugin/example_commands.py from bzrlib import commands commands.plugin_cmds.register_lazy('cmd_class_name', [], 'bzrlib.plugins.example_plugin.example_commands') This looks very similar to ServiceMaker. Glyph's expressed scepticism that plugin authors and maintainers will know to keep their __init__.py files cheap to import. Bazaar's experience is different. Partly that's probably because the Bazaar community has paid a fair bit of attention to start up time and I suppose Twisted doesn't have that. Yeah, bzr's audience makes this easier. For one thing, the audience is much bigger :), but more importantly, bzr is a user-facing tool which users are running _constantly_ at the command line. The only visible consequence of a rogue twistd plugin is that your server which runs for days at a time takes 0.2s longer to start; the real problem sets in later, where your 25 subprocesses are suddenly consuming an additional 50meg each because of the extra plugin they loaded. You do find this eventually, it's just rare to find it while you're writing the plugin. But I think also it's partly because we've provided tools to help people diagnose what/who to blame for bzr being slow to start, like 'bzr --profile-imports', and even the crude 'time bzr rocks'. Yes. These are a great idea, and there's no excuse that Twisted's plugin system is so difficult to inspect and debug. A couple of good tools would address a wide range of plugin issues, many of them much more interesting than performance, like the ever-popular why isn't my plugin getting loaded. Thanks for the impetus to file the ticket above. (I kinda hope it's a dup, but I couldn't find one.)___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On 2011-04-07, Stefano Debenedetti wrote: On Apr 7, 2011, at 1:08 AM, Tim Allen wrote: If you need a non-Turing-complete config language and rule out .ini and XML, I'm not sure what's left. JSON, perhaps. I bet a lot of people have a deja-vu feeling about a config file syntax debate so I'll propose an alternative approach: RDF. I am +1 on this idea. I like rdf. My question is now: is there an rdf parser lib that is available on python2.4+ which can either be gently embedded within twisted, or used as a dependency? We don't really need SparQL or anything complicated, just the ability to resolve some simple triples. I do not like angle brackets, but I have always had a fond affection for n3. -- Regards, Stephen Thorne Development Engineer Netbox Blue ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] Twisted Plugins - Implementation Discussion
On 2011-04-07, Glyph Lefkowitz wrote: I am +1 on this idea. I like rdf. My question is now: is there an rdf parser lib that is available on python2.4+ which can either be gently embedded within twisted, or used as a dependency? You're welcome to try and do this; I'm not particularly interested in blocking it or holding it up, but I don't think that changing the input format actually solves any real problems. I guess I will hold it up if you can't convince me that I'm wrong about that, and demonstrate an actual problem that it solves :-). You still have to define all the same classes in order to get a plugin, unless we change some of that too - which has nothing to do with the metadata format at all. I think the way to avoid caching issues in general is to generate the packaging metadata from the source earlier in advance (i.e. at development time, and check it in with the source code, like you would do with a Cython-generated C file or something), not to just mess around with it in a text editor. I think that there is a benefit to sticking with a format that people very much dislike editing. Having separately manually-edited metadata introduces an opportunity for the metadata to diverge from the reality of the code. Making this easy to edit manually means making it more likely that people will think that they need to introduce some manual tweaks. If it's a huge pain to actually generate the metadata without running a tool that inspects the code, it's less likely that someone will feel the need to get clever. I was just thinking about this. It would be very easy to write a single twisted/plugins/rdf_plugins.py file that scans for non-python metadata defined plugins and creates them. That way twisted doesn't need to depend on an RDF lib, and this can be a 'third party' outside-of-twisted package that if you want to use, you just specify it as a dependancy along with the rest of the things that your project depends on. ... I like this idea for a variety of reasons. -- Regards, Stephen Thorne Development Engineer Netbox Blue ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python
Re: [Twisted-Python] WaitForMultipleObjects socket limitation
On Thu, Apr 7, 2011 at 1:23 PM, Glyph Lefkowitz gl...@twistedmatrix.comwrote: On Apr 7, 2011, at 12:16 PM, anatoly techtonik wrote: I've made a proof of concept for asynchronous console input on Windows [1] and now I am trying to understand the limits of WaitForMultipleObjects API I've used. Documentation on win32eventreactor mentions limit for 64 objects: http://twistedmatrix.com/documents/11.0.0/api/twisted.internet.win32eventreactor.htm However, it is completely opaque what these objects are? For console handles and process handles it is quite obvious, but not for sockets. Is 64 the limit for total amount sockets opened on different ports? Is 64 the limit for connections made to a socket on specified port? 1. http://techtonik.rainforce.org/2011/03/asynchronous-input-from-windows-console.html 64 is the limit for the total number of objects (listening ports, connections to a port, client connections, your console, the waker, serial ports, whatever) that WFMO may wait upon at once. Put another way, MAXIMUM_WAIT_OBJECTS=64: http://msdn.microsoft.com/en-us/library/ms687025(v=vs.85).aspx Note that you can wait on more than 64 objects at a time, just not using a single WaitForMultipleObjects call. The MSDN page Glyph pointed out has a little more info. Kevin Horn ___ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python