Re: one word syncronize once more
Greg Ames wrote: please see rev. 558039. requests_this_child does not need to be 100% accurate. the cure below is worse than the disease. Greg -requests_this_child--; /* FIXME: should be synchronized - aaron */ +apr_atomic_dec32(requests_this_child); /* much slower than important */ Maybe someone could reconfirm to the list what exactly the disease is ? If this counter is not accurate (meaning it may loose increments/decrements) does the end the world cave in ? Maybe someone could explain how and where this particular variable requests_this_child is used. For example if its just to provide guideline information to allow termination after 1 requests then thats less of a problem than if someone configured it to terminate after 2 requests. Being 0.0001% out is less of a problem than 100% out. For example if the counter is used in some way to allow apache to restart/logrotate and if it does not come back to zero reliabily then apache will remain hung trying to shutdown (as it often the case in my real world experience) then the disease would be a hung webserver during shutdown and taking. This is much more fatal. I'm not suggesting this particular counter requests_this_child has a direct cause with hung apache instances but I'm asking those that understand the disease(s) if they could explain exactly what they are. I'm not familiar with the apr_atomic_dec32() API, is this correctly optimized to a single asm instruction lock decl 0xaddr32 on Intel IA32 ? How many clock cycles do you think that operations takes, what technical understanding makes you think such a cure should be utilized with caution ? My understanding of the cure is that its very very light weight so light weight I'd challenge you to prove the mythical performance issue in the situation its being used in apache. To give this some weight, I'm was recently involved in an multi-threaded ethernet packet processing application dealing with 100's of mbit and each packet that came updated multiple counters using the IA32 LOCK prefix but there was zero noticeable contention. Indeed this is the purpose of the assembly primitive to implement something in hardware that is most efficiently done there. Darryl
Re: one word syncronize
sebb wrote: On 14/06/07, Dmytro Fedonin [EMAIL PROTECTED] wrote: Looking through 'server/mpm/worker/worker.c' I have found such a combination of TODO/FIXME comments: 1) /* TODO: requests_this_child should be synchronized - aaron */ if (requests_this_child = 0) { 2) requests_this_child--; /* FIXME: should be synchronized - aaron */ And I can not see any point here. These are one word CPU operations, thus there is no way to preempt inside this kind of operation. So, one CPU is safe by nature of basic operation. If we have several CPUs they will synchronize caches any way, thus we will never get inconsistent state here. We can only lose time trying to synchronize it in code. Am I not right? The decrement operation is a read-modify-write cycle, it is possible for 2 CPUs to overlap their operations, ending up with a observable lost decrement. Since they both end up reading the same initial value. On IA32/x86 the DEC assembly instruction operation can be prefixed by the LOCK instruction, this makes the CPU continue to assert memory bus locking for the duration of the instruction so there is no way for CPU2 to perform a read access until CPU1 releases control of the memory bus when it completes the instruction, this is effectively what atomic_dec() enforces. The amount of performance lost by using atomic_xxx() really is minimal, with any luck it might only be that cache-line that remains locked not the entire memory bus. The decrement operation may be handled as load, decrement, store on some architectures, so can be pre-empted by a different CPU. There is no other way to handle it :) Memory itself can't perform arithmetic operations, so the decrement always happens inside the ALU inside the CPU. It is true that non-SMP aware CPUs might maintain memory bus acquisition during the 'decrement' (aka modify) phase of the operation since there is no reason not to give it up as they are the only user of memory. This becomes a performance bottleneck for any SMP capable CPU which has a cache that can operate at full CPU clock speeds. As the 'decrement' (aka modify) phase is going to require at least 1 clock cycle to perform so why not let another CPU make use of the memory bus. Also some hardware architectures (e.g. HP Alpha) have an unusual memory model. One CPU may see memory updates in a different order from another CPU. Software that relies on the updates being seen across all CPUs must use the appropriate memory synchronisation instructions. I don't know if these considerations apply to this code. Memory update ordering applies when considering how 2 or more distinct machine words are updated with respect to themselves when those updates are observed from another CPU. The example here is with concerns over a single machine word being updated on SMP systems. Darryl
Re: Creating a thread safe module and the problem of calling of 'CRYPTO_set_locking_callback' twice!
William A. Rowe, Jr. wrote: Darryl Miles wrote: Your thinking is correct there is a problem. Those OpenSSL functions are not documented in my man page but exist in the library. Yes there is a read-test-write race window by using those APIs alone. Nope. This is set when the server process is running in single process, single thread mode, long before the server 'opens up' and spawns off it's worker threads. Understood on the single thread situation. But what about module initialization order guaranteed and possible future modules which may also use OpenSSL ? The goal is to ensure the CRYPTO_() init functions are called, and called exactly once before openssl first use only when openssl is being used. * mod_core (no mod_ssl, no mod_frank, no need do anything no openssl users) * mod_core + mod_ssl (no mod_frank) * mod_core + mod_frank (no mod_ssl) * mod_core + mod_ssl + mod_frank * mod_core + mod_frank + mod_ssl (opposite initialization order) There may not be worker threads and apache-http may not support dynamic runtime loading / unloading of modules (at this time) but its possible to deal with all those concerns cleanly. What I DO agree with is that these callbacks should be locked in much earlier than post_config. I'm happy to see these callbacks locked in at the time we register the module itself. Is the module registration order the same for a given canonical configuration (i.e. its not subject to httpd.conf config LoadModule directive ordering). Also can one module can ask the core what other modules are loaded and it can ask them if they too are users of OpenSSL. I.e. whatever system to address this problem is used would need to deal with the possible existence of a future mod_frank_v2 (that does not exist today but may do in the future) in a way that mod_ssl does not need to be recompiled or upgraded to know about mod_frank_v2 when the time comes to release mod_frank_v2. Darryl
Re: Creating a thread safe module and the problem of calling of 'CRYPTO_set_locking_callback' twice!
Frank wrote: Joe Orton wrote: On Wed, Dec 06, 2006 at 06:20:55PM +, Darryl Miles wrote: [...] Is there an API to get the current value ? Yes, CRYPTO_get_locking_callback/CRYPTO_get_id_callback. [...] I already know that this functions exists. But what if my module gets inited before mod_ssl, which doesn't use the get-functions to determine that something is already there? I was in the hope to see a clean general purpose solution. :-) Your thinking is correct there is a problem. Those OpenSSL functions are not documented in my man page but exist in the library. Yes there is a read-test-write race window by using those APIs alone. After this long and informative discussion I really think there is need for a ssl_thread_init_if_not_already_done inside Apache. (Besides the correction/removal of OpenSSL's stupid global locking mechanism.) I disagree I should not need to compile apache mod_core in a way that has to understand the quirks of OpenSSL. So adding a function ssl_thread_init_if_not_already_done() to apache core is too specific to that one problem. I'm saying its possible to solve a bunch of problems like this in a more generic way. This helps you and the next man. Maybe there is some (small) re-design of the Apache code needed? Agreed, something needs to be added. I'm saying there is no need to make it specific to OpenSSL. Serializing the initialization can be made generic such that these objectives are met: * When building Apache mod_core with minimal functionality it is not necessary to build it with special API function to handle OpenSSL. So we get the co-ordination mechanism by default. * I can build my mod_ssl or mod_frank at any point later without having to rebuild apache mod_core as well. * Can provide a generic mechanism which can be used to solve the same problem where a common subordinate library is linked with by a module but is not require by mod_core. * The overhead of adding the handful of new functions to apache mod_core should low. Subjective but... * Because the solution is generic it can be reused for other purposes where one module has to co-ordinate with another module at runtime over the use of a shared resource, but due to the modular nature of apache its impossible to tell if/when/what other modules need to co-ordinate their behavior until runtime. * Only those modules that use OpenSSL directly need to be linked against OpenSSL. * Only those modules that link with OpenSSL need openssl around at build time. So the helper functions below would be repeated within each mod_ that uses OpenSSL. Those a rough concerns to address. All Apache should provide is an accessible keyed namespace to attach arbitrary pieces of memory where the attachment functions provides some thread-safe guarantees. Maybe there is already a mechanism to achieve most of that within apache. All these functions must be thread-safe with respect to themselves. The key itself could be a string or an integer. Or even both if you wanted to abstract the key, like X11 Xwindows atoms work, you lookup/create your key independently and get back a reference which is usually a cardinal integer starting at 1 and then use this api with that (not the key directly). * void *get(key) this gets the (void *) key value. * void *attach(key, newdata) this overrides the current key value with an optional exchange (by returning the old value). * void *attach_smart_exchange(key, newdata) this does an exchange but will only exchange a NULL for non-NULL and non-NULL for NULL. This is used to set without overwriting. It always returns the pre-existing value by checking that value in the context of how you called it you can work out if you were successful or not. Sample hypothetical usage in self-documenting non compilable code: struct openssl_init_struct { long magic; /* do versioning here */ int boolean_flag; mutex_t mutex; } *ois; ois = calloc_from_global_pool(1, sizeof(openssl_init_struct)); ois-magic = OPENSSL_INIT_STRUCT_V1; /* magic constant */ ois-boolean_flag = FALSE; mutex_init(ois-mutex); /* as first use we lock before attachment */ mutex_lock(ois-mutex); const char *label_key = mylabel_openssl_initialization_serializer; /* could be integer based key system */ /* This ap_newapi_attach_smart_exchange function is a thread-safe atomic attachment of mydata to the global namespace, it will only attach if the current value for the key does not exist or is NULL already, which are essentially treated identically (does not exist vs NULL value) */ void *retval = apache_core_handle-ap_newapi_attach_smart_exchange(label_key, ois); if(retval == NULL) { /* we won the attachment, we already locked it and we own that lock, fall thru to code below */ } else { free_from_global_pool(ois); /* FIXME: destruct mutex etc.. properly */ /* we lost, go get the winner, you can optimize
Re: Creating a thread safe module and the problem of calling of 'CRYPTO_set_locking_callback' twice!
Frank wrote: William A. Rowe, Jr. wrote: Nick Kew wrote: [...] An SSL_CTX can't be cross-threaded. If the scope of use of that CTX is restricted to one thread at a time, then yes, OpenSSL has been threadsafe for a very very long time. You mean if I were able to create one SSL_CTX for every thread then I do not have to use the both thread-safe-maker callbacks? I dont think this is true. But correct my understanding too if I am wrong. Cross-threaded might confuse someone into thinking there maybe some apartment threading rules to obey, there isn't. An SSL * can't have a method invoked on the same instance at the same time. So long as you serialize your method calls (SSL_() family) to that same instance; any thread can call that method. It is unusual to need to do so. But SSL_CTX * is the template context specifically designed to be shared and used across multiple-threads if needs be, providing you make correct use of the 'CRYPTO_set_locking_callback' and 'CRYPTO_set_id_callback' and friends as part of your application initialization. This allows for (amongst other things) the obviously parallel usage of SSL_new(SSL_CTX *) when creating new connections. Maybe the openssl-users list would be a better place for assistance. Darryl
Re: Creating a thread safe module and the problem of calling of 'CRYPTO_set_locking_callback' twice!
Nick Kew wrote: Unless OpenSSL nomenclature is rather confusing here, an SSL_CTX sounds like the kind of thing you would instantiate per-connection or per-request. Does your module act on a request or a connection? Maybe a bit of background reading and examination of reference implementations would be a better help for you right now. SSL_CTX_new(3): SSL_CTX_new - create a new SSL_CTX object as framework for TLS/SSL enabled functions SSL_new(3): SSL_new - create a new SSL structure for a connection The SSL_CTX is a template/configuration holder to stamp out your connection instances from. This saves configuring certificates, cipher specs, etc... for every connection. Darryl
Re: vote on concept of ServerTokens Off
Jeff Trawick wrote: I know... that's why I asked :) We're up to two great answers to disable some output from the server that isn't required by the HTTP protocol anyway: 1) modify the source 2) install third-party module ROFL. Please add to the list: 3) Start a new apache-httpd fork. apache-phewbits :D
Re: Creating a thread safe module and the problem of calling of 'CRYPTO_set_locking_callback' twice!
Frank wrote: EVP_CIPHER_CTX ctx; EVP_CIPHER_CTX_init ( ctx); EVP_EncryptInit ( ctx, EVP_bf_cbc (), key, iv); EVP_EncryptUpdate ( ctx, outbuf, olen, inbuff, n); EVP_EncryptFinal ( ctx, outbuf + olen, tlen); Because 'EVP_CIPHER_CTX_init' is 'slow', I want to call it once! (Yes! I can call it for every request and then (I think) I am on the safe side, but I do not want this because there are MANY requests!) So my code has to be thread safe, as Apache might be compiled with thread support! To make it thread safe http://www.openssl.org/docs/crypto/threads.html told me: These functions a generally thread safe AFAIK so long as you are not using an engine / cyptocard or something shared. If its just a basic memory in memory out transform it should be pretty clean of locking. Its only stuff like SSL session cache and SSL_CTX and registrations schemes (cipher registration, digest registration, etc...) which use locking as they are shared concurrently. But I understand your point. Its a necessity because the API contract says that is what you must do. OpenSSL can safely be used in multi-threaded applications provided that at least two callback functions are set. This means the two functions 'CRYPTO_set_locking_callback' and 'CRYPTO_set_id_callback'! These two functions are being called from mod_ssl by the ssl_init_Module-function (via ssl_util_thread_setup, which creates some thread mutexes and calls the both functions) without testing whether they have already being called or not. My question is: How does this interfere with my module? How can I ensure that only one of us (mod_ssl or my module) is calling these both functions? I cannot believe that there is no problem when my module creates some thread mutexes and mod_ssl does it too... Of course it is necessary to arbitrate the load and unloading of the OpenSSL libraries. Where loading really means the initial configuration that OpenSSL requires to initialize itself before it achieves maximal thread-safety. This is true of any programming paradigm. If something it thread-unsafe until you configure it correctly to be thread-safe, you have to serialize all usage upto the point it becomes safe. This includes calling functions to initialize the safety mechanism. You also can't go and change the locking policy while there maybe one or more active users of the old/existing policy. From a practical standpoint its impossible to tell how many users there are once the first user gets stuck into application work. So you are correct, you can't blindly can go overwriting the policy with CRYPTO_set_locking_callback() to point to your mutexes. P.S.: I still think there is need for a test routine like 'ssl_is_thread_safe_maker_on()'. Agreed. The problem is reduced to serializing the setting of the locking policy, which is reduced to the apache http server framework providing a non-performance critical locking mechanism and a boolean flag. Then each mod_ user locks that external lock, checks the external flag, if flag not set, configure locking policy in openssl, set flag and unlock. The apache framework needs to be the arbitrator here by holding the lock and flag in a namespace accessible by all modules. It does not matter which application wins to set the policy, just so long as a policy it setup. But it would be helpful if both mod_ssl and mod_frank would agree on the same policy (OpenSSL kind of gives you a few options here) Darryl
Re: Creating a thread safe module and the problem of calling of 'CRYPTO_set_locking_callback' twice!
Joe Orton wrote: What I do with OpenSSL in neon is to check that the existing callback is NULL before registering a new callback; and likewise to check that the ID callback is the one neon previously registered before un-registering it later. If everybody did that it would be relatively safe. Is there an API to get the current value ? It shouldn't be too hard to attach a piece of memory to a keyed namespace which the core module maintains. 3 primitives should do the trick. data = get(key), set(key, data), set_if_null(key, data) where the set_if_null() operation is especially thread-safe. This would be little impact/bloat on core module, but be user extendable to allow other such things. Someone just needs to maintain an assigned key value list. A notice board for modules to talk to each other. The OpenSSL guys have actually obviated the ID callback for some future release, it was entirely unportable because of the cast-to-long issue anyway; but the locking callback remains. (void *) in the latest releases AFAIK. This is to address this very concern. Darryl
Re: [PATCH 40026] ServerTokens Off
Mads Toftum wrote: +1 - looking at the number of IIS targeted worms that keep hitting my apache installs seem to suggest that obscuring the server name will at most lead to a false sense of security. Besides, if you really care, I'm pretty sure it wouldn't be all that hard to guess what server it is by looking at all the rest of the headers. Looking at the way the TCPIP stack behaves under normal and error conditions. Looking at the way the HTTP server behaves under normal and error conditions. Looking at the way the file serving behaves under normal and error conditions. Looking at the way any scripting technology behaves under normal and error conditions. You can't hide everything and why waste your own CPU cycles trying to imitate another platforms quirks, when you could be serving documents with it. Another major point about OSS security is that it can withstand source code disclosure _AND_ still be secure. Maybe other servers implementations just aren't in the same league of security. Darryl
Re: [PATCH 40026] ServerTokens Off
Joshua Slive wrote: noteSetting directiveServerTokens/directive to less than codeminimal/code is not recommended because it makes it more difficult to debug interoperational problems./note And my +1 isn't very strong. I have no problem with saying that this small bit of advertising is the tiny price that you pay for using our free software. But just to make this never-ending issue go away, I'd say put it in. I should also be pointed out in the documentation that those thinking of setting it to Off for the purpose of security by obscurity (for hiding of implementation and version number) should realize that this concept has no technical merit in the HTTP server situation. Call this an education clause in the documentation which may head off inappropriate usage by less clueful users. With regards to the price that you pay ... I take it that you are reading it from the karma equalization policy not in any legal policy since one of the fundamental points of the Apache Foundation is that advertisement is not one of the prices you pay. Darryl
Re: mod_proxy_balancer/mod_proxy_ajp TODO
Henri Gomez wrote: Well you we always indicate some sort of CPU power for a remote (a sort of bogomips) and use this in computation. Why should the CPU power matter, what if the high power CPU is getting all the complex requests and the lower power CPU is ending up with simple request (static content). You are better implementing it in control packets over AJP that can advertise that hosts current willingness to take on new requests on a 32/64bit scale. Call this a flow control back pressure packet, a stateless beacon of information which may or may not be used by the client AJP. Then have a pluggable implementation at the server end of calculating that value and frequency for advertising it. All the apache LB has to do is work from that load calculation. All existing AJPs have to do is just ignore the packet. In the case of different horse power CPUs that factor is better fed into the server AJP algorithm by biasing the advertised willingness to take on new requests after a certain low level is reached. Only the server end of the AJP know the true request rate and how near that system is to breaking point. This scheme also works where apache may not see all the work the backend is doing, like if you have a different apache front end clusters linked to the same single backend cluster. Darryl
Re: mod_proxy_balancer/mod_proxy_ajp TODO
Henri Gomez wrote: The TomcatoMips indicator was just something to tell that it's not the raw CPU power which is important, but the estimated LOAD capacity of an instance. But its still apache working out TomcatoMips. I think that approach is still flawed. I'm saying only the server end of the AJP knows the true situation. The current setup presumes that the running apache instance has all the facts necessary to determine balancing. When all it knows about is the work it has given the backend and the work rate its betting it back. I'm thinking both ends apache and tomcat should make load calculations based on that they know at hand. As far as I know there is no provision in the AJP to announce Willingness to serve. Both ends should feed their available information and configuration biases it into their respective algorithm and come out with a result that can be compared against each other. The worker would then announce as necessary (there maybe a minimum % change to damper information flap) down the connector the info to apache. There probably need to be a random magic number and wrapping sequence number in the packet to help the apache end spot obvious problems. This would allow kernel load avg/io load (and anything else) to be periodically taken into account at the tomcat end. It would be expected that each member of the backend tomcat cluster is using the same algorithm to announce willingness. Otherwise you get disparity when apache comes to make a decision. So I suppose its just the framework to allow an LB worker to announce its willingness to serve I am calling for here. Not any specific algorithm, that issue can be toyed with until the end of time. An initial implementation would need to experiment and work out: * How that willingess value impacts/biases the existing apache LB calculations. * Guidelines on how to configure algorithm at each end up based on known factors (like CPUs, average background workload, relative IO performance). I'm thinking with that you can hit the widest audience to make a usable default without giving much thought to configuration. The type of approach kernels make these days, you only have to tweak and think about configuration in extreme scenarios but for the most it works well out of the box. Darryl
mod_proxy_xxxxx last resort fallback redirect ?
I'm interested in your comments (good and bad) on implementing a new option to ProxyPass which would make apache perform a redirect when the proxy server or balancer cluster is not available. This minics the same functionality of a dedicated hardware load balancer by issuing a HTTP redirect response to a holding page (This service is not available, or We are updating out website). Can anyone see any reason why this is a bad idea to implement within apache, or see any obstacles in my way ? The last time I looked at the internals of apache it had a sub-request facility, I would like to be able to redirect to both external and internal pages. External being a HTTP 3xx response with Location: header and would be configured with a absolute url http://...;. Internal being a fall through to the configured mapping and content handler for the URL just as-if the ProxyPass directive wasn't there, I was thinking along the lines of this being configured with a fallthrough: scheme prefix. This means it would be possible to use: DocumentRoot /opt/apache/htdocs ProxyPass / balancer://group1/ timeout=5 maxattempts=3 fallback-redirect=fallthrough:/holding.html Then if the balancer://group1 cluster fails the document at /opt/apache/htdocs/holding.html is served instead. Request for comments, Darryl -- Darryl L. Miles