I think I now understand what happened with the outage of PyPI yesterday and today.
As Thomas found, somebody was crawling the wiki, with multiple requests per second, all links (e.g. in a series such as /moin/PyConFrancescAlted?action=AttachFile /moin/PyConFrancescAlted?action=diff /moin/PyConFrancescAlted?action=info /moin/PyConFrancescAlted?action=edit /moin/PyConFrancescAlted?action=LocalSiteMap /moin/PyConFrancescAlted?action=print /moin/PyConFrancescAlted?action=refresh and so on, for every page. That caused considerable load on the machine (load average 17). In turn, PyPI began to respond more slowly; in some cases, it would not respond within the 60s that I configured for FastCGI. As a result, mod_fastcgi would close the connection for the request (and log an error). thfcgi.py found that it can't write to the pipe anymore (EPIPE), and therefore decided to terminate the FCGI server. In turn, mod_fastcgi attempted to restart the server for some time, and eventually would start throttling the restarts, making all PyPI servers go away (i.e. they would quit, and then not get restarted for some time). At that point, my maintenance script would detect that all PyPI instances went away, and initiate a graceful restart of Apache. The crawler comes from the same ISP, but today with a different IP address. I blocked that address as well. Can anybody suggest a more reliable way to prevent crawlers from hitting the wiki so hard? Regards, Martin _______________________________________________ Catalog-SIG mailing list [email protected] http://mail.python.org/mailman/listinfo/catalog-sig
