I was experimenting with the protocol plugin that continually connects and disconnects from the server for each and every request. HTML may be lightweight (or cached in the httpclient code), but other protocols are not.
My code was ruthless about establishing and tearing down the connections, but it looked very repetitive for getProtocolOutput and getRobotRules. Trying to make functions reusable first of all led to loss of complete control on the connection. No worries, they get garbage collected - don't they? Well it seems these connections get closed and gc'ed but it takes too much time. Inbetween the fetcher hits problems and runs into grace periods of 300 000 milliseconds. The total scan becomes unperformant just because I tried to optimize the code. Which leads me to the next question: What is the plugin's life cycle? Is there one plugin instance per server? One per URL? One per thread? Or one in total? This scope defines whether I can make use of local variables, or instance fields. Or is there some other mechanism where a plugin could store data that should survive across the getProtocolOutput calls? Could a plugin define which scope it wants to be in?

