Hi Hiran,

... to answer the questions...

> What is the plugin's life cycle? Is there one plugin instance per
> server? One per URL? One per thread? Or one in total?

There is a single instance per plugin and task (a Java process running
Nutch). In local mode (not running in a distributed manner on a Hadoop
cluster), this means there is a single instance of every plugin.
So the Fetcher job holds one single instance of every protocol plugin.

> This scope defines whether I can make use of local variables, or
> instance fields.

Yes, you can use instance fields, e.g. to pool connections.

> Or is there some other mechanism where a plugin could
> store data that should survive across the getProtocolOutput calls?

No.

> Could a plugin define which scope it wants to be in?

No.

Keep in mind that most methods, for example, getProtocolOutput, need
to be thread-safe. That is they might be called concurrently from
multiple Fetcher threads.


Best,
Sebastian

Reply via email to