Re: Initializing - break init() API compatibility?

Chris Hostetter Mon, 19 Nov 2007 10:26:52 -0800

Hey guys, i'm *WAY* behind on my email, ironicly due to going mostly "off 
the grid" while at a apachecon -- but the first thing i'm trying to do is 
get caught up with what's going on here.


if i can sum up my understanding (and please correct me if i'm wrong, this 
is based purely on email and jira comments, i haven't read any patches) 
...

1) In SOLR-399 Henri attempted to implement the ideas we talked about a 
few weeks ago in which the init methods stayed "1.2" style, and plugins 
that wanted to know about the core implemented a new 
SolrCore.Initializable "marker interface" -- after they were initied, 
they'd been initialized using their config options, they would be added to 
a query, which the SolrCore would iterate over and call a method from that 
marker interface passing in itself.

2) Ryan felt that the complexiy of SOLR-399 was too much, and proposed a 
simpler approach of adding SolrCore to the signature of most init methods 
-- and made the very astute point that right now an IndexSchema can be 
initialized without a SolrCore.  Which made me realize two key things 
(that i'll come back to later)
  a) the only reason an IndexSchema needs a SolrConfig is for 
     config dir / classpath access
  b) the only current or discussed use case for analysis factories to have 
     access to the SolrCore is for either for the same reason.

3) Henri experimented with an idea involving a SolrSystem which i don't 
fully understand, but aparently the end result is that neither Henri or 
Ryan liked it much - but it got them both seriously considering the 
possibility of using ThreadLocal to keep track of the core so that it 
would always be available via a static method if anyone wanted it, which 
ryan implemented as SOLR-414

...Did i get that all right?

Here's my take(s) on things: 

1) the ThreadLocal appraoch scares me ... it seems like it would be very 
easy for us to screw it up down the road, when making changes to classes 
such that we may not allways remember to "set" the core/config when we 
should, and it would be very hard to spot the problem (particularly in 
single threaded tests)

2) As far as analysis factory "stuff" goes -- let's not give anyone 
(including the IndexSchema) access to the SolrCore ... let's even 
deprecate the IndexSchema's usage of the SolrConfig interface, and instead 
refactor the classpath related stuff in SolrConfig/Config into a new 
interface/utility class (for the sake of argument: ResourceFinder) 
that just knows how to find config files and instantiate classes using a 
classpath.  after the IndexSchema has inited a factory (using the 1.2 
sigs) it can use a marker interface like we've discussed before ie: 
ResourceFinderInitable) to tell that factory about the ResourceFinder (if 
it wants to go get a stopwords file, or instantiate another Tokenizer for 
parsing a synonm file etc)
   i) backwards compatible, just adds a new optional interface
  ii) no crazy queing or callback loops - the ResourceFinder will be 
      completley initialized by teh SOlrConfig well before hte indexSchema 
      uses it.
 iii) keeps IndexSchema inependent of SolrCore, which is nice; and may 
      someday help us promote the IndexSchema concept to a lucene contrib

3) as far as non schema related things go (ie: request handlers, etc...) 
and making them aware of the SolrCore .. what if we revist the marker 
interface callback idea for a minute: the reason it got crazy and required 
complex queing and callbacks was because the SolrCore isn't fully 
initialized when the plugins are initialized -- the same problem as if we 
added SolrCore to the init signature for the plugins; and because Henri 
wanted to account for both a plugin triggering initialization of other 
plugins that need to know aboutthe core, and forwanting to "delay" their 
initialization until other plugins are done.  

what if we just don't support any of that complexity?

make the semantics simple:
    a) Foo.init( ...whatever current args are...) will be called exactly once.
    b) if Foo implements SolrCoreInitalizable, then:
       init(SolrCore) will be called exactly once, at some point after 
       the previouslu mentioned init method and before any work is 
       exepcted from this plugin.
    c) there is no implied order that plugins will be initialized, or that 
       SolrCoreInitalizable.init will be called in -- the orders may not 
       be the same.

if we *really* want to support something "defered" solrCore 
initializability, we can do it by making the "queue" publicly appendable 
using some sort of "core.needsInit(SolrCoreInitalizable)" method ... it 
would initially build up a queue, but once the core is ready to go, could 
ust call the init method immediately -- if a Plugin wants to be defered 
for later, it's init(SolrCore c) method could just call c.needsInit(this) 
...might lead to an inifinite loop, but we could mitigate that by 
making needsInit both dedup it's queue, and record if it's already in the 
middle of initing something in realtime and returning false false value if 
it's asked to init the same thing again 9ie: nothing in the queu)


That would all play icely with AbstractPluginLoader as well right?



-Hoss

Re: Initializing - break init() API compatibility?

Reply via email to