Hello! This is to bring attention to the ticket GORA-227 <https://issues.apache.org/jira/browse/GORA-227>. Me and my team have developed a plugin for Nutch (in 2.x branch fork) and wanted to write a test similar to TestGenerator <http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup>. It turned out that TestGenerator is currently disabled and investigation lead us to tickets NUTCH-1572 <https://issues.apache.org/jira/browse/NUTCH-1572> and GORA-225 <https://issues.apache.org/jira/browse/GORA-225>, which has GORA-227 as a subtask. We did some debugging and posted a message on the ticket. I'm copying it here:
Hello! > > I have debugged TestGenerator and, from what I saw, it fails due to the > fact that query is being executed on a different MemStore instance rather > than one that holds injected web pages. That is, when GeneratorJob inits > its mapper and reducer, it creates new instance of MemStore for both. Each > of this two instances hold their internal maps and know nothing about each > other and MemStore created by TestGenerator (and populated with web pages). > > What is the best way to address this issue? Should we somehow amend > DataStoreFactory to make it return single instance of MemStore or should > all MemStores share their states? Any suggestions? > A day passed by with no reply, so I figured it might be a good idea to post it on mailing list. Any reply is welcome, thank you in advance! Best regards, Sergey Weiss