Hello!

This is to bring attention to the ticket GORA-227
<https://issues.apache.org/jira/browse/GORA-227>. Me and my team have
developed a plugin for Nutch (in 2.x branch fork) and wanted to write a
test similar to TestGenerator
<http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup>.
It turned out that TestGenerator is currently disabled and investigation
lead us to tickets NUTCH-1572
<https://issues.apache.org/jira/browse/NUTCH-1572> and GORA-225
<https://issues.apache.org/jira/browse/GORA-225>, which has GORA-227 as a
subtask. We did some debugging and posted a message on the ticket. I'm
copying it here:

Hello!
>
> I have debugged TestGenerator and, from what I saw, it fails due to the
> fact that query is being executed on a different MemStore instance rather
> than one that holds injected web pages. That is, when GeneratorJob inits
> its mapper and reducer, it creates new instance of MemStore for both. Each
> of this two instances hold their internal maps and know nothing about each
> other and MemStore created by TestGenerator (and populated with web pages).
>
> What is the best way to address this issue? Should we somehow amend
> DataStoreFactory to make it return single instance of MemStore or should
> all MemStores share their states? Any suggestions?
>

 A day passed by with no reply, so I figured it might be a good idea to
post it on mailing list.
Any reply is welcome, thank you in advance!

Best regards,
Sergey Weiss

Reply via email to