Hi Sergey, There is an attempt to make MemStore like other stores: https://issues.apache.org/jira/browse/GORA-228
But to just fix tests I think we could just use static ConcurrentHashMap or Collections.synchronizedMap(new LinkedHashMap) as the backing store for MemStore for now. CC @Lewis and @Renato - Henry On Wed, Oct 8, 2014 at 6:18 AM, Sergey Weiss <swe...@griddynamics.com> wrote: > Hello! > > This is to bring attention to the ticket GORA-227 > <https://issues.apache.org/jira/browse/GORA-227>. Me and my team have > developed a plugin for Nutch (in 2.x branch fork) and wanted to write a > test similar to TestGenerator > <http://svn.apache.org/viewvc/nutch/branches/2.x/src/test/org/apache/nutch/crawl/TestGenerator.java?view=markup>. > It turned out that TestGenerator is currently disabled and investigation > lead us to tickets NUTCH-1572 > <https://issues.apache.org/jira/browse/NUTCH-1572> and GORA-225 > <https://issues.apache.org/jira/browse/GORA-225>, which has GORA-227 as a > subtask. We did some debugging and posted a message on the ticket. I'm > copying it here: > > Hello! >> >> I have debugged TestGenerator and, from what I saw, it fails due to the >> fact that query is being executed on a different MemStore instance rather >> than one that holds injected web pages. That is, when GeneratorJob inits >> its mapper and reducer, it creates new instance of MemStore for both. Each >> of this two instances hold their internal maps and know nothing about each >> other and MemStore created by TestGenerator (and populated with web pages). >> >> What is the best way to address this issue? Should we somehow amend >> DataStoreFactory to make it return single instance of MemStore or should >> all MemStores share their states? Any suggestions? >> > > A day passed by with no reply, so I figured it might be a good idea to > post it on mailing list. > Any reply is welcome, thank you in advance! > > Best regards, > Sergey Weiss