Hi Markus, sounds somewhat similar to NUTCH-1252 but that was rather trivial and easy to reproduce.
Sebastian 2012/11/30 Markus Jelsma <markus.jel...@openindex.io>: > Hi, > > We've got an issue where one in a few thousand records partially contains > another record's ParseMeta data. To be specific, record A ends up with the > ParseMeta data of record B that is added by one of our custom parse plugins. > I'm unsure as to where the problem really is because the parse plugin > receives data from a modified parser plugin that in turn adds a custom Tika > ContentHandler. > > Because i'm unable to reproduce this i had to inspect the code for places > where an object is reused but an attribute is not reset. To me, that would be > the most obvious problem, but until now i've been unsuccessful in finding the > issue! > > Regardless of how remote the chance is of someone having had some similar > issue: does anyone have some ideas to share? > > Thanks, > Markus