Hi Gabriel, I'm not an expert FileEntityProcessor user, but I'd expect a consistent process order. Your code seems "kosher" to me. You use the last modified date as order, which seems ok to me. So create a Jira issue and attach your patch!
Martijn On 24 October 2011 21:49, Gabriel Cooper <[email protected]> wrote: > Hello, > > I noticed what appears to be a bug in DataImportHandler's > FileListEntityProcessor. Specifically, it relies on Java's File.list() > method to retrieve a list of files from the configured dataimport directory, > but list() does not guarantee a sort order. This means that if you have two > files that update the same record, the results are non-deterministic. > Typically, list() does in fact return them lexigraphically sorted, but this > is not guaranteed. > > An example of how you can get into trouble is to imagine the following: > > xyz.xml -- Created one hour ago. Contains updates to records "Foo" and > "Bar". > abc.xml -- Created one minute ago. Contains updates to records "Bar" and > "Baz". > > In this case, the newest file, in abc.xml, would (likely, but not > guaranteed) be run first, updating the "Bar" and "Baz" records. Next, the > older file, xyz.xml, would update "Foo" and overwrite "Bar" with outdated > changes. > > The "HowToContribute" wiki page suggested I send my request here before > opening an actual bug ticket, so please let me know if there's anything else > I can or should do to get this patch submitted and approved. I've attached a > patch of FileListEntityProcessor, along with an updated test, please let me > know if it's kosher. > > Thank you, > > Gabriel. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > -- Met vriendelijke groet, Martijn van Groningen --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
