[ https://issues.apache.org/jira/browse/NUTCH-893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12908791#action_12908791 ]
Andrzej Bialecki commented on NUTCH-893: ----------------------------------------- +1 and +1. > DataStore.put() silently loses records when executed from multiple processes > ---------------------------------------------------------------------------- > > Key: NUTCH-893 > URL: https://issues.apache.org/jira/browse/NUTCH-893 > Project: Nutch > Issue Type: Bug > Affects Versions: 2.0 > Environment: Gora HEAD, SqlStore, MySQL 5.1, Ubuntu 10.4 x64, Sun JDK > 1.6 > Reporter: Andrzej Bialecki > Priority: Blocker > Fix For: 2.0 > > Attachments: NUTCH-893.patch, NUTCH-893_v2.patch > > > In order to debug the issue described in NUTCH-879 I created a test to > simulate multiple clients appending to webtable (please see the patch), which > is the situation that we have in distributed map-reduce jobs. > There are two tests there: one that uses multiple threads within the same > JVM, and another that uses single thread in multiple JVMs. Each test first > clears webtable (be careful!), and then puts a bunch of pages, and finally > counts that all are present and their values correspond to keys. To make > things more interesting each execution context (thread or process) closes and > reopens its instance of DataStore a few times. > The multithreaded test passes just fine. However, the multi-process test > fails with missing keys, as many as 30%. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.