On Mon, Aug 24, 2015 at 10:30 PM, Tom Hood <[email protected]> wrote:

> Hi,
>
> There appears to be a bug where two rows are merging into one as a result
> of doing separate calls to the Iface.mutate method using
> RowMutationType.UPDATE_ROW and RecordMutationType.REPLACE_ENTIRE_RECORD.
> (I can also see the problem using REPLACE_ROW and REPLACE_ENTIRE_RECORD
> instead).
>
> For example, if the index has 2 rows with 1 record each that has a copy of
> the rowId in cf.key:
>   row A:   cf.key=A
>   row B:   cf.key=B
>
> After an attempt to Iface.mutate row A with exactly the same data,
> sometimes the result is:
>   row A:  cf.key=A
>   row B:  cf.key=B  cf.key=A
>
> instead of the expected result of a no-op.  The corruption is visible with
> "blur get" and "blur query cf.key:B" and an Iface.fetchRow from java.
>
> For the above, the recordId is always "0" and the rowId is a UUID generated
> from java UUID.randomUUID (although for my test I'm also using the same
> UUIDs).
>
> I'm not setting a schema at all in my test program, so all the defaults for
> analyzers, fieldless=true, etc.
>
> I do notice the following show up in the shard server log:  INFO ...
> [thrift-processors1] search.PrimeDocCache: PrimeDoc for reader
> [_k(4.3):C19/4] not stored, because count [13] and freq [16] do not match.
>
> Restarting blur doesn't seem to help.
>
> Blur version is 0.2.4.  Hadoop stack is CDH 5.1.0
>
> Cluster configuration is running 1 shard, 1 controller, 1 namenode all on
> the same machine (redhat 6.3 Santiago).
>
> I have a fairly small test case that if I run repeatedly sometimes fails,
> sometimes doesn't.  I run it after using blur shell to remove the old table
> and create a new one with 1 shard.
>
> Although it isn't 100% reproducible, it seems to fail pretty often for me.
> As I've typed the code in on a different network, I don't have the code for
> you yet.
>
> Have you seen this kind of issue before?
>

I have not.


>
> Any suggestions for how to track it down?
>

Not sure yet, maybe we could reproduce it in the IndexManagerTest.  That's
where most the the mutation test are located.


>
> Are there any commands you want me to run on the resulting table that might
> yield some clues?
>

I don't know enough yet to suggest anything.  I have opened a jira ticket
where we can track the issue.

https://issues.apache.org/jira/browse/BLUR-441

I will try to investigate ASAP.

Aaron


>
> Thanks,
> -- Tom
>

Reply via email to