Thanks Tom I will take a look.

On Wed, Aug 26, 2015 at 2:12 PM, Tom Hood <[email protected]> wrote:

> I uploaded a test program to the jira issue that demonstrates the problem
> I'm seeing.
>
> Please let me know if you are able to reproduce the problem and whether you
> think there's a workaround for it that doesn't involve a patch.
>
> Thanks,
> -- Tom
>
>
> On Tue, Aug 25, 2015 at 12:51 PM, Aaron McCurry <[email protected]>
> wrote:
>
> > On Mon, Aug 24, 2015 at 10:30 PM, Tom Hood <[email protected]> wrote:
> >
> > > Hi,
> > >
> > > There appears to be a bug where two rows are merging into one as a
> result
> > > of doing separate calls to the Iface.mutate method using
> > > RowMutationType.UPDATE_ROW and
> RecordMutationType.REPLACE_ENTIRE_RECORD.
> > > (I can also see the problem using REPLACE_ROW and REPLACE_ENTIRE_RECORD
> > > instead).
> > >
> > > For example, if the index has 2 rows with 1 record each that has a copy
> > of
> > > the rowId in cf.key:
> > >   row A:   cf.key=A
> > >   row B:   cf.key=B
> > >
> > > After an attempt to Iface.mutate row A with exactly the same data,
> > > sometimes the result is:
> > >   row A:  cf.key=A
> > >   row B:  cf.key=B  cf.key=A
> > >
> > > instead of the expected result of a no-op.  The corruption is visible
> > with
> > > "blur get" and "blur query cf.key:B" and an Iface.fetchRow from java.
> > >
> > > For the above, the recordId is always "0" and the rowId is a UUID
> > generated
> > > from java UUID.randomUUID (although for my test I'm also using the same
> > > UUIDs).
> > >
> > > I'm not setting a schema at all in my test program, so all the defaults
> > for
> > > analyzers, fieldless=true, etc.
> > >
> > > I do notice the following show up in the shard server log:  INFO ...
> > > [thrift-processors1] search.PrimeDocCache: PrimeDoc for reader
> > > [_k(4.3):C19/4] not stored, because count [13] and freq [16] do not
> > match.
> > >
> > > Restarting blur doesn't seem to help.
> > >
> > > Blur version is 0.2.4.  Hadoop stack is CDH 5.1.0
> > >
> > > Cluster configuration is running 1 shard, 1 controller, 1 namenode all
> on
> > > the same machine (redhat 6.3 Santiago).
> > >
> > > I have a fairly small test case that if I run repeatedly sometimes
> fails,
> > > sometimes doesn't.  I run it after using blur shell to remove the old
> > table
> > > and create a new one with 1 shard.
> > >
> > > Although it isn't 100% reproducible, it seems to fail pretty often for
> > me.
> > > As I've typed the code in on a different network, I don't have the code
> > for
> > > you yet.
> > >
> > > Have you seen this kind of issue before?
> > >
> >
> > I have not.
> >
> >
> > >
> > > Any suggestions for how to track it down?
> > >
> >
> > Not sure yet, maybe we could reproduce it in the IndexManagerTest.
> That's
> > where most the the mutation test are located.
> >
> >
> > >
> > > Are there any commands you want me to run on the resulting table that
> > might
> > > yield some clues?
> > >
> >
> > I don't know enough yet to suggest anything.  I have opened a jira ticket
> > where we can track the issue.
> >
> > https://issues.apache.org/jira/browse/BLUR-441
> >
> > I will try to investigate ASAP.
> >
> > Aaron
> >
> >
> > >
> > > Thanks,
> > > -- Tom
> > >
> >
>

Reply via email to