Hi,

Thanks for the response.  Agreed.  I have no intention of doing this on a
large index at all.

We're upgrading from cdh4.2.0, java 1.6.x, blur 0.1.4 to cdh5.1, java
1.7.x, blur 0.2.3

I'm just trying to get a warm fuzzy that a small blur index created before
the upgrade is the same after the upgrade.  I was just planning to dump all
the records and the terms and diff.  Then I might do some sampled (or heck
it's small, maybe all of them) lookups of records to see that they are
returned in both.  I realize this is less than an efficient way to go about
this for a large index and perhaps I'm fooling myself into thinking this is
a good test at all.  It's a little better than nothing I guess.

Did the binary format of the files in hdfs change between the blur
versions?  Can I just do a compare of the various shard files?  Would you
expect such a comparison to work?  Or does it embed a blur version string
in there somewhere or other stuff like hostnames, etc, that might trip up a
diff/compare?

How would you compare two blur indexes?

-- Tom

On Mon, Oct 6, 2014 at 4:08 PM, Tim Williams <[email protected]> wrote:

> On Mon, Oct 6, 2014 at 3:38 PM, Tom Hood <[email protected]> wrote:
> > Hi,
> >
> > If I want to iterate all rows in a blur index, is the recommended way to
> > just do a query on "*"?
>
> I think it's fair to say it's recommended *not* to do that (iterate
> over all rows) at all:)  On a tiny index, sure, a *:* query works.
> But again, beyond playing around, going deep into results will only
> cause problems - maybe you could share what you're trying to achieve
> we could help with a different approach?
>
> We are, btw, working on some set of features that will provide some
> simple constructs that will make it easier for you to achieve
> something like this... they'll be in the next release but will be
> experimental only.
>
> > It looks like the Selector.setLocationId isn't something I'm supposed to
> > use although it looks like it might provide a means of doing this (i.e.
> > exposing IndexReader.document(docId)).  I don't know the set of all
> rowids
> > in advance, so Selector.setRowId won't help here.
>
> Set your blurQuery.setSelector(new Selector()) (to tell blur to bring
> back all the columns) and adjust the fetch size.  Does that help?
>
> --tim
>

Reply via email to