Hi, Thanks for the response. Agreed. I have no intention of doing this on a large index at all.
We're upgrading from cdh4.2.0, java 1.6.x, blur 0.1.4 to cdh5.1, java 1.7.x, blur 0.2.3 I'm just trying to get a warm fuzzy that a small blur index created before the upgrade is the same after the upgrade. I was just planning to dump all the records and the terms and diff. Then I might do some sampled (or heck it's small, maybe all of them) lookups of records to see that they are returned in both. I realize this is less than an efficient way to go about this for a large index and perhaps I'm fooling myself into thinking this is a good test at all. It's a little better than nothing I guess. Did the binary format of the files in hdfs change between the blur versions? Can I just do a compare of the various shard files? Would you expect such a comparison to work? Or does it embed a blur version string in there somewhere or other stuff like hostnames, etc, that might trip up a diff/compare? How would you compare two blur indexes? -- Tom On Mon, Oct 6, 2014 at 4:08 PM, Tim Williams <[email protected]> wrote: > On Mon, Oct 6, 2014 at 3:38 PM, Tom Hood <[email protected]> wrote: > > Hi, > > > > If I want to iterate all rows in a blur index, is the recommended way to > > just do a query on "*"? > > I think it's fair to say it's recommended *not* to do that (iterate > over all rows) at all:) On a tiny index, sure, a *:* query works. > But again, beyond playing around, going deep into results will only > cause problems - maybe you could share what you're trying to achieve > we could help with a different approach? > > We are, btw, working on some set of features that will provide some > simple constructs that will make it easier for you to achieve > something like this... they'll be in the next release but will be > experimental only. > > > It looks like the Selector.setLocationId isn't something I'm supposed to > > use although it looks like it might provide a means of doing this (i.e. > > exposing IndexReader.document(docId)). I don't know the set of all > rowids > > in advance, so Selector.setRowId won't help here. > > Set your blurQuery.setSelector(new Selector()) (to tell blur to bring > back all the columns) and adjust the fetch size. Does that help? > > --tim >
