UNCLASSIFIED
Hi,
I have field indexes that looks something like
Row Id: <date>-<UUID>
CF: fi||<type>||<value>
CQ: <date>-<UUID>
For example:
20130814-550e8400-e29b-41d4-a716-446655440000 fi||verb||run
20130814-550e8400-e29b-41d4-a716-446655440000
20130814-550e8400-e29b-41d4-a716-446655440000 page||58 line||16 "the boy can
run up the hill"
>From what I could determine from the doco and API I am executing the following
>code to perform an intersecting query on two values...
Set<Range> shards = new HashSet<Range>();
Text[] terms = {new Text("fi||<type>||<value>"), new
Text("fi||<type>||<value>")};
BatchScanner bs = conn.createBatchScanner(table, auths, 20); bs.setTimeout(360,
TimeUnit.SECONDS);
IteratorSetting iter = new IteratorSetting(20, "ii",
IntersectingIterator.class); IntersectingIterator.setColumnFamilies(iter,
terms); bs.addScanIterator(iter);
bs.setRanges(Collections.singleton(new Range()));
for(Entry<Key,Value> entry : bs) {
shards.add(new Range(entry.getKey().getColumnQualifier()));
}
I then perform a second batch scan using the set of ranges returned by the
above to get my actual results.
My issues is that the intersecting query takes several minutes to return if at
all (in some cases it times out). Is this expected? Is there some way to
improve performance? Is there a better way to do this sort of query?
Any guidance would be much appreciated.
Thanks
Luke
IMPORTANT: This email remains the property of the Department of Defence and is
subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have
received this email in error, you are requested to contact the sender and
delete the email.