There are 2 kind of scans that are done. One is a unique scan where we know
that only one unique row or a set of unique rows will be returned.
And second is a non-unique scan where multiple rows are returned.

Does this optimization apply to both of these cases?

We do have estimates of accessed rows at compile time and could turn this
opt on, if rows are small.
What happens if this flag is set and the scan is not small or doesn't fit in
the cache? Will that work with some perf degradation or will it fail?

We only read metadata information from hbase when the table is used for the
first time
in a session or if table definition is changed. That info is then cached in
compiler session structures.
Are you seeing metadata traffic every time you run a query from the same
session? Are these queries on the same table?
Is table definition changing within your session?

anoop

-----Original Message-----
From: Eric Owhadi [mailto:eric.owh...@esgyn.com]
Sent: Friday, July 24, 2015 6:46 AM
To: dev@trafodion.incubator.apache.org
Subject: Small scanner: do you know about it, can it buy us perf
improvement?

HI all,
stepping in the code, I have seen scan traffic on meta data. Now I don't
know if this traffic is loading some cache, or if meta data is accessed
every time, no caching (apart from HBase block cache of course).
If so, I am not sure developers are aware of a new scanner optimized for
scans that result in result set fitting in the configured cache size. It is
3 times faster than the regular scanner, and all you have to do is set a
flag in the scan object to notify that it can benefit from small scanner
perf optimization. In a nutshell, small scanner will do its job in one RPC
instead of 3 for regular scanner (combining the open/next/close). Small
scanner is already available in the Hbase .98 we use.

Do you think we can take advantage of small scanner feature ether for meta
data or somewhere else?

regards,
Eric Owhadi

Reply via email to