For each file; there is a time range. When you scan/search, the file is
skipped if there is no overlap between the file timerange and the timerange
of the query. As there are other parameters as well (row distribution,
compaction effects, cache, bloom filters, ...) it's difficult to know in
It seems like the the internal logic for handling a time range is two
part: First, as you said, each file contains the minimum and maximum
timestamps contained within. This provides a very rough filter for the
data, but if your data is right, the effect can be huge. Second, a
time range acts a
Subject: Re: Performance of scan setTimeRange VS manually doing it
It seems like the the internal logic for handling a time range is two
part: First, as you said, each file contains the minimum and maximum
timestamps contained within. This provides a very rough filter for the
data, but if your data
Hi,
do you have script in python for rack awareness configuration?
Thanks!
beatls
On Thu, Sep 13, 2012 at 5:52 AM, Tom Brown tombrow...@gmail.com wrote:
When I query HBase, I always include a time range. This has not been a
problem when querying recent data, but it seems to be an issue