Re: Speeding Up Group By Queries

2016-03-25 Thread James Taylor
Hi Amit, Using 4.7.0-HBase-1.1 release, I see the index being used for that query (see below). An index will help some, as the aggregation can be done in place as the scan over the index is occurring (as opposed to having to hold the distinct values found during grouping in memory per chunk of

Re: Speeding Up Group By Queries

2016-03-25 Thread Mujtaba Chohan
That seems excessively slow for 10M rows which should be in order of few seconds at most without index. 1. How wide is your table 2. How many region servers is your data distributed on and what's the heap size? 3. Do you see lots of disk I/O on region servers during aggregation? 4. Can you try

Re: Disabling HBase Block Cache

2016-03-25 Thread James Taylor
Hi Amit, Have you see our documentation and examples for ALTER TABLE [1]? So you could do ALTER TABLE my_table SET BLOCKCACHE=false; If you want to prevent rows from being put in the block cache on a per query basis, you can use the /*+ NO_CACHE */ hint [2] on a query like this: SELECT /*+

Speeding Up Group By Queries

2016-03-25 Thread Amit Shah
Hi, I am trying to evaluate apache hbase (version 1.0.0) and phoenix (version 4.6) deployed through cloudera for our OLAP workfload. I have a table that has 10 mil rows. I try to execute the below roll up query and it takes around 2 mins to return 1,850 rows. SELECT SUM(UNIT_CNT_SOLD),

Re: Disabling HBase Block Cache

2016-03-25 Thread Amit Shah
I noticed that the charts on cloudera indicate no block usage when the group by query is executed. This probably means that the block cache is disabled. The only strange fact is that the hbase shell describe command gave BLOCKCACHE => 'true'. It would be great if

Disabling HBase Block Cache

2016-03-25 Thread Amit Shah
Hi, I am using apache hbase (version 1.0.0) and phoenix (version 4.6) deployed through cloudera. Since my aggregations with group by query is slow, I want to try out disabling the block cache for a particular hbase table. I tried a couple of approaches but couldn't succeed. I am verifying if the