Hey Gabriel Thanks a lot for the suggestion; it turns out we were not quite utilizing our phoenix client (the sqlline program) correctly - having the right configs in the right place for the client etc. As a result I tried with a bunch of different salt bucket configs and we were not seeing any results. Once I put the right configs in the right place we got the desired results and made sense.
Thanks a ton for your help! Gaurav On 28 September 2015 at 10:41, Gabriel Reid <gabriel.r...@gmail.com> wrote: > Hi Gaurav, > > Looking at your DDL statement, I'm guessing that your table is > currently made up of 33 regions, which means that the time to do a > full count query will take at least as long as it takes to count 27 > million rows with a single thread (900 million threads divided by 33 > regions). > > The most-likely reason for issues like this is poor parallelization of > the query. Have you tried running "UPDATE STATISTICS" on the table? If > not, please see this link: > http://phoenix.apache.org/update_statistics.html > > Manually splitting the table will also likely improve the > parallelization of a select count(*) query. > > - Gabriel > > > On Fri, Sep 25, 2015 at 11:10 PM, Gaurav Kanade <gaurav.kan...@gmail.com> > wrote: > > Hello Guys > > > > I was able to load my large data set (200 G) with phoenix bulk load tool > > with your help last week. > > > > But I am running into other problem running queries on this now using > > sqlline. > > > > All I am trying to do is run a simple count(*) query. > > > > Initially I hit timeout issues due to a socketconnection exception, I > figure > > out a way to get past this (set the hbase.client.retry.count) in > > hbase-site.xml on the client side > > > > However I seem to never be able to successfully run the count(*) even > though > > I increased my phoenix query timeout to 10 minutes (the query eventually > > times out) > > > > To try to figure out what was happening I tried to do count(*) on smaller > > chunks of data (I filtered on the first column in primary key) so I tried > > where Col1 < x, Col1 < y. > > > > The queries seem to succeed on the smaller chunks; but soon hit a point > > where they cannot scale and hit the same timeout issues. What is weird is > > for e.g. the behavior is very erratic too; sometimes a query may time out > > (10 mins) on first attempt and complete in 2 minutes at the second > attempt > > and so on. > > > > I tried to look at the region server logs and I see no errors that might > > point to something except for responseTooSlow message from time to time > on > > scan requests. > > > > I realize I don't have much specific error messages and so on to provide > but > > that is mainly because I couldn't find any; it seems to me from the > nature > > of the overall behavior that I might be missing something obvious at a > high > > level; if so it would be great if you could point me in that direction. > > > > The schema for the table is as follows (900 million rows): (and I am > using a > > 32 node cluster now) > > > > CREATE TABLE AP1.EstimateA33(BookId INTEGER NOT NULL, MarketId INTEGER > NOT > > NULL, StationId INTEGER NOT NULL, EstimateTypeId INTEGER NOT NULL, > > DaypartId INTEGER NOT NULL, DemographicId INTEGER NOT NULL, > EstimateValue > > INTEGER, StateId INTEGER, StateCode VARCHAR, > StateName > > VARCHAR, EstimateType VARCHAR, MarketName VARCHAR, > > StationName VARCHAR, DaypartName VARCHAR, > > BookType VARCHAR, BookYear INTEGER, > > GeographyId INTEGER, SeasonId INTEGER, BookGeography > VARCHAR, > > BookSeason VARCHAR, DemographicType > VARCHAR > > CONSTRAINT pk_Estimates PRIMARY KEY (BookId, MarketId, StationId, > > EstimateTypeId, DaypartId, DemographicId)) SALT_BUCKETS = 33 > > > > -- > > Thanks > > Gaurav >