Hi Arvind,

Your use case interested me very much, and I also have that idea in my mind to 
load wide columns of dataset into hbase. It lures me to conduct a test run on 
my local machine:

OSX 10.12, 4cpu + 4GB memory, 128G SSD
hbase 1.2.2, phoenix 4.8.0

I generated a 5000 rows, each of which consists of 10,000 columns of float 
value, all of which are in a single column family. 

It took about 23 mins to insert, while only about 7 seconds to SELECT * FROM 
table LIMIT 5. While the max heap consumed is no more than 1G memory.

Compared to your 16GB heap RS + 8 cpu setup, I presume yours shall be much 
faster than in my case. (Though, SSD may compensate a lot in my case) 

-- 
Juvenn Woo
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Thursday, 29 December 2016 at 1:58 PM, Ankit Singhal wrote:

> Have you checked your query performance without sqlline. As Jonathan also 
> mentioned, Sqlline has it's own performance issue in terms of reading 
> metadata.( so probably time spend is actually spent by sqlline in reading 
> metadata for 3600 columns and printing header)
> 
> 
> 
> On Wed, Dec 28, 2016 at 12:04 AM, Mark Heppner <[email protected] 
> (mailto:[email protected])> wrote:
> > If you don't need to query any of the 3600 columns, you could even just use 
> > JSON inside of a VARCHAR field.
> > 
> > On Mon, Dec 26, 2016 at 2:25 AM, Arvind S <[email protected] 
> > (mailto:[email protected])> wrote:
> > > Setup .. 
> > > hbase (1.1.2.2.4) cluster on azure with 1 Region server. (8core 28 gb ram 
> > > ..~16gb RS heap)
> > > phoenix .. 4.4
> > > 
> > > Observation .. 
> > > created a table with 3 col composite PK and 3600 float type columns (1 
> > > per sec).
> > > loaded with <5000 lines of data (<100 MB compressed snappy & fast diff 
> > > encoding) 
> > > 
> > > On performing "select * " or select with individually naming each of 
> > > these 3600 columns the query takes around 2+ mins to just return a few 
> > > lines (limit 2,10 etc).
> > > 
> > > Subsequently on selecting lesser number of columns the performance seems 
> > > to improve. 
> > > 
> > > is it an anti-pattern to have large number of columns in phoenix tables? 
> > > Cheers !!
> > > Arvind
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > 
> > 
> > 
> > 
> > 
> > -- 
> > Mark Heppner
> 

Reply via email to