Hi, I have tested read performance after reducing number of column families from 14 to 3 and yes there is improvement. Meanwhile i was going through the paper published by google on BigTable. It says
"It is our intent that the number of distinct column families in a table be small (in the hundreds at most), and that families rarely change during operation." So Is that theoretical value ( 100 CFs ) or its possible but not with the current version of Hbase ? On Tue, Jul 2, 2013 at 12:48 AM, Viral Bajaria <viral.baja...@gmail.com>wrote: > On Mon, Jul 1, 2013 at 10:06 AM, Vimal Jain <vkj...@gmail.com> wrote: > > > Sorry for the typo .. please ignore previous mail.. Here is the corrected > > one.. > > 1)I have around 140 columns for each row , out of 140 , around 100 > columns > > hold java primitive data type , remaining 40 columns contain serialized > > java object as byte array(Inside each object is an ArrayList). Yes , I do > > delete data but the frequency is very less ( 1 out of 5K operations ). I > > dont run any compaction. > > > > This answers the type of data in each cell not the size of data. Can you > figure out the average size of data that you insert in that size. For > example what is the length of the byte array ? Also for java primitive, is > it 8-byte long ? 4-byte int ? > In addition to that, what is in the row key ? How long is that in bytes ? > Same for column family, can you share the names of the column family ? How > about qualifiers ? > > If you have disabled major compactions, you should run it once a few days > (if not once a day) to consolidate the # of files that each scan will have > to open. > > 2) I had ran scan keeping in mind the CPU,IO and other system related > > parameters.I found them to be normal with system load being 0.1-0.3. > > > > How many disks do you have in your box ? Have you ever benchmarked the > hardware ? > > Thanks, > Viral > -- Thanks and Regards, Vimal Jain