Anil, Please let us know how well this works.
On Mon, Aug 27, 2012 at 4:19 PM, anil gupta <anilgupt...@gmail.com> wrote: > Hi Guys, > > I was digging through the hbase-default.xml file and i found this property > relates HFile handling: > </property> > <property> > <name>hfile.format.version</name> > <value>2</value> > <description> > The HFile format version to use for new files. Set this to 1 to > test > backwards-compatibility. The default value of this option should > be > consistent with FixedFileTrailer.MAX_VERSION. > </description> > </property> > > I believe setting this to 1 would help me carry out my test. Now we know > how to store data in HFileV1 in HBase0.92 :) . I'll post the result once i > try this out. > > Thanks, > Anil > > > On Wed, Aug 15, 2012 at 5:09 AM, J Mohamed Zahoor <jmo...@gmail.com> > wrote: > > > Cool. Now we have something on the records :-) > > > > ./Zahoor@iPad > > > > On 15-Aug-2012, at 3:12 AM, Harsh J <ha...@cloudera.com> wrote: > > > > > Not wanting to have this thread too end up as a mystery-result on the > > > web, I did some tests. I loaded 10k rows (of 100 KB random chars each) > > > into test tables on 0.90 and 0.92 both, flushed them, major_compact'ed > > > them (waited for completion and drop in IO write activity) and then > > > measured them to find this: > > > > > > 0.92 takes a total of 1049661190 bytes under its /hbase/test directory. > > > 0.90 takes a total of 1049467570 bytes under its /hbase/test directory. > > > > > > So… not much of a difference. It is still your data that counts. I > > > believe what Anil may have had were merely additional, un-compacted > > > stores? > > > > > > P.s. Note that my 'test' table were all defaults. That is, merely > > > "create 'test', 'col1'", nothing else, so the block indexes must've > > > probably gotten created for every row, as thats at 64k by default, > > > while my rows are all 100k each. > > > > > > On Wed, Aug 15, 2012 at 2:25 AM, anil gupta <anilgupt...@gmail.com> > > wrote: > > >> Hi Kevin, > > >> > > >> If it's not possible to store table in HFilev1 in HBase 0.92 then my > > last > > >> option will be to do store data on pseudo-distributed or standalone > > cluster > > >> for the comparison. > > >> The advantage with the current installation is that its a fully > > distributed > > >> cluster with around 33 million records in a table. So, it would give > me > > a > > >> better estimate. > > >> > > >> Thanks, > > >> Anil Gupta > > >> > > >> On Tue, Aug 14, 2012 at 1:48 PM, Kevin O'dell < > kevin.od...@cloudera.com > > >wrote: > > >> > > >>> Do you not have a pseudo cluster for testing anywhere? > > >>> > > >>> On Tue, Aug 14, 2012 at 4:46 PM, anil gupta <anilgupt...@gmail.com> > > wrote: > > >>> > > >>>> Hi Jerry, > > >>>> > > >>>> I am wiling to do that but the problem is that i wiped off the > > HBase0.90 > > >>>> cluster. Is there a way to store a table in HFilev1 in HBase0.92? > If i > > >>> can > > >>>> store a file in HFilev1 in 0.92 then i can do the comparison. > > >>>> > > >>>> Thanks, > > >>>> Anil Gupta > > >>>> > > >>>> On Tue, Aug 14, 2012 at 1:28 PM, Jerry Lam <chiling...@gmail.com> > > wrote: > > >>>> > > >>>>> Hi Anil: > > >>>>> > > >>>>> Maybe you can try to compare the two HFile implementation directly? > > Let > > >>>> say > > >>>>> write 1000 rows into HFile v1 format and then into HFile v2 format. > > You > > >>>> can > > >>>>> then compare the size of the two directly? > > >>>>> > > >>>>> HTH, > > >>>>> > > >>>>> Jerry > > >>>>> > > >>>>> On Tue, Aug 14, 2012 at 3:36 PM, anil gupta <anilgupt...@gmail.com > > > > >>>> wrote: > > >>>>> > > >>>>>> Hi Zahoor, > > >>>>>> > > >>>>>> Then it seems like i might have missed something when doing hdfs > > >>> usage > > >>>>>> estimation of HBase. I usually do hadoop fs -dus > /hbase/$TABLE_NAME > > >>> for > > >>>>>> getting the hdfs usage of a table. Is this the right way? Since i > > >>> wiped > > >>>>> of > > >>>>>> the HBase0.90 cluster so now i cannot look into hdfs usage of it. > Is > > >>> it > > >>>>>> possible to store a table in HFileV1 instead of HFileV2 in > > HBase0.92? > > >>>>>> In this way i can do a fair comparison. > > >>>>>> > > >>>>>> Thanks, > > >>>>>> Anil Gupta > > >>>>>> > > >>>>>> On Tue, Aug 14, 2012 at 12:13 PM, jmozah <jmo...@gmail.com> > wrote: > > >>>>>> > > >>>>>>> Hi Anil, > > >>>>>>> > > >>>>>>> I really doubt that there is 50% drop in file sizes... As far as > i > > >>>>> know.. > > >>>>>>> there is no drastic space conserving feature in V2. Just as an > > >>> after > > >>>>>>> thought.. do a major compact and check the sizes. > > >>>>>>> > > >>>>>>> ./Zahoor > > >>>>>>> http://blog.zahoor.in > > >>>>>>> > > >>>>>>> > > >>>>>>> On 15-Aug-2012, at 12:31 AM, anil gupta <anilgupt...@gmail.com> > > >>>> wrote: > > >>>>>>> > > >>>>>>>> l > > >>>>>>> > > >>>>>>> > > >>>>>> > > >>>>>> > > >>>>>> -- > > >>>>>> Thanks & Regards, > > >>>>>> Anil Gupta > > >>>>>> > > >>>>> > > >>>> > > >>>> > > >>>> > > >>>> -- > > >>>> Thanks & Regards, > > >>>> Anil Gupta > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Kevin O'Dell > > >>> Customer Operations Engineer, Cloudera > > >>> > > >> > > >> > > >> > > >> -- > > >> Thanks & Regards, > > >> Anil Gupta > > > > > > > > > > > > -- > > > Harsh J > > > > > > -- > Thanks & Regards, > Anil Gupta > -- Kevin O'Dell Customer Operations Engineer, Cloudera