On Mon, Nov 9, 2009 at 8:25 AM, Murali Krishna. P <[email protected]>wrote:

> HI Stack,
>  Thanks for the script.  The cluster is in use and I am not comfortable
> modifying that big table right now. (only .001% data will be missing, which
> is ok for now) .


OK.


> I will definitely try the script some time this week and will post you.
>  Regarding the table, I have 8 region servers (each hosting 5.5k). Each
> using 2.3G heap. I have jetty in front of this to serve data, yet to
> benchmark it for performance. Will definitely post you the metrics soon.
> Thanks for  the upload tool, couldn't think of loading such a table earlier
> , I indeed tried and gave up tableoutputfornmat idea some time back :)
>


5.5k regions per RS is > than anything I've seen before.  My guess is that
it probably won't perform to well; let us know (This mail message from
Andrew might be of interest to you:
http://osdir.com/ml/hbase-user-hadoop-apache/2009-09/msg00118.html).   How
many rows do you think you have loaded and what size is your hbase dir in
hdfs?  (out of interest).    Also out of interest, what are you trying to
evaluate?  Maybe we can help.  And whats your hardware stats like?  Disks,
RAM and CPUs?

You might try setting larger regions... 1G.  Be sure to up the flush size
too x4 so you flush 256M at a time  (there is a bit of a balance you need to
keep up between flusher and splitter; its better not to have too many files
feeding compactions.. they can take a while).   See
hbase.hregion.memstore.flush.size and hbase.hregion.max.filesize.

St.Ack



>
> Thanks,
> Murali Krishna
>
>
>
>
> ________________________________
> From: stack <[email protected]>
> To: [email protected]
> Sent: Mon, 9 November, 2009 9:12:55 PM
> Subject: Re: Issue with bulk loader tool
>
> On Mon, Nov 9, 2009 at 2:26 AM, Murali Krishna. P <[email protected]
> >wrote:
>
> > hi Stack,
> >  I attached the changes to that patch to the jira.
>
>
>
> Thanks for fixing my hackup.
>
>
>
>
> >   Is it possible correct an existing table created via old loadtable? I
> > have a table with few billon records and 40k regions, took almost 2 days
> for
> > the MR to create this. Don't want to do it again :)
> >
> > 40k regions is good going.  How many per regionserver?
>
> See the script bin/add_table.rb.  See how the first thing it does is delete
> old table.   I'd suggest you comment out the moving aside of a table
> directory at line #72 and then all from line #103 on where it adds into
> .META. the new table.
>
> I uploaded to the issue 1962 a version with commented out sections.  Review
> before running.
>
> St.Ack
>

Reply via email to