Re: scan performance improvement

Oleg Ruchovets Thu, 11 Nov 2010 03:55:32 -0800

Yes , I thought about large number , so you said it depends on block size.
Good point.


I have one recored ~ 4k ,
 block size is:

<property>
  <name>dfs.block.size</name>
  <value>268435456</value>
  <description>HDFS blocksize of 256MB for large file-systems.
</description>
</property>

what is the number that I have choose? Assuming
I am afraid that using number which is equal one block brings to
socketTimeOutException? Am I write?

Thanks Oleg.




On Thu, Nov 11, 2010 at 1:30 PM, Friso van Vollenhoven <
fvanvollenho...@xebia.com> wrote:

> How small is small? If it is bytes, then setting the value to 50 is not so
> much different from 1, I suppose. If 50 rows fit in one block, it will just
> fetch one block whether the setting is 1 or 50. You might want to try a
> larger value. It should be fine if the records are small and you need them
> all on the client side anyway.
>
> It also depends on the block size, of course. When you only ever do full
> scans on a table and little random access, you might want to increase that.
>
> Friso
>
>
>
>
> On 11 nov 2010, at 12:15, Oleg Ruchovets wrote:
>
> > Hi ,
> >   To improve client performance I  changed
> > hbase.client.scanner.caching from 1 to 50.
> > After running client with new value( hbase.client.scanner.caching from =
> 50
> > ) it didn't improve execution time at all.
> >
> > I have ~ 9 million small records.
> > I have to do full scan  , so it brings all 9 million records to client .
> > My assumption -- this change have to bring significant improvement , but
> it
> > is not.
> >
> > Additional Information.
> > I scan table which has 100 regions
> > 5 server
> > 20 map
> > 4  concurrent map
> > Scan process takes 5.5 - 6 hours. As for me it is too much time? Am I
> write?
> > and how can I improve it
> >
> >
> > I changed the value in all hbase-site.xml files and restart hbase.
> >
> > Any suggestions.
>
>

Re: scan performance improvement

Reply via email to