Re: How to improve HBase read performance.

Juan Jose Escobar Thu, 07 Jun 2018 00:46:10 -0700

Hello,

Are you performing a single Scan? Note that it will process the rows one at
a time in sequential manner, one region at a time.


To benefit from your salted rowkeys and speedup the read you may want to
execute N Scans in parallel  (e.g. one per salt prefix). If you are using
some mapreduce you can create a custom TableInputFormat that generates one
split per region (or per prefix) with the salted ranges and pass that to
your job configuration (e,g, for mapduce or spark).


On Thu, Jun 7, 2018 at 4:05 AM, Kang Minwoo <minwoo.k...@outlook.com> wrote:

> Sorry for the late reply.
>
> The row key structure is shown below.
>
> {hash_salt (string hash)} + {timestamp} + {string}
>
> and  I am doing Range Scan.
>
> Best regards,
> Minwoo Kang
>
> ________________________________________
> 보낸 사람: sudhir patil <spatil.sud...@gmail.com>
> 보낸 날짜: 2018년 5월 26일 토요일 19:29
> 받는 사람: user@hbase.apache.org
> 제목: Re: How to improve HBase read performance.
>
> HBASE performance highly dependant on query & row key format.
>
> Can you share few rowkeys, query format? also what is encoding you are
> using?
>
> On Thu, May 24, 2018 at 8:38 PM, Kang Minwoo <minwoo.k...@outlook.com>
> wrote:
>
> > 5B logs a day?
> > => Yes, 5B/day
> >
> > You store a single log into a single cell?
> > => No, I store more than 50 cells because with extra information.
> >
> > The log in the hbase cell is compressed or you have enabled compression
> on
> > the hbase column family?
> > => enabled compression on the hbase column family
> >
> > What is your test environment like? Is it a single server? How are you
> > running your compare?
> > => about 100 region server, and I tested hive sql vs hbase scan from same
> > origin data.
> > (origin data -> put hbase, convert orc file -> test hive sql vs hbase
> scan)
> >
> >
> > Best regards,
> > Minwoo Kang
> >
> > ________________________________________
> > 보낸 사람: Stack <st...@duboce.net> 대신 saint....@gmail.com <
> > saint....@gmail.com>
> > 보낸 날짜: 2018년 5월 24일 목요일 01:33
> > 받는 사람: Hbase-User
> > 제목: Re: How to improve HBase read performance.
> >
> > On Wed, May 16, 2018 at 7:30 PM, Kang Minwoo <minwoo.k...@outlook.com>
> > wrote:
> >
> > > Here is information.
> > >
> > > store about 5 billion a day.
> > >
> >
> >
> > 5B logs a day?
> >
> >
> >
> > > size of each log is about 4k.
> > >
> >
> > You store a single log into a single cell?
> >
> >
> >
> > > compression by snappy.
> > >
> > >
> > The log in the hbase cell is compressed or you have enabled compression
> on
> > the hbase column family?
> >
> >
> > What is your test environment like? Is it a single server? How are you
> > running your compare?
> >
> > Thanks,
> > M
> >
> >
> >
> >
> > > Best regards,
> > > Minwoo Kang
> > >
> > > ________________________________________
> > > 보낸 사람: Ted Yu <yuzhih...@gmail.com>
> > > 보낸 날짜: 2018년 5월 17일 목요일 00:01
> > > 받는 사람: user@hbase.apache.org
> > > 제목: Re: How to improve HBase read performance.
> > >
> > > bq. store a lot of logs in HBase
> > >
> > > Kang:
> > > Can you tell us a bit more about how you store (and access) the log(s)
> -
> > > size of each log, whether log is encoded in hbase ?
> > >
> > > ORC is columnar format while hbase uses different format.
> > >
> > > Thanks
> > >
> > > On Wed, May 16, 2018 at 6:41 AM, Marcell Ortutay <mortu...@23andme.com
> >
> > > wrote:
> > >
> > > > This ticket: https://issues.apache.org/jira/browse/HBASE-20459 was
> > fixed
> > > > in
> > > > the latest version of HBase, upgrading to latest may help with
> > > performance
> > > >
> > > > On Wed, May 16, 2018 at 3:55 AM, Kang Minwoo <
> minwoo.k...@outlook.com>
> > > > wrote:
> > > >
> > > > > Hi, Users.
> > > > >
> > > > > I store a lot of logs in HBase.
> > > > > However, the reading speed of the log is too slow compared to the
> > Hive
> > > > ORC
> > > > > file.
> > > > > I know that HBase is slow compared to the Hive ORC file.
> > > > > The problem is that it is too slow.
> > > > > HBase is about 6 times slower.
> > > > >
> > > > > Is there a good way to speed up HBase's reading speed?
> > > > > Should I put a lot of servers?
> > > > >
> > > > > I am using HBase 1.2.6.
> > > > >
> > > > > Best regards,
> > > > > Minwoo Kang
> > > > >
> > > >
> > >
> >
>

Re: How to improve HBase read performance.

Reply via email to