Re: Loading data, hbase slower than Hive?

praveenesh kumar Fri, 18 Jan 2013 09:57:41 -0800

Hey,
Can someone throw some pointers on what would be the best practice for bulk
imports in hbase ?
That would be really helpful.


Regards,
Praveenesh

On Thu, Jan 17, 2013 at 11:16 PM, Mohammad Tariq <donta...@gmail.com> wrote:

> Just to add to whatever all the heavyweights have said above, your MR job
> may not be as efficient as the MR job corresponding to your Hive query. You
> can enhance the performance by setting the mapred config parameters wisely
> and by tuning your MR job.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Thu, Jan 17, 2013 at 10:39 PM, ramkrishna vasudevan <
> ramkrishna.s.vasude...@gmail.com> wrote:
>
> > Hive is more for batch and HBase is for more of real time data.
> >
> > Regards
> > Ram
> >
> > On Thu, Jan 17, 2013 at 10:30 PM, Anoop John <anoop.hb...@gmail.com>
> > wrote:
> >
> > > In case of Hive data insertion means placing the file under table path
> in
> > > HDFS.  HBase need to read the data and convert it into its format.
> > (HFiles)
> > > MR is doing this work..  So this makes it clear that HBase will be
> > slower.
> > > :)  As Michael said the read operation...
> > >
> > >
> > >
> > > -Anoop-
> > >
> > > On Thu, Jan 17, 2013 at 10:14 PM, Austin Chungath <austi...@gmail.com
> > > >wrote:
> > >
> > > >   Hi,
> > > > Problem: hive took 6 mins to load a data set, hbase took 1 hr 14
> mins.
> > > > It's a 20 gb data set approx 230 million records. The data is in
> hdfs,
> > > > single text file. The cluster is 11 nodes, 8 cores.
> > > >
> > > > I loaded this in hive, partitioned by date and bucketed into 32 and
> > > sorted.
> > > > Time taken is 6 mins.
> > > >
> > > > I loaded the same data into hbase, in the same cluster by writing a
> map
> > > > reduce code. It took 1hr 14 mins. The cluster wasn't running anything
> > > else
> > > > and assuming that the code that i wrote is good enough, what is it
> that
> > > > makes hbase slower than hive in loading the data?
> > > >
> > > > Thanks,
> > > > Austin
> > > >
> > >
> >
>

Re: Loading data, hbase slower than Hive?

Reply via email to