Re: Improving query performance on hive and hdfs

Bertrand Dechoux Wed, 05 Sep 2012 00:32:53 -0700

It would be interesting to know the size of the data, how it is stored
within Hive and what kind of query you run on it.

Typically, 90 000 000 records could be less than 64 Mo and could even be
all loaded into memory. In that case, yes, it is not astonishing that
alternatives could outperform Hadoop.

If you are using regexes in order to parse the line (row format), there
could be a point of improvement there.

Then again depending on the query (multiple joins? group by?), that could
have a huge impact too.

Regards

Bertrand

On Wed, Sep 5, 2012 at 8:28 AM, MiaoMiao <[email protected]> wrote:

> Your store 90 million records in DB? What kind?
>
> Sure there are some optimizations to speed up hive query, but I don't
> see a universal one, except adding more servers.
>
> On Wed, Sep 5, 2012 at 2:19 PM, iwannaplay games
> <[email protected]> wrote:
> > Hi all,
> >
> > I ran a query on hive on top of 90 million records that took 12 minutes
> to
> > execute and same query on sql server took 8 minutes.My question is how
> can i
> > make hadoop's performance better.What all configurations will improve the
> > latency?
> >
> > Thanks & Regards
> > Prabhjot
>

-- 
Bertrand Dechoux

Re: Improving query performance on hive and hdfs

Reply via email to