It would be interesting to know the size of the data, how it is stored within Hive and what kind of query you run on it.
Typically, 90 000 000 records could be less than 64 Mo and could even be all loaded into memory. In that case, yes, it is not astonishing that alternatives could outperform Hadoop. If you are using regexes in order to parse the line (row format), there could be a point of improvement there. Then again depending on the query (multiple joins? group by?), that could have a huge impact too. Regards Bertrand On Wed, Sep 5, 2012 at 8:28 AM, MiaoMiao <[email protected]> wrote: > Your store 90 million records in DB? What kind? > > Sure there are some optimizations to speed up hive query, but I don't > see a universal one, except adding more servers. > > On Wed, Sep 5, 2012 at 2:19 PM, iwannaplay games > <[email protected]> wrote: > > Hi all, > > > > I ran a query on hive on top of 90 million records that took 12 minutes > to > > execute and same query on sql server took 8 minutes.My question is how > can i > > make hadoop's performance better.What all configurations will improve the > > latency? > > > > Thanks & Regards > > Prabhjot > -- Bertrand Dechoux
