Re: OLAP with Hive

Josh Ferguson Sun, 14 Dec 2008 13:20:02 -0800

I'd honestly like to see hive remain a partitioned flat file store. Idon't think indexing what's inside the files is too incredibly usefulin most situations where you'd use hive. I also think this kind ofstore is just the right fit for the hadoop and large scale analyticssituation. I don't want to see hive go toward hbase or katta. What isthe long term vision for hive?


Josh


On Dec 14, 2008, at 1:06 PM, Joydeep Sen Sarma wrote:

We have done some preliminary work with indexing – but that’s notthe focus right now and no code is available in the open sourcetrunk for this purpose. I think it’s fair to say that hive is notoptimized for online processing right now. (and we are quite someways off from columnar storage).
From: Martin Matula [mailto:[email protected]]
Sent: Sunday, December 14, 2008 6:54 AM
To: [email protected]
Subject: OLAP with Hive

Hi,
Is Hive capable of indexing the data and storing them in a wayoptimized for querying (like a columnar database - bitmap indexes,compression, etc.)?I need to be able to get decent response times for queries (up to afew seconds) over huge amounts of analytical data. Is thatachievable (with appropriate number of machines in a cluster)? I sawthe serialization/deserialization of tables is pluggable. Is thatthe way to make the storage more efficient? Any existingimplementation (either ready or in progress) that would be targetedat this? Or any hints on what I may want to take a look at among thethings that are currently available in Hive/Hadoop?
Thanks,
Martin

Re: OLAP with Hive

Reply via email to