> I have also heard about Hortonworks with Tez + LLAP but that is a distro?

Yes. AFAIK, during Hadoop Summit there was a HDP 2.5 techpreview sandbox
instance which shipped Hive2 (scroll down all the way to end in the
downloads page).

Enable the "interactive mode" in Ambari for a HiveServer2 config group &
HiveServer2 switches over to LLAP.

Though if you're interested in measuring performance, I debate the
usefulness of an in-memory buffer-cache for a 1-node & cpu/memory
constrained VM.

> Is it a complicated work to build it with Do It Yourself so to speak?

Complicated enough that I have automated it (at least for myself & most of
the devs).

https://github.com/t3rmin4t0r/tez-autobuild/blob/llap/README.md

That setup should work as long as you have a base Apache compatible
hadoop-2.7.1 install.

Because the way to deploy LLAP is a "yarn jar" & then have YARN run the
instances, no part of the actual deploy requires root on any worker node.

All you need is access to the metastore db (new features in the metastore)
and a single Zk ensemble to register LLAP onto.

That makes it really easy to "drop into" an existing YARN cluster where
you're not an admin, but the LLAP install is then tied to a single user
(you).

That's set up a bit unconventionally since LLAP was never meant to hijack
a user like this and allow access from the CLI.

The real reason for that is so that I can do hive --debug and debug the
CLI from remote much more easily than HiveServer2's massive number of
threads.

I did put up a demo GIF earlier during the Summit, which should give you
an idea of how fast/slow LLAP is with S3 data (which is when the
read-through cache really comes into the limelight).

<https://twitter.com/t3rmin4t0r/status/748630764959338497/photo/1>


Cheers,
Gopal















Reply via email to