Are you looking for sth like this:
https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html

To answer your original question: why not implement the whole job in Hive? Or 
orchestrate using oozie  some parts in mr and some in Huve.

> On 30. Jan 2018, at 05:15, Sungwoo Park <[email protected]> wrote:
> 
> Hello all,
> 
> I wonder if an external YARN container can send requests to LLAP daemon to 
> read data from its in-memory cache. For example, YARN containers owned by a 
> typical MapReduce job (e.g., TeraSort) could fetch data directly from LLAP 
> instead of contacting HDFS. In this scenario, LLAP daemon just serves IO 
> requests from YARN containers and does not run its executors to perform 
> non-trivial computation. 
> 
> If this is feasible, LLAP daemon can be shared by all services running in the 
> cluster. Any comment would be appreciated. Thanks a lot.
> 
> -- Gla Park
> 

Reply via email to