Are you looking for sth like this: https://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html
To answer your original question: why not implement the whole job in Hive? Or orchestrate using oozie some parts in mr and some in Huve. > On 30. Jan 2018, at 05:15, Sungwoo Park <[email protected]> wrote: > > Hello all, > > I wonder if an external YARN container can send requests to LLAP daemon to > read data from its in-memory cache. For example, YARN containers owned by a > typical MapReduce job (e.g., TeraSort) could fetch data directly from LLAP > instead of contacting HDFS. In this scenario, LLAP daemon just serves IO > requests from YARN containers and does not run its executors to perform > non-trivial computation. > > If this is feasible, LLAP daemon can be shared by all services running in the > cluster. Any comment would be appreciated. Thanks a lot. > > -- Gla Park >
