Hi Sungwoo, Thanks for your reply but I was referring exclusively to the LLAP application master which is not Tez-related.
Thanks, Aaron On Wed, 2023-03-22 at 20:02 +0900, Sungwoo Park wrote: Hello, A similar issue was discussed in the Tez mailing list a long time ago: https://lists.apache.org/thread/0vjor12lpcncg43rn6vddw8yc1k62c81 Tez still does not support specifying node labels for AMs, but as explained in the response, this is quite easy to implement if you can re-compile Tez. (Hive-MR3 is still a valid option, with hundreds of patches backported to Hive 3.1.3.) --- Sungwoo On Wed, Mar 22, 2023 at 7:21 PM Aaron Grubb <aa...@kaden.ai<mailto:aa...@kaden.ai>> wrote: Hi all, I have a Hadoop cluster (3.3.4) with 6 nodes of equal resource size that run HDFS and YARN and 1 node with lower resources which only runs YARN that I use for Hive AMs, the LLAP AM, Spark AMs and Hive file merge containers. The HDFS nodes are set up such that the queue for LLAP on the YARN NodeManager is allocated resources exactly equal to what the LLAP daemons consume. However, when I need to re-launch LLAP, I currently have to stop the NodeManager processes on each HDFS node, then launch LLAP to guarantee that the application master ends up on the YARN-only machine, then start the NodeManager processes again to let the daemons start spawning on the nodes. This used to not be a problem because only Hive/LLAP was using YARN but now we've started using Spark in my company and I'm in a position where if LLAP happens to crash, I would need to wait for Spark jobs to finish before I can re-launch LLAP, which would put our ETL processes behind, potentially to unacceptable delays. I could allocate 1 vcore and 1024mb memory extra for the LLAP queue on each machine, however that would mean I have 5 vcores and 5gb RAM being reserved and unused at all times, so I was wondering if there's a way to specify which node to launch the LLAP AM on, perhaps through YARN node labels similar to the Spark "spark.yarn.am.nodeLabelExpression" configuration? Or even a way to specify the node machine through a different mechanism? My Hive version is 3.1.3. Thanks, Aaron