ayushtkn commented on PR #6507: URL: https://github.com/apache/hive/pull/6507#issuecomment-4709529031
Thanx @zhangbutao for the great insights!!! You hit the nail on the head regarding the shift from "YARN-thinking" to "Kubernetes-native thinking." 1. Physical vs. Logical Isolation You are completely right about Workload Management (WLM). Trying to carve up a single JVM's heap and CPU cycles among competing tenants is incredibly complex and never gives you 100% true isolation. By shifting to Kubernetes, we get true physical isolation via namespaces, cgroups, and dedicated pod resources. 2. How this could work technically What you are describing is entirely feasible. The LLAP instances register themselves in ZooKeeper under a specific app name (defaulting to @llap0). If we update the Operator to support an array of LLAP profiles (e.g., llap-cluster1, llap-cluster2), the Operator would spin up multiple independent StatefulSets, each registering to a different ZK path. Then, exactly as you said, a user simply sets hive.llap.daemon.service.hosts=@llap-cluster1 in their JDBC string or session. TezAM would look up that specific ZK path, find those specific pods, and route the fragments exclusively to that tenant's dedicated executors. 3. The Autoscaling Synergy The best part is how it ties into the autoscaling logic in this PR! Because each tenant's LLAP cluster would be its own independent K8s StatefulSet, the autoscaler would scale llap-cluster1 and llap-cluster2 completely independently. If user1 isn't running queries, their dedicated LLAP cluster scales to zero, costing nothing, while user2 can comfortably stay scaled up to 100 pods. This is a fantastic concept for multi-tenancy. Since the core autoscaling loop and K8s operator primitives are established in this PR, building out "Multi-Tenant LLAP Compute Groups" on top of it feels like a perfect follow-up Jira ticket. I think it is definitely worth exploring! I will definitely give it a shot :-) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
