abstractdog commented on PR #456:
URL: https://github.com/apache/tez/pull/456#issuecomment-3799488247

   > @abstractdog , I was able to start DagAppMaster with ZK on local. 
Attaching logs for the container 
[docker_logs.txt](https://github.com/user-attachments/files/24839584/docker_logs.txt)
   > 
   > ```
   > docker run -d \
   >         --name tez-am \
   >         -p 10001:10001 \
   >         -e TEZ_FRAMEWORK_MODE="STANDALONE_ZOOKEEPER" 
apache/tez-am:1.0.0-SNAPSHOT
   > ```
   > 
   > ```
   > brew install zookeeper
   > zkServer start
   > ```
   > 
   > But this PR has lot of open items and I need some advice on the following:
   > 
   > 1. Is the docker directory inside tez-dist fine or should I create a 
sepate sub-module for dockerfile related code which will be executed after 
tez-dist module.
   > 2. This image will presumeably be ran with ZK + K8 + S3. Question is do we 
need a hadoop tarball inside this image just in case for some 3rd party jars 
etc. If my understanding is correct, it shouldn't be there but I've kept it for 
now. Will remove if you say so.
   > 3. in DAGAppMaster#main() there are lot of ENV variables which I have 
mocked for now in `entrypoint.sh`. I'll try to improve this (suggestions are 
welcomed here)
   > 4. my `tez-site.xml` is not getting picked up from classpath 
https://github.com/apache/tez/blob/4632058795de8f871504601d5f2992f311be792a/tez-dag/src/main/java/org/apache/tez/dag/app/DAGAppMaster.java#L2432
   >     . will debug that
   > 5. Any way/How to test this AM container without YARN by running some job?
   
   very good, very good, let me check this in detail sometime this week, here 
are some pointers in the meantime, responding your questions:
   
   1. I believe we can follow Apache Hive in this area, feel free to do 
something like here: https://github.com/apache/hive/tree/master/packaging
   
   2. We should keep hadoop jars. Even if the k8s environment is not the 
hadoop/yarn environment anymore, Tez heavily depends on hadoop compile time and 
runtime as well, and this is something we don't intend to break in the short or 
midterm.
   
   3. I'll check it. What we should really be clear about is e.g.
   ```
   # 3. NodeManager Details
   export NM_HOST=${NM_HOST:-"localhost"}
   export NM_PORT=${NM_PORT:-"12345"}
   ```
   there is no Yarn NodeManager in a k8s environment, so the reader of the 
entrypoint.sh should see a clear code distinguishing between needed env vars 
and legacy/backward-compatible env vars, that's what should be handled with 
care in my opinion
   
   4. Okay.
   
   5.  Yeah. So given that neither tez containers (TEZ-4665) nor llap 
containers ([HIVE-29411](https://issues.apache.org/jira/browse/HIVE-29411)) 
thing is implemented, we cannot successfully run a whole DAG, but we can get to 
a point where at least a DAG is successfully submitted from Hive to this AM 
container. So, I believe, to make this happen, we need to make a HS2 container 
(see Hive instructions for dockerized setup) be able to find this Tez AM 
container, so most probably, we need to stop using tez.local.mode=true for this 
experiment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to