Re: [PR] DO-N0T-MERGE Move to Hadoop3 [zeppelin]

via GitHub Mon, 27 Nov 2023 04:55:59 -0800


Reamer commented on code in PR #4691:
URL: https://github.com/apache/zeppelin/pull/4691#discussion_r1406119893



##########
rlang/pom.xml:
##########
@@ -116,18 +116,10 @@
 
         <dependency>
             <groupId>org.apache.hadoop</groupId>
-            <artifactId>hadoop-client</artifactId>
+            <artifactId>hadoop-client-runtime</artifactId>
             <version>${hadoop.version}</version>
             <scope>compile</scope>
         </dependency>
-
-        <dependency>
-            <groupId>org.apache.hadoop</groupId>
-            <artifactId>hadoop-common</artifactId>
-            <version>${hadoop.version}</version>
-            <scope>compile</scope>
-        </dependency>
-
         <dependency>

Review Comment:
   > There is a switch in YARN to enable/disable Hadoop class population for 
containers.
   
   I don't know how this is used in Zeppelin.
   
   > QQ, I understand we should not include Hadoop classes in plugins, because 
they will be loaded into the same JVM with Zeppelin server, so that they can 
share the Hadoop classes. What about the interpreteres? I assume the 
interpreters are always run in dedicated JVMs, so Hadoop classes seem always 
necessary (except for those runtimes who already provided Hadoop classes, e.g. 
Spark, Flink)?
   
   Correct the Zeppelin server & the zengine use the same JVM as the Zeppelin 
plugins.
   In my opinion, the interpreters usually run in separate JVM instances. We 
should set the scope of Hadoop to Provided in the interpreter, because I think 
the Hadoop code in the interpreter is only in use for YARN. See 
https://github.com/apache/zeppelin/blob/56da029ffe413c55ba34f46e4e4b91b8d20d9ce2/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/remote/YarnUtils.java#L20
   
   Maybe there will be a way to remove the dependency at some point.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@zeppelin.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] DO-N0T-MERGE Move to Hadoop3 [zeppelin]

Reply via email to