[
https://issues.apache.org/jira/browse/RANGER-4166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
caijialiang updated RANGER-4166:
--------------------------------
Description:
Here we mainly discuss how to reason and reproduce this compilation error
stably.
environment
[root@gs-server-12223 ~]# locale
LANG=zh_CN.UTF-8
LC_CTYPE="zh_CN.UTF-8"
LC_NUMERIC="zh_CN.UTF-8"
LC_TIME="zh_CN.UTF-8"
LC_COLLATE="zh_CN.UTF-8"
LC_MONETARY="zh_CN.UTF-8"
LC_MESSAGES="zh_CN.UTF-8"
LC_PAPER="zh_CN.UTF-8"
LC_NAME="zh_CN.UTF-8"
LC_ADDRESS="zh_CN.UTF-8"
LC_TELEPHONE="zh_CN.UTF-8"
LC_MEASUREMENT="zh_CN.UTF-8"
LC_IDENTIFICATION="zh_CN.UTF-8"
LC_ALL=zh_CN.UTF-8
lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.4.1708 (Core)
Release: 7.4.1708
Codename: Core
uname -a
Linux gs-server-12223 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC 2017
x86_64 x86_64 x86_64 GNU/Linux
maven 版本3.6.3
description:
There are compilation errors when building Ranger 2.3 and Ranger 2.4 in a Linux
environment.
Compilation command:
mvn -Pall clean compile package install -Dmaven.test.skip=true
-DskipTests=true -Dfindbugs.skip=true -Dcheckstyle.skip=true
-Djacoco.skip=true -Dpmd.skip=true -Drat.skip=true -Dspotbugs.skip=true
-Dhadoop.version=3.3.4 -Dhbase.version=2.4.13 -Dhive.version=3.1.3
-Dkafka.version=2.8.1 -Dsolr.version=8.11.2 -Dzookeeper.version=3.6.4
The following two patches were applied to ranger2.3 in order to compile
successfully.
git apply ../ranger/patch1-RANGER-3818.diff
git apply ../patch0-RANGER-3373.diff
*The compilation of ranger 2.3 fails with the following error:*
{code:java}
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-assembly-plugin:2.6:single (default) on project
ranger-distro: Failed to create assembly: Error creating assembly archive
schema-registry-plugin: Problem creating jar:
jar:file:/home/jialiang/prjs/ranger/distro/target/ranger-distro-2.3.0.jar!/META-INF/maven/org.apache.ranger/ranger-distro/pom.xml:
JAR entry META-INF/maven/org.apache.ranger/ranger-distro/pom.xml not found in
/home/jialiang/prjs/ranger/distro/target/ranger-distro-2.3.0.jar -> [Help 1]
{code}
*ranger2.4 did not apply any patches, and compilation errors are as follows:*
{code:java}
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-assembly-plugin:2.6:single (default) on project
ranger-distro: Failed to create assembly: Error creating assembly archive
schema-registry-plugin: IOException when zipping
rMETA-INF/maven/org.apache.ranger/ranger-distro/pom.properties: invalid code
lengths set -> [Help 1]{code}
According to the compilation error message of ranger2.4, it is suspected that
the issue is related to encoding. After checking the encoding format of the
corresponding file, it is found to be ASCII, while Linux defaults to UTF-8
file ./distro/target/maven-archiver/pom.properties
./distro/target/maven-archiver/pom.properties: ASCII text
Therefore, it is possible that it is a encoding problem. In addition, the error
message mentions "Error creating assembly archive." The Maven Assembly Plugin
is executed during the package phase of Maven, after compilation, testing, and
other operations are completed, to prepare the build artifacts for distribution
as archive files.
This error occurs when the Assembly Plugin is creating a distributable archive,
such as a zip or tar.gz format, from the build artifacts. Therefore, it is
related to how the archive tool used by Maven Assembly Plugin handles encoding.
In both ranger2.3 and ranger2.4, the
<assembly.plugin.version>2.6</assembly.plugin.version> is used. Hence, it is
necessary to investigate the code of this version of the Assembly Plugin."
[https://github.com/apache/maven-assembly-plugin],
[https://github.com/apache/maven-assembly-plugin/blob/maven-assembly-plugin-2.6/pom.xml]
>From the pom file and the compression logic in the code, it can be concluded
>that the compression tool used is plexus-archiver, version 3.0.1.
!image-2023-04-04-10-15-39-889.png!
The release note for plexus-archiver is as follows
[https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md]
Searching for the keyword 'encod' in the release note reveals that many
encoding-related issues have been fixed since version 3.0, including
* [Issue #37|https://github.com/codehaus-plexus/plexus-archiver/issues/37] -
Deprecate Manifest(Reader) and update all related Implemenation does not
properly map characters to map and makes assumptions about character encoding
which might lead to failures. Deprecate and rely on Java Manifest reader to do
the right thing.
* [Issue #39|https://github.com/codehaus-plexus/plexus-archiver/issues/39] -
Updated to stop falling back to the unicode path extra field policy
NOT_ENCODEABLE. If a name is not encodeable in UTF-8, it also is not encodeable
in the extra field. Updated to always add the Info-ZIP Unicode Path Extra Field
when creating an archive using an encoding different from UTF-8 instead of only
when a name is not encodeable. Additionally support that extra field when
unarchiving.
* [Pull Request
#73|https://github.com/codehaus-plexus/plexus-archiver/pull/73] - Symbolic
links not properly encoded in ZIP archives
then download the plexus-archiver code and search for the error message
'IOException when zipping' in the source code
!image-2023-04-04-10-16-01-609.png!
!image-2023-04-04-10-16-29-682.png!
By reading the plexus-archiver code, it was found that setting encoding is
necessary when creating a jar file using plexus-archiver, because the jar file
contains text files such as the manifest file, which may have non-ASCII
characters and need to be correctly encoded to avoid potential issues.
Therefore, setting the encoding ensures that the text files in the jar file are
properly encoded.
However, when creating a tar.gz file using plexus-archiver, there is no need
for the setEncoding() method, because tar.gz files do not have a text encoding
format. They are binary files that contain compressed data.
At this point, we can explain why only the schema-registry in the distro
packaging will have an error. The descriptor of the schema-registry is
specified as follows:
<descriptor>src/main/assembly/plugin-schema-registry.xml</descriptor> the
format specified is jar!
!image-2023-04-04-10-17-22-307.png!
And all other formats specified in the assembly, except for this one, are tar.gz
!image-2023-04-04-10-16-59-645.png!
We can use the file command to check the encoding format of all files generated
during the compilation of all modules:
bashCopy code
file ./xxx/target/maven-archiver/pom.properties
And all of them are encoded in ASCII. This is why all of them are encoded in
ASCII and only assembly packaging of schema-registry will result in an error.
Based on the above inference, I modified the 'format' in
plugin-schema-registry.xml from 'jar' to 'tar' and it passed the compilation
smoothly. Adding the line '<encoding>UTF-8</encoding>' in the distro's pom file
also allowed it to pass the compilation.
!image-2023-04-04-10-18-02-975.png!
However, these are not the fundamental solutions. The root cause is a bug in
plexus-archiver that re-encodes when packaging jars. This bug has been fixed in
the latest version of plexus-archiver. Our assembly plugin was using an older
version of plexus-archiver, causing the issue. Therefore, upgrading to the
latest version can solve the problem.
By checking the pom file of the assembly plugin, I found that the
maven-assembly-plugin-3.4.2 uses plexus-archiver 4.4. Therefore, I updated the
ranger's <assembly.plugin.version>2.6</assembly.plugin.version> to
<assembly.plugin.version>3.4.2</assembly.plugin.version> and the compilation
problem was also solved.
!image-2023-04-04-10-18-31-532.png!
I have tested both ranger 2.3 and ranger 2.4, and upgrading the assembly plugin
and modifying the encoding can solve the compilation issue on Linux.
https://issues.apache.org/jira/browse/RANGER-2721
Therefore, this issue does not solve the problem of compilation errors. Here we
are just avoiding using the assembly command to prevent triggering this
compilation error 100% of the time. In reality, even if assembly is removed,
many environments will still encounter compilation errors in the final step.
How to reproduce and test stably: We use ranger2.4 for testing because it does
not require a patch to be applied. Before testing, clear the ranger directory
installed in the Maven M2 repository.
ranger2.4
1.To reproduce the error, compile using the following command without making
any modifications.
{code:java}
[root@gs-server-12223 ranger]# git branch -vv master 460a176 [origin/master]
RANGER-4085: Search filter hint is not available where you search for policy *
ranger-2.4 50ad9c1 [origin/ranger-2.4] RANGER-4155 : Structure of resource(UI)
hierarchy in policy form not proper formatted for multiple values.
release-ranger-2.3.0 ce3339c RANGER-3730: use reload4j to replace log4j-1.2
[root@gs-server-12223 ranger]
# git diff [root@gs-server-12223 ranger]# rm -rf
/home/jzhou/m2/org/apache/ranger
[root@gs-server-12223 ranger]# /usr/local/src/apache-maven-3.6.3/bin/mvn -Pall
clean compile package install assembly:single -Dmaven.test.skip=true
-DskipTests=true -Dfindbugs.skip=true -Dcheckstyle.skip=true -Djacoco.skip=true
-Dpmd.skip=true -Drat.skip=true -Dspotbugs.skip=true -Dhadoop.version=3.3.4
-Dhbase.version=2.4.13 -Dhive.version=3.1.3 -Dkafka.version=2.8.1
-Dsolr.version=8.11.2 -Dzookeeper.version=3.6.4 {code}
!image-2023-04-04-10-21-17-574.png!
2.Upgrade the assembly.plugin.version in the ranger project to 3.4.2, and
continue to compile using the above command. The error disappears and the
compilation can proceed smoothly.
!image-2023-04-04-10-21-38-104.png!
!image-2023-04-04-10-21-50-064.png!
3.Reverting the changes still cannot compile successfully.
!image-2023-04-04-10-22-07-056.png!
A regrettable point here is that it has not yet been figured out which line of
code, under what circumstances, causes the compilation problem to occur, as
well as the reason why the issue cannot be stably reproduced without adding
assembly:single. If someone is interested, they can continue to dig deeper, and
the answer may be in the maven-assembly-plugin, plexus-archiver, and
commons-compress libraries.
[https://github.com/apache/maven-assembly-plugin]
[https://github.com/codehaus-plexus/plexus-archiver/|https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md]
[https://github.com/apache/commons-compress]
was:
{*}Environment{*}: bigtop/slaves:3.2.0-centos-7
The ranger compilation failed in the "do-component-build" because the
compilation command for ranger was incorrect. Removing "install" will result in
a successful compilation.
{code:java}
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal
org.apache.maven.plugins:maven-assembly-plugin:2.6:single (default) on project
ranger-distro: Failed to create assembly: Error creating assembly archive
schema-registry-plugin: Problem creating jar:
jar:file:/ws/build/ranger/rpm/BUILD/ranger-release-ranger-2.3.0/distro/target/ranger-distro-2.3.0.jar!/META-INF/maven/org.apache.ranger/ranger-distro/pom.xml:
JAR entry META-INF/maven/org.apache.ranger/ranger-distro/pom.xml not found in
/ws/build/ranger/rpm/BUILD/ranger-release-ranger-2.3.0/distro/target/ranger-distro-2.3.0.jar
-> [Help 1]
[ERROR]
[ERROR] To see the full stack{code}
!image-2023-04-01-18-31-58-091.png!
Changing the compilation command from "mvn clean compile package install" to
"mvn clean compile package" resulted in successful compilation.
!image-2023-04-01-18-33-29-756.png!
> ranger2.3 build failed
> ----------------------
>
> Key: RANGER-4166
> URL: https://issues.apache.org/jira/browse/RANGER-4166
> Project: Ranger
> Issue Type: Bug
> Components: Ranger
> Affects Versions: 2.3.0
> Reporter: caijialiang
> Priority: Major
> Attachments: image-2023-04-01-18-31-58-091.png,
> image-2023-04-01-18-33-29-756.png
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Here we mainly discuss how to reason and reproduce this compilation error
> stably.
> environment
> [root@gs-server-12223 ~]# locale
> LANG=zh_CN.UTF-8
> LC_CTYPE="zh_CN.UTF-8"
> LC_NUMERIC="zh_CN.UTF-8"
> LC_TIME="zh_CN.UTF-8"
> LC_COLLATE="zh_CN.UTF-8"
> LC_MONETARY="zh_CN.UTF-8"
> LC_MESSAGES="zh_CN.UTF-8"
> LC_PAPER="zh_CN.UTF-8"
> LC_NAME="zh_CN.UTF-8"
> LC_ADDRESS="zh_CN.UTF-8"
> LC_TELEPHONE="zh_CN.UTF-8"
> LC_MEASUREMENT="zh_CN.UTF-8"
> LC_IDENTIFICATION="zh_CN.UTF-8"
> LC_ALL=zh_CN.UTF-8
> lsb_release -a
> LSB Version: :core-4.1-amd64:core-4.1-noarch
> Distributor ID: CentOS
> Description: CentOS Linux release 7.4.1708 (Core)
> Release: 7.4.1708
> Codename: Core
> uname -a
> Linux gs-server-12223 3.10.0-693.el7.x86_64 #1 SMP Tue Aug 22 21:09:27 UTC
> 2017 x86_64 x86_64 x86_64 GNU/Linux
> maven 版本3.6.3
> description:
> There are compilation errors when building Ranger 2.3 and Ranger 2.4 in a
> Linux environment.
> Compilation command:
> mvn -Pall clean compile package install -Dmaven.test.skip=true
> -DskipTests=true -Dfindbugs.skip=true -Dcheckstyle.skip=true
> -Djacoco.skip=true -Dpmd.skip=true -Drat.skip=true -Dspotbugs.skip=true
> -Dhadoop.version=3.3.4 -Dhbase.version=2.4.13 -Dhive.version=3.1.3
> -Dkafka.version=2.8.1 -Dsolr.version=8.11.2 -Dzookeeper.version=3.6.4
> The following two patches were applied to ranger2.3 in order to compile
> successfully.
> git apply ../ranger/patch1-RANGER-3818.diff
> git apply ../patch0-RANGER-3373.diff
> *The compilation of ranger 2.3 fails with the following error:*
> {code:java}
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-assembly-plugin:2.6:single (default) on
> project ranger-distro: Failed to create assembly: Error creating assembly
> archive schema-registry-plugin: Problem creating jar:
> jar:file:/home/jialiang/prjs/ranger/distro/target/ranger-distro-2.3.0.jar!/META-INF/maven/org.apache.ranger/ranger-distro/pom.xml:
> JAR entry META-INF/maven/org.apache.ranger/ranger-distro/pom.xml not found
> in /home/jialiang/prjs/ranger/distro/target/ranger-distro-2.3.0.jar -> [Help
> 1] {code}
>
> *ranger2.4 did not apply any patches, and compilation errors are as follows:*
> {code:java}
> [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-assembly-plugin:2.6:single (default) on
> project ranger-distro: Failed to create assembly: Error creating assembly
> archive schema-registry-plugin: IOException when zipping
> rMETA-INF/maven/org.apache.ranger/ranger-distro/pom.properties: invalid code
> lengths set -> [Help 1]{code}
> According to the compilation error message of ranger2.4, it is suspected that
> the issue is related to encoding. After checking the encoding format of the
> corresponding file, it is found to be ASCII, while Linux defaults to UTF-8
>
> file ./distro/target/maven-archiver/pom.properties
> ./distro/target/maven-archiver/pom.properties: ASCII text
>
> Therefore, it is possible that it is a encoding problem. In addition, the
> error message mentions "Error creating assembly archive." The Maven Assembly
> Plugin is executed during the package phase of Maven, after compilation,
> testing, and other operations are completed, to prepare the build artifacts
> for distribution as archive files.
> This error occurs when the Assembly Plugin is creating a distributable
> archive, such as a zip or tar.gz format, from the build artifacts. Therefore,
> it is related to how the archive tool used by Maven Assembly Plugin handles
> encoding.
> In both ranger2.3 and ranger2.4, the
> <assembly.plugin.version>2.6</assembly.plugin.version> is used. Hence, it is
> necessary to investigate the code of this version of the Assembly Plugin."
> [https://github.com/apache/maven-assembly-plugin],
> [https://github.com/apache/maven-assembly-plugin/blob/maven-assembly-plugin-2.6/pom.xml]
> From the pom file and the compression logic in the code, it can be concluded
> that the compression tool used is plexus-archiver, version 3.0.1.
> !image-2023-04-04-10-15-39-889.png!
> The release note for plexus-archiver is as follows
> [https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md]
> Searching for the keyword 'encod' in the release note reveals that many
> encoding-related issues have been fixed since version 3.0, including
> * [Issue #37|https://github.com/codehaus-plexus/plexus-archiver/issues/37] -
> Deprecate Manifest(Reader) and update all related Implemenation does not
> properly map characters to map and makes assumptions about character encoding
> which might lead to failures. Deprecate and rely on Java Manifest reader to
> do the right thing.
> * [Issue #39|https://github.com/codehaus-plexus/plexus-archiver/issues/39] -
> Updated to stop falling back to the unicode path extra field policy
> NOT_ENCODEABLE. If a name is not encodeable in UTF-8, it also is not
> encodeable in the extra field. Updated to always add the Info-ZIP Unicode
> Path Extra Field when creating an archive using an encoding different from
> UTF-8 instead of only when a name is not encodeable. Additionally support
> that extra field when unarchiving.
> * [Pull Request
> #73|https://github.com/codehaus-plexus/plexus-archiver/pull/73] - Symbolic
> links not properly encoded in ZIP archives
> then download the plexus-archiver code and search for the error message
> 'IOException when zipping' in the source code
> !image-2023-04-04-10-16-01-609.png!
> !image-2023-04-04-10-16-29-682.png!
> By reading the plexus-archiver code, it was found that setting encoding is
> necessary when creating a jar file using plexus-archiver, because the jar
> file contains text files such as the manifest file, which may have non-ASCII
> characters and need to be correctly encoded to avoid potential issues.
> Therefore, setting the encoding ensures that the text files in the jar file
> are properly encoded.
> However, when creating a tar.gz file using plexus-archiver, there is no need
> for the setEncoding() method, because tar.gz files do not have a text
> encoding format. They are binary files that contain compressed data.
> At this point, we can explain why only the schema-registry in the distro
> packaging will have an error. The descriptor of the schema-registry is
> specified as follows:
> <descriptor>src/main/assembly/plugin-schema-registry.xml</descriptor> the
> format specified is jar!
> !image-2023-04-04-10-17-22-307.png!
> And all other formats specified in the assembly, except for this one, are
> tar.gz
> !image-2023-04-04-10-16-59-645.png!
> We can use the file command to check the encoding format of all files
> generated during the compilation of all modules:
> bashCopy code
> file ./xxx/target/maven-archiver/pom.properties
> And all of them are encoded in ASCII. This is why all of them are encoded in
> ASCII and only assembly packaging of schema-registry will result in an error.
> Based on the above inference, I modified the 'format' in
> plugin-schema-registry.xml from 'jar' to 'tar' and it passed the compilation
> smoothly. Adding the line '<encoding>UTF-8</encoding>' in the distro's pom
> file also allowed it to pass the compilation.
> !image-2023-04-04-10-18-02-975.png!
> However, these are not the fundamental solutions. The root cause is a bug in
> plexus-archiver that re-encodes when packaging jars. This bug has been fixed
> in the latest version of plexus-archiver. Our assembly plugin was using an
> older version of plexus-archiver, causing the issue. Therefore, upgrading to
> the latest version can solve the problem.
> By checking the pom file of the assembly plugin, I found that the
> maven-assembly-plugin-3.4.2 uses plexus-archiver 4.4. Therefore, I updated
> the ranger's <assembly.plugin.version>2.6</assembly.plugin.version> to
> <assembly.plugin.version>3.4.2</assembly.plugin.version> and the compilation
> problem was also solved.
> !image-2023-04-04-10-18-31-532.png!
> I have tested both ranger 2.3 and ranger 2.4, and upgrading the assembly
> plugin and modifying the encoding can solve the compilation issue on Linux.
> https://issues.apache.org/jira/browse/RANGER-2721
> Therefore, this issue does not solve the problem of compilation errors. Here
> we are just avoiding using the assembly command to prevent triggering this
> compilation error 100% of the time. In reality, even if assembly is removed,
> many environments will still encounter compilation errors in the final step.
> How to reproduce and test stably: We use ranger2.4 for testing because it
> does not require a patch to be applied. Before testing, clear the ranger
> directory installed in the Maven M2 repository.
> ranger2.4
> 1.To reproduce the error, compile using the following command without making
> any modifications.
> {code:java}
> [root@gs-server-12223 ranger]# git branch -vv master 460a176 [origin/master]
> RANGER-4085: Search filter hint is not available where you search for policy
> * ranger-2.4 50ad9c1 [origin/ranger-2.4] RANGER-4155 : Structure of
> resource(UI) hierarchy in policy form not proper formatted for multiple
> values. release-ranger-2.3.0 ce3339c RANGER-3730: use reload4j to replace
> log4j-1.2 [root@gs-server-12223 ranger]
> # git diff [root@gs-server-12223 ranger]# rm -rf
> /home/jzhou/m2/org/apache/ranger
> [root@gs-server-12223 ranger]# /usr/local/src/apache-maven-3.6.3/bin/mvn
> -Pall clean compile package install assembly:single -Dmaven.test.skip=true
> -DskipTests=true -Dfindbugs.skip=true -Dcheckstyle.skip=true
> -Djacoco.skip=true -Dpmd.skip=true -Drat.skip=true -Dspotbugs.skip=true
> -Dhadoop.version=3.3.4 -Dhbase.version=2.4.13 -Dhive.version=3.1.3
> -Dkafka.version=2.8.1 -Dsolr.version=8.11.2 -Dzookeeper.version=3.6.4 {code}
>
> !image-2023-04-04-10-21-17-574.png!
> 2.Upgrade the assembly.plugin.version in the ranger project to 3.4.2, and
> continue to compile using the above command. The error disappears and the
> compilation can proceed smoothly.
> !image-2023-04-04-10-21-38-104.png!
> !image-2023-04-04-10-21-50-064.png!
> 3.Reverting the changes still cannot compile successfully.
> !image-2023-04-04-10-22-07-056.png!
> A regrettable point here is that it has not yet been figured out which line
> of code, under what circumstances, causes the compilation problem to occur,
> as well as the reason why the issue cannot be stably reproduced without
> adding assembly:single. If someone is interested, they can continue to dig
> deeper, and the answer may be in the maven-assembly-plugin, plexus-archiver,
> and commons-compress libraries.
> [https://github.com/apache/maven-assembly-plugin]
> [https://github.com/codehaus-plexus/plexus-archiver/|https://github.com/codehaus-plexus/plexus-archiver/blob/master/ReleaseNotes.md]
> [https://github.com/apache/commons-compress]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)