[
https://issues.apache.org/jira/browse/ANY23-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated ANY23-447:
---------------------------------------
Attachment: output.txt
> Reduce Any23 dependency bloat
> -----------------------------
>
> Key: ANY23-447
> URL: https://issues.apache.org/jira/browse/ANY23-447
> Project: Apache Any23
> Issue Type: Improvement
> Components: core
> Affects Versions: 2.3
> Reporter: David Cockbill
> Priority: Minor
> Attachments: output.txt
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Compelled by email conversation with Hans Brende:
> {code:java}
> David, unfortunately this move won't reduce the number of core dependencies
> we have: the plugins and service modules are not dependencies of the core
> module. However, it might be useful if you posted an issue about the
> dependency bloat, including the various exclusions you are using: we might
> be able to mitigate the problem.
> {code}
> This was a result of having to exclude dependencies in the pom.xml for a
> product (Note that there was not too much thought in the exclusions, I was
> trying to get the code size down before a release). Section of pom.xml:
> {code:java}
> <dependency>
> <groupId>org.apache.any23</groupId>
> <artifactId>apache-any23-core</artifactId>
> <exclusions>
> <!-- Any23 brings in a lot of dependencies which bloats the sharded
> jar.
> This is an attempt to reduce this by excluding packages
> that we may not be using as part of Any23.
> NOTE: If dependency is required at runtime, then a
> java.lang.NoClassDefFoundError is thrown. -->
>
> <exclusion>
> <groupId>org.apache.tika</groupId>
> <artifactId>tika-parsers</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.bouncycastle</groupId>
> <artifactId>bcmail-jdk15on</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.bouncycastle</groupId>
> <artifactId>bcprov-jdk15on</artifactId>
> </exclusion>
> <exclusion>
> <groupId>edu.ucar</groupId>
> <artifactId>cdm</artifactId>
> </exclusion>
> <exclusion>
> <groupId>net.sf.trove4j</groupId>
> <artifactId>trove4j</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.cxf</groupId>
> <artifactId>cxf-rt-rs-client</artifactId>
> </exclusion>
> <exclusion>
> <groupId>com.github.ben-manes.caffeine</groupId>
> <artifactId>caffeine</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.opengis</groupId>
> <artifactId>geoapi</artifactId>
> </exclusion>
> <exclusion>
> <groupId>com.drewnoakes</groupId>
> <artifactId>metadata-extractor</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.eclipse.rdf4j</groupId>
> <artifactId>rdf4j-repository-sail</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.eclipse.rdf4j</groupId>
> <artifactId>rdf4j-sail-memory</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.tukaani</groupId>
> <artifactId>xz</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.codelibs</groupId>
> <artifactId>jhighlight</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.gagravarr</groupId>
> <artifactId>vorbis-java-core</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.gagravarr</groupId>
> <artifactId>vorbis-java-tika</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.opennlp</groupId>
> <artifactId>opennlp-tools</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.pdfbox</groupId>
> <artifactId>pdfbox</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.pdfbox</groupId>
> <artifactId>pdfbox-tools</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.poi</groupId>
> <artifactId>poi-scratchpad</artifactId>
> </exclusion>
> <exclusion>
> <groupId>edu.ucar</groupId>
> <artifactId>grib</artifactId>
> </exclusion>
> <exclusion>
> <groupId>com.googlecode.mp4parser</groupId>
> <artifactId>isoparser</artifactId>
> </exclusion>
> <exclusion>
> <groupId>com.healthmarketscience.jackcess</groupId>
> <artifactId>jackcess</artifactId>
> </exclusion>
> <exclusion>
> <groupId>com.healthmarketscience.jackcess</groupId>
> <artifactId>jackcess-encrypt</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.sis.core</groupId>
> <artifactId>sis-utility</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.sis.storage</groupId>
> <artifactId>sis-netcdf</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.apache.sis.core</groupId>
> <artifactId>sis-metadata</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.eclipse.rdf4j</groupId>
> <artifactId>rdf4j-rio-trix</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.yaml</groupId>
> <artifactId>snakeyaml</artifactId>
> </exclusion>
> <exclusion>
> <groupId>org.eclipse.rdf4j</groupId>
> <artifactId>rdf4j-rio-turtle</artifactId>
> </exclusion>
> </exclusions>
> </dependency>
> {code}
> Some background that may be useful from my notes:
> {code:java}
> Whilst adding Any23 the product, the Any23 Core package was causing Lintian
> to fail.
> Lintian is a Debian package checker written in PERL. This package uses
> Archive::Zip to unpack any .jar file in the Debian package. This particular
> unzip utility does not handle the Zip64 format; causing the failure. The
> original zip format has various restrictions, one of which being the number
> of files in the archive. Therefore if the class files in the jar for the
> product exceeds this limit (65535), then a zip64 format file is produced
> instead of a standard zip file.
> The Any23 Core Library does seem quite excessive in what it pulls in. From
> running the following, the output for the product goes from 40490 to 78513.
> zipinfo -1 product.jar | wc -l
> {code}
> This Linitan failure on a linux build was the original push for the
> exclusions; however the product .jar also increased in a similar fashion.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)