Hello Gren, Feel free to provide a PR on GitHub (with a unit test) so we can see clearly what you suggest.
TY! Gary On Tue, Mar 19, 2024, 7:46 AM Gren Elliot <gell...@mimecast.com.invalid> wrote: > Hi, > > > > I’m finding that commons-compress-1.26.1 is recognising a utf-16 text file > as a tar archive – unlike the previous version > > > > This is the code that changed in that release in ArchiveStreamFactory - > *public > static String detect(final InputStream in) throws ArchiveException {* > > that differs in detection: > > > > if (signatureLength >= *TAR_HEADER_SIZE*) { > try (TarArchiveInputStream inputStream = new TarArchiveInputStream(new > ByteArrayInputStream(tarHeader))) { > > > *// COMPRESS-191 - verify the header checksum // COMPRESS-644 - do > not allow zero byte file entries *TarArchiveEntry entry = > inputStream.getNextEntry(); > > *// try to find the first non-directory entry within the first 10 entries. > *int count = 0; > while (entry != null && entry.isDirectory() && count++ < > *TAR_TEST_ENTRY_COUNT*) { > entry = inputStream.getNextEntry(); > } > if (entry != null && entry.isCheckSumOK() && !entry.isDirectory() > && entry.getSize() > 0 || count > 0) { > return *TAR*; > } > } catch (final Exception e) { > > *// NOPMD NOSONAR // can generate IllegalArgumentException as well > as IOException auto-detection, simply not a TAR ignored *} > } > > > > I feel this is being too lenient. For instance at the last “if” > statement, for the test file, entry is null and count=1. The code suggests > it is looking for the first non-directory entry. It hasn’t found a > non-directory entry in our case. > > > > For instance, the earlier code at least checked that the checksum was OK > for the one entry it checked (it isn’t for our test file…) > > > > Regards, > > Gren > > > > > Gren Elliot m: +44 7590 571125 www.mimecast.com > <https://www.mimecast.com/?utm_source=EmailStationary&utm_medium=Email> > Senior Software Engineer p: Address click here > <https://www.mimecast.com/company/contact/?utm_source=EmailStationary&utm_medium=Email> > > [image: https://www.mimecast.com] > <https://eu-api.mimecast.com/s/click/F2A44qlyvx7D1oreXULOBV5kMEPIKkV83Y4Ke-dt-NBDaF60XiI0--IA4dqHElBMaoswX807HbAxqGGR7xQ51HVPCRBYg4JXq_Wd9owjxjfwOBrI7hBD-W7h0EAlLCx_QYGLsysA_qxqzLlmgHh0s0QhvGUnBXihs0pinvg0j4BhulqLIIEXsdwdbimte5_S0h2AlbdQ0nEaRB4-UMa-vw> > *Work > Protected.™* > > > *Disclaimer* > The information contained in this communication from * > gell...@mimecast.com <gell...@mimecast.com> * sent at 2024-03-19 11:45:54 > is confidential and may be legally privileged. It is intended solely for > use by * user@commons.apache.org <user@commons.apache.org> * and others > authorized to receive it. If you are not * user@commons.apache.org > <user@commons.apache.org> * you are hereby notified that any disclosure, > copying, distribution or taking action in reliance of the contents of this > information is strictly prohibited and may be unlawful. > > Visit our preference center to change how often you hear from us: Preference > Center > <https://info.mimecast.com/Subscription-Management.html?utm_source=EmailStationary>. > > > > >