Dawid Iwo Cokan created COMPRESS-689:
----------------------------------------
Summary: Unable to detect symlinks in ZIP
Key: COMPRESS-689
URL: https://issues.apache.org/jira/browse/COMPRESS-689
Project: Commons Compress
Issue Type: Bug
Affects Versions: 1.27.1
Environment: MacOs M1 Sonoma 14.4
Reporter: Dawid Iwo Cokan
*Context:*
In my project I need to prepare a ZIP with a data under different paths. Some
of resources appear there multiple times hence I wanted to improve it so it
does not appear more than once. I initially thought about using hard links and
tar archive (I did POC and it works) but it has to be ZIP. So I decided to use
sym links.
*Problem:*
Creating a ZIP with symlinks works (when I unzip it to in my system the link is
preserved) but when I parse the same ZIP with my code using the same version of
commons-compress the
{code:java}
entry.isUnixSymlink()
{code}
always returns false.
Here is a snippet to reproduce the problem:
{code:java}
@Test
public void createZIPWithLinks() throws IOException {
OutputStream output = new FileOutputStream("zipWithLinks.zip");
try (ZipArchiveOutputStream zipOutputStream = new
ZipArchiveOutputStream(output)) {
zipOutputStream.putArchiveEntry(new ZipArchiveEntry("original"));
zipOutputStream.write("original content".getBytes());
zipOutputStream.closeArchiveEntry();
ZipArchiveEntry symlinkEntry = new ZipArchiveEntry("link");
symlinkEntry.setUnixMode(0120444);
zipOutputStream.putArchiveEntry(symlinkEntry);
zipOutputStream.write("original".getBytes());
zipOutputStream.closeArchiveEntry();
}
ZipArchiveInputStream zipInputStream = new ZipArchiveInputStream(new
FileInputStream("zipWithLinks.zip"));
ZipArchiveEntry entry;
int entriesCount = 0;
while ((entry = zipInputStream.getNextEntry()) != null) {
boolean isSymLink = entry.isUnixSymlink();
if ("link".equals(entry.getName())) {
assertTrue(entry.isUnixSymlink(), "'link' detected but it's not sym
link");
} else {
assertFalse(entry.isUnixSymlink(), "'original' detected but it's
sym link and should be regular file");
}
entriesCount++;
}
assertEquals(2, entriesCount);
} {code}
I dig a bit in ZIP specification and tried to understand what's wrong and in
zipdetails of the archive I can see this:
{quote}0000 LOCAL HEADER #1 04034B50
0004 Extract Zip Spec 14 '2.0'
*0005 Extract OS 00 'MS-DOS'*
{quote}
I can see the 'isUnixSymLIink()' is checking platform field is UNIX, otherwise
is not detecting the other information. The platform field in ZipArchiveEntry
is set in ZipArchiveInputStream line 687 based on local file header:
{code:java}
int off = WORD; // = 4
current = new CurrentEntry();
final int versionMadeBy = ZipShort.getValue(lfhBuf, off);
off += SHORT; // = 20, HEX: 14
current.entry.setPlatform(versionMadeBy >> ZipFile.BYTE_SHIFT &
ZipFile.NIBLET_MASK); {code}
And here is something I don't understand. I see this reads the 'Extract Zip
Spec' header which is correct. But then the operation:
{code:java}
versionMadeBy >> ZipFile.BYTE_SHIFT & ZipFile.NIBLET_MASK {code}
produces 0 so later the isUnixSymLink() always returns 0 becuase:
{code:java}
public boolean isUnixSymlink() {
return (getUnixMode() & UnixStat.FILE_TYPE_FLAG) == UnixStat.LINK_FLAG;
}
public int getUnixMode() {
return platform != PLATFORM_UNIX ? 0 : (int) (getExternalAttributes() >>
SHORT_SHIFT & SHORT_MASK);
}
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)