Jens Reimann created COMPRESS-459: ------------------------------------- Summary: CPIO fails decoding multibyte name entries Key: COMPRESS-459 URL: https://issues.apache.org/jira/browse/COMPRESS-459 Project: Commons Compress Issue Type: Bug Components: Compressors Affects Versions: 1.17, 1.9 Reporter: Jens Reimann
Having a CPIO archive in (e.g. UTF-8) mode and having a name entry with a name containing multi-byte characters the decoder fails. The problem IMHO is the "getHeaderPadCount" method, which assumes a single byte per character: {code:java} public int getHeaderPadCount(){ if (this.alignmentBoundary == 0) { return 0; } int size = this.headerSize + 1; // Name has terminating null if (name != null) { size += name.length(); } final int remain = size % this.alignmentBoundary; if (remain > 0){ return this.alignmentBoundary - remain; } return 0; } {code} However this may (or may not) be true for UTF-8. Also it wouldn't be enough to call "String#getBytes(…)" as this might already transform the underlying bytes. The proper solution would be to provide the name size, as read from the CPIO stream, and pass it to the entry. -- This message was sent by Atlassian JIRA (v7.6.3#76005)