[ 
https://issues.apache.org/jira/browse/COMPRESS-176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13216647#comment-13216647
 ] 

Stefan Bodewig commented on COMPRESS-176:
-----------------------------------------

OK, this means nobody except for Commons Compress and InfoZIP tools seems to 
read the Unicode extra field.

This is what I get when trying to extract the original ZIP on Linux:

{noformat}
stefan@birdy:~/Desktop$ unzip test-winzip.zip 
Archive:  test-winzip.zip
  inflating: doc.txt.gz              
 extracting: doc2.txt                
warning:  test-winzip.zip appears to use backslashes as path separators
   creating: ??/
  inflating: ??/??zip.zip            
 extracting: ??/??.txt  
{noformat}

and it creates an "ä" directory.  I'll try to look through InfoZIPs sources 
what it bases it heuristics on, maybe we can use the same in Commons Compress 
to turn backslashes into slashes.

                
> ArchiveInputStream#getNextEntry(): Problems with WinZip directories with 
> Umlauts
> --------------------------------------------------------------------------------
>
>                 Key: COMPRESS-176
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-176
>             Project: Commons Compress
>          Issue Type: Bug
>          Components: Archivers
>    Affects Versions: 1.3
>         Environment: Windows 7
>            Reporter: Wurstbrot mit Senf
>         Attachments: test-7zip.zip, test-windows.zip, test-winzip.zip, 
> testzap-winzip.zip
>
>
> There is a problem when handling a WinZip-created zip with Umlauts in 
> directories.
> I'm accessing a zip file created with WinZip containing a directory with an 
> umlaut ("ä") with ArchiveInputStream. When creating the zip file the 
> unicode-flag of winzip had been active.
> The following problem occurs when accessing the entries of the zip:
> the ArchiveEntry for a directory containing an umlaut is not marked as a 
> directory and the file names for the directory and all files contained in 
> that directory contain backslashes instead of slashes (i.e. completely 
> different to all other files in directories with no umlaut in their path).
> There is no difference when letting the ArchiveStreamFactory decide which 
> ArchiveInputStream to create or when using the ZipArchiveInputStream 
> constructor with the correct encoding (I've tried different encodings CP437, 
> CP850, ISO-8859-15, but still the problem persisted).
> This problem does not occur when using the very same zip file but compressed 
> by 7zip or the built-in Windows 7 zip functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to