DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUGĀ· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=41455>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED ANDĀ· INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=41455 Summary: TarEntry.java: getName does not provide for non-ASCII encoded entry names Product: Ant Version: 1.7.0RC1 Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: Core AssignedTo: dev@ant.apache.org ReportedBy: [EMAIL PROTECTED] If a tar file contains entries that have non-ASCII encoding, for example encoded in 'Shift_JIS', then the encoded entry's filename is not returned correctly in the TarEntry getName method. It is converted to ASCII. (The zip classes work fine) One possible way to remove the limitation, and to not cause any incompatibility, is to add a new method, getByteName, which returns the name of the entry as a byte array. I have enclosed a patch to TarInputStream.java which adds the ability to set and return the byte name for the entry's name. This solution is probably incomplete. Ideally, the TarEntry class should have a method getByteName, but more modifcations are needed in order to do that. I do not have good reproduction steps, but basically: tar cvpf testfile.tar 'Shift_JIS filename' tar xvpf testfile.tar I do not see a way to attach a patch file, so I include the patch to TarInputStream.java below: --- TarInputStream.java.orig 2007-01-24 10:30:04.321949000 -0500 +++ TarInputStream.java 2007-01-24 16:45:38.477672000 -0500 @@ -19,8 +19,34 @@ /* * This package is based on the work done by Timothy Gerard Endres * ([EMAIL PROTECTED]) to whom the Ant project is very grateful for his great code. + * + */ + +/* + * The class is modified from the original to provide the ability + * to return the byte name for the entry. The byte name array can + * be used to correctly construct a filename with an appropriate + * non-ASCII encoding. The reason for the modifications is the + * TarInputStream class does not return non-ASCII names in the + * getName method. + * + * The method getNextEntry was modified from the original version to add + * extraction and setting of the protected variable byteName, the byte + * array name for the entry. + * + * New methods to process the byte name array have been added: + * append Append a byte buffer to the appendTo byte array. + * parseByteName Parse the byte name from the header. + * setByteName Set this entry's byte name. + * getByteName Get this entry's byte name. + * + * The modifcations were written by + * Kelly G. Luetkemeyer + * The MathWorks, Inc. + * 1/24/2007 */ + package org.apache.tools.tar; import java.io.FilterInputStream; @@ -42,6 +68,7 @@ protected long entrySize; protected long entryOffset; protected byte[] readBuf; + protected byte[] byteName; protected TarBuffer buffer; protected TarEntry currEntry; @@ -191,6 +218,8 @@ * If there are no more entries in the archive, null will * be returned to indicate that the end of the archive has * been reached. + * + * The byteName is set from the TarEntry header buffer. * * @return The next TarEntry in the archive, or null. * @throws IOException on error @@ -254,10 +283,17 @@ StringBuffer longName = new StringBuffer(); byte[] buf = new byte[256]; int length = 0; + byte [] tmpname; + tmpname = null; + while ((length = read(buf)) >= 0) { longName.append(new String(buf, 0, length)); + tmpname = append(tmpname, buf, length); } + getNextEntry(); + this.byteName = tmpname; + if (this.currEntry == null) { // Bugzilla: 40334 // Malformed tar file - long entry name not followed by entry @@ -269,8 +305,9 @@ longName.deleteCharAt(longName.length() - 1); } this.currEntry.setName(longName.toString()); + } else { + this.byteName = parseByteName(headerBuf, TarConstants.NAMELEN); } - return this.currEntry; } @@ -387,4 +424,66 @@ out.write(buf, 0, numRead); } } + /** + * Append a byte buffer to the appendTo byte array. + * + * @param appendTo The byte buffer to append data to. + * @param buf The byte buffer containing the new data. + * @param length The number of bytes in buf to copy + * @return The new array with appended data. + */ + public byte [] append(byte[] appendTo, byte[] buf, int buflength) { + + if (appendTo == null) { + byte [] results = new byte[buflength]; + System.arraycopy(buf, 0, results, 0, buflength); + return results; + + } else { + int length; + length = appendTo.length + buflength; + byte [] results = new byte[length]; + System.arraycopy(appendTo, 0, results, 0, appendTo.length); + System.arraycopy(buf, 0, results, appendTo.length-1, buflength); + return results; + } + } + + /** + * Parse the byte name from the header. + * + * @param header The TarEntry header array. + * @param length The number of byte to parse. + * @return The byte array name. + */ + public byte [] parseByteName(byte[] header, int length) { + int end; + for(end = 0; end < length; ++end) { + if (header[end] == 0) { + break; + } + } + byte [] results = new byte[end]; + System.arraycopy(header, 0, results, 0, end); + return results; + } + + /** + * Set this entry's byte name. + * + * @param name This entry's name as a byte array. + */ + public void setByteName(byte [] name) { + this.byteName = name; + } + + /** + * Get this entry's byte name. + * + * @return This entry's name as a byte array. + */ + public byte[] getByteName() { + return this.byteName; + } + } -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]