Hi Roger, Thank you very much for your review and the feedback. Please find my comments below and a new patch attached. Philipp
On Wed, 2018-12-12 at 10:52 -0500, Roger Riggs wrote: > Hi Phillip, > > > > Sorry, got busy... > > > > Can you rebase the patch to the current repo, it did not apply > cleanly. > > > > I know you are focused on removing the deprecation, but a few > localized improvements > > would be good. > > > > In Attributes.java : 346-337, it uses StringBuffer, please change > it > to StringBuilder. > > Unless thread safety is an issue, StringBuilder is recommended, > it > is > > slightly more efficient since it does no synchronization. I did deliberately not touch the StringBuffer in the previous patch but fully agree now I know it has a chance to be accepted. Would you accept to replace StringBuffer with plain string concatenation after http://op enjdk.java.net/jeps/280 which was not in place at the time those StringBuffers were introduced? > > > - And the StringBuilder should be sized when it is created, to > avoid > needing to resize it later. > > Using a single StringBuilder all the entries, using > setLength(0), > would save allocating > > for each entry. Jep 280 would also avoid having to size the buffers far as I understand. > > > - check the indentation @line 308-20 and 311. The indentation was weird, 5 instead of 4 spaces on some lines and I re-indented only the lines I touched anyway in the previous patch. Now, some lines appear changed only due to the indentation. After the StringBuffer removal only two of them are left in the current patch and certainly don't add significantly many unrelated changed lines now any more. > > > In Manifest.java: > > - write72 method !String.isEmpty() is preferred over the > .length() > > 0. ok > - if the line is empty, should it write the LINE_BREAK_BYTES? > > A blank line in the manifest may be seen as significant. Before the patch, a line break was always added to the end of the StringBuffer after passing to make72Safe and before writing it. Now with the previous patch, write72 added it. Altogether makes no difference. But after reconsidering your point, I found a clearer approach, I hope, than passing an empty string for having a line break requested to be output which now also seems to me having been not the most obvious way and more like a kind of a hack before. In the course of that change I also renamed write72 to println and println72 and also added a test for it. Hope you also like it better that way. > > > - in the write method: Change StringBuffer to StringBuilder > > > > - The javadoc links to MANIFEST_VERSION and SIGNATURE_VERSION > should use "#". > > * {@link Attributes.Name#MANIFEST_VERSION} or > > * {@link Attributes.Name#SIGNATURE_VERSION} must be set in removed it again because it applies more to bug 8196371 or 6910466 > > > Thanks, Roger > > > > > > On 12/04/2018 03:34 AM, Philipp Kunz > wrote: > > > > > > > Hi Roger, > > > > > > > > I'm afraid the previous patch missed the tests, should be > > included this time. > > > > > > > > The intention of the patch is to solve only bug 8066619 about > > deprecation. I sincerely hope the changes are neutral. > > > > > > > > The new ValueUtf8Coding test heavily coincides/overlaps with > > 6202130 which is why I mentioned it. I'm however not > > satisfied > > that that test alone also completely solves 6202130 because > > 6202130 has or might have implications with breaking > > characters > > encoded in UTF-8 with more than one bytes across a line > > break > > onto a continuation line which is not part of the current > > patch > > proposed for 8066619. At some point I proposed > > ValueUtf8Coding > > with only removing the comments from the implementation in > > http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-October/0 > > 56166.html > > but I have changed my mind since then and think now 6202130 > > should also change the implementation not to break lines > > inside > > of multi-byte characters which is not part of the current > > patch > > and is probably easier after the current patch if necessary > > at > > all. Both 6202130 and the current patch for 8066619 here > > touch > > the UTF-8 coding of manifests and which ever is solved > > first > > should add a corresponding test because no such test exists > > yet > > I believe. Worth to mention are > > test/jdk/tools/launcher/DiacriticTest.java and > > test/jdk/tools/launcher/UnicodeTest.java both of which test > > the > > JVM launch and have a somewhat different purpose. I haven't > > found any other test for the specifically changed lines of > > code > > apart from probably many tests that use manifests > > indirectly in > > some form. > > > > > > > > Regards, > > Philipp > > > > > > > > > > > > > > On Mon, 2018-12-03 at 16:43 -0500, Roger Riggs wrote: > > > > > Hi Phillip, > > > > > > The amount detail obscures the general purpose. > > > And there appears to be more than 1. > > > The Jira issue IDs mentioned are 8066619 and 6202130. > > > > > > Is this functionally neutral and only fixes the deprecations? > > > > > > There is a mention that a test is needed for multi-byte chars, > > > but a test > > > is not included. Is there an existing test for that? > > > > > > Its probably best to identify the main functional improvement > > > (multi-byte) > > > and fix the deprecation as a side effect. > > > > > > Thanks for digging through the issues and the explanations; > > > it will take a bit of study to unravel and understand everything > > > in this > > > changeset. > > > > > > Regards, Roger > > > > > > > > > On 12/01/2018 06:49 AM, Philipp Kunz wrote: > > > > Find the proposed patch attached. Some comments and > > > > explanations, here: > > > > > > > > There is a quite interesting implementation in Manifest and > > > > Attributes > > > > worth quite some explanation. The way it used to work before > > > > was: > > > > > > > > 1. put manifest header name, colon and space into a > > > > StringBuffer > > > > -> the buffer now contains a string of characters each high- > > > > byte of > > > > which is zero as explained later why this is important. the > > > > high-bytes > > > > are zero because the set of allowed characters is very limited > > > > to ":", > > > > " ", "a" - "z", "A" - "Z", "0" - "9", "_", and "-" according to > > > > Attributes.Name#hash(String) so far with only the name and the > > > > separator and yet without the values. > > > > > > > > 2. if the value is not null, encode it in UTF-8 into a byte > > > > array and > > > > instantiate a String with it using deprecated > > > > String#String(byte[],int,int,int) resulting in a String with > > > > the same > > > > length as the byte array before holding one byte in each > > > > character's > > > > low-byte. This makes a difference for characters encoded with > > > > more than > > > > one byte in UTF-8. The new String is potentially longer than > > > > the > > > > original value. > > > > > > > > 3. if the value is not null, append value to buffer. The one > > > > UTF-8 > > > > encoded byte per character from the appended string is > > > > preserved also > > > > in the buffer along with the previous buffer contents. > > > > > > > > 3alt. if the value is null, add "null" to the buffer. See > > > > java.lang.AbstractStringBuilder#append(String). Neither of the > > > > characters of "null" has a non-zero high-byte encoded as UTF-16 > > > > chars. > > > > > > > > 4. make72Safe inserts line breaks with continuation spaces. > > > > Note that > > > > the buffer here contains only one byte per character because > > > > all high- > > > > bytes are still zero so that line.length() and > > > > line.insert(index, ...) > > > > effectively operate with byte offsets and not characters. > > > > > > > > 5. buffer.toString() > > > > > > > > 6. DataOutputStream#writeBytes(String). First of all read the > > > > JavaDoc > > > > comment for it, which explains it all: > > > > Writes out the string to the underlying output stream as a > > > > sequence of bytes. Each character in the string is written > > > > out, in > > > > sequence, by discarding its high eight bits. If no > > > > exception is > > > > thrown, the counter <code>written</code> is incremented by > > > > the > > > > length of <code>s</code> > > > > This restores the earlier UTF-8 encoding correctly. > > > > > > > > The topic has been discussed and mentioned already in > > > > http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-May/0 > > > > 52946.ht > > > > ml > > > > https://bugs.openjdk.java.net/browse/JDK-6202130 > > > > > > > > String(byte[],int,int,int) works "well" or "well enough" only > > > > together > > > > with DataOutputStream#writeBytes(String). When removing > > > > String(byte[],int,int,int) from Manifest and Attributes because > > > > deprecated, it makes no sense to keep using > > > > DataOutputStream#writeBytes(String) either. > > > > > > > > For the same reason as String#String(byte[],int,int,int) has > > > > been > > > > deprecated, I suggest to also deprecate > > > > java.io.DataOutput#writeBytes(String) as a separate issue. This > > > > might > > > > relate to https://bugs.openjdk.java.net/browse/JDK-6400767 but > > > > that one > > > > came to a different conclusion some ten years ago. > > > > > > > > I preferred to stick with the DataOutputStream even though not > > > > strictly > > > > necessary any more. It is and has been in the API of Attributes > > > > (unfortunately not private) and should better not be removed by > > > > changing the parameter type. Same for > > > > Manifest#make72Safe(StringBuffer) > > > > which I deprecated rather than having removed. Someone could > > > > have > > > > extended a class from Manifest and use such a method and when > > > > changing > > > > the signature it could no longer even compile in a far-fetched > > > > case. > > > > > > > > LINE_BREAK, CONTINUATION_SPACE, LINE_BREAK_BYTES, and > > > > LINE_BREAK_WITH_CONTINUATION_SPACE_BYTES should prevent having > > > > to > > > > invoke getBytes(UTF_8) over and over again on "\r\n" and "\r\n > > > > " with > > > > the idea to slightly improve performance this way. I figured it > > > > does > > > > not need JavaDoc comments but would be happy to add them if > > > > desired. > > > > > > > > I removed "XXX Need to handle UTF8 values." from Manifest#read > > > > after > > > > adding a test for it in ValueUtf8Coding. This change and test > > > > also > > > > relate to bug 6202130 but does not solve that one completely. > > > > ValueUtf8Coding demonstrates that Manifest can read UTF-8 > > > > encoded > > > > values which is a necessary test case to cover for this patch > > > > here. > > > > ValueUtf8Coding is the same test as already submitted and > > > > suggested > > > > earlier. See > > > > http://mail.openjdk.java.net/pipermail/core-libs-dev/2018-Octob > > > > er/threa > > > > d.html#55848 > > > > > > > > Indentation in Attributes#write(DataOutputStream) was five > > > > spaces on > > > > most lines. I fixed indentation only on the lines changed > > > > anyway. > > > > > > > > I replaced String#String(byte[],int,int,String) with > > > > String#String(byte[],int,int,java.nio.charset.StandardCharsets. > > > > UTF_8) > > > > which as a difference does not declare to throw a > > > > java.io.UnsupportedEncodingException. That also replaced "UTF8" > > > > as a > > > > charset name which I would consider not optimal regarding > > > > sun.nio.cs.UTF_8#UTF_8() and sun.nio.cs.UTF_8#historicalName(). > > > > > > > > In my opinion there is still some duplicated or at least very > > > > similar > > > > code in Manifest#write, Attributes#writeMain, and > > > > Attributes#write but > > > > I preferred to change less rather than more and not to further > > > > refactor > > > > and re-combine it. > > > > > > > > In EmptyKeysAndValues and NullKeysAndValues tests I tried to > > > > demonstrate that the changed implementation does not change > > > > behaviour > > > > also in edge cases. I would have expected not having to test > > > > all these > > > > cases but then I realized it was possible to test and is > > > > therefore > > > > possible in a real use case as well however far-fetched. At > > > > least the > > > > > > > > if (value != null) { > > > > > > > > lines (three times) most obviously demand to test the null > > > > value cases. > > > > > > > > I'm looking curiously forward to any kind of feedback or > > > > opinion. > > > > > > > > Philipp > > > > > > > > > > > > > > > > > >
diff -r 54aafb3ba9ab src/java.base/share/classes/java/util/jar/Attributes.java --- a/src/java.base/share/classes/java/util/jar/Attributes.java Mon Sep 24 22:12:07 2018 -0700 +++ b/src/java.base/share/classes/java/util/jar/Attributes.java Mon Dec 17 04:28:11 2018 +0100 @@ -36,6 +36,8 @@ import sun.util.logging.PlatformLogger; +import static java.nio.charset.StandardCharsets.UTF_8; + /** * The Attributes class maps Manifest attribute names to associated string * values. Valid attribute names are case-insensitive, are restricted to @@ -298,25 +300,12 @@ * Writes the current attributes to the specified data output stream. * XXX Need to handle UTF8 values and break up lines longer than 72 bytes */ - @SuppressWarnings("deprecation") - void write(DataOutputStream os) throws IOException { - for (Entry<Object, Object> e : entrySet()) { - StringBuffer buffer = new StringBuffer( - ((Name) e.getKey()).toString()); - buffer.append(": "); - - String value = (String) e.getValue(); - if (value != null) { - byte[] vb = value.getBytes("UTF8"); - value = new String(vb, 0, 0, vb.length); - } - buffer.append(value); - - Manifest.make72Safe(buffer); - buffer.append("\r\n"); - os.writeBytes(buffer.toString()); - } - os.writeBytes("\r\n"); + void write(DataOutputStream out) throws IOException { + for (Entry<Object, Object> e : entrySet()) { + Manifest.println72(out, + e.getKey().toString() + ": " + e.getValue()); + } + Manifest.println(out); // empty line after individual section } /* @@ -326,7 +315,6 @@ * * XXX Need to handle UTF8 values and break up lines longer than 72 bytes */ - @SuppressWarnings("deprecation") void writeMain(DataOutputStream out) throws IOException { // write out the *-Version header first, if it exists @@ -338,7 +326,8 @@ } if (version != null) { - out.writeBytes(vername+": "+version+"\r\n"); + out.write((vername + ": " + version).getBytes(UTF_8)); + Manifest.println(out); } // write out all attributes except for the version @@ -346,30 +335,15 @@ for (Entry<Object, Object> e : entrySet()) { String name = ((Name) e.getKey()).toString(); if ((version != null) && !(name.equalsIgnoreCase(vername))) { - - StringBuffer buffer = new StringBuffer(name); - buffer.append(": "); - - String value = (String) e.getValue(); - if (value != null) { - byte[] vb = value.getBytes("UTF8"); - value = new String(vb, 0, 0, vb.length); - } - buffer.append(value); - - Manifest.make72Safe(buffer); - buffer.append("\r\n"); - out.writeBytes(buffer.toString()); + Manifest.println72(out, name + ": " + e.getValue()); } } - out.writeBytes("\r\n"); + Manifest.println(out); // empty line after main attributes section } /* * Reads attributes from the specified input stream. - * XXX Need to handle UTF8 values. */ - @SuppressWarnings("deprecation") void read(Manifest.FastInputStream is, byte[] lbuf) throws IOException { String name = null, value; byte[] lastline = null; @@ -401,7 +375,7 @@ lastline = buf; continue; } - value = new String(buf, 0, buf.length, "UTF8"); + value = new String(buf, 0, buf.length, UTF_8); lastline = null; } else { while (lbuf[i++] != ':') { @@ -412,13 +386,13 @@ if (lbuf[i++] != ' ') { throw new IOException("invalid header field"); } - name = new String(lbuf, 0, 0, i - 2); + name = new String(lbuf, 0, i - 2, UTF_8); if (is.peek() == ' ') { lastline = new byte[len - i]; System.arraycopy(lbuf, i, lastline, 0, len - i); continue; } - value = new String(lbuf, i, len - i, "UTF8"); + value = new String(lbuf, i, len - i, UTF_8); } try { if ((putValue(name, value) != null) && (!lineContinued)) { diff -r 54aafb3ba9ab src/java.base/share/classes/java/util/jar/Manifest.java --- a/src/java.base/share/classes/java/util/jar/Manifest.java Mon Sep 24 22:12:07 2018 -0700 +++ b/src/java.base/share/classes/java/util/jar/Manifest.java Mon Dec 17 04:28:11 2018 +0100 @@ -34,6 +34,8 @@ import java.util.HashMap; import java.util.Iterator; +import static java.nio.charset.StandardCharsets.UTF_8; + /** * The Manifest class is used to maintain Manifest entry names and their * associated Attributes. There are main Manifest Attributes as well as @@ -143,31 +145,24 @@ * @exception IOException if an I/O error has occurred * @see #getMainAttributes */ - @SuppressWarnings("deprecation") public void write(OutputStream out) throws IOException { DataOutputStream dos = new DataOutputStream(out); // Write out the main attributes for the manifest attr.writeMain(dos); // Now write out the per-entry attributes for (Map.Entry<String, Attributes> e : entries.entrySet()) { - StringBuffer buffer = new StringBuffer("Name: "); - String value = e.getKey(); - if (value != null) { - byte[] vb = value.getBytes("UTF8"); - value = new String(vb, 0, 0, vb.length); - } - buffer.append(value); - make72Safe(buffer); - buffer.append("\r\n"); - dos.writeBytes(buffer.toString()); + println72(dos, "Name: " + e.getKey()); e.getValue().write(dos); } dos.flush(); } /** - * Adds line breaks to enforce a maximum 72 bytes per line. + * Adds line breaks to enforce a maximum of 72 bytes per line. + * + * @deprecation Replaced with {@link #println72}. */ + @Deprecated(since = "13") static void make72Safe(StringBuffer line) { int length = line.length(); int index = 72; @@ -176,7 +171,37 @@ index += 74; // + line width + line break ("\r\n") length += 3; // + line break ("\r\n") and space } - return; + } + + /** + * Writes {@code line} to {@code out} with line breaks and continuation + * spaces within the limits of 72 bytes of contents per line followed + * by a line break. + */ + static void println72(OutputStream out, String line) throws IOException { + if (!line.isEmpty()) { + byte[] lineBytes = line.getBytes(UTF_8); + int length = lineBytes.length; + // first line can hold one byte more than subsequent lines which + // start with a continuation line break space + out.write(lineBytes[0]); + int pos = 1; + while (length - pos > 71) { + out.write(lineBytes, pos, 71); + pos += 71; + println(out); + out.write(' '); + } + out.write(lineBytes, pos, length - pos); + } + println(out); + } + + /** + * Writes a line break to {@code out}. + */ + static void println(OutputStream out) throws IOException { + out.write(new byte[] {'\r', '\n'}); } /** @@ -238,7 +263,7 @@ lastline = buf; continue; } - name = new String(buf, 0, buf.length, "UTF8"); + name = new String(buf, 0, buf.length, UTF_8); lastline = null; } Attributes attr = getAttributes(name); @@ -264,7 +289,7 @@ toLower(lbuf[2]) == 'm' && toLower(lbuf[3]) == 'e' && lbuf[4] == ':' && lbuf[5] == ' ') { try { - return new String(lbuf, 6, len - 6, "UTF8"); + return new String(lbuf, 6, len - 6, UTF_8); } catch (Exception e) { } diff -r 54aafb3ba9ab test/jdk/java/util/jar/Attributes/NullAndEmptyKeysAndValues.java --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test/jdk/java/util/jar/Attributes/NullAndEmptyKeysAndValues.java Mon Dec 17 04:28:11 2018 +0100 @@ -0,0 +1,225 @@ +/* + * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA + * or visit www.oracle.com if you need additional information or have any + * questions. + */ + +import static java.nio.charset.StandardCharsets.UTF_8; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.util.jar.Attributes; +import java.util.jar.Manifest; +import java.util.jar.Attributes.Name; +import java.lang.reflect.Field; + +import org.testng.annotations.Test; +import static org.testng.Assert.*; + +/** + * @test + * @bug 8066619 + * @run testng/othervm --illegal-access=warn NullAndEmptyKeysAndValues + * @summary Tests manifests with {@code null} and empty string {@code ""} + * values as section name, header name, or value in both main and named + * attributes sections. + */ +/* + * Note to future maintainer: + * In order to actually being able to test all the cases where key and values + * are null normal manifest and attributes manipulation through their public + * api is not sufficient but then there were these null checks there before + * which may or may not have had their reason and this way it's ensured that + * the behavior does not change with that respect. + * Once module isolation is enforced some test cases will not any longer be + * possible and those now tested situations will be guaranteed not to occur + * any longer at all at which point the corresponding tests can be removed + * safely without replacement unless of course another way is found inject the + * tested null values. + * Another trick to access package private class members could be to use + * deserialization or adding a new class to the same package on the classpath. + * Here is not important how the values are set to null because it shows that + * the behavior remains unchanged. + */ +public class NullAndEmptyKeysAndValues { + + static final String SOME_KEY = "some-key"; + static final String SOME_VALUE = "some value"; + static final String STRNULL = "null"; + static final String EMPTY_STR = ""; + static final Name EMPTY_NAME = new Name("tmp") {{ + try { + Field name = Name.class.getDeclaredField("name"); + name.setAccessible(true); + name.set(this, EMPTY_STR); + } catch (Exception e) { + throw new RuntimeException(e.getMessage(), e); + } + }}; + + @Test + public void testMainAttributesHeaderNameNull() throws Exception { + Manifest mf = new Manifest(); + Field attr = mf.getClass().getDeclaredField("attr"); + attr.setAccessible(true); + Attributes mainAtts = new Attributes() {{ + super.put( /* in: */ null, SOME_VALUE); + }}; + attr.set(mf, mainAtts); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + try { + mf = writeAndRead(mf); + fail(); + } catch ( /* out: */ NullPointerException e) { + return; // ok, was always like this + } + } + + @Test + public void testMainAttributesHeaderNameEmpty() throws Exception { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getMainAttributes().put(EMPTY_NAME, SOME_VALUE); + try { + mf = writeAndRead(mf); + fail(); + } catch ( /* out: */ IOException e) { + return; // ok, was always like this + } + } + + @Test + public void testMainAttributesHeaderValueNull() throws Exception { + Manifest mf = new Manifest(); + Field attr = mf.getClass().getDeclaredField("attr"); + attr.setAccessible(true); + Attributes mainAtts = new Attributes() {{ + map.put(new Name(SOME_KEY), /* in: */ null); + }}; + attr.set(mf, mainAtts); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf = writeAndRead(mf); + assertEquals(mf.getMainAttributes().getValue(SOME_KEY), + /* out: */ STRNULL); + } + + @Test + public void testMainAttributesHeaderValueEmpty() throws Exception { + Manifest mf = new Manifest(); + Field attr = mf.getClass().getDeclaredField("attr"); + attr.setAccessible(true); + Attributes mainAtts = new Attributes() {{ + map.put(new Name(SOME_KEY), /* in: */ EMPTY_STR); + }}; + attr.set(mf, mainAtts); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf = writeAndRead(mf); + assertEquals(mf.getMainAttributes().getValue(SOME_KEY), + /* out: */ EMPTY_STR); + } + + @Test + public void testSectionNameNull() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getEntries().put( /* in: */ null, new Attributes()); + mf = writeAndRead(mf); + assertNotNull(mf.getEntries().get( /* out: */ STRNULL)); + } + + @Test + public void testSectionNameEmpty() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getEntries().put( /* in: */ EMPTY_STR, new Attributes()); + mf = writeAndRead(mf); + assertNotNull(mf.getEntries().get( /* out: */ EMPTY_STR)); + } + + @Test + public void testNamedSectionHeaderNameNull() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getEntries().put(SOME_KEY, new Attributes() {{ + map.put( /* in: */ null, SOME_VALUE); + }}); + try { + mf = writeAndRead(mf); + fail(); + } catch ( /* out: */ NullPointerException e) { + return; // ok + } + } + + @Test + public void testNamedSectionHeaderNameEmpty() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getEntries().put(SOME_KEY, new Attributes() {{ + map.put( /* in: */ EMPTY_NAME, SOME_VALUE); + }}); + try { + mf = writeAndRead(mf); + fail(); + } catch ( /* out: */ IOException e) { + return; // ok + } + } + + @Test + public void testNamedSectionHeaderValueNull() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getEntries().put(SOME_KEY, new Attributes() {{ + map.put(new Name(SOME_KEY), /* in: */ null); + }}); + mf = writeAndRead(mf); + assertEquals(mf.getEntries().get(SOME_KEY).getValue(SOME_KEY), + /* out: */ STRNULL); + } + + @Test + public void testNamedSectionHeaderValueEmpty() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getEntries().put(SOME_KEY, new Attributes() {{ + map.put(new Name(SOME_KEY), /* in: */ EMPTY_STR); + }}); + mf = writeAndRead(mf); + assertEquals(mf.getEntries().get(SOME_KEY).getValue(SOME_KEY), + /* out: */ EMPTY_STR); + } + + static Manifest writeAndRead(Manifest mf) throws IOException { + ByteArrayOutputStream out = new ByteArrayOutputStream(); + mf.write(out); + byte[] mfBytes = out.toByteArray(); + System.out.println("-------------------------------------------" + + "-----------------------------"); + System.out.print(new String(mfBytes, UTF_8)); + System.out.println("-------------------------------------------" + + "-----------------------------"); + + ByteArrayInputStream in = new ByteArrayInputStream(mfBytes); + return new Manifest(in); + } + +} diff -r 54aafb3ba9ab test/jdk/java/util/jar/Manifest/ValueUtf8Coding.java --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test/jdk/java/util/jar/Manifest/ValueUtf8Coding.java Mon Dec 17 04:28:11 2018 +0100 @@ -0,0 +1,214 @@ +/* + * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA + * or visit www.oracle.com if you need additional information or have any + * questions. + */ + +import static java.nio.charset.StandardCharsets.UTF_8; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.util.jar.Attributes; +import java.util.jar.Attributes.Name; +import java.util.jar.Manifest; +import java.util.List; +import java.util.ArrayList; + +import org.testng.annotations.Test; +import static org.testng.Assert.*; + +/** + * @test + * @bug 8066619 + * @run testng ValueUtf8Coding + * @summary Tests encoding and decoding manifest header values to and from + * UTF-8 with the complete Unicode character set. + */ /* + * see also "../tools/launcher/UnicodeTest.java" for manifest attributes + * parsed during launch + */ +public class ValueUtf8Coding { + + /** + * Maximum number of bytes of UTF-8 encoded characters in one header value. + * <p> + * There are too many different Unicode code points (more than one million) + * to fit all into one manifest value. The specifications state: + * <q>Implementations should support 65535-byte (not character) header + * values, and 65535 headers per file. They might run out of memory, + * but there should not be hard-coded limits below these values.</q> + * + * @see <a + * href="{@docRoot}/../specs/jar/jar.html#Notes_on_Manifest_and_Signature_Files"> + * Notes on Manifest and Signature Files</a> + */ + static final int SUPPORTED_VALUE_LENGTH = 65535; + + /** + * Returns {@code true} if {@code codePoint} is known not to be a supported + * character in manifest header values. Explicitly forbidden in manifest + * header values are according to a statement from the specifications: + * <q>otherchar: any UTF-8 character except NUL, CR and LF</q>. + * {@code NUL} ({@code 0x0}), however, works just fine and might have been + * used and might still be. + * + * @see <a href="{@docRoot}/../specs/jar/jar.html#Section-Specification"> + * Jar File Specification</a> + */ + static boolean isUnsupportedManifestValueCharacter(int codePoint) { + return codePoint == '\r' /* CR */ || codePoint == '\n' /* LF */; + }; + + /** + * Produces a list of strings with all Unicode characters except those + * explicitly invalid in manifest header values. + * Each string is filled with as many characters as fit into + * {@link #SUPPORTED_VALUE_LENGTH} bytes with UTF-8 encoding except the + * last string which contains the remaining characters. Each of those + * strings becomes a header value the number of which 65535 should be + * supported per file. + * + * @see <a + * href="{@docRoot}/../specs/jar/jar.html#Notes_on_Manifest_and_Signature_Files"> + * Notes on Manifest and Signature Files</a> + */ + static List<String> produceValuesWithAllUnicodeCharacters() { + ArrayList<String> values = new ArrayList<>(); + byte[] valueBuf = new byte[SUPPORTED_VALUE_LENGTH]; + int pos = 0; + for (int codePoint = Character.MIN_CODE_POINT; + codePoint <= Character.MAX_CODE_POINT; codePoint++) { + if (isUnsupportedManifestValueCharacter(codePoint)) { + continue; + } + + byte[] charBuf = Character.toString(codePoint).getBytes(UTF_8); + if (pos + charBuf.length > valueBuf.length) { + values.add(new String(valueBuf, 0, pos, UTF_8)); + pos = 0; + } + System.arraycopy(charBuf, 0, valueBuf, pos, charBuf.length); + pos += charBuf.length; + } + if (pos > 0) { + values.add(new String(valueBuf, 0, pos, UTF_8)); + } + // minimum number of headers supported is the same as the minimum size + // of each header value in bytes + assertTrue(values.size() <= SUPPORTED_VALUE_LENGTH); + return values; + } + + /** + * Returns simple, valid, short, and distinct manifest header names. + * The returned name cannot collide with "{@code Manifest-Version}" because + * the returned string does not contain "{@code -}". + */ + static Name azName(int seed) { + StringBuffer name = new StringBuffer(); + do { + name.insert(0, (char) (seed % 26 + (seed < 26 ? 'A' : 'a'))); + seed = seed / 26 - 1; + } while (seed >= 0); + return new Name(name.toString()); + } + + /** + * Writes and reads a manifest with the complete Unicode character set. + * The characters are grouped into manifest header values with about as + * many bytes as allowed each, utilizing a single big manifest. + * <p> + * This test assumes that a manifest is encoded and decoded correctly if + * writing and then reading it again results in a manifest with identical + * values as the original. The test is not about other aspects of writing + * and reading manifests than only that, given the fact and the way it + * works for some characters such as the most widely and often used ones, + * it also works for the complete Unicode character set just the same. + * <p> + * Only header values are tested. The set of allowed characters for header + * names are much more limited and are a different topic entirely and most + * simple ones are used here as necessary just to get valid and different + * ones (see {@link #azName}). + * <p> + * Because the current implementation under test uses different portions + * of code depending on where the value occurs to read or write, each + * character is tested in each of the three positions:<ul> + * <li>main attribute header,</li> + * <li>named section name, and</li> + * <li>named sections header values</li> + * </ul> + * Implementation of writing the main section headers in + * {@link Attributes#writeMain(java.io.DataOutputStream)} differs from the + * one writing named section headers in + * {@link Attributes#write(java.io.DataOutputStream)} regarding the special + * order of {@link Name#MANIFEST_VERSION} and + * {@link Name#SIGNATURE_VERSION} and also + * {@link Manifest#read(java.io.InputStream)} at least potentially reads + * main sections differently than reading named sections names headers in + * {@link Attributes#read(Manifest.FastInputStream, byte[])}. + */ + @Test + public void testCompleteUnicodeCharacterSet() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + + List<String> values = produceValuesWithAllUnicodeCharacters(); + for (int i = 0; i < values.size(); i++) { + Name name = azName(i); + String value = values.get(i); + + mf.getMainAttributes().put(name, value); + Attributes attributes = new Attributes(); + mf.getEntries().put(value, attributes); + attributes.put(name, value); + } + + mf = writeAndRead(mf); + + for (int i = 0; i < values.size(); i++) { + String value = values.get(i); + Name name = azName(i); + + assertEquals(mf.getMainAttributes().getValue(name), value, + "main attributes header value"); + Attributes attributes = mf.getAttributes(value); + assertNotNull(attributes, "named section"); + assertEquals(attributes.getValue(name), value, + "named section attributes value"); + } + } + + static Manifest writeAndRead(Manifest mf) throws IOException { + ByteArrayOutputStream out = new ByteArrayOutputStream(); + mf.write(out); + byte[] mfBytes = out.toByteArray(); + + String sepLine = "----------------------------------------------" + + "--------------------------"; + System.out.println(sepLine); + System.out.print(new String(mfBytes, UTF_8)); + System.out.println(sepLine); + + ByteArrayInputStream in = new ByteArrayInputStream(mfBytes); + return new Manifest(in); + } + +} diff -r 54aafb3ba9ab test/jdk/java/util/jar/Manifest/WriteBinaryStructure.java --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/test/jdk/java/util/jar/Manifest/WriteBinaryStructure.java Mon Dec 17 04:28:11 2018 +0100 @@ -0,0 +1,73 @@ +/* + * Copyright (c) 2018, Oracle and/or its affiliates. All rights reserved. + * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. + * + * This code is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 only, as + * published by the Free Software Foundation. + * + * This code is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + * version 2 for more details (a copy is included in the LICENSE file that + * accompanied this code). + * + * You should have received a copy of the GNU General Public License version + * 2 along with this work; if not, write to the Free Software Foundation, + * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. + * + * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA + * or visit www.oracle.com if you need additional information or have any + * questions. + */ + +import static java.nio.charset.StandardCharsets.UTF_8; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.util.jar.Attributes; +import java.util.jar.Attributes.Name; +import java.util.jar.Manifest; + +import org.testng.annotations.Test; +import static org.testng.Assert.*; + +/** + * @test + * @bug 8066619 + * @run testng WriteBinaryStructure + * @summary Tests that jar manifests are written in a particular structure + */ +public class WriteBinaryStructure { + + @Test + public void testMainAttributes() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + mf.getMainAttributes().put(new Name("Key"), "Value"); + ByteArrayOutputStream buf = new ByteArrayOutputStream(); + mf.write(buf); + assertEquals(buf.toByteArray(), ( + "Manifest-Version: 1.0\r\n" + + "Key: Value\r\n" + + "\r\n").getBytes(UTF_8)); + } + + @Test + public void testIndividualSection() throws IOException { + Manifest mf = new Manifest(); + mf.getMainAttributes().put(Name.MANIFEST_VERSION, "1.0"); + Attributes attributes = new Attributes(); + mf.getEntries().put("Individual-Section-Name", attributes); + attributes.put(new Name("Key"), "Value"); + ByteArrayOutputStream buf = new ByteArrayOutputStream(); + mf.write(buf); + assertEquals(buf.toByteArray(), ( + "Manifest-Version: 1.0\r\n" + + "\r\n" + + "Name: Individual-Section-Name\r\n" + + "Key: Value\r\n" + + "\r\n").getBytes(UTF_8)); + } + +}