Hi Ess,

I have been reminded that the syntax for "Exotic identifiers" in Java language as proposed for JDK 7 but then redrawn was using '#' character as a prefix in front of a classical string literal:

http://mail.openjdk.java.net/pipermail/coin-dev/2009-March/001131.html

I accidentally replaced it with a syntax for Obj-C NSString literals which uses '@'.

On 01/07/2017 01:40 AM, Ess Kay wrote:
As far as I can tell, the complete string @"What a wonderful world!" is itself a valid module, package, class, field and method name.

And using the syntax for exotic identifiers, it would be expressed as:

#"@\"What a wonderful world!\""


The '@' character has a reserved status in a module name but the JVM spec says that it may appear with some yet to be published meaning. Almost every possible string of printable characters is a valid module, package, class, field and method name. For example, the string \u0022\" is a valid 8 character Java field or method name.

Written with exotic identifier syntax as:

#"\\u0022\\\""

The string \\u0022\\" is a valid 10 character Java field or method name. So a solution that uses escape characters is not as obvious as it may appear at first glance.You could even throw in a leading, embedded and trailing space and it would still be valid.

No problem. A sequence of any unicode characters is expressible as a string literal and consequently as an exotic identifier when prefixed with #.


I haven't yet tested this but, prima facie, even non-printable characters such as backspaces and carriage returns are permitted in package, class, field and method names (but not module names.) Does the JVM support some escaping scheme to allow such characters in JAR manifests and service provider specifications? If the answer is yes then what is it? If the answer is no then doesn't that demonstrates the absurdity of the situation?

It appears that NUL, CR, and LF can't be part of header values in JAR manifests, but other characters can:

http://docs.oracle.com/javase/7/docs/technotes/guides/jar/jar.html#Manifest_Specification

/Notes on Manifest and Signature Files//
//
//    Line length://
// No line may be longer than 72 bytes (not characters), in its UTF8-encoded form. If a value would make the initial line longer than this, it should be continued on extra lines (each starting with a single SPACE).//
//
//    Limitations://
// Because header names cannot be continued, the maximum length of a header name is 70 bytes (there must be a colon and a SPACE after the name).// // NUL, CR, and LF can't be embedded in header values, and NUL, CR, LF and ":" can't be embedded in header names.// // Implementations should support 65535-byte (not character) header values, and 65535 headers per file. They might run out of memory, but there should not be hard-coded limits below these values.//
/

Regards, Peter


So at this point Alan's suggested initial 'do nothing' approach is attractive. At this point the flexibility that the JVM spec gives is totally gratuitous in that no one as yet appears to have had any reason to make use of it.

On Sat, Jan 7, 2017 at 8:45 AM, Peter Levart <peter.lev...@gmail.com <mailto:peter.lev...@gmail.com>> wrote:

    Hi Ess,


    On 01/06/2017 05:27 AM, Ess Kay wrote:
    chances of meeting a module-info.class with funky module names is low
    When I raised the initial question, I had no idea that the Java verifier
    had been changed (with Java 6?) to allow "funky" package, class, field and
    method names. Somehow that change passed right under the radar. Yes - a
    possible option would be to simply ignore the broad character range allowed
    by the JVM specification and trust that in practice no one would actually
    use the usual characters in package, class, field, method or module names.
    A downside to that option is that we will no longer be able to say to our
    users that we fully support the JVM specification which in some cases can
    be a problem. Anyway, I guess it is time to accept the overwhelming inertia
    of the status quo and move on to the next problem.

    If I remember correctly, there was a crazy proposal in the past to
    specify a syntax for arbitrary symbol names in Java. It went
    roughly like:

    @"the syntax of Java string in here"


    So you could write code like:


    public class @"What a wonderful world!" {
        public static void @"Let's party..."() {
        }
    }

    //
    @"What a wonderful world!".@"Let's party..."();


    You could adopt this in your tool, what do you think?

    Regards, Peter



Reply via email to