On Feb 19, 2019, at 2:50 PM, Brian Goetz <[email protected]> wrote: > > …we are still left with the same problem of finding the source file > corresponding to com/foo/X.class, because it will not necessarily be the > corresponding com/foo/X.java in the source path.
Yes, this is a key problem. A flattened "binary name" like pkg.X or pkg.X$Y is converted to a file system query on an internal name like pkg/X.class or pkg/X$Y.class. If both classes were to be defined in one bundle of bits, then (I think) one of the following conditions must hold: A reference to either class must converge to a reference to that one *.class file, or else (given that a reference to either class internalizes as a reference to a specific classfile name unique to it) both class file names must somehow converge to locate a single copy of the bits. More concisely either this: pkg.X, pkg.Y =converge=> pkg/XY.class => bits for X, Y or this: (pkg.X => pkg/X.class, pkg.Y => pkg/Y.class) =converge=> bits for X, Y The first alternative appears to require a convergence mapping at the name level, while the second can also rely on a convergence mapping in the file system (sym. links) or in the files themselves (brief forwarding records). The first alternative seems to me to split again into two ways, depending on whether the user of a class name has a burden to record the convergence. That is, if my source code refers to X or Y, does javac place an extra bit of information that helps locate their common definition XY? Or is it the job of the JVM and other implementors of the classpath mechanism to scan definitions like XY and "register" their willingness to define both names? Leaning some more on the (odd but suggestive) term "convergence", the alternatives might be called: 1. def-site convergence 2. use-site convergence 3. class-path convergence …based on where the primary responsibility of converging X, Y to XY occurs. Use-site convergence is actually a pretty reasonable technique for nested classes, since Java mandates that, if a compiler which translates X.Y to pkg.X$Y at the source level must *also* record that X is the definer of X.Y in the InnerClasses attribute. This gives a possible "hook" for extending class loaders to search pkg/X.class for a nearby definition of pkg.X$Y. This technique could probably be extended to associate "affiliated" classes which are not actually related by a nesting relation, but instead are located in the same source file. So use-site convergence (via some InnerClasses-like stuffing) could help guide classpath searches. It would *not* help with source-path searches, however; those would have to crawl through package folders and peek inside of source files to find hidden class declarations. In fact, the source-path mechanisms seem (to me) more resistant (than classpath mechanisms) to any notion of convergence, since we are talking about human-written source files, not classfiles which we have some control over. Nevertheless, the logic of the alternatives above applies somewhat to source-path considerations also: 1. def-site convergence = source path scanners need to peek inside all path files 2. use-site convergence = source files need an explicit "import X from Y" type statement to declare locations 3. path convergence = source paths need to be augmented with summaries (pre-compiled?) of what's stored where, perhaps rolled up in package-info.*. It seems to me we might make progress with a mixed solution: Inner classes use today's available hooks affiliated classes use forwarding pointers (def-site c.) in the file system, either sym-links if appropriate or stub classfiles which emulate sym-links on systems which lack them. E.g., the stub classfile X.class would contain a zero-length constant pool and the unqualified name of the classfile XY.class which defines X in that same package & folder. (Yep, Maarten, you sparked some musings.) — John
