Currently, the compiler expects sources on a source path (i.e. --source-path) to be organized in a specific hierarchy corresponding to packages and classes. For the module source path, this is extended to include the enclosing module. Although we experimented with requiring the module directory to be immediately enclosing the package directory, this proved to be too onerous in practice, and we weakened the requirement to be that the sources for a module should be in one or more directories below a directory named for the module. The use case is for complex projects that have variant forms of the source code for a module, such as baseline shared code, and OS-specific variants, such as you find in OpenJDK itself.

This means that the each element on the module source path is composed of 3 parts: 1. one or more paths identify the paths to the directories containing module directories
    2. a directory named for the module
3. one or more paths identifying the subdirectories containing roots of the package hierarchies for the module's classes.

Thus, the '*' you see in the module path should not be construed as a wildcard, so much as it is a token to indicate where in the overall path the module name is expected to appear.

The 3rd component may be empty, in which case you can drop the second component as well; it will be inferred to be at the end of the given paths.

Here are some examples of how this can be used.

If your project has a couple of modules m1, m2, the simplest organization is to have a src/ directory, and put the source for m1 in src/m1/ and the source for m2 in src/m2/. If you do that, your module source path could be like
    --module-source-path /Users/Me/MyProject/src
The compiler will be able to find everything it needs in this simple case under the src/ directory. If it needs to find class p1.C1 in m1 it can look in /Users/Me/MyProject/src/m1/p1/C1.java, etc.

Now suppose the project is a bit more complicated, and each module has some OS-specific code and some OS-independent code. You might want to put the Linux code in src/m1/linux, the Windows code in src/m1/windows, and the shared code in src/m1/shared, and ditto for m2. You can tell the compiler about that using a mdule source path like one of these:
    --module-source-path /Users/Me/MyProject/src/*/{linux,shared}
    --module-source-path /Users/Me/MyProject/src/*/{windows,shared}

Now, maybe the project gets even bigger, and you start generating some of the code for each module. You generate the code for m1 in build/gensrc/m1, and the code for m2 in build/gensrc/m2. You can describe that too: --module-source-path /Users/Me/MyProject/src/*/{linux,shared}:/Users/Me/MyProject/build/gensrc/*

At this point, it is important to realize that * is more than a wildcard. It stands for the same module name in all the places it appears. In other words, the source for m1 will be found in
/Users/Me/MyProject/src/m1/{linux,shared}:/Users/Me/MyProject/build/gensrc/m1
and the source for m2 will likewise be found in
/Users/Me/MyProject/src/m2/{linux,shared}:/Users/Me/MyProject/build/gensrc/m2

Yes, this is complicated, but so is the use case. I go back to saying that the simple case is simple: if you arrange the code in your modules such that you put the code for a module in an enclosing directory named for the module, the module source path becomes more like a simple path, as in
    --module-source-path /Users/Me/MyProject/src
or if it is in multiple projects, use
--module-source-path /Users/Me/MyProject/src:/Users/Me/MyOtherProject/src

The requirement that the source must be in/under a directory named for the module is a natural extension of the existing naming conventions for the directories and files that contain packages and classes.


Addition responses inline.



On 10/18/2016 12:09 AM, Eugene Zhuravlev wrote:
Hi dev. list members,

We at JetBrains are working on jigsaw-related javac features support in IntelliJ IDEA. Namely, the --module-source-path parameter. This option is important when multiple modules are compiled at the same time. While the IDE compiles modules one-by-one, there are certain situations where we have to use multi-module compilation. For example, the case when module-info files for different modules reference each other:

module-info.java in module A:
module a {
  exports a to b;
}


module-info.java in module B:
module b {
  requires a;
}

Here we have to compile sources for module A and module B together in one compile session and use --module-source-path parameter so that javac is able to resolve both module descriptors.

My recent investigations show that current javac implementation assumes certain disk layout for the source files that form a module. This leads to restrictions on the --module-source-path argument value. Currently this value is a list of paths where every path may optionally a "*" wildcard denoting any directory on particular file system level. The code responsible for --module-source-path option support is located in com.sun.tools.javac.file.Locations.ModuleSourcePathLocationHandler.init()

The code here works differently depending on whether the path element contains an optional '*' wildcard or not. If the path contains the wildcard, the directory name matching this wildcard will be assumed equal to module name (which is another problem) and the path to the module descriptor file is configured correctly. If there is no wildcard in the path, the path is not used "as is", but instead its direct sub-directories are analyzed and used as roots where module-info.java can be found. The latter looks more like a bug than intended behavior.

The different behavior is simply the compiler treating a path element without a '*' as equivalent to the path with '*' appended. In other words, a module source path of
    /Users/Me/MyProject/src
is equivalent to one like this
    /Users/Me/MyProject/src/*

So yes, the behavior you are seeing is intentional, and not a bug.


From the IDE's point of view there is no need to use "*" wildcards, since the "too much typing" is not an issue for the program. Another reason is that the usage of wildcards is possible only for certain layouts of module A and B sources. In general case, when modules contain several source roots on different file system levels, the usage of wildcards is not possible.

'*' is not a wildcard, and it is not there for brevity. It is there to help the compiler coordinate the different directories containing the source code for the module. The '*' character could equally have been any other token, like '%' or "MODULE-NAME". It is not a shell character, nor is it a file system character, it is simply a special token in the syntax for complex module source paths.


So enumeration of absolute paths to source roots is the only option available for the IDE. Due to the problem mentioned above this does not work either. The IDE could have created the paths with wildcards and this would have worked for some project layouts, but the assumption that the directory name is equal to module name looks too strict and should not be true for many real-life project layouts.

Currently, if you are compiling the source code for many modules at the same time, it is a requirement that there be a directory named for the module in the path. This is to partly to assist the compiler when looking up references to classes in other modules, and partly to coordinate the directories when the source code for a single module is spread across many directories.


So the questions are:
- Are there any changes planned for the command line interface to address these issues?

The simple answer is no. We have been discussing -when- to use the source path, but nothing regarding the syntax of the module source path.


- If current command line behavior is correct and intended for some certain situations only, we would kindly ask to consider making module source path configuration more flexible via the compiler tooling API, which is used by IDEs. However, keeping command line interface and tooling API consistent is a good idea too.

It is certainly the case that apart from a slight name change, --module-source-path is still the same as originally designed, and it is the case that many other new command line options have evolved since then. The most obvious suggestion would be to allow a command line option and API to set an explicit package-oriented path for each module individually. Staying clear of the naming bike-shed for now, this could be
        --new-module-source-path MODULE=PATH
e.g. --new-module-source-path m1=/Users/Me/MyProject/src/m1 --new-module-source-path m2=/Users/Me/MyProject/src/m2
with corresponding API
public void setLocationForModule(Location location, Name moduleName, List<Path> paths) throws IOException e.g. fileManager.setLocationForModule(StandardLocations.MODULE_SOURCE_PATH, "m1", List.of(Paths.get("/Users/Me/MyProject/src/m1"))); e.g. fileManager.setLocationForModule(StandardLocations.MODULE_SOURCE_PATH, "m2", List.of(Paths.get("/Users/Me/MyProject/src/m2")));


But that is just an off-the-wall spur-of-the-moment suggestion.

Thanks in advance for any comments on the problem,


-- Jon

Reply via email to