On Mon, 1 Jul 2024 14:03:52 GMT, Chen Liang <li...@openjdk.org> wrote:
>> Please review this bugfix to the way the language of a snippet is determined >> and processed. >> >> The language of a snippet affects the form of snippet markup and enables >> external syntax highlighting, such as that provided by prism.js. The >> language of a snippet is >> [[determined](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet)](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet) >> as follows: >> >>> A snippet may specify a `lang` attribute, which identifies the kind of >>> content in the snippet. For an inline snippet, the default value is `java`. >>> For an external snippet, the default value is derived from the extension of >>> the name of the file containing the snippet's content. >> >> There are two issues that this PR fixes. The first issue is a specification >> issue. The spec says nothing about the language of a hybrid snippet, which >> has features of both an inline and external snippets. It makes sense to >> specify that in the absence of the `lang` attribute, the language of a >> hybrid snippet is derived from the file extension. Put differently, when >> determining the language, a hybrid snippet behaves like an external snippet, >> not like an inline snippet. >> >> The second issue is an implementation issue. If the `lang` attribute or the >> file extension is `java` or `properties`, then the form of markup >> corresponds to that language and the HTML construct modelling the snippet is >> attributed with `class=language-java` or `class=language-properties` >> respectively. This is expected. However, if the `lang` attribute or the file >> extension is neither of those, or the `lang` attribute is default, then the >> form of markup is assumed to be that of `java`, but the HTML construct >> modelling the snippet is not attributed, which means that the language is >> not passed through to the 3rd party syntax highlighters. >> >> Stepping out of this PR for a moment, there is clearly a conflation between >> the language of a snippet and the form of snippet markup. Those are linked >> and controlled by a single knob. That and the design whereby every snippet >> in an unsupported language can use markup for the Java language was >> purposeful: it was considered simple and practical. >> >> This PR proposes that the language of a snippet is determined and processed >> as follows: >> >> 1. If the `lang` attribute is present, then its value is the language; if >> that value is empty, then the language is undefined >> 2. Otherwise, >> 1. If the snippet... > > src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/taglets/SnippetTaglet.java > line 498: > >> 496: return null; >> 497: } >> 498: return (lastPeriod == fileName.length() - 1) ? null : >> fileName.substring(lastPeriod + 1); > > Some files, like `.gitignore`, only has suffixes, yet they are valid > languages. What do you think? Sure, that's why there's `<= 0` and not `< 0` one line above that: int lastPeriod = fileName.lastIndexOf('.'); if (lastPeriod <= 0) { return null; So, if `fileName` starts with `.`, the extension is null. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/19971#discussion_r1661212422