On Mon, 1 Jul 2024 13:33:28 GMT, Pavel Rappo <pra...@openjdk.org> wrote:

> Please review this bugfix to the way the language of a snippet is determined 
> and processed.
> 
> The language of a snippet affects the form of snippet markup and enables 
> external syntax highlighting, such as that provided by prism.js. The language 
> of a snippet is 
> [[determined](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet)](https://docs.oracle.com/en/java/javase/22/docs/specs/javadoc/doc-comment-spec.html#snippet)
>  as follows:
> 
>> A snippet may specify a `lang` attribute, which identifies the kind of 
>> content in the snippet. For an inline snippet, the default value is `java`. 
>> For an external snippet, the default value is derived from the extension of 
>> the name of the file containing the snippet's content.
> 
> There are two issues that this PR fixes. The first issue is a specification 
> issue. The spec says nothing about the language of a hybrid snippet, which 
> has features of both an inline and external snippets. It makes sense to 
> specify that in the absence of the `lang` attribute, the language of a hybrid 
> snippet is derived from the file extension. Put differently, when determining 
> the language, a hybrid snippet behaves like an external snippet, not like an 
> inline snippet.
> 
> The second issue is an implementation issue. If the `lang` attribute or the 
> file extension is `java` or `properties`, then the form of markup corresponds 
> to that language and the HTML construct modelling the snippet is attributed 
> with `class=language-java` or `class=language-properties` respectively. This 
> is expected. However, if the `lang` attribute or the file extension is 
> neither of those, or the `lang` attribute is default, then the form of markup 
> is assumed to be that of `java`, but the HTML construct modelling the snippet 
> is not attributed, which means that the language is not passed through to the 
> 3rd party syntax highlighters.
> 
> Stepping out of this PR for a moment, there is clearly a conflation between 
> the language of a snippet and the form of snippet markup. Those are linked 
> and controlled by a single knob. That and the design whereby every snippet in 
> an unsupported language can use markup for the Java language was purposeful: 
> it was considered simple and practical.
> 
> This PR proposes that the language of a snippet is determined and processed 
> as follows:
> 
> 1. If the `lang` attribute is present, then its value is the language; if 
> that value is empty, then the language is undefined
> 2. Otherwise, 
>    1. If the snippet is inline, then the language is ...

src/jdk.javadoc/share/classes/jdk/javadoc/internal/doclets/formats/html/taglets/SnippetTaglet.java
 line 498:

> 496:             return null;
> 497:         }
> 498:         return (lastPeriod == fileName.length() - 1) ? null : 
> fileName.substring(lastPeriod + 1);

Some files, like `.gitignore`, only has suffixes, yet they are valid languages. 
What do you think?

test/langtools/jdk/javadoc/doclet/testSnippetTag/TestSnippetTag.java line 1868:

> 1866:                         """, """
> 1867:                         # @highlight substring=hi:
> 1868:                         hi <span class="bold">there</span>

I see that these expected contents are duplicated in many places. Can we pull 
them out to local variables like `propertiesEvaluated` `javaEvaluated` and 
`noneEvaluated` for ease of future maintenance and clarity?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/19971#discussion_r1661109580
PR Review Comment: https://git.openjdk.org/jdk/pull/19971#discussion_r1661106687

Reply via email to