Oops. Here is a link to the forgotten updated webrev: http://cr.openjdk.java.net/~prappo/8239487/webrev.02/
> On 10 Mar 2020, at 15:45, Pavel Rappo <[email protected]> wrote: > > The replies are inline. > >> On 10 Mar 2020, at 02:11, Jonathan Gibbons <[email protected]> >> wrote: >> >> Meta comment: >> This review is getting out of hand; I think it was a mistake for me to have >> ordered my comments according to your phases, and I apologize; I won't do it >> again. > > Sorry about that. > >> --- >> >> Ther's some naming stuff and minor feedback, but I think you're getting >> close. >> >> -- Jon >> >> >> On 03/09/2020 10:48 AM, Pavel Rappo wrote: >>> Jon, the replies are inline and the updated webrev is here: >>> >>> http://cr.openjdk.java.net/~prappo/8239487/webrev.01/ >>> >>>> On 7 Mar 2020, at 01:53, Jonathan Gibbons <[email protected]> >>>> wrote: >>>> >>>> phase-1 >>>> >>>> In AbstractIndexWriter, >>>> the use of "anyOf" with a singleton list seems a bit weird. >>> I've changed methods' names to "ofCategories" and "containsOfCategories". >>> I didn't change them to just "of" and "contains" because of the possibility >>> of having a richer selection of those in the near future. For instance, >>> ofModule(...), ofPackage(...), havingFirstLetter(...), etc. >> >> "containsOfCategories" doesn't make grammatical sense. Maybe you should go >> back to "anyOf" naming, which allows for the possibility of "allOf". > > I know that naming is hard. On the one hand, all code, both internal and > external, > should be of the same good quality. On the other hand, it's an internal API > inaccessible from the outside. How much effort do we want to put into it? > > Here's another perspective. On the one hand, the name should be a good > mnemonic > and ideally should be sufficient to see what's going on without having to > reach > for documentation. On the other hand, it is an idealistic and probably > unattainable goal. Sigh. > > I think that the "contains" method should have "any" in its name. The reason > is > that I can easily imagine reading the code and being puzzled by "Does this > mean > that true is returned when ALL of the listed categories or ANY of them are > found?" Methods that return Stream<SearchIndexItem> are different in that > respect. One glance at Category resolves any ambiguities. Indeed, an item may > be > of only one category. Thus, no item can be of many categories, thus the > implicit > semantics is "any/or" rather than "all/and". > >> It's just that both containsAnyOf and containsAllOf collapse to just >> 'contains' for a single argument. > > I can only say that this is not a unique case. > java.util.concurrent.CompletableFuture's > allOf and anyOf come to mind. > >>>> In SearchIndexItem: >>>> The word "index" is heavily overloaded in javadoc. Would it be better in >>>> this case to use "INDEX_TAG"? (Just asking). >>>> If nothing else, add doc-comments on the members of Category. >>> I've added comments instead of renaming that constant. >> OK, at least for now. ( I may change my mind in the future :-) ) >> >>> >>>> Folding "boolean systemProperty" into the Category seems like a good idea. >>>> >>>> In SearchIndexItems: ugh for having to put the apiNote after the scary >>>> comment. Nothing we can do, except maybe one day think about removing the >>>> scary comments, which have maybe outlived their usefulness. Maybe we could >>>> limit them to package-info.java fies. >>>> >>>> 120 * <p> The returned stream consists of all items {@code i} for >>>> which there's >>>> It's often considered better to avoid contractions (like "there's") in >>>> formal writing. >>>> >>>> 120 * <p> The returned stream consists of all items {@code i} for >>>> which there's >>>> 121 * a category {@code c}, from the specified categories, such that >>>> 122 * {@code i.getCategory().equals(c)}. The stream may be empty, if >>>> there are >>>> 123 * no such items. >>>> The commas are questionable around "from the specified categories". Don't >>>> use commas when the enclosed text is important to the understanding of the >>>> sentence. >>> Grammar has been fixed. Thanks. >>> >>>> `concat` seems overkill. For an internal method, I'd simplify the >>>> anyOfCategories, and either use overloads or a single varargs and just >>>> verify the length is not zero. By inspection, it seems to only ever use 1 >>>> or 2 args. Use overloads! >>> I still think this method has a value, and I do hope to reuse it really >>> soon. >>> The reason is, I hold compile-time checks in high regard. >> >> It's not the argument pattern (first, rest) as much as the implementation. >> I'm unconvinced of the need to detect duplicates, and concat is the wrong >> name if you do want to suppress duplicates. > > I see. I don't have a strong opinion about nulls and duplicates. However, I'm > convinced that those extra runtime checks have their value. > > That said, I don't want this thread to rathole into a discussion that has been > done many times in many places. It really is not that essential in this > context. > So I hope you'll welcome this "skimmed milk" version of the "concat" > functionality: > > /** > * Returns a concatenated stream of elements. > * > * <p> The elements of the returned stream are encountered in the > following order: > * {@code first, remaining[0], remaining[1], ..., > remaining[remaining.length - 1]}. > * > * @param first > * the first element > * @param remaining > * the remaining elements, if any > * @param <T> > * the type of elements > * > * @return the stream of elements > * > * @throws NullPointerException > * if {@code remaining} is {@code null} > */ > private static <T> Stream<T> concatenatedStreamOf(T first, T[] remaining) { > return Stream.concat(Stream.of(first), Stream.of(remaining)); > } > > Just bare-bones functionality of concatenation for internal consumption. > Duplicates and nulls are dealt with on the client side, if required. > >>> >>> Quick search [*] through the codebase shows there are other places where >>> this >>> pattern is used. Many of those ultimately push arguments down to either of >>> >>> java.util.EnumSet.of(E first, E... rest) >>> java.nio.file.FileSystem.getPath(String first, String... more) >>> >>> with a couple of exceptions. For example, these methods delegate to neither >>> of the above: >>> >>> java.net.http.WebSocket.Builder.subprotocols(String mostPreferred, >>> String... lesserPreferred) >>> com.sun.tools.javac.util.List.of(A x1, A x2, A x3, A... rest) >>> >>> Among the things I found are these: >>> >>> javax.tools.StandardJavaFileManager.PathFactory.getPath >>> (ahem) jdk.javadoc.internal.doclets.formats.html.markup.HtmlTree.UL >>> If you think of alternatives like overloads, as you suggested, or plain >>> vararg, >>> they have their own problems. Both will contain roughly the same code >>> addressing >>> duplicates and nulls, albeit a bit more simple. Simple, but not trivial. If >>> so, >>> we might as well provide this code once. It's not a performance-sensitive >>> place by any means. >>> >>> For overloads like >>> >>> ofCategories(Category category) >>> ofCategories(Category firstCategory, Category secondCategory) >>> ofCategories(Category firstCategory, Category secondCategory, Category >>> thirdCategory) >>> ... >>> the former method also has a naming problem: should "category" be plural >>> or singular? >>> >>> After I posted the initial webrev, I realized that there's a potential >>> problem >>> (but not yet a bug) in that `concat` method of mine. Namely, ordering. >>> Set.of() >>> is ordering-unfriendly by design. So a stream from that set would be of an >>> unknown encounter order. I fixed in a straightforward way: >>> >>> @SafeVarargs >>> @SuppressWarnings("varargs") >>> private static <T> Stream<T> concat(T required, T... optional) { >>> Set<T> set = new LinkedHashSet<>(optional.length + 1, 1.0f); >>> set.add(Objects.requireNonNull(required)); >>> for (T t : optional) { >>> if (!set.add(Objects.requireNonNull(t))) >>> throw new IllegalArgumentException(); >>> } >>> return set.stream(); >>> } >>> >>>> phase-2 >>>> >>>> AbstractIndexWriter >>>> line 511,512: horrible non-standard formatting: was this suggested by an >>>> IDE? >>> No, it was done by hand. Default IDE's style is not to blame. Fixed. >>> >>>> keyCharacter: when is this method called with an empty string? It looks >>>> like it is related to the SearchIndexItem having en empty label ... >>>> perhaps we should check/fix the problem there. >>>> Be careful of using Character.toUpperCase/toLowerCase ... they are >>>> locale-dependent. >>>> That being said, maybe we do want the locale-dependent version here, which >>>> is unusual. >>> One more of these things can be found here: >>> >>> jdk.javadoc.internal.doclets.toolkit.util.IndexBuilder#keyCharacter >>> >>> This is all preexisting code. That said, we should definitely investigate >>> it. >>> Do you want me to do it in the context of this task though? >> No >>> >>>> While it is reasonable to get the first character of a Java identifier, I >>>> wonder if it is reasonable >>>> to get the first character of an arbitrary string, i.e. from an {@index} >>>> tag. >>>> >>>> various: it's not wrong, but I personally don't like the style of >>>> importing static methods, like groupingBy and toList. Keep it if you want, >>>> but if I were writing the code, I would use Collectors.methodName >>> I share your dislike in this case. I guess I'm still trying to cram code >>> into >>> 80-character-long lines. Old habits die hard. >> The convention for this code is more like 100-characters or so. The days of >> punched cards and video display units with 80x24 character screens are long >> gone. > > ok, from now on I'll try to use that extra 25% for the benefit of the > jdk.javadoc codebase. > >>> >>>> phase-3 >>>> >>>> HtmlDocletWriter: this class is one instance per page ... do you really >>>> mean to put it here, as compared to Configuration (more global) or some >>>> more specific kind of page? (I see it moved in phase 4!) >>>> >>>> SystemPropertiesWriter:80 >>>> I don't know if this is the right or wrong place for the check, but for >>>> reference I would compare with other "optional" pages, like Constant >>>> Values and Serialized Form. >>> If in doubt, at least be consistent with what there is at the moment. That >>> makes >>> sense. As far as I understand you are pointing me to the "builders" >>> infrastructure, >>> right? >>> >>> SystemPropertyWriter has never had a builder. If we are to use it, may I >>> suggest >>> we address this in a follow-up issue? >> >> Uugh, builders, uugh. That's a big can of worms that we don't want to open >> right now. > > Right. I have put an extra comment there, if you don't mind. > >>> >>>> SystemPropertiesWriter:155 >>>> I like the intention, but I would have thought the link factory would have >>>> returned a link in code font. I guess I would check other places where >>>> links are generated. >>> Jon, I found >>> >>> jdk.javadoc.internal.doclets.formats.html.HtmlDocletWriter#plainOrCode >>> >>> and a family of overloaded createLink methods consuming some enigmatic >>> >>> @param strong whether to wrap the {@code label} in a SPAN element >>> >>> Is there anything in particular I should be looking for? >> I guess I was thinking to go look at other places where references to java >> elements are created. If you chase down code starting from >> AbstractMemberWriter.Signature, via methods like 'addParam', you end up at >> HtmlDocletWriter.getLink, which you can call with a suitably configured >> LinkInfoInfo. > > I'm looking into it and will update you soon. Thanks. > >>>> TagletWriterImpl:457 >>>> The comment is unclear: does "those" refer back to DocFileELement? >>> I've refined that comment. >>> >>>> I'm also curious why you think using identity equals and hashCode is >>>> inefficient. >>> It's not that those methods are inefficient, it's that this *cache* of >>> DocletElement instances *might* be inefficient. The worst-case scenario is >>> where >>> titles for the same HTML document represented by different DocletElement >>> objects >>> are repeatedly calculated and stored. The comment is just about that. I was >>> lazy >>> and didn't investigate where those instances come from, I also saw that >>> those >>> types do not override equals & hashCode, hence this comment. >> OK, so I guess you're concerned about space-inefficient, not time-inefficient >> because identity .equals and .hashCode are fast ;-) > > Well, kind of. It's when the cache behaves as if it weren't there and the > title > is parsed out of HTML each and every time. > >>>> Utils: uuugh ... I keep trying to get stuff *out* of the Utils dumping >>>> ground, and here you are adding to it. This is OK for now, but there may >>>> be a different utility class waiting to be written for processing >>>> DocTrees. This is not the only method to be working on DocTrees (although >>>> other work may be elsewhere, down in jdk.compiler module.) >>> Please propose a more appropriate place. >> It's OK for now. >> >>> >>>> test: the new convention is to generate files on the fly ... which will be >>>> even easier when we have text blocks. The ToolBox library class has code >>>> to help write out smal files like these ... and using text strings avoids >>>> the heavy copyright overhead. >>> Not that I'm opposing it, just asking: are there any other benefits or using >>> on-the-fly file generation beyond simply avoiding "the heavy copyright >>> overhead" >>> in this case? >> >> In all cases where I've had to create such sets of files, I find it's >> generally easier to visualize the overall pattern of the files when they are >> co-located. >> >>> >>> Both of the said techniques have their advantages and disadvantages. >>> On-the-fly >>> generation is convenient for example, when the output is small or >>> structurally >>> flat, or when it is parametric. Here it is neither. Pre-baked files can >>> benefit >>> from IDE support, which is quite convenient. I wouldn't like to give it up >>> this >>> lightly. >> OK. >>> >>>> phase-4 >>>> >>>> In SystemPropertiesWriter, there is no need to use a WeakHashMap compared >>>> to a standard HashMap. >>> Right. Using WeakHashMap is more of a declaration: it's cache, the whole >>> cache, >>> and nothing but the cache. I like this extra bit of information, but I can >>> swap >>> it to HashMap if you want. >> It's just an unnecessary overhead to be creating all those weak references. >> But, your point is taken. >>> >>>> Elements created by javac, which is the dominant collection of elements, >>>> will be permanent until the end of the javadoc run. And moreover, >>>> SystemPropertiesWriter should have a short lifetime, and no elements will >>>> be garbage-collectible while it is alive. >>> Right. (For the sake of the argument, those DocletElement instances are >>> created by javadoc.) >>> >>>> Line 172: I see the comment moved here from TagletWriterImpl. Same >>>> comments apply that I made for phase-3. >>>> >>>> In SearchIndexItem, if I understand it correctly, it seems strange to add >>>> a field of type DocletElement that is specific to certain types of search >>>> index item. And, there is a cognizant disparity between the name of the >>>> type (DocletElement) and the name of the method (getElement). Generally, I >>>> would expect a method named `getElement` to return an Element. >>> Understood and fixed. >>> >>>> It seems like all other serach-index items are associated with n element >>>> anyway! >>> Not sure what can be done here. The signature of `getElement` doesn't >>> require >>> any argument or `int` value for that matter, which could be mistaken for >>> some sort of index/position such as in List.get(int). >> I'm not sure I understand what you're saying here. I was trying to say that >> since all >> search index items seems to be abstractly associate with some element >> representing >> the "target" of the search item, then it was reasonable to have a general >> getter/setter >> for the associate element, and that seems to be what you've done, although >> maybe not >> so far as to set it in all cases for all SearchIndexItems. > > I see. > >>> *** >>> >>> What about feedback to the user in case no <title> in the document is found? >>> Do you have any suggestions? >> Which user? the one running javadoc, or the one reading the generated files. > > Bob the javadoc-builder. That is, the one running the javadoc tool. > >> The base name of the file (e.g. net-properties.html) > > This is how the fallback is performed in this change. > > -Pavel
