kinow commented on code in PR #146:
URL: https://github.com/apache/jena-site/pull/146#discussion_r1103840564
##########
layouts/_default/search.html:
##########
@@ -0,0 +1,200 @@
+{{ define "main" }}
+<!-- Source: https://makewithhugo.com/add-search-to-a-hugo-site/ -->
+<main>
+ <div id="search-results"></div>
+ <div class="search-loading">Loading...</div>
+
+ <script id="search-result-template" type="text/x-js-template">
+ <div id="summary-${key}">
+ <h3><a href="${link}">${title}</a></h3>
+ <p class="pb-0 mb-0">${snippet}</p>
+ <p class="opacity-50 pt-0 mt-0"><small>Score: ${score}</small></p>
+ <p>
+ <small>
+ ${ isset tags }Tags: ${tags}<br>${ end }
+ </small>
+ </p>
+ </div>
+ </script>
+
+ <script src="/js/fuse.min.js" type="text/javascript" crossorigin="anonymous"
referrerpolicy="no-referrer"></script>
+ <script src="/js/mark.min.js" type="text/javascript" crossorigin="anonymous"
referrerpolicy="no-referrer"></script>
+ <script type="text/javascript">
+ (function() {
+ const summaryInclude = 180;
+ // See: https://fusejs.io/api/options.html
+ const fuseOptions = {
+ // Indicates whether comparisons should be case sensitive.
+ isCaseSensitive: false,
+ // Whether the score should be included in the result set.
+ // A score of 0 indicates a perfect match, while a score of 1
indicates a complete mismatch.
+ includeScore: true,
+ // Whether the matches should be included in the result set.
+ // When true, each record in the result set will include the indices
of the matched characters.
+ // These can consequently be used for highlighting purposes.
+ includeMatches: true,
+ // Only the matches whose length exceeds this value will be returned.
+ // (For instance, if you want to ignore single character matches in
the result, set it to 2).
+ minMatchCharLength: 2,
+ // Whether to sort the result list, by score.
+ shouldSort: true,
+ // List of keys that will be searched.
+ // This supports nested paths, weighted search, searching in arrays of
strings and objects.
+ keys: [
+ {name: "title", weight: 0.8},
+ {name: "contents", weight: 0.7},
+ // {name: "tags", weight: 0.95},
+ // {name: "categories", weight: 0.05}
+ ],
+ // --- Fuzzy Matching Options
+ // Determines approximately where in the text is the pattern expected
to be found.
+ location: 0,
+ // At what point does the match algorithm give up.
+ // A threshold of 0.0 requires a perfect match (of both letters and
location),
+ // a threshold of 1.0 would match anything.
+ threshold: 0.2,
Review Comment:
With a `threshold` of `0` I get `7` search results for SHACL. That's the
same number I get when grepping for it (case insensitive),
```bash
kinow@ranma:~/Development/java/jena/jena-site/source$ grep -r -H -o -i SHACL
| awk -F: '{ print $1 }' | sort -h | uniq
documentation/fuseki2/fuseki-config-endpoint.md
documentation/__index.md
documentation/javadoc.md
documentation/notes/system-initialization.md
documentation/shacl/__index.md
documentation/tools/__index.md
download/maven.md
```
But if I search for "shakl" it brings `0` results.
With `0.2`, both SHACL and SHAKL bring me 14 search results. The 7 first
results have a score lower than `1` (in Fuse.js higher is worse), and the other
7 have a score of `1` (I left the score to be displayed with results to help
users).
So I decided to leave it to 0.2 so users still get some result if they
misspell their search query.
##########
layouts/_default/search.html:
##########
@@ -0,0 +1,200 @@
+{{ define "main" }}
+<!-- Source: https://makewithhugo.com/add-search-to-a-hugo-site/ -->
+<main>
+ <div id="search-results"></div>
+ <div class="search-loading">Loading...</div>
+
+ <script id="search-result-template" type="text/x-js-template">
+ <div id="summary-${key}">
+ <h3><a href="${link}">${title}</a></h3>
+ <p class="pb-0 mb-0">${snippet}</p>
+ <p class="opacity-50 pt-0 mt-0"><small>Score: ${score}</small></p>
+ <p>
+ <small>
+ ${ isset tags }Tags: ${tags}<br>${ end }
+ </small>
+ </p>
+ </div>
+ </script>
+
+ <script src="/js/fuse.min.js" type="text/javascript" crossorigin="anonymous"
referrerpolicy="no-referrer"></script>
+ <script src="/js/mark.min.js" type="text/javascript" crossorigin="anonymous"
referrerpolicy="no-referrer"></script>
+ <script type="text/javascript">
+ (function() {
+ const summaryInclude = 180;
+ // See: https://fusejs.io/api/options.html
+ const fuseOptions = {
+ // Indicates whether comparisons should be case sensitive.
+ isCaseSensitive: false,
+ // Whether the score should be included in the result set.
+ // A score of 0 indicates a perfect match, while a score of 1
indicates a complete mismatch.
+ includeScore: true,
+ // Whether the matches should be included in the result set.
+ // When true, each record in the result set will include the indices
of the matched characters.
+ // These can consequently be used for highlighting purposes.
+ includeMatches: true,
+ // Only the matches whose length exceeds this value will be returned.
+ // (For instance, if you want to ignore single character matches in
the result, set it to 2).
+ minMatchCharLength: 2,
+ // Whether to sort the result list, by score.
+ shouldSort: true,
+ // List of keys that will be searched.
+ // This supports nested paths, weighted search, searching in arrays of
strings and objects.
+ keys: [
+ {name: "title", weight: 0.8},
+ {name: "contents", weight: 0.7},
+ // {name: "tags", weight: 0.95},
+ // {name: "categories", weight: 0.05}
+ ],
+ // --- Fuzzy Matching Options
+ // Determines approximately where in the text is the pattern expected
to be found.
+ location: 0,
+ // At what point does the match algorithm give up.
+ // A threshold of 0.0 requires a perfect match (of both letters and
location),
+ // a threshold of 1.0 would match anything.
+ threshold: 0.2,
+ // Determines how close the match must be to the fuzzy location
(specified by location).
+ // An exact letter match which is distance characters away from the
fuzzy location would
+ // score as a complete mismatch. A distance of 0 requires the match be
at the exact
+ // location specified. A distance of 1000 would require a perfect
match to be within 800
+ // characters of the location to be found using a threshold of 0.8.
+ distance: 100,
+ // When true, search will ignore location and distance, so it won't
matter where in
+ // the string the pattern appears.
+ //
+ // NOTE: These settings are used to calculate the Fuzziness Score
(Bitap algorithm) in Fuse.js.
+ // It calculates threshold (default 0.6) * distance (default
(100), which gives 60 by
+ // default, meaning it will search for the query-term within 60
characters from the location
+ // (default 0). Since Jena docs may have very long text that
includes the query term anywhere
+ // we disable it with ignoreLocation: true.
+ // For more:
https://fusejs.io/concepts/scoring-theory.html#scoring-theory
+ ignoreLocation: true,
Review Comment:
@afs Fuse.js uses the location of the match in its algorithm, which IMO
doesn't make much sense for our use case. For example, by default it excludes
documents that have the search match appearing 60 after the initial 60
characters.
The setting above disables it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]