This is an automated email from the ASF dual-hosted git repository.
janhoy pushed a commit to branch branch_9_0
in repository https://gitbox.apache.org/repos/asf/solr.git
The following commit(s) were added to refs/heads/branch_9_0 by this push:
new 90f7114671d [RefGuide] Various Lucene/Solr, Solr/Lucene cleanups (#799)
90f7114671d is described below
commit 90f7114671df5eb7ee3ff93413c813e404e8da15
Author: Jan Høydahl <[email protected]>
AuthorDate: Sat Apr 9 01:32:04 2022 +0200
[RefGuide] Various Lucene/Solr, Solr/Lucene cleanups (#799)
* Use dep-version-lucene for lucene version links
* More nice wording from #800
Co-authored by @cpoerschke
(cherry picked from commit 67eb86aeaa73dcb57a18681847925c1f5de7df87)
---
.../modules/configuration-guide/pages/configuring-solr-xml.adoc | 2 +-
.../modules/configuration-guide/pages/index-segments-merging.adoc | 2 +-
.../modules/deployment-guide/pages/indexupgrader-tool.adoc | 2 +-
solr/solr-ref-guide/modules/deployment-guide/pages/jvm-settings.adoc | 2 +-
.../modules/deployment-guide/pages/taking-solr-to-production.adoc | 2 +-
.../modules/getting-started/pages/about-this-guide.adoc | 2 +-
solr/solr-ref-guide/modules/getting-started/pages/tutorial-aws.adoc | 2 +-
.../modules/indexing-guide/pages/currencies-exchange-rates.adoc | 2 +-
.../modules/indexing-guide/pages/indexing-with-tika.adoc | 2 +-
.../modules/indexing-guide/pages/luke-request-handler.adoc | 2 +-
.../modules/indexing-guide/pages/phonetic-matching.adoc | 2 +-
.../solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc | 2 +-
solr/solr-ref-guide/modules/query-guide/pages/loading.adoc | 2 +-
solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc | 2 +-
.../modules/query-guide/pages/standard-query-parser.adoc | 4 ++--
.../modules/query-guide/pages/stream-evaluator-reference.adoc | 2 +-
solr/solr-ref-guide/modules/query-guide/pages/term-vectors.adoc | 4 ++--
.../modules/upgrade-notes/pages/solr-upgrade-notes.adoc | 2 +-
18 files changed, 20 insertions(+), 20 deletions(-)
diff --git
a/solr/solr-ref-guide/modules/configuration-guide/pages/configuring-solr-xml.adoc
b/solr/solr-ref-guide/modules/configuration-guide/pages/configuring-solr-xml.adoc
index 9f1955082da..9744a7956ae 100644
---
a/solr/solr-ref-guide/modules/configuration-guide/pages/configuring-solr-xml.adoc
+++
b/solr/solr-ref-guide/modules/configuration-guide/pages/configuring-solr-xml.adoc
@@ -260,7 +260,7 @@ The directory under which configsets for Solr cores can be
found.
Sets the maximum number of (nested) clauses allowed in any query.
+
This global limit provides a safety constraint on the total number of clauses
allowed in any query against any collection -- regardless of whether those
clauses were explicitly specified in a query string, or were the result of
query expansion/re-writing from a more complex type of query based on the terms
in the index.
-This limit is enforced at multiple points in Lucene, both to prevent
primitive query objects (mainly `BooleanQuery`) from being constructed with an
excessive number of clauses in a way that may exhaust the JVM heap, but also to
ensure that no composite query (made up of multiple primitive queries) can be
executed with an excessive _total_ number of nested clauses in a way that may
cause a search thread to use excessive CPU.
+This limit is enforced at multiple points in Lucene, both to prevent primitive
query objects (mainly `BooleanQuery`) from being constructed with an excessive
number of clauses in a way that may exhaust the JVM heap, but also to ensure
that no composite query (made up of multiple primitive queries) can be executed
with an excessive _total_ number of nested clauses in a way that may cause a
search thread to use excessive CPU.
+
In default configurations this property uses the value of the
`solr.max.booleanClauses` system property if specified.
This is the same system property used in the `_default` configset for the
xref:caches-warming.adoc#maxbooleanclauses-element[`<maxBooleanClauses>`
element of `solrconfig.xml`] making it easy for Solr administrators to increase
both values (in all collections) without needing to search through and update
all of their configs.
diff --git
a/solr/solr-ref-guide/modules/configuration-guide/pages/index-segments-merging.adoc
b/solr/solr-ref-guide/modules/configuration-guide/pages/index-segments-merging.adoc
index 9bf7cb4d74b..766e61cbaf7 100644
---
a/solr/solr-ref-guide/modules/configuration-guide/pages/index-segments-merging.adoc
+++
b/solr/solr-ref-guide/modules/configuration-guide/pages/index-segments-merging.adoc
@@ -250,7 +250,7 @@ This is not required for near real-time search, but will
reduce search latency o
== Compound File Segments
Each Lucene segment is typically comprised of a dozen or so files.
-Lucene can be configured to bundle all of the files for a segment into a
single compound file using a file extension of `.cfs`, for "Compound File
Segment".
+Solr can be configured to bundle all of the files for a Lucene segment into a
single compound file using a file extension of `.cfs`, for "Compound File
Segment".
CFS segments may incur a minor performance hit for various reasons, depending
on the runtime environment.
For example, filesystem buffers are typically associated with open file
descriptors, which may limit the total cache space available to each index.
diff --git
a/solr/solr-ref-guide/modules/deployment-guide/pages/indexupgrader-tool.adoc
b/solr/solr-ref-guide/modules/deployment-guide/pages/indexupgrader-tool.adoc
index 3edffb1d023..e6aa84ff1da 100644
--- a/solr/solr-ref-guide/modules/deployment-guide/pages/indexupgrader-tool.adoc
+++ b/solr/solr-ref-guide/modules/deployment-guide/pages/indexupgrader-tool.adoc
@@ -36,7 +36,7 @@ You will need to include the `lucene-core-<version>.jar` and
`lucene-backwards-c
[source,bash,subs="attributes"]
----
-java -cp
lucene-core-{solr-full-version}.jar:lucene-backward-codecs-{solr-full-version}.jar
org.apache.lucene.index.IndexUpgrader [-delete-prior-commits] [-verbose]
/path/to/index
+java -cp
lucene-core-{dep-version-lucene}.jar:lucene-backward-codecs-{dep-version-lucene}.jar
org.apache.lucene.index.IndexUpgrader [-delete-prior-commits] [-verbose]
/path/to/index
----
This tool keeps only the last commit in an index.
diff --git
a/solr/solr-ref-guide/modules/deployment-guide/pages/jvm-settings.adoc
b/solr/solr-ref-guide/modules/deployment-guide/pages/jvm-settings.adoc
index f2a73a418d2..d1217258a63 100644
--- a/solr/solr-ref-guide/modules/deployment-guide/pages/jvm-settings.adoc
+++ b/solr/solr-ref-guide/modules/deployment-guide/pages/jvm-settings.adoc
@@ -39,7 +39,7 @@ There are several points to keep in mind:
* Running Solr with too little "headroom" allocated for the heap can cause
excessive resources to be consumed by continual GC.
Thus the 25-50% recommendation above.
-* Lucene/Solr makes extensive use of MMapDirectory, which uses RAM _not_
reserved for the JVM for most of the Lucene index.
+* Solr makes extensive use of MMapDirectory, which uses RAM _not_ reserved for
the JVM for most of the Lucene index.
Therefore, as much memory as possible should be left for the operating system
to use for this purpose.
* The heap allocated should be as small as possible while maintaining good
performance.
8-16Gb is quite common, and larger heaps are sometimes used.
diff --git
a/solr/solr-ref-guide/modules/deployment-guide/pages/taking-solr-to-production.adoc
b/solr/solr-ref-guide/modules/deployment-guide/pages/taking-solr-to-production.adoc
index 34f807423d2..1af02f47f70 100644
---
a/solr/solr-ref-guide/modules/deployment-guide/pages/taking-solr-to-production.adoc
+++
b/solr/solr-ref-guide/modules/deployment-guide/pages/taking-solr-to-production.adoc
@@ -386,7 +386,7 @@ Errors such as "too many open files", "connection error",
and "max processes exc
=== Avoid Swapping (*nix Operating Systems)
-When running a Java application like Lucene/Solr, having the OS swap memory to
disk is a very bad situation.
+When running a Java application like Solr, having the OS swap memory to disk
is a very bad situation.
We usually prefer a hard crash so other healthy Solr nodes can take over,
instead of letting a Solr node swap, causing terrible performance, timeouts and
an unstable system.
So our recommendation is to disable swap on the host altogether or reduce the
"swappiness".
These instructions are valid for Linux environments.
diff --git
a/solr/solr-ref-guide/modules/getting-started/pages/about-this-guide.adoc
b/solr/solr-ref-guide/modules/getting-started/pages/about-this-guide.adoc
index 7ea258ce3cb..f60b414c0af 100644
--- a/solr/solr-ref-guide/modules/getting-started/pages/about-this-guide.adoc
+++ b/solr/solr-ref-guide/modules/getting-started/pages/about-this-guide.adoc
@@ -25,7 +25,7 @@ It is structured to address a broad spectrum of needs,
ranging from new develope
It will be of use at any point in the application life cycle, for whenever you
need authoritative information about Solr.
The material as presented assumes that you are familiar with some basic search
concepts and that you can read XML.
-It does not assume that you are a Java programmer, although knowledge of Java
is helpful when working directly with Lucene or when developing custom
extensions to a Lucene/Solr installation.
+It does not assume that you are a Java programmer, although knowledge of Java
is helpful when working directly with Lucene or when developing custom
extensions to a Solr installation.
== Hosts and Port Examples
diff --git
a/solr/solr-ref-guide/modules/getting-started/pages/tutorial-aws.adoc
b/solr/solr-ref-guide/modules/getting-started/pages/tutorial-aws.adoc
index 6439c5f56fe..4a7d7bdee4a 100644
--- a/solr/solr-ref-guide/modules/getting-started/pages/tutorial-aws.adoc
+++ b/solr/solr-ref-guide/modules/getting-started/pages/tutorial-aws.adoc
@@ -140,7 +140,7 @@ $ java -version
+
[source,bash,subs="verbatim,attributes+"]
# download desired version of Solr
-$ wget
http://archive.apache.org/dist/lucene/solr/{solr-full-version}/solr-{solr-full-version}.tgz
+$ wget
http://archive.apache.org/dist/solr/solr/{solr-full-version}/solr-{solr-full-version}.tgz
# untar
$ tar -zxvf solr-{solr-full-version}.tgz
# set SOLR_HOME
diff --git
a/solr/solr-ref-guide/modules/indexing-guide/pages/currencies-exchange-rates.adoc
b/solr/solr-ref-guide/modules/indexing-guide/pages/currencies-exchange-rates.adoc
index 7d089ac9415..3853ce305cc 100644
---
a/solr/solr-ref-guide/modules/indexing-guide/pages/currencies-exchange-rates.adoc
+++
b/solr/solr-ref-guide/modules/indexing-guide/pages/currencies-exchange-rates.adoc
@@ -16,7 +16,7 @@
// specific language governing permissions and limitations
// under the License.
-The `currency` FieldType provides support for monetary values to Solr/Lucene
with query-time currency conversion and exchange rates.
+The `currency` FieldType provides support for monetary values to Solr with
query-time currency conversion and exchange rates.
The following features are supported:
* Point queries
diff --git
a/solr/solr-ref-guide/modules/indexing-guide/pages/indexing-with-tika.adoc
b/solr/solr-ref-guide/modules/indexing-guide/pages/indexing-with-tika.adoc
index 4c55ae950e5..73a38af2c96 100644
--- a/solr/solr-ref-guide/modules/indexing-guide/pages/indexing-with-tika.adoc
+++ b/solr/solr-ref-guide/modules/indexing-guide/pages/indexing-with-tika.adoc
@@ -46,7 +46,7 @@ Solr Cell supplies some metadata of its own too.
You can configure which elements should be included/ignored, and which should
map to another field.
* Solr Cell maps each piece of metadata onto a field.
By default it maps to the same name but several parameters control how this is
done.
-* When Solr Cell finishes creating the internal `SolrInputDocument`, the rest
of the Lucene/Solr indexing stack takes over.
+* When Solr Cell finishes creating the internal `SolrInputDocument`, the rest
of the indexing stack takes over.
The next step after any update handler is the
xref:configuration-guide:update-request-processors.adoc[Update Request
Processor] chain.
Solr Cell is a module, which means it's not automatically included with Solr
but must be configured.
diff --git
a/solr/solr-ref-guide/modules/indexing-guide/pages/luke-request-handler.adoc
b/solr/solr-ref-guide/modules/indexing-guide/pages/luke-request-handler.adoc
index bb0987d17dd..56ec50eebc8 100644
--- a/solr/solr-ref-guide/modules/indexing-guide/pages/luke-request-handler.adoc
+++ b/solr/solr-ref-guide/modules/indexing-guide/pages/luke-request-handler.adoc
@@ -17,7 +17,7 @@
// under the License.
The Luke Request Handler offers programmatic access to the information
provided on the xref:schema-browser-screen.adoc[] page of the Admin UI.
-It is modeled after Luke, the Lucene Index Browser by Andrzej Bialecki.
+It is modeled after
https://github.com/apache/lucene/tree/releases/lucene/{dep-version-lucene}/lucene/luke[Luke],
the Lucene Index Browser.
It is an implicit handler, so you don't need to define it in `solrconfig.xml`.
The Luke Request Handler accepts the following parameters:
diff --git
a/solr/solr-ref-guide/modules/indexing-guide/pages/phonetic-matching.adoc
b/solr/solr-ref-guide/modules/indexing-guide/pages/phonetic-matching.adoc
index 2f6fd1365b9..c68b7a9e2ac 100644
--- a/solr/solr-ref-guide/modules/indexing-guide/pages/phonetic-matching.adoc
+++ b/solr/solr-ref-guide/modules/indexing-guide/pages/phonetic-matching.adoc
@@ -25,7 +25,7 @@ For overviews of and comparisons between algorithms, see
http://en.wikipedia.org
For examples of how to use this encoding in your analyzer, see
xref:filters.adoc#beider-morse-filter[Beider Morse Filter] in the Filter
Descriptions section.
Beider-Morse Phonetic Matching (BMPM) is a "soundalike" tool that lets you
search using a new phonetic matching system.
-BMPM helps you search for personal names (or just surnames) in a Solr/Lucene
index, and is far superior to the existing phonetic codecs, such as regular
soundex, metaphone, caverphone, etc.
+BMPM helps you search for personal names (or just surnames) in a Solr index,
and is far superior to the existing phonetic codecs, such as regular soundex,
metaphone, caverphone, etc.
In general, phonetic matching lets you search a name list for names that are
phonetically equivalent to the desired name.
BMPM is similar to a soundex search in that an exact spelling is not required.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
index 52d209bafdf..c9d0db263f7 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/dismax-query-parser.adoc
@@ -77,7 +77,7 @@ These boost factors make matches in `fieldOne` much more
significant than matche
=== mm (Minimum Should Match) Parameter
-When processing queries, Lucene/Solr recognizes three types of clauses:
mandatory, prohibited, and "optional" (also known as "should" clauses).
+When processing queries, there are three types of clauses: mandatory,
prohibited, and "optional" (also known as "should" clauses).
By default, all words or phrases specified in the `q` parameter are treated as
"optional" clauses unless they are preceded by a "+" or a "-".
When dealing with these "optional" clauses, the `mm` parameter makes it
possible to say that a certain minimum number of those clauses must match.
The DisMax query parser offers great flexibility in how the minimum number can
be specified.
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
index 150f981ff2c..f2d03b10516 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/loading.adoc
@@ -446,7 +446,7 @@ image::math-expressions/ifIsNull.png[]
=== Text Analysis
The `analyze` function can be used from inside a `select` function to analyze
-a text field with a Lucene/Solr analyzer.
+a text field with an available analyzer.
The output of `analyze` is a list of analyzed tokens which can be added to
each tuple as a multi-valued field.
The multi-valued field can then be sent to Solr for indexing or the
`cartesianProduct`
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
index 5f602ef04bc..55aaba56948 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/machine-learning.adoc
@@ -415,7 +415,7 @@ NOTE: The example below works with TF-IDF _term vectors_.
The section xref:term-vectors.adoc[] offers a full explanation of this
features.
In the example the `search` function returns documents where the `review_t`
field matches the phrase "star wars".
-The `select` function is run over the result set and applies the `analyze`
function which uses the Lucene/Solr analyzer attached to the schema field
`text_bigrams` to re-analyze the `review_t` field.
+The `select` function is run over the result set and applies the `analyze`
function which uses the analyzer attached to the schema field `text_bigrams` to
re-analyze the `review_t` field.
This analyzer returns bigrams which are then annotated to documents in a field
called `terms`.
The `termVectors` function then creates TD-IDF term vectors from the bigrams
stored in the `terms` field.
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
index fe9bb929bcd..06dd8112a11 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/standard-query-parser.adoc
@@ -220,7 +220,7 @@ However for float/double types that support `NaN` values,
these two queries perf
=== Boosting a Term with "^"
-Lucene/Solr provides the relevance level of matching documents based on the
terms found.
+Solr provides the relevance level of matching documents based on the terms
found.
To boost a term use the caret symbol `^` with a boost factor (a number) at the
end of the term you are searching.
The higher the boost factor, the more relevant the term will be.
@@ -372,7 +372,7 @@ For example, to search for (1+1):2 without having Solr
interpret the plus sign a
== Grouping Terms to Form Sub-Queries
-Lucene/Solr supports using parentheses to group clauses to form sub-queries.
+Solr supports using parentheses to group clauses to form sub-queries.
This can be very useful if you want to control the Boolean logic for a query.
The query below searches for either "jakarta" or "apache" and "website":
diff --git
a/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
index 5e3accf0575..c6bb3f9a4a6 100644
---
a/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
+++
b/solr/solr-ref-guide/modules/query-guide/pages/stream-evaluator-reference.adoc
@@ -100,7 +100,7 @@ add(fieldA,if(gt(fieldA,fieldB),fieldA,fieldB)) // if
fieldA > fieldB then field
== analyze
-The `analyze` function analyzes text using a Lucene/Solr analyzer and returns
a list of tokens emitted by the analyzer.
+The `analyze` function analyzes text using an available analyzer and returns a
list of tokens emitted by the analyzer.
The `analyze` function can be called on its own or within the
xref:stream-decorator-reference.adoc#select[`select`] and
xref:stream-decorator-reference.adoc#cartesianproduct[`cartesianProduct`]
streaming expressions.
=== analyze Parameters
diff --git a/solr/solr-ref-guide/modules/query-guide/pages/term-vectors.adoc
b/solr/solr-ref-guide/modules/query-guide/pages/term-vectors.adoc
index 7ca5527fff0..5f297d050a8 100644
--- a/solr/solr-ref-guide/modules/query-guide/pages/term-vectors.adoc
+++ b/solr/solr-ref-guide/modules/query-guide/pages/term-vectors.adoc
@@ -119,7 +119,7 @@ The phrase query "Man on Fire" is searched for and the top
5000 results, by scor
A single field from the results is return which is the `review_t` field that
contains text of the movie review.
Then `cartesianProduct` function is run over the search results.
-The `cartesianProduct` function applies the `analyze` function, which takes
the `review_t` field and analyzes it with the Lucene/Solr analyzer attached to
the `text_bigrams` schema field.
+The `cartesianProduct` function applies the `analyze` function, which takes
the `review_t` field and analyzes it with the analyzer attached to the
`text_bigrams` schema field.
This analyzer emits the bigrams found in the text field.
The `cartesianProduct` function explodes each bigram into its own tuple with
the bigram stored in the field `term`.
@@ -132,7 +132,7 @@ Then Zeppelin-Solr is used to visualize the top 10 ten
bigrams.
image::math-expressions/text-analytics.png[]
-Lucene/Solr analyzers can be configured in many different ways to support
aggregations over NLP entities (people, places, companies, etc.) as well as
tokens extracted with regular expressions or dictionaries.
+Analyzers can be configured in many different ways to support aggregations
over NLP entities (people, places, companies, etc.) as well as tokens extracted
with regular expressions or dictionaries.
== TF-IDF Term Vectors
diff --git
a/solr/solr-ref-guide/modules/upgrade-notes/pages/solr-upgrade-notes.adoc
b/solr/solr-ref-guide/modules/upgrade-notes/pages/solr-upgrade-notes.adoc
index ac9c8bcd4ff..617cd14a0e7 100644
--- a/solr/solr-ref-guide/modules/upgrade-notes/pages/solr-upgrade-notes.adoc
+++ b/solr/solr-ref-guide/modules/upgrade-notes/pages/solr-upgrade-notes.adoc
@@ -574,7 +574,7 @@ for an overview of the main new features of Solr 8.4.
When upgrading to 8.4.x users should be aware of the following major changes
from 8.3.
-*Reminder:* If you set the `postingsFormat` or `docValuesFormat` in the
schema in order to use a non-default option, you risk preventing yourself from
upgrading your Lucene/Solr software at future versions.
+*Reminder:* If you set the `postingsFormat` or `docValuesFormat` in the
schema in order to use a non-default option, you risk preventing yourself from
upgrading your Solr software at future versions, due to changed version of the
Lucene library.
Multiple non-default postings formats changed in 8.4, thus rendering the index
data from a previous index.
This includes "FST50" which was recommended by the Solr TaggerHandler for
performance reasons.
There is now improved documentation to navigate this trade-off choice.