This is an automated email from the ASF dual-hosted git repository.
mawiesne pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/opennlp.git
The following commit(s) were added to refs/heads/main by this push:
new 7b362c84 OPENNLP-1791: Document the use of ClassPathModelProvider in
dev manual (#919)
7b362c84 is described below
commit 7b362c843d6925af76fd3d65f5927c0687b9d454
Author: Martin Wiesner <[email protected]>
AuthorDate: Mon Dec 29 14:09:55 2025 +0100
OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual
(#919)
* OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual
- adds note on JVM argument since J17 and beyond
- improves formatting of all "Note:" text fragments to better highlight
helpful context to the reader
---
opennlp-docs/src/docbkx/doccat.xml | 2 +-
opennlp-docs/src/docbkx/langdetect.xml | 4 ++--
opennlp-docs/src/docbkx/model-loading.xml | 35 ++++++++++++++++++++++++++++++-
opennlp-docs/src/docbkx/namefinder.xml | 2 +-
opennlp-docs/src/docbkx/parser.xml | 2 +-
5 files changed, 39 insertions(+), 6 deletions(-)
diff --git a/opennlp-docs/src/docbkx/doccat.xml
b/opennlp-docs/src/docbkx/doccat.xml
index 33ac04a2..5c486dd6 100644
--- a/opennlp-docs/src/docbkx/doccat.xml
+++ b/opennlp-docs/src/docbkx/doccat.xml
@@ -126,7 +126,7 @@ GMDecrease Major acquisitions that have a lower gross
margin than the existing n
GMIncrease The upward movement of gross margin resulted from amounts pursuant
to adjustments \
to obligations towards dealers .]]>
</screen>
- Note: The line breaks marked with a backslash are just
inserted for formatting purposes and must not be
+ <emphasis role="strong">Note</emphasis>: The line
breaks marked with a backslash are just inserted for formatting purposes and
must not be
included in the training data.
</para>
<section id="tools.doccat.training.tool">
diff --git a/opennlp-docs/src/docbkx/langdetect.xml
b/opennlp-docs/src/docbkx/langdetect.xml
index 7e901714..0e510c59 100644
--- a/opennlp-docs/src/docbkx/langdetect.xml
+++ b/opennlp-docs/src/docbkx/langdetect.xml
@@ -141,7 +141,7 @@ lav Egija Tri-Active procedūru īpaši iesaka izmantot
siltākajos gadalaik
nedēļu laika Ekonomikas ministrijai, Finanšu ministrijai un
Labklājības ministrijai, lai ar vienotu \
pozīciju atgrieztos pie jautājuma izskatīšanas.]]>
</screen>
- Note: The line breaks marked with a backslash are just
inserted for formatting purposes and must not be
+ <emphasis role="strong">Note</emphasis>: The line
breaks marked with a backslash are just inserted for formatting purposes and
must not be
included in the training data.
</para>
<section id="tools.langdetect.training.tool">
@@ -153,7 +153,7 @@ lav Egija Tri-Active procedūru īpaši iesaka izmantot
siltākajos gadalaik
$ bin/opennlp LanguageDetectorTrainer[.leipzig] -model modelFile [-params
paramsFile] \
[-factory factoryName] -data sampleData [-encoding charsetName]]]>
</screen>
- Note: To customize the language detector,
extend the class opennlp.tools.langdetect.LanguageDetectorFactory
+ <emphasis role="strong">Note</emphasis>: To
customize the language detector, extend the class
opennlp.tools.langdetect.LanguageDetectorFactory
add it to the classpath and pass it in the
-factory argument.
</para>
</section>
diff --git a/opennlp-docs/src/docbkx/model-loading.xml
b/opennlp-docs/src/docbkx/model-loading.xml
index aec161cd..e901abf7 100644
--- a/opennlp-docs/src/docbkx/model-loading.xml
+++ b/opennlp-docs/src/docbkx/model-loading.xml
@@ -96,6 +96,39 @@ for(ClassPathModelEntry entry : models) {
</programlisting>
</para>
+ <para>
+ Moreover, certain OpenNLP models can be obtained via a
+ <emphasis>ClassPathModelProvider</emphasis>, such as OpenNLP's
+ built-in <emphasis>DefaultClassPathModelProvider</emphasis> class.
+ It allows direct use of models available under a certain locale,
given
+ that those are present in the classpath and can be loaded.
+
+ <programlisting language="java">
+ <![CDATA[
+final ClassPathModelProvider provider = new
DefaultClassPathModelProvider(finder, loader);
+// Here: SentenceModel, other model types accordingly
+final SentenceModel sm = provider.load("en",
opennlp.tools.models.ModelType.SENTENCE_DETECTOR, SentenceModel.class);
+if(sm != null) {
+ // do something with the (sentence) model
+}]]>
+ </programlisting>
+
+ In the above example, the finder and loader objects can be created
or re-used as shown in the
+ previous code example.
+ </para>
+ <para>
+ <emphasis role="strong">Note</emphasis>:
+ When running on Java 17+, the JVM argument
+
+ <screen>--add-opens
java.base/jdk.internal.loader=ALL-UNNAMED</screen>
+
+ may be required. Without this parameter, OpenNLP uses the JVM
bootstrap classpath to locate models
+ rather than the UCP class loader.
+
+ For more advanced or non-standard class loading scenarios, using
ClassGraph and implementing a
+ custom provider may cover additional cases beyond the default UCP
class loader or
+ JVM bootstrap class path behavior.
+ </para>
</section>
@@ -106,7 +139,7 @@ for(ClassPathModelEntry entry : models) {
we recommend that you have a look at our setup in the <ulink
url="https://github.com/apache/opennlp-models">OpenNLP Models
repository</ulink>. We recommend to bundle one model per JAR file.
- Make sure you add a <emphasis>model.properties</emphasis> file
with the following content
+ Make sure you add a <emphasis>model.properties</emphasis> file
with the following content:
<programlisting language="java">
<![CDATA[
diff --git a/opennlp-docs/src/docbkx/namefinder.xml
b/opennlp-docs/src/docbkx/namefinder.xml
index cdf77d3f..284a4a31 100644
--- a/opennlp-docs/src/docbkx/namefinder.xml
+++ b/opennlp-docs/src/docbkx/namefinder.xml
@@ -506,7 +506,7 @@ Precision: 0.8005071889818507
Recall: 0.7450581122145297
F-Measure: 0.7717879983140168]]>
</screen>
- Note: The command line interface does not support
cross evaluation in the current version.
+ <emphasis role="strong">Note</emphasis>: The command
line interface does not support cross evaluation in the current version.
</para>
</section>
<section id="tools.namefind.eval.api">
diff --git a/opennlp-docs/src/docbkx/parser.xml
b/opennlp-docs/src/docbkx/parser.xml
index 2dc1ecd6..468a3ea8 100644
--- a/opennlp-docs/src/docbkx/parser.xml
+++ b/opennlp-docs/src/docbkx/parser.xml
@@ -200,7 +200,7 @@ $ opennlp ParserTrainer -model en-parser-chunking.bin
-parserType CHUNKING \
tool replaces the tagger model inside the parser model with a
new one.
</para>
<para>
- Note: The original parser model will be overwritten with the
new parser model which
+ <emphasis role="strong">Note</emphasis>: The original parser
model will be overwritten with the new parser model which
contains the replaced tagger model.
<screen>
<![CDATA[