(opennlp) branch main updated: OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual (#919)

mawiesne Mon, 29 Dec 2025 05:10:09 -0800

This is an automated email from the ASF dual-hosted git repository.

mawiesne pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/opennlp.git



The following commit(s) were added to refs/heads/main by this push:
     new 7b362c84 OPENNLP-1791: Document the use of ClassPathModelProvider in 
dev manual (#919)
7b362c84 is described below

commit 7b362c843d6925af76fd3d65f5927c0687b9d454
Author: Martin Wiesner <[email protected]>
AuthorDate: Mon Dec 29 14:09:55 2025 +0100

    OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual 
(#919)
    
    * OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual
    - adds note on JVM argument since J17 and beyond
    - improves formatting of all "Note:" text fragments to better highlight 
helpful context to the reader
---
 opennlp-docs/src/docbkx/doccat.xml        |  2 +-
 opennlp-docs/src/docbkx/langdetect.xml    |  4 ++--
 opennlp-docs/src/docbkx/model-loading.xml | 35 ++++++++++++++++++++++++++++++-
 opennlp-docs/src/docbkx/namefinder.xml    |  2 +-
 opennlp-docs/src/docbkx/parser.xml        |  2 +-
 5 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/opennlp-docs/src/docbkx/doccat.xml 
b/opennlp-docs/src/docbkx/doccat.xml
index 33ac04a2..5c486dd6 100644
--- a/opennlp-docs/src/docbkx/doccat.xml
+++ b/opennlp-docs/src/docbkx/doccat.xml
@@ -126,7 +126,7 @@ GMDecrease Major acquisitions that have a lower gross 
margin than the existing n
 GMIncrease The upward movement of gross margin resulted from amounts pursuant 
to adjustments \
            to obligations towards dealers .]]>
                        </screen>
-                       Note: The line breaks marked with a backslash are just 
inserted for formatting purposes and must not be
+                       <emphasis role="strong">Note</emphasis>: The line 
breaks marked with a backslash are just inserted for formatting purposes and 
must not be
                        included in the training data.
                </para>
                <section id="tools.doccat.training.tool">
diff --git a/opennlp-docs/src/docbkx/langdetect.xml 
b/opennlp-docs/src/docbkx/langdetect.xml
index 7e901714..0e510c59 100644
--- a/opennlp-docs/src/docbkx/langdetect.xml
+++ b/opennlp-docs/src/docbkx/langdetect.xml
@@ -141,7 +141,7 @@ lav     Egija Tri-Active procedūru īpaši iesaka izmantot 
siltākajos gadalaik
                nedēļu laika Ekonomikas ministrijai, Finanšu ministrijai un 
Labklājības ministrijai, lai ar vienotu \
                pozīciju atgrieztos pie jautājuma izskatīšanas.]]>
                        </screen>
-                       Note: The line breaks marked with a backslash are just 
inserted for formatting purposes and must not be
+                       <emphasis role="strong">Note</emphasis>: The line 
breaks marked with a backslash are just inserted for formatting purposes and 
must not be
                        included in the training data.
                </para>
                <section id="tools.langdetect.training.tool">
@@ -153,7 +153,7 @@ lav     Egija Tri-Active procedūru īpaši iesaka izmantot 
siltākajos gadalaik
 $ bin/opennlp LanguageDetectorTrainer[.leipzig] -model modelFile [-params 
paramsFile] \
   [-factory factoryName] -data sampleData [-encoding charsetName]]]>
                                </screen>
-                               Note: To customize the language detector, 
extend the class opennlp.tools.langdetect.LanguageDetectorFactory
+                               <emphasis role="strong">Note</emphasis>: To 
customize the language detector, extend the class 
opennlp.tools.langdetect.LanguageDetectorFactory
                                add it to the classpath and pass it in the 
-factory argument.
                        </para>
                </section>
diff --git a/opennlp-docs/src/docbkx/model-loading.xml 
b/opennlp-docs/src/docbkx/model-loading.xml
index aec161cd..e901abf7 100644
--- a/opennlp-docs/src/docbkx/model-loading.xml
+++ b/opennlp-docs/src/docbkx/model-loading.xml
@@ -96,6 +96,39 @@ for(ClassPathModelEntry entry : models) {
             </programlisting>
 
         </para>
+        <para>
+          Moreover, certain OpenNLP models can be obtained via a
+          <emphasis>ClassPathModelProvider</emphasis>, such as OpenNLP's
+          built-in <emphasis>DefaultClassPathModelProvider</emphasis> class.
+          It allows direct use of models available under a certain locale, 
given
+          that those are present in the classpath and can be loaded.
+
+          <programlisting language="java">
+            <![CDATA[
+final ClassPathModelProvider provider = new 
DefaultClassPathModelProvider(finder, loader);
+// Here: SentenceModel, other model types accordingly
+final SentenceModel sm = provider.load("en", 
opennlp.tools.models.ModelType.SENTENCE_DETECTOR, SentenceModel.class);
+if(sm != null) {
+  // do something with the (sentence) model
+}]]>
+          </programlisting>
+
+          In the above example, the finder and loader objects can be created 
or re-used as shown in the
+          previous code example.
+        </para>
+        <para>
+          <emphasis role="strong">Note</emphasis>:
+          When running on Java 17+, the JVM argument
+          
+          <screen>--add-opens 
java.base/jdk.internal.loader=ALL-UNNAMED</screen>
+
+          may be required. Without this parameter, OpenNLP uses the JVM 
bootstrap classpath to locate models
+          rather than the UCP class loader.
+
+          For more advanced or non-standard class loading scenarios, using 
ClassGraph and implementing a
+          custom provider may cover additional cases beyond the default UCP 
class loader or
+          JVM bootstrap class path behavior.
+        </para>
        </section>
 
 
@@ -106,7 +139,7 @@ for(ClassPathModelEntry entry : models) {
             we recommend that you have a look at our setup in the <ulink 
url="https://github.com/apache/opennlp-models";>OpenNLP Models
             repository</ulink>. We recommend to bundle one model per JAR file.
 
-            Make sure you add a <emphasis>model.properties</emphasis> file 
with the following content
+            Make sure you add a <emphasis>model.properties</emphasis> file 
with the following content:
 
        <programlisting language="java">
                 <![CDATA[
diff --git a/opennlp-docs/src/docbkx/namefinder.xml 
b/opennlp-docs/src/docbkx/namefinder.xml
index cdf77d3f..284a4a31 100644
--- a/opennlp-docs/src/docbkx/namefinder.xml
+++ b/opennlp-docs/src/docbkx/namefinder.xml
@@ -506,7 +506,7 @@ Precision: 0.8005071889818507
 Recall: 0.7450581122145297
 F-Measure: 0.7717879983140168]]>
          </screen>
-                        Note: The command line interface does not support 
cross evaluation in the current version.
+                        <emphasis role="strong">Note</emphasis>: The command 
line interface does not support cross evaluation in the current version.
                </para>
                </section>
                <section id="tools.namefind.eval.api">
diff --git a/opennlp-docs/src/docbkx/parser.xml 
b/opennlp-docs/src/docbkx/parser.xml
index 2dc1ecd6..468a3ea8 100644
--- a/opennlp-docs/src/docbkx/parser.xml
+++ b/opennlp-docs/src/docbkx/parser.xml
@@ -200,7 +200,7 @@ $ opennlp ParserTrainer -model en-parser-chunking.bin 
-parserType CHUNKING \
                tool replaces the tagger model inside the parser model with a 
new one. 
                </para>
                <para>
-               Note: The original parser model will be overwritten with the 
new parser model which
+               <emphasis role="strong">Note</emphasis>: The original parser 
model will be overwritten with the new parser model which
                contains the replaced tagger model.
         <screen>
                <![CDATA[

(opennlp) branch main updated: OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual (#919)

Reply via email to