This is an automated email from the ASF dual-hosted git repository. mawiesne pushed a commit to branch opennlp-2.x in repository https://gitbox.apache.org/repos/asf/opennlp.git
commit 94113f5bdca5719558432ba81558930349aae54c Author: Martin Wiesner <[email protected]> AuthorDate: Mon Dec 29 14:09:55 2025 +0100 OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual (#919) * OPENNLP-1791: Document the use of ClassPathModelProvider in dev manual - adds note on JVM argument since J17 and beyond - improves formatting of all "Note:" text fragments to better highlight helpful context to the reader (cherry picked from commit 7b362c843d6925af76fd3d65f5927c0687b9d454) --- opennlp-docs/src/docbkx/doccat.xml | 2 +- opennlp-docs/src/docbkx/langdetect.xml | 4 ++-- opennlp-docs/src/docbkx/model-loading.xml | 35 ++++++++++++++++++++++++++++++- opennlp-docs/src/docbkx/namefinder.xml | 2 +- opennlp-docs/src/docbkx/parser.xml | 2 +- 5 files changed, 39 insertions(+), 6 deletions(-) diff --git a/opennlp-docs/src/docbkx/doccat.xml b/opennlp-docs/src/docbkx/doccat.xml index 33ac04a2..5c486dd6 100644 --- a/opennlp-docs/src/docbkx/doccat.xml +++ b/opennlp-docs/src/docbkx/doccat.xml @@ -126,7 +126,7 @@ GMDecrease Major acquisitions that have a lower gross margin than the existing n GMIncrease The upward movement of gross margin resulted from amounts pursuant to adjustments \ to obligations towards dealers .]]> </screen> - Note: The line breaks marked with a backslash are just inserted for formatting purposes and must not be + <emphasis role="strong">Note</emphasis>: The line breaks marked with a backslash are just inserted for formatting purposes and must not be included in the training data. </para> <section id="tools.doccat.training.tool"> diff --git a/opennlp-docs/src/docbkx/langdetect.xml b/opennlp-docs/src/docbkx/langdetect.xml index 7e901714..0e510c59 100644 --- a/opennlp-docs/src/docbkx/langdetect.xml +++ b/opennlp-docs/src/docbkx/langdetect.xml @@ -141,7 +141,7 @@ lav Egija Tri-Active procedūru īpaši iesaka izmantot siltākajos gadalaik nedēļu laika Ekonomikas ministrijai, Finanšu ministrijai un Labklājības ministrijai, lai ar vienotu \ pozīciju atgrieztos pie jautājuma izskatīšanas.]]> </screen> - Note: The line breaks marked with a backslash are just inserted for formatting purposes and must not be + <emphasis role="strong">Note</emphasis>: The line breaks marked with a backslash are just inserted for formatting purposes and must not be included in the training data. </para> <section id="tools.langdetect.training.tool"> @@ -153,7 +153,7 @@ lav Egija Tri-Active procedūru īpaši iesaka izmantot siltākajos gadalaik $ bin/opennlp LanguageDetectorTrainer[.leipzig] -model modelFile [-params paramsFile] \ [-factory factoryName] -data sampleData [-encoding charsetName]]]> </screen> - Note: To customize the language detector, extend the class opennlp.tools.langdetect.LanguageDetectorFactory + <emphasis role="strong">Note</emphasis>: To customize the language detector, extend the class opennlp.tools.langdetect.LanguageDetectorFactory add it to the classpath and pass it in the -factory argument. </para> </section> diff --git a/opennlp-docs/src/docbkx/model-loading.xml b/opennlp-docs/src/docbkx/model-loading.xml index aec161cd..e901abf7 100644 --- a/opennlp-docs/src/docbkx/model-loading.xml +++ b/opennlp-docs/src/docbkx/model-loading.xml @@ -96,6 +96,39 @@ for(ClassPathModelEntry entry : models) { </programlisting> </para> + <para> + Moreover, certain OpenNLP models can be obtained via a + <emphasis>ClassPathModelProvider</emphasis>, such as OpenNLP's + built-in <emphasis>DefaultClassPathModelProvider</emphasis> class. + It allows direct use of models available under a certain locale, given + that those are present in the classpath and can be loaded. + + <programlisting language="java"> + <![CDATA[ +final ClassPathModelProvider provider = new DefaultClassPathModelProvider(finder, loader); +// Here: SentenceModel, other model types accordingly +final SentenceModel sm = provider.load("en", opennlp.tools.models.ModelType.SENTENCE_DETECTOR, SentenceModel.class); +if(sm != null) { + // do something with the (sentence) model +}]]> + </programlisting> + + In the above example, the finder and loader objects can be created or re-used as shown in the + previous code example. + </para> + <para> + <emphasis role="strong">Note</emphasis>: + When running on Java 17+, the JVM argument + + <screen>--add-opens java.base/jdk.internal.loader=ALL-UNNAMED</screen> + + may be required. Without this parameter, OpenNLP uses the JVM bootstrap classpath to locate models + rather than the UCP class loader. + + For more advanced or non-standard class loading scenarios, using ClassGraph and implementing a + custom provider may cover additional cases beyond the default UCP class loader or + JVM bootstrap class path behavior. + </para> </section> @@ -106,7 +139,7 @@ for(ClassPathModelEntry entry : models) { we recommend that you have a look at our setup in the <ulink url="https://github.com/apache/opennlp-models">OpenNLP Models repository</ulink>. We recommend to bundle one model per JAR file. - Make sure you add a <emphasis>model.properties</emphasis> file with the following content + Make sure you add a <emphasis>model.properties</emphasis> file with the following content: <programlisting language="java"> <![CDATA[ diff --git a/opennlp-docs/src/docbkx/namefinder.xml b/opennlp-docs/src/docbkx/namefinder.xml index cdf77d3f..284a4a31 100644 --- a/opennlp-docs/src/docbkx/namefinder.xml +++ b/opennlp-docs/src/docbkx/namefinder.xml @@ -506,7 +506,7 @@ Precision: 0.8005071889818507 Recall: 0.7450581122145297 F-Measure: 0.7717879983140168]]> </screen> - Note: The command line interface does not support cross evaluation in the current version. + <emphasis role="strong">Note</emphasis>: The command line interface does not support cross evaluation in the current version. </para> </section> <section id="tools.namefind.eval.api"> diff --git a/opennlp-docs/src/docbkx/parser.xml b/opennlp-docs/src/docbkx/parser.xml index 2dc1ecd6..468a3ea8 100644 --- a/opennlp-docs/src/docbkx/parser.xml +++ b/opennlp-docs/src/docbkx/parser.xml @@ -200,7 +200,7 @@ $ opennlp ParserTrainer -model en-parser-chunking.bin -parserType CHUNKING \ tool replaces the tagger model inside the parser model with a new one. </para> <para> - Note: The original parser model will be overwritten with the new parser model which + <emphasis role="strong">Note</emphasis>: The original parser model will be overwritten with the new parser model which contains the replaced tagger model. <screen> <