[
https://issues.apache.org/jira/browse/OPENNLP-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17906117#comment-17906117
]
ASF GitHub Bot commented on OPENNLP-1665:
-----------------------------------------
mawiesne commented on code in PR #197:
URL: https://github.com/apache/opennlp-sandbox/pull/197#discussion_r1887057418
##########
opennlp-grpc/README.md:
##########
@@ -0,0 +1,96 @@
+# OpenNLP gRPC - Proof of Concept
+
+This project demonstrates a proof of concept for creating a backend powered by
Apache OpenNLP using gRPC. It comprises
+three main modules:
+
+- **opennlp-grpc-api**
+- **opennlp-grpc-service**
+- **examples**
+
+## Modules Overview
+
+1. **opennlp-grpc-api**:
+ - Contains the gRPC schema for OpenNLP services.
+ - Includes generated Java stubs.
+ - Provides a README with instructions on generating code stubs for various
languages and auto-generated
+ documentation.
+
+2. **opennlp-grpc-service**:
+ - Features a server implementation.
+ - Offers an initial service implementation for POS tagging.
+
+3. **examples**:
+ - Provides a sample implementation for interacting with the OpenNLP server
backend via gRPC in Python.
+
+## Getting Started
+
+Follow these steps to set up and run the OpenNLP gRPC proof of concept project:
+
+### Prerequisites
+Before you begin, ensure you have the following installed on your system:
+
+- Java Development Kit (JDK) 17 or later
+- Apache Maven (for building Java components)
+- Docker for running the gRPC tools if modifications to the .proto files are
needed
+
+You can build the project by running
+
+```
+mvn clean install
+```
+
+### Running the gRPC Backend
+
+Start the Server: Use the following command to run the server with default
settings:
+
+```bash
+java -jar target/opennlp-grpc-server-2.5.2-SNAPSHOT.jar
+```
+
+Configure Server Options:
Review Comment:
server -> lower case "s"
options -> lower case "o"
##########
opennlp-grpc/README.md:
##########
@@ -0,0 +1,96 @@
+# OpenNLP gRPC - Proof of Concept
+
+This project demonstrates a proof of concept for creating a backend powered by
Apache OpenNLP using gRPC. It comprises
+three main modules:
+
+- **opennlp-grpc-api**
+- **opennlp-grpc-service**
+- **examples**
+
+## Modules Overview
+
+1. **opennlp-grpc-api**:
+ - Contains the gRPC schema for OpenNLP services.
+ - Includes generated Java stubs.
+ - Provides a README with instructions on generating code stubs for various
languages and auto-generated
+ documentation.
+
+2. **opennlp-grpc-service**:
+ - Features a server implementation.
+ - Offers an initial service implementation for POS tagging.
+
+3. **examples**:
+ - Provides a sample implementation for interacting with the OpenNLP server
backend via gRPC in Python.
+
+## Getting Started
+
+Follow these steps to set up and run the OpenNLP gRPC proof of concept project:
+
+### Prerequisites
+Before you begin, ensure you have the following installed on your system:
+
+- Java Development Kit (JDK) 17 or later
+- Apache Maven (for building Java components)
+- Docker for running the gRPC tools if modifications to the .proto files are
needed
+
+You can build the project by running
+
+```
+mvn clean install
+```
+
+### Running the gRPC Backend
+
+Start the Server: Use the following command to run the server with default
settings:
+
+```bash
+java -jar target/opennlp-grpc-server-2.5.2-SNAPSHOT.jar
+```
+
+Configure Server Options:
+
+The server supports several command-line options for customization:
+
+```bash
+-p or --port: Port on which the server will listen (default: 7071).
+-h or --hostname: Hostname to report (default: localhost).
Review Comment:
I don't understand "report" here. Should that be "to bind to"?
2nd Q: If ipv4 (or ipv6) is supported, a hint would be fine here for callers.
##########
opennlp-grpc/README.md:
##########
@@ -0,0 +1,96 @@
+# OpenNLP gRPC - Proof of Concept
+
+This project demonstrates a proof of concept for creating a backend powered by
Apache OpenNLP using gRPC. It comprises
+three main modules:
+
+- **opennlp-grpc-api**
+- **opennlp-grpc-service**
+- **examples**
+
+## Modules Overview
+
+1. **opennlp-grpc-api**:
+ - Contains the gRPC schema for OpenNLP services.
+ - Includes generated Java stubs.
+ - Provides a README with instructions on generating code stubs for various
languages and auto-generated
+ documentation.
+
+2. **opennlp-grpc-service**:
+ - Features a server implementation.
+ - Offers an initial service implementation for POS tagging.
+
+3. **examples**:
+ - Provides a sample implementation for interacting with the OpenNLP server
backend via gRPC in Python.
+
+## Getting Started
+
+Follow these steps to set up and run the OpenNLP gRPC proof of concept project:
+
+### Prerequisites
+Before you begin, ensure you have the following installed on your system:
+
+- Java Development Kit (JDK) 17 or later
+- Apache Maven (for building Java components)
+- Docker for running the gRPC tools if modifications to the .proto files are
needed
+
+You can build the project by running
+
+```
+mvn clean install
+```
+
+### Running the gRPC Backend
+
+Start the Server: Use the following command to run the server with default
settings:
+
+```bash
+java -jar target/opennlp-grpc-server-2.5.2-SNAPSHOT.jar
+```
+
+Configure Server Options:
+
+The server supports several command-line options for customization:
+
+```bash
+-p or --port: Port on which the server will listen (default: 7071).
+-h or --hostname: Hostname to report (default: localhost).
+-c or --config: Path to a configuration file.
+```
+
+Example with custom options:
+
+```bash
+java -jar target/opennlp-grpc-server-1.0-SNAPSHOT.jar -p 8080 -h 127.0.0.1 -c
./server-config.ini
+```
+
+Sample Configuration File:
+
+If using a configuration file, it should be in the format:
+
+```bash
+# Set to true to enable gRPC server reflection.
+server.enable_reflection = false
+
+# This is the folder to be used to search for models
+model.location=extlib
+# Set this to true to recursively search for models inside the model.location
folder.
+model.recursive=true
+# A wildcard to search for models in the model.location folder.
+model.pos.wildcard=opennlp-models-pos-*.jar
+```
+
+#### Models
+
+To ensure the server automatically loads models, they must be placed in the
`extlib` (or in the location configured via `model.location`) directory.
+
+## Building a Custom Client in another Programming Language
+
+Details can be found in the README of the opennlp-grpc-api module.
Review Comment:
Is it possible to use a link here? (for the README of the other module)
##########
opennlp-grpc/README.md:
##########
@@ -0,0 +1,96 @@
+# OpenNLP gRPC - Proof of Concept
+
+This project demonstrates a proof of concept for creating a backend powered by
Apache OpenNLP using gRPC. It comprises
+three main modules:
+
+- **opennlp-grpc-api**
+- **opennlp-grpc-service**
+- **examples**
+
+## Modules Overview
+
+1. **opennlp-grpc-api**:
+ - Contains the gRPC schema for OpenNLP services.
+ - Includes generated Java stubs.
+ - Provides a README with instructions on generating code stubs for various
languages and auto-generated
+ documentation.
+
+2. **opennlp-grpc-service**:
+ - Features a server implementation.
+ - Offers an initial service implementation for POS tagging.
+
+3. **examples**:
+ - Provides a sample implementation for interacting with the OpenNLP server
backend via gRPC in Python.
+
+## Getting Started
+
+Follow these steps to set up and run the OpenNLP gRPC proof of concept project:
+
+### Prerequisites
+Before you begin, ensure you have the following installed on your system:
+
+- Java Development Kit (JDK) 17 or later
+- Apache Maven (for building Java components)
+- Docker for running the gRPC tools if modifications to the .proto files are
needed
+
+You can build the project by running
+
+```
+mvn clean install
+```
+
+### Running the gRPC Backend
+
+Start the Server: Use the following command to run the server with default
settings:
+
+```bash
+java -jar target/opennlp-grpc-server-2.5.2-SNAPSHOT.jar
+```
+
+Configure Server Options:
+
+The server supports several command-line options for customization:
+
+```bash
+-p or --port: Port on which the server will listen (default: 7071).
+-h or --hostname: Hostname to report (default: localhost).
+-c or --config: Path to a configuration file.
+```
+
+Example with custom options:
+
+```bash
+java -jar target/opennlp-grpc-server-1.0-SNAPSHOT.jar -p 8080 -h 127.0.0.1 -c
./server-config.ini
+```
+
+Sample Configuration File:
+
+If using a configuration file, it should be in the format:
+
+```bash
+# Set to true to enable gRPC server reflection.
+server.enable_reflection = false
+
+# This is the folder to be used to search for models
+model.location=extlib
+# Set this to true to recursively search for models inside the model.location
folder.
+model.recursive=true
+# A wildcard to search for models in the model.location folder.
Review Comment:
add "pattern" after "wildcard" here?
##########
opennlp-grpc/opennlp-grpc-api/opennlp.proto:
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+syntax = "proto3";
+
+option java_package = "opennlp";
+option java_outer_classname = "OpenNLPService";
+package opennlp;
+
+service PosTaggerService {
+ // Assigns the sentence of tokens pos tags.
Review Comment:
"pos" should be capitalized here: "POS" tags, as written below (in another
comment)
##########
opennlp-grpc/opennlp-grpc-service/src/main/java/opennlp/service/PosTaggerService.java:
##########
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package opennlp.service;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.nio.file.Path;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+import com.google.rpc.Code;
+import com.google.rpc.Status;
+import io.grpc.protobuf.StatusProto;
+import io.grpc.stub.StreamObserver;
+import org.slf4j.LoggerFactory;
+
+import opennlp.OpenNLPService;
+import opennlp.PosTaggerServiceGrpc;
+import opennlp.service.classpath.DirectoryModelFinder;
+import opennlp.service.exception.ServiceException;
+import opennlp.tools.commons.ThreadSafe;
+import opennlp.tools.models.ClassPathModel;
+import opennlp.tools.models.ClassPathModelEntry;
+import opennlp.tools.models.ClassPathModelLoader;
+import opennlp.tools.postag.POSModel;
+import opennlp.tools.postag.POSTagFormat;
+import opennlp.tools.postag.POSTagger;
+import opennlp.tools.postag.ThreadSafePOSTaggerME;
+
+/**
+ * The {@code PosTaggerService} class implements a gRPC service for
Part-of-Speech (POS) tagging
+ * using Apache OpenNLP models. It extends the auto-generated gRPC base class
+ * {@link PosTaggerServiceGrpc.PosTaggerServiceImplBase}.
+ *
+ * <p>This service provides functionality for:
+ * <ul>
+ * <li>Retrieving available POS models loaded from the classpath.</li>
+ * <li>Performing POS tagging on input sentences.</li>
+ * <li>Performing POS tagging with additional context.</li>
+ * </ul>
+ * </p>
+ *
+ * <p><b>Configuration:</b>
+ * <ul>
+ * <li>{@code model.location}: Directory to search for models (default:
"extlib").</li>
+ * <li>{@code model.recursive}: Whether to scan subdirectories (default:
{@code true}).</li>
+ * <li>{@code model.pos.wildcard}: Wildcard pattern to identify POS models
(default: "opennlp-models-pos-*.jar").</li>
+ * </ul>
+ * </p>
+ */
+@ThreadSafe
+public class PosTaggerService extends
PosTaggerServiceGrpc.PosTaggerServiceImplBase {
+
+ private static final org.slf4j.Logger logger =
+ LoggerFactory.getLogger(PosTaggerService.class);
+
+ private static final Map<String, ClassPathModel> MODEL_CACHE = new
ConcurrentHashMap<>();
+ private static final Map<String, POSTagger> TAGGER_CACHE = new
ConcurrentHashMap<>();
+
+ public PosTaggerService(Map<String, String> conf) {
+
+ try {
+ initializeModelCache(conf);
+ } catch (IOException e) {
+ logger.error(e.getLocalizedMessage(), e);
+ throw new RuntimeException(e);
+ }
+
+ }
+
+ public static void clearCaches() {
+ synchronized (TAGGER_CACHE) {
+ for (POSTagger t : TAGGER_CACHE.values()) {
+ if (t instanceof AutoCloseable a) {
+ try {
+ a.close();
+ } catch (Exception ignored) {
+
+ }
+ }
+ }
+ TAGGER_CACHE.clear();
+ MODEL_CACHE.clear();
+ }
+ }
+
+ private void initializeModelCache(Map<String, String> conf) throws
IOException {
+ final String modelDir = conf.getOrDefault("model.location", "extlib");
+ final boolean recursive =
Boolean.parseBoolean(conf.getOrDefault("model.recursive", "true"));
+ final String wildcardPattern = conf.getOrDefault("model.pos.wildcard",
"opennlp-models-pos-*.jar");
+
+ final DirectoryModelFinder finder = new
DirectoryModelFinder(wildcardPattern, Path.of(modelDir), recursive);
+ final ClassPathModelLoader loader = new ClassPathModelLoader();
+
+ final Set<ClassPathModelEntry> models = finder.findModels(false);
+ for (ClassPathModelEntry entry : models) {
+ final ClassPathModel model = loader.load(entry);
+ if (model != null) {
+ MODEL_CACHE.putIfAbsent(model.getModelSHA256(), model);
+ }
+ }
+
+ }
+
+ @Override
+ public void getAvailableModels(opennlp.OpenNLPService.Empty request,
+
io.grpc.stub.StreamObserver<opennlp.OpenNLPService.AvailableModels>
responseObserver) {
+
+ try {
+ final OpenNLPService.AvailableModels.Builder response =
OpenNLPService.AvailableModels.newBuilder();
+ for (ClassPathModel model : MODEL_CACHE.values()) {
+ final OpenNLPService.Model m = OpenNLPService.Model.newBuilder()
+ .setHash(model.getModelSHA256())
+ .setName(model.getModelName())
+ .setLocale(model.getModelLanguage())
+ .build();
+
+ response.addModels(m);
+
+ }
+
+ responseObserver.onNext(response.build());
+ responseObserver.onCompleted();
+ } catch (Exception e) {
+ handleException(e, responseObserver);
+ }
+ }
+
+ @Override
+ public void tag(opennlp.OpenNLPService.TagRequest request,
+ io.grpc.stub.StreamObserver<opennlp.OpenNLPService.PosTags>
responseObserver) {
+ try {
+ final POSTagger tagger = getTagger(request.getModelHash(),
request.getFormat());
+ final String[] tags = tagger.tag(request.getSentenceList().toArray(new
String[0]));
+
responseObserver.onNext(OpenNLPService.PosTags.newBuilder().addAllTags(Arrays.asList(tags)).build());
+ responseObserver.onCompleted();
+ } catch (Exception e) {
+ handleException(e, responseObserver);
+ }
+
+ }
+
+ @Override
+ public void tagWithContext(opennlp.OpenNLPService.TagWithContextRequest
request,
+
io.grpc.stub.StreamObserver<opennlp.OpenNLPService.PosTags> responseObserver) {
+
+ try {
+ final POSTagger tagger = getTagger(request.getModelHash(),
request.getFormat());
+ final String[] tags = tagger.tag(request.getSentenceList().toArray(new
String[0]), request.getAdditionalContextList().toArray(new String[0]));
Review Comment:
Line is too long, pls reformat to increase readability.
##########
opennlp-grpc/opennlp-grpc-service/src/main/java/opennlp/service/PosTaggerService.java:
##########
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package opennlp.service;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+import java.nio.file.Path;
+import java.util.Arrays;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+import com.google.rpc.Code;
+import com.google.rpc.Status;
+import io.grpc.protobuf.StatusProto;
+import io.grpc.stub.StreamObserver;
+import org.slf4j.LoggerFactory;
+
+import opennlp.OpenNLPService;
+import opennlp.PosTaggerServiceGrpc;
+import opennlp.service.classpath.DirectoryModelFinder;
+import opennlp.service.exception.ServiceException;
+import opennlp.tools.commons.ThreadSafe;
+import opennlp.tools.models.ClassPathModel;
+import opennlp.tools.models.ClassPathModelEntry;
+import opennlp.tools.models.ClassPathModelLoader;
+import opennlp.tools.postag.POSModel;
+import opennlp.tools.postag.POSTagFormat;
+import opennlp.tools.postag.POSTagger;
+import opennlp.tools.postag.ThreadSafePOSTaggerME;
+
+/**
+ * The {@code PosTaggerService} class implements a gRPC service for
Part-of-Speech (POS) tagging
+ * using Apache OpenNLP models. It extends the auto-generated gRPC base class
+ * {@link PosTaggerServiceGrpc.PosTaggerServiceImplBase}.
+ *
+ * <p>This service provides functionality for:
+ * <ul>
+ * <li>Retrieving available POS models loaded from the classpath.</li>
+ * <li>Performing POS tagging on input sentences.</li>
+ * <li>Performing POS tagging with additional context.</li>
+ * </ul>
+ * </p>
+ *
+ * <p><b>Configuration:</b>
+ * <ul>
+ * <li>{@code model.location}: Directory to search for models (default:
"extlib").</li>
+ * <li>{@code model.recursive}: Whether to scan subdirectories (default:
{@code true}).</li>
+ * <li>{@code model.pos.wildcard}: Wildcard pattern to identify POS models
(default: "opennlp-models-pos-*.jar").</li>
+ * </ul>
+ * </p>
+ */
+@ThreadSafe
+public class PosTaggerService extends
PosTaggerServiceGrpc.PosTaggerServiceImplBase {
+
+ private static final org.slf4j.Logger logger =
+ LoggerFactory.getLogger(PosTaggerService.class);
+
+ private static final Map<String, ClassPathModel> MODEL_CACHE = new
ConcurrentHashMap<>();
+ private static final Map<String, POSTagger> TAGGER_CACHE = new
ConcurrentHashMap<>();
+
+ public PosTaggerService(Map<String, String> conf) {
+
+ try {
+ initializeModelCache(conf);
+ } catch (IOException e) {
+ logger.error(e.getLocalizedMessage(), e);
+ throw new RuntimeException(e);
+ }
+
+ }
+
+ public static void clearCaches() {
+ synchronized (TAGGER_CACHE) {
+ for (POSTagger t : TAGGER_CACHE.values()) {
+ if (t instanceof AutoCloseable a) {
+ try {
+ a.close();
+ } catch (Exception ignored) {
+
+ }
+ }
+ }
+ TAGGER_CACHE.clear();
+ MODEL_CACHE.clear();
+ }
+ }
+
+ private void initializeModelCache(Map<String, String> conf) throws
IOException {
+ final String modelDir = conf.getOrDefault("model.location", "extlib");
+ final boolean recursive =
Boolean.parseBoolean(conf.getOrDefault("model.recursive", "true"));
+ final String wildcardPattern = conf.getOrDefault("model.pos.wildcard",
"opennlp-models-pos-*.jar");
+
+ final DirectoryModelFinder finder = new
DirectoryModelFinder(wildcardPattern, Path.of(modelDir), recursive);
+ final ClassPathModelLoader loader = new ClassPathModelLoader();
+
+ final Set<ClassPathModelEntry> models = finder.findModels(false);
+ for (ClassPathModelEntry entry : models) {
+ final ClassPathModel model = loader.load(entry);
+ if (model != null) {
+ MODEL_CACHE.putIfAbsent(model.getModelSHA256(), model);
+ }
+ }
+
+ }
+
+ @Override
+ public void getAvailableModels(opennlp.OpenNLPService.Empty request,
+
io.grpc.stub.StreamObserver<opennlp.OpenNLPService.AvailableModels>
responseObserver) {
+
+ try {
+ final OpenNLPService.AvailableModels.Builder response =
OpenNLPService.AvailableModels.newBuilder();
+ for (ClassPathModel model : MODEL_CACHE.values()) {
+ final OpenNLPService.Model m = OpenNLPService.Model.newBuilder()
+ .setHash(model.getModelSHA256())
+ .setName(model.getModelName())
+ .setLocale(model.getModelLanguage())
+ .build();
+
+ response.addModels(m);
+
+ }
+
+ responseObserver.onNext(response.build());
+ responseObserver.onCompleted();
+ } catch (Exception e) {
+ handleException(e, responseObserver);
+ }
+ }
+
+ @Override
+ public void tag(opennlp.OpenNLPService.TagRequest request,
+ io.grpc.stub.StreamObserver<opennlp.OpenNLPService.PosTags>
responseObserver) {
+ try {
+ final POSTagger tagger = getTagger(request.getModelHash(),
request.getFormat());
+ final String[] tags = tagger.tag(request.getSentenceList().toArray(new
String[0]));
+
responseObserver.onNext(OpenNLPService.PosTags.newBuilder().addAllTags(Arrays.asList(tags)).build());
+ responseObserver.onCompleted();
+ } catch (Exception e) {
+ handleException(e, responseObserver);
+ }
+
+ }
+
+ @Override
+ public void tagWithContext(opennlp.OpenNLPService.TagWithContextRequest
request,
+
io.grpc.stub.StreamObserver<opennlp.OpenNLPService.PosTags> responseObserver) {
+
+ try {
+ final POSTagger tagger = getTagger(request.getModelHash(),
request.getFormat());
+ final String[] tags = tagger.tag(request.getSentenceList().toArray(new
String[0]), request.getAdditionalContextList().toArray(new String[0]));
+
responseObserver.onNext(OpenNLPService.PosTags.newBuilder().addAllTags(Arrays.asList(tags)).build());
+ responseObserver.onCompleted();
+ } catch (Exception e) {
+ handleException(e, responseObserver);
+ }
+ }
+
+ private void handleException(Exception e, StreamObserver<?>
responseObserver) {
+ final Status status = Status.newBuilder()
+ .setCode(Code.INTERNAL.getNumber())
+ .setMessage(e.getLocalizedMessage())
+ .build();
+ responseObserver.onError(StatusProto.toStatusRuntimeException(status));
+ }
+
+ private POSTagger getTagger(String hash, OpenNLPService.POSTagFormat
posTagFormat) {
+ final POSTagFormat format = (posTagFormat == null) ? POSTagFormat.UD :
POSTagFormat.valueOf(posTagFormat.name());
+
+ return TAGGER_CACHE.computeIfAbsent((hash + "-" + format), modelHash -> {
+ final ClassPathModel model = MODEL_CACHE.get(modelHash);
+
+ if (model == null) {
+ throw new ServiceException("Could not find the given model.");
+ }
+
+ try {
+ return new ThreadSafePOSTaggerME(new POSModel(new
ByteArrayInputStream(model.model())), format);
Review Comment:
Should additionally be wrapped in a `BufferedInputStream` to speed up
reading of larger models.
##########
opennlp-grpc/pom.xml:
##########
@@ -0,0 +1,50 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+ Licensed to the Apache Software Foundation (ASF) under one
+ or more contributor license agreements. See the NOTICE file
+ distributed with this work for additional information
+ regarding copyright ownership. The ASF licenses this file
+ to you under the Apache License, Version 2.0 (the
+ "License"); you may not use this file except in compliance
+ with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing,
+ software distributed under the License is distributed on an
+ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ KIND, either express or implied. See the License for the
+ specific language governing permissions and limitations
+ under the License.
+-->
+<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xmlns="http://maven.apache.org/POM/4.0.0"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+
+ <parent>
+ <groupId>org.apache.opennlp</groupId>
+ <artifactId>opennlp-sandbox</artifactId>
+ <version>2.5.2-SNAPSHOT</version>
+ </parent>
+
+ <artifactId>opennlp-grpc</artifactId>
+ <name>Apache OpenNLP gRPC Server</name>
+ <packaging>pom</packaging>
+
+ <modules>
+ <module>opennlp-grpc-api</module>
+ <module>opennlp-grpc-service</module>
+ </modules>
+
+ <properties>
+ <grpc.version>1.69.0</grpc.version>
+ <opennlp.version>2.5.1</opennlp.version>
Review Comment:
That property should be defined already via the parent sandbox pom.xml file?
same with junit and slf4j, log4j2. Why don't we re-use this, when this pom.xml
file points to a parent that has it already defined?
##########
opennlp-grpc/opennlp-grpc-service/src/test/resources/models/opennlp-models-pos-en-1.2.0.jar:
##########
Review Comment:
Can we avoid adding (larger) binary model resources to the
/src/test/resources/models folder? This is a liability for the future and means
extra cost in transfer (scm checkouts) and disk space requirements.
##########
opennlp-grpc/README.md:
##########
@@ -0,0 +1,96 @@
+# OpenNLP gRPC - Proof of Concept
+
+This project demonstrates a proof of concept for creating a backend powered by
Apache OpenNLP using gRPC. It comprises
+three main modules:
+
+- **opennlp-grpc-api**
+- **opennlp-grpc-service**
+- **examples**
+
+## Modules Overview
+
+1. **opennlp-grpc-api**:
+ - Contains the gRPC schema for OpenNLP services.
+ - Includes generated Java stubs.
+ - Provides a README with instructions on generating code stubs for various
languages and auto-generated
+ documentation.
+
+2. **opennlp-grpc-service**:
+ - Features a server implementation.
+ - Offers an initial service implementation for POS tagging.
+
+3. **examples**:
+ - Provides a sample implementation for interacting with the OpenNLP server
backend via gRPC in Python.
+
+## Getting Started
+
+Follow these steps to set up and run the OpenNLP gRPC proof of concept project:
+
+### Prerequisites
+Before you begin, ensure you have the following installed on your system:
+
+- Java Development Kit (JDK) 17 or later
+- Apache Maven (for building Java components)
+- Docker for running the gRPC tools if modifications to the .proto files are
needed
+
+You can build the project by running
+
+```
+mvn clean install
+```
+
+### Running the gRPC Backend
+
+Start the Server: Use the following command to run the server with default
settings:
+
+```bash
+java -jar target/opennlp-grpc-server-2.5.2-SNAPSHOT.jar
+```
+
+Configure Server Options:
+
+The server supports several command-line options for customization:
+
+```bash
+-p or --port: Port on which the server will listen (default: 7071).
Review Comment:
should read:
"will listen on" <
> gRPC Backend for Multi-Language Support
> ---------------------------------------
>
> Key: OPENNLP-1665
> URL: https://issues.apache.org/jira/browse/OPENNLP-1665
> Project: OpenNLP
> Issue Type: New Feature
> Reporter: Richard Zowalla
> Assignee: Richard Zowalla
> Priority: Major
>
> Taken from the slack discussion. Goal is to broaden the application of
> OpenNLP and usage in different programming language.
> To do so, we can build a component based on gRPC as a module into opennlp
> itself and implement the server side component for the most common tasks (can
> be a growing thing, if community is going to like it) in Java using
> [gRPC|https://grpc.io/] - with the gRPC spec for OpenNLP in place, we can
> simply generate appropriate clients for every supported language (Python, Go,
> etc.) - that would make it easier to maintain since we can just tell the
> people how to generate the client based on the gRPC spec.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)