Hi Richard, Jeff, and OpenNLP Developers, I hope you’re doing well. I wanted to follow up on my previous message regarding the gRPC-based Python integration work. I completely understand that mentoring bandwidth is limited at the moment, and I appreciate the clarity you shared earlier. That said, I remain genuinely interested in contributing to OpenNLP and plan to continue working on this integration independently outside of GSoC. At this stage, I have a clean and working foundation: Python client communicating with the OpenNLP gRPC server End-to-end sentence detection working with proper model loading A structured branch with minimal examples and setup instructions Branch for reference: https://github.com/JOBIN-SABU/opennlp-sandbox/tree/grpc Before I expand further (e.g., POS tagging, NER, and SDK improvements), I would really value any brief guidance on: Whether the current structure is suitable for an initial PR Preferred placement for Python client/examples within the project Whether generated gRPC files should be committed or user-generated Even a small pointer or confirmation would help me align better with project expectations. I’ll continue refining and extending the work in the meantime and will aim to contribute in a way that is useful to the community. Thank you for your time, and I appreciate any feedback whenever convenient. Best regards, Jobin Sabu [email protected] https://github.com/JOBIN-SABU
On Wed, 1 Apr, 2026, 12:07 pm Jobin Sabu, <[email protected]> wrote: > Hi Richard, > I hope you're doing well. > I’ve cleaned up my work and pushed the current state of the gRPC-based > Python integration to a separate branch for review: > https://github.com/JOBIN-SABU/opennlp-sandbox/tree/grpc > This currently includes: > A working Python client using the existing proto definitions > End-to-end sentence detection via gRPC > A minimal example for testing the setup > A cleaned project structure (excluding target/, models/, and generated > artifacts) > Updated README with step-by-step instructions > At this stage, my focus has been to establish a clean and working > foundation that demonstrates Python ↔ OpenNLP integration in a simple and > reproducible way. > My broader goal (as discussed earlier) is to extend this further by: > Adding additional services such as POS tagging and NER > Improving model handling and configuration > Developing a more complete Python SDK and documentation > However, before expanding the scope, I wanted to first confirm that the > current structure and direction align with project expectations. > I would really appreciate your feedback on: > Whether this structure is suitable for an initial PR > If the placement of the Python example is appropriate > Whether generated Python gRPC files should be included or generated by > users > Based on your guidance, I will continue refining and expanding the > implementation. > Thank you again for your time and support. > Best regards, > Jobin Sabu > On Sun, 29 Mar, 2026, 7:22 pm Jobin Sabu, <[email protected]> wrote: > >> Dear Richard, Jeff, and OpenNLP Developers, >> >> I hope you’re doing well. >> >> I wanted to share a quick update regarding the gRPC-based Python client >> integration. I’m happy to say that the setup is now working end-to-end — >> the server is running successfully, models are loading correctly, and I’m >> able to make RPC calls from the Python client with expected outputs. I’ve >> also captured screenshots of the working setup for reference. >> >> This took longer than expected due to my academic commitments, but I’ve >> been consistently working on this for nearly a year now and have gained a >> solid understanding of the system. >> >> With GSoC 2026 approaching, I would love to continue this work and take >> it further — including improving the Python SDK, adding more services like >> NER and chunking, and refining documentation for broader adoption. >> >> I wanted to ask if anyone from the OpenNLP community might be available >> to mentor this effort for GSoC 2026. I’ll have significantly more >> availability this year and am fully committed to pushing this forward as a >> meaningful contribution to the project. >> >> Thank you again for your guidance and support throughout this journey. >> I’d really appreciate any feedback or direction. >> >> Best regards, >> Jobin Sabu >> [image: image.png] >> [email protected] >> >> On Tue, 10 Mar 2026 at 00:58, Richard Zowalla <[email protected]> wrote: >> >>> Hi Jobin, >>> >>> Thanks for the detailed update, and apologies for the slow reply as >>> with most volunteer-driven projects, the day job occasionally takes >>> priority! >>> >>> Regarding the model loading error: the server doesn't load raw .bin >>> files directly from the filesystem. Instead, it expects a model JAR dropped >>> into the location specified in the config. You can find pre-built model >>> JARs for OpenNLP on Maven Central via the opennlp-models repository: >>> https://github.com/apache/opennlp-models >>> >>> If you need to deploy a custom model, it needs to follow the packaging >>> pattern shown in that same repo, so simply pointing to a .bin file won't >>> work. The best reference for how to set this up correctly is the >>> integration test in the sandbox repo, which shows the expected directory >>> structure and configuration in a working example. >>> >>> Regarding GSoC 2026: your proposal sounds well thought-out, and it's >>> great to see the direction you have in mind (NER, Chunking, a PyPI SDK, and >>> docs). However, I am currently unable to mentor due to time constraints in >>> my day job. >>> >>> Best >>> Richard >>> >>> > Am 06.03.2026 um 05:41 schrieb Jobin Sabu <[email protected]>: >>> > >>> > *Hi Richard and Jeff,* >>> > >>> > I hope you're both doing well. >>> > >>> > I would like to provide an update on the gRPC-based Python client >>> > integration. I have reached the stage where the client connects and RPC >>> > calls are being made, but I am consistently receiving a server-side >>> error >>> > when attempting sentence detection. >>> > >>> > To assist with debugging, I have pushed the *entire raw state* of my >>> > environment to my repository: >>> > *https://github.com/JOBIN-SABU/opennlp-sandbox-experiments >>> > < >>> https://www.google.com/url?sa=E&source=gmail&q=https://github.com/JOBIN-SABU/opennlp-sandbox-experiments >>> >* >>> > >>> > *Technical Context:* >>> > >>> > - >>> > >>> > 1. *Working Directory:* >>> > tmp-opennlp-sandbox1/opennlp-sandbox/opennlp-grpc/target/ >>> > 2. *Server Command:* java -cp >>> > "opennlp-grpc-server-2.5.8-SNAPSHOT.jar:models/:" >>> -Dopennlp.model.dir=. >>> > org.apache.opennlp.grpc.OpenNLPService >>> > 3. *Model Location:* ./models/opennlp/tools/sentdetect/en-sent.bin >>> > 4. *The Error:* grpc._channel._InactiveRpcError: status = >>> > StatusCode.INTERNAL, details = "Could not find the given model." >>> > >>> > >>> > >>> > I suspect the issue lies in how the gRPC wrapper handles resource >>> > loading—specifically whether it expects models on the *ClassPath* or >>> > supports *Relative/Absolute File System paths* via the config file. >>> Since >>> > I’ve experimented with both flat and nested directory hierarchies >>> without >>> > success, I would appreciate any insight into the "expected" pathing >>> for the >>> > Sandbox server. >>> > >>> > *Regarding GSoC 2026:* As I have been contributing to this for nearly a >>> > year, my goal remains to establish a robust bridge between OpenNLP and >>> the >>> > Python community. I would love to formally propose this as a project >>> for >>> > the *GSoC 2026 cycle* to move these features from the sandbox into a >>> > production-ready state. >>> > >>> > Beyond fixing the current integration, my proposal includes: >>> > >>> > - >>> > >>> > *Expanding Services:* Implementing NER (Named Entity Recognition) and >>> > Chunking as gRPC services. >>> > - >>> > >>> > *Pythonic SDK:* Developing a client library for distribution via >>> > PyPI/pip. >>> > - >>> > >>> > *Documentation:* Creating comprehensive benchmarks and "Getting >>> Started" >>> > guides. >>> > >>> > Given my deep involvement in the current implementation, *would either >>> of >>> > you be interested in mentoring me for this project during the upcoming >>> GSoC >>> > cycle?* I am eager to see this through to completion for the Apache >>> OpenNLP >>> > community. >>> > >>> > Best regards, >>> > >>> > *Jobin Sabu* >>> > >>> > >>> > On Wed, 25 Feb 2026 at 10:45, Jobin Sabu <[email protected]> >>> wrote: >>> > >>> >> Hi Richard and Jeff, >>> >> >>> >> I hope you're both doing well. >>> >> >>> >> I would like to provide a brief report on the gRPC based Python >>> client >>> >> integration. I have got to a stage where the client connects and RPC >>> calls >>> >> are being made but I am always receiving a server-side error when >>> trying to >>> >> detect the sentence. >>> >> >>> >> I have deployed the latest version of my work, such as the models/ >>> folder >>> >> and the config.properties, to my repository: >>> >> https://github.com/JOBIN-SABU/opennlp-sandbox-experiments >>> >> < >>> https://www.google.com/url?sa=E&source=gmail&q=https://github.com/JOBIN-SABU/opennlp-sandbox-experiments >>> > >>> >> . >>> >> >>> >> *Technical Details:* >>> >> >>> >> *Server Command:* >>> >> >>> >> java -jar opennlp-grpc-server-2.5.8-SNAPSHOT.jar -c config.properties >>> -p >>> >> 7071 >>> >> >>> >> *config.properties:* >>> >> >>> >> sentenceModel=en-sent.bin >>> >> >>> >> *The Error:* >>> >> >>> >> grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that >>> terminated >>> >> with: status = StatusCode.INTERNAL, details = "Could not find the >>> given >>> >> model."> >>> >> >>> >> As OpenNLP is a library, I believe the problem is in the nature of >>> >> loading resources in gRPC wrapper, i.e., whether it anticipates the >>> model >>> >> existing on the ClassPath or is capable of interpreting File System >>> paths >>> >> based on the configuration file. Am I going through an absolute path, >>> or >>> >> does the server have a particular directory hierarchy it wants >>> external >>> >> models to go through? >>> >> >>> >> *Regarding GSoC 2026:* >>> >> >>> >> Since I have been working on this for almost a year now, my goal >>> remains >>> >> to establish a bridge between OpenNLP and the Python community. While >>> I am >>> >> committed to this regardless of GSoC, I would love to formally >>> propose this >>> >> as a project for the 2026 cycle to move it from the sandbox into a >>> >> production-ready feature. >>> >> >>> >> Beyond repairing the existing integration, I plan to: >>> >> >>> >> - Implement *NER (Named Entity Recognition)* and *Chunking* as gRPC >>> >> services. >>> >> - Write a pythonic client SDK that will be distributed through >>> >> *PyPI/pip*. >>> >> - Develop detailed documentation and performance benchmarks. >>> >> >>> >> Since I am already deep into the implementation, would either of you >>> >> consider mentoring me for this project during the upcoming cycle of >>> Gsoc? >>> >> I’m eager to see this through to completion for the community. >>> >> >>> >> Best regards, >>> >> >>> >> *Jobin Sabu* >>> >> >>> >> *https://www.linkedin.com/in/jobin-sabu-0b18bb2b8/ >>> >> <https://www.linkedin.com/in/jobin-sabu-0b18bb2b8/>* >>> >> >>> >> *https://github.com/JOBIN-SABU <https://github.com/JOBIN-SABU>* >>> >> >>> >>>
