Hi Richard, Jeff, and OpenNLP Developers,
I hope you’re doing well.
I wanted to follow up on my previous message regarding the gRPC-based
Python integration work. I completely understand that mentoring bandwidth
is limited at the moment, and I appreciate the clarity you shared earlier.
That said, I remain genuinely interested in contributing to OpenNLP and
plan to continue working on this integration independently outside of GSoC.
At this stage, I have a clean and working foundation:
Python client communicating with the OpenNLP gRPC server
End-to-end sentence detection working with proper model loading
A structured branch with minimal examples and setup instructions
Branch for reference:
https://github.com/JOBIN-SABU/opennlp-sandbox/tree/grpc
Before I expand further (e.g., POS tagging, NER, and SDK improvements), I
would really value any brief guidance on:
Whether the current structure is suitable for an initial PR
Preferred placement for Python client/examples within the project
Whether generated gRPC files should be committed or user-generated
Even a small pointer or confirmation would help me align better with
project expectations.
I’ll continue refining and extending the work in the meantime and will aim
to contribute in a way that is useful to the community.
Thank you for your time, and I appreciate any feedback whenever convenient.
Best regards,
Jobin Sabu
[email protected]
https://github.com/JOBIN-SABU

On Wed, 1 Apr, 2026, 12:07 pm Jobin Sabu, <[email protected]> wrote:

> Hi Richard,
> I hope you're doing well.
> I’ve cleaned up my work and pushed the current state of the gRPC-based
> Python integration to a separate branch for review:
> https://github.com/JOBIN-SABU/opennlp-sandbox/tree/grpc
> This currently includes:
> A working Python client using the existing proto definitions
> End-to-end sentence detection via gRPC
> A minimal example for testing the setup
> A cleaned project structure (excluding target/, models/, and generated
> artifacts)
> Updated README with step-by-step instructions
> At this stage, my focus has been to establish a clean and working
> foundation that demonstrates Python ↔ OpenNLP integration in a simple and
> reproducible way.
> My broader goal (as discussed earlier) is to extend this further by:
> Adding additional services such as POS tagging and NER
> Improving model handling and configuration
> Developing a more complete Python SDK and documentation
> However, before expanding the scope, I wanted to first confirm that the
> current structure and direction align with project expectations.
> I would really appreciate your feedback on:
> Whether this structure is suitable for an initial PR
> If the placement of the Python example is appropriate
> Whether generated Python gRPC files should be included or generated by
> users
> Based on your guidance, I will continue refining and expanding the
> implementation.
> Thank you again for your time and support.
> Best regards,
> Jobin Sabu
> On Sun, 29 Mar, 2026, 7:22 pm Jobin Sabu, <[email protected]> wrote:
>
>> Dear Richard, Jeff, and OpenNLP Developers,
>>
>> I hope you’re doing well.
>>
>> I wanted to share a quick update regarding the gRPC-based Python client
>> integration. I’m happy to say that the setup is now working end-to-end —
>> the server is running successfully, models are loading correctly, and I’m
>> able to make RPC calls from the Python client with expected outputs. I’ve
>> also captured screenshots of the working setup for reference.
>>
>> This took longer than expected due to my academic commitments, but I’ve
>> been consistently working on this for nearly a year now and have gained a
>> solid understanding of the system.
>>
>> With GSoC 2026 approaching, I would love to continue this work and take
>> it further — including improving the Python SDK, adding more services like
>> NER and chunking, and refining documentation for broader adoption.
>>
>> I wanted to ask if anyone from the OpenNLP community might be available
>> to mentor this effort for GSoC 2026. I’ll have significantly more
>> availability this year and am fully committed to pushing this forward as a
>> meaningful contribution to the project.
>>
>> Thank you again for your guidance and support throughout this journey.
>> I’d really appreciate any feedback or direction.
>>
>> Best regards,
>> Jobin Sabu
>> [image: image.png]
>> [email protected]
>>
>> On Tue, 10 Mar 2026 at 00:58, Richard Zowalla <[email protected]> wrote:
>>
>>> Hi Jobin,
>>>
>>> Thanks for the detailed update, and apologies for the slow reply  as
>>> with most volunteer-driven projects, the day job occasionally takes
>>> priority!
>>>
>>> Regarding the model loading error: the server doesn't load raw .bin
>>> files directly from the filesystem. Instead, it expects a model JAR dropped
>>> into the location specified in the config. You can find pre-built model
>>> JARs for OpenNLP on Maven Central via the opennlp-models repository:
>>> https://github.com/apache/opennlp-models
>>>
>>> If you need to deploy a custom model, it needs to follow the packaging
>>> pattern shown in that same repo, so simply pointing to a .bin file won't
>>> work. The best reference for how to set this up correctly is the
>>> integration test in the sandbox repo, which shows the expected directory
>>> structure and configuration in a working example.
>>>
>>> Regarding GSoC 2026: your proposal sounds well thought-out, and it's
>>> great to see the direction you have in mind (NER, Chunking, a PyPI SDK, and
>>> docs). However, I am currently unable to mentor due to time constraints in
>>> my day job.
>>>
>>> Best
>>> Richard
>>>
>>> > Am 06.03.2026 um 05:41 schrieb Jobin Sabu <[email protected]>:
>>> >
>>> > *Hi Richard and Jeff,*
>>> >
>>> > I hope you're both doing well.
>>> >
>>> > I would like to provide an update on the gRPC-based Python client
>>> > integration. I have reached the stage where the client connects and RPC
>>> > calls are being made, but I am consistently receiving a server-side
>>> error
>>> > when attempting sentence detection.
>>> >
>>> > To assist with debugging, I have pushed the *entire raw state* of my
>>> > environment  to my repository:
>>> > *https://github.com/JOBIN-SABU/opennlp-sandbox-experiments
>>> > <
>>> https://www.google.com/url?sa=E&source=gmail&q=https://github.com/JOBIN-SABU/opennlp-sandbox-experiments
>>> >*
>>> >
>>> > *Technical Context:*
>>> >
>>> >   -
>>> >
>>> >   1. *Working Directory:*
>>> >      tmp-opennlp-sandbox1/opennlp-sandbox/opennlp-grpc/target/
>>> >      2. *Server Command:* java -cp
>>> >      "opennlp-grpc-server-2.5.8-SNAPSHOT.jar:models/:"
>>> -Dopennlp.model.dir=.
>>> >      org.apache.opennlp.grpc.OpenNLPService
>>> >      3. *Model Location:* ./models/opennlp/tools/sentdetect/en-sent.bin
>>> >      4. *The Error:* grpc._channel._InactiveRpcError: status =
>>> >      StatusCode.INTERNAL, details = "Could not find the given model."
>>> >
>>> >
>>> >
>>> > I suspect the issue lies in how the gRPC wrapper handles resource
>>> > loading—specifically whether it expects models on the *ClassPath* or
>>> > supports *Relative/Absolute File System paths* via the config file.
>>> Since
>>> > I’ve experimented with both flat and nested directory hierarchies
>>> without
>>> > success, I would appreciate any insight into the "expected" pathing
>>> for the
>>> > Sandbox server.
>>> >
>>> > *Regarding GSoC 2026:* As I have been contributing to this for nearly a
>>> > year, my goal remains to establish a robust bridge between OpenNLP and
>>> the
>>> > Python community. I would love to formally propose this as a project
>>> for
>>> > the *GSoC 2026 cycle* to move these features from the sandbox into a
>>> > production-ready state.
>>> >
>>> > Beyond fixing the current integration, my proposal includes:
>>> >
>>> >   -
>>> >
>>> >   *Expanding Services:* Implementing NER (Named Entity Recognition) and
>>> >   Chunking as gRPC services.
>>> >   -
>>> >
>>> >   *Pythonic SDK:* Developing a client library for distribution via
>>> >   PyPI/pip.
>>> >   -
>>> >
>>> >   *Documentation:* Creating comprehensive benchmarks and "Getting
>>> Started"
>>> >   guides.
>>> >
>>> > Given my deep involvement in the current implementation, *would either
>>> of
>>> > you be interested in mentoring me for this project during the upcoming
>>> GSoC
>>> > cycle?* I am eager to see this through to completion for the Apache
>>> OpenNLP
>>> > community.
>>> >
>>> > Best regards,
>>> >
>>> > *Jobin Sabu*
>>> >
>>> >
>>> > On Wed, 25 Feb 2026 at 10:45, Jobin Sabu <[email protected]>
>>> wrote:
>>> >
>>> >>   Hi Richard and Jeff,
>>> >>
>>> >> ​I hope you're both doing well.
>>> >>
>>> >> ​I would like to provide a brief report on the gRPC based Python
>>> client
>>> >> integration. I have got to a stage where the client connects and RPC
>>> calls
>>> >> are being made but I am always receiving a server-side error when
>>> trying to
>>> >> detect the sentence.
>>> >>
>>> >> ​I have deployed the latest version of my work, such as the models/
>>> folder
>>> >> and the config.properties, to my repository:
>>> >> https://github.com/JOBIN-SABU/opennlp-sandbox-experiments
>>> >> <
>>> https://www.google.com/url?sa=E&source=gmail&q=https://github.com/JOBIN-SABU/opennlp-sandbox-experiments
>>> >
>>> >> .
>>> >>
>>> >> ​*Technical Details:*
>>> >>
>>> >> ​*Server Command:*
>>> >>
>>> >> java -jar opennlp-grpc-server-2.5.8-SNAPSHOT.jar -c config.properties
>>> -p
>>> >> 7071
>>> >>
>>> >> ​*config.properties:*
>>> >>
>>> >> sentenceModel=en-sent.bin
>>> >>
>>> >> ​*The Error:*
>>> >>
>>> >> grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that
>>> terminated
>>> >> with: status = StatusCode.INTERNAL, details = "Could not find the
>>> given
>>> >> model.">
>>> >>
>>> >> ​As OpenNLP is a library, I believe the problem is in the nature of
>>> >> loading resources in gRPC wrapper, i.e., whether it anticipates the
>>> model
>>> >> existing on the ClassPath or is capable of interpreting File System
>>> paths
>>> >> based on the configuration file. Am I going through an absolute path,
>>> or
>>> >> does the server have a particular directory hierarchy it wants
>>> external
>>> >> models to go through?
>>> >>
>>> >> ​*Regarding GSoC 2026:*
>>> >>
>>> >> ​Since I have been working on this for almost a year now, my goal
>>> remains
>>> >> to establish a bridge between OpenNLP and the Python community. While
>>> I am
>>> >> committed to this regardless of GSoC, I would love to formally
>>> propose this
>>> >> as a project for the 2026 cycle to move it from the sandbox into a
>>> >> production-ready feature.
>>> >>
>>> >> ​Beyond repairing the existing integration, I plan to:
>>> >>
>>> >>   - ​Implement *NER (Named Entity Recognition)* and *Chunking* as gRPC
>>> >>   services.
>>> >>   - ​Write a pythonic client SDK that will be distributed through
>>> >>   *PyPI/pip*.
>>> >>   - ​Develop detailed documentation and performance benchmarks.
>>> >>
>>> >> ​Since I am already deep into the implementation, would either of you
>>> >> consider mentoring me for this project during the upcoming cycle of
>>> Gsoc?
>>> >> I’m eager to see this through to completion for the community.
>>> >>
>>> >> ​Best regards,
>>> >>
>>> >> ​*Jobin Sabu*
>>> >>
>>> >> *https://www.linkedin.com/in/jobin-sabu-0b18bb2b8/
>>> >> <https://www.linkedin.com/in/jobin-sabu-0b18bb2b8/>*
>>> >>
>>> >> *https://github.com/JOBIN-SABU <https://github.com/JOBIN-SABU>*
>>> >>
>>>
>>>

Reply via email to