Re: [PR] OPENNLP-1539 - Introduce parameter for POSTaggerME to configure output POS tag format (opennlp)

2024-05-28 Thread via GitHub
mawiesne merged PR #601: URL: https://github.com/apache/opennlp/pull/601 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@opennlp.apache.

Re: [DISCUSS] Version Scheme for OpenNLP Models

2024-05-28 Thread Jeff Zemerick
I favor option (a) because we likely won't release models as frequently but we will have to keep track of what's compatible with what. There is a manifest file inside the model files and it contains the version number of OpenNLP that trained the model. It's used to check if the version of OpenNLP

Re: [PR] OPENNLP-1563 Fix tokenization of words containing non-spacing letters. (opennlp)

2024-05-28 Thread via GitHub
jzonthemtn commented on PR #602: URL: https://github.com/apache/opennlp/pull/602#issuecomment-2135699051 Thanks @demq! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To u

Re: [PR] OPENNLP-1563 Fix tokenization of words containing non-spacing letters. (opennlp)

2024-05-28 Thread via GitHub
mawiesne commented on PR #602: URL: https://github.com/apache/opennlp/pull/602#issuecomment-2134979949 Thx @demq ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] OPENNLP-1563 Fix tokenization of words containing non-spacing letters. (opennlp)

2024-05-28 Thread via GitHub
rzo1 merged PR #602: URL: https://github.com/apache/opennlp/pull/602 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@opennlp.apache.org

Re: [DISCUSS] Version Scheme for OpenNLP Models

2024-05-28 Thread Atita Arora
hi, Thanks for initiating this discussion regarding the future version scheme for our OpenNLP Maven module distribution. Personally, I lean towards option (a) as it establishes a fresh starting point for our Maven module distribution. However, I'm open to hearing others' thoughts and considerations

Re: [PR] OPENNLP-1563 Fix tokenization of words containing non-spacing letters. (opennlp)

2024-05-28 Thread via GitHub
demq commented on code in PR #602: URL: https://github.com/apache/opennlp/pull/602#discussion_r1616698074 ## opennlp-tools/src/test/java/opennlp/tools/tokenize/SimpleTokenizerTest.java: ## @@ -128,4 +128,18 @@ void testTokenizationOfStringWithWindowsNewLineTokens() { Assert