Thanks for the PR! I just merged it. I'm glad this will be in the 2.1
release which should go out for vote next week.
Thanks,
Jeff
On Tue, Oct 25, 2022 at 2:31 AM Richard Zowalla wrote:
> Hi,
>
> here is a PR by my collegue Martin W.:
> https://github.com/apache/opennlp/pull/427
>
> Some more d
jzonthemtn merged PR #427:
URL: https://github.com/apache/opennlp/pull/427
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@opennlp.apach
mawiesne commented on PR #427:
URL: https://github.com/apache/opennlp/pull/427#issuecomment-1294606427
FYI and for completeness: I tested the patch against the German treebank
resources available from: https://universaldependencies.org/#language.
- The resulting model files could be
kinow commented on PR #427:
URL: https://github.com/apache/opennlp/pull/427#issuecomment-1294547409
> @jzonthemtn Could you trigger & approve the (build) workflow, 'cause I
can't. Thx in advance.
Done
--
This is an automated message from the Apache Git Service.
To respond to the me
mawiesne commented on PR #427:
URL: https://github.com/apache/opennlp/pull/427#issuecomment-1294518254
@jzonthemtn Could you trigger approve the (build) workflow, 'cause I can't.
Thx in advance.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
mawiesne commented on PR #427:
URL: https://github.com/apache/opennlp/pull/427#issuecomment-1292436503
@atarora The JUnit test has 🛩️ and gives a 💯% line coverage for
`ModelParameterChunker`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please
mawiesne commented on PR #427:
URL: https://github.com/apache/opennlp/pull/427#issuecomment-1292384823
@atarora I'll provide a basic JUnit test for the `ModelParameterChunker`
class. The more complex scenario resembles a full system test at least it is an
integration test setup, that is qui
atarora commented on PR #427:
URL: https://github.com/apache/opennlp/pull/427#issuecomment-1292233015
the PR looks good, except for missing test cases! Can those be added ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and u
jzonthemtn commented on code in PR #427:
URL: https://github.com/apache/opennlp/pull/427#discussion_r1005576541
##
opennlp-tools/src/main/java/opennlp/tools/ml/model/ModelParameterChunker.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under o
mawiesne commented on code in PR #427:
URL: https://github.com/apache/opennlp/pull/427#discussion_r1005268338
##
opennlp-tools/src/main/java/opennlp/tools/ml/model/ModelParameterChunker.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
mawiesne commented on code in PR #427:
URL: https://github.com/apache/opennlp/pull/427#discussion_r1005268338
##
opennlp-tools/src/main/java/opennlp/tools/ml/model/ModelParameterChunker.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
mawiesne commented on code in PR #427:
URL: https://github.com/apache/opennlp/pull/427#discussion_r1005268338
##
opennlp-tools/src/main/java/opennlp/tools/ml/model/ModelParameterChunker.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
jzonthemtn commented on PR #427:
URL: https://github.com/apache/opennlp/pull/427#issuecomment-1290958387
@mawiesne This looks great! Thanks for the PR!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to g
jzonthemtn commented on code in PR #427:
URL: https://github.com/apache/opennlp/pull/427#discussion_r1004820481
##
opennlp-tools/src/main/java/opennlp/tools/ml/model/ModelParameterChunker.java:
##
@@ -0,0 +1,142 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under o
Hi,
here is a PR by my collegue Martin W.:
https://github.com/apache/opennlp/pull/427
Some more details are contained in
https://issues.apache.org/jira/browse/OPENNLP-1366
The change is tested with the huge corpus on the HPC system.
Gruß
Richard Z
Am Freitag, dem 14.10.2022 um 08:18 +0200 sc
mawiesne opened a new pull request, #427:
URL: https://github.com/apache/opennlp/pull/427
Thank you for contributing to Apache OpenNLP.
In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:
### For all changes:
- [x
Hi Jeff,
just to drop a short notice on that one:
My collegue, who is affected by this, is preparing a PR (might take
some time though because of testing on the HPC system...), which will
hopefully solve reading / writing "large" models without breaking
existing ones in the process.
Gruß
Richar
Hi Jeff,
no real updates from our side. We were quite busy in the last weeks
finishing and correcting student course work ;)
My last status in this matter is:
The change from writeUTF to writeShort worked. Training and writing the
MaxEnt model just worked for this huge corpus. No (runtime) erros
Hi Richard,
I know it's been a while but I wanted to circle back to this to see if
there are any updates.
Thanks,
Jeff
On Mon, Apr 25, 2022 at 1:48 PM Richard Eckart de Castilho
wrote:
> Hi,
>
> > On 11. Apr 2022, at 14:50, Zowalla, Richard <
> richard.zowa...@hs-heilbronn.de> wrote:
> >
> > T
Hi,
> On 11. Apr 2022, at 14:50, Zowalla, Richard
> wrote:
>
> This works fine for mid size corpora (just need a little bit of RAM and
> time). However, we are running into the exception mentioned in [1].
> Debugging into the DataOutputStream reveals, that this is a limitation
> of the java.io.
Thanks for trying it and for all the info! I will check it out and let you
know.
Thanks,
Jeff
On Sun, Apr 17, 2022 at 12:51 PM Zowalla, Richard <
richard.zowa...@hs-heilbronn.de> wrote:
> Hi Jeff,
>
> he did the validation again and it showed, that the IDE used an older
> version of OpenNLP.
>
>
Hi Jeff,
he did the validation again and it showed, that the IDE used an older
version of OpenNLP.
After a clean build with the freshly created SNAPSHOT, the model load
resulted in another exception (which now looks reasonable to me).
He updated his comment in [1]. Maybe you have an idea :)
Th
Hi Jeff,
reading the stacktrace myself (now), I think, that an outdated snapshot
was included for this test (as it doesn't fit the code).
I will report back, if this is the case and Maven / Gradle / IDE did
something weird.
Sorry & Gruß
Richard
Am Sonntag, dem 17.04.2022 um 16:26 + schrieb
Hi Jeff,
the task completed and we have some feedback.
My colleague directly commented in the related commit [1].
Writing the model seems to work but reading the resulting model fails.
Gruß
Richard
[1]
https://github.com/apache/opennlp/commit/803f5a4f3a938b7e19ad0be6915097708348e702#commitcom
Hi Jeff,
thanks for the update.
We will give the change a try with a SNAPSHOT build including the
potential patch and start a run on the cluster with the Tübingen
Wikipedia Treebank. Guess we will have feedback in ~ 48 hours regarding
writeShort(...).
Gruß
Richard
Am Dienstag, dem 12.04.2022 u
Luckily, this looks like a common problem [1] for years regarding
writeUTF(). Following other guidance and the function's javadocs [2],
writeUTF() writes the number of bytes written out followed by the string.
Changing it to manually write the length of the string followed by write()
allows the tra
Great, thanks. I was able to reproduce the problem. I'll take a look and
keep this thread updated.
Thanks,
Jeff
On Mon, Apr 11, 2022 at 10:22 AM Zowalla, Richard <
richard.zowa...@hs-heilbronn.de> wrote:
> Hi Jeff,
>
> thanks for the quick reply. Here it is:
> https://issues.apache.org/jira/brow
Hi Jeff,
thanks for the quick reply. Here it is:
https://issues.apache.org/jira/browse/OPENNLP-1366
Using the treebank from Tübingen might not be feasable as it consumes
around 2 TB RAM ;) - the mentioned link in the ticket points to a
smaller dataset, which should reproduce the issue with a fea
Hi Richard,
Thanks for reporting this. A Jira issue with steps to reproduce it would be
fantastic. https://issues.apache.org/jira/projects/OPENNLP
Please create one and reply back here with its ID once you do. I can take a
look and see what can be done.
Thanks,
Jeff
On Mon, Apr 11, 2022 at 8:47
Hi all,
we are working on training a large opennlp maxent model for lemmatizing
German texts. We use a wikipedia tree bank from Tübingen.
This works fine for mid size corpora (just need a little bit of RAM and
time). However, we are running into the exception mentioned in [1].
Debugging into the
Hi all,
we are working on training a large opennlp maxent model for lemmatizing
German texts. We use a wikipedia tree bank from Tübingen.
This works fine for mid size corpora (just need a little bit of RAM and
time). However, we are running into the exception mentioned in [1].
Debugging into the
31 matches
Mail list logo