[
https://issues.apache.org/jira/browse/SOLR-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278001#comment-17278001
]
Markus Jelsma commented on SOLR-11735:
--
Updated patch for master.
> TransformerFactory to support
[
https://issues.apache.org/jira/browse/SOLR-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated SOLR-11735:
-
Attachment: SOLR-11735.patch
> TransformerFactory to support SolrCoreAware
>
[
https://issues.apache.org/jira/browse/LUCENE-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277119#comment-17277119
]
Markus Jelsma commented on LUCENE-9636:
---
*
[
https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250382#comment-17250382
]
Markus Jelsma commented on SOLR-14788:
--
Yes, those parts of stack traces break to the next line,
[
https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250344#comment-17250344
]
Markus Jelsma edited comment on SOLR-14788 at 12/16/20, 3:13 PM:
-
I am
[
https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250344#comment-17250344
]
Markus Jelsma commented on SOLR-14788:
--
I am not a Docker user. But got ZK running in some tab
[
https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250290#comment-17250290
]
Markus Jelsma commented on SOLR-14788:
--
I am still missing the internal Zookeeper running at 9983
Markus Jelsma created LUCENE-9591:
-
Summary: StringIndexOutOfBoundsException in FastVectorHighlighter
Key: LUCENE-9591
URL: https://issues.apache.org/jira/browse/LUCENE-9591
Project: Lucene - Core
[
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176237#comment-17176237
]
Markus Jelsma commented on SOLR-14636:
--
Well, adding a null check there fixes the problem, but the
[
https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176231#comment-17176231
]
Markus Jelsma commented on SOLR-14636:
--
Hi [~markrmiller] , as curious as a was, i tried to compile
[
https://issues.apache.org/jira/browse/SOLR-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049368#comment-17049368
]
Markus Jelsma commented on SOLR-7759:
-
With TLOG, each shard replica is identical to all other TLOG or
[
https://issues.apache.org/jira/browse/SOLR-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049270#comment-17049270
]
Markus Jelsma commented on SOLR-7759:
-
Hello Jan. This is for me no longer an issue. Possibly because
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027702#comment-17027702
]
Markus Jelsma commented on LUCENE-9112:
---
Hello Robert,
I agree, it is useful to have an
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026634#comment-17026634
]
Markus Jelsma commented on LUCENE-9112:
---
Updated patch so it uses the already existing .bin model
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: LUCENE-9112.patch
> SegmentingTokenizerBase splits terms that occupy 1024th
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: LUCENE-9112.patch
> SegmentingTokenizerBase splits terms that occupy 1024th
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026591#comment-17026591
]
Markus Jelsma commented on LUCENE-9112:
---
Hello Robert,
I asked my colleague Jurian Broertjes
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025816#comment-17025816
]
Markus Jelsma commented on LUCENE-9112:
---
I discovered a problem with my sentence detector model
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: en-token.bin
en-sent.bin
> SegmentingTokenizerBase splits terms
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: (was: en-sent.bin)
> SegmentingTokenizerBase splits terms that occupy 1024th
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: (was: en-token.bin)
> SegmentingTokenizerBase splits terms that occupy 1024th
[
https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009612#comment-17009612
]
Markus Jelsma commented on SOLR-12743:
--
I can not confirm whether it is fixed or not. The collection
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Summary: SegmentingTokenizerBase splits terms that occupy 1024th positions
in text (was:
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008871#comment-17008871
]
Markus Jelsma commented on LUCENE-9112:
---
SegmentingTokenizerBase works fine on texts smaller than
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006105#comment-17006105
]
Markus Jelsma commented on LUCENE-9112:
---
There it is:
{code}
usableLength = findSafeEnd();
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: en-token.bin
en-sent.bin
> OpenNLP tokenizer is fooled by text
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: LUCENE-9112-unittest.patch
> OpenNLP tokenizer is fooled by text containing
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006089#comment-17006089
]
Markus Jelsma edited comment on LUCENE-9112 at 12/31/19 1:22 PM:
-
I now
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006089#comment-17006089
]
Markus Jelsma commented on LUCENE-9112:
---
I now believe it is a problem in the Lucene code, namely
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006076#comment-17006076
]
Markus Jelsma commented on LUCENE-9112:
---
Hello [~sarowe],
I first spotted the issue with a Dutch
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Description:
The OpenNLP tokenizer show weird behaviour when text contains spurious
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: LUCENE-9112-unittest.patch
> OpenNLP tokenizer is fooled by text containing
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: (was: LUCENE-8740.patch)
> OpenNLP tokenizer is fooled by text containing
[
https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated LUCENE-9112:
--
Attachment: LUCENE-8740.patch
> OpenNLP tokenizer is fooled by text containing spurious
34 matches
Mail list logo