[jira] [Commented] (SOLR-11735) TransformerFactory to support SolrCoreAware

2021-02-03 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278001#comment-17278001 ] Markus Jelsma commented on SOLR-11735: -- Updated patch for master. > TransformerFactory to support

[jira] [Updated] (SOLR-11735) TransformerFactory to support SolrCoreAware

2021-02-03 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-11735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated SOLR-11735: - Attachment: SOLR-11735.patch > TransformerFactory to support SolrCoreAware >

[jira] [Commented] (LUCENE-9636) Exact and operation to get a SIMD optimize

2021-02-02 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277119#comment-17277119 ] Markus Jelsma commented on LUCENE-9636: --- *

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

2020-12-16 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250382#comment-17250382 ] Markus Jelsma commented on SOLR-14788: -- Yes, those parts of stack traces break to the next line,

[jira] [Comment Edited] (SOLR-14788) Solr: The Next Big Thing

2020-12-16 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250344#comment-17250344 ] Markus Jelsma edited comment on SOLR-14788 at 12/16/20, 3:13 PM: - I am

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

2020-12-16 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250344#comment-17250344 ] Markus Jelsma commented on SOLR-14788: -- I am not a Docker user. But got ZK running in some tab

[jira] [Commented] (SOLR-14788) Solr: The Next Big Thing

2020-12-16 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-14788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17250290#comment-17250290 ] Markus Jelsma commented on SOLR-14788: -- I am still missing the internal Zookeeper running at 9983

[jira] [Created] (LUCENE-9591) StringIndexOutOfBoundsException in FastVectorHighlighter

2020-10-29 Thread Markus Jelsma (Jira)
Markus Jelsma created LUCENE-9591: - Summary: StringIndexOutOfBoundsException in FastVectorHighlighter Key: LUCENE-9591 URL: https://issues.apache.org/jira/browse/LUCENE-9591 Project: Lucene - Core

[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-12 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176237#comment-17176237 ] Markus Jelsma commented on SOLR-14636: -- Well, adding a null check there fixes the problem, but the

[jira] [Commented] (SOLR-14636) Provide a reference implementation for SolrCloud that is stable and fast.

2020-08-12 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-14636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176231#comment-17176231 ] Markus Jelsma commented on SOLR-14636: -- Hi [~markrmiller] , as curious as a was, i tried to compile

[jira] [Commented] (SOLR-7759) DebugComponent's explain should be implemented as a distributed query

2020-03-02 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049368#comment-17049368 ] Markus Jelsma commented on SOLR-7759: - With TLOG, each shard replica is identical to all other TLOG or

[jira] [Commented] (SOLR-7759) DebugComponent's explain should be implemented as a distributed query

2020-03-02 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-7759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17049270#comment-17049270 ] Markus Jelsma commented on SOLR-7759: - Hello Jan. This is for me no longer an issue. Possibly because

[jira] [Commented] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-31 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027702#comment-17027702 ] Markus Jelsma commented on LUCENE-9112: --- Hello Robert, I agree, it is useful to have an

[jira] [Commented] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026634#comment-17026634 ] Markus Jelsma commented on LUCENE-9112: --- Updated patch so it uses the already existing .bin model

[jira] [Updated] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: LUCENE-9112.patch > SegmentingTokenizerBase splits terms that occupy 1024th

[jira] [Updated] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: LUCENE-9112.patch > SegmentingTokenizerBase splits terms that occupy 1024th

[jira] [Commented] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026591#comment-17026591 ] Markus Jelsma commented on LUCENE-9112: --- Hello Robert, I asked my colleague Jurian Broertjes

[jira] [Commented] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-29 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025816#comment-17025816 ] Markus Jelsma commented on LUCENE-9112: --- I discovered a problem with my sentence detector model

[jira] [Updated] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-29 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: en-token.bin en-sent.bin > SegmentingTokenizerBase splits terms

[jira] [Updated] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-29 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: (was: en-sent.bin) > SegmentingTokenizerBase splits terms that occupy 1024th

[jira] [Updated] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-29 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: (was: en-token.bin) > SegmentingTokenizerBase splits terms that occupy 1024th

[jira] [Commented] (SOLR-12743) Memory leak introduced in Solr 7.3.0

2020-01-07 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/SOLR-12743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17009612#comment-17009612 ] Markus Jelsma commented on SOLR-12743: -- I can not confirm whether it is fixed or not. The collection

[jira] [Updated] (LUCENE-9112) SegmentingTokenizerBase splits terms that occupy 1024th positions in text

2020-01-06 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Summary: SegmentingTokenizerBase splits terms that occupy 1024th positions in text (was:

[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2020-01-06 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17008871#comment-17008871 ] Markus Jelsma commented on LUCENE-9112: --- SegmentingTokenizerBase works fine on texts smaller than

[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-31 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006105#comment-17006105 ] Markus Jelsma commented on LUCENE-9112: --- There it is: {code} usableLength = findSafeEnd();

[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-31 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: en-token.bin en-sent.bin > OpenNLP tokenizer is fooled by text

[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-31 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: LUCENE-9112-unittest.patch > OpenNLP tokenizer is fooled by text containing

[jira] [Comment Edited] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-31 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006089#comment-17006089 ] Markus Jelsma edited comment on LUCENE-9112 at 12/31/19 1:22 PM: - I now

[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-31 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006089#comment-17006089 ] Markus Jelsma commented on LUCENE-9112: --- I now believe it is a problem in the Lucene code, namely

[jira] [Commented] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-31 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006076#comment-17006076 ] Markus Jelsma commented on LUCENE-9112: --- Hello [~sarowe], I first spotted the issue with a Dutch

[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Description: The OpenNLP tokenizer show weird behaviour when text contains spurious

[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: LUCENE-9112-unittest.patch > OpenNLP tokenizer is fooled by text containing

[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: (was: LUCENE-8740.patch) > OpenNLP tokenizer is fooled by text containing

[jira] [Updated] (LUCENE-9112) OpenNLP tokenizer is fooled by text containing spurious punctuation

2019-12-30 Thread Markus Jelsma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated LUCENE-9112: -- Attachment: LUCENE-8740.patch > OpenNLP tokenizer is fooled by text containing spurious