[GitHub] [lucene-solr] anshumg commented on issue #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
anshumg commented on issue #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method URL: https://github.com/apache/lucene-solr/pull/1335#issuecomment-597473846 LGTM. Thanks for adding the test. Can you please also add a CHANGELOG entry? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14316) Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
[ https://issues.apache.org/jira/browse/SOLR-14316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aroop updated SOLR-14316: - Description: There is an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method. This change removes that warning by handling a checked conversion and also adds to tests to an earlier untested api. was:There is an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method. > Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's > equals() method > - > > Key: SOLR-14316 > URL: https://issues.apache.org/jira/browse/SOLR-14316 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.2, 8.4.1 >Reporter: Aroop >Priority: Minor > Labels: patch > Time Spent: 0.5h > Remaining Estimate: 0h > > There is an unchecked type conversion warning in JavaBinCodec's > readMapEntry's equals() method. > This change removes that warning by handling a checked conversion and also > adds to tests to an earlier untested api. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] aroopganguly commented on issue #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
aroopganguly commented on issue #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method URL: https://github.com/apache/lucene-solr/pull/1335#issuecomment-597466952 @noblepaul tests added and some points to ponder upon as well in the Tests section in the main section of the PR above. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056691#comment-17056691 ] Munendra S N commented on SOLR-13199: - Thanks [~mkhl] for the review. The reason I thought of that scenario is that there are no enforcements or checks on what can be a parentFilter. As stated earlier, it is bit of stretch and I agree it is not a valid case [^SOLR-13199.patch] modified to throw exceptions > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The > attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that > you will obtain by following the steps below: > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > {noformat} >Reporter: Johannes Kloos >Assignee: Munendra S N >Priority: Minor > Labels: diffblue, newdev > Attachments: SOLR-13199.patch, SOLR-13199.patch, home.zip > > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:* > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.NullPointerException > at > org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:103) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:1) > at > org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:184) > at > org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:136) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292) > at org.apache.solr.response.JSONWriter.writeResponse(JSONWriter.java:73) > {noformat} > In ChildDocTransformer.transform, we have the following lines: > {noformat} > final BitSet segParentsBitSet = parentsFilter.getBitSet(leafReaderContext); > final int segPrevRootId = segRootId==0? -1: > segParentsBitSet.prevSetBit(segRootId - 1); // can return -1 and that's okay > {noformat} > But getBitSet can return null if the set of DocIds is empty: > {noformat} > return docIdSet == DocIdSet.EMPTY ? null : ((BitDocIdSet) docIdSet).bits(); > {noformat} > We found this bug using [Diffblue Microservices > Testing|https://www.diffblue.com/labs/?utm_source=solr-br]. Find more > information on this [fuzz testing > campaign|https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results?utm_source=solr-br]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munendra S N updated SOLR-13199: Attachment: SOLR-13199.patch > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The > attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that > you will obtain by following the steps below: > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > {noformat} >Reporter: Johannes Kloos >Assignee: Munendra S N >Priority: Minor > Labels: diffblue, newdev > Attachments: SOLR-13199.patch, SOLR-13199.patch, home.zip > > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:* > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.NullPointerException > at > org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:103) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:1) > at > org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:184) > at > org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:136) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292) > at org.apache.solr.response.JSONWriter.writeResponse(JSONWriter.java:73) > {noformat} > In ChildDocTransformer.transform, we have the following lines: > {noformat} > final BitSet segParentsBitSet = parentsFilter.getBitSet(leafReaderContext); > final int segPrevRootId = segRootId==0? -1: > segParentsBitSet.prevSetBit(segRootId - 1); // can return -1 and that's okay > {noformat} > But getBitSet can return null if the set of DocIds is empty: > {noformat} > return docIdSet == DocIdSet.EMPTY ? null : ((BitDocIdSet) docIdSet).bits(); > {noformat} > We found this bug using [Diffblue Microservices > Testing|https://www.diffblue.com/labs/?utm_source=solr-br]. Find more > information on this [fuzz testing > campaign|https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results?utm_source=solr-br]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056668#comment-17056668 ] Mikhail Khludnev commented on SOLR-13199: - {quote}{{cat_s:fantasy}} is the parentFilter {quote} [~munendrasn], here I would disagree. Such query can't be a parent filter ever. {quote}bq. Solr flattens the nest then supplies the list to Lucene which +guarantees+ this. {quote} Thanks for reminding, [~dsmiley]. I agree that having even two distinct parent types like: {{type:ParentA}} and {{type:ParentB}} is not a valid case and shouldn't be considered. But, couldn't a segment has all docs marked for deletion? Could it happen due to NRT or some other esoteric cases? > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The > attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that > you will obtain by following the steps below: > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > {noformat} >Reporter: Johannes Kloos >Assignee: Munendra S N >Priority: Minor > Labels: diffblue, newdev > Attachments: SOLR-13199.patch, home.zip > > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:* > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.NullPointerException > at > org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:103) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:1) > at > org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:184) > at > org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:136) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedList(JsonTextWriter.java:292) > at org.apache.solr.response.JSONWriter.writeResponse(JSONWriter.java:73) > {noformat} > In ChildDocTransformer.transform, we have the following lines: > {noformat} > final BitSet segParentsBitSet = parentsFilter.getBitSet(leafReaderContext); > final int segPrevRootId = segRootId==0? -1: > segParentsBitSet.prevSetBit(segRootId - 1); // can return -1 and that's okay > {noformat} > But getBitSet can return null if the set of DocIds is empty: > {noformat} > return docIdSet == DocIdSet.EMPTY ? null : ((BitDocIdSet) docIdSet).bits(); > {noformat} > We found this bug using [Diffblue Microservices > Testing|https://www.diffblue.com/labs/?utm_source=solr-br]. Find more > information on this [fuzz testing > campaign|https://www.diffblue.com/blog/2018/12/19/diffblue-microservice-testing-a-sneak-peek-at-our-early-product-and-results?utm_source=solr-br]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.ap
[jira] [Commented] (LUCENE-8103) QueryValueSource should use TwoPhaseIterator
[ https://issues.apache.org/jira/browse/LUCENE-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056624#comment-17056624 ] David Smiley commented on LUCENE-8103: -- {{ant test -Dtestcase=TestValueSources -Dtests.method=testQueryWrapedFuncWrapedQuery -Dtests.seed=625CF512BDD7BD01 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=fr-CA -Dtests.timezone=America/Phoenix -Dtests.asserts=true -Dtests.file.encoding=UTF-8}} {{java.lang.AssertionError: ITERATINGjava.lang.AssertionError: ITERATING}} {{ at __randomizedtesting.SeedInfo.seed([625CF512BDD7BD01:36E49DF7D50086EB]:0) at org.apache.lucene.search.AssertingScorer$3.matches(AssertingScorer.java:235) at org.apache.lucene.queries.function.valuesource.QueryDocValues.exists(QueryValueSource.java:156) at org.apache.lucene.queries.function.valuesource.QueryDocValues.floatVal(QueryValueSource.java:129) at org.apache.lucene.queries.function.FunctionQuery$AllScorer.score(FunctionQuery.java:120) at org.apache.lucene.search.AssertingScorer.score(AssertingScorer.java:102)}} > QueryValueSource should use TwoPhaseIterator > > > Key: LUCENE-8103 > URL: https://issues.apache.org/jira/browse/LUCENE-8103 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: David Smiley >Priority: Minor > Attachments: LUCENE-8103.patch > > > QueryValueSource (in "queries" module) is a ValueSource representation of a > Query; the score is the value. It ought to try to use a TwoPhaseIterator > from the query if it can be offered. This will prevent possibly expensive > advancing beyond documents that we aren't interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off
dsmiley commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off URL: https://github.com/apache/lucene-solr/pull/1332#discussion_r390721357 ## File path: solr/solr-ref-guide/src/the-tagger-handler.adoc ## @@ -271,11 +271,12 @@ The response should be this (the QTime may vary): }} -== Tagger Tips +== Tagger Performance Tips -Performance Tips: - -* Follow the recommended configuration field settings, especially `postingsFormat=FST50`. +* Follow the recommended configuration field settings above. +Additionally, for the best tagger performance, set `postingsFormat=FST50`. +However, non-default postings formats have no backwards-compatibility guarantees, and so if you upgrade Solr then you may find a nasty exception on startup as it fails to read the older index. +If the input text to be tagged is small (e.g. you are tagging queries or tweets) then the postings format choice isn't as important. Review comment: FYI the SolrTextTagger was benchmarked a couple years ago to compare the old "Memory" PF and FST50 -- https://github.com/OpenSextant/SolrTextTagger/issues/38#issuecomment-385597248 we never tried the default (blocktree). I believe the input data in that experiment were whole articles, and thus would be impacted by the postings format choice. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-10112) Prevent DBQs from getting reordered
[ https://issues.apache.org/jira/browse/SOLR-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056576#comment-17056576 ] Eugene Tenkaev edited comment on SOLR-10112 at 3/11/20, 1:44 AM: - [~ichattopadhyaya], [~rwhaddad] please help! We need to figure out root problem in our case. I see you are pretty deep in this question, so decided to ask for your help. We are sending intensely updates that contains partial modification of fields in document using *set*. Also, we're sending delete by query(DBQ) pretty intensive, we use it for garbage collection of documents that don't have field *enabled* with value *true*. So if we want doc to be removed we set field *enabled:false*. Our delete by query(DBQ) looks like: {code} -enabled:true {code} Documents represent for example phone model. After some time it can be added back to search index and this doc will contain: {code} enabled:true {code} In parallel, we're enriching document by additional info(enriching concurrently, info comes from different sources), but we always make sure that update with *enabled:true* will be sent first before *other* updates. We start observer that we lose some *other* updates(updates that comes after first initial update with *enabled:true*). Seems like they not applied, can DBQs create such problems? Should we send DBQ's where we mention date and delete only documents that not have been updated for a day for example? Thank you, I will be very appreciated for help! was (Author: hronom): [~ichattopadhyaya], [~rwhaddad] please help! We need to figure out root problem in our case. I see you are pretty deep in this question, so decided to ask for your help. We are sending intensely updates that contains partial modification of fields in document using *set*. Also, we're sending delete by query(DBQ) pretty intensive, we use it for garbage collection of documents that don't have field *enabled* with value *true*. So if we want doc to be removed we set field *enabled:false*. Our delete by query(DBQ) looks like: {code} -enabled:true {code} Documents represent for example phone model. After some time it can be added back to search index and this doc will contain: {code} enabled:true {code} In parallel, we're enriching document by additional info, but we always make sure that update with *enabled:true* will be sent first before *other* updates. We start observer that we lose some *other* updates(updates that comes after first initial update with *enabled:true*). Seems like they not applied, can DBQs create such problems? Should we send DBQ's where we mention date and delete only documents that not have been updated for a day for example? Thank you, I will be very appreciated for help! > Prevent DBQs from getting reordered > --- > > Key: SOLR-10112 > URL: https://issues.apache.org/jira/browse/SOLR-10112 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Priority: Major > > Reordered DBQs are problematic for various reasons. We might be able to > prevent DBQs from getting re-ordered by making sure, at the leader, that all > updates before a DBQ have been written successfully on the replicas, and > block all updates after the DBQ until the DBQ is written successfully at the > replicas. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-10112) Prevent DBQs from getting reordered
[ https://issues.apache.org/jira/browse/SOLR-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056576#comment-17056576 ] Eugene Tenkaev edited comment on SOLR-10112 at 3/11/20, 1:43 AM: - [~ichattopadhyaya], [~rwhaddad] please help! We need to figure out root problem in our case. I see you are pretty deep in this question, so decided to ask for your help. We are sending intensely updates that contains partial modification of fields in document using *set*. Also, we're sending delete by query(DBQ) pretty intensive, we use it for garbage collection of documents that don't have field *enabled* with value *true*. So if we want doc to be removed we set field *enabled:false*. Our delete by query(DBQ) looks like: {code} -enabled:true {code} Documents represent for example phone model. After some time it can be added back to search index and this doc will contain: {code} enabled:true {code} In parallel, we're enriching document by additional info, but we always make sure that update with *enabled:true* will be sent first before *other* updates. We start observer that we lose some *other* updates(updates that comes after first initial update with *enabled:true*). Seems like they not applied, can DBQs create such problems? Should we send DBQ's where we mention date and delete only documents that not have been updated for a day for example? Thank you, I will be very appreciated for help! was (Author: hronom): [~ichattopadhyaya], [~rwhaddad] please help! We need to figure out root problem in our case. We are sending intensely updates that contains partial modification of fields in document using *set*. Also, we're sending delete by query(DBQ) pretty intensive, we use it for garbage collection of documents that don't have field *enabled* with value *true*. So if we want doc to be removed we set field *enabled:false*. Our delete by query(DBQ) looks like: {code} -enabled:true {code} Documents represent for example phone model. After some time it can be added back to search index and this doc will contain: {code} enabled:true {code} In parallel, we're enriching document by additional info, but we always make sure that update with *enabled:true* will be sent first before *other* updates. We start observer that we lose some *other* updates(updates that comes after first initial update with *enabled:true*). Seems like they not applied, can DBQs create such problems? Should we send DBQ's where we mention date and delete only documents that not have been updated for a day for example? Thank you, I will be very appreciated for help! > Prevent DBQs from getting reordered > --- > > Key: SOLR-10112 > URL: https://issues.apache.org/jira/browse/SOLR-10112 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Priority: Major > > Reordered DBQs are problematic for various reasons. We might be able to > prevent DBQs from getting re-ordered by making sure, at the leader, that all > updates before a DBQ have been written successfully on the replicas, and > block all updates after the DBQ until the DBQ is written successfully at the > replicas. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-10112) Prevent DBQs from getting reordered
[ https://issues.apache.org/jira/browse/SOLR-10112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056576#comment-17056576 ] Eugene Tenkaev commented on SOLR-10112: --- [~ichattopadhyaya], [~rwhaddad] please help! We need to figure out root problem in our case. We are sending intensely updates that contains partial modification of fields in document using *set*. Also, we're sending delete by query(DBQ) pretty intensive, we use it for garbage collection of documents that don't have field *enabled* with value *true*. So if we want doc to be removed we set field *enabled:false*. Our delete by query(DBQ) looks like: {code} -enabled:true {code} Documents represent for example phone model. After some time it can be added back to search index and this doc will contain: {code} enabled:true {code} In parallel, we're enriching document by additional info, but we always make sure that update with *enabled:true* will be sent first before *other* updates. We start observer that we lose some *other* updates(updates that comes after first initial update with *enabled:true*). Seems like they not applied, can DBQs create such problems? Should we send DBQ's where we mention date and delete only documents that not have been updated for a day for example? Thank you, I will be very appreciated for help! > Prevent DBQs from getting reordered > --- > > Key: SOLR-10112 > URL: https://issues.apache.org/jira/browse/SOLR-10112 > Project: Solr > Issue Type: Bug >Reporter: Ishan Chattopadhyaya >Priority: Major > > Reordered DBQs are problematic for various reasons. We might be able to > prevent DBQs from getting re-ordered by making sure, at the leader, that all > updates before a DBQ have been written successfully on the replicas, and > block all updates after the DBQ until the DBQ is written successfully at the > replicas. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul commented on issue #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
noblepaul commented on issue #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method URL: https://github.com/apache/lucene-solr/pull/1335#issuecomment-597392672 LGTM a test can help This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on issue #1191: SOLR-14197 Reduce API of SolrResourceLoader
dsmiley commented on issue #1191: SOLR-14197 Reduce API of SolrResourceLoader URL: https://github.com/apache/lucene-solr/pull/1191#issuecomment-597376064 I propose that in 8.x, I leave the methods that were moved but mark them as deprecated. Otherwise, the rest of the changes are rather internal and can be applied to 8x, thus helping reduce possible merge conflicts for our future work. I'll put an entry in Other Changes for 9.x: "SolrResourceLoader remove deprecated methods (numerous)." Simple and succinct. The commit message will have lots of info. On 8x I'll say "SolrResourceLoader: marked many methods as deprecated, and in some cases rerouted exiting logic to avoid them". Since this is very internal stuff, people can go see for themselves. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14310) Expose solr logs with basic filters via HTTP
[ https://issues.apache.org/jira/browse/SOLR-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056519#comment-17056519 ] Noble Paul commented on SOLR-14310: --- My idea is to just read last n lines from the file and dump it in the API > Expose solr logs with basic filters via HTTP > > > Key: SOLR-14310 > URL: https://issues.apache.org/jira/browse/SOLR-14310 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > path {{/api/node/tools/logs}} > params > * lines : (default 100) no:of lines to be shown. the most recent lines are > shown > * collection : (multivalued) filter log lines by collections > * shard : (multivalued) filter log lines by shard name > * core: (multivalued)filter log lines by cores > * startTime: timestamp start > * endTime: Timestamp end > * className :(multivalued) Name of the class in logs > * logLevel: (multivalued) . INFO/DEBUG etc > * threadNamePrefix: (multivalued) : eg: qtp ,searchExecutor , > solrHandlerExecutor etc > The output will be in plain text format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy merged pull request #1322: Remove some unused lines from addBackcompatIndexes.py related to svn
janhoy merged pull request #1322: Remove some unused lines from addBackcompatIndexes.py related to svn URL: https://github.com/apache/lucene-solr/pull/1322 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14310) Expose solr logs with basic filters via HTTP
[ https://issues.apache.org/jira/browse/SOLR-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056517#comment-17056517 ] Jan Høydahl commented on SOLR-14310: I agree that anything but simple show last-N lines is overkill. Did you plan to serve this from the actual file system or from some memory buffer? Remember that for some deployments like containers, Solr may very well log to stdout and not store any log files on disk locally. > Expose solr logs with basic filters via HTTP > > > Key: SOLR-14310 > URL: https://issues.apache.org/jira/browse/SOLR-14310 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > path {{/api/node/tools/logs}} > params > * lines : (default 100) no:of lines to be shown. the most recent lines are > shown > * collection : (multivalued) filter log lines by collections > * shard : (multivalued) filter log lines by shard name > * core: (multivalued)filter log lines by cores > * startTime: timestamp start > * endTime: Timestamp end > * className :(multivalued) Name of the class in logs > * logLevel: (multivalued) . INFO/DEBUG etc > * threadNamePrefix: (multivalued) : eg: qtp ,searchExecutor , > solrHandlerExecutor etc > The output will be in plain text format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] msokolov commented on issue #1316: LUCENE-8929 parallel early termination in TopFieldCollector using minmin score
msokolov commented on issue #1316: LUCENE-8929 parallel early termination in TopFieldCollector using minmin score URL: https://github.com/apache/lucene-solr/pull/1316#issuecomment-597359816 I've done pretty extensive performance testing, results are good, and unit tests are passing, but I would really appreciate some eyeballs if anyone has the time because this is a pretty sensitive area, and the implementation I've tested extensively in the wild is not exactly the same as this one, although it implements the same strategy. I'm also particularly interested, @jimczi, if you can comment on whether the `MaxScoreAccumulator` still provides additional benefit alongside this opto? I haven't tried removing it, but I wonder if it might be doing something redundant now - I'm not totally clear what impact setMinCompetitiveScore will have. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sarowe commented on issue #1322: Remove some unused lines from addBackcompatIndexes.py related to svn
sarowe commented on issue #1322: Remove some unused lines from addBackcompatIndexes.py related to svn URL: https://github.com/apache/lucene-solr/pull/1322#issuecomment-597354343 +1 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14306) Refactor coordination code into separate module and evaluate using Curator
[ https://issues.apache.org/jira/browse/SOLR-14306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056457#comment-17056457 ] David Smiley commented on SOLR-14306: - Overall I'm really encouraged to finally seem some momentum in this direction. Thanks so much Tomas! > Refactor coordination code into separate module and evaluate using Curator > -- > > Key: SOLR-14306 > URL: https://issues.apache.org/jira/browse/SOLR-14306 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Reporter: Tomas Eduardo Fernandez Lobbe >Priority: Major > > This Jira issue is to discuss two changes that unfortunately are difficult to > address separately > # Separate all ZooKeeper coordination logic into it’s own module, that can > be tested in isolation > # Evaluate using Apache Curator for coordination instead of our own logic. > I drafted a > [SIP|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=148640472], > but this is very much WIP, I’d like to hear opinions before I spend too much > time on something people hates. > From the initial draft of the SIP: > {quote}The main goal of this change is to allow better testing of the > different ZooKeeper interactions related to coordination (leader election, > queues, etc). There are already some abstractions in place for lower level > operations (set-data, get-data, etc, see DistribStateManager), so the idea is > to have a new, related abstraction named CoordinationManager, where we could > have some higher level coordination-related classes, like LeaderRunner > (Overseer), LeaderLatch (for shard leaders), etc. Curator comes into place > because, in order to refactor the existing code into these new abstractions, > we’d have to rework much of it, so we could instead consider using Curator, a > library that was mentioned in the past many times. While I don’t think this > is required, It would make this transition and our code simpler (from what I > could see, however, input from people with more Curator experience would be > greatly appreciated). > While it would be out of the scope of this change, If the > abstractions/interfaces are correctly designed, this could lead to, in the > future, be able to use something other than ZooKeeper for coordination, > either etcd or maybe even some in-memory replacement for tests. > {quote} > There are still many open questions, and many questions I still don’t know > we’ll have, but please, let me know if you have any early feedback, specially > if you’ve worked with Curator in the past. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14310) Expose solr logs with basic filters via HTTP
[ https://issues.apache.org/jira/browse/SOLR-14310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056435#comment-17056435 ] David Smiley commented on SOLR-14310: - I agree with Ishan; I think this is scope-creep bloat. > Expose solr logs with basic filters via HTTP > > > Key: SOLR-14310 > URL: https://issues.apache.org/jira/browse/SOLR-14310 > Project: Solr > Issue Type: Sub-task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Priority: Major > > path {{/api/node/tools/logs}} > params > * lines : (default 100) no:of lines to be shown. the most recent lines are > shown > * collection : (multivalued) filter log lines by collections > * shard : (multivalued) filter log lines by shard name > * core: (multivalued)filter log lines by cores > * startTime: timestamp start > * endTime: Timestamp end > * className :(multivalued) Name of the class in logs > * logLevel: (multivalued) . INFO/DEBUG etc > * threadNamePrefix: (multivalued) : eg: qtp ,searchExecutor , > solrHandlerExecutor etc > The output will be in plain text format -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14173) Ref Guide Redesign
[ https://issues.apache.org/jira/browse/SOLR-14173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056433#comment-17056433 ] David Smiley commented on SOLR-14173: - I like the gray-themed one better than the blue/orange combo. > Ref Guide Redesign > -- > > Key: SOLR-14173 > URL: https://issues.apache.org/jira/browse/SOLR-14173 > Project: Solr > Issue Type: Improvement > Components: documentation >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-14173.patch, blue-left-nav.png, gray-left-nav.png > > > The current design of the Ref Guide was essentially copied from a > Jekyll-based documentation theme > (https://idratherbewriting.com/documentation-theme-jekyll/), which had a > couple important benefits for that time: > * It was well-documented and since I had little experience with Jekyll and > its Liquid templates and since I was the one doing it, I wanted to make it as > easy on myself as possible > * It was designed for documentation specifically so took care of all the > things like inter-page navigation, etc. > * It helped us get from Confluence to our current system quickly > It had some drawbacks, though: > * It wasted a lot of space on the page > * The theme was built for Markdown files, so did not take advantage of the > features of the {{jekyll-asciidoc}} plugin we use (the in-page TOC being one > big example - the plugin could create it at build time, but the theme > included JS to do it as the page loads, so we use the JS) > * It had a lot of JS and overlapping CSS files. While it used Bootstrap it > used a customized CSS on top of it for theming that made modifications > complex (it was hard to figure out how exactly a change would behave) > * With all the stuff I'd changed in my bumbling way just to get things to > work back then, I broke a lot of the stuff Bootstrap is supposed to give us > in terms of responsiveness and making the Guide usable even on smaller screen > sizes. > After upgrading the Asciidoctor components in SOLR-12786 and stopping the PDF > (SOLR-13782), I wanted to try to set us up for a more flexible system. We > need it for things like Joel's work on the visual guide for streaming > expressions (SOLR-13105), and in order to implement other ideas we might have > on how to present information in the future. > I view this issue as a phase 1 of an overall redesign that I've already > started in a local branch. I'll explain in a comment the changes I've already > made, and will use this issue to create and push a branch where we can > discuss in more detail. > Phase 1 here will be under-the-hood CSS/JS changes + overall page layout > changes. > Phase 2 (issue TBD) will be a wholesale re-organization of all the pages of > the Guide. > Phase 3 (issue TBD) will explore moving us from Jekyll to another static site > generator that is better suited for our content format, file types, and build > conventions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14318) Missing dependency on commons-lang in solr-cell 8.4.1
[ https://issues.apache.org/jira/browse/SOLR-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Risden resolved SOLR-14318. - Assignee: Kevin Risden Resolution: Information Provided Marking this as information provided. Thanks for the feedback [~markus.guenther] let us know if there are any other concerns. > Missing dependency on commons-lang in solr-cell 8.4.1 > - > > Key: SOLR-14318 > URL: https://issues.apache.org/jira/browse/SOLR-14318 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.4.1 >Reporter: Markus Günther >Assignee: Kevin Risden >Priority: Minor > > During a migration from Solr 7.x to Solr 8.4.1 we noticed that the > commons-lang:commons-lang:2.6 dependency has been removed, and thus, no > longer is part of org.apache.solr:solr-cell. solr-cell however comes bundled > with Apache Tika Parsers (org.apache.tika:tika-parsers) in version 1.19.1 > which - although it is not an explicit dependency - does require > commons-lang:commons-lang:2.6. > This raises an issue when trying to extract the content from Microsoft Access > database files using Tika. See the stacktrace below. > {code:java} > java.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtilsjava.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtils at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.equals(SimpleColumnMatcher.java:74) > at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.matches(SimpleColumnMatcher.java:46) > at > com.healthmarketscience.jackcess.util.CaseInsensitiveColumnMatcher.matches(CaseInsensitiveColumnMatcher.java:49) > at > com.healthmarketscience.jackcess.impl.CursorImpl.currentRowMatchesImpl(CursorImpl.java:571) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRowImpl(CursorImpl.java:627) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRow(CursorImpl.java:517) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findFirstRow(CursorImpl.java:494) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$FallbackTableFinder.findRow(DatabaseImpl.java:2376) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$TableFinder.findObjectId(DatabaseImpl.java:2176) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.readSystemCatalog(DatabaseImpl.java:879) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.(DatabaseImpl.java:534) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.open(DatabaseImpl.java:401) > at > com.healthmarketscience.jackcess.DatabaseBuilder.open(DatabaseBuilder.java:252) > at > org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:94) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at > org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) > at > org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:350) > at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:287) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2596) at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799) at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578) at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at > org.eclipse.j
[jira] [Commented] (SOLR-14054) Upgrade Tika to 1.23
[ https://issues.apache.org/jira/browse/SOLR-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056413#comment-17056413 ] Tim Allison commented on SOLR-14054: Would something like this be acceptable? https://stackoverflow.com/a/24497206 > Upgrade Tika to 1.23 > > > Key: SOLR-14054 > URL: https://issues.apache.org/jira/browse/SOLR-14054 > Project: Solr > Issue Type: Task > Components: contrib - DataImportHandler >Reporter: Tim Allison >Assignee: Tim Allison >Priority: Minor > Fix For: 8.5 > > Attachments: test-documents.7z, > tika-integration-example-9.0.0-SNAPSHOT.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > We just released 1.23. Let's upgrade Tika. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14054) Upgrade Tika to 1.23
[ https://issues.apache.org/jira/browse/SOLR-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056403#comment-17056403 ] Tim Allison edited comment on SOLR-14054 at 3/10/20, 8:54 PM: -- We use xerces 2.12.0 which brings in xml-apis 1.4.01, which is needed by Java 8...see above. In master, we get rid of xml-apis because we don't need it with Java > 8. Any recommendations for a fix in 8.x when building with Java > 8? Is there an ant/ivy version of maven's profiles, activated by Java > 8, e.g.: https://github.com/apache/pdfbox/blob/trunk/parent/pom.xml#L176 ? was (Author: talli...@mitre.org): We use xerces 2.12.0 which brings in xml-apis 1.4.01, which is needed by Java 8...see above. In master, we get rid of xml-apis because we don't need it with Java > 8. Any recommendations for a fix in 8.x when building with Java > 8? > Upgrade Tika to 1.23 > > > Key: SOLR-14054 > URL: https://issues.apache.org/jira/browse/SOLR-14054 > Project: Solr > Issue Type: Task > Components: contrib - DataImportHandler >Reporter: Tim Allison >Assignee: Tim Allison >Priority: Minor > Fix For: 8.5 > > Attachments: test-documents.7z, > tika-integration-example-9.0.0-SNAPSHOT.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > We just released 1.23. Let's upgrade Tika. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14054) Upgrade Tika to 1.23
[ https://issues.apache.org/jira/browse/SOLR-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056403#comment-17056403 ] Tim Allison edited comment on SOLR-14054 at 3/10/20, 8:45 PM: -- We use xerces 2.12.0 which brings in xml-apis 1.4.01, which is needed by Java 8...see above. In master, we get rid of xml-apis because we don't need it with Java > 8. Any recommendations for a fix in 8.x when building with Java > 8? was (Author: talli...@mitre.org): We use xerces 2.12.0 which brings in xml-apis 1.4.01, which is needed by Java 8...see above. In master, we get rid of xml-apis because we don't need it with Java > 8. Any recommendations for a fix in 8.x? > Upgrade Tika to 1.23 > > > Key: SOLR-14054 > URL: https://issues.apache.org/jira/browse/SOLR-14054 > Project: Solr > Issue Type: Task > Components: contrib - DataImportHandler >Reporter: Tim Allison >Assignee: Tim Allison >Priority: Minor > Fix For: 8.5 > > Attachments: test-documents.7z, > tika-integration-example-9.0.0-SNAPSHOT.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > We just released 1.23. Let's upgrade Tika. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14054) Upgrade Tika to 1.23
[ https://issues.apache.org/jira/browse/SOLR-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056403#comment-17056403 ] Tim Allison commented on SOLR-14054: We use xerces 2.12.0 which brings in xml-apis 1.4.01, which is needed by Java 8...see above. In master, we get rid of xml-apis because we don't need it with Java > 8. Any recommendations for a fix in 8.x? > Upgrade Tika to 1.23 > > > Key: SOLR-14054 > URL: https://issues.apache.org/jira/browse/SOLR-14054 > Project: Solr > Issue Type: Task > Components: contrib - DataImportHandler >Reporter: Tim Allison >Assignee: Tim Allison >Priority: Minor > Fix For: 8.5 > > Attachments: test-documents.7z, > tika-integration-example-9.0.0-SNAPSHOT.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > We just released 1.23. Let's upgrade Tika. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14173) Ref Guide Redesign
[ https://issues.apache.org/jira/browse/SOLR-14173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056396#comment-17056396 ] Cassandra Targett commented on SOLR-14173: -- New branch created and pushed (without the color changes I mentioned yesterday...still waiting for some feedback there): named {{jira/solr-14173-2}}, in GH at: https://github.com/apache/lucene-solr/tree/jira/solr-14173-2 > Ref Guide Redesign > -- > > Key: SOLR-14173 > URL: https://issues.apache.org/jira/browse/SOLR-14173 > Project: Solr > Issue Type: Improvement > Components: documentation >Reporter: Cassandra Targett >Assignee: Cassandra Targett >Priority: Major > Attachments: SOLR-14173.patch, blue-left-nav.png, gray-left-nav.png > > > The current design of the Ref Guide was essentially copied from a > Jekyll-based documentation theme > (https://idratherbewriting.com/documentation-theme-jekyll/), which had a > couple important benefits for that time: > * It was well-documented and since I had little experience with Jekyll and > its Liquid templates and since I was the one doing it, I wanted to make it as > easy on myself as possible > * It was designed for documentation specifically so took care of all the > things like inter-page navigation, etc. > * It helped us get from Confluence to our current system quickly > It had some drawbacks, though: > * It wasted a lot of space on the page > * The theme was built for Markdown files, so did not take advantage of the > features of the {{jekyll-asciidoc}} plugin we use (the in-page TOC being one > big example - the plugin could create it at build time, but the theme > included JS to do it as the page loads, so we use the JS) > * It had a lot of JS and overlapping CSS files. While it used Bootstrap it > used a customized CSS on top of it for theming that made modifications > complex (it was hard to figure out how exactly a change would behave) > * With all the stuff I'd changed in my bumbling way just to get things to > work back then, I broke a lot of the stuff Bootstrap is supposed to give us > in terms of responsiveness and making the Guide usable even on smaller screen > sizes. > After upgrading the Asciidoctor components in SOLR-12786 and stopping the PDF > (SOLR-13782), I wanted to try to set us up for a more flexible system. We > need it for things like Joel's work on the visual guide for streaming > expressions (SOLR-13105), and in order to implement other ideas we might have > on how to present information in the future. > I view this issue as a phase 1 of an overall redesign that I've already > started in a local branch. I'll explain in a comment the changes I've already > made, and will use this issue to create and push a branch where we can > discuss in more detail. > Phase 1 here will be under-the-hood CSS/JS changes + overall page layout > changes. > Phase 2 (issue TBD) will be a wholesale re-organization of all the pages of > the Guide. > Phase 3 (issue TBD) will explore moving us from Jekyll to another static site > generator that is better suited for our content format, file types, and build > conventions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14319) Add ability to select replicatype to admin ui collection creation
[ https://issues.apache.org/jira/browse/SOLR-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056392#comment-17056392 ] Lucene/Solr QA commented on SOLR-14319: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} SOLR-14319 does not apply to master. Rebase required? Wrong Branch? See https://wiki.apache.org/solr/HowToContribute#Creating_the_patch_file for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | SOLR-14319 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12996319/SOLR-14319.patch | | Console output | https://builds.apache.org/job/PreCommit-SOLR-Build/708/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org | This message was automatically generated. > Add ability to select replicatype to admin ui collection creation > - > > Key: SOLR-14319 > URL: https://issues.apache.org/jira/browse/SOLR-14319 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Affects Versions: 7.7.2 >Reporter: Richard Goodman >Priority: Minor > Attachments: SOLR-14319.patch, Screenshot 2020-03-10 at 16.26.28.png, > Screenshot 2020-03-10 at 16.33.52.png > > > This is just a small patch that allows you to select the replica type when > creating a collection. I'm aware that a possible strategy for replica types > of a collection can be {{'tlog + pull'}}, because of this, I'm open to > feedback on a different way to display this feature. Currently I have a drop > down box defining the types of replicas, with it defaulting to nrt, and it > will take the replication factor specified and will that many replicas of a > given type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14318) Missing dependency on commons-lang in solr-cell 8.4.1
[ https://issues.apache.org/jira/browse/SOLR-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056344#comment-17056344 ] Tim Allison commented on SOLR-14318: Y. Confirmed we removed commons-lang from Tika in 1.23 so 8.5. > Missing dependency on commons-lang in solr-cell 8.4.1 > - > > Key: SOLR-14318 > URL: https://issues.apache.org/jira/browse/SOLR-14318 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.4.1 >Reporter: Markus Günther >Priority: Minor > > During a migration from Solr 7.x to Solr 8.4.1 we noticed that the > commons-lang:commons-lang:2.6 dependency has been removed, and thus, no > longer is part of org.apache.solr:solr-cell. solr-cell however comes bundled > with Apache Tika Parsers (org.apache.tika:tika-parsers) in version 1.19.1 > which - although it is not an explicit dependency - does require > commons-lang:commons-lang:2.6. > This raises an issue when trying to extract the content from Microsoft Access > database files using Tika. See the stacktrace below. > {code:java} > java.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtilsjava.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtils at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.equals(SimpleColumnMatcher.java:74) > at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.matches(SimpleColumnMatcher.java:46) > at > com.healthmarketscience.jackcess.util.CaseInsensitiveColumnMatcher.matches(CaseInsensitiveColumnMatcher.java:49) > at > com.healthmarketscience.jackcess.impl.CursorImpl.currentRowMatchesImpl(CursorImpl.java:571) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRowImpl(CursorImpl.java:627) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRow(CursorImpl.java:517) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findFirstRow(CursorImpl.java:494) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$FallbackTableFinder.findRow(DatabaseImpl.java:2376) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$TableFinder.findObjectId(DatabaseImpl.java:2176) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.readSystemCatalog(DatabaseImpl.java:879) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.(DatabaseImpl.java:534) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.open(DatabaseImpl.java:401) > at > com.healthmarketscience.jackcess.DatabaseBuilder.open(DatabaseBuilder.java:252) > at > org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:94) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at > org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) > at > org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:350) > at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:287) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2596) at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799) at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578) at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) > at > org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) > at > org.eclipse.jetty.server.session.S
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off
dsmiley commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off URL: https://github.com/apache/lucene-solr/pull/1332#discussion_r390581813 ## File path: solr/solr-ref-guide/src/the-tagger-handler.adoc ## @@ -271,11 +271,12 @@ The response should be this (the QTime may vary): }} -== Tagger Tips +== Tagger Performance Tips -Performance Tips: - -* Follow the recommended configuration field settings, especially `postingsFormat=FST50`. +* Follow the recommended configuration field settings above. +Additionally, for the best tagger performance, set `postingsFormat=FST50`. +However, non-default postings formats have no backwards-compatibility guarantees, and so if you upgrade Solr then you may find a nasty exception on startup as it fails to read the older index. +If the input text to be tagged is small (e.g. you are tagging queries or tweets) then the postings format choice isn't as important. Review comment: > I didn't realize that the FST50 vs default performance decreased the smaller the individual document size was The tagger works by looping over each token from the input and doing a term dictionary lookup on the local index. Logically, if your input text is small then there is less work to do than for large input text. Knowing this requires tagger knowledge but not how any particular postings format works. See? No I didn't benchmark this ;-). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14318) Missing dependency on commons-lang in solr-cell 8.4.1
[ https://issues.apache.org/jira/browse/SOLR-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056321#comment-17056321 ] Kevin Risden commented on SOLR-14318: - I'm guessing this was caused by SOLR-9079 in 8.1. Either way for the most part it makes sense to not run Tika as part of Solr itself and instead use Tika server or some other place to run Tika. Running Tika inside Solr can cause all sorts of issues for stability. > Missing dependency on commons-lang in solr-cell 8.4.1 > - > > Key: SOLR-14318 > URL: https://issues.apache.org/jira/browse/SOLR-14318 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.4.1 >Reporter: Markus Günther >Priority: Minor > > During a migration from Solr 7.x to Solr 8.4.1 we noticed that the > commons-lang:commons-lang:2.6 dependency has been removed, and thus, no > longer is part of org.apache.solr:solr-cell. solr-cell however comes bundled > with Apache Tika Parsers (org.apache.tika:tika-parsers) in version 1.19.1 > which - although it is not an explicit dependency - does require > commons-lang:commons-lang:2.6. > This raises an issue when trying to extract the content from Microsoft Access > database files using Tika. See the stacktrace below. > {code:java} > java.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtilsjava.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtils at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.equals(SimpleColumnMatcher.java:74) > at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.matches(SimpleColumnMatcher.java:46) > at > com.healthmarketscience.jackcess.util.CaseInsensitiveColumnMatcher.matches(CaseInsensitiveColumnMatcher.java:49) > at > com.healthmarketscience.jackcess.impl.CursorImpl.currentRowMatchesImpl(CursorImpl.java:571) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRowImpl(CursorImpl.java:627) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRow(CursorImpl.java:517) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findFirstRow(CursorImpl.java:494) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$FallbackTableFinder.findRow(DatabaseImpl.java:2376) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$TableFinder.findObjectId(DatabaseImpl.java:2176) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.readSystemCatalog(DatabaseImpl.java:879) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.(DatabaseImpl.java:534) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.open(DatabaseImpl.java:401) > at > com.healthmarketscience.jackcess.DatabaseBuilder.open(DatabaseBuilder.java:252) > at > org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:94) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at > org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) > at > org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:350) > at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:287) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2596) at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799) at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578) at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > at > org.eclip
[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed
[ https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056320#comment-17056320 ] ASF subversion and git services commented on LUCENE-9164: - Commit 845ee75e28b5bb73bd8ec5a8b1d79a46cff7737c in lucene-solr's branch refs/heads/branch_8x from Simon Willnauer [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=845ee75 ] LUCENE-9164: process all events before closing gracefully (#1319) IndexWriter must process all pending events before closing the writer during rollback to prevent AlreadyClosedExceptions from being thrown during event processing which can cause the writer to be closed with a tragic event. > Should not consider ACE a tragedy if IW is closed > - > > Key: LUCENE-9164 > URL: https://issues.apache.org/jira/browse/LUCENE-9164 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: master (9.0), 8.5, 8.4.2 >Reporter: Nhat Nguyen >Assignee: Nhat Nguyen >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: LUCENE-9164.patch, LUCENE-9164.patch > > Time Spent: 7h 20m > Remaining Estimate: 0h > > If IndexWriter is closed or being closed, AlreadyClosedException is expected. > We should not consider it a tragic event in this case. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed
[ https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-9164: Fix Version/s: 8.5 master (9.0) Resolution: Fixed Status: Resolved (was: Patch Available) > Should not consider ACE a tragedy if IW is closed > - > > Key: LUCENE-9164 > URL: https://issues.apache.org/jira/browse/LUCENE-9164 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: master (9.0), 8.5, 8.4.2 >Reporter: Nhat Nguyen >Assignee: Nhat Nguyen >Priority: Major > Fix For: master (9.0), 8.5 > > Attachments: LUCENE-9164.patch, LUCENE-9164.patch > > Time Spent: 7h 20m > Remaining Estimate: 0h > > If IndexWriter is closed or being closed, AlreadyClosedException is expected. > We should not consider it a tragic event in this case. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14318) Missing dependency on commons-lang in solr-cell 8.4.1
[ https://issues.apache.org/jira/browse/SOLR-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056318#comment-17056318 ] Kevin Risden commented on SOLR-14318: - [~markus.guenther] I'm pretty sure this won't be addressed in 8.4.1 since Tika was upgraded in SOLR-14054 for 8.5 which should get underway for release soon. Tika upgrade as far as I can tell doesn't use commons-lang anymore and moves to commons-lang3. [~tallison] can you confirm/deny? > Missing dependency on commons-lang in solr-cell 8.4.1 > - > > Key: SOLR-14318 > URL: https://issues.apache.org/jira/browse/SOLR-14318 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.4.1 >Reporter: Markus Günther >Priority: Minor > > During a migration from Solr 7.x to Solr 8.4.1 we noticed that the > commons-lang:commons-lang:2.6 dependency has been removed, and thus, no > longer is part of org.apache.solr:solr-cell. solr-cell however comes bundled > with Apache Tika Parsers (org.apache.tika:tika-parsers) in version 1.19.1 > which - although it is not an explicit dependency - does require > commons-lang:commons-lang:2.6. > This raises an issue when trying to extract the content from Microsoft Access > database files using Tika. See the stacktrace below. > {code:java} > java.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtilsjava.lang.NoClassDefFoundError: > org/apache/commons/lang/ObjectUtils at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.equals(SimpleColumnMatcher.java:74) > at > com.healthmarketscience.jackcess.util.SimpleColumnMatcher.matches(SimpleColumnMatcher.java:46) > at > com.healthmarketscience.jackcess.util.CaseInsensitiveColumnMatcher.matches(CaseInsensitiveColumnMatcher.java:49) > at > com.healthmarketscience.jackcess.impl.CursorImpl.currentRowMatchesImpl(CursorImpl.java:571) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRowImpl(CursorImpl.java:627) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRow(CursorImpl.java:517) > at > com.healthmarketscience.jackcess.impl.CursorImpl.findFirstRow(CursorImpl.java:494) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$FallbackTableFinder.findRow(DatabaseImpl.java:2376) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl$TableFinder.findObjectId(DatabaseImpl.java:2176) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.readSystemCatalog(DatabaseImpl.java:879) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.(DatabaseImpl.java:534) > at > com.healthmarketscience.jackcess.impl.DatabaseImpl.open(DatabaseImpl.java:401) > at > com.healthmarketscience.jackcess.DatabaseBuilder.open(DatabaseBuilder.java:252) > at > org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:94) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at > org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) > at > org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:350) > at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:287) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at > org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at > org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2596) at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799) at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578) at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:
[jira] [Commented] (LUCENE-9164) Should not consider ACE a tragedy if IW is closed
[ https://issues.apache.org/jira/browse/LUCENE-9164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056316#comment-17056316 ] ASF subversion and git services commented on LUCENE-9164: - Commit 79feb93bd962aa65ede05ecf7cc86e9f5cec84a1 in lucene-solr's branch refs/heads/master from Simon Willnauer [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=79feb93 ] LUCENE-9164: process all events before closing gracefully (#1319) IndexWriter must process all pending events before closing the writer during rollback to prevent AlreadyClosedExceptions from being thrown during event processing which can cause the writer to be closed with a tragic event. > Should not consider ACE a tragedy if IW is closed > - > > Key: LUCENE-9164 > URL: https://issues.apache.org/jira/browse/LUCENE-9164 > Project: Lucene - Core > Issue Type: Bug > Components: core/index >Affects Versions: master (9.0), 8.5, 8.4.2 >Reporter: Nhat Nguyen >Assignee: Nhat Nguyen >Priority: Major > Attachments: LUCENE-9164.patch, LUCENE-9164.patch > > Time Spent: 7h 20m > Remaining Estimate: 0h > > If IndexWriter is closed or being closed, AlreadyClosedException is expected. > We should not consider it a tragic event in this case. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] s1monw merged pull request #1319: LUCENE-9164: process all events before closing gracefully
s1monw merged pull request #1319: LUCENE-9164: process all events before closing gracefully URL: https://github.com/apache/lucene-solr/pull/1319 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14254) Index backcompat break between 8.3.1 and 8.4.1
[ https://issues.apache.org/jira/browse/SOLR-14254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056306#comment-17056306 ] Jason Gerlowski commented on SOLR-14254: I agree: short of doing something to handle the old index format in Solr itself the best we can do here is a doc fix. Thanks for proposing docs, they look good to me. > Index backcompat break between 8.3.1 and 8.4.1 > -- > > Key: SOLR-14254 > URL: https://issues.apache.org/jira/browse/SOLR-14254 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Jason Gerlowski >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > I believe I found a backcompat break between 8.4.1 and 8.3.1. > I encountered this when a Solr 8.3.1 cluster was upgraded to 8.4.1. On 8.4. > nodes, several collections had cores fail to come up with > {{CorruptIndexException}}: > {code} > 2020-02-10 20:58:26.136 ERROR > (coreContainerWorkExecutor-2-thread-1-processing-n:192.168.1.194:8983_solr) [ > ] o.a.s.c.CoreContainer Error waiting for SolrCore to be loaded on startup > => org.apache.sol > r.common.SolrException: Unable to create core > [testbackcompat_shard1_replica_n1] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313) > org.apache.solr.common.SolrException: Unable to create core > [testbackcompat_shard1_replica_n1] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1313) > ~[?:?] > at > org.apache.solr.core.CoreContainer.lambda$load$13(CoreContainer.java:788) > ~[?:?] > at > com.codahale.metrics.InstrumentedExecutorService$InstrumentedCallable.call(InstrumentedExecutorService.java:202) > ~[metrics-core-4.0.5.jar:4.0.5] > at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] > at > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:210) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > ~[?:?] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > ~[?:?] > at java.lang.Thread.run(Thread.java:834) [?:?] > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1072) ~[?:?] > at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292) > ~[?:?] > ... 7 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2182) > ~[?:?] > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2302) > ~[?:?] > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1132) > ~[?:?] > at org.apache.solr.core.SolrCore.(SolrCore.java:1013) ~[?:?] > at org.apache.solr.core.SolrCore.(SolrCore.java:901) ~[?:?] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1292) > ~[?:?] > ... 7 more > Caused by: org.apache.lucene.index.CorruptIndexException: codec mismatch: > actual codec=Lucene50PostingsWriterDoc vs expected > codec=Lucene84PostingsWriterDoc > (resource=MMapIndexInput(path="/Users/jasongerlowski/run/solrdata/data/testbackcompat_shard1_replica_n1/data/index/_0_FST50_0.doc")) > at > org.apache.lucene.codecs.CodecUtil.checkHeaderNoMagic(CodecUtil.java:208) > ~[?:?] > at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:198) > ~[?:?] > at > org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) ~[?:?] > at > org.apache.lucene.codecs.lucene84.Lucene84PostingsReader.(Lucene84PostingsReader.java:82) > ~[?:?] > at > org.apache.lucene.codecs.memory.FSTPostingsFormat.fieldsProducer(FSTPostingsFormat.java:66) > ~[?:?] > at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:315) > ~[?:?] > at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:395) > ~[?:?] > at > org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:114) > ~[?:?] > at > org.apache.lucene.index.SegmentReader.(SegmentReader.java:84) ~[?:?] > at > org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:177) > ~[?:?] > at > org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:219) > ~[?:?] > at > org.apache.lucene.index.StandardDirectoryReader.open(Standa
[GitHub] [lucene-solr] gerlowskija commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off
gerlowskija commented on a change in pull request #1332: SOLR-14254: Docs for text tagger: FST50 trade-off URL: https://github.com/apache/lucene-solr/pull/1332#discussion_r390556194 ## File path: solr/solr-ref-guide/src/the-tagger-handler.adoc ## @@ -271,11 +271,12 @@ The response should be this (the QTime may vary): }} -== Tagger Tips +== Tagger Performance Tips -Performance Tips: - -* Follow the recommended configuration field settings, especially `postingsFormat=FST50`. +* Follow the recommended configuration field settings above. +Additionally, for the best tagger performance, set `postingsFormat=FST50`. +However, non-default postings formats have no backwards-compatibility guarantees, and so if you upgrade Solr then you may find a nasty exception on startup as it fails to read the older index. +If the input text to be tagged is small (e.g. you are tagging queries or tweets) then the postings format choice isn't as important. Review comment: [Q] Interesting. I didn't realize that the FST50 vs default performance decreased the smaller the individual document size was. Did you do a particular performance test to bear this out, or are you just intuiting that behavior from knowing how postingsFormats work? Is the performance comparable even if numTweets or whatever gets large and the posting-lists grow due to the sheer number of tiny docs? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056298#comment-17056298 ] Munendra S N commented on SOLR-13199: - {code:java} I suspect you are not familiar with how nested documents work and/or this particular DocTransformer. The use-case for a child doc transformer is when the master query matches only parent documents but you want to see their children in the response, attached to it. What the user pages through are parent documents; they only see parent documents. {code} I think I didn't explain the case properly. Let's consider http://yonik.com/solr-nested-objects/ Suppose the initial query is {{author_s:yonik}} which matches parent document and {{cat_s:fantasy}} is the parentFilter used in childDocTranformer(child documents are reviews). This works fine Now say, we add a new book by {{author_s:yonik}} which belongs to {{cat_s:biography}}. Once this is added, if a make same query {{author_s:yoniik}} and {{parentFilter=cat_s:fantasy}}, new parentFilter won't obviously match {{cat_s:biography}} and So, getBitSet will return {{null}} we get NPE. This is what I was trying to explain but made it complicated by introducing pagination and etc. I understand in this case better parentFilter would be type_s:book instead of cat_s but in the beginning cat_s is also constant across the parent set until new category is introduced. After some more thought and trying to out scenarios like childless parent(which works fine as long parentFilter is proper), it is better to fail the request instead of not returning any children since parent contains children but not able to return because of improper parentFilter I will make changes to throw an error in both cases. [~dsmiley] Thanks a lot for your patience and detailed explanation > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The > attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that > you will obtain by following the steps below: > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > {noformat} >Reporter: Johannes Kloos >Assignee: Munendra S N >Priority: Minor > Labels: diffblue, newdev > Attachments: SOLR-13199.patch, home.zip > > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:* > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.NullPointerException > at > org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:103) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:1) > at > org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:184) > at > org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:136) > at > org.apache.solr.common.util.JsonTextWriter.writeNamedListAsMapWithDups(JsonTextWriter.java:386) > at > org.apache.s
[jira] [Commented] (SOLR-14265) Move to admin API to v2 completely
[ https://issues.apache.org/jira/browse/SOLR-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056283#comment-17056283 ] Cassandra Targett commented on SOLR-14265: -- The only thing I have is a doc from 2017 that Noble (and Steve also IIRC) wrote as the first documentation on the v2 APIs: https://docs.google.com/document/d/18n9IL6y82C8gnBred6lzG0GLaT3OsZZsBvJQ2YAt72I. At the end there is a mapping of v1 to v2 which was reasonably correct when it was written. There have been lots of changes since then; whole new APIs have been added, some with and some without v2 support. In some cases v2 support has been added for things that were missing back in 2017. Some of my comments in SOLR-11646 also point out a couple of additional gaps that may not be listed in the 2017 docs Noble wrote. > Move to admin API to v2 completely > --- > > Key: SOLR-14265 > URL: https://issues.apache.org/jira/browse/SOLR-14265 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Anshum Gupta >Assignee: Anshum Gupta >Priority: Major > > V2 admin API has been available in Solr for a very long time, making it > difficult for both users and developers to remember and understand which > format to use when. We should move to v2 API completely for all Solr Admin > calls for the following reasons: > # converge code - there are multiple ways of doing the same thing, there's > unwanted back-compat code, and we should get rid of that > # POJO all the way - no more NamedList. I know this would have split > opinions, but I strongly think we should move in this direction. I created > Jira about this specific task in the past and went half way but I think we > should just close this one out now. > # Automatic documentation > # Others > This is just an umbrella Jira for the task. Let's create sub-tasks and split > this up as it would require a bunch of rewriting of the code and it makes a > lot of sense to get this out with 9.0 so we don't have to support v1 forever! > There have been some conversations going on about this and it feels like most > folks are happy to go this route. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] janhoy merged pull request #1326: Remove unused scripts in dev-tools folder
janhoy merged pull request #1326: Remove unused scripts in dev-tools folder URL: https://github.com/apache/lucene-solr/pull/1326 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14054) Upgrade Tika to 1.23
[ https://issues.apache.org/jira/browse/SOLR-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056259#comment-17056259 ] Tim Allison commented on SOLR-14054: Looking... > Upgrade Tika to 1.23 > > > Key: SOLR-14054 > URL: https://issues.apache.org/jira/browse/SOLR-14054 > Project: Solr > Issue Type: Task > Components: contrib - DataImportHandler >Reporter: Tim Allison >Assignee: Tim Allison >Priority: Minor > Fix For: 8.5 > > Attachments: test-documents.7z, > tika-integration-example-9.0.0-SNAPSHOT.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > We just released 1.23. Let's upgrade Tika. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14054) Upgrade Tika to 1.23
[ https://issues.apache.org/jira/browse/SOLR-14054?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056257#comment-17056257 ] Adrien Grand commented on SOLR-14054: - It looks like this issue is responsible for the smoketest build failures, see e.g. https://builds.apache.org/job/Lucene-Solr-SmokeRelease-8.x/369/console. > Upgrade Tika to 1.23 > > > Key: SOLR-14054 > URL: https://issues.apache.org/jira/browse/SOLR-14054 > Project: Solr > Issue Type: Task > Components: contrib - DataImportHandler >Reporter: Tim Allison >Assignee: Tim Allison >Priority: Minor > Fix For: 8.5 > > Attachments: test-documents.7z, > tika-integration-example-9.0.0-SNAPSHOT.tgz > > Time Spent: 20m > Remaining Estimate: 0h > > We just released 1.23. Let's upgrade Tika. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#discussion_r390523437 ## File path: lucene/core/src/java/org/apache/lucene/search/SliceExecutor.java ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.search; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.Executor; +import java.util.concurrent.Future; +import java.util.concurrent.FutureTask; +import java.util.concurrent.RejectedExecutionException; + +/** + * Executor which is responsible + * for execution of slices based on the current status + * of the system and current system load + */ +class SliceExecutor { + private final Executor executor; + + public SliceExecutor(Executor executor) { +this.executor = executor; + } + + public List> invokeAll(Collection> tasks) { Review comment: I wonder whether this is the right API. We could change the return type to `void` and use `Runnable` instead of `FutureTask` and that would still work, right? The return value isn't really useful since it has the same content as the input collection? So what about making it just: `public void invokeAll(Collection tasks)`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#discussion_r390522773 ## File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java ## @@ -933,6 +932,13 @@ public Executor getExecutor() { return executor; } + /** + * Returns this searchers slice execution control plane or null if no executor was provided + */ + public SliceExecutor getSliceExecutor() { Review comment: we shouldn't make this method public if it returns a pkg-private class, let's make the method pkg-private too? Or even remove it entirely as I'm not seeing any call site for it? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#discussion_r390515024 ## File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java ## @@ -211,6 +213,18 @@ public IndexSearcher(IndexReaderContext context, Executor executor) { assert context.isTopLevel: "IndexSearcher's ReaderContext must be topLevel for reader" + context.reader(); reader = context.reader(); this.executor = executor; +this.sliceExecutor = executor == null ? null : getSliceExecutionControlPlane(executor); +this.readerContext = context; +leafContexts = context.leaves(); +this.leafSlices = executor == null ? null : slices(leafContexts); Review comment: maybe this should delegate to the below constructor? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#discussion_r390521799 ## File path: lucene/core/src/java/org/apache/lucene/search/SliceExecutor.java ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.search; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.Executor; +import java.util.concurrent.Future; +import java.util.concurrent.FutureTask; +import java.util.concurrent.RejectedExecutionException; + +/** + * Executor which is responsible + * for execution of slices based on the current status + * of the system and current system load + */ +class SliceExecutor { + private final Executor executor; + + public SliceExecutor(Executor executor) { +this.executor = executor; + } + + public List> invokeAll(Collection> tasks) { + +if (tasks == null) { + throw new IllegalArgumentException("Tasks is null"); +} + +if (executor == null) { + throw new IllegalArgumentException("Executor is null"); +} + +List> futures = new ArrayList(); + +int i = 0; + +for (FutureTask task : tasks) { + boolean shouldExecuteOnCallerThread = false; + + // Execute last task on caller thread + if (i == tasks.size() - 1) { +shouldExecuteOnCallerThread = true; + } + + processTask(task, futures, shouldExecuteOnCallerThread); + ++i; +} + +return futures; + } + + // Helper method to execute a single task + protected void processTask(final FutureTask task, final List> futures, + final boolean shouldExecuteOnCallerThread) { +if (task == null) { + throw new IllegalArgumentException("Input is null"); +} + +if (!shouldExecuteOnCallerThread) { + try { +executor.execute(task); +futures.add(task); + +return; + } catch (RejectedExecutionException e) { +// Execute on caller thread + } +} + +runTaskOnCallerThread(task); + +try { + futures.add(CompletableFuture.completedFuture(task.get())); +} catch (Exception e) { + throw new RuntimeException(e); +} + } + + // Private helper method to run a task on the caller thread + private void runTaskOnCallerThread(FutureTask task) { +try { + task.run(); +} catch (Exception e) { + throw new RuntimeException(e); +} Review comment: we don't need this catch block as task.run() doesn't declare any non-runtime exception? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056243#comment-17056243 ] David Smiley commented on SOLR-13199: - {quote}what about a segment with no hits? Presumably it may occurs between regular ones {quote} No, a nested document set is committed atomically; it is not split to other segments. Solr flattens the nest then supplies the list to Lucene which +guarantees+ this. {quote}One case where query could be null even if parentFilter is specified filter is defined on text field and value is stopword {quote} The dev user is then not using an appropriate query. It's mandatory that at least one document in a non-empty index be a parent; otherwise we don't actually have a nested index which is also the dev user's fault if true. {quote}Suppose, user is also using pagination. Fist page returns properly, there is one such parent product which fits the bill and we throw an exception. {quote} I suspect you are not familiar with how nested documents work and/or this particular DocTransformer. The use-case for a child doc transformer is when the master query matches only parent documents but you want to see their children in the response, attached to it. What the user pages through are parent documents; they only see parent documents. If somehow there is some use case where throwing an exception would inhibit a use case we have never thought of, I insist we wait until such a use-case actually presents itself. Otherwise, we are failing to inform the user that they are very probably making a mistake. {quote}Also, I have question if someone uses nestPathField approach(defined in the schema) but doesn't have any children for parents what does childTransformer return? Does it fail the request with valid error or return just the parent products? {quote} What is the "it" in "Does it fail ..." refer to? You probably refer to ChildDocTransformer. It attaches no child documents to the parent because there aren't any (perfectly valid!). It would not "return parents" but I think you maybe mean would the master query return parents. What the master query returns is whatever your q/fq match and is not something the transformer affects. > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The > attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that > you will obtain by following the steps below: > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > {noformat} >Reporter: Johannes Kloos >Assignee: Munendra S N >Priority: Minor > Labels: diffblue, newdev > Attachments: SOLR-13199.patch, home.zip > > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:* > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.NullPointerException > at > org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92) > at org.apache.solr.response.DocsStrea
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#discussion_r390523963 ## File path: lucene/core/src/java/org/apache/lucene/search/SliceExecutor.java ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.search; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.Executor; +import java.util.concurrent.Future; +import java.util.concurrent.FutureTask; +import java.util.concurrent.RejectedExecutionException; + +/** + * Executor which is responsible + * for execution of slices based on the current status + * of the system and current system load + */ +class SliceExecutor { + private final Executor executor; + + public SliceExecutor(Executor executor) { +this.executor = executor; + } + + public List> invokeAll(Collection> tasks) { + +if (tasks == null) { + throw new IllegalArgumentException("Tasks is null"); +} + +if (executor == null) { + throw new IllegalArgumentException("Executor is null"); +} + +List> futures = new ArrayList(); + +int i = 0; + +for (FutureTask task : tasks) { Review comment: we should never use generic types without type parameters, can you address all these compilation warnings? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#discussion_r390525903 ## File path: lucene/core/src/java/org/apache/lucene/search/SliceExecutor.java ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.search; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.Executor; +import java.util.concurrent.Future; +import java.util.concurrent.FutureTask; +import java.util.concurrent.RejectedExecutionException; + +/** + * Executor which is responsible + * for execution of slices based on the current status + * of the system and current system load + */ +class SliceExecutor { + private final Executor executor; + + public SliceExecutor(Executor executor) { +this.executor = executor; + } + + public List> invokeAll(Collection> tasks) { + +if (tasks == null) { + throw new IllegalArgumentException("Tasks is null"); +} + +if (executor == null) { + throw new IllegalArgumentException("Executor is null"); +} + +List> futures = new ArrayList(); + +int i = 0; + +for (FutureTask task : tasks) { + boolean shouldExecuteOnCallerThread = false; + + // Execute last task on caller thread + if (i == tasks.size() - 1) { +shouldExecuteOnCallerThread = true; + } + + processTask(task, futures, shouldExecuteOnCallerThread); + ++i; +} + +return futures; + } + + // Helper method to execute a single task + protected void processTask(final FutureTask task, final List> futures, + final boolean shouldExecuteOnCallerThread) { +if (task == null) { + throw new IllegalArgumentException("Input is null"); +} + +if (!shouldExecuteOnCallerThread) { + try { +executor.execute(task); +futures.add(task); + +return; + } catch (RejectedExecutionException e) { +// Execute on caller thread + } +} + +runTaskOnCallerThread(task); + +try { + futures.add(CompletableFuture.completedFuture(task.get())); Review comment: this has the same effect as `futures.add(task)` unless I'm missing something This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14265) Move to admin API to v2 completely
[ https://issues.apache.org/jira/browse/SOLR-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056226#comment-17056226 ] Anshum Gupta commented on SOLR-14265: - Great idea, Cassandra! I was kind of doing that in addition to working through and understanding how the new API is structured. There are more than one ways the v2 stuff is written so trying to converge on the best option there too. Having a list of all the endpoints that we want to support with v2 would be great, and then I plan on moving those APIs over one after the other. If you already have a document or writeup on that mapping/accounting, please feel free to share and that would be a great anchor for this. > Move to admin API to v2 completely > --- > > Key: SOLR-14265 > URL: https://issues.apache.org/jira/browse/SOLR-14265 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Anshum Gupta >Assignee: Anshum Gupta >Priority: Major > > V2 admin API has been available in Solr for a very long time, making it > difficult for both users and developers to remember and understand which > format to use when. We should move to v2 API completely for all Solr Admin > calls for the following reasons: > # converge code - there are multiple ways of doing the same thing, there's > unwanted back-compat code, and we should get rid of that > # POJO all the way - no more NamedList. I know this would have split > opinions, but I strongly think we should move in this direction. I created > Jira about this specific task in the past and went half way but I think we > should just close this one out now. > # Automatic documentation > # Others > This is just an umbrella Jira for the task. Let's create sub-tasks and split > this up as it would require a bunch of rewriting of the code and it makes a > lot of sense to get this out with 9.0 so we don't have to support v1 forever! > There have been some conversations going on about this and it feels like most > folks are happy to go this route. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] danmuzi opened a new pull request #1336: LUCENE-9270: Update Javadoc about normalizeEntry in the Kuromoji DictionaryBuilder
danmuzi opened a new pull request #1336: LUCENE-9270: Update Javadoc about normalizeEntry in the Kuromoji DictionaryBuilder URL: https://github.com/apache/lucene-solr/pull/1336 The normalizeEntry option is missing from the Javadoc of Kuromoji DictionaryBuilder. Without this explanation, users don't know what it means until they see the code. Also, if user follows the usage of Javadoc, it will not be built. Please check the following JIRA issue: [LUCENE-9270](https://issues.apache.org/jira/browse/LUCENE-9270) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9270) Update Javadoc about normalizeEntry in Kuromoji DictionaryBuilder
Namgyu Kim created LUCENE-9270: -- Summary: Update Javadoc about normalizeEntry in Kuromoji DictionaryBuilder Key: LUCENE-9270 URL: https://issues.apache.org/jira/browse/LUCENE-9270 Project: Lucene - Core Issue Type: Improvement Reporter: Namgyu Kim Assignee: Namgyu Kim The normalizeEntry option is missing from the Javadoc of Kuromoji DictionaryBuilder. Without this explanation, users don't know what it means until they see the code. Also, if user follows the usage of Javadoc, it will not be built. So the following contents need to be applied: 1) Change usage before: java -cp [lucene classpath] org.apache.lucene.analysis.ja.util.DictionaryBuilder \ ${inputDir} ${outputDir} ${encoding} after: java -cp [lucene classpath] org.apache.lucene.analysis.ja.util.DictionaryBuilder \ ${inputDir} ${outputDir} ${encoding} *${normalizeEntry}* 2) Add description about normalizeEntry -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
jpountz commented on a change in pull request #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#discussion_r390514257 ## File path: lucene/core/src/java/org/apache/lucene/search/IndexSearcher.java ## @@ -211,6 +213,18 @@ public IndexSearcher(IndexReaderContext context, Executor executor) { assert context.isTopLevel: "IndexSearcher's ReaderContext must be topLevel for reader" + context.reader(); reader = context.reader(); this.executor = executor; +this.sliceExecutionControlPlane = executor == null ? null : getSliceExecutionControlPlane(executor); +this.readerContext = context; +leafContexts = context.leaves(); +this.leafSlices = executor == null ? null : slices(leafContexts); + } + + // Package private for testing + IndexSearcher(IndexReaderContext context, Executor executor, SliceExecutionControlPlane sliceExecutionControlPlane) { +assert context.isTopLevel: "IndexSearcher's ReaderContext must be topLevel for reader" + context.reader(); +reader = context.reader(); +this.executor = executor; +this.sliceExecutionControlPlane = executor == null ? null : sliceExecutionControlPlane; Review comment: My point was that it sounds like a bug on the caller of this constructor to pass a null executor and a non-null sliceExecutionControlPlane? So I'd rather have validation around it rather than be lenient and ignore the provided sliceExecutionControlPlane if the executor is null? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056205#comment-17056205 ] Adrien Grand commented on LUCENE-9269: -- For this particular issue, I think that the right fix would be to fix {{TermQuery#equals}} and {{hashCode}} to take {{perReaderTermState}} into account. Queries shouldn't be considered equal if they might return different scores. I don't think that this would have bad side-effects as boolean rewrites are generally used for scoring queries, which are not cached (which is the other typicall call-site for Query#equals/hashCode). > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14265) Move to admin API to v2 completely
[ https://issues.apache.org/jira/browse/SOLR-14265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056179#comment-17056179 ] Cassandra Targett commented on SOLR-14265: -- I was thinking about this today as I had a moment to pick up SOLR-11646 again and the first thing I noticed was a Collections API command (action=CLUSTERSTATUS) which does not have a v2 counterpart. It made me think that the first thing to do here is possibly an accounting of what does & doesn't have v2 coverage, get those added, and then work on removing v1. > Move to admin API to v2 completely > --- > > Key: SOLR-14265 > URL: https://issues.apache.org/jira/browse/SOLR-14265 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Anshum Gupta >Assignee: Anshum Gupta >Priority: Major > > V2 admin API has been available in Solr for a very long time, making it > difficult for both users and developers to remember and understand which > format to use when. We should move to v2 API completely for all Solr Admin > calls for the following reasons: > # converge code - there are multiple ways of doing the same thing, there's > unwanted back-compat code, and we should get rid of that > # POJO all the way - no more NamedList. I know this would have split > opinions, but I strongly think we should move in this direction. I created > Jira about this specific task in the past and went half way but I think we > should just close this one out now. > # Automatic documentation > # Others > This is just an umbrella Jira for the task. Let's create sub-tasks and split > this up as it would require a bunch of rewriting of the code and it makes a > lot of sense to get this out with 9.0 so we don't have to support v1 forever! > There have been some conversations going on about this and it feels like most > folks are happy to go this route. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14007) Difference response format for percentile aggregation
[ https://issues.apache.org/jira/browse/SOLR-14007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056132#comment-17056132 ] Munendra S N commented on SOLR-14007: - Thanks [~mkhl] for the review Thanks [~jbernste] for the sharing your thoughts. Idea of table view looks exciting. My idea was to provide similar view (key-value) with normal facet response for percentile. Thank you [~ysee...@gmail.com] for the detailed review. {code:java} Consistency should not be a goal since the Stats component should be deprecated {code} Huge +1 to deprecate stats component. Having multiple component to same functionality is not required and also a maintenance hassle. I have been working on this from past few months (but lesser speed than preferred). I have created some tasks for the same and they are not exhaustive. Idea is not to fix all of them but fix only those which makes sense. For example, returning distinctValues could lead to potential OOM, there are other way to achieve the say(terms component with limit) which need not have to be supported in json facets. Similarly, avg on dates not sure of the usecase for finding avg on date. If there is not such case, maybe failing rather than returning some date or double value makes more sense {code:java} Regardless, what the Stats component currently does should really shouldn't have much bearing on what solution we chose here. {code} Completely agree with this. Even with the current patch, when there are no values unlike stats component not returning null for each percentile specified. {code:java} For percentile(), if the norm was a single argument, then representing the response as a single value would be natural and multiple values would be an extension (but an exception. {code} My understanding was always the other way around. I always thought median to be case which could be supported via percentiles. {code:java} I also do question if this change actually makes anyones lives easier. The vast majority of clients would know what they are asking for and hence the form of answer they will get back? {code} I still think having consistent response format irrespective of number of values specified, makes response processing cleaner without if-else checks. The reason for going with NamedList(initial I built the patch with list in my mind, still have it my local), is to make response self-contained as much possible. I have shared the reasoning behind the approach irrespective of namedList(would prefer of self-containess) or list, would prefer to have consistent value type in the response Let me know if there are any suggestions > Difference response format for percentile aggregation > - > > Key: SOLR-14007 > URL: https://issues.apache.org/jira/browse/SOLR-14007 > Project: Solr > Issue Type: Sub-task > Components: Facet Module >Reporter: Munendra S N >Assignee: Munendra S N >Priority: Major > Attachments: SOLR-14007.patch > > > For percentile, > In Stats component, the response format for percentile is {{NamedList}} but > in JSON facet, the format is either array or single value depending on number > of percentiles specified. > Even if JSON percentile doesn't use NamedList, response format shouldn't > change based on number of percentiles -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14319) Add ability to select replicatype to admin ui collection creation
[ https://issues.apache.org/jira/browse/SOLR-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Goodman updated SOLR-14319: --- Status: Patch Available (was: Open) > Add ability to select replicatype to admin ui collection creation > - > > Key: SOLR-14319 > URL: https://issues.apache.org/jira/browse/SOLR-14319 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Affects Versions: 7.7.2 >Reporter: Richard Goodman >Priority: Minor > Attachments: SOLR-14319.patch, Screenshot 2020-03-10 at 16.26.28.png, > Screenshot 2020-03-10 at 16.33.52.png > > > This is just a small patch that allows you to select the replica type when > creating a collection. I'm aware that a possible strategy for replica types > of a collection can be {{'tlog + pull'}}, because of this, I'm open to > feedback on a different way to display this feature. Currently I have a drop > down box defining the types of replicas, with it defaulting to nrt, and it > will take the replication factor specified and will that many replicas of a > given type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14319) Add ability to select replicatype to admin ui collection creation
[ https://issues.apache.org/jira/browse/SOLR-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Goodman updated SOLR-14319: --- Attachment: Screenshot 2020-03-10 at 16.26.28.png Screenshot 2020-03-10 at 16.33.52.png > Add ability to select replicatype to admin ui collection creation > - > > Key: SOLR-14319 > URL: https://issues.apache.org/jira/browse/SOLR-14319 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Affects Versions: 7.7.2 >Reporter: Richard Goodman >Priority: Minor > Attachments: SOLR-14319.patch, Screenshot 2020-03-10 at 16.26.28.png, > Screenshot 2020-03-10 at 16.33.52.png > > > This is just a small patch that allows you to select the replica type when > creating a collection. I'm aware that a possible strategy for replica types > of a collection can be {{'tlog + pull'}}, because of this, I'm open to > feedback on a different way to display this feature. Currently I have a drop > down box defining the types of replicas, with it defaulting to nrt, and it > will take the replication factor specified and will that many replicas of a > given type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14319) Add ability to select replicatype to admin ui collection creation
Richard Goodman created SOLR-14319: -- Summary: Add ability to select replicatype to admin ui collection creation Key: SOLR-14319 URL: https://issues.apache.org/jira/browse/SOLR-14319 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Components: Admin UI Affects Versions: 7.7.2 Reporter: Richard Goodman Attachments: SOLR-14319.patch, Screenshot 2020-03-10 at 16.26.28.png, Screenshot 2020-03-10 at 16.33.52.png This is just a small patch that allows you to select the replica type when creating a collection. I'm aware that a possible strategy for replica types of a collection can be {{'tlog + pull'}}, because of this, I'm open to feedback on a different way to display this feature. Currently I have a drop down box defining the types of replicas, with it defaulting to nrt, and it will take the replication factor specified and will that many replicas of a given type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14319) Add ability to select replicatype to admin ui collection creation
[ https://issues.apache.org/jira/browse/SOLR-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Richard Goodman updated SOLR-14319: --- Attachment: SOLR-14319.patch > Add ability to select replicatype to admin ui collection creation > - > > Key: SOLR-14319 > URL: https://issues.apache.org/jira/browse/SOLR-14319 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Admin UI >Affects Versions: 7.7.2 >Reporter: Richard Goodman >Priority: Minor > Attachments: SOLR-14319.patch, Screenshot 2020-03-10 at 16.26.28.png, > Screenshot 2020-03-10 at 16.33.52.png > > > This is just a small patch that allows you to select the replica type when > creating a collection. I'm aware that a possible strategy for replica types > of a collection can be {{'tlog + pull'}}, because of this, I'm open to > feedback on a different way to display this feature. Currently I have a drop > down box defining the types of replicas, with it defaulting to nrt, and it > will take the replication factor specified and will that many replicas of a > given type. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056103#comment-17056103 ] Munendra S N edited comment on SOLR-13199 at 3/10/20, 4:19 PM: --- [~dsmiley] Thanks for the review This is in addition what Mikhail has shared Initially I was thinking to raise/throw an Exception but then I thought few cases {code:java} Likewise if we parse the query and get null, the query is in error. {code} One case where query could be null even if parentFilter is specified filter is defined on text field and value is stopword. I have seen cases where query resolves to null in lot of cases but currently could think of this case. Using text field itself for parentFilter is not the right choice but I don't think we can control usage. So, when user has specified perfectly fine filter which resolves to null should we throw an exception? {code:java} If parentsFilter.getBitSet returns null, then we should throw an error that the user didn't supply a parentFilter matching parent documents {code} parentFilter could be something that matches fewer parent set rather then whole parent set. Suggestion throw an error is good if there is an enforcement that unique parent condition should be part of each document. Suppose, user is also using pagination. Fist page returns properly, there is one such parent product which fits the bill and we throw an exception. Same query throws exception based on limit and start parameter. Not sure, if that would be right choice I understand both cases are either bit of stretch or corner cases but I'm sharing my reasoning behind going with the above approach Let me know if these corners cases doesn't make much sense and its okay to fail request then, i will modify the patch accordingly. Also, I have question if someone uses nestPathField approach(defined in the schema) but doesn't have any children for parents what does childTransformer return? Does it fail the request with valid error or return just the parent products? I haven't yet tried nestPathField for indexing parent-children. So, just curious. was (Author: munendrasn): [~dsmiley] Thanks for the review This is in addition what Mikhail has shared Initially I was thinking to raise/throw an Exception but then I thought few cases {code:java} Likewise if we parse the query and get null, the query is in error. {code} One case where query could be null even if parentFilter is specified filter is defined on text field and value is stopword. I have seen cases where query resolves to null in lot of cases but currently could think of this case. Using text field itself for parentFilter is not the right choice but I don't think we can control usage. So, when user has specified perfectly fine filter which resolves to null should we throw an exception? {code:java} If parentsFilter.getBitSet returns null, then we should throw an error that the user didn't supply a parentFilter matching parent documents {code} parentFilter could be something that matches fewer parent set rather then whole parent set. Suggestion throw an error is good if there is an enforcement that unique parent condition should be part of each document. Suppose, user is also using pagination. Fist page returns properly, there is one such parent product which fits the bill and we throw an exception. Same query throws exception based on limit and start parameter. Not sure, if that would be right choice I understand both cases are either bit of stretch or corner cases but I'm sharing my reasoning behind going with the above approach Let me know if these corners cases doesn't make such sense and its okay to fail request then, i will modify the patch accordingly. Also, I have question if someone uses nestPathField approach(defined in the schema) but doesn't have any children for parents what does childTransformer return? Does it fail the request with valid error or return just the parent products? I haven't yet tried nestPathField for indexing parent-children. So, just curious. > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git
[jira] [Commented] (SOLR-13199) NPE due to unexpected null return value from QueryBitSetProducer.getBitSet
[ https://issues.apache.org/jira/browse/SOLR-13199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056103#comment-17056103 ] Munendra S N commented on SOLR-13199: - [~dsmiley] Thanks for the review This is in addition what Mikhail has shared Initially I was thinking to raise/throw an Exception but then I thought few cases {code:java} Likewise if we parse the query and get null, the query is in error. {code} One case where query could be null even if parentFilter is specified filter is defined on text field and value is stopword. I have seen cases where query resolves to null in lot of cases but currently could think of this case. Using text field itself for parentFilter is not the right choice but I don't think we can control usage. So, when user has specified perfectly fine filter which resolves to null should we throw an exception? {code:java} If parentsFilter.getBitSet returns null, then we should throw an error that the user didn't supply a parentFilter matching parent documents {code} parentFilter could be something that matches fewer parent set rather then whole parent set. Suggestion throw an error is good if there is an enforcement that unique parent condition should be part of each document. Suppose, user is also using pagination. Fist page returns properly, there is one such parent product which fits the bill and we throw an exception. Same query throws exception based on limit and start parameter. Not sure, if that would be right choice I understand both cases are either bit of stretch or corner cases but I'm sharing my reasoning behind going with the above approach Let me know if these corners cases doesn't make such sense and its okay to fail request then, i will modify the patch accordingly. Also, I have question if someone uses nestPathField approach(defined in the schema) but doesn't have any children for parents what does childTransformer return? Does it fail the request with valid error or return just the parent products? I haven't yet tried nestPathField for indexing parent-children. So, just curious. > NPE due to unexpected null return value from QueryBitSetProducer.getBitSet > -- > > Key: SOLR-13199 > URL: https://issues.apache.org/jira/browse/SOLR-13199 > Project: Solr > Issue Type: Bug > Components: search >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. The > attached file ({{home.zip}}) gives the contents of folder {{/tmp/home}} that > you will obtain by following the steps below: > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > {noformat} >Reporter: Johannes Kloos >Assignee: Munendra S N >Priority: Minor > Labels: diffblue, newdev > Attachments: SOLR-13199.patch, home.zip > > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fl=[child%20parentFilter=ge]&q=*:* > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.NullPointerException > at > org.apache.solr.response.transform.ChildDocTransformer.transform(ChildDocTransformer.java:92) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:103) > at org.apache.solr.response.DocsStreamer.next(DocsStreamer.java:1) > at > org.apache.solr.response.TextResponseWrit
[jira] [Commented] (SOLR-13807) Caching for term facet counts
[ https://issues.apache.org/jira/browse/SOLR-13807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056100#comment-17056100 ] Michael Gibney commented on SOLR-13807: --- Thanks for responding on these points, [~hossman]! Apologies for my delay in responding, but it's taken me a while to dig into actually addressing some of the issues uncovered by testing (just pushed to [PR #751|https://github.com/apache/lucene-solr/pull/751]). Before embarking on a potential major refactor of PR code that is I believe essentially sound, I first wanted to address the test failures in the existing PR and then see where we are with things. The changes required were not large in terms of number of lines of code. Aside from some trivial bug fixes, the substantive issues addressed fell into three categories, broadly speaking: # UIF caching was an afterthought in the initial patch. I knew this at the time I opened the PR (and should have called it out more explicitly) but although I had roughed in some of the cache-entry-building logic as a POC, nothing was ever actually getting inserted in the cache \(!) and not all code branches were covered. It was fairly straightforward to bring UIF into line (and I re-enabled the UIF cases from your initial test). # Cache entry compatibility across different methods of facet processing. I had to clarify that term counts are only eligible for caching when {{prefix==null}} (or {{prefix.isEmpty()}}). (It would be possible to use no-prefix cached term counts to process prefixed facet requests, but I think it makes sense to leave that for later, if at all). Aside from that, missing buckets are collected _inline_ and cached for {{FacetFieldProcessorByArrayDV}}, but are _not_ collected (nor cached) for {{FacetFieldProcessorByArrayUIF}} (or legacy {{DocValuesFacets}}) processing. In practice, it's unlikely that the same field would be processed both as UIF (no cached "missing" count) _and_ as DV (cached "missing" count), but the case did come up in testing, and I addressed it by detecting and re-processing with {{*ByArrayDV}}, and replacing the cache entry with the new one that includes "missing" count. The resulting "missing"-inclusive cache-entry is backward-compatible with (may be used by) {{*ByArrayUIF}} and legacy {{DocValuesFacets}} processing implementations. Incidentally, I wonder whether this "inline" collection of "missing" counts is something like what you had in mind with the comment "{{TODO: it would be more efficient to build up a missing DocSet if we need it here anyway.}}"? # Cache key compatibility across blockJoin domain changes. The extant "nested facet" implementation only passes the {{base}} DocSet domain down from parent to child. One of the things this PR had to do was to also track corresponding changes to the {{baseFilters}} – the queries used to generate the {{base}} DocSet domain – because these queries are required for use in facet cache keys. The initial PR punted on the question of blockJoin domain changes, and simply set {{baseFilters = null}}, with a comment in code: "{{unusual case; TODO: can we make a cache key for this base domain?}}". Well I meant "unusual _for me, at the moment_" :); I just had to put the effort into building proper ({{baseFilter}} query) cache keys for these domain changes. In the process, I also realized that tracking {{baseFilters}} down the nested facet tree should probably address "{{TODO: somehow remove responsebuilder dependency}}" – I put a {{nocommit}} comment to that effect (and temporarily throw an {{AssertionError}} to highlight what I think can now be dead code following). I also found myself wondering how exclusion of ancestor tagged filters would affect descendent join/graph/blockjoin domain changes ... but that's a separate issue. > Caching for term facet counts > - > > Key: SOLR-13807 > URL: https://issues.apache.org/jira/browse/SOLR-13807 > Project: Solr > Issue Type: New Feature > Components: Facet Module >Affects Versions: master (9.0), 8.2 >Reporter: Michael Gibney >Priority: Minor > Attachments: SOLR-13807__SOLR-13132_test_stub.patch > > > Solr does not have a facet count cache; so for _every_ request, term facets > are recalculated for _every_ (facet) field, by iterating over _every_ field > value for _every_ doc in the result domain, and incrementing the associated > count. > As a result, subsequent requests end up redoing a lot of the same work, > including all associated object allocation, GC, etc. This situation could > benefit from integrated caching. > Because of the domain-based, serial/iterative nature of term facet > calculation, latency is proportional to the size of the result domain. > Consequently, one common/clear manifestation of this issue is
[GitHub] [lucene-solr] aroopganguly opened a new pull request #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method
aroopganguly opened a new pull request #1335: SOLR-14316 Remove unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method URL: https://github.com/apache/lucene-solr/pull/1335 # Description there was an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method # Solution fixed an unchecked type conversion warning in JavaBinCodec's readMapEntry's equals() method # Tests no new test added, but all existing test suite succeed with warning now gone. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `ant precommit` and the appropriate test suite. - [x] I have run `gradlew precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8103) QueryValueSource should use TwoPhaseIterator
[ https://issues.apache.org/jira/browse/LUCENE-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056067#comment-17056067 ] Michele Palmia edited comment on LUCENE-8103 at 3/10/20, 3:37 PM: -- Thanks a lot - i had not fully grasped the approximation mechanism and the {{TPI. asDocIdSetIterator(tpi)}} implementation. I uploaded an updated patch. was (Author: micpalmia): Thanks a lot - i had not grasped the approximation mechanism and the {{TPI. asDocIdSetIterator(tpi)}} implementation. I uploaded an updated patch. > QueryValueSource should use TwoPhaseIterator > > > Key: LUCENE-8103 > URL: https://issues.apache.org/jira/browse/LUCENE-8103 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: David Smiley >Priority: Minor > Attachments: LUCENE-8103.patch > > > QueryValueSource (in "queries" module) is a ValueSource representation of a > Query; the score is the value. It ought to try to use a TwoPhaseIterator > from the query if it can be offered. This will prevent possibly expensive > advancing beyond documents that we aren't interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8103) QueryValueSource should use TwoPhaseIterator
[ https://issues.apache.org/jira/browse/LUCENE-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056067#comment-17056067 ] Michele Palmia commented on LUCENE-8103: Thanks a lot - i had not grasped the approximation mechanism and the {{TPI. asDocIdSetIterator(tpi)}} implementation. I uploaded an updated patch. > QueryValueSource should use TwoPhaseIterator > > > Key: LUCENE-8103 > URL: https://issues.apache.org/jira/browse/LUCENE-8103 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: David Smiley >Priority: Minor > Attachments: LUCENE-8103.patch > > > QueryValueSource (in "queries" module) is a ValueSource representation of a > Query; the score is the value. It ought to try to use a TwoPhaseIterator > from the query if it can be offered. This will prevent possibly expensive > advancing beyond documents that we aren't interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-8103) QueryValueSource should use TwoPhaseIterator
[ https://issues.apache.org/jira/browse/LUCENE-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michele Palmia updated LUCENE-8103: --- Attachment: (was: LUCENE-8103.patch) > QueryValueSource should use TwoPhaseIterator > > > Key: LUCENE-8103 > URL: https://issues.apache.org/jira/browse/LUCENE-8103 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: David Smiley >Priority: Minor > Attachments: LUCENE-8103.patch > > > QueryValueSource (in "queries" module) is a ValueSource representation of a > Query; the score is the value. It ought to try to use a TwoPhaseIterator > from the query if it can be offered. This will prevent possibly expensive > advancing beyond documents that we aren't interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-8103) QueryValueSource should use TwoPhaseIterator
[ https://issues.apache.org/jira/browse/LUCENE-8103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michele Palmia updated LUCENE-8103: --- Attachment: LUCENE-8103.patch > QueryValueSource should use TwoPhaseIterator > > > Key: LUCENE-8103 > URL: https://issues.apache.org/jira/browse/LUCENE-8103 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: David Smiley >Priority: Minor > Attachments: LUCENE-8103.patch > > > QueryValueSource (in "queries" module) is a ValueSource representation of a > Query; the score is the value. It ought to try to use a TwoPhaseIterator > from the query if it can be offered. This will prevent possibly expensive > advancing beyond documents that we aren't interested in. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on issue #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches
atris commented on issue #1294: LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches URL: https://github.com/apache/lucene-solr/pull/1294#issuecomment-597139106 @jpountz Any thoughts on this one? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9266) ant nightly-smoke fails due to presence of build.gradle
[ https://issues.apache.org/jira/browse/LUCENE-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056023#comment-17056023 ] Mike Drob commented on LUCENE-9266: --- Gradle itself is master only, but there are other failures on 8x nightly currently ([https://builds.apache.org/job/Lucene-Solr-SmokeRelease-8.x/372/]) that I'm sure I will discover on master once I get through whatever else is failing. > ant nightly-smoke fails due to presence of build.gradle > --- > > Key: LUCENE-9266 > URL: https://issues.apache.org/jira/browse/LUCENE-9266 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Mike Drob >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Seen on Jenkins - > [https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1617/console] > > Reproduced locally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9236) Having a modular Doc Values format
[ https://issues.apache.org/jira/browse/LUCENE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056013#comment-17056013 ] juan camilo rodriguez duran commented on LUCENE-9236: - [~jpountz] This is step 2 of the Jira issue, I want to know what do you think about step one, only splitting the big classes and then make the reader and writing part more symmetric > Having a modular Doc Values format > -- > > Key: LUCENE-9236 > URL: https://issues.apache.org/jira/browse/LUCENE-9236 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: juan camilo rodriguez duran >Priority: Minor > Labels: docValues > > Today DocValues Consumer/Producer require override 5 different methods, even > if you only want to use one and given that one given field can only support > one doc values type at same time. > > In the attached PR I’ve implemented a new modular version of those classes > (consumer/producer) each one having a single responsibility and writing in > the same unique file. > This is mainly a refactor of the existing format opening the possibility to > override or implement the sub-format you need. > > I’ll do in 3 steps: > # Create a CompositeDocValuesFormat and moving the code of > Lucene80DocValuesFormat in separate classes, without modifying the inner > code. At same time I created a Lucene85CompositeDocValuesFormat based on > these changes. > # I’ll introduce some basic components for writing doc values in general > such as: > ## DocumentIdSetIterator Serializer: used in each type of field based on an > IndexedDISI. > ## Document Ordinals Serializer: Used in Sorted and SortedSet for > deduplicate values using a dictionary. > ## Document Boundaries Serializer (optional used only for multivalued > fields: SortedNumeric and SortedSet) > ## TermsEnum Serializer: useful to write and read the terms dictionary for > sorted and sorted set doc values. > # I’ll create the new Sub-DocValues format using the previous components. > > PR: [https://github.com/apache/lucene-solr/pull/1282] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-8674) UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal
[ https://issues.apache.org/jira/browse/LUCENE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055990#comment-17055990 ] Michele Palmia edited comment on LUCENE-8674 at 3/10/20, 2:20 PM: -- The problematic query ( {{?fq=\{!frange l=10 u=100}or_version_s,directed_by}} ) specifies two value sources separated by a comma ({{or_version_s,directed_by}}). These are parsed as a {{VectorValueSource}} embedding the two individual ValueSources corresponding to the two fields (see [FunctionQParser.java:115|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/FunctionQParser.java#L115]). was (Author: micpalmia): The problematic query ( {{?fq={!frange%20l=10%20u=100}or_version_s,directed_by}} ) specifies two value sources separated by a comma ({{or_version_s,directed_by}}). These are parsed as a {{VectorValueSource}} embedding the two individual ValueSources corresponding to the two fields (see [FunctionQParser.java:115|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/FunctionQParser.java#L115]). > UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal > -- > > Key: LUCENE-8674 > URL: https://issues.apache.org/jira/browse/LUCENE-8674 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection and reproducing the bug > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > curl -v “URL_BUG” > {noformat} > Please check the issue description below to find the “URL_BUG” that will > allow you to reproduce the issue reported. >Reporter: Johannes Kloos >Priority: Minor > Labels: diffblue, newdev > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fq={!frange%20l=10%20u=100}or_version_s,directed_by > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.UnsupportedOperationException > at > org.apache.lucene.queries.function.FunctionValues.floatVal(FunctionValues.java:47) > at > org.apache.lucene.queries.function.FunctionValues$3.matches(FunctionValues.java:188) > at > org.apache.lucene.queries.function.ValueSourceScorer$1.matches(ValueSourceScorer.java:53) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.doNext(TwoPhaseIterator.java:89) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.nextDoc(TwoPhaseIterator.java:77) > at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:261) > at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:214) > at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:652) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443) > at org.apache.solr.search.DocSetUtil.createDocSetGeneric(DocSetUtil.java:151) > at org.apache.solr.search.DocSetUtil.createDocSet(DocSetUtil.java:140) > at > org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1177) > at > org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:817) > at > org.apach
[jira] [Commented] (LUCENE-8674) UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal
[ https://issues.apache.org/jira/browse/LUCENE-8674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055990#comment-17055990 ] Michele Palmia commented on LUCENE-8674: The problematic query ( {{?fq={!frange%20l=10%20u=100}or_version_s,directed_by}} ) specifies two value sources separated by a comma ({{or_version_s,directed_by}}). These are parsed as a {{VectorValueSource}} embedding the two individual ValueSources corresponding to the two fields (see [FunctionQParser.java:115|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/search/FunctionQParser.java#L115]). > UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal > -- > > Key: LUCENE-8674 > URL: https://issues.apache.org/jira/browse/LUCENE-8674 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring >Affects Versions: master (9.0) > Environment: h1. Steps to reproduce > * Use a Linux machine. > * Build commit {{ea2c8ba}} of Solr as described in the section below. > * Build the films collection as described below. > * Start the server using the command {{./bin/solr start -f -p 8983 -s > /tmp/home}} > * Request the URL given in the bug description. > h1. Compiling the server > {noformat} > git clone https://github.com/apache/lucene-solr > cd lucene-solr > git checkout ea2c8ba > ant compile > cd solr > ant server > {noformat} > h1. Building the collection and reproducing the bug > We followed [Exercise > 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from > the [Solr > Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html]. > {noformat} > mkdir -p /tmp/home > echo '' > > /tmp/home/solr.xml > {noformat} > In one terminal start a Solr instance in foreground: > {noformat} > ./bin/solr start -f -p 8983 -s /tmp/home > {noformat} > In another terminal, create a collection of movies, with no shards and no > replication, and initialize it: > {noformat} > bin/solr create -c films > curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": > {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' > http://localhost:8983/solr/films/schema > curl -X POST -H 'Content-type:application/json' --data-binary > '{"add-copy-field" : {"source":"*","dest":"_text_"}}' > http://localhost:8983/solr/films/schema > ./bin/post -c films example/films/films.json > curl -v “URL_BUG” > {noformat} > Please check the issue description below to find the “URL_BUG” that will > allow you to reproduce the issue reported. >Reporter: Johannes Kloos >Priority: Minor > Labels: diffblue, newdev > > Requesting the following URL causes Solr to return an HTTP 500 error response: > {noformat} > http://localhost:8983/solr/films/select?fq={!frange%20l=10%20u=100}or_version_s,directed_by > {noformat} > The error response seems to be caused by the following uncaught exception: > {noformat} > java.lang.UnsupportedOperationException > at > org.apache.lucene.queries.function.FunctionValues.floatVal(FunctionValues.java:47) > at > org.apache.lucene.queries.function.FunctionValues$3.matches(FunctionValues.java:188) > at > org.apache.lucene.queries.function.ValueSourceScorer$1.matches(ValueSourceScorer.java:53) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.doNext(TwoPhaseIterator.java:89) > at > org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.nextDoc(TwoPhaseIterator.java:77) > at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:261) > at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:214) > at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:652) > at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443) > at org.apache.solr.search.DocSetUtil.createDocSetGeneric(DocSetUtil.java:151) > at org.apache.solr.search.DocSetUtil.createDocSet(DocSetUtil.java:140) > at > org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1177) > at > org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:817) > at > org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1025) > at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1540) > at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420) > at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567) > at > org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1434) > {noformat} > Sadly, I can't understand the logic of this code
[jira] [Created] (SOLR-14318) Missing dependency on commons-lang in solr-cell 8.4.1
Markus Günther created SOLR-14318: - Summary: Missing dependency on commons-lang in solr-cell 8.4.1 Key: SOLR-14318 URL: https://issues.apache.org/jira/browse/SOLR-14318 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - Solr Cell (Tika extraction) Affects Versions: 8.4.1 Reporter: Markus Günther During a migration from Solr 7.x to Solr 8.4.1 we noticed that the commons-lang:commons-lang:2.6 dependency has been removed, and thus, no longer is part of org.apache.solr:solr-cell. solr-cell however comes bundled with Apache Tika Parsers (org.apache.tika:tika-parsers) in version 1.19.1 which - although it is not an explicit dependency - does require commons-lang:commons-lang:2.6. This raises an issue when trying to extract the content from Microsoft Access database files using Tika. See the stacktrace below. {code:java} java.lang.NoClassDefFoundError: org/apache/commons/lang/ObjectUtilsjava.lang.NoClassDefFoundError: org/apache/commons/lang/ObjectUtils at com.healthmarketscience.jackcess.util.SimpleColumnMatcher.equals(SimpleColumnMatcher.java:74) at com.healthmarketscience.jackcess.util.SimpleColumnMatcher.matches(SimpleColumnMatcher.java:46) at com.healthmarketscience.jackcess.util.CaseInsensitiveColumnMatcher.matches(CaseInsensitiveColumnMatcher.java:49) at com.healthmarketscience.jackcess.impl.CursorImpl.currentRowMatchesImpl(CursorImpl.java:571) at com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRowImpl(CursorImpl.java:627) at com.healthmarketscience.jackcess.impl.CursorImpl.findAnotherRow(CursorImpl.java:517) at com.healthmarketscience.jackcess.impl.CursorImpl.findFirstRow(CursorImpl.java:494) at com.healthmarketscience.jackcess.impl.DatabaseImpl$FallbackTableFinder.findRow(DatabaseImpl.java:2376) at com.healthmarketscience.jackcess.impl.DatabaseImpl$TableFinder.findObjectId(DatabaseImpl.java:2176) at com.healthmarketscience.jackcess.impl.DatabaseImpl.readSystemCatalog(DatabaseImpl.java:879) at com.healthmarketscience.jackcess.impl.DatabaseImpl.(DatabaseImpl.java:534) at com.healthmarketscience.jackcess.impl.DatabaseImpl.open(DatabaseImpl.java:401) at com.healthmarketscience.jackcess.DatabaseBuilder.open(DatabaseBuilder.java:252) at org.apache.tika.parser.microsoft.JackcessParser.parse(JackcessParser.java:94) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at org.apache.tika.parser.DelegatingParser.parse(DelegatingParser.java:72) at org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor.parseEmbedded(ParsingEmbeddedDocumentExtractor.java:102) at org.apache.tika.parser.pkg.PackageParser.parseEntry(PackageParser.java:350) at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:287) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:228) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211) at org.apache.solr.core.SolrCore.execute(SolrCore.java:2596) at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799) at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711) at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347) at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.ja
[jira] [Commented] (SOLR-8306) Enhance ExpandComponent to allow expand.hits=0
[ https://issues.apache.org/jira/browse/SOLR-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055977#comment-17055977 ] Amelia Henderson commented on SOLR-8306: Added a Github PR that includes Marshall's work and some tests. > Enhance ExpandComponent to allow expand.hits=0 > -- > > Key: SOLR-8306 > URL: https://issues.apache.org/jira/browse/SOLR-8306 > Project: Solr > Issue Type: Improvement >Affects Versions: 5.3.1 >Reporter: Marshall Sanders >Priority: Minor > Labels: expand > Fix For: 5.5 > > Attachments: SOLR-8306.patch, SOLR-8306.patch, > SOLR-8306_branch_5x@1715230.patch > > Time Spent: 10m > Remaining Estimate: 0h > > This enhancement allows the ExpandComponent to allow expand.hits=0 for those > who don't want an expanded document returned and only want the numFound from > the expand section. > This is useful for "See 54 more like this" use cases, but without the > performance hit of gathering an entire expanded document. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] ameliahenderson opened a new pull request #1334: SOLR-8306: Enhance ExpandComponent to allow expand.hits=0
ameliahenderson opened a new pull request #1334: SOLR-8306: Enhance ExpandComponent to allow expand.hits=0 URL: https://github.com/apache/lucene-solr/pull/1334 # Description Please provide a short description of the changes you're making with this pull request. # Solution Please provide a short description of the approach taken to implement your solution. # Tests Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `master` branch. - [x] I have run `ant precommit` and the appropriate test suite. - [x] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055927#comment-17055927 ] Michele Palmia edited comment on LUCENE-9269 at 3/10/20, 1:15 PM: -- I was actually just looking at a [user report|https://mail-archives.apache.org/mod_mbox/lucene-dev/202003.mbox/%3CCALyzSEn%2BQFoT3MpNYkxw-dEK9jc59mSTvXqccuUVMMDAgOMMmA%40mail.gmail.com%3E] that came to lucene-dev and looked interesting. In their use case, they were using fuzzy queries, that in turn generate blended queries that are affected by this issue. Maybe users of BlendedQuery/FuzzyQuery should be able to find some form of warning in the docs (while [LUCENE-8840|https://issues.apache.org/jira/browse/LUCENE-8840] is not fixed)? was (Author: micpalmia): I was actually just looking at a [user report|https://mail-archives.apache.org/mod_mbox/lucene-dev/202003.mbox/%3CCALyzSEn%2BQFoT3MpNYkxw-dEK9jc59mSTvXqccuUVMMDAgOMMmA%40mail.gmail.com%3E] that came to lucene-dev and looked interesting. In their use case, they were using fuzzy queries, that in turn generate blended queries that are affected by this issue. Maybe users of BlendedQuery/FuzzyQuery should be able to find some form of warning in the docs? > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14139) Support backtick phrase queries in Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Bernstein resolved SOLR-14139. --- Fix Version/s: 8.5 Resolution: Resolved > Support backtick phrase queries in Streaming Expressions > > > Key: SOLR-14139 > URL: https://issues.apache.org/jira/browse/SOLR-14139 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Priority: Major > Fix For: 8.5 > > Attachments: SOLR-14139.patch, SOLR-14139.patch > > > Currently in order to make phrase queries in Streaming Expressions you must > escape the quotes as follows: > {code:java} > search(collection1, q="fieldA:\"hello world\" AND fieldB:two"){code} > This ticket will allow phrase queries to be entered with back ticks as > follows: > {code:java} > search(collection1, q="fieldA:`hello world` AND fieldB:two") {code} > Back ticks are nice because they are infrequently searched on and people in > the SQL world are used to back ticks meaning "take the literal value of this > string". > Under the covers back ticks will be translated to double quotes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14139) Support backtick phrase queries in Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055939#comment-17055939 ] ASF subversion and git services commented on SOLR-14139: Commit d3c2afec4fbf9501279034e8d6aca4c5af797616 in lucene-solr's branch refs/heads/branch_8_5 from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d3c2afe ] SOLR-14139: Update CHANGE.txt > Support backtick phrase queries in Streaming Expressions > > > Key: SOLR-14139 > URL: https://issues.apache.org/jira/browse/SOLR-14139 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Priority: Major > Attachments: SOLR-14139.patch, SOLR-14139.patch > > > Currently in order to make phrase queries in Streaming Expressions you must > escape the quotes as follows: > {code:java} > search(collection1, q="fieldA:\"hello world\" AND fieldB:two"){code} > This ticket will allow phrase queries to be entered with back ticks as > follows: > {code:java} > search(collection1, q="fieldA:`hello world` AND fieldB:two") {code} > Back ticks are nice because they are infrequently searched on and people in > the SQL world are used to back ticks meaning "take the literal value of this > string". > Under the covers back ticks will be translated to double quotes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14139) Support backtick phrase queries in Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055938#comment-17055938 ] ASF subversion and git services commented on SOLR-14139: Commit c179ab66e4facd9d342c33c6fda021f27165941a in lucene-solr's branch refs/heads/branch_8x from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c179ab6 ] SOLR-14139: Update CHANGE.txt > Support backtick phrase queries in Streaming Expressions > > > Key: SOLR-14139 > URL: https://issues.apache.org/jira/browse/SOLR-14139 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Priority: Major > Attachments: SOLR-14139.patch, SOLR-14139.patch > > > Currently in order to make phrase queries in Streaming Expressions you must > escape the quotes as follows: > {code:java} > search(collection1, q="fieldA:\"hello world\" AND fieldB:two"){code} > This ticket will allow phrase queries to be entered with back ticks as > follows: > {code:java} > search(collection1, q="fieldA:`hello world` AND fieldB:two") {code} > Back ticks are nice because they are infrequently searched on and people in > the SQL world are used to back ticks meaning "take the literal value of this > string". > Under the covers back ticks will be translated to double quotes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14139) Support backtick phrase queries in Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-14139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055936#comment-17055936 ] ASF subversion and git services commented on SOLR-14139: Commit 193e4a64234b2f76036d8f018a7478d61e5a0fab in lucene-solr's branch refs/heads/master from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=193e4a6 ] SOLR-14139: Update CHANGE.txt > Support backtick phrase queries in Streaming Expressions > > > Key: SOLR-14139 > URL: https://issues.apache.org/jira/browse/SOLR-14139 > Project: Solr > Issue Type: Improvement >Reporter: Joel Bernstein >Priority: Major > Attachments: SOLR-14139.patch, SOLR-14139.patch > > > Currently in order to make phrase queries in Streaming Expressions you must > escape the quotes as follows: > {code:java} > search(collection1, q="fieldA:\"hello world\" AND fieldB:two"){code} > This ticket will allow phrase queries to be entered with back ticks as > follows: > {code:java} > search(collection1, q="fieldA:`hello world` AND fieldB:two") {code} > Back ticks are nice because they are infrequently searched on and people in > the SQL world are used to back ticks meaning "take the literal value of this > string". > Under the covers back ticks will be translated to double quotes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055927#comment-17055927 ] Michele Palmia edited comment on LUCENE-9269 at 3/10/20, 1:07 PM: -- I was actually just looking at a [user report|https://mail-archives.apache.org/mod_mbox/lucene-dev/202003.mbox/%3CCALyzSEn%2BQFoT3MpNYkxw-dEK9jc59mSTvXqccuUVMMDAgOMMmA%40mail.gmail.com%3E] that came to lucene-dev and looked interesting. In their use case, they were using fuzzy queries, that in turn generate blended queries that are affected by this issue. Maybe users of BlendedQuery/FuzzyQuery should be able to find some form of warning in the docs? was (Author: micpalmia): I was actually just looking at a [user report|https://mail-archives.apache.org/mod_mbox/lucene-dev/202003.mbox/browser] that came to lucene-dev and looked interesting. In their use case, they were using fuzzy queries, that in turn generate blended queries that are affected by this issue. Maybe users of BlendedQuery/FuzzyQuery should be able to find some form of warning in the docs? > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055927#comment-17055927 ] Michele Palmia edited comment on LUCENE-9269 at 3/10/20, 1:05 PM: -- I was actually just looking at a [user report|https://mail-archives.apache.org/mod_mbox/lucene-dev/202003.mbox/browser] that came to lucene-dev and looked interesting. In their use case, they were using fuzzy queries, that in turn generate blended queries that are affected by this issue. Maybe users of BlendedQuery/FuzzyQuery should be able to find some form of warning in the docs? was (Author: micpalmia): I was actually just looking at a [user report|[https://mail-archives.apache.org/mod_mbox/lucene-dev/202003.mbox/browser]] that came to lucene-dev and looked interesting. In their use case, they were using fuzzy queries, that in turn generate blended queries that are affected by this issue. Maybe users of BlendedQuery/FuzzyQuery should be able to find some form of warning in the docs? > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055927#comment-17055927 ] Michele Palmia commented on LUCENE-9269: I was actually just looking at a [user report|[https://mail-archives.apache.org/mod_mbox/lucene-dev/202003.mbox/browser]] that came to lucene-dev and looked interesting. In their use case, they were using fuzzy queries, that in turn generate blended queries that are affected by this issue. Maybe users of BlendedQuery/FuzzyQuery should be able to find some form of warning in the docs? > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9266) ant nightly-smoke fails due to presence of build.gradle
[ https://issues.apache.org/jira/browse/LUCENE-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055922#comment-17055922 ] Dawid Weiss commented on LUCENE-9266: - I don't think the plan is to backporte gradle build to 8x - at least I don't plan to invest time in doing this (and it's hard for me to say how much work it'd be). > ant nightly-smoke fails due to presence of build.gradle > --- > > Key: LUCENE-9266 > URL: https://issues.apache.org/jira/browse/LUCENE-9266 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Mike Drob >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Seen on Jenkins - > [https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1617/console] > > Reproduced locally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055891#comment-17055891 ] Michele Palmia edited comment on LUCENE-9269 at 3/10/20, 12:57 PM: --- I added a very simple test (with my very limited Lucene testing skills) that emulates example c) above and checks for the score of the top document. As there is no "right" score, I just check for one of the two possible scores and have the test fail on the other. I'm having a hard time wrapping my head around what the right behavior should be in this case (and thus coming up with a more sensible test and fix). In case that's useful, I should probably add that the randomness in the scoring behavior is due to the HashMap underlying MultiSet: when should clauses are processed for deduplication, they're served in an arbitrary order (see [BooleanQuery.java:370|[https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java#L370]]) was (Author: micpalmia): I added a very simple test (with my very limited Lucene testing skills) that simply emulates example c) above and checks for the score of the top document. As there is no "right" score, I just check for one of the two possible scores and have the test fail on the other. I'm having a hard time wrapping my head around what the right behavior should be in this case (and thus coming up with a more sensible test and fix). > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9236) Having a modular Doc Values format
[ https://issues.apache.org/jira/browse/LUCENE-9236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055907#comment-17055907 ] Adrien Grand commented on LUCENE-9236: -- [~juan.duran] In my opinion it introduces complexity because it introduces more abstractions: CompositeFieldMetadata, DocValuesConsumerSupplier, and so on. > Having a modular Doc Values format > -- > > Key: LUCENE-9236 > URL: https://issues.apache.org/jira/browse/LUCENE-9236 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: juan camilo rodriguez duran >Priority: Minor > Labels: docValues > > Today DocValues Consumer/Producer require override 5 different methods, even > if you only want to use one and given that one given field can only support > one doc values type at same time. > > In the attached PR I’ve implemented a new modular version of those classes > (consumer/producer) each one having a single responsibility and writing in > the same unique file. > This is mainly a refactor of the existing format opening the possibility to > override or implement the sub-format you need. > > I’ll do in 3 steps: > # Create a CompositeDocValuesFormat and moving the code of > Lucene80DocValuesFormat in separate classes, without modifying the inner > code. At same time I created a Lucene85CompositeDocValuesFormat based on > these changes. > # I’ll introduce some basic components for writing doc values in general > such as: > ## DocumentIdSetIterator Serializer: used in each type of field based on an > IndexedDISI. > ## Document Ordinals Serializer: Used in Sorted and SortedSet for > deduplicate values using a dictionary. > ## Document Boundaries Serializer (optional used only for multivalued > fields: SortedNumeric and SortedSet) > ## TermsEnum Serializer: useful to write and read the terms dictionary for > sorted and sorted set doc values. > # I’ll create the new Sub-DocValues format using the previous components. > > PR: [https://github.com/apache/lucene-solr/pull/1282] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055901#comment-17055901 ] Adrien Grand commented on LUCENE-9269: -- We should remove BlendedTermQuery eventually. It tries to solve cross-field search and synonym search at the same time, which introduces complications... Since you seem to be using it for the synonym case, you can look at SynonymQuery, which can deal with multiple synonyms that have different boosts today already. For cross-field search, we have a BM25FQuery, though I hope we'll find ways to make it easier to use in the future, e.g. by moving the scoring logic to Similarity. > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055891#comment-17055891 ] Michele Palmia commented on LUCENE-9269: I added a very simple test (with my very limited Lucene testing skills) that simply emulates example c) above and checks for the score of the top document. As there is no "right" score, I just check for one of the two possible scores and have the test fail on the other. I'm having a hard time wrapping my head around what the right behavior should be in this case (and thus coming up with a more sensible test and fix). > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michele Palmia updated LUCENE-9269: --- Attachment: LUCENE-9269-test.patch > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > Attachments: LUCENE-9269-test.patch > > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michele Palmia updated LUCENE-9269: --- Description: If two blended queries are should clauses of a boolean query and are built so that * some of their terms are the same * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE the docFreq for the overlapping terms used for scoring is picked as follow: * if the overlapping terms are not boosted, the df of the term in the first blended query is used * if any of the overlapping terms is boosted, the df is picked at (what looks like) random. A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). {code:java} 1. Blended(f:a f:b) Blended (f:a) df: 3 df: 2 gets rewritten to: (f:a)^2.0 (f:b) df: 3 df:2 Blended(f:a) Blended(f:a f:b) df: 2df: 3 gets rewritten to: (f:a)^2.0 (f:b) df: 2 df:2 Blended(f:a f:b^0.66) Blended (f:a^0.75) df: 3 df: 2 gets rewritten to: (f:a)^1.75 (f:b)^0.66 df:? df:2 {code} with ? either 2 or 3, depending on the run. was: If two blended queries are built so that * some of their terms are the same * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE the docFreq for the overlapping terms used for scoring is picked as follow: * if the overlapping terms are not boosted, the df of the term in the first blended query is used * if any of the overlapping terms is boosted, the df is picked at (what looks like) random. A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). {code:java} 1. Blended(f:a f:b) Blended (f:a) df: 3 df: 2 gets rewritten to: (f:a)^2.0 (f:b) df: 3 df:2 Blended(f:a) Blended(f:a f:b) df: 2df: 3 gets rewritten to: (f:a)^2.0 (f:b) df: 2 df:2 Blended(f:a f:b^0.66) Blended (f:a^0.75) df: 3 df: 2 gets rewritten to: (f:a)^1.75 (f:b)^0.66 df:? df:2 {code} with ? either 2 or 3, depending on the run. > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > * if the overlapping terms are not boosted, the df of the term in the first > blended query is used > * if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > 1. > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
[ https://issues.apache.org/jira/browse/LUCENE-9269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michele Palmia updated LUCENE-9269: --- Description: If two blended queries are should clauses of a boolean query and are built so that * some of their terms are the same * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE the docFreq for the overlapping terms used for scoring is picked as follow: # if the overlapping terms are not boosted, the df of the term in the first blended query is used # if any of the overlapping terms is boosted, the df is picked at (what looks like) random. A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). {code:java} a) Blended(f:a f:b) Blended (f:a) df: 3 df: 2 gets rewritten to: (f:a)^2.0 (f:b) df: 3 df:2 b) Blended(f:a) Blended(f:a f:b) df: 2df: 3 gets rewritten to: (f:a)^2.0 (f:b) df: 2 df:2 c) Blended(f:a f:b^0.66) Blended (f:a^0.75) df: 3 df: 2 gets rewritten to: (f:a)^1.75 (f:b)^0.66 df:? df:2 {code} with ? either 2 or 3, depending on the run. was: If two blended queries are should clauses of a boolean query and are built so that * some of their terms are the same * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE the docFreq for the overlapping terms used for scoring is picked as follow: * if the overlapping terms are not boosted, the df of the term in the first blended query is used * if any of the overlapping terms is boosted, the df is picked at (what looks like) random. A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). {code:java} 1. Blended(f:a f:b) Blended (f:a) df: 3 df: 2 gets rewritten to: (f:a)^2.0 (f:b) df: 3 df:2 Blended(f:a) Blended(f:a f:b) df: 2df: 3 gets rewritten to: (f:a)^2.0 (f:b) df: 2 df:2 Blended(f:a f:b^0.66) Blended (f:a^0.75) df: 3 df: 2 gets rewritten to: (f:a)^1.75 (f:b)^0.66 df:? df:2 {code} with ? either 2 or 3, depending on the run. > Blended queries with boolean rewrite can result in inconstitent scores > -- > > Key: LUCENE-9269 > URL: https://issues.apache.org/jira/browse/LUCENE-9269 > Project: Lucene - Core > Issue Type: Bug > Components: core/search >Affects Versions: 8.4 >Reporter: Michele Palmia >Priority: Minor > > If two blended queries are should clauses of a boolean query and are built so > that > * some of their terms are the same > * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE > the docFreq for the overlapping terms used for scoring is picked as follow: > # if the overlapping terms are not boosted, the df of the term in the first > blended query is used > # if any of the overlapping terms is boosted, the df is picked at (what > looks like) random. > A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). > {code:java} > a) > Blended(f:a f:b) Blended (f:a) > df: 3 df: 2 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 3 df:2 > b) > Blended(f:a) Blended(f:a f:b) > df: 2df: 3 > gets rewritten to: > (f:a)^2.0 (f:b) > df: 2 df:2 > c) > Blended(f:a f:b^0.66) Blended (f:a^0.75) > df: 3 df: 2 > gets rewritten to: > (f:a)^1.75 (f:b)^0.66 > df:? df:2 > {code} > with ? either 2 or 3, depending on the run. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9269) Blended queries with boolean rewrite can result in inconstitent scores
Michele Palmia created LUCENE-9269: -- Summary: Blended queries with boolean rewrite can result in inconstitent scores Key: LUCENE-9269 URL: https://issues.apache.org/jira/browse/LUCENE-9269 Project: Lucene - Core Issue Type: Bug Components: core/search Affects Versions: 8.4 Reporter: Michele Palmia If two blended queries are built so that * some of their terms are the same * their rewrite method is BlendedTermQuery.BOOLEAN_REWRITE the docFreq for the overlapping terms used for scoring is picked as follow: * if the overlapping terms are not boosted, the df of the term in the first blended query is used * if any of the overlapping terms is boosted, the df is picked at (what looks like) random. A few examples using a field with 2 terms: f:a (df: 2), and f:b (df: 3). {code:java} 1. Blended(f:a f:b) Blended (f:a) df: 3 df: 2 gets rewritten to: (f:a)^2.0 (f:b) df: 3 df:2 Blended(f:a) Blended(f:a f:b) df: 2df: 3 gets rewritten to: (f:a)^2.0 (f:b) df: 2 df:2 Blended(f:a f:b^0.66) Blended (f:a^0.75) df: 3 df: 2 gets rewritten to: (f:a)^1.75 (f:b)^0.66 df:? df:2 {code} with ? either 2 or 3, depending on the run. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9263) Geo3D distance query computes wrongly the radius
[ https://issues.apache.org/jira/browse/LUCENE-9263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055745#comment-17055745 ] ASF subversion and git services commented on LUCENE-9263: - Commit f4737e5974d75decf14f8217b99176431dfa055c in lucene-solr's branch refs/heads/branch_8_5 from Ignacio Vera [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f4737e5 ] LUCENE-9263: Fix wrong transformation of distance in meters to radians in Geo3DPoint (#1318) > Geo3D distance query computes wrongly the radius > > > Key: LUCENE-9263 > URL: https://issues.apache.org/jira/browse/LUCENE-9263 > Project: Lucene - Core > Issue Type: Bug >Reporter: Ignacio Vera >Assignee: Ignacio Vera >Priority: Major > Fix For: 8.6 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > This I side effect of LUCENE-9150, the transformation of radius in meters to > radians is totally wrong as it does not take into account the mean radius of > the earth. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9266) ant nightly-smoke fails due to presence of build.gradle
[ https://issues.apache.org/jira/browse/LUCENE-9266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055714#comment-17055714 ] Alan Woodward commented on LUCENE-9266: --- I think gradle is master only at the moment? > ant nightly-smoke fails due to presence of build.gradle > --- > > Key: LUCENE-9266 > URL: https://issues.apache.org/jira/browse/LUCENE-9266 > Project: Lucene - Core > Issue Type: Sub-task >Reporter: Mike Drob >Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > Seen on Jenkins - > [https://builds.apache.org/job/Lucene-Solr-SmokeRelease-master/1617/console] > > Reproduced locally. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dweiss commented on a change in pull request #1333: LUCENE-9266 Update smoke test for gradle
dweiss commented on a change in pull request #1333: LUCENE-9266 Update smoke test for gradle URL: https://github.com/apache/lucene-solr/pull/1333#discussion_r390134272 ## File path: lucene/common-build.xml ## @@ -598,9 +599,13 @@ - + + Review comment: Why is this needed? Is this related to gradle? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14317) HttpClusterStateProvider throws exception when only one node down
[ https://issues.apache.org/jira/browse/SOLR-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055662#comment-17055662 ] Ishan Chattopadhyaya commented on SOLR-14317: - Feel free to submit a patch, with tests if possible. > HttpClusterStateProvider throws exception when only one node down > - > > Key: SOLR-14317 > URL: https://issues.apache.org/jira/browse/SOLR-14317 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.1, 7.7.2 >Reporter: Lyle >Assignee: Ishan Chattopadhyaya >Priority: Major > > When create a CloudSolrClient with solrUrls, if the first url in the solrUrls > list is invalid or server is down, it will throw exception directly rather > than try remaining url. > In > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L65], > if fetchLiveNodes(initialClient) have any IOException, in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java#L648], > exceptions will be caught and throw SolrServerException to the upper caller, > while no IOExceptioin will be caught in > HttpClusterStateProvider.fetchLiveNodes(HttpClusterStateProvider.java:200). > The SolrServerException should be caught as well in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L69], > so that if first node provided in solrUrs down, we can try to use the second > to fetch live nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14317) HttpClusterStateProvider throws exception when only one node down
[ https://issues.apache.org/jira/browse/SOLR-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya reassigned SOLR-14317: --- Assignee: Ishan Chattopadhyaya > HttpClusterStateProvider throws exception when only one node down > - > > Key: SOLR-14317 > URL: https://issues.apache.org/jira/browse/SOLR-14317 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.1, 7.7.2 >Reporter: Lyle >Assignee: Ishan Chattopadhyaya >Priority: Major > > When create a CloudSolrClient with solrUrls, if the first url in the solrUrls > list is invalid or server is down, it will throw exception directly rather > than try remaining url. > In > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L65], > if fetchLiveNodes(initialClient) have any IOException, in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java#L648], > exceptions will be caught and throw SolrServerException to the upper caller, > while no IOExceptioin will be caught in > HttpClusterStateProvider.fetchLiveNodes(HttpClusterStateProvider.java:200). > The SolrServerException should be caught as well in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L69], > so that if first node provided in solrUrs down, we can try to use the second > to fetch live nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14317) HttpClusterStateProvider throws exception when only one node down
[ https://issues.apache.org/jira/browse/SOLR-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17055661#comment-17055661 ] Ishan Chattopadhyaya commented on SOLR-14317: - Thanks for reporting, I'll take a look. > HttpClusterStateProvider throws exception when only one node down > - > > Key: SOLR-14317 > URL: https://issues.apache.org/jira/browse/SOLR-14317 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrJ >Affects Versions: 7.7.1, 7.7.2 >Reporter: Lyle >Assignee: Ishan Chattopadhyaya >Priority: Major > > When create a CloudSolrClient with solrUrls, if the first url in the solrUrls > list is invalid or server is down, it will throw exception directly rather > than try remaining url. > In > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L65], > if fetchLiveNodes(initialClient) have any IOException, in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpSolrClient.java#L648], > exceptions will be caught and throw SolrServerException to the upper caller, > while no IOExceptioin will be caught in > HttpClusterStateProvider.fetchLiveNodes(HttpClusterStateProvider.java:200). > The SolrServerException should be caught as well in > [https://github.com/apache/lucene-solr/blob/branch_7_7/solr/solrj/src/java/org/apache/solr/client/solrj/impl/HttpClusterStateProvider.java#L69], > so that if first node provided in solrUrs down, we can try to use the second > to fetch live nodes. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org