[jira] [Commented] (SOLR-13403) Terms component fails for DatePointField
[ https://issues.apache.org/jira/browse/SOLR-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956713#comment-16956713 ] Munendra S N commented on SOLR-13403: - Thanks [~hossman] For now, I have commented out the distributed case test. I was able to reproduce the failure in both master and 8x for distributed mode. There are no failures for standalone mode. The exception is being thrown [here|https://github.com/apache/lucene-solr/blob/597241a412a0a27fa3a915df2934de3fdb5a376f/solr/core/src/java/org/apache/solr/search/PointMerger.java#L60]. This codepath is not changed in this issue. I was able to reproduce the issue for other point fields too. Looks like the test added here exposed already existing issue as none of previous test cases covered this codepath. I will debug further and attach a patch for this soon > Terms component fails for DatePointField > > > Key: SOLR-13403 > URL: https://issues.apache.org/jira/browse/SOLR-13403 > Project: Solr > Issue Type: Bug > Components: SearchComponents - other >Reporter: Munendra S N >Assignee: Munendra S N >Priority: Major > Fix For: 8.4 > > Attachments: SOLR-13403.patch, SOLR-13403.patch, SOLR-13403.patch > > > Getting terms for PointFields except DatePointField. For DatePointField, the > request fails NPE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13403) Terms component fails for DatePointField
[ https://issues.apache.org/jira/browse/SOLR-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956706#comment-16956706 ] ASF subversion and git services commented on SOLR-13403: Commit c5d91017d0f0cee22c1167ca9634672274f17621 in lucene-solr's branch refs/heads/branch_8x from Munendra S N [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c5d9101 ] SOLR-13403: disable distrib test for point fields in terms > Terms component fails for DatePointField > > > Key: SOLR-13403 > URL: https://issues.apache.org/jira/browse/SOLR-13403 > Project: Solr > Issue Type: Bug > Components: SearchComponents - other >Reporter: Munendra S N >Assignee: Munendra S N >Priority: Major > Fix For: 8.4 > > Attachments: SOLR-13403.patch, SOLR-13403.patch, SOLR-13403.patch > > > Getting terms for PointFields except DatePointField. For DatePointField, the > request fails NPE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13403) Terms component fails for DatePointField
[ https://issues.apache.org/jira/browse/SOLR-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956699#comment-16956699 ] ASF subversion and git services commented on SOLR-13403: Commit 597241a412a0a27fa3a915df2934de3fdb5a376f in lucene-solr's branch refs/heads/master from Munendra S N [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=597241a ] SOLR-13403: disable distrib test for point fields in terms > Terms component fails for DatePointField > > > Key: SOLR-13403 > URL: https://issues.apache.org/jira/browse/SOLR-13403 > Project: Solr > Issue Type: Bug > Components: SearchComponents - other >Reporter: Munendra S N >Assignee: Munendra S N >Priority: Major > Fix For: 8.4 > > Attachments: SOLR-13403.patch, SOLR-13403.patch, SOLR-13403.patch > > > Getting terms for PointFields except DatePointField. For DatePointField, the > request fails NPE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13857) QueryParser.jj produces code that will not compile
Gus Heck created SOLR-13857: --- Summary: QueryParser.jj produces code that will not compile Key: SOLR-13857 URL: https://issues.apache.org/jira/browse/SOLR-13857 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: query parsers Reporter: Gus Heck Assignee: Gus Heck There are 2 problems that have crept into the parser generation system. # SOLR-8764 removed deprecated methods that are part of a generated interface (and the implementation thereof). It's kind of stinky that Javacc is generating an interface that includes deprecated methods, but deleting them from the generated class means that re-generation causes compiler errors, so this should probably be reverted. # SOLR-11662 changed the signature of org.apache.solr.parser.QueryParser#newFieldQuery to add a parameter, but did not update the corresponding portion of the QueryParser.jj file, and so the method signature reverts upon regeneration, causing compile errors. # There are a few places where string concatenation was turned to .append() The pull request to be attached soon fixes these two issues such that running ant javacc-QueryParser will once again produce code that compiles. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13857) QueryParser.jj produces code that will not compile
[ https://issues.apache.org/jira/browse/SOLR-13857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956619#comment-16956619 ] Gus Heck commented on SOLR-13857: - There also seem to be a variety of edits, some of which are trivial IDE warning type things (generics, pointless casts, etc) that should probably be allowed to revert to generated form, but I see the following 3 edits that appear to be optimizations, that probably need to be retained, which begs the question of how we avoid loosing them if anyone ever updates the parser sufficiently that they want to regen the code. # org/apache/solr/parser/ParseException.java:176 - concatenation inside an append in a loop # org/apache/solr/parser/QueryParserTokenManager.java:1405 StringBuilder instead of StringBuffer # org/apache/solr/parser/TokenMgrError.java:85 - concatenation inside an append in a loop > QueryParser.jj produces code that will not compile > -- > > Key: SOLR-13857 > URL: https://issues.apache.org/jira/browse/SOLR-13857 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Gus Heck >Assignee: Gus Heck >Priority: Major > > There are 2 problems that have crept into the parser generation system. > # SOLR-8764 removed deprecated methods that are part of a generated > interface (and the implementation thereof). It's kind of stinky that Javacc > is generating an interface that includes deprecated methods, but deleting > them from the generated class means that re-generation causes compiler > errors, so this should probably be reverted. > # SOLR-11662 changed the signature of > org.apache.solr.parser.QueryParser#newFieldQuery to add a parameter, but did > not update the corresponding portion of the QueryParser.jj file, and so the > method signature reverts upon regeneration, causing compile errors. > # There are a few places where string concatenation was turned to .append() > The pull request to be attached soon fixes these two issues such that running > ant javacc-QueryParser will once again produce code that compiles. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13855) DistributedZkUpdateProcessor isn't propagating finish()
[ https://issues.apache.org/jira/browse/SOLR-13855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956598#comment-16956598 ] Bar Rotstein commented on SOLR-13855: - Oh that sounds like a nasty bug. I’ll investigate further tomorrow, and hopefully have a patch ready with a test to ensure this bug will not be repeated. > DistributedZkUpdateProcessor isn't propagating finish() > --- > > Key: SOLR-13855 > URL: https://issues.apache.org/jira/browse/SOLR-13855 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: UpdateRequestProcessors >Affects Versions: 8.1 >Reporter: David Smiley >Priority: Major > > In SOLR-12955, DistributedUpdateProcessorFactory was split up into a > subclass, DistributedZkUpdateProcessor. This refactoring has a bug in which > finish() is not propagated to the remaining URPs in the chain when > DistributedZkUpdateProcessor is in play. This is noticeable when > LogUpdateProcessorFactory is later down the line. > CC [~barrotsteindev] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13268) Clean up any test failures resulting from defaulting to async logging
[ https://issues.apache.org/jira/browse/SOLR-13268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956541#comment-16956541 ] Chris M. Hostetter commented on SOLR-13268: --- FWIW, while looking for something else i noticed this static block in {{LuceneTestCase}} ... {code:java} /** * Try to capture streams early so that other classes don't have a chance to steal references * to them (as is the case with ju.logging handlers). */ static { TestRuleLimitSysouts.checkCaptureStreams(); Logger.getGlobal().getHandlers(); } {code} I haven't dug into this, or thought about it in depth at all really, but in the back of my head I wonder if what's happening here in this {{TestRuleLimitSysouts.checkCaptureStreams();}} call is perhaps related to why/how/when we're seeing test logging bleed over from one test class to another? > Clean up any test failures resulting from defaulting to async logging > - > > Key: SOLR-13268 > URL: https://issues.apache.org/jira/browse/SOLR-13268 > Project: Solr > Issue Type: Bug >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13268-flushing.patch, SOLR-13268.patch, > SOLR-13268.patch, SOLR-13268.patch > > Time Spent: 1h > Remaining Estimate: 0h > > This is a catch-all for test failures due to the async logging changes. So > far, the I see a couple failures on JDK13 only. I'll collect a "starter set" > here, these are likely systemic, once the root cause is found/fixed, then > others are likely fixed as well. > JDK13: > ant test -Dtestcase=TestJmxIntegration -Dtests.seed=54B30AC62A2D71E > -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=lv-LV > -Dtests.timezone=Asia/Riyadh -Dtests.asserts=true -Dtests.file.encoding=UTF-8 > ant test -Dtestcase=TestDynamicURP -Dtests.seed=54B30AC62A2D71E > -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=rwk > -Dtests.timezone=Australia/Brisbane -Dtests.asserts=true > -Dtests.file.encoding=UTF-8 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Reopened] (SOLR-13403) Terms component fails for DatePointField
[ https://issues.apache.org/jira/browse/SOLR-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter reopened SOLR-13403: --- These changes have introduced reproducible failures in DistributedTermsComponentTest >From master... {noformat} ... [junit4] 2> 13697 INFO (qtp1725601784-240) [x:collection1 ] o.a.s.c.S.Request [collection1] webapp=/_d path=/terms params={df=text&distrib=false&qt=/terms&shards.purpose=1024&terms.sort=index&shard.url=http://127.0.0.1:33661/_d/collection1|[ff01::213]:2/_d|[ff01::114]:2/_d|[ff01::083]:2/_d&version=2&shards.qt=/terms&terms=true&omitHeader=false&terms.fl=foo_date_p&terms.limit=-1&NOW=1571700672083&isShard=true&wt=javabin} status=500 QTime=2 [junit4] 2> 13698 ERROR (qtp1725601784-240) [x:collection1 ] o.a.s.s.HttpSolrCall null:java.lang.IndexOutOfBoundsException: Index -1 out of bounds for length 0 [junit4] 2>at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64) [junit4] 2>at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70) [junit4] 2>at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248) [junit4] 2>at java.base/java.util.Objects.checkIndex(Objects.java:372) [junit4] 2>at java.base/java.util.ArrayList.get(ArrayList.java:458) [junit4] 2>at java.base/java.util.Collections$UnmodifiableList.get(Collections.java:1310) [junit4] 2>at org.apache.solr.search.PointMerger$ValueIterator.(PointMerger.java:60) [junit4] 2>at org.apache.solr.search.PointMerger$ValueIterator.(PointMerger.java:54) [junit4] 2>at org.apache.solr.handler.component.TermsComponent.process(TermsComponent.java:246) [junit4] 2>at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:304) [junit4] 2>at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:198) [junit4] 2>at org.apache.solr.core.SolrCore.execute(SolrCore.java:2565) [junit4] 2>at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:803) [junit4] 2>at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:582) [junit4] 2>at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:424) ... [junit4] ERROR 12.0s | DistributedTermsComponentTest.test <<< [junit4]> Throwable #1: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://127.0.0.1:33661/_d/collection1: org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[http://[ff01::083]:2/_d, http://[ff01::213]:2/_d, http://[ff01::114]:2/_d, http://127.0.0.1:33661/_d/collection1] [junit4]>at __randomizedtesting.SeedInfo.seed([F0A05BD2319D056:875E3A678DE5BDAE]:0) [junit4]>at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:665) [junit4]>at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:265) [junit4]>at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:248) [junit4]>at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:207) [junit4]>at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1003) [junit4]>at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:1018) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase.queryServer(BaseDistributedSearchTestCase.java:626) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:678) [junit4]>at org.apache.solr.handler.component.DistributedTermsComponentTest.query(DistributedTermsComponentTest.java:111) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:656) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase.query(BaseDistributedSearchTestCase.java:635) [junit4]>at org.apache.solr.handler.component.DistributedTermsComponentTest.query(DistributedTermsComponentTest.java:106) [junit4]>at org.apache.solr.handler.component.DistributedTermsComponentTest.test(DistributedTermsComponentTest.java:82) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit4]>at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit4]>at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit4]>at java.base/java.lang.ref
[jira] [Updated] (SOLR-13856) 8.x HdfsWriteToMultipleCollectionsTest jenkins failures due to TImeoutException
[ https://issues.apache.org/jira/browse/SOLR-13856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris M. Hostetter updated SOLR-13856: -- Attachment: apache_Lucene-Solr-NightlyTests-8.3_25.log.txt apache_Lucene-Solr-repro_3681.log.txt 8.x.fail3.log.txt 8.x.fail2.log.txt 8.x.fail1.log.txt 8.3.fail3.log.txt 8.3.fail2.log.txt 8.3.fail1.log.txt Status: Open (was: Open) I'm attaching some logs from jenkins, as well as from from my own local runs that show the same failure behavior on 8x and 83 with a variety of seeds. The specifics of the failure looks like this... {noformat} [junit4] 2> NOTE: reproduce with: ant test -Dtestcase=HdfsWriteToMultipleCollectionsTest -Dtests.method=test -Dtests.seed=7DB11BE46316786B -Dtests.multiplier=2 -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/home/jenkins/jenkins-slave/workspace/Lucene-Solr-NightlyTests-8.3/test-data/enwiki.random.lines.txt -Dtests.locale=el-CY -Dtests.timezone=Asia/Yekaterinburg -Dtests.asserts=true -Dtests.file.encoding=ISO-8859-1 [junit4] FAILURE 287s J0 | HdfsWriteToMultipleCollectionsTest.test <<< [junit4]> Throwable #1: java.lang.AssertionError: expected:<680> but was:<595> [junit4]>at __randomizedtesting.SeedInfo.seed([7DB11BE46316786B:F5E5243ECDEA1593]:0) [junit4]>at org.apache.solr.cloud.hdfs.HdfsWriteToMultipleCollectionsTest.test(HdfsWriteToMultipleCollectionsTest.java:129) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsFixedStatement.callStatement(BaseDistributedSearchTestCase.java:1082) [junit4]>at org.apache.solr.BaseDistributedSearchTestCase$ShardsRepeatRule$ShardsStatement.evaluate(BaseDistributedSearchTestCase.java:1054) [junit4]>at java.lang.Thread.run(Thread.java:748) {noformat} ...the exact number "expected" will vary by seed, and the exact number of "actual" docs found will vary by run. This assertion happens after all of the (parallel) indexing threads have finished, when the test is simply trying to count documents across all of hte various collections it's indexed to, before it makes any HDFS specific assertions. The root cause of the failure to get the expected number of docs seems to be due to some failures in forwarding updates between nodes -- evidently due to a lock in the HDFS layer? We'll see lots of log messages showing that that updates are screaming along, and then suddenly there will be a big gap in time where only "metrics" related requests are being logged -- followed by several updateExecutor's logging that they (eventually) timed out waiting for a response from remote nodes to the forwarded pdate requests (sometimes TOLEADER, sometimes FROMLEADER). These are accompanied by WARNing messages from the HDFS DataStreamer class indicating it caught an InterruptedException while writting to some index file... {noformat} [junit4] 2> 1918294 ERROR (updateExecutor-6515-thread-1-processing-n:127.0.0.1:37445_awn%2Fps x:acollection0_shard2_replica_n3 c:acollection0 s:shard2 r:core_node8) [n:127.0.0.1:37445_awn%2Fps c:acollection0 s:shard2 r:core_node8 x:acollection0_shard2_replica_n3 ] o.a.s.u.SolrCmdDistributor org.apache.solr.client.solrj.SolrServerException: Timeout occured while waiting response from server at: http://127.0.0.1:34527/awn/ps/acollection0_shard1_replica_n1/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F127.0.0.1%3A37445%2Fawn%2Fps%2Facollection0_shard2_replica_n3%2F [junit4] 2>at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:408) [junit4] 2>at org.apache.solr.client.solrj.impl.Http2SolrClient.request(Http2SolrClient.java:754) [junit4] 2>at org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.request(ConcurrentUpdateHttp2SolrClient.java:364) [junit4] 2>at org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1290) [junit4] 2>at org.apache.solr.update.SolrCmdDistributor.doRequest(SolrCmdDistributor.java:342) [junit4] 2>at org.apache.solr.update.SolrCmdDistributor.lambda$submit$0(SolrCmdDistributor.java:331) [junit4] 2>at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] 2>at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [junit4] 2>at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit4] 2>at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:181) [junit4] 2>at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209) [junit4] 2>at java.util.concurrent.Threa
[jira] [Created] (SOLR-13856) 8.x HdfsWriteToMultipleCollectionsTest jenkins failures due to TImeoutException
Chris M. Hostetter created SOLR-13856: - Summary: 8.x HdfsWriteToMultipleCollectionsTest jenkins failures due to TImeoutException Key: SOLR-13856 URL: https://issues.apache.org/jira/browse/SOLR-13856 Project: Solr Issue Type: Test Security Level: Public (Default Security Level. Issues are Public) Reporter: Chris M. Hostetter I've noticed a trend in jenkins failures where HdfsWriteToMultipleCollectionsTest... * does _NOT_ ever seem to fail on master even w/heavy beasting * fails on 8.x (28c1049a258bbd060a80803c72e1c6cadc784dab) and 8.3 (25968e3b75e5e9a4f2a64de10500aae10a257bdd) easily ** failing seeds frequently reproduce, but not 100% ** seeds reproduce even when tested using newer (ie: java11) JVMs ** doesn't fail when commenting out HDFS aspects of test *** suggests failure cause is somehow specific to HDFS, not differences in the 8x/master HTTP/solr indexing stack... *However:* There are currently zero differences between the *.hdfs.* packaged solr code (src or test) on branch_8x vs master; likewise 8x and master also use the exact same hadoop jars. So what the hell is different? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13855) DistributedZkUpdateProcessor isn't propagating finish()
David Smiley created SOLR-13855: --- Summary: DistributedZkUpdateProcessor isn't propagating finish() Key: SOLR-13855 URL: https://issues.apache.org/jira/browse/SOLR-13855 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: UpdateRequestProcessors Affects Versions: 8.1 Reporter: David Smiley In SOLR-12955, DistributedUpdateProcessorFactory was split up into a subclass, DistributedZkUpdateProcessor. This refactoring has a bug in which finish() is not propagated to the remaining URPs in the chain when DistributedZkUpdateProcessor is in play. This is noticeable when LogUpdateProcessorFactory is later down the line. CC [~barrotsteindev] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956460#comment-16956460 ] Noble Paul commented on SOLR-13841: --- bq.It is my understanding that SolrJ uses NamedList for its internal representation. JSON might be the over-the-wire transport... The strategy is as follows. * Start using Strongly typed objects for certain requests. * These objects will be serializable in JSIN as well as javabin (javabin will support all data types.JSON will support fewer) NamedList is not a requirement at all > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956455#comment-16956455 ] David Smiley commented on SOLR-13851: - Sounds reasonable. WDYT [~yo...@apache.org]? I suspect you added these long ago. > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956433#comment-16956433 ] ASF subversion and git services commented on SOLR-13824: Commit 0b8b1438e9c8105425ebb7d155a8b0a7bc47692f in lucene-solr's branch refs/heads/branch_8x from Mikhail Khludnev [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=0b8b143 ] SOLR-13824: reject prematurely closed curly bracket in JSON. > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch, SOLR-13824.patch, SOLR-13824.patch, > SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956430#comment-16956430 ] Shawn Heisey commented on SOLR-13841: - bq. We would like to have Strongly typed POJO based upstream/downstream data from SolrJ. We should be able to use the same objects in Server/Client so as to avoid duplication of code. It is my understanding that SolrJ uses NamedList for its internal representation. JSON might be the over-the-wire transport (if the user changes it from javabin), but JSON itself, as far as I am aware, has a very limited number of data representations. NamedList, even though it probably can work with any Java object type in conjunction with Javabin, probably should also be limited in practice like JSON to only basic and well-known Java types provided by the JVM itself. Users might extend Solr to send their own arbitrary objects via Javabin, and expect to get them unchanged in their client applications, but I cannot think of anything to be gained for general uses by working with objects outside the basic ones provided by the JVM -- data that would be impossible to encode with XML or JSON as the wire transport, which we must also support until such time as we declare them unsupported, which I don't think is going to happen. I had been imagining this ticket as bringing Jackson encoding/decoding for JSON to SolrJ, but apparently that's not the idea. > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13824) JSON Request API ignores prematurely closing curly brace.
[ https://issues.apache.org/jira/browse/SOLR-13824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956420#comment-16956420 ] ASF subversion and git services commented on SOLR-13824: Commit afdb80069cc7a7972411b90dd08847cac574e3dd in lucene-solr's branch refs/heads/master from Mikhail Khludnev [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=afdb800 ] SOLR-13824: reject prematurely closed curly bracket in JSON. > JSON Request API ignores prematurely closing curly brace. > -- > > Key: SOLR-13824 > URL: https://issues.apache.org/jira/browse/SOLR-13824 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: JSON Request API >Reporter: Mikhail Khludnev >Priority: Major > Attachments: SOLR-13824.patch, SOLR-13824.patch, SOLR-13824.patch, > SOLR-13824.patch > > > {code:java} > json={query:"content:foo", facet:{zz:{field:id}}} > {code} > this works fine, but if we mistype {{}}} instead of {{,}} > {code:java} > json={query:"content:foo"} facet:{zz:{field:id}}} > {code} > It's captured only partially, here's we have under debug > {code:java} > "json":{"query":"content:foo"}, > {code} > I suppose it should throw an error with 400 code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956397#comment-16956397 ] Noble Paul edited comment on SOLR-13841 at 10/21/19 8:04 PM: - {quote} * Could you please be more specific within the SolrJ codebase as to what classes would benefit{quote} We would like to have Strongly typed POJO based upstream/downstream data from SolrJ. I would like it to support JSON/Javabin or whatever format that is possible using the same annotated POJOs.We should be able to use the same objects in Server/Client so as to avoid duplication of code. The same objects can be used by users of SolrJ to serialize their data downstream So we will have to decide what JSON framework to use. The options are * Use an existing library . [see this list|https://simplesolution.dev/top-5-libraries-for-serialization-and-deserialization-json-in-java/] * Use a shaded version of an existing library * Roll your own was (Author: noble.paul): {quote} * Could you please be more specific within the SolrJ codebase as to what classes would benefit{quote} We would like to have Strongly typed POJO based upstream/downstream data from SolrJ. We should be able to use the same objects in Server/Client so as to avoid duplication of code. The same objects can be used by users of SolrJ to serialize their data downstream So we will have to decide what JSON framework to use. The options are * Use an existing library . [see this list|https://simplesolution.dev/top-5-libraries-for-serialization-and-deserialization-json-in-java/] * Use a shaded version of an existing library * Roll your own > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956397#comment-16956397 ] Noble Paul edited comment on SOLR-13841 at 10/21/19 7:49 PM: - {quote} * Could you please be more specific within the SolrJ codebase as to what classes would benefit{quote} We would like to have Strongly typed POJO based upstream/downstream data from SolrJ. We should be able to use the same objects in Server/Client so as to avoid duplication of code. The same objects can be used by users of SolrJ to serialize their data downstream So we will have to decide what JSON framework to use. The options are * Use an existing library . [see this list|https://simplesolution.dev/top-5-libraries-for-serialization-and-deserialization-json-in-java/] * Use a shaded version of an existing library * Roll your own was (Author: noble.paul): {quote} * Could you please be more specific within the SolrJ codebase as to what classes would benefit{quote} We would like to have Strongly typed POJO based upstream/downstream data from SolrJ. We should be able to use the same objects in Server/Client so as to avoid duplication of code. The same objects can be used by users of SolrJ to serialize their data downstream So we will have to decide what JSON framework to use the options are * Use an existing library * Use a shaded version of an existing library * Roll your own > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956397#comment-16956397 ] Noble Paul commented on SOLR-13841: --- {quote} * Could you please be more specific within the SolrJ codebase as to what classes would benefit{quote} We would like to have Strongly typed POJO based upstream/downstream data from SolrJ. We should be able to use the same objects in Server/Client so as to avoid duplication of code. The same objects can be used by users of SolrJ to serialize their data downstream So we will have to decide what JSON framework to use the options are * Use an existing library * Use a shaded version of an existing library * Roll your own > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11087) Get rid of jar duplicates in release
[ https://issues.apache.org/jira/browse/SOLR-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956354#comment-16956354 ] Shawn Heisey commented on SOLR-11087: - One long-term goal that I have (and I think it's shared by others) is to make the precise way of providing network services into an implementation detail. Solr should become a standalone application, and one way to do that is to embed Jetty into the application, so it's completely under our control and hidden from the user. We historically have left classpath management mostly up to the container, but we're going to have to take that over if we want the goal above to succeed. Simplification and good separation will be important for that. On the hardlink idea: It's really only viable in the tarball. I don't think we could do it in the zip version, and in truth I don't like it any more than you do. But if we do proper classpath management with our scripting, the whole notion is moot anyway, because we will have solved the problem. > Get rid of jar duplicates in release > > > Key: SOLR-11087 > URL: https://issues.apache.org/jira/browse/SOLR-11087 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Priority: Major > Fix For: 8.1, master (9.0) > > Attachments: SOLR-11087.patch > > > Sub task of SOLR-6806 > The {{dist/}} folder contains many duplicate jar files, totalling 10,5M: > {noformat} > 4,6M ./dist/solr-core-6.6.0.jar (WEB-INF/lib) > 1,2M ./dist/solr-solrj-6.6.0.jar (WEB-INF/lib) > 4,7M ./dist/solrj-lib/* (WEB-INF/lib and server/lib/ext) > {noformat} > The rest of the files in dist/ are contrib jars and test-framework. > To weed out the duplicates and save 10,5M, we can simply add a > {{dist/README.md}} file listing what jar files are located where. The file > could also contain a bash one-liner to copy them to the dist folder. Another > possibility is to ship the binary release tarball with symlinks in the dist > folder, and advise people to use {{cp -RL dist mydist}} which will make a > copy with the real files. Downside is that this won't work for ZIP archives > that do not preserve symlinks, and neither on Windows. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13851) SolrIndexSearcher.getFirstMatch trips assertion if multiple matches
[ https://issues.apache.org/jira/browse/SOLR-13851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956343#comment-16956343 ] Chris M. Hostetter commented on SOLR-13851: --- {quote}And as you note, we always use the ID. Perhaps these might be renamed... {quote} Yeah, i think the key changes should be: * both existing methods should be deprecated in 8x and removed in master ** getFirstMatch should be documented to note it's current peculiar state re: assertions and non-unique field usage * a new method should replace them that takes in _only_ the BytesRef and fails with a (non-assert) error if: ** the current schema doesn't use a uniqueKey field ** more then 1 doc is found matching the specified BytesRef in the uniqueKey field The exact return value/semantics of the new method should probably be something less likely to be missunderstood then a long with two parts that you have to bitshift to extract? i get the efficiency value in returning the perSegment docId for usecase that are already dealing with per-segment readers, but there's also a lot of super-simple cases (like the existing callers of getFirstMatch) that just want the (top level) docId and don't care about the segments, and i worry that two methods with similra names, but one that returns an "int" (global) docId and another that returns a "long" (bitshifted) segId+docId might lead plugin writers to confusing bugs (that could silently "work" on test indexes with only one segment) Maybe a simple container object with easy to understand accessor methods like: * {{boolean exists()}} * {{int getDocId()}} * {{int getLeafContextId()}} * {{int getDocIdInLeafContext()}} ? > SolrIndexSearcher.getFirstMatch trips assertion if multiple matches > --- > > Key: SOLR-13851 > URL: https://issues.apache.org/jira/browse/SOLR-13851 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Chris M. Hostetter >Priority: Major > > the documentation for {{SolrIndexSearcher.getFirstMatch}} says... > {quote} > Returns the first document number containing the term t Returns > -1 if no document was found. This method is primarily intended for clients > that want to fetch documents using a unique identifier." > @return the first document number containing the term > {quote} > But SOLR-12366 refactored {{SolrIndexSearcher.getFirstMatch}} to eliminate > it's previous implementation and replace it with a call to (a refactored > version of) {{SolrIndexSearcher.lookupId}} -- but the code in {{lookupId}} > was always designed *explicitly* for dealing with a uniqueKey field, and has > an assertion that once it finds a match _there will be no other matches in > the index_ > This means that even though {{getFirstMatch}} is _intended_ for fields that > are unique between documents, i it's used on a field that is not unique, it > can trip an assertion. > At a minimum we need to either "fix" {{getFirstMatch}} to behave as > documented, or fix it's documetation. > Given that the current behavior has now been in place since Solr 7.4, and > given that all existing uses in "core" solr code are for looking up docs by > uniqueKey, it's probably best to simply fix the documentation, but we should > also consider replacing hte assertion with an IllegalStateException, or > SolrException -- anything not dependent on having assertions enabled -- to > prevent silent bugs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13783) NamedList.toString() ought to be consistent with AbstractMap
[ https://issues.apache.org/jira/browse/SOLR-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munendra S N updated SOLR-13783: Labels: newdev (was: ) > NamedList.toString() ought to be consistent with AbstractMap > > > Key: SOLR-13783 > URL: https://issues.apache.org/jira/browse/SOLR-13783 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: David Smiley >Priority: Minor > Labels: newdev > > NamedList.toString() does not put a space after the joining commas, whereas > AbstractMap does. I think it ought to be consistent. This can matter if you > write tests that toString() a piece of a SolrResponse that varies in its use > of EmbeddedSolrServer versus other SolrClients that serialize the response. > Some custom SearchComponents or whatever might prefer to use a Map and that > should ultimately be consistent. I know the assumption is a little brittle > but still. > I think 9.0 without a back-port is safe. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11087) Get rid of jar duplicates in release
[ https://issues.apache.org/jira/browse/SOLR-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956269#comment-16956269 ] Uwe Schindler commented on SOLR-11087: -- But nevertheless, we can get rid of the duplicates, if we do some classpath magic. If we move all SolrJ JAR file up the tree and add them to Jetty's classpath, we can still keep the rest in the webapp folder. Webapps see JAR files from the context, too. > Get rid of jar duplicates in release > > > Key: SOLR-11087 > URL: https://issues.apache.org/jira/browse/SOLR-11087 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Priority: Major > Fix For: 8.1, master (9.0) > > Attachments: SOLR-11087.patch > > > Sub task of SOLR-6806 > The {{dist/}} folder contains many duplicate jar files, totalling 10,5M: > {noformat} > 4,6M ./dist/solr-core-6.6.0.jar (WEB-INF/lib) > 1,2M ./dist/solr-solrj-6.6.0.jar (WEB-INF/lib) > 4,7M ./dist/solrj-lib/* (WEB-INF/lib and server/lib/ext) > {noformat} > The rest of the files in dist/ are contrib jars and test-framework. > To weed out the duplicates and save 10,5M, we can simply add a > {{dist/README.md}} file listing what jar files are located where. The file > could also contain a bash one-liner to copy them to the dist folder. Another > possibility is to ship the binary release tarball with symlinks in the dist > folder, and advise people to use {{cp -RL dist mydist}} which will make a > copy with the real files. Downside is that this won't work for ZIP archives > that do not preserve symlinks, and neither on Windows. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13854) Remove deprecated SolrMetricProducer.initializeMetrics API
Andrzej Bialecki created SOLR-13854: --- Summary: Remove deprecated SolrMetricProducer.initializeMetrics API Key: SOLR-13854 URL: https://issues.apache.org/jira/browse/SOLR-13854 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Andrzej Bialecki Assignee: Andrzej Bialecki Fix For: master (9.0) SOLR-13677 introduced an improved API for registration and cleanup of metrics for Solr components. The previous API has been deprecated in 8x and it should be removed in 9.0. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-11087) Get rid of jar duplicates in release
[ https://issues.apache.org/jira/browse/SOLR-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956263#comment-16956263 ] Uwe Schindler edited comment on SOLR-11087 at 10/21/19 4:44 PM: Hi, I think the issue here should focus to get rid on the web application and have a single lib folder directly below the root dir of the distribution. Then we have a solr-main.jar (without solrj) and this one also contains a Main.class to bootstrap Jetty. This would make deployment much easier. As said before, the tons of HTML/Javascript should also be packaged into a JAR file to get rid of tons of small files making unzipping damn slow and consume lots of space (block size). Jetty is able to deliver static contents from a JAR file directy, I use this all the time for microservice-like stuff. Once we are at that place we could maybe split the root lib folder into 2 of them: One with Solrj and one with the remaining stuff to startup the server. The contrib modules can be linked into cores the usual way with {{}}. If I would have some more time, I could start in refactoring all this, but my knowledge with Solr is limited. I'd really like to get rid of SolrDispatchFilter and replace it with a simple servlet or better a Jetty Handler directly added to the root context in the Solr Bootstrap class. About your comment: Hardlinks is a problem on Windows, and Symlinks are not better. I'd not do this. was (Author: thetaphi): Hi, I think the issue here should focus to get rid on the web application and have a single lib folder directly below the root dir of the distribution. Then we have a solr-main.jar (without solrj) and this one also contains a Main.class to bootstrap Jetty. This would make deployment much easier. As said before, the tons of HTML/Javascript should also be packaged into a JAR file to get rid of tons of small files making unzipping damn slow and consume lots of space (block size). Jetty is able to deliver static contents from a JAR file directy, I use this all the time for microservice-like stuff. Once we are at that place we could maybe split the root lib folder into 2 of them: One with Solrj and one with the remaining stuff to startup the server. The contrib modules can be linked into cores the usual way with {{}}. If I would have some more time, I could start in refactoring all this, but my knowledge with Solr is limited. I'd really like to get rid of SolrDispatchFilter and replace it with a simple servlet directly added to the root context in the Solr Bootstrap class. About your comment: Hardlinks is a problem on Windows, and Symlinks are not better. I'd not do this. > Get rid of jar duplicates in release > > > Key: SOLR-11087 > URL: https://issues.apache.org/jira/browse/SOLR-11087 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Priority: Major > Fix For: 8.1, master (9.0) > > Attachments: SOLR-11087.patch > > > Sub task of SOLR-6806 > The {{dist/}} folder contains many duplicate jar files, totalling 10,5M: > {noformat} > 4,6M ./dist/solr-core-6.6.0.jar (WEB-INF/lib) > 1,2M ./dist/solr-solrj-6.6.0.jar (WEB-INF/lib) > 4,7M ./dist/solrj-lib/* (WEB-INF/lib and server/lib/ext) > {noformat} > The rest of the files in dist/ are contrib jars and test-framework. > To weed out the duplicates and save 10,5M, we can simply add a > {{dist/README.md}} file listing what jar files are located where. The file > could also contain a bash one-liner to copy them to the dist folder. Another > possibility is to ship the binary release tarball with symlinks in the dist > folder, and advise people to use {{cp -RL dist mydist}} which will make a > copy with the real files. Downside is that this won't work for ZIP archives > that do not preserve symlinks, and neither on Windows. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11087) Get rid of jar duplicates in release
[ https://issues.apache.org/jira/browse/SOLR-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956263#comment-16956263 ] Uwe Schindler commented on SOLR-11087: -- Hi, I think the issue here should focus to get rid on the web application and have a single lib folder directly below the root dir of the distribution. Then we have a solr-main.jar (without solrj) and this one also contains a Main.class to bootstrap Jetty. This would make deployment much easier. As said before, the tons of Java/Javascript should also be packaged into a JAR file to get rid of tons of small files making unzipping damn slow and consume lots of space (block size). Jetty is able to deliver static contents from a JAR file directy, I use this all the time for microservice-like stuff. Once we are at that place we could maybe split the root lib folder into 2 of them: One with Solrj and one with the remaining stuff to startup the server. The contrib modules can be linked into cores the usual way with {{}}. If I would have some more time, I could start in refactoring all this, but my knowledge with Solr is limited. I'd really like to get rid of SolrDispatchFilter and replace it with a simple servlet directly added to the root context in the Solr Bootstrap class. About your comment: Hardlinks is a problem on Windows, and Symlinks are not better. I'd not do this. > Get rid of jar duplicates in release > > > Key: SOLR-11087 > URL: https://issues.apache.org/jira/browse/SOLR-11087 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Priority: Major > Fix For: 8.1, master (9.0) > > Attachments: SOLR-11087.patch > > > Sub task of SOLR-6806 > The {{dist/}} folder contains many duplicate jar files, totalling 10,5M: > {noformat} > 4,6M ./dist/solr-core-6.6.0.jar (WEB-INF/lib) > 1,2M ./dist/solr-solrj-6.6.0.jar (WEB-INF/lib) > 4,7M ./dist/solrj-lib/* (WEB-INF/lib and server/lib/ext) > {noformat} > The rest of the files in dist/ are contrib jars and test-framework. > To weed out the duplicates and save 10,5M, we can simply add a > {{dist/README.md}} file listing what jar files are located where. The file > could also contain a bash one-liner to copy them to the dist folder. Another > possibility is to ship the binary release tarball with symlinks in the dist > folder, and advise people to use {{cp -RL dist mydist}} which will make a > copy with the real files. Downside is that this won't work for ZIP archives > that do not preserve symlinks, and neither on Windows. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-11087) Get rid of jar duplicates in release
[ https://issues.apache.org/jira/browse/SOLR-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956263#comment-16956263 ] Uwe Schindler edited comment on SOLR-11087 at 10/21/19 4:43 PM: Hi, I think the issue here should focus to get rid on the web application and have a single lib folder directly below the root dir of the distribution. Then we have a solr-main.jar (without solrj) and this one also contains a Main.class to bootstrap Jetty. This would make deployment much easier. As said before, the tons of HTML/Javascript should also be packaged into a JAR file to get rid of tons of small files making unzipping damn slow and consume lots of space (block size). Jetty is able to deliver static contents from a JAR file directy, I use this all the time for microservice-like stuff. Once we are at that place we could maybe split the root lib folder into 2 of them: One with Solrj and one with the remaining stuff to startup the server. The contrib modules can be linked into cores the usual way with {{}}. If I would have some more time, I could start in refactoring all this, but my knowledge with Solr is limited. I'd really like to get rid of SolrDispatchFilter and replace it with a simple servlet directly added to the root context in the Solr Bootstrap class. About your comment: Hardlinks is a problem on Windows, and Symlinks are not better. I'd not do this. was (Author: thetaphi): Hi, I think the issue here should focus to get rid on the web application and have a single lib folder directly below the root dir of the distribution. Then we have a solr-main.jar (without solrj) and this one also contains a Main.class to bootstrap Jetty. This would make deployment much easier. As said before, the tons of Java/Javascript should also be packaged into a JAR file to get rid of tons of small files making unzipping damn slow and consume lots of space (block size). Jetty is able to deliver static contents from a JAR file directy, I use this all the time for microservice-like stuff. Once we are at that place we could maybe split the root lib folder into 2 of them: One with Solrj and one with the remaining stuff to startup the server. The contrib modules can be linked into cores the usual way with {{}}. If I would have some more time, I could start in refactoring all this, but my knowledge with Solr is limited. I'd really like to get rid of SolrDispatchFilter and replace it with a simple servlet directly added to the root context in the Solr Bootstrap class. About your comment: Hardlinks is a problem on Windows, and Symlinks are not better. I'd not do this. > Get rid of jar duplicates in release > > > Key: SOLR-11087 > URL: https://issues.apache.org/jira/browse/SOLR-11087 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Priority: Major > Fix For: 8.1, master (9.0) > > Attachments: SOLR-11087.patch > > > Sub task of SOLR-6806 > The {{dist/}} folder contains many duplicate jar files, totalling 10,5M: > {noformat} > 4,6M ./dist/solr-core-6.6.0.jar (WEB-INF/lib) > 1,2M ./dist/solr-solrj-6.6.0.jar (WEB-INF/lib) > 4,7M ./dist/solrj-lib/* (WEB-INF/lib and server/lib/ext) > {noformat} > The rest of the files in dist/ are contrib jars and test-framework. > To weed out the duplicates and save 10,5M, we can simply add a > {{dist/README.md}} file listing what jar files are located where. The file > could also contain a bash one-liner to copy them to the dist folder. Another > possibility is to ship the binary release tarball with symlinks in the dist > folder, and advise people to use {{cp -RL dist mydist}} which will make a > copy with the real files. Downside is that this won't work for ZIP archives > that do not preserve symlinks, and neither on Windows. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-11087) Get rid of jar duplicates in release
[ https://issues.apache.org/jira/browse/SOLR-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956243#comment-16956243 ] Shawn Heisey commented on SOLR-11087: - I like the idea of replacing the duplicates in dist with a README describing exactly which jars to copy out of the webapp if they are needed and where to find them. One thing I wonder is whether we expect hardlinks to be supported across all operating systems that natively support tarballs -- if we do, we could use hardlinks to share files in dist and the webapp, and mention this fact in the README. If we think that hardlinks might be a specialty item, then just the README would be appropriate. While looking at SOLR-13841 I found that we have duplicate jars other than the ones in dist, which currently are expected. My question for the moment: Should I open a new issue for those other duplicates, or would we want to expand this issue to cover a full audit of the jars in the binary release? I almost started with a new issue ... glad I searched first. > Get rid of jar duplicates in release > > > Key: SOLR-11087 > URL: https://issues.apache.org/jira/browse/SOLR-11087 > Project: Solr > Issue Type: Sub-task > Components: Build >Reporter: Jan Høydahl >Priority: Major > Fix For: 8.1, master (9.0) > > Attachments: SOLR-11087.patch > > > Sub task of SOLR-6806 > The {{dist/}} folder contains many duplicate jar files, totalling 10,5M: > {noformat} > 4,6M ./dist/solr-core-6.6.0.jar (WEB-INF/lib) > 1,2M ./dist/solr-solrj-6.6.0.jar (WEB-INF/lib) > 4,7M ./dist/solrj-lib/* (WEB-INF/lib and server/lib/ext) > {noformat} > The rest of the files in dist/ are contrib jars and test-framework. > To weed out the duplicates and save 10,5M, we can simply add a > {{dist/README.md}} file listing what jar files are located where. The file > could also contain a bash one-liner to copy them to the dist folder. Another > possibility is to ship the binary release tarball with symlinks in the dist > folder, and advise people to use {{cp -RL dist mydist}} which will make a > copy with the real files. Downside is that this won't work for ZIP archives > that do not preserve symlinks, and neither on Windows. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956213#comment-16956213 ] Andrzej Bialecki commented on SOLR-13841: - The issue here is also adding more dependencies to SolrJ, this should be discussed first - most users won't care about added / changed dependencies on the server-side but they may easily run into conflicts in their client-side code. > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956166#comment-16956166 ] Shawn Heisey commented on SOLR-13841: - Jackson is already included in Solr (the server side). In the webapp for version 8.2.0, you can find three jackson jars, including jackson-annotations. I found that we have some jar duplication. The lib directory for the prometheus-exporter contrib contains three jackson jars which are already included in Solr. It also contains log4j jars. The lib directory for the clustering contrib contains two jackson jars. If we do this right, we can make the Solr download a few megabytes smaller. Assuming I understand how everything fits together, we would remove the jackson-annotations dependency from solr-core, all jackson dependencies from clustering, and all jackson dependencies except jackson-jq from prometheus-exporter. Then we would add jackson-annotations to solrj. I checked the script for starting the prometheus exporter as a separate process, and it includes Solr's WEB-INF/lib directory, so the jackson jars are already available without needing to put them in the contrib lib directory. If the jetty lib/ext directory is added to the classpath in the scripts, then we can also remove the log4j jars from the prometheus-exporter module ... although we might want to tackle that in a separate issue. I do not know whether we can remove slf4j-api from the prometheus-exporter module ... I guess that depends on whether the module can be used without the rest of Solr like SolrJ can. My guess is that it can't be used in this way and that we can remove slf4j-api. > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8992) Share minimum score across segments in concurrent search
[ https://issues.apache.org/jira/browse/LUCENE-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956145#comment-16956145 ] ASF subversion and git services commented on LUCENE-8992: - Commit cfa49401671b5f9958d46c04120df8c7e3f358be in lucene-solr's branch refs/heads/master from jimczi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=cfa4940 ] LUCENE-8992: Update CHANGES after backport to 8x > Share minimum score across segments in concurrent search > > > Key: LUCENE-8992 > URL: https://issues.apache.org/jira/browse/LUCENE-8992 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Fix For: master (9.0), 8.4 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > As a follow up of LUCENE-8978 we should share the minimum score in > concurrent search > for top field collectors that sort on relevance first. The same logic should > be applicable with the only condition that the primary sort is by relevance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-8992) Share minimum score across segments in concurrent search
[ https://issues.apache.org/jira/browse/LUCENE-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jim Ferenczi resolved LUCENE-8992. -- Fix Version/s: 8.4 master (9.0) Resolution: Fixed Thanks [~atris] and [~jpountz]. I merged in master and 8x. > Share minimum score across segments in concurrent search > > > Key: LUCENE-8992 > URL: https://issues.apache.org/jira/browse/LUCENE-8992 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Fix For: master (9.0), 8.4 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > As a follow up of LUCENE-8978 we should share the minimum score in > concurrent search > for top field collectors that sort on relevance first. The same logic should > be applicable with the only condition that the primary sort is by relevance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8978) "Max Bottom" Based Early Termination For Concurrent Search
[ https://issues.apache.org/jira/browse/LUCENE-8978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956139#comment-16956139 ] ASF subversion and git services commented on LUCENE-8978: - Commit 1c23a3c14e78ab0840633bbfbbbad924bf7faefe in lucene-solr's branch refs/heads/branch_8x from jimczi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1c23a3c ] LUCENE-8992: Share minimum score across segment in concurrent search This is a follow up of LUCENE-8978 that introduces shared minimum score across segment in concurrent search for top field collectors that sort by relevance first. > "Max Bottom" Based Early Termination For Concurrent Search > -- > > Key: LUCENE-8978 > URL: https://issues.apache.org/jira/browse/LUCENE-8978 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Atri Sharma >Assignee: Atri Sharma >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > When running a search concurrently, collectors which have collected the > number of hits requested locally i.e. their local priority queue is full can > then globally publish their bottom hit's score, and other collectors can then > use that score as the filter. If multiple collectors have full priority > queues, the maximum of all bottom scores will be considered as the global > bottom score. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8992) Share minimum score across segments in concurrent search
[ https://issues.apache.org/jira/browse/LUCENE-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956138#comment-16956138 ] ASF subversion and git services commented on LUCENE-8992: - Commit 1c23a3c14e78ab0840633bbfbbbad924bf7faefe in lucene-solr's branch refs/heads/branch_8x from jimczi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=1c23a3c ] LUCENE-8992: Share minimum score across segment in concurrent search This is a follow up of LUCENE-8978 that introduces shared minimum score across segment in concurrent search for top field collectors that sort by relevance first. > Share minimum score across segments in concurrent search > > > Key: LUCENE-8992 > URL: https://issues.apache.org/jira/browse/LUCENE-8992 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > As a follow up of LUCENE-8978 we should share the minimum score in > concurrent search > for top field collectors that sort on relevance first. The same logic should > be applicable with the only condition that the primary sort is by relevance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13568) Expand component should not cache group queries in the filter cache
[ https://issues.apache.org/jira/browse/SOLR-13568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956089#comment-16956089 ] KHADIDJA MESSAOUDI commented on SOLR-13568: --- +1 > Expand component should not cache group queries in the filter cache > --- > > Key: SOLR-13568 > URL: https://issues.apache.org/jira/browse/SOLR-13568 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7.2, 8.1.1 >Reporter: Ludovic Boutros >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently the expand component is creating queries (bit sets) from the > current page document ids. > These queries are sadly put in the filter cache. > This behavior floods the filter cache and it becomes inefficient. > Therefore, the group query should be wrapped in a query with its cache flag > disabled. > This is a tiny little thing to do. The PR will follow very soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13568) Expand component should not cache group queries in the filter cache
[ https://issues.apache.org/jira/browse/SOLR-13568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956076#comment-16956076 ] Ludovic Boutros commented on SOLR-13568: I just updated the code to resolve the conflict. Should be easy now :) > Expand component should not cache group queries in the filter cache > --- > > Key: SOLR-13568 > URL: https://issues.apache.org/jira/browse/SOLR-13568 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7.2, 8.1.1 >Reporter: Ludovic Boutros >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently the expand component is creating queries (bit sets) from the > current page document ids. > These queries are sadly put in the filter cache. > This behavior floods the filter cache and it becomes inefficient. > Therefore, the group query should be wrapped in a query with its cache flag > disabled. > This is a tiny little thing to do. The PR will follow very soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13841) Add jackson databind annotations to SolrJ classpath
[ https://issues.apache.org/jira/browse/SOLR-13841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956071#comment-16956071 ] David Smiley commented on SOLR-13841: - Could you please be more specific within the SolrJ codebase as to what classes would benefit / how? Would this be purely internal (I suppose it would). Might this be an alternative?: https://github.com/FasterXML/jackson-docs/wiki/JacksonMixInAnnotations > Add jackson databind annotations to SolrJ classpath > --- > > Key: SOLR-13841 > URL: https://issues.apache.org/jira/browse/SOLR-13841 > Project: Solr > Issue Type: Task > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Noble Paul >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > We can start using annotations in SolrJ to minimize the amount of code we > write & improve readability. Jackson is a widely used library and everyone is > already familiar with it -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8992) Share minimum score across segments in concurrent search
[ https://issues.apache.org/jira/browse/LUCENE-8992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16956043#comment-16956043 ] ASF subversion and git services commented on LUCENE-8992: - Commit 066d324006507e9830179a9801bf8860d2ffc9b2 in lucene-solr's branch refs/heads/master from Jim Ferenczi [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=066d324 ] Merge pull request #904 from jimczi/shared_min_score LUCENE-8992: Share minimum score across segment in concurrent search > Share minimum score across segments in concurrent search > > > Key: LUCENE-8992 > URL: https://issues.apache.org/jira/browse/LUCENE-8992 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Jim Ferenczi >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > As a follow up of LUCENE-8978 we should share the minimum score in > concurrent search > for top field collectors that sort on relevance first. The same logic should > be applicable with the only condition that the primary sort is by relevance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi merged pull request #904: LUCENE-8992: Share minimum score across segment in concurrent search
jimczi merged pull request #904: LUCENE-8992: Share minimum score across segment in concurrent search URL: https://github.com/apache/lucene-solr/pull/904 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13808) Query DSL should let to cache filter
[ https://issues.apache.org/jira/browse/SOLR-13808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952969#comment-16952969 ] Mikhail Khludnev edited comment on SOLR-13808 at 10/21/19 11:55 AM: Ok. It seems like the plan is to # -create \{!cache} query parser to hook it up by existing DSL. Caveat for users is loosing scoring-. # enable cache by default for \{!bool filter=... filter=..} # make sure that it's sensitive for \{!cache=false} local param for enclosing queries I'm fine with it and patches are welcome. was (Author: mkhludnev): Ok. It seems like the plan is to # create \{!cache} query parser to hook it up by existing DSL. Caveat for users is loosing scoring. # enable cache by default for \{!bool filter=... filter=..} # make sure that it sensitive for \{!cache=false} local param for enclosing queries I'm fine with it and patches are welcome. > Query DSL should let to cache filter > > > Key: SOLR-13808 > URL: https://issues.apache.org/jira/browse/SOLR-13808 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Mikhail Khludnev >Priority: Major > > Query DSL let to express Lucene BQ's filter > > {code:java} > { query: {bool: { filter: {term: {f:name,query:"foo bar"}}} }}{code} > However, it might easily catch the need in caching it in filter cache. This > might rely on ExtensibleQuery and QParser: > {code:java} > { query: {bool: { filter: {term: {f:name,query:"foo bar", cache:true}}} }} > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13850) Atomic Updates with PreAnalyzedField
[ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955997#comment-16955997 ] Mikhail Khludnev commented on SOLR-13850: - {quote}I'm not even sure it's meaningful to have a pre-analyzed field be "stored". {quote} [Looks like|https://lucene.apache.org/solr/guide/6_6/working-with-external-files-and-processes.html#WorkingwithExternalFilesandProcesses-JsonPreAnalyzedParser] it is. Let's embrace the failure. Patch is welcome. > Atomic Updates with PreAnalyzedField > > > Key: SOLR-13850 > URL: https://issues.apache.org/jira/browse/SOLR-13850 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7.2, 8.2 > Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 > (Oracle) >Reporter: Oleksandr Drapushko >Priority: Critical > Labels: AtomicUpdate > > If you try to update non pre-analyzed fields in a document using atomic > updates, data in pre-analyzed fields (if there is any) will be lost. > *Steps to reproduce* > 1. Index this document into techproducts > {code:json} > { > "id": "a", > "n_s": "s1", > "pre": > "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" > } > {code} > 2. Query the document > {code:json} > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":"a", > "n_s":"s1", > "pre":"Alaska", > "_version_":1647475215142223872}] > }} > {code} > 3. Update using atomic syntax > {code:json} > { > "add": { > "doc": { > "id": "a", > "n_s": {"set": "s2"} > }}} > {code} > 4. Observe the warning in solr log > UI: > {noformat} > WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing > pre-analyzed field 'pre' > {noformat} > solr.log: > {noformat} > WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 > x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing > pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type > java.lang.String, expected Map > at > org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) > {noformat} > 5. Query the document again > {code:json} > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":"a", > "n_s":"s2", > "_version_":1647475461695995904}] > }} > {code} > *Result*: There is no 'pre' field in the document anymore. > _My thoughts on it_ > 1. Data loss can be prevented if the warning will be replaced with error > (re-throwing exception). Atomic updates for such documents still won't work, > but updates will be explicitly rejected. > 2. Solr tries to read the document from index, merge it with input document > and re-index the document, but when it reads indexed pre-analyzed fields the > format is different, so Solr cannot parse and re-index those fields properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13850) Atomic Updates with PreAnalyzedField
[ https://issues.apache.org/jira/browse/SOLR-13850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955961#comment-16955961 ] Oleksandr Drapushko commented on SOLR-13850: [~dsmiley], For Atomic Updates all fields must be configured as stored or docValues, except for copyFields. Since PreAnalyzedField does not support docValues, it must be stored. If you define it as stored=false, which you shouldn't, it will result in data loss too. > Atomic Updates with PreAnalyzedField > > > Key: SOLR-13850 > URL: https://issues.apache.org/jira/browse/SOLR-13850 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7.2, 8.2 > Environment: Ubuntu 16.04 LTS / Java 8 (Zulu), Windows 10 / Java 11 > (Oracle) >Reporter: Oleksandr Drapushko >Priority: Critical > Labels: AtomicUpdate > > If you try to update non pre-analyzed fields in a document using atomic > updates, data in pre-analyzed fields (if there is any) will be lost. > *Steps to reproduce* > 1. Index this document into techproducts > {code:json} > { > "id": "a", > "n_s": "s1", > "pre": > "{\"v\":\"1\",\"str\":\"Alaska\",\"tokens\":[{\"t\":\"alaska\",\"s\":0,\"e\":6,\"i\":1}]}" > } > {code} > 2. Query the document > {code:json} > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":"a", > "n_s":"s1", > "pre":"Alaska", > "_version_":1647475215142223872}] > }} > {code} > 3. Update using atomic syntax > {code:json} > { > "add": { > "doc": { > "id": "a", > "n_s": {"set": "s2"} > }}} > {code} > 4. Observe the warning in solr log > UI: > {noformat} > WARN x:techproducts_shard2_replica_n6 PreAnalyzedField Error parsing > pre-analyzed field 'pre' > {noformat} > solr.log: > {noformat} > WARN (qtp1384454980-23) [c:techproducts s:shard2 r:core_node8 > x:techproducts_shard2_replica_n6] o.a.s.s.PreAnalyzedField Error parsing > pre-analyzed field 'pre' => java.io.IOException: Invalid JSON type > java.lang.String, expected Map > at > org.apache.solr.schema.JsonPreAnalyzedParser.parse(JsonPreAnalyzedParser.java:86) > {noformat} > 5. Query the document again > {code:json} > { > "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ > { > "id":"a", > "n_s":"s2", > "_version_":1647475461695995904}] > }} > {code} > *Result*: There is no 'pre' field in the document anymore. > _My thoughts on it_ > 1. Data loss can be prevented if the warning will be replaced with error > (re-throwing exception). Atomic updates for such documents still won't work, > but updates will be explicitly rejected. > 2. Solr tries to read the document from index, merge it with input document > and re-index the document, but when it reads indexed pre-analyzed fields the > format is different, so Solr cannot parse and re-index those fields properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram commented on issue #959: SOLR-13677 fix & cleanup
sigram commented on issue #959: SOLR-13677 fix & cleanup URL: https://github.com/apache/lucene-solr/pull/959#issuecomment-544458628 Committed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram closed pull request #959: SOLR-13677 fix & cleanup
sigram closed pull request #959: SOLR-13677 fix & cleanup URL: https://github.com/apache/lucene-solr/pull/959 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] sigram edited a comment on issue #959: SOLR-13677 fix & cleanup
sigram edited a comment on issue #959: SOLR-13677 fix & cleanup URL: https://github.com/apache/lucene-solr/pull/959#issuecomment-544458628 Committed to branch_8_3, branch_8x and master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13568) Expand component should not cache group queries in the filter cache
[ https://issues.apache.org/jira/browse/SOLR-13568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955928#comment-16955928 ] Ludovic Boutros commented on SOLR-13568: Hello, anyone on this issue please ? > Expand component should not cache group queries in the filter cache > --- > > Key: SOLR-13568 > URL: https://issues.apache.org/jira/browse/SOLR-13568 > Project: Solr > Issue Type: Bug >Affects Versions: 7.7.2, 8.1.1 >Reporter: Ludovic Boutros >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently the expand component is creating queries (bit sets) from the > current page document ids. > These queries are sadly put in the filter cache. > This behavior floods the filter cache and it becomes inefficient. > Therefore, the group query should be wrapped in a query with its cache flag > disabled. > This is a tiny little thing to do. The PR will follow very soon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13125) Optimize Queries when sorting by router.field
[ https://issues.apache.org/jira/browse/SOLR-13125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16955861#comment-16955861 ] Jan Høydahl commented on SOLR-13125: Hi, following up on this. [~gus] I have not looked into the code yet wrt new SearchComponent hook or some other design. I think mosh may be correct that it would be better if this is decoupled from searchHandler and more a feature of TRA. > Optimize Queries when sorting by router.field > - > > Key: SOLR-13125 > URL: https://issues.apache.org/jira/browse/SOLR-13125 > Project: Solr > Issue Type: Sub-task >Reporter: mosh >Assignee: Gus Heck >Priority: Minor > Attachments: SOLR-13125-no-commit.patch, SOLR-13125.patch, > SOLR-13125.patch, SOLR-13125.patch > > Time Spent: 10m > Remaining Estimate: 0h > > We are currently testing TRA using Solr 7.7, having >300 shards in the alias, > with much growth in the coming months. > The "hot" data(in our case, more recent) will be stored on stronger > nodes(SSD, more RAM, etc). > A proposal of optimizing queries sorted by router.field(the field which TRA > uses to route the data to the correct collection) has emerged. > Perhaps, in queries which are sorted by router.field, Solr could be smart > enough to wait for the more recent collections, and in case the limit was > reached cancel other queries(or just not block and wait for the results)? > For example: > When querying a TRA which with a filter on a different field than > router.field, but sorting by router.field desc, limit=100. > Since this is a TRA, solr will issue queries for all the collections in the > alias. > But to optimize this particular type of query, Solr could wait for the most > recent collection in the TRA, see whether the result set matches or exceeds > the limit. If so, the query could be returned to the user without waiting for > the rest of the shards. If not, the issuing node will block until the second > query returns, and so forth, until the limit of the request is reached. > This might also be useful for deep paging, querying each collection and only > skipping to the next once there are no more results in the specified > collection. > Thoughts or inputs are always welcome. > This is just my two cents, and I'm always happy to brainstorm. > Thanks in advance. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9019) Patch for making some variable names more consistent with the other parts of the code.
Yusuke Shinyama created LUCENE-9019: --- Summary: Patch for making some variable names more consistent with the other parts of the code. Key: LUCENE-9019 URL: https://issues.apache.org/jira/browse/LUCENE-9019 Project: Lucene - Core Issue Type: Improvement Reporter: Yusuke Shinyama Fix For: trunk Hello, we're developing an automated system that detects inconsistent variable names in a large software project. Our system checks if each variable name is consistent with other variables in the project in its usage pattern, and proposes correct candidates if inconsistency is detected. This is a part of academic research that we hope to publish soon, but as a part of the evaluation, we applied our systems to your projects and got a few interesting results. We carefully reviewed our system output and manually created a patch to correct a few variable names. We would be delighted if this patch is found to be useful. If you have a question or suggestion regarding this patch, we'd happily answer. Thank you. P.S. our patch is purely for readability purposes and does not change any functionality. A couple of issues that we've noticed were left untouched. For example, mixed use of variable names "len" and "length" were fairly widespread, but we have only corrected a few notable instances to minimize our impact. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] euske opened a new pull request #961: [LUCENE-9019] Patch for making some variable names more consistent with the other parts of the code.
euske opened a new pull request #961: [LUCENE-9019] Patch for making some variable names more consistent with the other parts of the code. URL: https://github.com/apache/lucene-solr/pull/961 [LUCENE-9019] Patch for making some variable names more consistent with the other parts of the code. Hello, we're developing an automated system that detects inconsistent variable names in a large software project. Our system checks if each variable name is consistent with other variables in the project in its usage pattern, and proposes correct candidates if inconsistency is detected. This is a part of academic research that we hope to publish soon, but as a part of the evaluation, we applied our systems to your projects and got a few interesting results. We carefully reviewed our system output and manually created a patch to correct a few variable names. We would be delighted if this patch is found to be useful. If you have a question or suggestion regarding this patch, we'd happily answer. Thank you. P.S. our patch is purely for readability purposes and does not change any functionality. A couple of issues that we've noticed were left untouched. For example, mixed use of variable names "len" and "length" were fairly widespread, but we have only corrected a few notable instances to minimize our impact. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] atris commented on issue #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache
atris commented on issue #916: LUCENE-8213: Asynchronous Caching in LRUQueryCache URL: https://github.com/apache/lucene-solr/pull/916#issuecomment-544379560 @jpountz Please let me know your thoughts on this This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org