[GitHub] [lucene-solr] thomaswoeckinger commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema
thomaswoeckinger commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema URL: https://github.com/apache/lucene-solr/pull/911#discussion_r329888692 ## File path: solr/core/src/test/org/apache/solr/analysis/TestWordDelimiterFilterFactory.java ## @@ -222,7 +223,7 @@ public void testCustomTypes() throws Exception { /* custom behavior */ args = new HashMap<>(); // use a custom type mapping -args.put("luceneMatchVersion", Version.LATEST.toString()); +args.put(IndexSchema.LUCENE_MATCH_VERSION_PARAM, Version.LATEST.toString()); Review comment: later comment? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on issue #910: SOLR-13661 : SOLR-13661 A package management system for Solr
dsmiley commented on issue #910: SOLR-13661 : SOLR-13661 A package management system for Solr URL: https://github.com/apache/lucene-solr/pull/910#issuecomment-536862873 The scope here is too much to review. Sub-system / component is more approachable. For example the new blob thingy ought to be it's own issue/PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema
dsmiley commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema URL: https://github.com/apache/lucene-solr/pull/911#discussion_r329877193 ## File path: solr/core/src/test/org/apache/solr/analysis/TestWordDelimiterFilterFactory.java ## @@ -222,7 +223,7 @@ public void testCustomTypes() throws Exception { /* custom behavior */ args = new HashMap<>(); // use a custom type mapping -args.put("luceneMatchVersion", Version.LATEST.toString()); +args.put(IndexSchema.LUCENE_MATCH_VERSION_PARAM, Version.LATEST.toString()); Review comment: not ugly (see later comment) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema
dsmiley commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema URL: https://github.com/apache/lucene-solr/pull/911#discussion_r329877133 ## File path: solr/core/src/test/org/apache/solr/rest/schema/TestSerializedLuceneMatchVersion.java ## @@ -44,13 +45,13 @@ public void testExplicitLuceneMatchVersions() throws Exception { "count(/response/lst[@name='fieldType']) = 1", "//lst[str[@name='class'][.='org.apache.solr.analysis.MockCharFilterFactory']]" - +" [str[@name='luceneMatchVersion'][.='4.0.0']]", + +" [str[@name='" + IndexSchema.LUCENE_MATCH_VERSION_PARAM + "'][.='4.0.0']]", Review comment: again; here in particular it's ugly This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] dsmiley commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema
dsmiley commented on a change in pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema URL: https://github.com/apache/lucene-solr/pull/911#discussion_r329876510 ## File path: solr/core/src/test/org/apache/solr/rest/schema/TestBulkSchemaAPI.java ## @@ -125,7 +126,7 @@ public void testAnalyzerClass() throws Exception { "'name' : 'myNewTextFieldWithAnalyzerClass',\n" + "'class':'solr.TextField',\n" + "'analyzer' : {\n" + -"'luceneMatchVersion':'5.0.0',\n" + +"'" + IndexSchema.LUCENE_MATCH_VERSION_PARAM + "':'5.0.0',\n" + Review comment: Using Constants in tests like here is a bit too far IMO. It slightly obscures readability and there's a point to be made that changing the input/output constant _should_ break a test. Subjective, I know. I don't mean to suggest the removal of all constants in tests but here in particular it's ugly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-13802) Analyzer property luceneMatchVersion is not written to managed schema
[ https://issues.apache.org/jira/browse/SOLR-13802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Wayne Smiley reassigned SOLR-13802: - Assignee: David Wayne Smiley > Analyzer property luceneMatchVersion is not written to managed schema > - > > Key: SOLR-13802 > URL: https://issues.apache.org/jira/browse/SOLR-13802 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis >Affects Versions: 7.7.2, master (9.0), 8.2 >Reporter: Thomas Wöckinger >Assignee: David Wayne Smiley >Priority: Major > Labels: easy-fix, pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The analyzer property luceneMatchVersion is no written to managed schema, it > is simply not handled by the code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13804) ant precommit fails on OpenJDK11 Corretto
Koen De Groote created SOLR-13804: - Summary: ant precommit fails on OpenJDK11 Corretto Key: SOLR-13804 URL: https://issues.apache.org/jira/browse/SOLR-13804 Project: Solr Issue Type: Bug Reporter: Koen De Groote Attachments: ant_precommit_fails.txt Noticed this while preparing for another pull request. I've attached a file with the output of the command. Errors start at point 4. I'm not sure if this is down to it being Corretto, or just anything other than Oracle JDK. At the time of writing latest commit was 67f4c7f36eef2ae75fb80859dfc0e612675cb94d My knowledge does not extend far enough to say anything meaningful about this. So I'm asking here for people to take a look at it. Corretto is Amazon's OpenJDK implementation: https://docs.aws.amazon.com/corretto/latest/corretto-11-ug/what-is-corretto-11.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13771) Add -v and -m to ulimit section of reference guide and bin/solr checks
[ https://issues.apache.org/jira/browse/SOLR-13771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941372#comment-16941372 ] ASF subversion and git services commented on SOLR-13771: Commit 2f0dc888f51ff5b763d1f49aa7b2e621c274d00e in lucene-solr's branch refs/heads/branch_8x from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2f0dc88 ] SOLR-13771: Add -v and -m to ulimit section of reference guide and bin/solr checks. Forgot CHANGES.txt entry (cherry picked from commit 67f4c7f36eef2ae75fb80859dfc0e612675cb94d) > Add -v and -m to ulimit section of reference guide and bin/solr checks > --- > > Key: SOLR-13771 > URL: https://issues.apache.org/jira/browse/SOLR-13771 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13771.patch > > > I just noticed these bits in MMapDirectory.java > {code} > if (!Constants.JRE_IS_64BIT) { > moreInfo = "MMapDirectory should only be used on 64bit platforms, > because the address space on 32bit operating systems is too small. "; > } else if (Constants.WINDOWS) { > moreInfo = "Windows is unfortunately very limited on virtual address > space. If your index size is several hundred Gigabytes, consider changing to > Linux. "; > } else if (Constants.LINUX) { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'), and 'sysctl vm.max_map_count'. "; > } else { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'). "; > } > {code} > We should add this info to the ref guide, particularly the bits about -v and > -m. We already mention ulimits, but only in relation to file handles and > processes. > What about restructuring that section a bit, to something like "operating > system settings", so we can include some of the information above. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13771) Add -v and -m to ulimit section of reference guide and bin/solr checks
[ https://issues.apache.org/jira/browse/SOLR-13771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941369#comment-16941369 ] ASF subversion and git services commented on SOLR-13771: Commit 67f4c7f36eef2ae75fb80859dfc0e612675cb94d in lucene-solr's branch refs/heads/master from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=67f4c7f ] SOLR-13771: Add -v and -m to ulimit section of reference guide and bin/solr checks. Forgot CHANGES.txt entry > Add -v and -m to ulimit section of reference guide and bin/solr checks > --- > > Key: SOLR-13771 > URL: https://issues.apache.org/jira/browse/SOLR-13771 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13771.patch > > > I just noticed these bits in MMapDirectory.java > {code} > if (!Constants.JRE_IS_64BIT) { > moreInfo = "MMapDirectory should only be used on 64bit platforms, > because the address space on 32bit operating systems is too small. "; > } else if (Constants.WINDOWS) { > moreInfo = "Windows is unfortunately very limited on virtual address > space. If your index size is several hundred Gigabytes, consider changing to > Linux. "; > } else if (Constants.LINUX) { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'), and 'sysctl vm.max_map_count'. "; > } else { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'). "; > } > {code} > We should add this info to the ref guide, particularly the bits about -v and > -m. We already mention ulimits, but only in relation to file handles and > processes. > What about restructuring that section a bit, to something like "operating > system settings", so we can include some of the information above. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13771) Add -v and -m to ulimit section of reference guide and bin/solr checks
[ https://issues.apache.org/jira/browse/SOLR-13771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941367#comment-16941367 ] ASF subversion and git services commented on SOLR-13771: Commit a1f3d2c29a1b61ac01e5defcb097695c43aaadd9 in lucene-solr's branch refs/heads/master from Erick Erickson [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=a1f3d2c ] SOLR-13771: Add -v and -m to ulimit section of reference guide and bin/solr checks > Add -v and -m to ulimit section of reference guide and bin/solr checks > --- > > Key: SOLR-13771 > URL: https://issues.apache.org/jira/browse/SOLR-13771 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13771.patch > > > I just noticed these bits in MMapDirectory.java > {code} > if (!Constants.JRE_IS_64BIT) { > moreInfo = "MMapDirectory should only be used on 64bit platforms, > because the address space on 32bit operating systems is too small. "; > } else if (Constants.WINDOWS) { > moreInfo = "Windows is unfortunately very limited on virtual address > space. If your index size is several hundred Gigabytes, consider changing to > Linux. "; > } else if (Constants.LINUX) { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'), and 'sysctl vm.max_map_count'. "; > } else { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'). "; > } > {code} > We should add this info to the ref guide, particularly the bits about -v and > -m. We already mention ulimits, but only in relation to file handles and > processes. > What about restructuring that section a bit, to something like "operating > system settings", so we can include some of the information above. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13771) Add -v and -m to ulimit section of reference guide and bin/solr checks
[ https://issues.apache.org/jira/browse/SOLR-13771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-13771: -- Summary: Add -v and -m to ulimit section of reference guide and bin/solr checks (was: Add -v and -m to ulimit section of reference guide) > Add -v and -m to ulimit section of reference guide and bin/solr checks > --- > > Key: SOLR-13771 > URL: https://issues.apache.org/jira/browse/SOLR-13771 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13771.patch > > > I just noticed these bits in MMapDirectory.java > {code} > if (!Constants.JRE_IS_64BIT) { > moreInfo = "MMapDirectory should only be used on 64bit platforms, > because the address space on 32bit operating systems is too small. "; > } else if (Constants.WINDOWS) { > moreInfo = "Windows is unfortunately very limited on virtual address > space. If your index size is several hundred Gigabytes, consider changing to > Linux. "; > } else if (Constants.LINUX) { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'), and 'sysctl vm.max_map_count'. "; > } else { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'). "; > } > {code} > We should add this info to the ref guide, particularly the bits about -v and > -m. We already mention ulimits, but only in relation to file handles and > processes. > What about restructuring that section a bit, to something like "operating > system settings", so we can include some of the information above. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13771) Add -v and -m to ulimit section of reference guide
[ https://issues.apache.org/jira/browse/SOLR-13771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson updated SOLR-13771: -- Attachment: SOLR-13771.patch Status: Open (was: Open) Doc change, also I changed bin/solr to check these two ulimits. Committing momentarily > Add -v and -m to ulimit section of reference guide > -- > > Key: SOLR-13771 > URL: https://issues.apache.org/jira/browse/SOLR-13771 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Erick Erickson >Assignee: Erick Erickson >Priority: Major > Attachments: SOLR-13771.patch > > > I just noticed these bits in MMapDirectory.java > {code} > if (!Constants.JRE_IS_64BIT) { > moreInfo = "MMapDirectory should only be used on 64bit platforms, > because the address space on 32bit operating systems is too small. "; > } else if (Constants.WINDOWS) { > moreInfo = "Windows is unfortunately very limited on virtual address > space. If your index size is several hundred Gigabytes, consider changing to > Linux. "; > } else if (Constants.LINUX) { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'), and 'sysctl vm.max_map_count'. "; > } else { > moreInfo = "Please review 'ulimit -v', 'ulimit -m' (both should return > 'unlimited'). "; > } > {code} > We should add this info to the ref guide, particularly the bits about -v and > -m. We already mention ulimits, but only in relation to file handles and > processes. > What about restructuring that section a bit, to something like "operating > system settings", so we can include some of the information above. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13101) Shared storage support in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941362#comment-16941362 ] Ishan Chattopadhyaya commented on SOLR-13101: - Can we collaborate over the ASF slack for discussing harmonizing the 3 blob stores? I am okay with having all three, if they serve different usecases; just that we need to have a cohesive and consistent story around it in terms of documentation. bq. I plan on creating a branch jira/SOLR-13101 soon for future work on this issue. How far do you think is it complete? Do you forsee a lot of more work going in here? Or, do you suggest we start reviewing it and attempt to merge it soon (in a week or so?). > Shared storage support in SolrCloud > --- > > Key: SOLR-13101 > URL: https://issues.apache.org/jira/browse/SOLR-13101 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Yonik Seeley >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > Solr should have first-class support for shared storage (blob/object stores > like S3, google cloud storage, etc. and shared filesystems like HDFS, NFS, > etc). > The key component will likely be a new replica type for shared storage. It > would have many of the benefits of the current "pull" replicas (not indexing > on all replicas, all shards identical with no shards getting out-of-sync, > etc), but would have additional benefits: > - Any shard could become leader (the blob store always has the index) > - Better elasticity scaling down >- durability not linked to number of replcias.. a single replica could be > common for write workloads >- could drop to 0 replicas for a shard when not needed (blob store always > has index) > - Allow for higher performance write workloads by skipping the transaction > log >- don't pay for what you don't need >- a commit will be necessary to flush to stable storage (blob store) > - A lot of the complexity and failure modes go away > An additional component a Directory implementation that will work well with > blob stores. We probably want one that treats local disk as a cache since > the latency to remote storage is so large. I think there are still some > "locking" issues to be solved here (ensuring that more than one writer to the > same index won't corrupt it). This should probably be pulled out into a > different JIRA issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13661) A package management system for Solr
[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941355#comment-16941355 ] Ishan Chattopadhyaya commented on SOLR-13661: - As per a discussion in 8.3 release thread, branch cutting has been delayed for a week. Let us do the needful (either revert or review/merge this PR) by then. > A package management system for Solr > > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13764) Parse Interval Query from JSON API
[ https://issues.apache.org/jira/browse/SOLR-13764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-13764: Fix Version/s: 8.3 > Parse Interval Query from JSON API > -- > > Key: SOLR-13764 > URL: https://issues.apache.org/jira/browse/SOLR-13764 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Mikhail Khludnev >Priority: Blocker > Fix For: 8.3 > > > h2. Context > Lucene has Intervals query LUCENE-8196. Note: these are a kind of healthy > man's Spans/Phrases. Note: It's not about ranges nor facets. > h2. Problem > There's no way to search by IntervalQuery via JSON Query DSL. > h2. Suggestion > * Create classic QParser \{{ {!interval df=text_content}a_json_param}}, ie > one can combine a few such refs in {{json.query.bool}} > * It accepts just a name of JSON params, nothing like this happens yet. > * This param carries plain json which is accessible via {{req.getJSON()}} > please examine > https://cwiki.apache.org/confluence/display/SOLR/SOLR-13764+Discussion+-+Interval+Queries+in+JSON > for syntax proposal. > h2. Challenges > * I have no idea about particular JSON DSL for these queries, Lucene API > seems like easy JSON-able. Proposals are welcome. > * Another awkward things is combining analysis and low level query API. eg > what if one request term for one word and analysis yield two tokens, and vice > versa requesting phrase might end up with single token stream. > * Putting json into Jira ticket description > h2. Q: Why don't.. > .. put intervals DSL right into {{json.query}}, avoiding these odd param > refs? > A: It requires heavy lifting for {{JsonQueryConverter}} which is streamlined > for handling old good http parametrized queires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13764) Parse Interval Query from JSON API
[ https://issues.apache.org/jira/browse/SOLR-13764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941353#comment-16941353 ] Ishan Chattopadhyaya commented on SOLR-13764: - I'm marking this as a 8.3 blocker just keep a track of issues that we discussed for 8.3. If this doesn't finish up by 8.3 timeframe, I'll downgrade the priority. It will be good to have it out with 8.3, so hopefully we can have it in a week. Thanks [~mkhl]. > Parse Interval Query from JSON API > -- > > Key: SOLR-13764 > URL: https://issues.apache.org/jira/browse/SOLR-13764 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Mikhail Khludnev >Priority: Major > > h2. Context > Lucene has Intervals query LUCENE-8196. Note: these are a kind of healthy > man's Spans/Phrases. Note: It's not about ranges nor facets. > h2. Problem > There's no way to search by IntervalQuery via JSON Query DSL. > h2. Suggestion > * Create classic QParser \{{ {!interval df=text_content}a_json_param}}, ie > one can combine a few such refs in {{json.query.bool}} > * It accepts just a name of JSON params, nothing like this happens yet. > * This param carries plain json which is accessible via {{req.getJSON()}} > please examine > https://cwiki.apache.org/confluence/display/SOLR/SOLR-13764+Discussion+-+Interval+Queries+in+JSON > for syntax proposal. > h2. Challenges > * I have no idea about particular JSON DSL for these queries, Lucene API > seems like easy JSON-able. Proposals are welcome. > * Another awkward things is combining analysis and low level query API. eg > what if one request term for one word and analysis yield two tokens, and vice > versa requesting phrase might end up with single token stream. > * Putting json into Jira ticket description > h2. Q: Why don't.. > .. put intervals DSL right into {{json.query}}, avoiding these odd param > refs? > A: It requires heavy lifting for {{JsonQueryConverter}} which is streamlined > for handling old good http parametrized queires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13764) Parse Interval Query from JSON API
[ https://issues.apache.org/jira/browse/SOLR-13764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-13764: Priority: Blocker (was: Major) > Parse Interval Query from JSON API > -- > > Key: SOLR-13764 > URL: https://issues.apache.org/jira/browse/SOLR-13764 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers >Reporter: Mikhail Khludnev >Priority: Blocker > > h2. Context > Lucene has Intervals query LUCENE-8196. Note: these are a kind of healthy > man's Spans/Phrases. Note: It's not about ranges nor facets. > h2. Problem > There's no way to search by IntervalQuery via JSON Query DSL. > h2. Suggestion > * Create classic QParser \{{ {!interval df=text_content}a_json_param}}, ie > one can combine a few such refs in {{json.query.bool}} > * It accepts just a name of JSON params, nothing like this happens yet. > * This param carries plain json which is accessible via {{req.getJSON()}} > please examine > https://cwiki.apache.org/confluence/display/SOLR/SOLR-13764+Discussion+-+Interval+Queries+in+JSON > for syntax proposal. > h2. Challenges > * I have no idea about particular JSON DSL for these queries, Lucene API > seems like easy JSON-able. Proposals are welcome. > * Another awkward things is combining analysis and low level query API. eg > what if one request term for one word and analysis yield two tokens, and vice > versa requesting phrase might end up with single token stream. > * Putting json into Jira ticket description > h2. Q: Why don't.. > .. put intervals DSL right into {{json.query}}, avoiding these odd param > refs? > A: It requires heavy lifting for {{JsonQueryConverter}} which is streamlined > for handling old good http parametrized queires. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] magibney commented on a change in pull request #892: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation
magibney commented on a change in pull request #892: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation URL: https://github.com/apache/lucene-solr/pull/892#discussion_r329744248 ## File path: lucene/analysis/icu/src/java/org/apache/lucene/analysis/icu/ICUTransformCharFilter.java ## @@ -0,0 +1,322 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.analysis.icu; + +import java.io.IOException; +import java.io.Reader; + +import com.ibm.icu.text.ReplaceableString; +import com.ibm.icu.text.Transliterator; +import com.ibm.icu.text.Transliterator.Position; +import com.ibm.icu.text.UTF16; + +import org.apache.lucene.analysis.CharFilter; +import org.apache.lucene.analysis.charfilter.BaseCharFilter; +import org.apache.lucene.util.ArrayUtil; + +/** + * A {@link CharFilter} that transforms text with ICU. + * + * ICU provides text-transformation functionality via its Transliteration API. + * Although script conversion is its most common use, a Transliterator can + * actually perform a more general class of tasks. In fact, Transliterator + * defines a very general API which specifies only that a segment of the input + * text is replaced by new text. The particulars of this conversion are + * determined entirely by subclasses of Transliterator. + * + * + * Some useful transformations for search are built-in: + * + * Conversion from Traditional to Simplified Chinese characters + * Conversion from Hiragana to Katakana + * Conversion from Fullwidth to Halfwidth forms. + * Script conversions, for example Serbian Cyrillic to Latin + * + * + * Example usage: stream = new ICUTransformCharFilter(reader, + * Transliterator.getInstance("Traditional-Simplified")); + * + * For more details, see the http://userguide.icu-project.org/transforms/general;>ICU User + * Guide. + */ +public final class ICUTransformCharFilter extends BaseCharFilter { + + // Transliterator to transform the text + private final Transliterator transform; + + // Reusable position object + private final Position position = new Position(); + + private static final int READ_BUFFER_SIZE = 1024; + private final char[] tmpBuffer = new char[READ_BUFFER_SIZE]; + + private static final int INITIAL_TRANSLITERATE_BUFFER_SIZE = 1024; + private final StringBuffer buffer = new StringBuffer(INITIAL_TRANSLITERATE_BUFFER_SIZE); + private final ReplaceableString replaceable = new ReplaceableString(buffer); + + private static final int BUFFER_PRUNE_THRESHOLD = 1024; + + private int outputCursor = 0; + private boolean inputFinished = false; + private int charCount = 0; + + static final int DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY = 8192; + private final int maxRollbackBufferCapacity; + + private static final int DEFAULT_INITIAL_ROLLBACK_BUFFER_CAPACITY = 4; // must be power of 2 + private char[] rollbackBuffer; + private int rollbackBufferSize = 0; + + ICUTransformCharFilter(Reader in, Transliterator transform) { +this(in, transform, DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY); + } + + /** + * Construct new {@link ICUTransformCharFilter} with the specified {@link Transliterator}, backed by + * the specified {@link Reader}. + * @param in input source + * @param transform used to perform transliteration + * @param maxRollbackBufferCapacityHint used to control the maximum size to which this + * {@link ICUTransformCharFilter} will buffer and rollback partial transliteration of input sequences. + * The provided hint will be converted to an enforced limit of "the greatest power of 2 (excluding '1') + * less than or equal to the specified value". Specifying a negative value allows the rollback buffer to Review comment: +1. This was here more as a convenience targeted at the external schema config API, so I've preserved that convenience, but moved it out to the `ICUTransformCharFilterFactory` instead (to allow users who essentially want to configure "no limit" to avoid having to explicitly write something like "2147483647" in their config files). Does that sound ok? I also noticed that there is no power of 2 greater than or equal to hint, for hint greater than
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941226#comment-16941226 ] ASF subversion and git services commented on SOLR-13105: Commit e54a792e4c835cf0eb55d319250ebad23a0274b3 in lucene-solr's branch refs/heads/SOLR-13105-visual from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e54a792 ] SOLR-13105: Update regression docs 2 > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13790) LRUStatsCache size explosion
[ https://issues.apache.org/jira/browse/SOLR-13790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941213#comment-16941213 ] Andrzej Bialecki commented on SOLR-13790: - This patch is a work in progress - it fixes the error described above but it also tries to fix an existential problem in LRUStatsCache - namely, as it is now it would always send requests for fetching stats (thus adding a round-trip to every query), even for repeated queries, consequently defeating the point of LRU caching. Changes in this patch: * consistenly use shard name instead of the full shard URL lists as caching keys, both in SolrCloud mode and in standalone distributed mode * optimized serialization of stats in order to minimize the size of data and to prevent serialization errors when terms contain separators or url-unsafe characters * added SolrCloud unit tests, still need much improvement * added some logic in LRUStatsCache that tries to avoid sending a stats request if all global data is already available in cache. This part is a little bit shaky but I don't have any better idea at the moment how to address this problem. Basically, it rewrites a query locally to see if there are any missing stats to be fetched - but the answer "none" is not 100% fool-proof because queries may be rewritten differently based on the available terms and fields in the local vs. remote index. The code tries to fix it post-factum by detecting missing global stats and forcing a fetch+cache of the missing stats with the next request. > LRUStatsCache size explosion > > > Key: SOLR-13790 > URL: https://issues.apache.org/jira/browse/SOLR-13790 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7.2, 8.2, 8.3 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Critical > Fix For: 7.7.3, 8.3 > > Attachments: SOLR-13790.patch > > > On a sizeable cluster with multi-shard multi-replica collections, when > {{LRUStatsCache}} was in use we encountered excessive memory usage, which > consequently led to severe performance problems. > On a closer examination of the heapdumps it became apparent that when > {{LRUStatsCache.addToPerShardTermStats}} is called it creates instances of > {{FastLRUCache}} using the passed {{shard}} argument - however, the value of > this argument is not a simple shard name but instead it's a randomly ordered > list of ALL replica URLs for this shard. > As a result, due to the combinatoric number of possible keys, over time the > map in {{LRUStatsCache.perShardTemStats}} grew to contain ~2 mln entries... > The fix seems to be simply to extract the shard name and cache using this > name instead of the full string value of the {{shard}} parameter. Existing > unit tests also need much improvement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13790) LRUStatsCache size explosion
[ https://issues.apache.org/jira/browse/SOLR-13790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated SOLR-13790: Attachment: SOLR-13790.patch > LRUStatsCache size explosion > > > Key: SOLR-13790 > URL: https://issues.apache.org/jira/browse/SOLR-13790 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 7.7.2, 8.2, 8.3 >Reporter: Andrzej Bialecki >Assignee: Andrzej Bialecki >Priority: Critical > Fix For: 7.7.3, 8.3 > > Attachments: SOLR-13790.patch > > > On a sizeable cluster with multi-shard multi-replica collections, when > {{LRUStatsCache}} was in use we encountered excessive memory usage, which > consequently led to severe performance problems. > On a closer examination of the heapdumps it became apparent that when > {{LRUStatsCache.addToPerShardTermStats}} is called it creates instances of > {{FastLRUCache}} using the passed {{shard}} argument - however, the value of > this argument is not a simple shard name but instead it's a randomly ordered > list of ALL replica URLs for this shard. > As a result, due to the combinatoric number of possible keys, over time the > map in {{LRUStatsCache.perShardTemStats}} grew to contain ~2 mln entries... > The fix seems to be simply to extract the shard name and cache using this > name instead of the full string value of the {{shard}} parameter. Existing > unit tests also need much improvement. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-13101) Shared storage support in SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16939809#comment-16939809 ] Yonik Seeley edited comment on SOLR-13101 at 9/30/19 5:21 PM: -- I plan on creating a branch jira/SOLR-13101 soon for future work on this issue. edit: this has been done. Please use this branch for future pull requests: https://github.com/apache/lucene-solr/tree/jira/SOLR-13101 was (Author: ysee...@gmail.com): I plan on creating a branch jira/SOLR-13101 soon for future work on this issue. > Shared storage support in SolrCloud > --- > > Key: SOLR-13101 > URL: https://issues.apache.org/jira/browse/SOLR-13101 > Project: Solr > Issue Type: New Feature > Components: SolrCloud >Reporter: Yonik Seeley >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > Solr should have first-class support for shared storage (blob/object stores > like S3, google cloud storage, etc. and shared filesystems like HDFS, NFS, > etc). > The key component will likely be a new replica type for shared storage. It > would have many of the benefits of the current "pull" replicas (not indexing > on all replicas, all shards identical with no shards getting out-of-sync, > etc), but would have additional benefits: > - Any shard could become leader (the blob store always has the index) > - Better elasticity scaling down >- durability not linked to number of replcias.. a single replica could be > common for write workloads >- could drop to 0 replicas for a shard when not needed (blob store always > has index) > - Allow for higher performance write workloads by skipping the transaction > log >- don't pay for what you don't need >- a commit will be necessary to flush to stable storage (blob store) > - A lot of the complexity and failure modes go away > An additional component a Directory implementation that will work well with > blob stores. We probably want one that treats local disk as a cache since > the latency to remote storage is so large. I think there are still some > "locking" issues to be solved here (ensuring that more than one writer to the > same index won't corrupt it). This should probably be pulled out into a > different JIRA issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] magibney commented on a change in pull request #892: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation
magibney commented on a change in pull request #892: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation URL: https://github.com/apache/lucene-solr/pull/892#discussion_r329684574 ## File path: lucene/analysis/icu/src/java/org/apache/lucene/analysis/icu/ICUTransformCharFilter.java ## @@ -0,0 +1,322 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.analysis.icu; + +import java.io.IOException; +import java.io.Reader; + +import com.ibm.icu.text.ReplaceableString; +import com.ibm.icu.text.Transliterator; +import com.ibm.icu.text.Transliterator.Position; +import com.ibm.icu.text.UTF16; + +import org.apache.lucene.analysis.CharFilter; +import org.apache.lucene.analysis.charfilter.BaseCharFilter; +import org.apache.lucene.util.ArrayUtil; + +/** + * A {@link CharFilter} that transforms text with ICU. + * + * ICU provides text-transformation functionality via its Transliteration API. + * Although script conversion is its most common use, a Transliterator can + * actually perform a more general class of tasks. In fact, Transliterator + * defines a very general API which specifies only that a segment of the input + * text is replaced by new text. The particulars of this conversion are + * determined entirely by subclasses of Transliterator. + * + * + * Some useful transformations for search are built-in: + * + * Conversion from Traditional to Simplified Chinese characters + * Conversion from Hiragana to Katakana + * Conversion from Fullwidth to Halfwidth forms. + * Script conversions, for example Serbian Cyrillic to Latin + * + * + * Example usage: stream = new ICUTransformCharFilter(reader, + * Transliterator.getInstance("Traditional-Simplified")); + * + * For more details, see the http://userguide.icu-project.org/transforms/general;>ICU User + * Guide. + */ +public final class ICUTransformCharFilter extends BaseCharFilter { + + // Transliterator to transform the text + private final Transliterator transform; + + // Reusable position object + private final Position position = new Position(); + + private static final int READ_BUFFER_SIZE = 1024; + private final char[] tmpBuffer = new char[READ_BUFFER_SIZE]; + + private static final int INITIAL_TRANSLITERATE_BUFFER_SIZE = 1024; + private final StringBuffer buffer = new StringBuffer(INITIAL_TRANSLITERATE_BUFFER_SIZE); + private final ReplaceableString replaceable = new ReplaceableString(buffer); + + private static final int BUFFER_PRUNE_THRESHOLD = 1024; + + private int outputCursor = 0; + private boolean inputFinished = false; + private int charCount = 0; + + static final int DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY = 8192; + private final int maxRollbackBufferCapacity; + + private static final int DEFAULT_INITIAL_ROLLBACK_BUFFER_CAPACITY = 4; // must be power of 2 + private char[] rollbackBuffer; + private int rollbackBufferSize = 0; + + ICUTransformCharFilter(Reader in, Transliterator transform) { +this(in, transform, DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY); + } + + /** + * Construct new {@link ICUTransformCharFilter} with the specified {@link Transliterator}, backed by + * the specified {@link Reader}. + * @param in input source + * @param transform used to perform transliteration + * @param maxRollbackBufferCapacityHint used to control the maximum size to which this + * {@link ICUTransformCharFilter} will buffer and rollback partial transliteration of input sequences. + * The provided hint will be converted to an enforced limit of "the greatest power of 2 (excluding '1') + * less than or equal to the specified value". Specifying a negative value allows the rollback buffer to + * grow indefinitely (equivalent to specifying {@link Integer#MAX_VALUE}). Specifying "0" (or "1", in practice) + * disables rollback. Larger values can in some cases yield more accurate transliteration, at the cost of + * performance and resolution/accuracy of offset correction. + * This is intended primarily as a failsafe, with a relatively large default value of {@value ICUTransformCharFilter#DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY}. + * See comments "To understand the need for
[jira] [Commented] (SOLR-13399) compositeId support for shard splitting
[ https://issues.apache.org/jira/browse/SOLR-13399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941130#comment-16941130 ] ASF subversion and git services commented on SOLR-13399: Commit 7775e17414b83508f40ee2e440914177951b5882 in lucene-solr's branch refs/heads/jira/SOLR-13101 from Yonik Seeley [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7775e17 ] SOLR-13399: fix splitByPrefix default to be false > compositeId support for shard splitting > --- > > Key: SOLR-13399 > URL: https://issues.apache.org/jira/browse/SOLR-13399 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Fix For: 8.3 > > Attachments: SOLR-13399.patch, SOLR-13399.patch, > SOLR-13399_testfix.patch, SOLR-13399_useId.patch, > ShardSplitTest.master.seed_AE04B5C9BA6E9A4.log.txt > > Time Spent: 1h > Remaining Estimate: 0h > > Shard splitting does not currently have a way to automatically take into > account the actual distribution (number of documents) in each hash bucket > created by using compositeId hashing. > We should probably add a parameter *splitByPrefix* to the *SPLITSHARD* > command that would look at the number of docs sharing each compositeId prefix > and use that to create roughly equal sized buckets by document count rather > than just assuming an equal distribution across the entire hash range. > Like normal shard splitting, we should bias against splitting within hash > buckets unless necessary (since that leads to larger query fanout.) . Perhaps > this warrants a parameter that would control how much of a size mismatch is > tolerable before resorting to splitting within a bucket. > *allowedSizeDifference*? > To more quickly calculate the number of docs in each bucket, we could index > the prefix in a different field. Iterating over the terms for this field > would quickly give us the number of docs in each (i.e lucene keeps track of > the doc count for each term already.) Perhaps the implementation could be a > flag on the *id* field... something like *indexPrefixes* and poly-fields that > would cause the indexing to be automatically done and alleviate having to > pass in an additional field during indexing and during the call to > *SPLITSHARD*. This whole part is an optimization though and could be split > off into its own issue if desired. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13399) compositeId support for shard splitting
[ https://issues.apache.org/jira/browse/SOLR-13399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941131#comment-16941131 ] ASF subversion and git services commented on SOLR-13399: Commit c169c182ffc2f9cfad2c175a26504d0c116a8ccd in lucene-solr's branch refs/heads/jira/SOLR-13101 from Megan Carey [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=c169c18 ] SOLR-13399: Adding splitByPrefix param to IndexSizeTrigger; some splitByPrefix test and code cleanup > compositeId support for shard splitting > --- > > Key: SOLR-13399 > URL: https://issues.apache.org/jira/browse/SOLR-13399 > Project: Solr > Issue Type: New Feature >Reporter: Yonik Seeley >Assignee: Yonik Seeley >Priority: Major > Fix For: 8.3 > > Attachments: SOLR-13399.patch, SOLR-13399.patch, > SOLR-13399_testfix.patch, SOLR-13399_useId.patch, > ShardSplitTest.master.seed_AE04B5C9BA6E9A4.log.txt > > Time Spent: 1h > Remaining Estimate: 0h > > Shard splitting does not currently have a way to automatically take into > account the actual distribution (number of documents) in each hash bucket > created by using compositeId hashing. > We should probably add a parameter *splitByPrefix* to the *SPLITSHARD* > command that would look at the number of docs sharing each compositeId prefix > and use that to create roughly equal sized buckets by document count rather > than just assuming an equal distribution across the entire hash range. > Like normal shard splitting, we should bias against splitting within hash > buckets unless necessary (since that leads to larger query fanout.) . Perhaps > this warrants a parameter that would control how much of a size mismatch is > tolerable before resorting to splitting within a bucket. > *allowedSizeDifference*? > To more quickly calculate the number of docs in each bucket, we could index > the prefix in a different field. Iterating over the terms for this field > would quickly give us the number of docs in each (i.e lucene keeps track of > the doc count for each term already.) Perhaps the implementation could be a > flag on the *id* field... something like *indexPrefixes* and poly-fields that > would cause the indexing to be automatically done and alleviate having to > pass in an additional field during indexing and during the call to > *SPLITSHARD*. This whole part is an optimization though and could be split > off into its own issue if desired. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] magibney commented on a change in pull request #892: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation
magibney commented on a change in pull request #892: LUCENE-8972: Add ICUTransformCharFilter, to support pre-tokenizer ICU text transformation URL: https://github.com/apache/lucene-solr/pull/892#discussion_r329679320 ## File path: lucene/analysis/icu/src/java/org/apache/lucene/analysis/icu/ICUTransformCharFilter.java ## @@ -0,0 +1,322 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.lucene.analysis.icu; + +import java.io.IOException; +import java.io.Reader; + +import com.ibm.icu.text.ReplaceableString; +import com.ibm.icu.text.Transliterator; +import com.ibm.icu.text.Transliterator.Position; +import com.ibm.icu.text.UTF16; + +import org.apache.lucene.analysis.CharFilter; +import org.apache.lucene.analysis.charfilter.BaseCharFilter; +import org.apache.lucene.util.ArrayUtil; + +/** + * A {@link CharFilter} that transforms text with ICU. + * + * ICU provides text-transformation functionality via its Transliteration API. + * Although script conversion is its most common use, a Transliterator can + * actually perform a more general class of tasks. In fact, Transliterator + * defines a very general API which specifies only that a segment of the input + * text is replaced by new text. The particulars of this conversion are + * determined entirely by subclasses of Transliterator. + * + * + * Some useful transformations for search are built-in: + * + * Conversion from Traditional to Simplified Chinese characters + * Conversion from Hiragana to Katakana + * Conversion from Fullwidth to Halfwidth forms. + * Script conversions, for example Serbian Cyrillic to Latin + * + * + * Example usage: stream = new ICUTransformCharFilter(reader, + * Transliterator.getInstance("Traditional-Simplified")); + * + * For more details, see the http://userguide.icu-project.org/transforms/general;>ICU User + * Guide. + */ +public final class ICUTransformCharFilter extends BaseCharFilter { + + // Transliterator to transform the text + private final Transliterator transform; + + // Reusable position object + private final Position position = new Position(); + + private static final int READ_BUFFER_SIZE = 1024; + private final char[] tmpBuffer = new char[READ_BUFFER_SIZE]; + + private static final int INITIAL_TRANSLITERATE_BUFFER_SIZE = 1024; + private final StringBuffer buffer = new StringBuffer(INITIAL_TRANSLITERATE_BUFFER_SIZE); + private final ReplaceableString replaceable = new ReplaceableString(buffer); + + private static final int BUFFER_PRUNE_THRESHOLD = 1024; + + private int outputCursor = 0; + private boolean inputFinished = false; + private int charCount = 0; + + static final int DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY = 8192; + private final int maxRollbackBufferCapacity; + + private static final int DEFAULT_INITIAL_ROLLBACK_BUFFER_CAPACITY = 4; // must be power of 2 + private char[] rollbackBuffer; + private int rollbackBufferSize = 0; + + ICUTransformCharFilter(Reader in, Transliterator transform) { +this(in, transform, DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY); + } + + /** + * Construct new {@link ICUTransformCharFilter} with the specified {@link Transliterator}, backed by + * the specified {@link Reader}. + * @param in input source + * @param transform used to perform transliteration + * @param maxRollbackBufferCapacityHint used to control the maximum size to which this + * {@link ICUTransformCharFilter} will buffer and rollback partial transliteration of input sequences. + * The provided hint will be converted to an enforced limit of "the greatest power of 2 (excluding '1') + * less than or equal to the specified value". Specifying a negative value allows the rollback buffer to + * grow indefinitely (equivalent to specifying {@link Integer#MAX_VALUE}). Specifying "0" (or "1", in practice) + * disables rollback. Larger values can in some cases yield more accurate transliteration, at the cost of + * performance and resolution/accuracy of offset correction. + * This is intended primarily as a failsafe, with a relatively large default value of {@value ICUTransformCharFilter#DEFAULT_MAX_ROLLBACK_BUFFER_CAPACITY}. + * See comments "To understand the need for
[jira] [Commented] (SOLR-13105) A visual guide to Solr Math Expressions and Streaming Expressions
[ https://issues.apache.org/jira/browse/SOLR-13105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941086#comment-16941086 ] ASF subversion and git services commented on SOLR-13105: Commit 2966f1308832ed10e4db38af1815df099e9419b2 in lucene-solr's branch refs/heads/SOLR-13105-visual from Joel Bernstein [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=2966f13 ] SOLR-13105: Update machine learning docs 2 > A visual guide to Solr Math Expressions and Streaming Expressions > - > > Key: SOLR-13105 > URL: https://issues.apache.org/jira/browse/SOLR-13105 > Project: Solr > Issue Type: New Feature >Reporter: Joel Bernstein >Assignee: Joel Bernstein >Priority: Major > Attachments: Screen Shot 2019-01-14 at 10.56.32 AM.png, Screen Shot > 2019-02-21 at 2.14.43 PM.png, Screen Shot 2019-03-03 at 2.28.35 PM.png, > Screen Shot 2019-03-04 at 7.47.57 PM.png, Screen Shot 2019-03-13 at 10.47.47 > AM.png, Screen Shot 2019-03-30 at 6.17.04 PM.png > > > Visualization is now a fundamental element of Solr Streaming Expressions and > Math Expressions. This ticket will create a visual guide to Solr Math > Expressions and Solr Streaming Expressions that includes *Apache Zeppelin* > visualization examples. > It will also cover using the JDBC expression to *analyze* and *visualize* > results from any JDBC compliant data source. > Intro from the guide: > {code:java} > Streaming Expressions exposes the capabilities of Solr Cloud as composable > functions. These functions provide a system for searching, transforming, > analyzing and visualizing data stored in Solr Cloud collections. > At a high level there are four main capabilities that will be explored in the > documentation: > * Searching, sampling and aggregating results from Solr. > * Transforming result sets after they are retrieved from Solr. > * Analyzing and modeling result sets using probability and statistics and > machine learning libraries. > * Visualizing result sets, aggregations and statistical models of the data. > {code} > > A few sample visualizations are attached to the ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thomaswoeckinger commented on issue #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema
thomaswoeckinger commented on issue #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema URL: https://github.com/apache/lucene-solr/pull/911#issuecomment-536620261 @dsmiley: Please review This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] thomaswoeckinger opened a new pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema
thomaswoeckinger opened a new pull request #911: SOLR-13802: Write analyzer property luceneMatchVersion to managed schema URL: https://github.com/apache/lucene-solr/pull/911 # Description Please provide a short description of the changes you're making with this pull request. # Solution Please provide a short description of the approach taken to implement your solution. # Tests Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem. # Checklist Please review the following and check all that apply: - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ ] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I am authorized to contribute this code to the ASF and have removed any code I do not have a license to distribute. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ ] I have developed this patch against the `master` branch. - [ ] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13793) HTTPSolrCall makes cascading calls even when all replicas are down for a collection
[ https://issues.apache.org/jira/browse/SOLR-13793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kesharee Nandan Vishwakarma updated SOLR-13793: --- Affects Version/s: master (9.0) > HTTPSolrCall makes cascading calls even when all replicas are down for a > collection > --- > > Key: SOLR-13793 > URL: https://issues.apache.org/jira/browse/SOLR-13793 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.6, master (9.0) >Reporter: Kesharee Nandan Vishwakarma >Priority: Major > > REMOTEQUERY action in HTTPSolrCall ends up making too many cascading > remoteQuery calls when all all the replicas of a collection are in down > state. > This results in increase in thread count, unresponsive solr nodes and > eventually node (one's which have this collection) going out of live nodes. > *Example scenario*: Consider a cluster with 3 nodes(solr1, solrw1, > solr-overseer1). A collection is present on solr1, solrw1 but both replicas > are in down state. When a search request is made to solr-overseer1, since > replica is not present locally a remote query is made to solr1 (we also > consider inactive slices/coreUrls), solr1 also doesn't see an active replica > present locally, it forwards to solrw1, again solrw1 will forward request to > solr1. This goes on till both of solr1, solrw1 become unresponsive. Attached > logs for this. > This is happening because we are considering [inactive > slices|https://github.com/apache/lucene-solr/blob/68fa249034ba8b273955f20097700dc2fbb7a800/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L913 > ], [inactive coreUrl| > https://github.com/apache/lucene-solr/blob/68fa249034ba8b273955f20097700dc2fbb7a800/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L929] > while forwarding requests to nodes. > *Steps to reproduce*: > # Bring down all replicas of a collection but ensure nodes containing them > are up > # Make any search call to any of solr nodes for this collection. > > *Possible fixes*: > # Ensure we select only active slices/coreUrls before making remote queries > # Put a limit on cascading calls probably limit to number of replicas > > {noformat} > solrw1_1 | > solrw1_1 | 2019-09-24 09:35:14.458 ERROR (qtp762152757-8772) [ ] > o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error trying > to proxy request for url: http://solr1:8983/solr/kg3/select > solrw1_1 |at > org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:660) > solrw1_1 |at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:514) > solrw1_1 |at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361) > solrw1_1 |at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305) > solrw1_1 |at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) > solrw1_1 |at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > solrw1_1 |at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > solrw1_1 |at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > solrw1_1 |at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > solrw1_1 |at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > solrw1_1 |at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > solrw1_1 |at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > solrw1_1 |at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > solrw1_1 |at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > solrw1_1 |at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213) > solrw1_1 |at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119) > solrw1_1 |at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > solrw1_1 |at > org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335) > solrw1_1 |at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134) > solrw1_1 |at > org.eclipse.jetty.server.Server.handle(Server.java:534) > solrw1_1 |at >
[jira] [Commented] (SOLR-13798) SSL: Adding Enabling/Disabling client's hostname verification config
[ https://issues.apache.org/jira/browse/SOLR-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941071#comment-16941071 ] ASF subversion and git services commented on SOLR-13798: Commit 494d823e9d2f3dae7587cc9824cae9fbd900e4e1 in lucene-solr's branch refs/heads/branch_8x from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=494d823 ] SOLR-13798: SSL: Adding Enabling/Disabling client's hostname verification config > SSL: Adding Enabling/Disabling client's hostname verification config > > > Key: SOLR-13798 > URL: https://issues.apache.org/jira/browse/SOLR-13798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.2 >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Attachments: SOLR-13709.patch, SOLR-13709.patch > > > The problem for this after upgrading to Jetty 9.4.19 (SOLR-13541). > {{endpointIdentificationAlgorithm}} changed from null → HTTPS. As a result of > this client's hostname (identity) is always get verified on connecting Solr. > This change improved the security level of Solr, since it requires 2 ways > identity verifications (client verify server's identity and vice versa). It > leads to a problem when only certificate verification is enough (client's > hostname is not known ahead) for users. > We should introduce a flag in {{solr.in.sh}} to disable client's hostname > verification when needed then. > More about this at : > * https://tools.ietf.org/html/rfc2818#section-3 > * https://github.com/eclipse/jetty.project/issues/3454 > * https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13798) SSL: Adding Enabling/Disabling client's hostname verification config
[ https://issues.apache.org/jira/browse/SOLR-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cao Manh Dat resolved SOLR-13798. - Fix Version/s: 8.3 Resolution: Fixed > SSL: Adding Enabling/Disabling client's hostname verification config > > > Key: SOLR-13798 > URL: https://issues.apache.org/jira/browse/SOLR-13798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.2 >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Fix For: 8.3 > > Attachments: SOLR-13709.patch, SOLR-13709.patch > > > The problem for this after upgrading to Jetty 9.4.19 (SOLR-13541). > {{endpointIdentificationAlgorithm}} changed from null → HTTPS. As a result of > this client's hostname (identity) is always get verified on connecting Solr. > This change improved the security level of Solr, since it requires 2 ways > identity verifications (client verify server's identity and vice versa). It > leads to a problem when only certificate verification is enough (client's > hostname is not known ahead) for users. > We should introduce a flag in {{solr.in.sh}} to disable client's hostname > verification when needed then. > More about this at : > * https://tools.ietf.org/html/rfc2818#section-3 > * https://github.com/eclipse/jetty.project/issues/3454 > * https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-13802) Analyzer property luceneMatchVersion is not written to managed schema
Thomas Wöckinger created SOLR-13802: --- Summary: Analyzer property luceneMatchVersion is not written to managed schema Key: SOLR-13802 URL: https://issues.apache.org/jira/browse/SOLR-13802 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Schema and Analysis Affects Versions: 8.2, 7.7.2, master (9.0) Reporter: Thomas Wöckinger The analyzer property luceneMatchVersion is no written to managed schema, it is simply not handled by the code. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13798) SSL: Adding Enabling/Disabling client's hostname verification config
[ https://issues.apache.org/jira/browse/SOLR-13798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941070#comment-16941070 ] ASF subversion and git services commented on SOLR-13798: Commit 7350c5031635317c531c2f9249325d304a900772 in lucene-solr's branch refs/heads/master from Cao Manh Dat [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7350c50 ] SOLR-13798: SSL: Adding Enabling/Disabling client's hostname verification config > SSL: Adding Enabling/Disabling client's hostname verification config > > > Key: SOLR-13798 > URL: https://issues.apache.org/jira/browse/SOLR-13798 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Affects Versions: 8.2 >Reporter: Cao Manh Dat >Assignee: Cao Manh Dat >Priority: Major > Attachments: SOLR-13709.patch, SOLR-13709.patch > > > The problem for this after upgrading to Jetty 9.4.19 (SOLR-13541). > {{endpointIdentificationAlgorithm}} changed from null → HTTPS. As a result of > this client's hostname (identity) is always get verified on connecting Solr. > This change improved the security level of Solr, since it requires 2 ways > identity verifications (client verify server's identity and vice versa). It > leads to a problem when only certificate verification is enough (client's > hostname is not known ahead) for users. > We should introduce a flag in {{solr.in.sh}} to disable client's hostname > verification when needed then. > More about this at : > * https://tools.ietf.org/html/rfc2818#section-3 > * https://github.com/eclipse/jetty.project/issues/3454 > * https://www.cs.utexas.edu/~shmat/shmat_ccs12.pdf) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13661) A package management system for Solr
[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941047#comment-16941047 ] Ishan Chattopadhyaya commented on SOLR-13661: - bq. We revert those changes. But they seem complicated enough that I don't want to attempt it. The commits are here: https://issues.apache.org/jira/browse/SOLR-13710 > A package management system for Solr > > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-13661) A package management system for Solr
[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ishan Chattopadhyaya updated SOLR-13661: Priority: Blocker (was: Major) > A package management system for Solr > > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Ishan Chattopadhyaya >Priority: Blocker > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-9458) DocumentDictionaryFactory StackOverflowError on many documents
[ https://issues.apache.org/jira/browse/SOLR-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-9458. -- Resolution: Fixed I think this is fixed by LUCENE-7914. We can re-open if it's still a problem here. There'll still be an exception thrown, but at least one that's controlled. Although understanding how to fix this problem is unclear. Add a filter that limits the length of a token? > DocumentDictionaryFactory StackOverflowError on many documents > -- > > Key: SOLR-9458 > URL: https://issues.apache.org/jira/browse/SOLR-9458 > Project: Solr > Issue Type: Bug > Components: Suggester >Affects Versions: 6.1, 6.2 >Reporter: Chris de Kok >Priority: Major > > When using the FuzzyLookupFactory in combinarion with the > DocumentDictionaryFactory it will throw a stackoverflow trying to build the > dictionary. > Using the HighFrequencyDictionaryFactory works ok but behaves very different. > ``` > > > suggest > suggestions > suggestions > FuzzyLookupFactory > DocumentDictionaryFactory > suggest_fuzzy > true > false > false > true > 0 > > > null:java.lang.StackOverflowError > at > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1311) > at > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1311) > at > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1311) > at > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(Operations.java:1311) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13661) A package management system for Solr
[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941012#comment-16941012 ] Ishan Chattopadhyaya commented on SOLR-13661: - bq. I want to be clear on one thing: The concern/frustration that Jan and I have on peer review is because this issue is not some ordinary JIRA issue. It's highly impactful to Solr. As-such, IMO peer review is required for at least the major ideas / high level, naming, CLI, release-plan. Getting into some small details, no, not needed. Thankfully the peer review is here now Thanks David and Jan for your help with reviews. I am glad to receive the reviews and the improvement that brings to our design and implementation. I shall update the design document with all the details that we discussed offline (on slack) that will make it easier to understand the workflows involved here. > A package management system for Solr > > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13661) A package management system for Solr
[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941010#comment-16941010 ] Ishan Chattopadhyaya commented on SOLR-13661: - We are in a difficult situation here. There are some commits concerning the class loader changes and new blob store that are already in master and branch_8x. But I am supposed to cut the 8.3 branch today. Here are the options we have: # We revert those changes. But they seem complicated enough that I don't want to attempt it. # We review and merge https://github.com/apache/lucene-solr/pull/910 which will complete the blob store and class loader work, but might take 1-2 days to review? # We leave the unfinished code in the branch and disable the feature and cut the branch. [~noble.paul], [~dsmiley], [~janhoy] any thoughts? > A package management system for Solr > > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13661) A package management system for Solr
[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941007#comment-16941007 ] David Wayne Smiley commented on SOLR-13661: --- The design document is pretty fantastic in its overall scope (not too much or too little) and structure (easy to consume). Of course I have things inside to debate but it was a breath of fresh air to consume. I want to be clear on one thing: The concern/frustration that Jan and I have on peer review is because this issue is not some ordinary JIRA issue. It's highly impactful to Solr. As-such, IMO peer review is _required_ for at least the major ideas / high level, naming, CLI, release-plan. Getting into some small details, no, not needed. Thankfully the peer review is here now :-) > A package management system for Solr > > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 40m > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] noblepaul opened a new pull request #910: SOLR-13661 : SOLR-13661 A package management system for Solr
noblepaul opened a new pull request #910: SOLR-13661 : SOLR-13661 A package management system for Solr URL: https://github.com/apache/lucene-solr/pull/910 # Description Please refer to the design doc for for details https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?ts=5d86a8ad# # Tests TestPackages have the tests required for this # Checklist Please review the following and check all that apply: - [ ] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [ ] I have created a Jira issue and added the issue ID to my pull request title. - [ ] I am authorized to contribute this code to the ASF and have removed any code I do not have a license to distribute. - [ ] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [ ] I have developed this patch against the `master` branch. - [ ] I have run `ant precommit` and the appropriate test suite. - [ ] I have added tests for my changes. - [ ] I have added documentation for the [Ref Guide](https://github.com/apache/lucene-solr/tree/master/solr/solr-ref-guide) (for Solr changes only). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13290) Prometheus metric exporter AsyncLogger: java.lang.NoClassDefFoundError
[ https://issues.apache.org/jira/browse/SOLR-13290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-13290. --- Resolution: Fixed [~kstoney] Looking over old JIRAs I've assigned to myself and saw this. I'm assuming it's fixed, probably by the linked JIRA. Do you agree? If not we can re-open. > Prometheus metric exporter AsyncLogger: java.lang.NoClassDefFoundError > -- > > Key: SOLR-13290 > URL: https://issues.apache.org/jira/browse/SOLR-13290 > Project: Solr > Issue Type: Bug > Components: metrics >Affects Versions: 8.0, 8.1 >Reporter: Karl Stoney >Assignee: Erick Erickson >Priority: Major > > Since this > commit:[https://github.com/apache/lucene-solr/commit/02eb9d34404b8fc7225ee7c5c867e194afae17a0] > The metrics exporter in branch_8x no longer starts > {code:java} > 2019-03-04 16:06:01,070 main ERROR Unable to invoke factory method in class > org.apache.logging.log4j.core.async.AsyncLoggerConfig for element > AsyncLogger: java.lang.NoClassDefFoundError > : com/lmax/disruptor/EventFactory java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.logging.log4j.core.config.plugins.util.PluginBuilder.build(PluginBuilder.java:136) > at > org.apache.logging.log4j.core.config.AbstractConfiguration.createPluginObject(AbstractConfiguration.java:964) > at > org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:904) > at > org.apache.logging.log4j.core.config.AbstractConfiguration.createConfiguration(AbstractConfiguration.java:896) > at > org.apache.logging.log4j.core.config.AbstractConfiguration.doConfigure(AbstractConfiguration.java:514) > at > org.apache.logging.log4j.core.config.AbstractConfiguration.initialize(AbstractConfiguration.java:238) > at > org.apache.logging.log4j.core.config.AbstractConfiguration.start(AbstractConfiguration.java:250) > at > org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:548) > at > org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:620) > at > org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:637) > at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:231) > at > org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:153) > at > org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:45) > at org.apache.logging.log4j.LogManager.getContext(LogManager.java:194) > at > org.apache.logging.log4j.spi.AbstractLoggerAdapter.getContext(AbstractLoggerAdapter.java:121) > at > org.apache.logging.slf4j.Log4jLoggerFactory.getContext(Log4jLoggerFactory.java:43) > at > org.apache.logging.log4j.spi.AbstractLoggerAdapter.getLogger(AbstractLoggerAdapter.java:46) > at > org.apache.logging.slf4j.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:29) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:358) > at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:383) > at > org.apache.solr.prometheus.exporter.SolrExporter.(SolrExporter.java:48) > Caused by: java.lang.NoClassDefFoundError: com/lmax/disruptor/EventFactory > at > org.apache.logging.log4j.core.config.AbstractConfiguration.getAsyncLoggerConfigDelegate(AbstractConfiguration.java:203) > at > org.apache.logging.log4j.core.async.AsyncLoggerConfig.(AsyncLoggerConfig.java:91) > at > org.apache.logging.log4j.core.async.AsyncLoggerConfig.createLogger(AsyncLoggerConfig.java:273) > ... 25 more > Caused by: java.lang.ClassNotFoundException: com.lmax.disruptor.EventFactory > at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 28 more{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on issue #904: LUCENE-8992: Share minimum score across segment in concurrent search
jimczi commented on issue #904: LUCENE-8992: Share minimum score across segment in concurrent search URL: https://github.com/apache/lucene-solr/pull/904#issuecomment-536520766 Thanks for reviewing @atris . I pushed some changes to address your comments and add unit tests for the bottom value checker, can you take another look ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] jimczi commented on a change in pull request #904: LUCENE-8992: Share minimum score across segment in concurrent search
jimczi commented on a change in pull request #904: LUCENE-8992: Share minimum score across segment in concurrent search URL: https://github.com/apache/lucene-solr/pull/904#discussion_r329528877 ## File path: lucene/core/src/java/org/apache/lucene/search/TopFieldCollector.java ## @@ -423,22 +440,24 @@ static TopFieldCollector create(Sort sort, int numHits, FieldDoc after, throw new IllegalArgumentException("after.fields has " + after.fields.length + " values but sort has " + sort.getSort().length); } - return new PagingFieldCollector(sort, queue, after, numHits, hitsThresholdChecker); + return new PagingFieldCollector(sort, queue, after, numHits, hitsThresholdChecker, bottomValueChecker); } } /** * Create a CollectorManager which uses a shared hit counter to maintain number of hits + * and a shared bottom value checker to propagate the minimum score accross segments if + * the primary sort is by relevancy. */ - public static CollectorManager createSharedManager(Sort sort, int numHits, FieldDoc after, - int totalHitsThreshold) { + public static CollectorManager createSharedManager(Sort sort, int numHits, FieldDoc after, int totalHitsThreshold) { Review comment: good catch, this shouldn't be changed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8213) Cache costly subqueries asynchronously
[ https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940798#comment-16940798 ] Atri Sharma commented on LUCENE-8213: - Interesting – I did not realise that testLRUEviction could also cause LRUQueryCache to cache asynchronously, hence did not update it to handle the same (in the manner testLRUConcurrentLoadAndEviction does). I have pushed a test fix now – beasted the test 50 times with the seed you provided, and also beasted the entire TestLRUQueryCache suite 20 times with the seed. Ran the entire Lucene test suite – came in clean. It is curious to note that I could not reproduce the test failure without the seed even after running multiple times – kudos to the CI! > Cache costly subqueries asynchronously > -- > > Key: LUCENE-8213 > URL: https://issues.apache.org/jira/browse/LUCENE-8213 > Project: Lucene - Core > Issue Type: Improvement > Components: core/query/scoring >Affects Versions: 7.2.1 >Reporter: Amir Hadadi >Priority: Minor > Labels: performance > Time Spent: 9h 40m > Remaining Estimate: 0h > > IndexOrDocValuesQuery allows to combine costly range queries with a selective > lead iterator in an optimized way. However, the range query at some point > gets cached by a querying thread in LRUQueryCache, which negates the > optimization of IndexOrDocValuesQuery for that specific query. > It would be nice to see an asynchronous caching implementation in such cases, > so that queries involving IndexOrDocValuesQuery would have consistent > performance characteristics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13661) A package management system for Solr
[ https://issues.apache.org/jira/browse/SOLR-13661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940750#comment-16940750 ] ASF subversion and git services commented on SOLR-13661: Commit 7779be1017c93166cf10d8debc8765e7d121037c in lucene-solr's branch refs/heads/jira/SOLR-13661 from Noble Paul [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=7779be1 ] SOLR-13661: Changed the blob store to use sha256- as the blob id instead of just sha256 > A package management system for Solr > > > Key: SOLR-13661 > URL: https://issues.apache.org/jira/browse/SOLR-13661 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Noble Paul >Assignee: Ishan Chattopadhyaya >Priority: Major > Labels: package > Attachments: plugin-usage.png, repos.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > Here's the design doc: > https://docs.google.com/document/d/15b3m3i3NFDKbhkhX_BN0MgvPGZaBj34TKNF2-UNC3U8/edit?usp=sharing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-13793) HTTPSolrCall makes cascading calls even when all replicas are down for a collection
[ https://issues.apache.org/jira/browse/SOLR-13793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940694#comment-16940694 ] Kesharee Nandan Vishwakarma commented on SOLR-13793: As per https://issues.apache.org/jira/browse/SOLR-4553 we are attempting to proxy requests more aggressively , this leads to scenarios mentioned in this bug. [~markrmil...@gmail.com] Can we improve accuracy in getting active [slices|https://github.com/apache/lucene-solr/blob/e7522297a70674662f1083f9942403bac3119693/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L978] / [coreUrls |https://github.com/apache/lucene-solr/blob/68fa249034ba8b273955f20097700dc2fbb7a800/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L929] itself. or else we can put a limit on cascading remote queries when considering dead replicas/slices > HTTPSolrCall makes cascading calls even when all replicas are down for a > collection > --- > > Key: SOLR-13793 > URL: https://issues.apache.org/jira/browse/SOLR-13793 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: SolrCloud >Affects Versions: 6.6 >Reporter: Kesharee Nandan Vishwakarma >Priority: Major > > REMOTEQUERY action in HTTPSolrCall ends up making too many cascading > remoteQuery calls when all all the replicas of a collection are in down > state. > This results in increase in thread count, unresponsive solr nodes and > eventually node (one's which have this collection) going out of live nodes. > *Example scenario*: Consider a cluster with 3 nodes(solr1, solrw1, > solr-overseer1). A collection is present on solr1, solrw1 but both replicas > are in down state. When a search request is made to solr-overseer1, since > replica is not present locally a remote query is made to solr1 (we also > consider inactive slices/coreUrls), solr1 also doesn't see an active replica > present locally, it forwards to solrw1, again solrw1 will forward request to > solr1. This goes on till both of solr1, solrw1 become unresponsive. Attached > logs for this. > This is happening because we are considering [inactive > slices|https://github.com/apache/lucene-solr/blob/68fa249034ba8b273955f20097700dc2fbb7a800/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L913 > ], [inactive coreUrl| > https://github.com/apache/lucene-solr/blob/68fa249034ba8b273955f20097700dc2fbb7a800/solr/core/src/java/org/apache/solr/servlet/HttpSolrCall.java#L929] > while forwarding requests to nodes. > *Steps to reproduce*: > # Bring down all replicas of a collection but ensure nodes containing them > are up > # Make any search call to any of solr nodes for this collection. > > *Possible fixes*: > # Ensure we select only active slices/coreUrls before making remote queries > # Put a limit on cascading calls probably limit to number of replicas > > {noformat} > solrw1_1 | > solrw1_1 | 2019-09-24 09:35:14.458 ERROR (qtp762152757-8772) [ ] > o.a.s.s.HttpSolrCall null:org.apache.solr.common.SolrException: Error trying > to proxy request for url: http://solr1:8983/solr/kg3/select > solrw1_1 |at > org.apache.solr.servlet.HttpSolrCall.remoteQuery(HttpSolrCall.java:660) > solrw1_1 |at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:514) > solrw1_1 |at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:361) > solrw1_1 |at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:305) > solrw1_1 |at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1691) > solrw1_1 |at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582) > solrw1_1 |at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > solrw1_1 |at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548) > solrw1_1 |at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226) > solrw1_1 |at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180) > solrw1_1 |at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512) > solrw1_1 |at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > solrw1_1 |at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112) > solrw1_1 |at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > solrw1_1 |at >
[jira] [Commented] (LUCENE-8213) Cache costly subqueries asynchronously
[ https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940677#comment-16940677 ] Ignacio Vera commented on LUCENE-8213: -- Here is the failure: https://elasticsearch-ci.elastic.co/job/apache+lucene-solr+master/5526/console I had a look and it is a race condition. We are checking if the query has been cached just after executing a query but if doing it async, it might happen that the query is still not there. > Cache costly subqueries asynchronously > -- > > Key: LUCENE-8213 > URL: https://issues.apache.org/jira/browse/LUCENE-8213 > Project: Lucene - Core > Issue Type: Improvement > Components: core/query/scoring >Affects Versions: 7.2.1 >Reporter: Amir Hadadi >Priority: Minor > Labels: performance > Time Spent: 9h 40m > Remaining Estimate: 0h > > IndexOrDocValuesQuery allows to combine costly range queries with a selective > lead iterator in an optimized way. However, the range query at some point > gets cached by a querying thread in LRUQueryCache, which negates the > optimization of IndexOrDocValuesQuery for that specific query. > It would be nice to see an asynchronous caching implementation in such cases, > so that queries involving IndexOrDocValuesQuery would have consistent > performance characteristics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8213) Cache costly subqueries asynchronously
[ https://issues.apache.org/jira/browse/LUCENE-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16940673#comment-16940673 ] Atri Sharma commented on LUCENE-8213: - Taking a look – although I am unable to reproduce with a simple ant test. Can you point me to the CI link so that I can dive deeper into the error output? > Cache costly subqueries asynchronously > -- > > Key: LUCENE-8213 > URL: https://issues.apache.org/jira/browse/LUCENE-8213 > Project: Lucene - Core > Issue Type: Improvement > Components: core/query/scoring >Affects Versions: 7.2.1 >Reporter: Amir Hadadi >Priority: Minor > Labels: performance > Time Spent: 9h 40m > Remaining Estimate: 0h > > IndexOrDocValuesQuery allows to combine costly range queries with a selective > lead iterator in an optimized way. However, the range query at some point > gets cached by a querying thread in LRUQueryCache, which negates the > optimization of IndexOrDocValuesQuery for that specific query. > It would be nice to see an asynchronous caching implementation in such cases, > so that queries involving IndexOrDocValuesQuery would have consistent > performance characteristics. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org