[JENKINS] Lucene-Solr-trunk-Windows (32bit/jdk1.8.0_20-ea-b21) - Build # 4191 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Windows/4191/ Java: 32bit/jdk1.8.0_20-ea-b21 -client -XX:+UseParallelGC All tests passed Build Log: [...truncated 51837 lines...] BUILD FAILED C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:467: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\build.xml:406: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:87: The following error occurred while executing this line: C:\Users\JenkinsSlave\workspace\Lucene-Solr-trunk-Windows\extra-targets.xml:181: Source checkout is dirty after running tests!!! Offending files: * ./solr/licenses/log4j-1.2.16.jar.sha1 Total time: 174 minutes 31 seconds Build step 'Invoke Ant' marked build as failure [description-setter] Description set: Java: 32bit/jdk1.8.0_20-ea-b21 -client -XX:+UseParallelGC Archiving artifacts Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-trunk-Linux (64bit/jdk1.7.0_60) - Build # 10827 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10827/ Java: 64bit/jdk1.7.0_60 -XX:-UseCompressedOops -XX:+UseParallelGC All tests passed Build Log: [...truncated 59454 lines...] BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:467: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/build.xml:406: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:87: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-trunk-Linux/extra-targets.xml:181: Source checkout is dirty after running tests!!! Offending files: * ./solr/licenses/log4j-1.2.16.jar.sha1 Total time: 98 minutes 46 seconds Build step 'Invoke Ant' marked build as failure [description-setter] Description set: Java: 64bit/jdk1.7.0_60 -XX:-UseCompressedOops -XX:+UseParallelGC Archiving artifacts Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_60) - Build # 10705 - Failure!
Tim has already fixed these. On Wed, Jul 16, 2014 at 6:49 AM, Policeman Jenkins Server < jenk...@thetaphi.de> wrote: > Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10705/ > Java: 32bit/jdk1.7.0_60 -server -XX:+UseSerialGC > > All tests passed > > Build Log: > [...truncated 32600 lines...] > -check-forbidden-all: > [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.7 > [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.7 > [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.3 > [forbidden-apis] Reading API signatures: > /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/forbiddenApis/base.txt > [forbidden-apis] Reading API signatures: > /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/forbiddenApis/servlet-api.txt > [forbidden-apis] Loading classes to check... > [forbidden-apis] Scanning for API signatures and dependencies... > [forbidden-apis] Forbidden method invocation: > java.util.concurrent.Executors#newFixedThreadPool(int) [Spawns threads with > vague names; use a custom thread factory (Lucene's NamedThreadFactory, > Solr's DefaultSolrThreadFactory) and name threads so that you can tell (by > its name) which executor it is associated with] > [forbidden-apis] in > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServerTest > (ConcurrentUpdateSolrServerTest.java:167) > [forbidden-apis] Forbidden method invocation: > javax.servlet.ServletRequest#getParameterMap() [Servlet API method is > parsing request parameters without using the correct encoding if no extra > configuration is given in the servlet container] > [forbidden-apis] in > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServerTest$TestServlet > (ConcurrentUpdateSolrServerTest.java:85) > [forbidden-apis] Scanned 368 (and 504 related) class file(s) for forbidden > API invocations (in 0.15s), 2 error(s). > > BUILD FAILED > /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The > following error occurred while executing this line: > /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:70: The > following error occurred while executing this line: > /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build.xml:271: The > following error occurred while executing this line: > /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/common-build.xml:479: > Check for forbidden API calls failed, see log. > > Total time: 105 minutes 44 seconds > Build step 'Invoke Ant' marked build as failure > [description-setter] Description set: Java: 32bit/jdk1.7.0_60 -server > -XX:+UseSerialGC > Archiving artifacts > Recording test results > Email was triggered for: Failure - Any > Sending email for trigger: Failure - Any > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > -- Regards, Shalin Shekhar Mangar.
[jira] [Created] (LUCENE-5826) Support proper hunspell case handling and related options
Robert Muir created LUCENE-5826: --- Summary: Support proper hunspell case handling and related options Key: LUCENE-5826 URL: https://issues.apache.org/jira/browse/LUCENE-5826 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5826.patch When ignoreCase=false, we should accept title-cased/upper-cased forms just like hunspell -m. Furthermore there are some options around this: * LANG: can turn on alternate casing for turkish/azeri * KEEPCASE: can prevent acceptance of title/upper cased forms for words While we are here setting up the same logic anyway, add support for similar options: * NEEDAFFIX/PSEUDOROOT: form is invalid without being affixed * ONLYINCOMPOUND: form/affixes only make sense inside compounds. This stuff is unrelated to the ignoreCase=true option. If you use that option though, it does use correct alternate casing for tr_TR/az_AZ now though. I didn't yet implement CHECKSHARPS because it seems more complicated, I have to figure out what the logic there should be first. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5826) Support proper hunspell case handling and related options
[ https://issues.apache.org/jira/browse/LUCENE-5826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5826: Attachment: LUCENE-5826.patch Patch with tests for these options and casing behavior. > Support proper hunspell case handling and related options > - > > Key: LUCENE-5826 > URL: https://issues.apache.org/jira/browse/LUCENE-5826 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir > Attachments: LUCENE-5826.patch > > > When ignoreCase=false, we should accept title-cased/upper-cased forms just > like hunspell -m. Furthermore there are some options around this: > * LANG: can turn on alternate casing for turkish/azeri > * KEEPCASE: can prevent acceptance of title/upper cased forms for words > While we are here setting up the same logic anyway, add support for similar > options: > * NEEDAFFIX/PSEUDOROOT: form is invalid without being affixed > * ONLYINCOMPOUND: form/affixes only make sense inside compounds. > This stuff is unrelated to the ignoreCase=true option. If you use that option > though, it does use correct alternate casing for tr_TR/az_AZ now though. > I didn't yet implement CHECKSHARPS because it seems more complicated, I have > to figure out what the logic there should be first. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5825) Allowing the benchmarking algorithm to choose PostingsFormat
Varun V Shenoy created LUCENE-5825: --- Summary: Allowing the benchmarking algorithm to choose PostingsFormat Key: LUCENE-5825 URL: https://issues.apache.org/jira/browse/LUCENE-5825 Project: Lucene - Core Issue Type: Improvement Components: modules/benchmark Affects Versions: 5.0 Reporter: Varun V Shenoy Priority: Minor Fix For: 5.0 The algorithm file for benchmarking should allow PostingsFormat to be configurable. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063089#comment-14063089 ] Da Huang commented on LUCENE-4396: -- Thank you, Mike! {quote} It looks like this gave some nice gains with the many-not cases {quote} Yes, but many-not cases may not be a usual case. Therefore, this method might be used in the final method. {quote} Curiously some of the tasks are really hurt by the larger sizes ... maybe 1<<9 is a good compromise? {quote} Yeah. Finally, I will just focus on those \*Some\* cases. "size9" is better for HighAndSomeHighOr case, while "size5" is better for LowAndSomeHighOr, LowAndSomeLowNot and LowAndSomeLowOr cases. I think it would be better to detect the case type and adjust the SIZE of bucketTable in BNS's constructor. > BooleanScorer should sometimes be used for MUST clauses > --- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > SIZE.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, > stat.cpp > > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 100 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063089#comment-14063089 ] Da Huang edited comment on LUCENE-4396 at 7/16/14 3:53 AM: --- Thank you, Mike! {quote} It looks like this gave some nice gains with the many-not cases {quote} Yes, but many-not cases might not be a usual case. Therefore, this method might not be used in the final method. {quote} Curiously some of the tasks are really hurt by the larger sizes ... maybe 1<<9 is a good compromise? {quote} Yeah. Finally, I will just focus on those \*Some\* cases. "size9" is better for HighAndSomeHighOr case, while "size5" is better for LowAndSomeHighOr, LowAndSomeLowNot and LowAndSomeLowOr cases. I think it would be better to detect the case type and adjust the SIZE of bucketTable in BNS's constructor. was (Author: dhuang): Thank you, Mike! {quote} It looks like this gave some nice gains with the many-not cases {quote} Yes, but many-not cases may not be a usual case. Therefore, this method might be used in the final method. {quote} Curiously some of the tasks are really hurt by the larger sizes ... maybe 1<<9 is a good compromise? {quote} Yeah. Finally, I will just focus on those \*Some\* cases. "size9" is better for HighAndSomeHighOr case, while "size5" is better for LowAndSomeHighOr, LowAndSomeLowNot and LowAndSomeLowOr cases. I think it would be better to detect the case type and adjust the SIZE of bucketTable in BNS's constructor. > BooleanScorer should sometimes be used for MUST clauses > --- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > SIZE.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, > stat.cpp > > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 100 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-4.x-Linux (64bit/ibm-j9-jdk7) - Build # 10706 - Still Failing!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10706/ Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} All tests passed Build Log: [...truncated 32538 lines...] -check-forbidden-all: [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.7 [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.7 [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.3 [forbidden-apis] Reading API signatures: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/forbiddenApis/base.txt [forbidden-apis] Reading API signatures: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/forbiddenApis/servlet-api.txt [forbidden-apis] Loading classes to check... [forbidden-apis] Scanning for API signatures and dependencies... [forbidden-apis] Forbidden method invocation: java.util.concurrent.Executors#newFixedThreadPool(int) [Spawns threads with vague names; use a custom thread factory (Lucene's NamedThreadFactory, Solr's DefaultSolrThreadFactory) and name threads so that you can tell (by its name) which executor it is associated with] [forbidden-apis] in org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServerTest (ConcurrentUpdateSolrServerTest.java:167) [forbidden-apis] Forbidden method invocation: javax.servlet.ServletRequest#getParameterMap() [Servlet API method is parsing request parameters without using the correct encoding if no extra configuration is given in the servlet container] [forbidden-apis] in org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServerTest$TestServlet (ConcurrentUpdateSolrServerTest.java:85) [forbidden-apis] Scanned 368 (and 503 related) class file(s) for forbidden API invocations (in 0.26s), 2 error(s). BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:70: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build.xml:271: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/common-build.xml:479: Check for forbidden API calls failed, see log. Total time: 66 minutes 37 seconds Build step 'Invoke Ant' marked build as failure [description-setter] Description set: Java: 64bit/ibm-j9-jdk7 -Xjit:exclude={org/apache/lucene/util/fst/FST.pack(IIF)Lorg/apache/lucene/util/fst/FST;} Archiving artifacts Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-5746: --- Attachment: SOLR-5746.patch bq. Since I stored "raw" config values, I used SolrParam to do the type conversion, but I didn’t find any API for a parameter removal. That’s why I’m keeping the original NamedList, so that I can remove correctly read values and keep track of the unknown ones. my previous suggestion about using SolrParams was vague and misguided. i _think_ i had in mind the idea of using SolrParams instead of the {{Map propMap}} -- your latest patch that eliminates the SolrParams and just uses the NamedList is definitely a better call. bq. ... However, I didn't realise that solr.xml files are versioned the same way as schema.xml files are. Should I bump the schema version to 1.6? It's not explicitly versioned -- just hueristicly versioned based on wether ConfigSolr detects by introspection that it's the "old style" or the "new style" ... i think Jack may have just been confused about what this issue was when he posted his comment. I'm attaching some updates to your patch... * skimming your changes made me realize there's a lot of cruft in this code related to defered sys prop substitution that's no longer needed at all - so I ripped that out. * I'm not really a fan of the way you added "excludedElements" to the DOMUtils method -- particularly since it still required the {{namedList.removeAll(null);}} call which seemed sloppy. I'd much rather have a tighter XPath that is very explicit about what we want out of the dom and handle the excusions that way ... so i changed that. * i added some explicit testing of {{}} since i wasn't completley convinced your new code that that would work correctly. * I think i misslead you a bit when i said we should validate configs being declared multiple times -- it's not a good idea to up front check that nothing is declared more then once, beause a week from now someone may in fact want to add something to solr.xml that can be specified multiple times. The better place for this type of validation is in storeConfigProperty, because at that point we know we expect there to be a single value. ** this does unfortunately mean it aborts early the first time it finds a duplicated key, so some of your tests had to be changed. * I switched the check for unknown options to be per section so the error msgs could include the section details as well. * String.format must be used with Locale.ROOT to prevent locale sensitive behavior. ** {{ant check-forbidden-apis}} will point out stuff like this for you in the lucene/solr code base * relaxed the int parsing so that small {{}} values are fine, but large longs still throw an error ** added test for both cases * added some checks that the sections themselves weren't being duplicated (ie: if a user adds a section totheir solr.xml, we want to give them an error if another section already existed higher up in the file) * some general test refactoring... ** no need to construct new Random instances -- just use random() ** eliminated a lot of unneccessary file creation in the tests by using {{ConfigSolr.fromString(loader, solrXml);}} instead of {{FileUtils.writeFile(...)}} and {{ConfigSolr.fromSolrHome(...)}} ** switched to the lucene convention of testmethod naming to eliminate ~20 lines of {{@Test}} annotations (the verbosity is why our test runner explicitly lets us continue to use the JUnit3 convention) I think this is probably ready to go -- but it would be nice to get some review from [~romseygeek] and/or [~erickerickson] since they know this code the best ... and of course, [~maciej.zasada]: you've clearly been looking at this code a lot the last few days, do you ou have any additional thoughts on my revised patch? > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, > SOLR-5746.patch, SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the abov
[JENKINS] Lucene-Solr-4.x-Linux (32bit/jdk1.7.0_60) - Build # 10705 - Failure!
Build: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/10705/ Java: 32bit/jdk1.7.0_60 -server -XX:+UseSerialGC All tests passed Build Log: [...truncated 32600 lines...] -check-forbidden-all: [forbidden-apis] Reading bundled API signatures: jdk-unsafe-1.7 [forbidden-apis] Reading bundled API signatures: jdk-deprecated-1.7 [forbidden-apis] Reading bundled API signatures: commons-io-unsafe-2.3 [forbidden-apis] Reading API signatures: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/forbiddenApis/base.txt [forbidden-apis] Reading API signatures: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/lucene/tools/forbiddenApis/servlet-api.txt [forbidden-apis] Loading classes to check... [forbidden-apis] Scanning for API signatures and dependencies... [forbidden-apis] Forbidden method invocation: java.util.concurrent.Executors#newFixedThreadPool(int) [Spawns threads with vague names; use a custom thread factory (Lucene's NamedThreadFactory, Solr's DefaultSolrThreadFactory) and name threads so that you can tell (by its name) which executor it is associated with] [forbidden-apis] in org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServerTest (ConcurrentUpdateSolrServerTest.java:167) [forbidden-apis] Forbidden method invocation: javax.servlet.ServletRequest#getParameterMap() [Servlet API method is parsing request parameters without using the correct encoding if no extra configuration is given in the servlet container] [forbidden-apis] in org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServerTest$TestServlet (ConcurrentUpdateSolrServerTest.java:85) [forbidden-apis] Scanned 368 (and 504 related) class file(s) for forbidden API invocations (in 0.15s), 2 error(s). BUILD FAILED /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:467: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/build.xml:70: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/build.xml:271: The following error occurred while executing this line: /mnt/ssd/jenkins/workspace/Lucene-Solr-4.x-Linux/solr/common-build.xml:479: Check for forbidden API calls failed, see log. Total time: 105 minutes 44 seconds Build step 'Invoke Ant' marked build as failure [description-setter] Description set: Java: 32bit/jdk1.7.0_60 -server -XX:+UseSerialGC Archiving artifacts Recording test results Email was triggered for: Failure - Any Sending email for trigger: Failure - Any - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062909#comment-14062909 ] Nathan Neulinger commented on SOLR-6251: Additionally - since this works 99.9% of the time - I would surely think that a blatant problem as that would have been more visible. The incremental updates work normally without issue, and just randomly fail. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > Attachments: schema.xml > > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062907#comment-14062907 ] Nathan Neulinger commented on SOLR-6251: Leaving closed, but adding more information in case Hoss Man will comment additionally. 'timestamp' is: stored=true indexed=false That seems to meet all of the requirements stated for partial updates unless 'indexed=true' is also required and not documented. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > Attachments: schema.xml > > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nathan Neulinger updated SOLR-6251: --- Attachment: schema.xml schema attached > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > Attachments: schema.xml > > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062874#comment-14062874 ] Hoss Man commented on SOLR-2894: Hey Andrew, I probably won't have a chance to review this issue/patches again until monday - but some quick replies... bq. With the OVERREQUEST options uncommented we do not get the proper bbc value and so the distributed version diverges from the non-distrib. Your second comment on this issue is exactly on point. Just to clarify: you are saying that bbc isn't included in the "top" set in the distrib call because overrequest is so low, which is inconcsistent with the control where bbc is in the top -- but all of the values returned by the distrib call do in fact have accurate refined counts ... correct? The point of that check is to definitely ensure that refinement works properly on facet.missing -- that's why i added it, because it wasn't before and the test didn't catch it because of the default overrequest -- so we can't eliminate those OVERREQUEST params. what we can do is explicitly call {{queryServer(...)}} instead of {{query(...)}} to ht a random distributed server bu bypass the comparison with the control server -- in that case though we want a lot of tight assertions to ensure that we aren't missing anything. (of course: we can also include another check of the same facet.missing request with the overrequest disabled if you want -- no one ever complained about too many assertions in a test) bq. Should facet.missing respect the mincount (in this case it's 1)? I think so? .. if that's what the non-distrib code is doing, that's what the distrib code should do as well. > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Hoss Man > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-mincount-minification.patch, > SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894_cloud_test.patch, > dateToObject.patch, pivot_mincount_problem.sh > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062806#comment-14062806 ] Nathan Neulinger commented on SOLR-6251: We are open to diagnostic suggestions on this, but are at a loss since this appears to be very intermittent and non-reproducible other than by waiting. Looking at solrconfig.xml compared to what is currently in 4.8.0 example - there are a variety of differences, mostly look like due to this config originally being based on 4.4 solrconfig.xml example. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-6251. Resolution: Not a Problem resolving as not a problem - refered user to solr-user@lucene mailing list. can reopen if more details indicate an actual bug. > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062802#comment-14062802 ] Hoss Man commented on SOLR-6251: Please ask questions about things like this on the solr-user@lucene list prior to filing a bug. You have not provided details about your schema, but based on the details you have provided, it appears that your timestamp field is not "stored", therefore you are probably hitting a documented limitation of using partial updates... https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents {panel} All original source fields must be stored for field modifiers to work correctly, which is the Solr default. {panel} > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062800#comment-14062800 ] Nathan Neulinger edited comment on SOLR-6251 at 7/15/14 10:51 PM: -- and here's an update in that same debug log from shortly before the error (the distribute from the insert of the document on solr1): {noformat} 2014-07-10 21:29:49,313 INFO qtp1599863753-30844 [solr.update.processor.LogUpdateProcessor] - [d-_v22_shard1_replica2] webapp=/solr path=/update params={distrib.from=http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/&update.distrib=TOLEADER&wt=javabin&version=2} {add=[4b2c4d09-31e2-4fe2-b767-3868efbdcda1 (1473278419196182528)]} 0 11 2014-07-10 21:29:49,416 INFO qtp1599863753-30844 [org.apache.solr.update.UpdateHandler] - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} {noformat} was (Author: nneul): and here's an update in that same debug log from shortly before the error (the distribute from the insert of the document on solr1): 2014-07-10 21:29:49,313 INFO qtp1599863753-30844 [solr.update.processor.LogUpdateProcessor] - [d-_v22_shard1_replica2] webapp=/solr path=/update params={distrib.from=http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/&update.distrib=TOLEADER&wt=javabin&version=2} {add=[4b2c4d09-31e2-4fe2-b767-3868efbdcda1 (1473278419196182528)]} 0 11 2014-07-10 21:29:49,416 INFO qtp1599863753-30844 [org.apache.solr.update.UpdateHandler] - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062793#comment-14062793 ] Nathan Neulinger edited comment on SOLR-6251 at 7/15/14 10:52 PM: -- {noformat} 16.24 = POD SRV 16.204 = SOLR 1 16.207 = SOLR 2 16.24 ⇒ 16.204 CAP 1 11344 14:29:49.299883 POST /solr/d-_v22/update/json?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 8677c2fb-8b92-4220-bb73-1e4c610d95be 2057 User-Agent: HivePoint (Factory JSON client:null:2056) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 1555 Connection: keep-alive { "add": { "commitWithin" : 5000, "doc" : {"hive":"vdates","at":"2014-07-10T21:28:41Z","timestamp":1405027721000,"type":"MESSAGE","channel":["dev"],"from":"pr...@sevogle.com","to":["a...@sevogle.com","vi...@sevogle.com","d...@sevogle.com","s...@hive.sevogle.com"],"subject":"Re: Deployments - B and then C","body":"eve.SNIP...stem. ","id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1"} } } 16.204 ⇒ 16.207 CAP 1 POST /solr/d-_v22_shard1_replica2/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F10.220.16.204%3A8983%2Fsolr%2Fd-_v22_shard1_replica1%2F&wt=javabin&version=2 HTTP/1.1 User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 Content-Type: application/javabin Transfer-Encoding: chunked Host: 10.220.16.207:8983 Connection: Keep-Alive 64c ...¶ms...update.distrib(TOLEADER.,distrib.from?.http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/.&delByQ..'docsMap.?$hive&vdates."at42014-07-10T21:28:41Z.)timestampx...$type'MESSAGE.'channel.#dev.$from1pr...@sevogle.com."to.0adam@sevogle.com1vi...@sevogle.com/dev@sevogle.com4...@hive.sevogle.com.'subject>Re: Deployments - B and then C.$body?#eve.SNIP...tem. ."id?.4b2c4d09-31e2-4fe2-b767-3868efbdcda1.*message_id?.2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1 .."ow.."cwX... 0 16.207 ⇒ 16.204 CAP 1 11368 14:29:49.495301 HTTP/1.1 200 OK Content-Type: application/octet-stream Content-Length: 40 responseHeader..&status..%QTimeK 16.24 ⇒ 16.204 CAP 1 11371 14:29:49.496308 INDEX COMPLETE HTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2C {"responseHeader":{"status":0,"QTime":195}} 0 16.24 ⇒ 16.207 CAP 2 9218 14:29:57.065156 9232 14:29”57.099274 Search (two different search results to two servers?) that show the timestamp is set. POST /solr/d-_v22/select?indent=on&wt=json HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/x-www-form-urlencoded; charset=UTF-8 request_id: null 957d1ca5-7200-4058-9c70-16a17fc64c19 2069 User-Agent: HivePoint (Factory JSON client:null:2068) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 244 Connection: keep-alive q=%2B%28*%29&fq=%2Bhive%3Avdates+AND+%2Bchannel%3A%28adam+bethany+dev+notifications+preet+share%29+AND+at%3A%5B2014-07-10T21%3A27%3A56Z+TO+*%5D&start=0&rows=300&sort=at+desc%2C+id+desc&fl=id,hive,timestamp,type,message_id,file_instance_id,scoreHTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2BB { "responseHeader":{ "status":0, "QTime":3, "params":{ "fl":"id,hive,timestamp,type,message_id,file_instance_id,score", "sort":"at desc, id desc", "indent":"on", "start":"0", "q":"+(*)", "wt":"json", "fq":"+hive:vdates AND +channel:(adam bethany dev notifications preet share) AND at:[2014-07-10T21:27:56Z TO *]", "rows":"300"}}, "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "hive":"vdates", "timestamp":1405027721000, "type":"MESSAGE", "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1", "score":1.0}] }} 0 16.24 ⇒ 16.207 CAP 2 9415 14:30:00.310995 Update Channel POST /solr/d-_v22/update?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 92fa6c11-78d8-44cc-a143-9ff3e4c132f4 2115 User-Agent: HivePoint (Factory JSON client:null:2114) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 102 Connection: keep-alive [{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "preet"},"channel": {"add": "adam"}}]HTTP/1.1 400 Bad Request Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 96 {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"[doc=4b2c4d09-31e2-4fe2-b767-3868efbdcda1] missing required field: timestamp","code":400}} 0 CAP 2 9602 14:30:08.082758 Subsequent search, after update POST /
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062800#comment-14062800 ] Nathan Neulinger commented on SOLR-6251: and here's an update in that same debug log from shortly before the error (the distribute from the insert of the document on solr1): 2014-07-10 21:29:49,313 INFO qtp1599863753-30844 [solr.update.processor.LogUpdateProcessor] - [d-_v22_shard1_replica2] webapp=/solr path=/update params={distrib.from=http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/&update.distrib=TOLEADER&wt=javabin&version=2} {add=[4b2c4d09-31e2-4fe2-b767-3868efbdcda1 (1473278419196182528)]} 0 11 2014-07-10 21:29:49,416 INFO qtp1599863753-30844 [org.apache.solr.update.UpdateHandler] - start commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} > incorrect 'missing required field' during update - document definitely has it > - > > Key: SOLR-6251 > URL: https://issues.apache.org/jira/browse/SOLR-6251 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.8 > Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All > on EC2. The two hosts are round-robin'd behind an ELB. >Reporter: Nathan Neulinger > Labels: replication > > Document added on solr1. We can see the distribute take place from solr1 to > solr2 and returning a success. Subsequent searches returning document, > clearly showing the field as being there. Later on, an update is done to add > to an element of the document - and the update fails. The update was sent to > solr2 instance. > Schema marks the 'timestamp' field as required, so the initial insert should > not work if the field isn't present. > Symptom is intermittent - we're seeing this randomly, with no warning or > triggering that we can see, but in all cases, it's getting the error in > response to an update when the instance tries to distribute the change to the > other node. > Searches that were run AFTER the update also show the field as being present > in the document. > Will add full trace of operations in the comments shortly. pcap captures of > ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062796#comment-14062796 ] Nathan Neulinger commented on SOLR-6251: this is the occurrence of the error on the server the update ran on 2014-07-10 21:30:00,313 ERROR qtp1599863753-30801 [org.apache.solr.core.SolrCore ] - org.apache.solr.common.SolrException: [doc=4b2c4d09-31e2-4fe2-b767-3868efbdcda1] missing required field: timestamp at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:189) at org.apache.solr.update.AddUpdateCommand.getLuceneDocument(AddUpdateCommand.java:77) at org.apache.solr.update.DirectUpdateHandler2.addDoc0(DirectUpdateHandler2.java:234) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:160) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51) at org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:704) at org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:858) at org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:557) at org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.handleAdds(JsonLoader.java:393) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.processUpdate(JsonLoader.java:118) at org.apache.solr.handler.loader.JsonLoader$SingleThreadedJsonLoader.load(JsonLoader.java:102) at org.apache.solr.handler.loader.JsonLoader.load(JsonLoader.java:66) at org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) > incorrect 'missi
[jira] [Commented] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
[ https://issues.apache.org/jira/browse/SOLR-6251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062793#comment-14062793 ] Nathan Neulinger commented on SOLR-6251: 16.24 = POD SRV 16.204 = SOLR 1 16.207 = SOLR 2 16.24 ⇒ 16.204 CAP 1 11344 14:29:49.299883 POST /solr/d-_v22/update/json?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 8677c2fb-8b92-4220-bb73-1e4c610d95be 2057 User-Agent: HivePoint (Factory JSON client:null:2056) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 1555 Connection: keep-alive { "add": { "commitWithin" : 5000, "doc" : {"hive":"vdates","at":"2014-07-10T21:28:41Z","timestamp":1405027721000,"type":"MESSAGE","channel":["dev"],"from":"pr...@sevogle.com","to":["a...@sevogle.com","vi...@sevogle.com","d...@sevogle.com","s...@hive.sevogle.com"],"subject":"Re: Deployments - B and then C","body":"eve.SNIP...stem. ","id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1"} } } 16.204 ⇒ 16.207 CAP 1 POST /solr/d-_v22_shard1_replica2/update?update.distrib=TOLEADER&distrib.from=http%3A%2F%2F10.220.16.204%3A8983%2Fsolr%2Fd-_v22_shard1_replica1%2F&wt=javabin&version=2 HTTP/1.1 User-Agent: Solr[org.apache.solr.client.solrj.impl.HttpSolrServer] 1.0 Content-Type: application/javabin Transfer-Encoding: chunked Host: 10.220.16.207:8983 Connection: Keep-Alive 64c ...¶ms...update.distrib(TOLEADER.,distrib.from?.http://10.220.16.204:8983/solr/d-_v22_shard1_replica1/.&delByQ..'docsMap.?$hive&vdates."at42014-07-10T21:28:41Z.)timestampx...$type'MESSAGE.'channel.#dev.$from1pr...@sevogle.com."to.0adam@sevogle.com1vi...@sevogle.com/dev@sevogle.com4...@hive.sevogle.com.'subject>Re: Deployments - B and then C.$body?#eve.SNIP...tem. ."id?.4b2c4d09-31e2-4fe2-b767-3868efbdcda1.*message_id?.2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1 .."ow.."cwX... 0 16.207 ⇒ 16.204 CAP 1 11368 14:29:49.495301 HTTP/1.1 200 OK Content-Type: application/octet-stream Content-Length: 40 responseHeader..&status..%QTimeK 16.24 ⇒ 16.204 CAP 1 11371 14:29:49.496308 INDEX COMPLETE HTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2C {"responseHeader":{"status":0,"QTime":195}} 0 16.24 ⇒ 16.207 CAP 2 9218 14:29:57.065156 9232 14:29”57.099274 Search (two different search results to two servers?) that show the timestamp is set. POST /solr/d-_v22/select?indent=on&wt=json HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/x-www-form-urlencoded; charset=UTF-8 request_id: null 957d1ca5-7200-4058-9c70-16a17fc64c19 2069 User-Agent: HivePoint (Factory JSON client:null:2068) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 244 Connection: keep-alive q=%2B%28*%29&fq=%2Bhive%3Avdates+AND+%2Bchannel%3A%28adam+bethany+dev+notifications+preet+share%29+AND+at%3A%5B2014-07-10T21%3A27%3A56Z+TO+*%5D&start=0&rows=300&sort=at+desc%2C+id+desc&fl=id,hive,timestamp,type,message_id,file_instance_id,scoreHTTP/1.1 200 OK Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 2BB { "responseHeader":{ "status":0, "QTime":3, "params":{ "fl":"id,hive,timestamp,type,message_id,file_instance_id,score", "sort":"at desc, id desc", "indent":"on", "start":"0", "q":"+(*)", "wt":"json", "fq":"+hive:vdates AND +channel:(adam bethany dev notifications preet share) AND at:[2014-07-10T21:27:56Z TO *]", "rows":"300"}}, "response":{"numFound":1,"start":0,"maxScore":1.0,"docs":[ { "hive":"vdates", "timestamp":1405027721000, "type":"MESSAGE", "id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1", "message_id":"2014-07-10-77a6614c-66e4-4ddb-8566-dff4bfb743d1", "score":1.0}] }} 0 16.24 ⇒ 16.207 CAP 2 9415 14:30:00.310995 Update Channel POST /solr/d-_v22/update?commit=true HTTP/1.1 host: d01-solr.srv.hivepoint.com Accept-Encoding: gzip,deflate Content-Type: application/json; charset=UTF-8 request_id: null 92fa6c11-78d8-44cc-a143-9ff3e4c132f4 2115 User-Agent: HivePoint (Factory JSON client:null:2114) X-Forwarded-For: 10.220.16.229 X-Forwarded-Port: 80 X-Forwarded-Proto: http Content-Length: 102 Connection: keep-alive [{"id":"4b2c4d09-31e2-4fe2-b767-3868efbdcda1","channel": {"add": "preet"},"channel": {"add": "adam"}}]HTTP/1.1 400 Bad Request Content-Type: text/plain;charset=UTF-8 Transfer-Encoding: chunked 96 {"responseHeader":{"status":400,"QTime":1},"error":{"msg":"[doc=4b2c4d09-31e2-4fe2-b767-3868efbdcda1] missing required field: timestamp","code":400}} 0 CAP 2 9602 14:30:08.082758 Subsequent search, after update POST /solr/d-_v22/select?indent=on&wt=json HTTP/1.1 host: d01-solr.s
[jira] [Commented] (LUCENE-5681) Fix RAMDirectory's IndexInput to not double-buffer on slice()
[ https://issues.apache.org/jira/browse/LUCENE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062795#comment-14062795 ] Robert Muir commented on LUCENE-5681: - Looks good, thanks Uwe! > Fix RAMDirectory's IndexInput to not double-buffer on slice() > - > > Key: LUCENE-5681 > URL: https://issues.apache.org/jira/browse/LUCENE-5681 > Project: Lucene - Core > Issue Type: Bug > Components: core/store >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5681.patch, LUCENE-5681.patch, LUCENE-5681.patch > > > After LUCENE-4371, we still have a non-optimal implementation of > IndexInput#slice() in RAMDirectory. We should fix that to use the cloning > approach like other directories do -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6251) incorrect 'missing required field' during update - document definitely has it
Nathan Neulinger created SOLR-6251: -- Summary: incorrect 'missing required field' during update - document definitely has it Key: SOLR-6251 URL: https://issues.apache.org/jira/browse/SOLR-6251 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.8 Environment: 4.8.0. Two nodes, SolrCloud, external ZK ensemble. All on EC2. The two hosts are round-robin'd behind an ELB. Reporter: Nathan Neulinger Document added on solr1. We can see the distribute take place from solr1 to solr2 and returning a success. Subsequent searches returning document, clearly showing the field as being there. Later on, an update is done to add to an element of the document - and the update fails. The update was sent to solr2 instance. Schema marks the 'timestamp' field as required, so the initial insert should not work if the field isn't present. Symptom is intermittent - we're seeing this randomly, with no warning or triggering that we can see, but in all cases, it's getting the error in response to an update when the instance tries to distribute the change to the other node. Searches that were run AFTER the update also show the field as being present in the document. Will add full trace of operations in the comments shortly. pcap captures of ALL traffic for the two nodes on 8983 is available if requested. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5681) Fix RAMDirectory's IndexInput to not double-buffer on slice()
[ https://issues.apache.org/jira/browse/LUCENE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5681: -- Fix Version/s: 4.10 > Fix RAMDirectory's IndexInput to not double-buffer on slice() > - > > Key: LUCENE-5681 > URL: https://issues.apache.org/jira/browse/LUCENE-5681 > Project: Lucene - Core > Issue Type: Bug > Components: core/store >Reporter: Uwe Schindler > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5681.patch, LUCENE-5681.patch, LUCENE-5681.patch > > > After LUCENE-4371, we still have a non-optimal implementation of > IndexInput#slice() in RAMDirectory. We should fix that to use the cloning > approach like other directories do -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-5681) Fix RAMDirectory's IndexInput to not double-buffer on slice()
[ https://issues.apache.org/jira/browse/LUCENE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-5681: - Assignee: Uwe Schindler > Fix RAMDirectory's IndexInput to not double-buffer on slice() > - > > Key: LUCENE-5681 > URL: https://issues.apache.org/jira/browse/LUCENE-5681 > Project: Lucene - Core > Issue Type: Bug > Components: core/store >Reporter: Uwe Schindler >Assignee: Uwe Schindler > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5681.patch, LUCENE-5681.patch, LUCENE-5681.patch > > > After LUCENE-4371, we still have a non-optimal implementation of > IndexInput#slice() in RAMDirectory. We should fix that to use the cloning > approach like other directories do -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5681) Fix RAMDirectory's IndexInput to not double-buffer on slice()
[ https://issues.apache.org/jira/browse/LUCENE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5681: -- Attachment: LUCENE-5681.patch Improve IllegalArgumentExceptions, be more strict on out-of-bounds slice. I will commit this tomorrow and backport to 4.10. > Fix RAMDirectory's IndexInput to not double-buffer on slice() > - > > Key: LUCENE-5681 > URL: https://issues.apache.org/jira/browse/LUCENE-5681 > Project: Lucene - Core > Issue Type: Bug > Components: core/store >Reporter: Uwe Schindler > Fix For: 5.0 > > Attachments: LUCENE-5681.patch, LUCENE-5681.patch, LUCENE-5681.patch > > > After LUCENE-4371, we still have a non-optimal implementation of > IndexInput#slice() in RAMDirectory. We should fix that to use the cloning > approach like other directories do -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6179) ManagedResource repeatedly logs warnings when not used
[ https://issues.apache.org/jira/browse/SOLR-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-6179. -- Resolution: Fixed Fix Version/s: 4.10 5.0 > ManagedResource repeatedly logs warnings when not used > -- > > Key: SOLR-6179 > URL: https://issues.apache.org/jira/browse/SOLR-6179 > Project: Solr > Issue Type: Bug >Affects Versions: 4.8, 4.8.1 > Environment: >Reporter: Hoss Man >Assignee: Timothy Potter > Fix For: 5.0, 4.10 > > > These messages are currently logged as WARNings, and should either be > switched to INFO level (or made more sophisticated so that it can tell when > solr is setup for managed resources but the data isn't available)... > {noformat} > 2788 [coreLoadExecutor-5-thread-1] WARN org.apache.solr.rest.ManagedResource > – No stored data found for /rest/managed > 2788 [coreLoadExecutor-5-thread-1] WARN org.apache.solr.rest.ManagedResource > – No registered observers for /rest/managed > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6136) ConcurrentUpdateSolrServer includes a Spin Lock
[ https://issues.apache.org/jira/browse/SOLR-6136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter resolved SOLR-6136. -- Resolution: Fixed Fix Version/s: 4.10 5.0 > ConcurrentUpdateSolrServer includes a Spin Lock > --- > > Key: SOLR-6136 > URL: https://issues.apache.org/jira/browse/SOLR-6136 > Project: Solr > Issue Type: Bug > Components: SolrCloud >Affects Versions: 4.6, 4.6.1, 4.7, 4.7.1, 4.7.2, 4.8, 4.8.1 >Reporter: Brandon Chapman >Assignee: Timothy Potter >Priority: Critical > Fix For: 5.0, 4.10 > > Attachments: SOLR-6136.patch, wait___notify_all.patch > > > ConcurrentUpdateSolrServer.blockUntilFinished() includes a Spin Lock. This > causes an extremely high amount of CPU to be used on the Cloud Leader during > indexing. > Here is a summary of our system testing. > Importing data on Solr4.5.0: > Throughput gets as high as 240 documents per second. > [tomcat@solr-stg01 logs]$ uptime > 09:53:50 up 310 days, 23:52, 1 user, load average: 3.33, 3.72, 5.43 > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 9547 tomcat 21 0 6850m 1.2g 16m S 86.2 5.0 1:48.81 java > Importing data on Solr4.7.0 with no replicas: > Throughput peaks at 350 documents per second. > [tomcat@solr-stg01 logs]$ uptime > 10:03:44 up 311 days, 2 min, 1 user, load average: 4.57, 2.55, 4.18 > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 9728 tomcat 23 0 6859m 2.2g 28m S 62.3 9.0 2:20.20 java > Importing data on Solr4.7.0 with replicas: > Throughput peaks at 30 documents per second because the Solr machine is out > of CPU. > [tomcat@solr-stg01 logs]$ uptime > 09:40:04 up 310 days, 23:38, 1 user, load average: 30.54, 12.39, 4.79 > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 9190 tomcat 17 0 7005m 397m 15m S 198.5 1.6 7:14.87 java -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Muldowney updated SOLR-2894: --- Attachment: SOLR-2894.patch I've uploaded a new file with my {{facet.missing}} changes. It's got the small and longtail working. {code:title=DistributedFacetPivotLargeTest.java}rsp = query( "q", "*:*", "fq", "-place_s:0placeholder", "rows", "0", "facet","true", "facet.limit","1", "facet.missing","true", //FacetParams.FACET_OVERREQUEST_RATIO, "0", // force refine //FacetParams.FACET_OVERREQUEST_COUNT, "0", // force refine "facet.pivot","special_s,company_t");{code} This test gets whacky when the {{OVERREQUEST}} options are uncommented. With the {{OVERREQUEST}} options uncommented we do not get the proper {{bbc}} value and so the distributed version diverges from the non-distrib. Your second comment on this issue is exactly on point. Another variance in that test is that on the distrib side we get {code} {field=special_s,value=,count=3,pivot=[ {field=company_t,value=microsoft,count=2}, {field=company_t,value=null,count=0}]} {code} whereas for the non-distrib we just get {code} {field=special_s,value=,count=3,pivot=[ {field=company_t,value=microsoft,count=2}]} {code} Should {{facet.missing}} respect the {{mincount}} (in this case it's 1)? > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Hoss Man > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-mincount-minification.patch, > SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894_cloud_test.patch, > dateToObject.patch, pivot_mincount_problem.sh > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6163) special chars and ManagedSynonymFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-6163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062671#comment-14062671 ] Timothy Potter commented on SOLR-6163: -- I'll take a look but I think the fix should be upstream from the managed resource implementations, seems like Restlet should have already done the decoding? > special chars and ManagedSynonymFilterFactory > - > > Key: SOLR-6163 > URL: https://issues.apache.org/jira/browse/SOLR-6163 > Project: Solr > Issue Type: Bug >Affects Versions: 4.8 >Reporter: Wim Kumpen > > Hey, > I was playing with the ManagedSynonymFilterFactory to create a synonym list > with the API. But I have difficulties when my keys contains special > characters (or spaces) to delete them... > I added a key ééé that matches with some other words. It's saved in the > synonym file as ééé. > When I try to delete it, I do: > curl -X DELETE > "http://localhost/solr/mycore/schema/analysis/synonyms/english/ééé"; > error message: %C3%A9%C3%A9%C3%A9%C2%B5 not found in > /schema/analysis/synonyms/english > A wild guess from me is that %C3%A9 isn't decoded back to ééé. And that's why > he can't find the keyword? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3617) Consider adding start scripts.
[ https://issues.apache.org/jira/browse/SOLR-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-3617: - Attachment: SOLR-3617.patch Previous patch was hosed ... here's a good one. > Consider adding start scripts. > -- > > Key: SOLR-3617 > URL: https://issues.apache.org/jira/browse/SOLR-3617 > Project: Solr > Issue Type: New Feature >Reporter: Mark Miller > Attachments: SOLR-3617.patch, SOLR-3617.patch > > > I've always found that starting Solr with java -jar start.jar is a little odd > if you are not a java guy, but I think there are bigger pros than looking > less odd in shipping some start scripts. > Not only do you get a cleaner start command: > sh solr.sh or solr.bat or something > But you also can do a couple other little nice things: > * it becomes fairly obvious for a new casual user to see how to start the > system without reading doc. > * you can make the working dir the location of the script - this lets you > call the start script from another dir and still have all the relative dir > setup work. > * have an out of the box place to save startup params like -Xmx. > * we could have multiple start scripts - say solr-dev.sh that logged to the > console and default to sys default for RAM - and also solr-prod which was > fully configured for logging, pegged Xms and Xmx at some larger value (1GB?) > etc. > You would still of course be able to make the java cmd directly - and that is > probably what you would do when it's time to run as a service - but these > could be good starter scripts to get people on the right track and improve > the initial user experience. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3617) Consider adding start scripts.
[ https://issues.apache.org/jira/browse/SOLR-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-3617: - Attachment: (was: SOLR-3617.patch) > Consider adding start scripts. > -- > > Key: SOLR-3617 > URL: https://issues.apache.org/jira/browse/SOLR-3617 > Project: Solr > Issue Type: New Feature >Reporter: Mark Miller > Attachments: SOLR-3617.patch, SOLR-3617.patch > > > I've always found that starting Solr with java -jar start.jar is a little odd > if you are not a java guy, but I think there are bigger pros than looking > less odd in shipping some start scripts. > Not only do you get a cleaner start command: > sh solr.sh or solr.bat or something > But you also can do a couple other little nice things: > * it becomes fairly obvious for a new casual user to see how to start the > system without reading doc. > * you can make the working dir the location of the script - this lets you > call the start script from another dir and still have all the relative dir > setup work. > * have an out of the box place to save startup params like -Xmx. > * we could have multiple start scripts - say solr-dev.sh that logged to the > console and default to sys default for RAM - and also solr-prod which was > fully configured for logging, pegged Xms and Xmx at some larger value (1GB?) > etc. > You would still of course be able to make the java cmd directly - and that is > probably what you would do when it's time to run as a service - but these > could be good starter scripts to get people on the right track and improve > the initial user experience. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-3617) Consider adding start scripts.
[ https://issues.apache.org/jira/browse/SOLR-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Timothy Potter updated SOLR-3617: - Attachment: SOLR-3617.patch Here's an updated patch with the following: 1) bin/solr.cmd For our Windows users ;-) I did my best to emulate the behavior of the Linux script (bin/solr). The main difference between the Windows version and the Linux version is that I don't know how to implement the stop all in the .cmd version, so currently, the user needs to do: solr stop -p PORT. In the Linux, if you don't pass the port, it stops all running Solrs, which may be too heavy handed anyway so easiest might be to fix the Linux version to require the port too. In general, I'm not a Windows user so there may be some things in this implementation that can be done cleaner / easier. Happy to have suggestions on how to improve it. 2) Added a restart mode, which stops and starts the Solr server again. Thus, if you re-issue start when a server is running, it complains vs. stop / start as it did before. Lastly, I'm still sorting out how to implement the cloud example. Specifically, I'm wondering if we should walk the user through using prompts and reading from stdin, that way it's a little more interactive, something like: $ cd solr-5.0.0/bin $ ./solr -e cloud Welcome to the SolrCloud example! Please enter the number of local nodes you would like to run? [2] <-- default 2 Ok, let's start up 2 Solr nodes for your example SolrCloud cluster. Enter the port for node1: [8983] 8983 Enter the port for node2: [7574] 7574 Oops! It looks like there is already something running on port 7574, please choose another port: [7575] 7575 Ok, starting node1 on port 8983 with embedded ZooKeeper listening on localhost:9983 Now starting node2 on port 7575 Success! Found 2 active nodes in your cluster. Do you want to create a new collection? Y/n Y Collection name? [collection1] collection1 How many shards? [2] 2 How many replicas per shard? [1] 1 Ok, collection1 created successfully, do you want to index some documents? Y/n Y Path to file to index: [example/exampledocs/mem.xml] ... We could also just support a -q flag to indicate "quiet" mode and just accept all defaults from the interactive session. And of course -V will activate a verbose mode that probably shows more of the commands being run during the interactive session. > Consider adding start scripts. > -- > > Key: SOLR-3617 > URL: https://issues.apache.org/jira/browse/SOLR-3617 > Project: Solr > Issue Type: New Feature >Reporter: Mark Miller > Attachments: SOLR-3617.patch, SOLR-3617.patch > > > I've always found that starting Solr with java -jar start.jar is a little odd > if you are not a java guy, but I think there are bigger pros than looking > less odd in shipping some start scripts. > Not only do you get a cleaner start command: > sh solr.sh or solr.bat or something > But you also can do a couple other little nice things: > * it becomes fairly obvious for a new casual user to see how to start the > system without reading doc. > * you can make the working dir the location of the script - this lets you > call the start script from another dir and still have all the relative dir > setup work. > * have an out of the box place to save startup params like -Xmx. > * we could have multiple start scripts - say solr-dev.sh that logged to the > console and default to sys default for RAM - and also solr-prod which was > fully configured for logging, pegged Xms and Xmx at some larger value (1GB?) > etc. > You would still of course be able to make the java cmd directly - and that is > probably what you would do when it's time to run as a service - but these > could be good starter scripts to get people on the right track and improve > the initial user experience. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062608#comment-14062608 ] Jessica Cheng commented on SOLR-5473: - I think there can be a race condition in CloudSolrServer's state-caching if the state is fetched just when a collection is created but none of the replicas have been added. If this state is cached, then until it expires, all the requests will fail-fast with an empty theUrlList. We may need to optionally skip caching if any of the shards has no replicas at all. > Split clusterstate.json per collection and watch states selectively > > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, > ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5823) recognize hunspell FULLSTRIP option in affix file
[ https://issues.apache.org/jira/browse/LUCENE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062527#comment-14062527 ] Ryan Ernst commented on LUCENE-5823: LGTM. > recognize hunspell FULLSTRIP option in affix file > - > > Key: LUCENE-5823 > URL: https://issues.apache.org/jira/browse/LUCENE-5823 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5823.patch > > > With LUCENE-5818 we fixed stripping to be correct (ensuring it doesnt strip > the entire word before applying an affix). This is usually true, but there is > an option in the affix file to allow this. > Its used by several languages (french, latvian, swedish, etc) > {noformat} > FULLSTRIP > With FULLSTRIP, affix rules can strip full words, not only one > less characters, before adding the affixes > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6137) Managed Schema / Schemaless and SolrCloud concurrency issues
[ https://issues.apache.org/jira/browse/SOLR-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe resolved SOLR-6137. -- Resolution: Fixed Fix Version/s: 4.10 5.0 Assignee: Steve Rowe Resolving - remaining issues will be dealt with on the issues Gregory raised. Thanks Gregory! > Managed Schema / Schemaless and SolrCloud concurrency issues > > > Key: SOLR-6137 > URL: https://issues.apache.org/jira/browse/SOLR-6137 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis, SolrCloud >Reporter: Gregory Chanan >Assignee: Steve Rowe > Fix For: 5.0, 4.10 > > Attachments: SOLR-6137.patch, SOLR-6137.patch, SOLR-6137v2.patch, > SOLR-6137v3.patch, SOLR-6137v4.patch > > > This is a follow up to a message on the mailing list, linked here: > http://mail-archives.apache.org/mod_mbox/lucene-dev/201406.mbox/%3CCAKfebOOcMeVEb010SsdcH8nta%3DyonMK5R7dSFOsbJ_tnre0O7w%40mail.gmail.com%3E > The Managed Schema integration with SolrCloud seems pretty limited. > The issue I'm running into is variants of the issue that schema changes are > not pushed to all shards/replicas synchronously. So, for example, I can make > the following two requests: > 1) add a field to the collection on server1 using the Schema API > 2) add a document with the new field, the document is routed to a core on > server2 > Then, there appears to be a race between when the document is processed by > the core on server2 and when the core on server2, via the > ZkIndexSchemaReader, gets the new schema. If the document is processed > first, I get a 400 error because the field doesn't exist. This is easily > reproducible by adding a sleep to the ZkIndexSchemaReader's processing. > I hit a similar issue with Schemaless: the distributed request handler sends > out the document updates, but there is no guarantee that the other > shards/replicas see the schema changes made by the update.chain. > Another issue I noticed today: making multiple schema API calls concurrently > can block; that is, one may get through and the other may infinite loop. > So, for reference, the issues include: > 1) Schema API changes return success before all cores are updated; subsequent > calls attempting to use new schema may fail > 2) Schemaless changes may fail on replicas/other shards for the same reason > 3) Concurrent Schema API changes may block > From Steve Rowe on the mailing list: > {quote} > For Schema API users, delaying a couple of seconds after adding fields before > using them should workaround this problem. While not ideal, I think schema > field additions are rare enough in the Solr collection lifecycle that this is > not a huge problem. > For schemaless users, the picture is worse, as you noted. Immediate > distribution of documents triggering schema field addition could easily prove > problematic. Maybe we need a schema update blocking mode, where after the ZK > schema node watch is triggered, all new request processing is halted until > the schema is finished downloading/parsing/swapping out? (Such a mode should > help Schema API users too.) > {quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6137) Managed Schema / Schemaless and SolrCloud concurrency issues
[ https://issues.apache.org/jira/browse/SOLR-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062508#comment-14062508 ] Gregory Chanan commented on SOLR-6137: -- Thanks [~sar...@syr.edu]! Your changes make sense. bq. Schema API changes return success before all cores are updated; subsequent calls attempting to use new schema may fail I filed SOLR-6249 for this. bq. One small issue I noticed is that there is a race between parsing and schema addition. I filed SOLR-6250 for this bq. Anything else? Nope. > Managed Schema / Schemaless and SolrCloud concurrency issues > > > Key: SOLR-6137 > URL: https://issues.apache.org/jira/browse/SOLR-6137 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis, SolrCloud >Reporter: Gregory Chanan > Attachments: SOLR-6137.patch, SOLR-6137.patch, SOLR-6137v2.patch, > SOLR-6137v3.patch, SOLR-6137v4.patch > > > This is a follow up to a message on the mailing list, linked here: > http://mail-archives.apache.org/mod_mbox/lucene-dev/201406.mbox/%3CCAKfebOOcMeVEb010SsdcH8nta%3DyonMK5R7dSFOsbJ_tnre0O7w%40mail.gmail.com%3E > The Managed Schema integration with SolrCloud seems pretty limited. > The issue I'm running into is variants of the issue that schema changes are > not pushed to all shards/replicas synchronously. So, for example, I can make > the following two requests: > 1) add a field to the collection on server1 using the Schema API > 2) add a document with the new field, the document is routed to a core on > server2 > Then, there appears to be a race between when the document is processed by > the core on server2 and when the core on server2, via the > ZkIndexSchemaReader, gets the new schema. If the document is processed > first, I get a 400 error because the field doesn't exist. This is easily > reproducible by adding a sleep to the ZkIndexSchemaReader's processing. > I hit a similar issue with Schemaless: the distributed request handler sends > out the document updates, but there is no guarantee that the other > shards/replicas see the schema changes made by the update.chain. > Another issue I noticed today: making multiple schema API calls concurrently > can block; that is, one may get through and the other may infinite loop. > So, for reference, the issues include: > 1) Schema API changes return success before all cores are updated; subsequent > calls attempting to use new schema may fail > 2) Schemaless changes may fail on replicas/other shards for the same reason > 3) Concurrent Schema API changes may block > From Steve Rowe on the mailing list: > {quote} > For Schema API users, delaying a couple of seconds after adding fields before > using them should workaround this problem. While not ideal, I think schema > field additions are rare enough in the Solr collection lifecycle that this is > not a huge problem. > For schemaless users, the picture is worse, as you noted. Immediate > distribution of documents triggering schema field addition could easily prove > problematic. Maybe we need a schema update blocking mode, where after the ZK > schema node watch is triggered, all new request processing is halted until > the schema is finished downloading/parsing/swapping out? (Such a mode should > help Schema API users too.) > {quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6250) Schemaless parsing does not work on a consistent schema
Gregory Chanan created SOLR-6250: Summary: Schemaless parsing does not work on a consistent schema Key: SOLR-6250 URL: https://issues.apache.org/jira/browse/SOLR-6250 Project: Solr Issue Type: Improvement Components: Schema and Analysis Reporter: Gregory Chanan See this comment (https://issues.apache.org/jira/browse/SOLR-6137?focusedCommentId=14044366&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14044366), reproduced here: bq. One small issue I noticed is that there is a race between parsing and schema addition. The AddSchemaFieldsUpdateProcessor handles this by only working on a fixed schema, so the schema doesn't change underneath it. If it decides on a schema addition and that fails (because another addition beat it), it will grab the latest schema and retry. But the parsers don't do that so the core's schema can change in the middle of parsing. It may make sense to defend against that by moving the retry code from the AddSchemaFieldsUpdateProcessor to some processor that runs before all the parsers. The downside is if the schema addition fails, you have to rerun all the parsers, but that may be a minor concern. bq. This may not actually matter. Consider the case tested at the end of the test: two documents are simultaneously inserted with the same field having a Long and Date value. Assume the Date wins the schema "race" and is updated first. While parsing the Long, each parser may see the schema as having a date field or no field. If a valid parser (that is, one that can modify the field value) sees a date field, it won't do any modifications because shouldMutate will fail, leaving the object in whatever state the serializer left it (either Long or String). If it sees no field, it will mutate the object to create a Long object. In either case, we should get an error at the point we actually create the lucene document, because neither a Long nor String-representation-of-a-long can be stored in a Date field. This is pretty difficult to reason about though. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5681) Fix RAMDirectory's IndexInput to not double-buffer on slice()
[ https://issues.apache.org/jira/browse/LUCENE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5681: -- Attachment: LUCENE-5681.patch Added ineffectivity warning to {{BufferedIndexInput#wrap()}} and cleaned up sliceDescription to be consistent. > Fix RAMDirectory's IndexInput to not double-buffer on slice() > - > > Key: LUCENE-5681 > URL: https://issues.apache.org/jira/browse/LUCENE-5681 > Project: Lucene - Core > Issue Type: Bug > Components: core/store >Reporter: Uwe Schindler > Fix For: 5.0 > > Attachments: LUCENE-5681.patch, LUCENE-5681.patch > > > After LUCENE-4371, we still have a non-optimal implementation of > IndexInput#slice() in RAMDirectory. We should fix that to use the cloning > approach like other directories do -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6249) Schema API changes return success before all cores are updated
Gregory Chanan created SOLR-6249: Summary: Schema API changes return success before all cores are updated Key: SOLR-6249 URL: https://issues.apache.org/jira/browse/SOLR-6249 Project: Solr Issue Type: Improvement Components: Schema and Analysis, SolrCloud Reporter: Gregory Chanan See SOLR-6137 for more details. The basic issue is that Schema API changes return success when the first core is updated, but other cores asynchronously read the updated schema from ZooKeeper. So a client application could make a Schema API change and then index some documents based on the new schema that may fail on other nodes. Possible fixes: 1) Make the Schema API calls synchronous 2) Give the client some ability to track the state of the schema. They can already do this to a certain extent by checking the Schema API on all the replicas and verifying that the field has been added, though this is pretty cumbersome. Maybe it makes more sense to do this sort of thing on the collection level, i.e. Schema API changes return the zk version to the client. We add an API to return the current zk version. On a replica, if the zk version is >= the version the client has, the client knows that replica has at least seen the schema change. We could also provide an API to do the distribution and checking across the different replicas of the collection so that clients don't need ot do that themselves. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5681) Fix RAMDirectory's IndexInput to not double-buffer on slice()
[ https://issues.apache.org/jira/browse/LUCENE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062475#comment-14062475 ] Uwe Schindler commented on LUCENE-5681: --- This improvement is especially important for slices of NRTCachingDirectory, because it uses RAMDirectory internally, too! > Fix RAMDirectory's IndexInput to not double-buffer on slice() > - > > Key: LUCENE-5681 > URL: https://issues.apache.org/jira/browse/LUCENE-5681 > Project: Lucene - Core > Issue Type: Bug > Components: core/store >Reporter: Uwe Schindler > Fix For: 5.0 > > Attachments: LUCENE-5681.patch > > > After LUCENE-4371, we still have a non-optimal implementation of > IndexInput#slice() in RAMDirectory. We should fix that to use the cloning > approach like other directories do -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5681) Fix RAMDirectory's IndexInput to not double-buffer on slice()
[ https://issues.apache.org/jira/browse/LUCENE-5681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-5681: -- Attachment: LUCENE-5681.patch Patch, including new test. The default impl is now only used by Solr anymore. We should fix this, too and remove the BufferedIndexInput.wrap() one completely. > Fix RAMDirectory's IndexInput to not double-buffer on slice() > - > > Key: LUCENE-5681 > URL: https://issues.apache.org/jira/browse/LUCENE-5681 > Project: Lucene - Core > Issue Type: Bug > Components: core/store >Reporter: Uwe Schindler > Fix For: 5.0 > > Attachments: LUCENE-5681.patch > > > After LUCENE-4371, we still have a non-optimal implementation of > IndexInput#slice() in RAMDirectory. We should fix that to use the cloning > approach like other directories do -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062438#comment-14062438 ] Andrew Muldowney edited comment on SOLR-2894 at 7/15/14 6:28 PM: - I've been making generally good headway on the .missing problem. We've got a new {{PivotFacetFieldValueCollection}} that should deal with the {{null}} values properly. Right now the Small and LongTail tests pass but the Long fails on the new {{facet.limit=1}} and {{facet.missing=true}} case with {{SPECIAL}}. The control response doesn't include the {{null}} and the distributed response doesn't get the count of {{bbc}} right, it only gets 150 and I'm sure the 298 it gets for {{microsoft}} is wrong too. There is something on the shard side code that is not happy with our "" and {{null}} values. I'm working on that right now. My assumption is that the {{facet.missing}} request makes it out to all the shards so we never need to refine on it since all shards responded with the full information, but I guess that isn't always the case since other fields under that null value might have limits that would need to be refined on? was (Author: andrew.muldowney): I've been making generally good headway on the .missing problem. We've got a new {{PivotFacetFieldValueCollection}} that should deal with the {{null}} values properly. Right now the Small and LongTail tests pass but the Long fails on the new {{facet.limit=1}} and {{facet.missing=true}} case with {{SPECIAL}}. The control response doesn't include the {{null}} and the distributed response doesn't get the count of {{bbc}} right, it only gets 150 and I'm sure the 298 it gets for {{microsoft}} is wrong too. There is something on the shard side code that is not happy with our "" and {{null}} values. I'm working on that right now. > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Hoss Man > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-mincount-minification.patch, > SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894_cloud_test.patch, dateToObject.patch, > pivot_mincount_problem.sh > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062438#comment-14062438 ] Andrew Muldowney commented on SOLR-2894: I've been making generally good headway on the .missing problem. We've got a new {{PivotFacetFieldValueCollection}} that should deal with the {{null}} values properly. Right now the Small and LongTail tests pass but the Long fails on the new {{facet.limit=1}} and {{facet.missing=true}} case with {{SPECIAL}}. The control response doesn't include the {{null}} and the distributed response doesn't get the count of {{bbc}} right, it only gets 150 and I'm sure the 298 it gets for {{microsoft}} is wrong too. There is something on the shard side code that is not happy with our "" and {{null}} values. I'm working on that right now. > Implement distributed pivot faceting > > > Key: SOLR-2894 > URL: https://issues.apache.org/jira/browse/SOLR-2894 > Project: Solr > Issue Type: Improvement >Reporter: Erik Hatcher >Assignee: Hoss Man > Fix For: 4.9, 5.0 > > Attachments: SOLR-2894-mincount-minification.patch, > SOLR-2894-reworked.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, > SOLR-2894.patch, SOLR-2894_cloud_test.patch, dateToObject.patch, > pivot_mincount_problem.sh > > > Following up on SOLR-792, pivot faceting currently only supports > undistributed mode. Distributed pivot faceting needs to be implemented. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6248) MoreLikeThis Query Parser
[ https://issues.apache.org/jira/browse/SOLR-6248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Rowe updated SOLR-6248: - Description: MLT Component doesn't let people highlight/paginate and the handler comes with an cost of maintaining another piece in the config. Also, any changes to the default (number of results to be fetched etc.) /select handler need to be copied/synced with this handler too. Having an MLT QParser would let users get back docs based on a query for them to paginate, highlight etc. It would also give them the flexibility to use this anywhere i.e. q,fq,bq etc. A bit of history about MLT (thanks to Hoss) MLT Handler pre-dates the existence of QParsers and was meant to take an arbitrary query as input, find docs that match that query, club them together to find interesting terms, and then use those terms as if they were my main query to generate a main result set. This result would then be used as the set to facet, highlight etc. The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList\(y) The MLT component on the other hand solved a very different purpose of augmenting the main result set. It is used to get similar docs for each of the doc in the main result set. DocSet\(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) The new approach: All of this can be done better and cleaner (and makes more sense too) using an MLT QParser. An important thing to handle here is the case where the user doesn't have TermVectors, in which case, it does what happens right now i.e. parsing stored fields. Also, in case the user doesn't have a field (to be used for MLT) indexed, the field would need to be a TextField with an index analyzer defined. This analyzer will then be used to extract terms for MLT. In case of SolrCloud mode, '/get-termvectors' can be used after looking at the schema (if TermVectors are enabled for the field). If not, a /get call can be used to fetch the field and parse it. was: MLT Component doesn't let people highlight/paginate and the handler comes with an cost of maintaining another piece in the config. Also, any changes to the default (number of results to be fetched etc.) /select handler need to be copied/synced with this handler too. Having an MLT QParser would let users get back docs based on a query for them to paginate, highlight etc. It would also give them the flexibility to use this anywhere i.e. q,fq,bq etc. A bit of history about MLT (thanks to Hoss) MLT Handler pre-dates the existence of QParsers and was meant to take an arbitrary query as input, find docs that match that query, club them together to find interesting terms, and then use those terms as if they were my main query to generate a main result set. This result would then be used as the set to facet, highlight etc. The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList(y) The MLT component on the other hand solved a very different purpose of augmenting the main result set. It is used to get similar docs for each of the doc in the main result set. DocSet(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) The new approach: All of this can be done better and cleaner (and makes more sense too) using an MLT QParser. An important thing to handle here is the case where the user doesn't have TermVectors, in which case, it does what happens right now i.e. parsing stored fields. Also, in case the user doesn't have a field (to be used for MLT) indexed, the field would need to be a TextField with an index analyzer defined. This analyzer will then be used to extract terms for MLT. In case of SolrCloud mode, '/get-termvectors' can be used after looking at the schema (if TermVectors are enabled for the field). If not, a /get call can be used to fetch the field and parse it. > MoreLikeThis Query Parser > - > > Key: SOLR-6248 > URL: https://issues.apache.org/jira/browse/SOLR-6248 > Project: Solr > Issue Type: New Feature >Reporter: Anshum Gupta > > MLT Component doesn't let people highlight/paginate and the handler comes > with an cost of maintaining another piece in the config. Also, any changes to > the default (number of results to be fetched etc.) /select handler need to be > copied/synced with this handler too. > Having an MLT QParser would let users get back docs based on a query for them > to paginate, highlight etc. It would also give them the flexibility to use > this anywhere i.e. q,fq,bq etc. > A bit of history about MLT (thanks to Hoss) > MLT Handler pre-dates the existence of QParsers and was meant to take an > arbitrary query as input, find docs that match that > query, club them together to find interesting terms, and then use those > terms as if they were my main query to generate a main result set. > This result would then be used as the set to facet, highli
[jira] [Commented] (SOLR-5480) Make MoreLikeThisHandler distributable
[ https://issues.apache.org/jira/browse/SOLR-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062358#comment-14062358 ] Anshum Gupta commented on SOLR-5480: [~vzhovtiuk] This is very different from what this JIRA talks about and not in line with the existing patches/intent. I have created a new JIRA (SOLR-6248) that is fit for this approach. It should be able to (functionally) solve the issue that this JIRA talks about. > Make MoreLikeThisHandler distributable > -- > > Key: SOLR-5480 > URL: https://issues.apache.org/jira/browse/SOLR-5480 > Project: Solr > Issue Type: Improvement >Reporter: Steve Molloy >Assignee: Noble Paul > Attachments: SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, SOLR-5480.patch, > SOLR-5480.patch > > > The MoreLikeThis component, when used in the standard search handler supports > distributed searches. But the MoreLikeThisHandler itself doesn't, which > prevents from say, passing in text to perform the query. I'll start looking > into adapting the SearchHandler logic to the MoreLikeThisHandler. If anyone > has some work done already and want to share, or want to contribute, any help > will be welcomed. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6137) Managed Schema / Schemaless and SolrCloud concurrency issues
[ https://issues.apache.org/jira/browse/SOLR-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062352#comment-14062352 ] Steve Rowe commented on SOLR-6137: -- [~gchanan], I've committed your patch to trunk and branch_4x with minor modifications (see previous comment). I think what's left are: bq. Schema API changes return success before all cores are updated; subsequent calls attempting to use new schema may fail and bq. One small issue I noticed is that there is a race between parsing and schema addition. A new issue for this one seems like a good idea. Anything else? > Managed Schema / Schemaless and SolrCloud concurrency issues > > > Key: SOLR-6137 > URL: https://issues.apache.org/jira/browse/SOLR-6137 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis, SolrCloud >Reporter: Gregory Chanan > Attachments: SOLR-6137.patch, SOLR-6137.patch, SOLR-6137v2.patch, > SOLR-6137v3.patch, SOLR-6137v4.patch > > > This is a follow up to a message on the mailing list, linked here: > http://mail-archives.apache.org/mod_mbox/lucene-dev/201406.mbox/%3CCAKfebOOcMeVEb010SsdcH8nta%3DyonMK5R7dSFOsbJ_tnre0O7w%40mail.gmail.com%3E > The Managed Schema integration with SolrCloud seems pretty limited. > The issue I'm running into is variants of the issue that schema changes are > not pushed to all shards/replicas synchronously. So, for example, I can make > the following two requests: > 1) add a field to the collection on server1 using the Schema API > 2) add a document with the new field, the document is routed to a core on > server2 > Then, there appears to be a race between when the document is processed by > the core on server2 and when the core on server2, via the > ZkIndexSchemaReader, gets the new schema. If the document is processed > first, I get a 400 error because the field doesn't exist. This is easily > reproducible by adding a sleep to the ZkIndexSchemaReader's processing. > I hit a similar issue with Schemaless: the distributed request handler sends > out the document updates, but there is no guarantee that the other > shards/replicas see the schema changes made by the update.chain. > Another issue I noticed today: making multiple schema API calls concurrently > can block; that is, one may get through and the other may infinite loop. > So, for reference, the issues include: > 1) Schema API changes return success before all cores are updated; subsequent > calls attempting to use new schema may fail > 2) Schemaless changes may fail on replicas/other shards for the same reason > 3) Concurrent Schema API changes may block > From Steve Rowe on the mailing list: > {quote} > For Schema API users, delaying a couple of seconds after adding fields before > using them should workaround this problem. While not ideal, I think schema > field additions are rare enough in the Solr collection lifecycle that this is > not a huge problem. > For schemaless users, the picture is worse, as you noted. Immediate > distribution of documents triggering schema field addition could easily prove > problematic. Maybe we need a schema update blocking mode, where after the ZK > schema node watch is triggered, all new request processing is halted until > the schema is finished downloading/parsing/swapping out? (Such a mode should > help Schema API users too.) > {quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6248) MoreLikeThis Query Parser
Anshum Gupta created SOLR-6248: -- Summary: MoreLikeThis Query Parser Key: SOLR-6248 URL: https://issues.apache.org/jira/browse/SOLR-6248 Project: Solr Issue Type: New Feature Reporter: Anshum Gupta MLT Component doesn't let people highlight/paginate and the handler comes with an cost of maintaining another piece in the config. Also, any changes to the default (number of results to be fetched etc.) /select handler need to be copied/synced with this handler too. Having an MLT QParser would let users get back docs based on a query for them to paginate, highlight etc. It would also give them the flexibility to use this anywhere i.e. q,fq,bq etc. A bit of history about MLT (thanks to Hoss) MLT Handler pre-dates the existence of QParsers and was meant to take an arbitrary query as input, find docs that match that query, club them together to find interesting terms, and then use those terms as if they were my main query to generate a main result set. This result would then be used as the set to facet, highlight etc. The flow: Query -> DocList(m) -> Bag (terms) -> Query -> DocList(y) The MLT component on the other hand solved a very different purpose of augmenting the main result set. It is used to get similar docs for each of the doc in the main result set. DocSet(n) -> n * Bag (terms) -> n * (Query) -> n * DocList(m) The new approach: All of this can be done better and cleaner (and makes more sense too) using an MLT QParser. An important thing to handle here is the case where the user doesn't have TermVectors, in which case, it does what happens right now i.e. parsing stored fields. Also, in case the user doesn't have a field (to be used for MLT) indexed, the field would need to be a TextField with an index analyzer defined. This analyzer will then be used to extract terms for MLT. In case of SolrCloud mode, '/get-termvectors' can be used after looking at the schema (if TermVectors are enabled for the field). If not, a /get call can be used to fetch the field and parse it. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5819) Add block tree postings format that supports term ords
[ https://issues.apache.org/jira/browse/LUCENE-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062343#comment-14062343 ] Michael McCandless commented on LUCENE-5819: The gist of the change here is that the terms index FST, via a new custom Outputs impl FSTOrdsOutputs, now also stores the start and end ord range for each block. The end ord is also necessary because the terms don't neatly fall into just the leaf blocks: "straggler" terms can easily fall inside inner blocks, and in this case we need the end ord of the lower blocks to realize the term is a "straggler". The on-disk blocks themselves are nearly the same; the only difference is when a block writes a pointer to a sub-block, it now also writes (vlong) how many terms are in that sub-block. This way when we are seeking by ord and skip that sub-block we know how many ords were just skipped. I made a custom getByOutput to handle the ranges, falling back to the last range that included the target ord while recursing. Otherwise the terms dict is basically the same as the normal block tree, including optimized intersect (w/o ord() implemented: not sure we need it), except all seek/next operations also compute the term ord. Floor blocks also store the term ord each one starts on. > Add block tree postings format that supports term ords > -- > > Key: LUCENE-5819 > URL: https://issues.apache.org/jira/browse/LUCENE-5819 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5819.patch, LUCENE-5819.patch > > > BlockTree is our default terms dictionary today, but it doesn't > support term ords, which is an optional API in the postings format to > retrieve the ordinal for the currently seek'd term, and also later > seek by that ordinal e.g. to lookup the term. > This can possibly be useful for e.g. faceting, and maybe at some point > we can share the postings terms dict with the one used by sorted/set > DV for cases when app wants to invert and facet on a given field. > The older (3.x) block terms dict can easily support ords, and we have > a Lucene41OrdsPF in test-framework, but it's not as fast / compact as > block-tree, and doesn't (can't easily) implement an optimized > intersect, but it could be for fields we'd want to facet on, these > tradeoffs don't matter. It's nice to have options... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr checkIfIAmLeader usage from ZK event thread
Currently when a replica is watching the current leader's ephemeral node and the leader disappears, it runs the leadership check along with its two way peer sync, ZK update etc. on the ZK event thread where the watch was fired. What this means is that for instances with lots of cores, you would be serializing leadership elections and the last in the list could take a long time to have a replacement elected (during which you will have no leader). I did a quick change to make the checkIfIAmLeader call async, but Solr cloud tests being what they are (thanks Shalin for cleaning them up btw :) ), I wanted to check if I am doing something stupid. If not, I will raise a JIRA. One contention could be if you might end up with two elections for the same shard, but I can't see how that might happen..
[jira] [Resolved] (SOLR-6247) Can't delete utf-8 word in ManagedStopFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-6247. Resolution: Fixed duplicate bug report: SOLR-6163 > Can't delete utf-8 word in ManagedStopFilterFactory > --- > > Key: SOLR-6247 > URL: https://issues.apache.org/jira/browse/SOLR-6247 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.9 > Environment: MacOS, Solr started locally >Reporter: Patryk Maryniok > > Request: > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; > or > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; > Response: > bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 > not found in /schema/analysis/stopwords/polish", "code":404}} > I can't delete this word, encoding doesn't affect. Am I doing something wrong > or is it bug? It also happens in ManagedSynonymFilterFactory. > Response for GET: > {code:xml} > { > "responseHeader":{ > "status":0, > "QTime":195}, > "wordSet":{ > "initArgs":{"ignoreCase":true}, > "initializedOn":"2014-07-15T14:52:53.859Z", > "managedList":["a", > "i", > "się", > "w", > "z"]} > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance
[ https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062250#comment-14062250 ] Steve Davids commented on SOLR-5986: bq. I wonder why you would have to restart the replica? I presume this is because that is your only recourse to stop a query that might take days to complete? Yes, that is correct, that is the easiest way to kill a run-away thread. bq. If a query takes that long and is ignoring a specified timeout, that seems like it's own issue that needs resolution. The Solr instance that is distributing the requests to other shards honors the timeout value and stops the collection process once the threshold is met (and returns to the client with partial results if any are available), though the queries remain running on all of the shards that were initially searched in the overall distributed request. If the timeout value is honored on each shard that was used in the distributed request that would probably take care of the problem. bq. IMHO, the primary goal should be to make SolrCloud clusters more resilient to performance degradations caused by such nasty queries described above. +1 resiliency to performance degradations is always a good thing :) bq. The circuit-breaker approach in the linked ES tickets is clever, but it does not seem to be as generally applicable as the ability to view all running queries with an option to stop them. +1 I actually prefer the BLUR route, though being able to see the current queries plus the ability to kill them off across the cluster would be great. Although it is crucial to be able to automatically have queries be killed off after a certain threshold (ideally the timeout value). This is necessary because I don't want to be monitoring the Solr admin page at all hours during the day (though I could create scripts to do the work if an API call is available, but not preferred). bq. My preference would be to have a response mechanism that 1) applies broadly and 2) a dev-ops guy can execute in a UI like Solr Admin, or even by API. +1 if "applied broadly" means ability to specify a threshold to start killing off queries. > Don't allow runaway queries from harming Solr cluster health or search > performance > -- > > Key: SOLR-5986 > URL: https://issues.apache.org/jira/browse/SOLR-5986 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Steve Davids >Priority: Critical > Fix For: 4.9 > > > The intent of this ticket is to have all distributed search requests stop > wasting CPU cycles on requests that have already timed out or are so > complicated that they won't be able to execute. We have come across a case > where a nasty wildcard query within a proximity clause was causing the > cluster to enumerate terms for hours even though the query timeout was set to > minutes. This caused a noticeable slowdown within the system which made us > restart the replicas that happened to service that one request, the worst > case scenario are users with a relatively low zk timeout value will have > nodes start dropping from the cluster due to long GC pauses. > [~amccurry] Built a mechanism into Apache Blur to help with the issue in > BLUR-142 (see commit comment for code, though look at the latest code on the > trunk for newer bug fixes). > Solr should be able to either prevent these problematic queries from running > by some heuristic (possibly estimated size of heap usage) or be able to > execute a thread interrupt on all query threads once the time threshold is > met. This issue mirrors what others have discussed on the mailing list: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2031 - Still Failing
: 2) chain that broke involves WDF which has historically been known to : be problematic -- but in theory LUCENE-3843 fixed WDF already? cut/paste error -- i ment LUCENE-5111 -Hoss http://www.lucidworks.com/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2031 - Still Failing
NOTE: 1) seed reproduces for me on branch_4x, linux 64bit, java7 2) chain that broke involves WDF which has historically been known to be problematic -- but in theory LUCENE-3843 fixed WDF already? ant test -Dtestcase=TestRandomChains -Dtests.method=testRandomChains -Dtests.seed=2871E466A5044906 -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=no_NO -Dtests.timezone=Atlantic/South_Georgia -Dtests.file.encoding=US-ASCII Perhaps the problem is specific to the combination of of filters? [junit4] 2> TEST FAIL: useCharFilter=false text='v|p({ Exception from random analyzer: [junit4] 2> charfilters= [junit4] 2> org.apache.lucene.analysis.fa.PersianCharFilter(java.io.StringReader@7cc398b8) [junit4] 2> org.apache.lucene.analysis.fa.PersianCharFilter(org.apache.lucene.analysis.fa.PersianCharFilter@7ef5b8c5) [junit4] 2> tokenizer= [junit4] 2> org.apache.lucene.analysis.path.PathHierarchyTokenizer(org.apache.lucene.analysis.core.TestRandomChains$CheckThatYouDidntReadAnythingReaderWrapper@690c7d5) [junit4] 2> filters= [junit4] 2> org.apache.lucene.analysis.miscellaneous.WordDelimiterFilter(LUCENE_4_10, ValidatingTokenFilter@587d7793 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word, 7, [vxaqklts, arraatqvg]) [junit4] 2> org.apache.lucene.analysis.shingle.ShingleFilter(ValidatingTokenFilter@6bbaa8d8 term=,bytes=[],startOffset=0,endOffset=0,positionIncrement=1,positionLength=1,type=word) [junit4] 2> offsetsAreCorrect=false : Date: Tue, 15 Jul 2014 12:42:31 + (UTC) : From: Apache Jenkins Server : Reply-To: dev@lucene.apache.org : To: dev@lucene.apache.org : Subject: [JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2031 - Still Failing : : Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/2031/ : : 1 tests failed. : REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains : : Error Message: : startOffset must be non-negative, and endOffset must be >= startOffset, startOffset=2,endOffset=1 : : Stack Trace: : java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, startOffset=2,endOffset=1 : at __randomizedtesting.SeedInfo.seed([2871E466A5044906:1590CD07E21654C6]:0) : at org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl.setOffset(PackedTokenAttributeImpl.java:107) : at org.apache.lucene.analysis.shingle.ShingleFilter.incrementToken(ShingleFilter.java:345) : at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:68) : at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:703) : at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:614) : at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:513) : at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:927) : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) : at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) : at java.lang.reflect.Method.invoke(Method.java:606) : at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) : at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) : at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) : at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) : at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) : at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) : at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) : at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) : at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) : at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) : at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) : at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) : at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) : at com.carrotsearch.randomizedtesting.ThreadLeakControl$3
[jira] [Updated] (SOLR-6247) Can't delete utf-8 word in ManagedStopFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patryk Maryniok updated SOLR-6247: -- Description: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? It also happens in ManagedSynonymFilterFactory. Response for GET: {code:xml} { "responseHeader":{ "status":0, "QTime":195}, "wordSet":{ "initArgs":{"ignoreCase":true}, "initializedOn":"2014-07-15T14:52:53.859Z", "managedList":["a", "i", "się", "w", "z"]} } {code} was: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? It also happens in ManagedSynonymFilterFactory. Response for GET: { "responseHeader":{ "status":0, "QTime":195}, "wordSet":{ "initArgs":{"ignoreCase":true}, "initializedOn":"2014-07-15T14:52:53.859Z", "managedList":["a", "i", "się", "w", "z"]}} > Can't delete utf-8 word in ManagedStopFilterFactory > --- > > Key: SOLR-6247 > URL: https://issues.apache.org/jira/browse/SOLR-6247 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.9 > Environment: MacOS, Solr started locally >Reporter: Patryk Maryniok > > Request: > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; > or > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; > Response: > bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 > not found in /schema/analysis/stopwords/polish", "code":404}} > I can't delete this word, encoding doesn't affect. Am I doing something wrong > or is it bug? It also happens in ManagedSynonymFilterFactory. > Response for GET: > {code:xml} > { > "responseHeader":{ > "status":0, > "QTime":195}, > "wordSet":{ > "initArgs":{"ignoreCase":true}, > "initializedOn":"2014-07-15T14:52:53.859Z", > "managedList":["a", > "i", > "się", > "w", > "z"]} > } > {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6247) Can't delete utf-8 word in ManagedStopFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patryk Maryniok updated SOLR-6247: -- Description: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? It also happens in ManagedSynonymFilterFactory. Response for GET: { "responseHeader":{ "status":0, "QTime":195}, "wordSet":{ "initArgs":{"ignoreCase":true}, "initializedOn":"2014-07-15T14:52:53.859Z", "managedList":["a", "i", "się", "w", "z"]}} was: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? It also happens in ManagedSynonymFilterFactory. {quote} { "responseHeader":{ "status":0, "QTime":195}, "wordSet":{ "initArgs":{"ignoreCase":true}, "initializedOn":"2014-07-15T14:52:53.859Z", "managedList":["a", "i", "się", "w", "z"]}} {/quote} > Can't delete utf-8 word in ManagedStopFilterFactory > --- > > Key: SOLR-6247 > URL: https://issues.apache.org/jira/browse/SOLR-6247 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.9 > Environment: MacOS, Solr started locally >Reporter: Patryk Maryniok > > Request: > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; > or > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; > Response: > bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 > not found in /schema/analysis/stopwords/polish", "code":404}} > I can't delete this word, encoding doesn't affect. Am I doing something wrong > or is it bug? It also happens in ManagedSynonymFilterFactory. > Response for GET: > { > "responseHeader":{ > "status":0, > "QTime":195}, > "wordSet":{ > "initArgs":{"ignoreCase":true}, > "initializedOn":"2014-07-15T14:52:53.859Z", > "managedList":["a", > "i", > "się", > "w", > "z"]}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6247) Can't delete utf-8 word in ManagedStopFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patryk Maryniok updated SOLR-6247: -- Description: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? It also happens in ManagedSynonymFilterFactory. {quote} { "responseHeader":{ "status":0, "QTime":195}, "wordSet":{ "initArgs":{"ignoreCase":true}, "initializedOn":"2014-07-15T14:52:53.859Z", "managedList":["a", "i", "się", "w", "z"]}} {/quote} was: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? It also happens in ManagedSynonymFilterFactory. > Can't delete utf-8 word in ManagedStopFilterFactory > --- > > Key: SOLR-6247 > URL: https://issues.apache.org/jira/browse/SOLR-6247 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.9 > Environment: MacOS, Solr started locally >Reporter: Patryk Maryniok > > Request: > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; > or > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; > Response: > bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 > not found in /schema/analysis/stopwords/polish", "code":404}} > I can't delete this word, encoding doesn't affect. Am I doing something wrong > or is it bug? It also happens in ManagedSynonymFilterFactory. > {quote} > { > "responseHeader":{ > "status":0, > "QTime":195}, > "wordSet":{ > "initArgs":{"ignoreCase":true}, > "initializedOn":"2014-07-15T14:52:53.859Z", > "managedList":["a", > "i", > "się", > "w", > "z"]}} > {/quote} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-6247) Can't delete utf-8 word in ManagedStopFilterFactory
Patryk Maryniok created SOLR-6247: - Summary: Can't delete utf-8 word in ManagedStopFilterFactory Key: SOLR-6247 URL: https://issues.apache.org/jira/browse/SOLR-6247 Project: Solr Issue Type: Bug Components: Schema and Analysis Affects Versions: 4.9 Environment: MacOS, Solr started locally Reporter: Patryk Maryniok Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-6247) Can't delete utf-8 word in ManagedStopFilterFactory
[ https://issues.apache.org/jira/browse/SOLR-6247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patryk Maryniok updated SOLR-6247: -- Description: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? It also happens in ManagedSynonymFilterFactory. was: Request: bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; or bq. curl -X DELETE "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; Response: bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 not found in /schema/analysis/stopwords/polish", "code":404}} I can't delete this word, encoding doesn't affect. Am I doing something wrong or is it bug? > Can't delete utf-8 word in ManagedStopFilterFactory > --- > > Key: SOLR-6247 > URL: https://issues.apache.org/jira/browse/SOLR-6247 > Project: Solr > Issue Type: Bug > Components: Schema and Analysis >Affects Versions: 4.9 > Environment: MacOS, Solr started locally >Reporter: Patryk Maryniok > > Request: > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/się"; > or > bq. curl -X DELETE > "http://localhost:8983/solr/collection1/schema/analysis/stopwords/polish/si%C4%99"; > Response: > bq. {"responseHeader":{"status":404, "QTime":3}, "error":{ "msg":"si%C4%99 > not found in /schema/analysis/stopwords/polish", "code":404}} > I can't delete this word, encoding doesn't affect. Am I doing something wrong > or is it bug? It also happens in ManagedSynonymFilterFactory. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_05) - Build # 10820 - Failure!
FYI: Seed reproduces for me on my 64bit linux java7... ant test -Dtestcase=TestFieldCacheSort -Dtests.method=testEmptyStringVsNullStringSort -Dtests.seed=A82FCB68C76B741B -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=ru -Dtests.timezone=Africa/Lusaka -Dtests.file.encoding=UTF-8 The failing asssert is "assert fi != null" public SortedDocValues getSortedDocValues(String field) throws IOException { SortedDocValues dv = super.getSortedDocValues(field); FieldInfo fi = getFieldInfos().fieldInfo(field); if (dv != null) { assert fi != null; assert fi.getDocValuesType() == FieldInfo.DocValuesType.SORTED; return new AssertingSortedDocValues(dv, maxDoc()); } else { assert fi == null || fi.getDocValuesType() != FieldInfo.DocValuesType.SORTED; return null; } } : Date: Tue, 15 Jul 2014 05:36:03 + (UTC) : From: Policeman Jenkins Server : Reply-To: dev@lucene.apache.org : To: dev@lucene.apache.org : Subject: [JENKINS] Lucene-Solr-trunk-Linux (32bit/jdk1.8.0_05) - Build # 10820 : - Failure! : : Build: http://jenkins.thetaphi.de/job/Lucene-Solr-trunk-Linux/10820/ : Java: 32bit/jdk1.8.0_05 -client -XX:+UseParallelGC : : 1 tests failed. : REGRESSION: org.apache.lucene.uninverting.TestFieldCacheSort.testEmptyStringVsNullStringSort : : Error Message: : : : Stack Trace: : java.lang.AssertionError : at __randomizedtesting.SeedInfo.seed([A82FCB68C76B741B:C9CCF9BFBF0A72AB]:0) : at org.apache.lucene.index.AssertingAtomicReader.getSortedDocValues(AssertingAtomicReader.java:638) : at org.apache.lucene.index.FilterAtomicReader.getSortedDocValues(FilterAtomicReader.java:414) : at org.apache.lucene.index.AssertingAtomicReader.getSortedDocValues(AssertingAtomicReader.java:635) : at org.apache.lucene.index.DocValues.getSorted(DocValues.java:273) : at org.apache.lucene.search.FieldComparator$TermOrdValComparator.getSortedDocValues(FieldComparator.java:821) : at org.apache.lucene.search.FieldComparator$TermOrdValComparator.setNextReader(FieldComparator.java:826) : at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.doSetNextReader(TopFieldCollector.java:97) : at org.apache.lucene.search.SimpleCollector.getLeafCollector(SimpleCollector.java:33) : at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:605) : at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:94) : at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:573) : at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:525) : at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:502) : at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:318) : at org.apache.lucene.uninverting.TestFieldCacheSort.testEmptyStringVsNullStringSort(TestFieldCacheSort.java:1029) : at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) : at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) : at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) : at java.lang.reflect.Method.invoke(Method.java:483) : at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) : at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) : at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) : at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) : at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) : at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) : at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) : at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) : at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) : at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) : at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) : at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) : at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) : at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) : at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) : at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) :
[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance
[ https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062228#comment-14062228 ] Jim Walker commented on SOLR-5986: -- Steve, I wonder why you would have to restart the replica? I presume this is because that is your only recourse to stop a query that might take days to complete? If a query takes that long and is ignoring a specified timeout, that seems like it's own issue that needs resolution. IMHO, the primary goal should be to make SolrCloud clusters more resilient to performance degradations caused by such nasty queries described above. The circuit-breaker approach in the linked ES tickets is clever, but it does not seem to be as generally applicable as the ability to view all running queries with an option to stop them. For example, it seems the linked ES circuit breaker will only trigger for issues deriving from loading too much field data. The problem described above may result from this cause, or any number of other causes. My preference would be to have a response mechanism that 1) applies broadly and 2) a dev-ops guy can execute in a UI like Solr Admin, or even by API. > Don't allow runaway queries from harming Solr cluster health or search > performance > -- > > Key: SOLR-5986 > URL: https://issues.apache.org/jira/browse/SOLR-5986 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Steve Davids >Priority: Critical > Fix For: 4.9 > > > The intent of this ticket is to have all distributed search requests stop > wasting CPU cycles on requests that have already timed out or are so > complicated that they won't be able to execute. We have come across a case > where a nasty wildcard query within a proximity clause was causing the > cluster to enumerate terms for hours even though the query timeout was set to > minutes. This caused a noticeable slowdown within the system which made us > restart the replicas that happened to service that one request, the worst > case scenario are users with a relatively low zk timeout value will have > nodes start dropping from the cluster due to long GC pauses. > [~amccurry] Built a mechanism into Apache Blur to help with the issue in > BLUR-142 (see commit comment for code, though look at the latest code on the > trunk for newer bug fixes). > Solr should be able to either prevent these problematic queries from running > by some heuristic (possibly estimated size of heap usage) or be able to > execute a thread interrupt on all query threads once the time threshold is > met. This issue mirrors what others have discussed on the mailing list: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5986) Don't allow runaway queries from harming Solr cluster health or search performance
[ https://issues.apache.org/jira/browse/SOLR-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062187#comment-14062187 ] Steve Davids commented on SOLR-5986: In an ideal world it would attempt to provide results for the shards that may be okay, but the end goal is to maintain the health of the cluster for queries that get out of hand. If you can know up front that there is no possible way that a query could complete then it would be reasonable to error out immediately (though that metric may be squishy to know if it will/will not complete). Hopefully that makes sense... > Don't allow runaway queries from harming Solr cluster health or search > performance > -- > > Key: SOLR-5986 > URL: https://issues.apache.org/jira/browse/SOLR-5986 > Project: Solr > Issue Type: Improvement > Components: search >Reporter: Steve Davids >Priority: Critical > Fix For: 4.9 > > > The intent of this ticket is to have all distributed search requests stop > wasting CPU cycles on requests that have already timed out or are so > complicated that they won't be able to execute. We have come across a case > where a nasty wildcard query within a proximity clause was causing the > cluster to enumerate terms for hours even though the query timeout was set to > minutes. This caused a noticeable slowdown within the system which made us > restart the replicas that happened to service that one request, the worst > case scenario are users with a relatively low zk timeout value will have > nodes start dropping from the cluster due to long GC pauses. > [~amccurry] Built a mechanism into Apache Blur to help with the issue in > BLUR-142 (see commit comment for code, though look at the latest code on the > trunk for newer bug fixes). > Solr should be able to either prevent these problematic queries from running > by some heuristic (possibly estimated size of heap usage) or be able to > execute a thread interrupt on all query threads once the time threshold is > met. This issue mirrors what others have discussed on the mailing list: > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200903.mbox/%3c856ac15f0903272054q2dbdbd19kea3c5ba9e105b...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5819) Add block tree postings format that supports term ords
[ https://issues.apache.org/jira/browse/LUCENE-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-5819: --- Attachment: LUCENE-5819.patch New patch, fixes last nocommit, fixes ant precommit ... I think it's ready. > Add block tree postings format that supports term ords > -- > > Key: LUCENE-5819 > URL: https://issues.apache.org/jira/browse/LUCENE-5819 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5819.patch, LUCENE-5819.patch > > > BlockTree is our default terms dictionary today, but it doesn't > support term ords, which is an optional API in the postings format to > retrieve the ordinal for the currently seek'd term, and also later > seek by that ordinal e.g. to lookup the term. > This can possibly be useful for e.g. faceting, and maybe at some point > we can share the postings terms dict with the one used by sorted/set > DV for cases when app wants to invert and facet on a given field. > The older (3.x) block terms dict can easily support ords, and we have > a Lucene41OrdsPF in test-framework, but it's not as fast / compact as > block-tree, and doesn't (can't easily) implement an optimized > intersect, but it could be for fields we'd want to facet on, these > tradeoffs don't matter. It's nice to have options... -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-3881) frequent OOM in LanguageIdentifierUpdateProcessor
[ https://issues.apache.org/jira/browse/SOLR-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062157#comment-14062157 ] Steve Rowe edited comment on SOLR-3881 at 7/15/14 2:58 PM: --- bq. See http://language-detection.googlecode.com/svn/trunk/doc/com/cybozu/labs/langdetect/Detector.html#setMaxTextLength(int) - the default is 10K chars - we can pass the configured max total chars here. -The default is actually 10K *bytes*, not chars, so we'd need to divide by two when passing the configured max total chars.- *edit* Disregard the above comment; the javadocs refer to "10KB" as the default max text length, but [{{Detector.append()}}|https://code.google.com/p/language-detection/source/browse/src/com/cybozu/labs/langdetect/Detector.java#141] uses the {{max_text_length}} config as a max number of chars. was (Author: steve_rowe): bq. See http://language-detection.googlecode.com/svn/trunk/doc/com/cybozu/labs/langdetect/Detector.html#setMaxTextLength(int) - the default is 10K chars - we can pass the configured max total chars here. The default is actually 10K *bytes*, not chars, so we'd need to divide by two when passing the configured max total chars. > frequent OOM in LanguageIdentifierUpdateProcessor > - > > Key: SOLR-3881 > URL: https://issues.apache.org/jira/browse/SOLR-3881 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 4.0 > Environment: CentOS 6.x, JDK 1.6, (java -server -Xms2G -Xmx2G > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=) >Reporter: Rob Tulloh > Fix For: 4.9, 5.0 > > Attachments: SOLR-3881.patch, SOLR-3881.patch > > > We are seeing frequent failures from Solr causing it to OOM. Here is the > stack trace we observe when this happens: > {noformat} > Caused by: java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2882) > at > java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) > at java.lang.StringBuffer.append(StringBuffer.java:224) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.concatFields(LanguageIdentifierUpdateProcessor.java:286) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.process(LanguageIdentifierUpdateProcessor.java:189) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:171) > at > org.apache.solr.handler.BinaryUpdateRequestHandler$2.update(BinaryUpdateRequestHandler.java:90) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:140) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:120) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:221) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:105) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186) > at > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:112) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:147) > at > org.apache.solr.handler.BinaryUpdateRequestHandler.parseAndLoadDocs(BinaryUpdateRequestHandler.java:100) > at > org.apache.solr.handler.BinaryUpdateRequestHandler.access$000(BinaryUpdateRequestHandler.java:47) > at > org.apache.solr.handler.BinaryUpdateRequestHandler$1.load(BinaryUpdateRequestHandler.java:58) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) > at
[jira] [Commented] (SOLR-3881) frequent OOM in LanguageIdentifierUpdateProcessor
[ https://issues.apache.org/jira/browse/SOLR-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062157#comment-14062157 ] Steve Rowe commented on SOLR-3881: -- bq. See http://language-detection.googlecode.com/svn/trunk/doc/com/cybozu/labs/langdetect/Detector.html#setMaxTextLength(int) - the default is 10K chars - we can pass the configured max total chars here. The default is actually 10K *bytes*, not chars, so we'd need to divide by two when passing the configured max total chars. > frequent OOM in LanguageIdentifierUpdateProcessor > - > > Key: SOLR-3881 > URL: https://issues.apache.org/jira/browse/SOLR-3881 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 4.0 > Environment: CentOS 6.x, JDK 1.6, (java -server -Xms2G -Xmx2G > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=) >Reporter: Rob Tulloh > Fix For: 4.9, 5.0 > > Attachments: SOLR-3881.patch, SOLR-3881.patch > > > We are seeing frequent failures from Solr causing it to OOM. Here is the > stack trace we observe when this happens: > {noformat} > Caused by: java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2882) > at > java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) > at java.lang.StringBuffer.append(StringBuffer.java:224) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.concatFields(LanguageIdentifierUpdateProcessor.java:286) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.process(LanguageIdentifierUpdateProcessor.java:189) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:171) > at > org.apache.solr.handler.BinaryUpdateRequestHandler$2.update(BinaryUpdateRequestHandler.java:90) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:140) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:120) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:221) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:105) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186) > at > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:112) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:147) > at > org.apache.solr.handler.BinaryUpdateRequestHandler.parseAndLoadDocs(BinaryUpdateRequestHandler.java:100) > at > org.apache.solr.handler.BinaryUpdateRequestHandler.access$000(BinaryUpdateRequestHandler.java:47) > at > org.apache.solr.handler.BinaryUpdateRequestHandler$1.load(BinaryUpdateRequestHandler.java:58) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5473: - Attachment: SOLR-5473-74.patch > Split clusterstate.json per collection and watch states selectively > > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, > ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5473: - Attachment: (was: SOLR-5473-74.patch) > Split clusterstate.json per collection and watch states selectively > > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74_POC.patch, > SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3881) frequent OOM in LanguageIdentifierUpdateProcessor
[ https://issues.apache.org/jira/browse/SOLR-3881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062122#comment-14062122 ] Steve Rowe commented on SOLR-3881: -- bq. Added string size calculation as string builder capacity. Used to prevent multiple array allocation on append. (Maybe also need to be configurable - for large documents only) [~vzhovtiuk], I agree - I think we should have two configurable limits: max chars per field value (already in [~tomasflobbe] and your updated patches), and a max total chars (not there yet). Tomás wrote: bq. Do you think it would make more sense to limit each append (for the different fields) or to limit the total size of the buffer/builder (stop appending fields when the maximum was reached)? Both ways would prevent OOM, however they could give different results. I think we should have *both* limits. I think it's more important, though, to do as [~rcmuir] said earlier in this issue: {quote} The langdetect implementation can append each piece at a time. It can also take reader: append(Reader), but that is really just syntactic sugar forwarding to append(String) and not exceeding the Detector.max_text_length. Seems like the concatenating stuff should be pushed out of the base class into the Tika impl. {quote} See http://language-detection.googlecode.com/svn/trunk/doc/com/cybozu/labs/langdetect/Detector.html#setMaxTextLength(int) - the default is 10K chars - we can pass the configured max total chars here. We should also set default maxima for both per-value and total chars, rather than MAX_INT, as in the current patch. > frequent OOM in LanguageIdentifierUpdateProcessor > - > > Key: SOLR-3881 > URL: https://issues.apache.org/jira/browse/SOLR-3881 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 4.0 > Environment: CentOS 6.x, JDK 1.6, (java -server -Xms2G -Xmx2G > -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=) >Reporter: Rob Tulloh > Fix For: 4.9, 5.0 > > Attachments: SOLR-3881.patch, SOLR-3881.patch > > > We are seeing frequent failures from Solr causing it to OOM. Here is the > stack trace we observe when this happens: > {noformat} > Caused by: java.lang.OutOfMemoryError: Java heap space > at java.util.Arrays.copyOf(Arrays.java:2882) > at > java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100) > at > java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:390) > at java.lang.StringBuffer.append(StringBuffer.java:224) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.concatFields(LanguageIdentifierUpdateProcessor.java:286) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.process(LanguageIdentifierUpdateProcessor.java:189) > at > org.apache.solr.update.processor.LanguageIdentifierUpdateProcessor.processAdd(LanguageIdentifierUpdateProcessor.java:171) > at > org.apache.solr.handler.BinaryUpdateRequestHandler$2.update(BinaryUpdateRequestHandler.java:90) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:140) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:120) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:221) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:105) > at > org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:186) > at > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:112) > at > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:147) > at > org.apache.solr.handler.BinaryUpdateRequestHandler.parseAndLoadDocs(BinaryUpdateRequestHandler.java:100) > at > org.apache.solr.handler.BinaryUpdateRequestHandler.access$000(BinaryUpdateRequestHandler.java:47) > at > org.apache.solr.handler.BinaryUpdateRequestHandler$1.load(BinaryUpdateRequestHandler.java:58) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:59) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1540) > at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:435) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:256) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.
[jira] [Comment Edited] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062112#comment-14062112 ] Noble Paul edited comment on SOLR-5473 at 7/15/14 2:27 PM: --- bq. wouldn't say I am still convinced that caching till you fail is the same as watching You are right. caching till you fail is just an optimization in CloudSolrServer . According to me the client has no business to watch the state at all. The cost of an extra request per stale state is negligible IMHO bq.That's why I am saying that at least in the simplistic case this should be left to configuration – watch none, all, or selected. Yes, I'm inclined to add this (selective watch) as an option which kicks in only if the no:of collections is greater than a certain threshold (say 10) . In that case all Solr nodes will watch all states. To sum it up . My preference is # Have SolrJ do caching till it fails or till it times out (no watching whatsoever). Please enlighten me with a case where it is risky # SolrNodes should choose to watch all or selective based on the no:of collections or a configurable clusterwide property was (Author: noble.paul): bq. wouldn't say I am still convinced that caching till you fail is the same as watching You are right. caching till you fail is just an optimization in CloudSolrServer . According to me the client has no business to watch the state at all. The cost of an extra request per stale state is negligible IMHO bq.That's why I am saying that at least in the simplistic case this should be left to configuration – watch none, all, or selected. Yes, I'm inclined to add this as an option which kicks in only if the no:of collections is greater than a certain threshold (say 10) . In that case all Solr nodes wil watch all states. To sum it up . My preference is # Have SolrJ do caching till it fails or till it times out (no watching whatsoever). Please enlighten me with a case where it is risky # SolrNodes should choose to watch all or selective based on the no:of collections or a configurable clusterwide property > Split clusterstate.json per collection and watch states selectively > > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, > ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062112#comment-14062112 ] Noble Paul commented on SOLR-5473: -- bq. wouldn't say I am still convinced that caching till you fail is the same as watching You are right. caching till you fail is just an optimization in CloudSolrServer . According to me the client has no business to watch the state at all. The cost of an extra request per stale state is negligible IMHO bq.That's why I am saying that at least in the simplistic case this should be left to configuration – watch none, all, or selected. Yes, I'm inclined to add this as an option which kicks in only if the no:of collections is greater than a certain threshold (say 10) . In that case all Solr nodes wil watch all states. To sum it up . My preference is # Have SolrJ do caching till it fails or till it times out (no watching whatsoever). Please enlighten me with a case where it is risky # SolrNodes should choose to watch all or selective based on the no:of collections or a configurable clusterwide property > Split clusterstate.json per collection and watch states selectively > > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, > ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5824) hunspell FLAG LONG implemented incorrectly
[ https://issues.apache.org/jira/browse/LUCENE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5824. - Resolution: Fixed Fix Version/s: 4.10 5.0 > hunspell FLAG LONG implemented incorrectly > -- > > Key: LUCENE-5824 > URL: https://issues.apache.org/jira/browse/LUCENE-5824 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5824.patch > > > If you have more than 256 flags, you run out of 8-bit characters, so you have > to use another flag type to get 64k: > * UTF-8: 16-bit BMP flags > * long: two-character flags like 'AB' > * num: decimal numbers like '10234' > But our implementation for 'long' is wrong, it encodes as 'A+B', which means > it cant distinguish between 'AB' and 'BA' and causes overgeneration. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4396) BooleanScorer should sometimes be used for MUST clauses
[ https://issues.apache.org/jira/browse/LUCENE-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062106#comment-14062106 ] Michael McCandless commented on LUCENE-4396: This is good progress, thanks Da! bq. In this patch, I just compact the array as I go through the MUST_NOT docs. It looks like this gave some nice gains with the many-not cases bq. It seems that we can get a better result on *Some* tasks if we combine size9 with size5. Curiously some of the tasks are really hurt by the larger sizes ... maybe 1<<9 is a good compromise? > BooleanScorer should sometimes be used for MUST clauses > --- > > Key: LUCENE-4396 > URL: https://issues.apache.org/jira/browse/LUCENE-4396 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael McCandless > Attachments: And.tasks, AndOr.tasks, AndOr.tasks, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, LUCENE-4396.patch, > SIZE.perf, luceneutil-score-equal.patch, luceneutil-score-equal.patch, > stat.cpp > > > Today we only use BooleanScorer if the query consists of SHOULD and MUST_NOT. > If there is one or more MUST clauses we always use BooleanScorer2. > But I suspect that unless the MUST clauses have very low hit count compared > to the other clauses, that BooleanScorer would perform better than > BooleanScorer2. BooleanScorer still has some vestiges from when it used to > handle MUST so it shouldn't be hard to bring back this capability ... I think > the challenging part might be the heuristics on when to use which (likely we > would have to use firstDocID as proxy for total hit count). > Likely we should also have BooleanScorer sometimes use .advance() on the subs > in this case, eg if suddenly the MUST clause skips 100 docs then you want > to .advance() all the SHOULD clauses. > I won't have near term time to work on this so feel free to take it if you > are inspired! -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5473) Split clusterstate.json per collection and watch states selectively
[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noble Paul updated SOLR-5473: - Attachment: SOLR-5473-74.patch Full patch with all tests passing . The changes are minimal with an addition of a constructor to ClusterState > Split clusterstate.json per collection and watch states selectively > > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud >Reporter: Noble Paul >Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74_POC.patch, SOLR-5473-configname-fix.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, > ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-6241) HttpPartitionTest.testRf3WithLeaderFailover fails sometimes
[ https://issues.apache.org/jira/browse/SOLR-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062064#comment-14062064 ] Shalin Shekhar Mangar commented on SOLR-6241: - I still see some exceptions such as: {code} No registered leader was found after waiting for 6ms , collection: c8n_1x3_lf slice: shard1 Stacktrace org.apache.solr.common.SolrException: No registered leader was found after waiting for 6ms , collection: c8n_1x3_lf slice: shard1 at __randomizedtesting.SeedInfo.seed([CBCC4F6420498B0C:4A2AC17C5716EB30]:0) at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:567) at org.apache.solr.cloud.HttpPartitionTest.testRf3WithLeaderFailover(HttpPartitionTest.java:370) at org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:150) at org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:863) {code} > HttpPartitionTest.testRf3WithLeaderFailover fails sometimes > --- > > Key: SOLR-6241 > URL: https://issues.apache.org/jira/browse/SOLR-6241 > Project: Solr > Issue Type: Bug > Components: SolrCloud, Tests >Reporter: Shalin Shekhar Mangar >Assignee: Shalin Shekhar Mangar >Priority: Minor > Fix For: 4.10 > > > This test fails sometimes locally as well as on jenkins. > {code} > Expected 2 of 3 replicas to be active but only found 1 > at org.junit.Assert.fail(Assert.java:93) > at org.junit.Assert.assertTrue(Assert.java:43) > at > org.apache.solr.cloud.HttpPartitionTest.testRf3WithLeaderFailover(HttpPartitionTest.java:367) > at > org.apache.solr.cloud.HttpPartitionTest.doTest(HttpPartitionTest.java:148) > at > org.apache.solr.BaseDistributedSearchTestCase.testDistribSearch(BaseDistributedSearchTestCase.java:863) > {code} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-6245) Socket and Connection configuration are ignored in HttpSolrServer when passing in HttpClient
[ https://issues.apache.org/jira/browse/SOLR-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar resolved SOLR-6245. - Resolution: Fixed Fix Version/s: 4.10 5.0 Thanks Patanachai! > Socket and Connection configuration are ignored in HttpSolrServer when > passing in HttpClient > > > Key: SOLR-6245 > URL: https://issues.apache.org/jira/browse/SOLR-6245 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 4.7, 4.8, 4.9 >Reporter: Patanachai Tangchaisin >Assignee: Shalin Shekhar Mangar > Fix For: 5.0, 4.10 > > Attachments: SOLR-6245.patch, SOLR-6245.patch, SOLR-6245.patch > > > I spent time debugging our HttpSolrServer and HttpClient. We construct our > HttpClient (we have some requirement regarding about connectionTimeout, > soTimeout, etc.) and then pass it to HttpSolrServer. I found out that all our > socket level and connection level configuration are ignored when creating a > http connection. > The problem is in HttpClient 4.3.X, they allow overriding of these parameters > per request i.e. one request can have socketTimeout=100ms and another request > can have socketTimeout=200ms. The logic[1] to check whether to make it > per-request base config or not depending on whether any of these parameters > is set. > {code} > protected NamedList executeMethod(HttpRequestBase method, final > ResponseParser processor) throws SolrServerException { > // XXX client already has this set, is this needed? > method.getParams().setParameter(ClientPNames.HANDLE_REDIRECTS, > followRedirects); > method.addHeader("User-Agent", AGENT); > {code} > In HttpSolrServer.java, only one parameter (HANDLE_REDIRECTS) is set but that > trigger the logic in HttpClient to initialize a default per-request base > config, which eventually override any socket and connection configuration, we > did via HttpClientBuilder. > To conclude, a solution would be to remove these line > {code} > // XXX client already has this set, is this needed? > method.getParams().setParameter(ClientPNames.HANDLE_REDIRECTS, > followRedirects); > {code} > [1] - > http://svn.apache.org/viewvc/httpcomponents/httpclient/trunk/httpclient/src/main/java/org/apache/http/impl/client/InternalHttpClient.java?revision=1603745&view=markup > [LINE:172] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
NumericFacets assumptions
I'm looking at org.apache.solr.request.NumericFacets.getCounts(), trying to get an idea of how it works. Specifically looking at this bit of code: final List leaves = searcher.getIndexReader().leaves(); final Iterator ctxIt = leaves.iterator(); for (DocIterator docsIt = docs.iterator(); docsIt.hasNext(); ) { final int doc = docsIt.nextDoc(); if (ctx == null || doc >= ctx.docBase + ctx.reader().maxDoc()) { do { ctx = ctxIt.next(); } while (ctx == null || doc >= ctx.docBase + ctx.reader().maxDoc()); switch (numericType) { case LONG: longs = DocValues.getNumeric(ctx.reader(), fieldName); Am I right that it is assuming that the docs are in order? This confused me because the javadoc for DocIterator says "The order of the documents returned by this iterator is non-deterministic". For most of the DocIterator implementations, the docs are returned in order, which I guess is how this works? Also, am I right that it is assuming the index reader leaves are in order? That is, each leaf has higher doc ids than the previous leaf? I'm wondering too if that is always true, or if that is just how the implementation works now. (I'm asking these questions because I want to do something very similar in a custom bit of code I'm writing, and would be nice if I could safely make these same assumptions.) -Michael
[jira] [Updated] (SOLR-6245) Socket and Connection configuration are ignored in HttpSolrServer when passing in HttpClient
[ https://issues.apache.org/jira/browse/SOLR-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shalin Shekhar Mangar updated SOLR-6245: Attachment: SOLR-6245.patch Moved the testcase into a new test so that we can use SupressSSL annotation on it. I cannot reconcile the SSL configuration with the new HttpComponents API and I don't have time to work on it further. But coverage wise we're good because the fix is adequately tested. > Socket and Connection configuration are ignored in HttpSolrServer when > passing in HttpClient > > > Key: SOLR-6245 > URL: https://issues.apache.org/jira/browse/SOLR-6245 > Project: Solr > Issue Type: Bug > Components: clients - java >Affects Versions: 4.7, 4.8, 4.9 >Reporter: Patanachai Tangchaisin >Assignee: Shalin Shekhar Mangar > Attachments: SOLR-6245.patch, SOLR-6245.patch, SOLR-6245.patch > > > I spent time debugging our HttpSolrServer and HttpClient. We construct our > HttpClient (we have some requirement regarding about connectionTimeout, > soTimeout, etc.) and then pass it to HttpSolrServer. I found out that all our > socket level and connection level configuration are ignored when creating a > http connection. > The problem is in HttpClient 4.3.X, they allow overriding of these parameters > per request i.e. one request can have socketTimeout=100ms and another request > can have socketTimeout=200ms. The logic[1] to check whether to make it > per-request base config or not depending on whether any of these parameters > is set. > {code} > protected NamedList executeMethod(HttpRequestBase method, final > ResponseParser processor) throws SolrServerException { > // XXX client already has this set, is this needed? > method.getParams().setParameter(ClientPNames.HANDLE_REDIRECTS, > followRedirects); > method.addHeader("User-Agent", AGENT); > {code} > In HttpSolrServer.java, only one parameter (HANDLE_REDIRECTS) is set but that > trigger the logic in HttpClient to initialize a default per-request base > config, which eventually override any socket and connection configuration, we > did via HttpClientBuilder. > To conclude, a solution would be to remove these line > {code} > // XXX client already has this set, is this needed? > method.getParams().setParameter(ClientPNames.HANDLE_REDIRECTS, > followRedirects); > {code} > [1] - > http://svn.apache.org/viewvc/httpcomponents/httpclient/trunk/httpclient/src/main/java/org/apache/http/impl/client/InternalHttpClient.java?revision=1603745&view=markup > [LINE:172] -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5819) Add block tree postings format that supports term ords
[ https://issues.apache.org/jira/browse/LUCENE-5819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062020#comment-14062020 ] Michael McCandless commented on LUCENE-5819: I ran a quick perf test of Lucene41 vs OrdsLucene41, on wikimediumall: {noformat} Report after iter 19: TaskQPS base StdDevQPS comp StdDev Pct diff PKLookup 153.33 (8.7%) 131.17 (8.5%) -14.4% ( -29% -3%) Respell 35.40 (5.4%) 31.41 (7.9%) -11.3% ( -23% -2%) AndHighLow 241.05 (3.3%) 224.00 (14.7%) -7.1% ( -24% - 11%) Fuzzy2 69.73 (6.3%) 65.30 (5.5%) -6.3% ( -17% -5%) Fuzzy1 44.32 (9.4%) 41.90 (11.8%) -5.5% ( -24% - 17%) LowTerm 313.68 (2.4%) 296.93 (10.8%) -5.3% ( -18% -8%) Wildcard 39.40 (5.7%) 37.35 (9.7%) -5.2% ( -19% - 10%) IntNRQ3.57 (9.3%)3.41 (14.5%) -4.6% ( -26% - 21%) MedSloppyPhrase4.98 (3.3%)4.76 (12.7%) -4.4% ( -19% - 12%) MedPhrase6.18 (3.8%)5.95 (13.1%) -3.7% ( -19% - 13%) HighTerm 27.78 (5.8%) 26.75 (10.1%) -3.7% ( -18% - 12%) AndHighHigh 13.51 (2.0%) 13.02 (9.9%) -3.6% ( -15% -8%) LowSloppyPhrase 134.71 (3.3%) 130.50 (12.1%) -3.1% ( -17% - 12%) Prefix38.88 (9.7%)8.65 (15.6%) -2.7% ( -25% - 25%) LowPhrase 49.67 (3.1%) 48.38 (11.4%) -2.6% ( -16% - 12%) MedTerm 117.97 (4.5%) 115.01 (6.9%) -2.5% ( -13% -9%) HighPhrase7.87 (6.0%)7.73 (13.3%) -1.8% ( -19% - 18%) HighSpanNear4.68 (6.6%)4.61 (14.7%) -1.4% ( -21% - 21%) AndHighMed 49.48 (1.6%) 48.95 (5.0%) -1.1% ( -7% -5%) LowSpanNear 23.70 (4.6%) 23.55 (10.4%) -0.7% ( -14% - 15%) HighSloppyPhrase5.90 (4.4%)5.87 (11.2%) -0.5% ( -15% - 15%) OrNotHighLow 36.90 (12.3%) 37.07 (12.9%) 0.5% ( -22% - 29%) OrHighHigh4.16 (15.2%)4.19 (16.7%) 0.8% ( -27% - 38%) OrHighNotHigh 11.86 (13.8%) 11.98 (18.4%) 0.9% ( -27% - 38%) MedSpanNear4.32 (5.3%)4.39 (10.7%) 1.5% ( -13% - 18%) OrHighNotMed 26.10 (14.7%) 26.60 (12.8%) 1.9% ( -22% - 34%) OrHighNotLow 19.61 (15.8%) 20.08 (13.9%) 2.4% ( -23% - 38%) OrNotHighMed 13.84 (15.9%) 14.19 (16.7%) 2.6% ( -25% - 41%) OrHighMed 27.09 (18.5%) 27.87 (19.4%) 2.9% ( -29% - 50%) OrHighLow 36.24 (15.4%) 37.42 (15.3%) 3.2% ( -23% - 40%) OrNotHighHigh9.70 (16.6%) 10.11 (15.5%) 4.2% ( -23% - 43%) {noformat} Net/net the terms-dict heavy operations (PKLookup, respell, fuzzy, maybe IntNRQ) take some hit, since there is added cost to decode ordinals from the FST; I think the other changes are likely noise. Also, the net terms index (size of FSTs that are loaded into RAM, \*.tip/\*.tipo) grew from 31M to 46M (~48% larger)... > Add block tree postings format that supports term ords > -- > > Key: LUCENE-5819 > URL: https://issues.apache.org/jira/browse/LUCENE-5819 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/other >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5819.patch > > > BlockTree is our default terms dictionary today, but it doesn't > support term ords, which is an optional API in the postings format to > retrieve the ordinal for the currently seek'd term, and also later > seek by that ordinal e.g. to lookup the term. > This can possibly be useful for e.g. faceting, and maybe at some point > we can share the postings terms dict with the one used by sorted/set > DV for cases when app wants to invert and facet on a given field. > The older (3.x) block terms dict can easily support ords, and we have > a Lucene41OrdsPF in test-framework, but it's not as fast / compact as > block-tree, and doesn't (can't easily) implement
[JENKINS] Lucene-Solr-Tests-4.x-Java7 - Build # 2031 - Still Failing
Build: https://builds.apache.org/job/Lucene-Solr-Tests-4.x-Java7/2031/ 1 tests failed. REGRESSION: org.apache.lucene.analysis.core.TestRandomChains.testRandomChains Error Message: startOffset must be non-negative, and endOffset must be >= startOffset, startOffset=2,endOffset=1 Stack Trace: java.lang.IllegalArgumentException: startOffset must be non-negative, and endOffset must be >= startOffset, startOffset=2,endOffset=1 at __randomizedtesting.SeedInfo.seed([2871E466A5044906:1590CD07E21654C6]:0) at org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl.setOffset(PackedTokenAttributeImpl.java:107) at org.apache.lucene.analysis.shingle.ShingleFilter.incrementToken(ShingleFilter.java:345) at org.apache.lucene.analysis.ValidatingTokenFilter.incrementToken(ValidatingTokenFilter.java:68) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkAnalysisConsistency(BaseTokenStreamTestCase.java:703) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:614) at org.apache.lucene.analysis.BaseTokenStreamTestCase.checkRandomData(BaseTokenStreamTestCase.java:513) at org.apache.lucene.analysis.core.TestRandomChains.testRandomChains(TestRandomChains.java:927) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1618) at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:827) at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:863) at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:877) at org.apache.lucene.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:50) at org.apache.lucene.util.TestRuleFieldCacheSanity$1.evaluate(TestRuleFieldCacheSanity.java:51) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at org.apache.lucene.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:49) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:365) at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:798) at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:458) at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:836) at com.carrotsearch.randomizedtesting.RandomizedRunner$3.evaluate(RandomizedRunner.java:738) at com.carrotsearch.randomizedtesting.RandomizedRunner$4.evaluate(RandomizedRunner.java:772) at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:783) at org.apache.lucene.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:46) at org.apache.lucene.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:42) at com.carrotsearch.randomizedtesting.rules.SystemPropertiesInvariantRule$1.evaluate(SystemPropertiesInvariantRule.java:55) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:39) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:43) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:48) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:65) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuit
[jira] [Updated] (LUCENE-5824) hunspell FLAG LONG implemented incorrectly
[ https://issues.apache.org/jira/browse/LUCENE-5824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5824: Attachment: LUCENE-5824.patch Simple patch and test to encode as A << 8 + B (and also check the values are really within range: they should be two ascii characters). This bug currently impacts the more complicated dictionaries using this encoding type (russian, arabic, hebrew, etc) > hunspell FLAG LONG implemented incorrectly > -- > > Key: LUCENE-5824 > URL: https://issues.apache.org/jira/browse/LUCENE-5824 > Project: Lucene - Core > Issue Type: Bug >Reporter: Robert Muir > Attachments: LUCENE-5824.patch > > > If you have more than 256 flags, you run out of 8-bit characters, so you have > to use another flag type to get 64k: > * UTF-8: 16-bit BMP flags > * long: two-character flags like 'AB' > * num: decimal numbers like '10234' > But our implementation for 'long' is wrong, it encodes as 'A+B', which means > it cant distinguish between 'AB' and 'BA' and causes overgeneration. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5824) hunspell FLAG LONG implemented incorrectly
Robert Muir created LUCENE-5824: --- Summary: hunspell FLAG LONG implemented incorrectly Key: LUCENE-5824 URL: https://issues.apache.org/jira/browse/LUCENE-5824 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir If you have more than 256 flags, you run out of 8-bit characters, so you have to use another flag type to get 64k: * UTF-8: 16-bit BMP flags * long: two-character flags like 'AB' * num: decimal numbers like '10234' But our implementation for 'long' is wrong, it encodes as 'A+B', which means it cant distinguish between 'AB' and 'BA' and causes overgeneration. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5865) Provide a MiniSolrCloudCluster to enable easier testing
[ https://issues.apache.org/jira/browse/SOLR-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061967#comment-14061967 ] ASF GitHub Bot commented on SOLR-5865: -- Github user asfgit closed the pull request at: https://github.com/apache/camel/pull/218 > Provide a MiniSolrCloudCluster to enable easier testing > --- > > Key: SOLR-5865 > URL: https://issues.apache.org/jira/browse/SOLR-5865 > Project: Solr > Issue Type: Improvement > Components: SolrCloud >Affects Versions: 4.7, 5.0 >Reporter: Gregory Chanan >Assignee: Mark Miller > Fix For: 4.8, 5.0 > > Attachments: SOLR-5865.patch, SOLR-5865.patch, > SOLR-5865addendum.patch, SOLR-5865addendum2.patch, SOLR-5865wait.patch > > > Today, the SolrCloud tests are based on the LuceneTestCase class hierarchy, > which has a couple of issues around support for downstream projects: > - It's difficult to test SolrCloud support in a downstream project that may > have its own test framework. For example, some projects have support for > different storage backends (e.g. Solr/ElasticSearch/HBase) and want tests > against each of the different backends. This is difficult to do cleanly, > because the Solr tests require derivation from LuceneTestCase, while the > other don't > - The LuceneTestCase class hierarchy is really designed for internal solr > tests (e.g. it randomizes a lot of parameters to get test coverage, but a > downstream project probably doesn't care about that). It's also quite > complicated and dense, much more so than a downstream project would want. > Given these reasons, it would be nice to provide a simple > "MiniSolrCloudCluster", similar to how HDFS provides a MiniHdfsCluster or > HBase provides a MiniHBaseCluster. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14060685#comment-14060685 ] Maciej Zasada edited comment on SOLR-5746 at 7/15/14 11:08 AM: --- Hi [~hossman], I've attached updated patch file: * removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly modified) existing API of {{DOMUtil}} instead. * removed reading values from {{SolrParam}} - they are being read directly from the {{NamedList<>}} * added reporting of duplicated config options ({{DEBUG}} level) per parent node, as well as exception message containing list of duplicated parameters, e.g. {code} 1 STRING-1 … 2 STRING-2 {code} will cause an exception: {code} Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param] {code} However, if parameters with a same name are attached to different parent nodes everything will pass just fine, e.g. {code} 1 STRING-1 … … 2 STRING-2 {code} In this case no exception will be thrown. Some examples to sum it up: |{{solr.xml}} file fragment|Expected type|Parsing result| |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(x) {{ant clean test}} shows that there's no regression. [~jkrupan] this change clearly is not backward compatible with the existing {{solr.xml}} files. For instance - unknown config values won't be silently ignored - an exception will be thrown instead. However, I didn't realise that {{solr.xml}} files are versioned the same way as {{schema.xml}} files are. Should I bump the schema version to 1.6? Cheers, Maciej was (Author: maciej.zasada): Hi [~hossman], I've attached updated patch file: * removed {{DOMUtil.readNamedChildrenAsNamedList}} method and used (slightly modified) existing API of {{DOMUtil}} instead. * removed reading values from {{SolrParam}} - they are being read directly from the {{NamedList<>}} * added reporting of duplicated config options ({{DEBUG}} level) per parent node, as well as exception message containing list of duplicated parameters, e.g. {code} 1 STRING-1 … 2 STRING-2 {code} will cause an exception: {code} Duplicated 2 config parameter(s) in solr.xml file: [int-param, str-param] {code} However, if parameters with a same name are attached to different parent nodes everything will pass just fine, e.g. {code} 1 STRING-1 … … 2 STRING-2 {code} In this case no exception will be thrown. Some examples to sum it up: |{{solr.xml}} file fragment|Expected type|Parsing result| |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(/) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{44}}|{{Integer}}|(x) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(/) |{{true}}|{{Boolean}}|(x) |{{true}}|{{Boolean}}|(x) {{ant clean test}} shows that there's no regression. [~jkrupan] this change clearly is not backward compatible with the existing {{solr.xml}} files. For instance - unknown config values won't be silently ignored - an exception will be thrown instead. However, I didn't realise that {{solr.xml}} files are versioned the same way as {{schema.xml}} files are. Should I bump the schema version to 1.6? Cheers, Maciej > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, > SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@na
[jira] [Updated] (SOLR-5746) solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; expects odd type for "shareSchema"
[ https://issues.apache.org/jira/browse/SOLR-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Zasada updated SOLR-5746: Attachment: SOLR-5746.patch Added some more unit test to the patch. > solr.xml parsing of "str" vs "int" vs "bool" is brittle; fails silently; > expects odd type for "shareSchema" > -- > > Key: SOLR-5746 > URL: https://issues.apache.org/jira/browse/SOLR-5746 > Project: Solr > Issue Type: Bug >Affects Versions: 4.3, 4.4, 4.5, 4.6 >Reporter: Hoss Man > Attachments: SOLR-5746.patch, SOLR-5746.patch, SOLR-5746.patch, > SOLR-5746.patch > > > A comment in the ref guide got me looking at ConfigSolrXml.java and noticing > that the parsing of solr.xml options here is very brittle and confusing. In > particular: > * if a boolean option "foo" is expected along the lines of {{ name="foo">true}} it will silently ignore {{ name="foo">true}} > * likewise for an int option {{32}} vs {{ name="bar">32}} > ... this is inconsistent with the way solrconfig.xml is parsed. In > solrconfig.xml, the xml nodes are parsed into a NamedList, and the above > options will work in either form, but an invalid value such as {{ name="foo">NOT A BOOLEAN}} will generate an error earlier (when > parsing config) then {{NOT A BOOLEAN}} (attempt to > parse the string as a bool the first time the config value is needed) > In addition, i notice this really confusing line... > {code} > propMap.put(CfgProp.SOLR_SHARESCHEMA, > doSub("solr/str[@name='shareSchema']")); > {code} > "shareSchema" is used internally as a boolean option, but as written the > parsing code will ignore it unless the user explicitly configures it as a > {{}} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5823) recognize hunspell FULLSTRIP option in affix file
[ https://issues.apache.org/jira/browse/LUCENE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-5823. - Resolution: Fixed Fix Version/s: 4.10 5.0 > recognize hunspell FULLSTRIP option in affix file > - > > Key: LUCENE-5823 > URL: https://issues.apache.org/jira/browse/LUCENE-5823 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir > Fix For: 5.0, 4.10 > > Attachments: LUCENE-5823.patch > > > With LUCENE-5818 we fixed stripping to be correct (ensuring it doesnt strip > the entire word before applying an affix). This is usually true, but there is > an option in the affix file to allow this. > Its used by several languages (french, latvian, swedish, etc) > {noformat} > FULLSTRIP > With FULLSTRIP, affix rules can strip full words, not only one > less characters, before adding the affixes > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-5823) recognize hunspell FULLSTRIP option in affix file
Robert Muir created LUCENE-5823: --- Summary: recognize hunspell FULLSTRIP option in affix file Key: LUCENE-5823 URL: https://issues.apache.org/jira/browse/LUCENE-5823 Project: Lucene - Core Issue Type: Improvement Reporter: Robert Muir Attachments: LUCENE-5823.patch With LUCENE-5818 we fixed stripping to be correct (ensuring it doesnt strip the entire word before applying an affix). This is usually true, but there is an option in the affix file to allow this. Its used by several languages (french, latvian, swedish, etc) {noformat} FULLSTRIP With FULLSTRIP, affix rules can strip full words, not only one less characters, before adding the affixes {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5823) recognize hunspell FULLSTRIP option in affix file
[ https://issues.apache.org/jira/browse/LUCENE-5823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-5823: Attachment: LUCENE-5823.patch Simple patch with a test. > recognize hunspell FULLSTRIP option in affix file > - > > Key: LUCENE-5823 > URL: https://issues.apache.org/jira/browse/LUCENE-5823 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Robert Muir > Attachments: LUCENE-5823.patch > > > With LUCENE-5818 we fixed stripping to be correct (ensuring it doesnt strip > the entire word before applying an affix). This is usually true, but there is > an option in the affix file to allow this. > Its used by several languages (french, latvian, swedish, etc) > {noformat} > FULLSTRIP > With FULLSTRIP, affix rules can strip full words, not only one > less characters, before adding the affixes > {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-4647) Grouping is broken on docvalues-only fields
[ https://issues.apache.org/jira/browse/SOLR-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061877#comment-14061877 ] Modassar Ather edited comment on SOLR-4647 at 7/15/14 9:40 AM: --- Hi, I am also seeing this issue while doing grouping on docValues enabled field. I checked createField(...) method of FieldType which returns null if the field is not indexed and stored. And when the returned field (which is null) gets passed to the fieldType.toObject(...) method from finish() method of Grouping.java it causes the NullPointerException. Kindly provide inputs if any of the indexe/stored needs to be set to true while creating a docValue field or this is an issue. Thanks, Modassar was (Author: modassar): Hi, I am also seeing this issue while doing grouping on docValues enabled field. I checked createField(...) method of FieldType which returns null if the field is not indexed and stored. Kindly provide inputs if any of the indexe/stored needs to be set to true while creating a docValue field or this is an issue. Thanks, Modassar > Grouping is broken on docvalues-only fields > --- > > Key: SOLR-4647 > URL: https://issues.apache.org/jira/browse/SOLR-4647 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2 >Reporter: Adrien Grand > Labels: newdev > > There are a few places where grouping uses > FieldType.toObject(SchemaField.createField(String, float)) to translate a > String field value to an Object. The problem is that createField returns null > when the field is neither stored nor indexed, even if it has doc values. > An option to fix it could be to use the ValueSource instead to resolve the > Object value (similarily to NumericFacets). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4647) Grouping is broken on docvalues-only fields
[ https://issues.apache.org/jira/browse/SOLR-4647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061877#comment-14061877 ] Modassar Ather commented on SOLR-4647: -- Hi, I am also seeing this issue while doing grouping on docValues enabled field. I checked createField(...) method of FieldType which returns null if the field is not indexed and stored. Kindly provide inputs if any of the indexe/stored needs to be set to true while creating a docValue field or this is an issue. Thanks, Modassar > Grouping is broken on docvalues-only fields > --- > > Key: SOLR-4647 > URL: https://issues.apache.org/jira/browse/SOLR-4647 > Project: Solr > Issue Type: Bug >Affects Versions: 4.2 >Reporter: Adrien Grand > Labels: newdev > > There are a few places where grouping uses > FieldType.toObject(SchemaField.createField(String, float)) to translate a > String field value to an Object. The problem is that createField returns null > when the field is neither stored nor indexed, even if it has doc values. > An option to fix it could be to use the ValueSource instead to resolve the > Object value (similarily to NumericFacets). -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org