[jira] [Commented] (SOLR-5104) Remove Default Core
[ https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727053#comment-13727053 ] Mark Miller commented on SOLR-5104: --- We certainly want to do this, and I don't consider it removing a feature. It's purely an improvement IMO. The main reason I have not pushed for it yet was that the it killed the admin UI - but now that that is fixed, this is the next step. This is a relic from the pre multi core days - when Solr was one index and that is it. Backcompat sludge is what has kept it around IMO - we want to act like most systems and start empty. It should be very simple for a user to create his first collection, but he should be the one to name it. As Grant mentions, this is certainly what you want for scriptibility, and it's more consistent with other systems for users as well. A new user should: 1. Start Solr 2. Name and create their first collection. When they want more collections, repeat step 2, a step they learned right away. Remove Default Core --- Key: SOLR-5104 URL: https://issues.apache.org/jira/browse/SOLR-5104 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll Fix For: 5.0 I see no reason to maintain the notion of a default Core/Collection. We can either default to Collection1, or just simply create a core on the fly based on the client's request. Thus, all APIs that are accessing a core would require the core to be in the address path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5104) Remove Default Core
[ https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13727060#comment-13727060 ] Mark Miller commented on SOLR-5104: --- Took me a moment for me to realize that is not referring to removing the core that ships with Solr, but the default core feature. I want to remove the actual default core that is setup, so certainly +1 on dropping this. I think we already discussed it some for 5.0. Remove Default Core --- Key: SOLR-5104 URL: https://issues.apache.org/jira/browse/SOLR-5104 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll Fix For: 5.0 I see no reason to maintain the notion of a default Core/Collection. We can either default to Collection1, or just simply create a core on the fly based on the client's request. Thus, all APIs that are accessing a core would require the core to be in the address path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welcome! - Mark On Jul 31, 2013, at 6:47 PM, Robert Muir rcm...@gmail.com wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
[jira] [Created] (LUCENE-5156) CompressingTermVectors termsEnum should probably not support seek-by-ord
Robert Muir created LUCENE-5156: --- Summary: CompressingTermVectors termsEnum should probably not support seek-by-ord Key: LUCENE-5156 URL: https://issues.apache.org/jira/browse/LUCENE-5156 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir Just like term vectors before it, it has a O(n) seek-by-term. But this one also advertises a seek-by-ord, only this is also O(n). This could cause e.g. checkindex to be very slow, because if termsenum supports ord it does a bunch of seeking tests. (Another solution would be to leave it, and add a boolean so checkindex never does seeking tests for term vectors, only real fields). However, I think its also kinda a trap, in my opinion if seek-by-ord is supported anywhere, you kinda expect it to be faster than linear time...? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Awesome! Congrats Welcome Cassandra! Thanks Regards, Kranti K Parisa http://www.linkedin.com/in/krantiparisa On Thu, Aug 1, 2013 at 7:23 PM, Mark Miller markrmil...@gmail.com wrote: Welcome! - Mark On Jul 31, 2013, at 6:47 PM, Robert Muir rcm...@gmail.com wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
[jira] [Updated] (SOLR-2570) randomize indexwriter settings in solr tests
[ https://issues.apache.org/jira/browse/SOLR-2570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man updated SOLR-2570: --- Attachment: SOLR-2570.patch updated patch taking into account the work already done in SOLR-4942 and SOLR-4951. In my limited testing so far, I haven't seen any obvious failures -- so i'd like to commit soon and then move forward with using the xml include snippet in more configs (SOLR-4952) randomize indexwriter settings in solr tests Key: SOLR-2570 URL: https://issues.apache.org/jira/browse/SOLR-2570 Project: Solr Issue Type: Sub-task Components: Build Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.5, 5.0 Attachments: SOLR-2570.patch, SOLR-2570.patch we should randomize indexwriter settings like lucene tests do, to vary # of segments and such. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welcome onboard Cassandra! Tommaso 2013/8/1 Robert Muir rcm...@gmail.com I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welcome Cassandra! Christian On Aug 1, 2013, at 7:47 AM, Robert Muir rcm...@gmail.com wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
[jira] [Created] (SOLR-5100) java.lang.OutOfMemoryError: Requested array size exceeds VM limit
Grzegorz Sobczyk created SOLR-5100: -- Summary: java.lang.OutOfMemoryError: Requested array size exceeds VM limit Key: SOLR-5100 URL: https://issues.apache.org/jira/browse/SOLR-5100 Project: Solr Issue Type: Bug Affects Versions: 4.2.1 Environment: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux Java 7, Tomcat, ZK standalone Reporter: Grzegorz Sobczyk Today I found exception in log (lmsiprse01): {code} sie 01, 2013 5:27:26 AM org.apache.solr.core.SolrCore execute INFO: [products] webapp=/solr path=/select params={facet=truestart=0q=facet.limit=-1facet.field=attribute_u-typfacet.field=attribute_u-gama-kolorystycznafacet.field=brand_namewt=javabinfq=node_id:1056version=2rows=0} hits=1241 status=0 QTime=33 sie 01, 2013 5:27:26 AM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:724) Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:64) at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:37) at org.apache.solr.handler.component.ShardFieldSortedHitQueue.init(ShardDoc.java:113) at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:766) at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:625) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:604) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) ... 13 more {code} We have: * 3x standalone zK * 3x Solr 4.2.1 on Tomcat Exception shows up after leader was stopped: * lmsiprse01: [2013-08-01 05:23:43]: /etc/init.d/tomcat6-1 stop [2013-08-01 05:25:09]: /etc/init.d/tomcat6-1 start * lmsiprse02 (leader): 2013-08-01 05:27:21]: /etc/init.d/tomcat6-1 stop 2013-08-01 05:29:31]: /etc/init.d/tomcat6-1 start * lmsiprse03: [2013-08-01 05:25:48]: /etc/init.d/tomcat6-1 stop [2013-08-01 05:26:42]: /etc/init.d/tomcat6-1 start -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5084) new field type - EnumField
[ https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elran Dvir updated SOLR-5084: - Attachment: Solr-5084.patch new field type - EnumField -- Key: SOLR-5084 URL: https://issues.apache.org/jira/browse/SOLR-5084 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, Solr-5084.patch We have encountered a use case in our system where we have a few fields (Severity. Risk etc) with a closed set of values, where the sort order for these values is pre-determined but not lexicographic (Critical is higher than High). Generically this is very close to how enums work. To implement, I have prototyped a new type of field: EnumField where the inputs are a closed predefined set of strings in a special configuration file (similar to currency.xml). The code is based on 4.2.1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5084) new field type - EnumField
[ https://issues.apache.org/jira/browse/SOLR-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726151#comment-13726151 ] Elran Dvir commented on SOLR-5084: -- I reformatted the code. I hope it's OK now. Thanks. new field type - EnumField -- Key: SOLR-5084 URL: https://issues.apache.org/jira/browse/SOLR-5084 Project: Solr Issue Type: New Feature Reporter: Elran Dvir Attachments: enumsConfig.xml, schema_example.xml, Solr-5084.patch, Solr-5084.patch We have encountered a use case in our system where we have a few fields (Severity. Risk etc) with a closed set of values, where the sort order for these values is pre-determined but not lexicographic (Critical is higher than High). Generically this is very close to how enums work. To implement, I have prototyped a new type of field: EnumField where the inputs are a closed predefined set of strings in a special configuration file (similar to currency.xml). The code is based on 4.2.1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welome Cassandra! -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welcome! On 1 August 2013 09:06, Adrien Grand jpou...@gmail.com wrote: Welome Cassandra! -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Met vriendelijke groet, Martijn van Groningen
Re: Welcome Cassandra Targett as Lucene/Solr committer
welcome! On Thu, Aug 1, 2013 at 9:18 AM, Martijn v Groningen martijn.v.gronin...@gmail.com wrote: Welcome! On 1 August 2013 09:06, Adrien Grand jpou...@gmail.com wrote: Welome Cassandra! -- Adrien - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Met vriendelijke groet, Martijn van Groningen - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726191#comment-13726191 ] Joern Kottmann commented on LUCENE-2899: Stanford NLP is licensed under GPLv2, this license is not compatible with the AL 2.0 and therefore such a component can't be contributed to an Apache project directly. Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 5.0, 4.5 Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welcome Cassandra! Alan Woodward www.flax.co.uk On 31 Jul 2013, at 23:47, Robert Muir wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/
[jira] [Assigned] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward reassigned SOLR-5099: --- Assignee: Alan Woodward The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Alan Woodward Priority: Critical Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2899) Add OpenNLP Analysis capabilities as a module
[ https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726221#comment-13726221 ] Andrew Janowczyk commented on LUCENE-2899: -- ahhh thanks for the info. i found a relevant link discussing the licenses which clearly explains the details [here|http://www.apache.org/licenses/GPL-compatibility.html]. oh well, it was worth a try :) Add OpenNLP Analysis capabilities as a module - Key: LUCENE-2899 URL: https://issues.apache.org/jira/browse/LUCENE-2899 Project: Lucene - Core Issue Type: New Feature Components: modules/analysis Reporter: Grant Ingersoll Assignee: Grant Ingersoll Priority: Minor Fix For: 5.0, 4.5 Attachments: LUCENE-2899-current.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899.patch, LUCENE-2899-RJN.patch, LUCENE-2899-x.patch, LUCENE-2899-x.patch, LUCENE-2899-x.patch, OpenNLPFilter.java, OpenNLPFilter.java, OpenNLPTokenizer.java, opennlp_trunk.patch Now that OpenNLP is an ASF project and has a nice license, it would be nice to have a submodule (under analysis) that exposed capabilities for it. Drew Farris, Tom Morton and I have code that does: * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it would have to change slightly to buffer tokens) * NamedEntity recognition as a TokenFilter We are also planning a Tokenizer/TokenFilter that can put parts of speech as either payloads (PartOfSpeechAttribute?) on a token or at the same position. I'd propose it go under: modules/analysis/opennlp -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: VInt block lenght in Lucene 4.1 postings format
Hi Aleksandra, The PostingsReader uses a skip list to determine the start file pointer of each block (both FOR packed and vInt encoded). The information is currently maintained by Lucene41SkipReader. The tricky part is, for each term, the skip data is exactly at the end of TermFreqs blocks, so, if you fetch the startFP for vInt block, and knows the docTermStartOffset skipOffset for current term, you can calculate out what you need. http://lucene.apache.org/core/4_4_0/core/org/apache/lucene/codecs/lucene41/Lucene41PostingsFormat.html#Frequencies On Thu, Aug 1, 2013 at 4:20 PM, Aleksandra Woźniak aleksandra.k.wozn...@gmail.com wrote: Hi all, recently I wanted to try out some modifications of Lucene's postings format (namely, copying blocks that have no deletions without int-decoding/encoding -- this is similar to what was described here: https://issues.apache.org/jira/browse/LUCENE-2082). I started with changing Lucene 4.1 postings format to check what can be done there. I came across the following problem: in Lucene41PostingsReader the length (number of bytes) of the last, vInt-encoded, block of posting in not known before all individual postings are read and decoded. When reading this block we only know the number of postings that should be read and decoded -- since vInts have different sizes by definition. If I wanted to copy the whole block without vInt decoding/encoding, I need to know how many bytes I have to read from postings index input. So, my question is: is there a clean way to determine the length of this block (ie. the number of bytes that this block has)? Is the number of bytes in a posting list tracked somewhere in Lucene 4.1 postings format? Thanks, Aleksandra -- Han Jiang Team of Search Engine and Web Mining, School of Electronic Engineering and Computer Science, Peking University, China - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5099) The core.properties not created during collection creation
[ https://issues.apache.org/jira/browse/SOLR-5099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726244#comment-13726244 ] Alan Woodward commented on SOLR-5099: - This is because creating a core in normal mode requires that the instance dir is already present, but creation via SolrCloud allows you to load all config from zookeeper, and so doesn't need an actual instance dir. Nice catch. I'll add a test for the Collections API as well. The core.properties not created during collection creation -- Key: SOLR-5099 URL: https://issues.apache.org/jira/browse/SOLR-5099 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.5, 5.0 Reporter: Herb Jiang Assignee: Alan Woodward Priority: Critical Attachments: CorePropertiesLocator.java.patch When using the new solr.xml structure. The core auto discovery mechanism trying to find core.properties. But I found the core.properties cannot be create when I dynamically create a collection. The root issue is the CorePropertiesLocator trying to create properties before the instanceDir is created. And collection creation process will done and looks fine at runtime, but it will cause issues (cores are not auto discovered after server restart). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf
[ https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726262#comment-13726262 ] Gilad Barkai commented on LUCENE-5155: -- Patch looks good. +1 for commit. Perhaps also document that FRNode is now comparable? Add OrdinalValueResolver in favor of FacetRequest.getValueOf Key: LUCENE-5155 URL: https://issues.apache.org/jira/browse/LUCENE-5155 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5155.patch FacetRequest.getValueOf is responsible for resolving an ordinal's value. It is given FacetArrays, and typically does something like {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this method is to allow special requests, e.g. average, to do some post processing on the values, that couldn't be done during aggregation. I feel that getValueOf is in the wrong place -- the calls to getInt/FloatArray are really redundant. Also, if an aggregator maintains some statistics by which it needs to correct the aggregated values, it's not trivial to pass it from the aggregator to the request. Therefore I would like to make the following changes: * Remove FacetRequest.getValueOf and .getFacetArraysSource * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays and has a simple API .valueOf(ordinal). * Modify the FacetResultHandlers to use OrdValResolver. This allows an OVR to initialize the right array instance(s) in the ctor, and return the value of the requested ordinal, without doing arrays.getArray() calls. Will post a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf
[ https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726279#comment-13726279 ] ASF subversion and git services commented on LUCENE-5155: - Commit 1509152 from [~shaie] in branch 'dev/trunk' [ https://svn.apache.org/r1509152 ] LUCENE-5155: add OrdinalValueResolver Add OrdinalValueResolver in favor of FacetRequest.getValueOf Key: LUCENE-5155 URL: https://issues.apache.org/jira/browse/LUCENE-5155 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5155.patch FacetRequest.getValueOf is responsible for resolving an ordinal's value. It is given FacetArrays, and typically does something like {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this method is to allow special requests, e.g. average, to do some post processing on the values, that couldn't be done during aggregation. I feel that getValueOf is in the wrong place -- the calls to getInt/FloatArray are really redundant. Also, if an aggregator maintains some statistics by which it needs to correct the aggregated values, it's not trivial to pass it from the aggregator to the request. Therefore I would like to make the following changes: * Remove FacetRequest.getValueOf and .getFacetArraysSource * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays and has a simple API .valueOf(ordinal). * Modify the FacetResultHandlers to use OrdValResolver. This allows an OVR to initialize the right array instance(s) in the ctor, and return the value of the requested ordinal, without doing arrays.getArray() calls. Will post a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf
[ https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-5155. Resolution: Fixed Fix Version/s: 4.5 5.0 Thanks Gilad, added a comment and committed. Add OrdinalValueResolver in favor of FacetRequest.getValueOf Key: LUCENE-5155 URL: https://issues.apache.org/jira/browse/LUCENE-5155 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Fix For: 5.0, 4.5 Attachments: LUCENE-5155.patch FacetRequest.getValueOf is responsible for resolving an ordinal's value. It is given FacetArrays, and typically does something like {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this method is to allow special requests, e.g. average, to do some post processing on the values, that couldn't be done during aggregation. I feel that getValueOf is in the wrong place -- the calls to getInt/FloatArray are really redundant. Also, if an aggregator maintains some statistics by which it needs to correct the aggregated values, it's not trivial to pass it from the aggregator to the request. Therefore I would like to make the following changes: * Remove FacetRequest.getValueOf and .getFacetArraysSource * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays and has a simple API .valueOf(ordinal). * Modify the FacetResultHandlers to use OrdValResolver. This allows an OVR to initialize the right array instance(s) in the ctor, and return the value of the requested ordinal, without doing arrays.getArray() calls. Will post a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5155) Add OrdinalValueResolver in favor of FacetRequest.getValueOf
[ https://issues.apache.org/jira/browse/LUCENE-5155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726283#comment-13726283 ] ASF subversion and git services commented on LUCENE-5155: - Commit 1509154 from [~shaie] in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1509154 ] LUCENE-5155: add OrdinalValueResolver Add OrdinalValueResolver in favor of FacetRequest.getValueOf Key: LUCENE-5155 URL: https://issues.apache.org/jira/browse/LUCENE-5155 Project: Lucene - Core Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Attachments: LUCENE-5155.patch FacetRequest.getValueOf is responsible for resolving an ordinal's value. It is given FacetArrays, and typically does something like {{arrays.getIntArray()[ord]}} -- for every ordinal! The purpose of this method is to allow special requests, e.g. average, to do some post processing on the values, that couldn't be done during aggregation. I feel that getValueOf is in the wrong place -- the calls to getInt/FloatArray are really redundant. Also, if an aggregator maintains some statistics by which it needs to correct the aggregated values, it's not trivial to pass it from the aggregator to the request. Therefore I would like to make the following changes: * Remove FacetRequest.getValueOf and .getFacetArraysSource * Add FacetsAggregator.createOrdinalValueResolver which takes the FacetArrays and has a simple API .valueOf(ordinal). * Modify the FacetResultHandlers to use OrdValResolver. This allows an OVR to initialize the right array instance(s) in the ctor, and return the value of the requested ordinal, without doing arrays.getArray() calls. Will post a patch shortly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5091) Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation
[ https://issues.apache.org/jira/browse/SOLR-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726285#comment-13726285 ] Markus Jelsma commented on SOLR-5091: - Can you include SOLR-4018 if you're replacing the dispatch filter or i'll have to keep updating it as trunk progresses :) Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation -- Key: SOLR-5091 URL: https://issues.apache.org/jira/browse/SOLR-5091 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Fix For: 5.0 This is an issue to track a series of sub issues related to deprecated and crufty Servlet/REST API code. I'll create sub-tasks to manage them. # Clean up all the old UI stuff (old redirects) # Kill/Simplify SolrDispatchFilter -- for instance, why not make the user always have a core name in 5.0? i.e. /collection1 is the default core ## I'd like to move to just using Guice's servlet extension to do this, which, I think will also make it easier to run Solr in other containers (i.e. non-servlet environments) due to the fact that you don't have to tie the request handling logic specifically to a Servlet. # Simplify the creation and testing of REST and other APIs via Guice + Restlet, which I've done on a number of occasions. ## It might be also possible to move all of the APIs onto Restlet and maintain back compat through a simple restlet proxy (still exploring this). This would also have the benefit of abstracting the core request processing out of the Servlet context and make that an implementation detail. ## Moving to Guice, IMO, will make it easier to isolate and test individual components by being able to inject mocks easier. I am close to a working patch for some of this. I will post incremental updates/issues as I move forward on this, but I think we should take 5.x as an opportunity to be more agnostic of container and I believe the approach I have in mind will do so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5091) Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation
[ https://issues.apache.org/jira/browse/SOLR-5091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726287#comment-13726287 ] Grant Ingersoll commented on SOLR-5091: --- I'll see what I can do. Clean up Servlets APIs, Kill SolrDispatchFilter, simplify API creation -- Key: SOLR-5091 URL: https://issues.apache.org/jira/browse/SOLR-5091 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Fix For: 5.0 This is an issue to track a series of sub issues related to deprecated and crufty Servlet/REST API code. I'll create sub-tasks to manage them. # Clean up all the old UI stuff (old redirects) # Kill/Simplify SolrDispatchFilter -- for instance, why not make the user always have a core name in 5.0? i.e. /collection1 is the default core ## I'd like to move to just using Guice's servlet extension to do this, which, I think will also make it easier to run Solr in other containers (i.e. non-servlet environments) due to the fact that you don't have to tie the request handling logic specifically to a Servlet. # Simplify the creation and testing of REST and other APIs via Guice + Restlet, which I've done on a number of occasions. ## It might be also possible to move all of the APIs onto Restlet and maintain back compat through a simple restlet proxy (still exploring this). This would also have the benefit of abstracting the core request processing out of the Servlet context and make that an implementation detail. ## Moving to Guice, IMO, will make it easier to isolate and test individual components by being able to inject mocks easier. I am close to a working patch for some of this. I will post incremental updates/issues as I move forward on this, but I think we should take 5.x as an opportunity to be more agnostic of container and I believe the approach I have in mind will do so. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5102) Simplify Solr Home
Grant Ingersoll created SOLR-5102: - Summary: Simplify Solr Home Key: SOLR-5102 URL: https://issues.apache.org/jira/browse/SOLR-5102 Project: Solr Issue Type: Bug Reporter: Grant Ingersoll Assignee: Grant Ingersoll Fix For: 5.0 I think for 5.0, we should re-think some of the variations we support around things like Solr Home, etc. We have a fair bit of code, I suspect that could just go away if make it easier by assuming there is a single solr home where everything lives. The notion of making that stuff configurable has outlived its usefulness -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5103) Plugin Improvements
Grant Ingersoll created SOLR-5103: - Summary: Plugin Improvements Key: SOLR-5103 URL: https://issues.apache.org/jira/browse/SOLR-5103 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Fix For: 5.0 I think for 5.0, we should make it easier to add plugins by defining a plugin package, ala a Hadoop Job jar, which is a self--contained archive of a plugin that can be easily installed (even from the UI!) and configured programmatically. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5104) Remove Default Core
Grant Ingersoll created SOLR-5104: - Summary: Remove Default Core Key: SOLR-5104 URL: https://issues.apache.org/jira/browse/SOLR-5104 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll Fix For: 5.0 I see no reason to maintain the notion of a default Core/Collection. We can either default to Collection1, or just simply create a core on the fly based on the client's request. Thus, all APIs that are accessing a core would require the core to be in the address path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4860) MoreLikeThisHandler doesn't work with numeric or date fields in 4.x
[ https://issues.apache.org/jira/browse/SOLR-4860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726314#comment-13726314 ] Mike commented on SOLR-4860: I came across this issue as well, I wanted to use _val_ hook and numeric field values for boosting mlt query via mlt.fl parameter. For regular search (via bf) this approach works just fine. Do you plan to fix this, or I should start working on different solution for my mlt query? What's the probability it will be fixed this year? :) MoreLikeThisHandler doesn't work with numeric or date fields in 4.x --- Key: SOLR-4860 URL: https://issues.apache.org/jira/browse/SOLR-4860 Project: Solr Issue Type: Bug Components: MoreLikeThis Affects Versions: 4.2 Reporter: Thomas Seidl After upgrading to Solr 4.2 (from 3.x), I realized that my MLT queries no longer work. It happens if I pass an integer ({{solr.TrieIntField}}), float ({{solr.TrieFloatField}}) or date ({{solr.DateField}}) field as part of the {{mlt.fl}} parameter. The field's {{multiValued}} setting doesn't seem to matter. This is the error I get: {noformat} NumericTokenStream does not support CharTermAttribute. java.lang.IllegalArgumentException: NumericTokenStream does not support CharTermAttribute. at org.apache.lucene.analysis.NumericTokenStream$NumericAttributeFactory.createAttributeInstance(NumericTokenStream.java:136) at org.apache.lucene.util.AttributeSource.addAttribute(AttributeSource.java:271) at org.apache.lucene.queries.mlt.MoreLikeThis.addTermFrequencies(MoreLikeThis.java:781) at org.apache.lucene.queries.mlt.MoreLikeThis.retrieveTerms(MoreLikeThis.java:724) at org.apache.lucene.queries.mlt.MoreLikeThis.like(MoreLikeThis.java:578) at org.apache.solr.handler.MoreLikeThisHandler$MoreLikeThisHelper.getMoreLikeThis(MoreLikeThisHandler.java:348) at org.apache.solr.handler.MoreLikeThisHandler.handleRequestBody(MoreLikeThisHandler.java:167) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:365) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:926) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:988) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:635) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:679) {noformat} The
[jira] [Commented] (SOLR-5103) Plugin Improvements
[ https://issues.apache.org/jira/browse/SOLR-5103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726362#comment-13726362 ] Grant Ingersoll commented on SOLR-5103: --- https://code.google.com/p/google-guice/wiki/Multibindings has some baseline good ideas in it, see SOLR-5091 as well for how Guice gets brought in. Plugin Improvements --- Key: SOLR-5103 URL: https://issues.apache.org/jira/browse/SOLR-5103 Project: Solr Issue Type: Improvement Reporter: Grant Ingersoll Assignee: Grant Ingersoll Fix For: 5.0 I think for 5.0, we should make it easier to add plugins by defining a plugin package, ala a Hadoop Job jar, which is a self--contained archive of a plugin that can be easily installed (even from the UI!) and configured programmatically. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Measuring SOLR performance
Hi Roman, When I try to run with -q /home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries here what is reported: Traceback (most recent call last): File solrjmeter.py, line 1390, in module main(sys.argv) File solrjmeter.py, line 1309, in main tests = find_tests(options) File solrjmeter.py, line 461, in find_tests with changed_dir(pattern): File /usr/lib/python2.7/contextlib.py, line 17, in __enter__ return self.gen.next() File solrjmeter.py, line 229, in changed_dir os.chdir(new) OSError: [Errno 20] Not a directory: '/home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries' Best, Dmitry On Wed, Jul 31, 2013 at 7:21 PM, Roman Chyla roman.ch...@gmail.com wrote: Hi Dmitry, probably mistake in the readme, try calling it with -q /home/dmitry/projects/lab/solrjmeter/queries/demo/demo.queries as for the base_url, i was testing it on solr4.0, where it tries contactin /solr/admin/system - is it different for 4.3? I guess I should make it configurable (it already is, the endpoint is set at the check_options()) thanks roman On Wed, Jul 31, 2013 at 10:01 AM, Dmitry Kan solrexp...@gmail.com wrote: Ok, got the error fixed by modifying the base solr ulr in solrjmeter.py (added core name after /solr part). Next error is: WARNING: no test name(s) supplied nor found in: ['/home/dmitry/projects/lab/solrjmeter/demo/queries/demo.queries'] It is a 'slow start with new tool' symptom I guess.. :) On Wed, Jul 31, 2013 at 4:39 PM, Dmitry Kan solrexp...@gmail.com wrote: Hi Roman, What version and config of SOLR does the tool expect? Tried to run, but got: **ERROR** File solrjmeter.py, line 1390, in module main(sys.argv) File solrjmeter.py, line 1296, in main check_prerequisities(options) File solrjmeter.py, line 351, in check_prerequisities error('Cannot contact: %s' % options.query_endpoint) File solrjmeter.py, line 66, in error traceback.print_stack() Cannot contact: http://localhost:8983/solr complains about URL, clicking which leads properly to the admin page... solr 4.3.1, 2 cores shard Dmitry On Wed, Jul 31, 2013 at 3:59 AM, Roman Chyla roman.ch...@gmail.com wrote: Hello, I have been wanting some tools for measuring performance of SOLR, similar to Mike McCandles' lucene benchmark. so yet another monitor was born, is described here: http://29min.wordpress.com/2013/07/31/measuring-solr-query-performance/ I tested it on the problem of garbage collectors (see the blogs for details) and so far I can't conclude whether highly customized G1 is better than highly customized CMS, but I think interesting details can be seen there. Hope this helps someone, and of course, feel free to improve the tool and share! roman
[jira] [Resolved] (SOLR-5100) java.lang.OutOfMemoryError: Requested array size exceeds VM limit
[ https://issues.apache.org/jira/browse/SOLR-5100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5100. -- Resolution: Invalid Please raise this on the user's list. OOM errors are common when one has not allocated enough heap to the JVM or otherwise tries to do too much with too few resources. The user's list will offer lots of help to change your setup to no longer OOM. java.lang.OutOfMemoryError: Requested array size exceeds VM limit - Key: SOLR-5100 URL: https://issues.apache.org/jira/browse/SOLR-5100 Project: Solr Issue Type: Bug Affects Versions: 4.2.1 Environment: Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux Java 7, Tomcat, ZK standalone Reporter: Grzegorz Sobczyk Today I found exception in log (lmsiprse01): {code} sie 01, 2013 5:27:26 AM org.apache.solr.core.SolrCore execute INFO: [products] webapp=/solr path=/select params={facet=truestart=0q=facet.limit=-1facet.field=attribute_u-typfacet.field=attribute_u-gama-kolorystycznafacet.field=brand_namewt=javabinfq=node_id:1056version=2rows=0} hits=1241 status=0 QTime=33 sie 01, 2013 5:27:26 AM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:724) Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:64) at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:37) at org.apache.solr.handler.component.ShardFieldSortedHitQueue.init(ShardDoc.java:113) at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:766) at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:625) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:604) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) ... 13 more {code} We have: * 3x standalone zK * 3x Solr 4.2.1 on Tomcat Exception shows up after leader was stopped: * lmsiprse01: [2013-08-01 05:23:43]: /etc/init.d/tomcat6-1 stop [2013-08-01 05:25:09]: /etc/init.d/tomcat6-1 start * lmsiprse02 (leader): 2013-08-01 05:27:21]: /etc/init.d/tomcat6-1 stop 2013-08-01 05:29:31]: /etc/init.d/tomcat6-1 start * lmsiprse03: [2013-08-01 05:25:48]: /etc/init.d/tomcat6-1 stop [2013-08-01 05:26:42]: /etc/init.d/tomcat6-1 start -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-5101) Invalid UTF-8 character 0xfffe during shard update
[ https://issues.apache.org/jira/browse/SOLR-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erick Erickson resolved SOLR-5101. -- Resolution: Invalid Please raise this on the user's list and verify that it is indeed a bug before raising a JIRA. Offhand this sounds like a configuration error in your servlet container, but that's just a guess. Invalid UTF-8 character 0xfffe during shard update -- Key: SOLR-5101 URL: https://issues.apache.org/jira/browse/SOLR-5101 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.3 Environment: Ubuntu 12.04.2 java version 1.6.0_27 OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Reporter: Federico Chiacchiaretta On data import from a PostgreSQL db, I get the following error in solr.log: ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException; shard update error RetryNode: http://172.16.201.173:8983/solr/archive/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Invalid UTF-8 character 0xfffe at char #416, byte #127) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:402) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332) at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) This prevents the document from being successfully added to the index, and a few documents targeting the same shard are also missing. This happens silently, because data import completes successfully, and the whole number of documents reported as Added includes those who failed (and are actually lost). Is there a known workaround for this issue? Regards, Federico Chiacchiaretta -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Welcome Cassandra! Mike McCandless http://blog.mikemccandless.com On Wed, Jul 31, 2013 at 6:47 PM, Robert Muir rcm...@gmail.com wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5105) Merge CoreAdmin and Collections API
Alan Woodward created SOLR-5105: --- Summary: Merge CoreAdmin and Collections API Key: SOLR-5105 URL: https://issues.apache.org/jira/browse/SOLR-5105 Project: Solr Issue Type: Improvement Reporter: Alan Woodward Fix For: 5.0 For 5.0, we should remove the distinction between the Core Admin API and the Collections API. It's confusing for users, and adds unnecessary complexity and duplication to the core code. * Under the hood, the AdminHandlers should just be deserializing the various core parameters and then passing them onto the CoreContainer to do the actual work. * The CoreContainer API can be cleaned up (need a distinction between loading existing cores and creating new ones, remove the various 'registerCore' methods) * ZkContainer should become a subclass of CoreContainer (maybe CloudCoreContainer?) and deal with the zookeeper interactions, while the base class deals with local cores. * The CoreContainer should be dealing with all core name logic (aliases, collections, etc). This will have the nice side-effect of simplifying the core dispatch logic as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5152) Lucene FST is not immutale
[ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5152: Attachment: LUCENE-5152.patch here is a patch that adds a #deepCopy method to Outputs that allows me to do a deep copy if the actual arc that is returned is a cached root arc. I think we should never return a pointer into the root arcs though. this is way to dangerous! I haven't run any perf tests will do once I am on my worksstation again.. if somebody beats me go ahead! Lucene FST is not immutale -- Key: LUCENE-5152 URL: https://issues.apache.org/jira/browse/LUCENE-5152 Project: Lucene - Core Issue Type: Bug Components: core/FSTs Affects Versions: 4.4 Reporter: Simon Willnauer Priority: Blocker Fix For: 5.0, 4.5 Attachments: LUCENE-5152.patch, LUCENE-5152.patch a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output from and FST (BytesRef) which caused sideffects in later execution. I added an assertion into the FST that checks if a cached root arc is modified and in-fact this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need to think about how to make this less trappy since it can cause bugs that are super hard to find. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5101) Invalid UTF-8 character 0xfffe during shard update
[ https://issues.apache.org/jira/browse/SOLR-5101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726408#comment-13726408 ] Federico Chiacchiaretta commented on SOLR-5101: --- Hi Erick, I'll post this on the user's list and I'll be back here when I have an update. Regarding servlet container config, I'm using included jetty stock configuration. Invalid UTF-8 character 0xfffe during shard update -- Key: SOLR-5101 URL: https://issues.apache.org/jira/browse/SOLR-5101 Project: Solr Issue Type: Bug Components: clients - java Affects Versions: 4.3 Environment: Ubuntu 12.04.2 java version 1.6.0_27 OpenJDK Runtime Environment (IcedTea6 1.12.5) (6b27-1.12.5-0ubuntu0.12.04.1) OpenJDK 64-Bit Server VM (build 20.0-b12, mixed mode) Reporter: Federico Chiacchiaretta On data import from a PostgreSQL db, I get the following error in solr.log: ERROR - 2013-08-01 09:51:00.217; org.apache.solr.common.SolrException; shard update error RetryNode: http://172.16.201.173:8983/solr/archive/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Invalid UTF-8 character 0xfffe at char #416, byte #127) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:402) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332) at org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:679) This prevents the document from being successfully added to the index, and a few documents targeting the same shard are also missing. This happens silently, because data import completes successfully, and the whole number of documents reported as Added includes those who failed (and are actually lost). Is there a known workaround for this issue? Regards, Federico Chiacchiaretta -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
FlushPolicy and maxBufDelTerm
Hi I'm a little confused about FlushPolicy and IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs say: * Segments are traditionally flushed by: * ul * liRAM consumption - configured via ... * li*Number of buffered delete terms/queries* - configured via * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li * /ul Yet IWC.setMaxBufDelTerm says: NOTE: This setting won't trigger a segment flush. And FlushByRamOrCountPolicy says: * li{@link #onDelete(DocumentsWriterFlushControl, DocumentsWriterPerThreadPool.ThreadState)} - flushes * based on the global number of buffered delete terms iff * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li Confused, I wrote a short unit test: public void testMaxBufDelTerm() throws Exception { Directory dir = new RAMDirectory(); IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random())); conf.setMaxBufferedDeleteTerms(1); conf.setMaxBufferedDocs(10); conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH); conf.setInfoStream(new PrintStreamInfoStream(System.out)); IndexWriter writer = new IndexWriter(dir, conf ); int numDocs = 4; for (int i = 0; i numDocs; i++) { Document doc = new Document(); doc.add(new StringField(id, doc- + i, Store.NO)); writer.addDocument(doc); } System.out.println(before delete); for (String f : dir.listAll()) System.out.println(f); writer.deleteDocuments(new Term(id, doc-0)); writer.deleteDocuments(new Term(id, doc-1)); System.out.println(\nafter delete); for (String f : dir.listAll()) System.out.println(f); writer.close(); dir.close(); } When InfoStream is turned on, I can see messages regarding terms flushing (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes effect. Yet both before and after the delete operations, the dir.list() returns only the fdx and fdt files. So is this a bug that a segment isn't flushed? If not (and I'm ok with that), is it a documentation inconsistency? Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer size, a new segment will be deleted? Slightly unrelated to FlushPolicy, but do I understand correctly that maxBufDelTerm does not apply to delete-by-query operations? BufferedDeletes doesn't increment any counter on addQuery(), so is it correct to assume that if I only delete-by-query, this setting has no effect? And the delete queries are buffered until the next segment is flushed due to other operations (constraints, commit, NRT-reopen)? Shai
Re: FlushPolicy and maxBufDelTerm
bq. a new segment will be deleted? I mean a new segment will be flushed :). Shai On Thu, Aug 1, 2013 at 4:03 PM, Shai Erera ser...@gmail.com wrote: Hi I'm a little confused about FlushPolicy and IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs say: * Segments are traditionally flushed by: * ul * liRAM consumption - configured via ... * li*Number of buffered delete terms/queries* - configured via * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li * /ul Yet IWC.setMaxBufDelTerm says: NOTE: This setting won't trigger a segment flush. And FlushByRamOrCountPolicy says: * li{@link #onDelete(DocumentsWriterFlushControl, DocumentsWriterPerThreadPool.ThreadState)} - flushes * based on the global number of buffered delete terms iff * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li Confused, I wrote a short unit test: public void testMaxBufDelTerm() throws Exception { Directory dir = new RAMDirectory(); IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random())); conf.setMaxBufferedDeleteTerms(1); conf.setMaxBufferedDocs(10); conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH); conf.setInfoStream(new PrintStreamInfoStream(System.out)); IndexWriter writer = new IndexWriter(dir, conf ); int numDocs = 4; for (int i = 0; i numDocs; i++) { Document doc = new Document(); doc.add(new StringField(id, doc- + i, Store.NO)); writer.addDocument(doc); } System.out.println(before delete); for (String f : dir.listAll()) System.out.println(f); writer.deleteDocuments(new Term(id, doc-0)); writer.deleteDocuments(new Term(id, doc-1)); System.out.println(\nafter delete); for (String f : dir.listAll()) System.out.println(f); writer.close(); dir.close(); } When InfoStream is turned on, I can see messages regarding terms flushing (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes effect. Yet both before and after the delete operations, the dir.list() returns only the fdx and fdt files. So is this a bug that a segment isn't flushed? If not (and I'm ok with that), is it a documentation inconsistency? Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer size, a new segment will be deleted? Slightly unrelated to FlushPolicy, but do I understand correctly that maxBufDelTerm does not apply to delete-by-query operations? BufferedDeletes doesn't increment any counter on addQuery(), so is it correct to assume that if I only delete-by-query, this setting has no effect? And the delete queries are buffered until the next segment is flushed due to other operations (constraints, commit, NRT-reopen)? Shai
[jira] [Updated] (SOLR-5057) queryResultCache should not related with the order of fq's list
[ https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huangfeihong updated SOLR-5057: --- Attachment: SOLR-5057.patch queryResultCache should not related with the order of fq's list --- Key: SOLR-5057 URL: https://issues.apache.org/jira/browse/SOLR-5057 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0, 4.1, 4.2, 4.3 Reporter: Feihong Huang Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5057.patch, SOLR-5057.patch, SOLR-5057.patch Original Estimate: 48h Remaining Estimate: 48h There are two case query with the same meaning below. But the case2 can't use the queryResultCache when case1 is executed. case1: q=*:*fq=field1:value1fq=field2:value2 case2: q=*:*fq=field2:value2fq=field1:value1 I think queryResultCache should not be related with the order of fq's list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5057) queryResultCache should not related with the order of fq's list
[ https://issues.apache.org/jira/browse/SOLR-5057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726425#comment-13726425 ] huangfeihong commented on SOLR-5057: Patch attached. Just rename several variable'name using Yonik's code. queryResultCache should not related with the order of fq's list --- Key: SOLR-5057 URL: https://issues.apache.org/jira/browse/SOLR-5057 Project: Solr Issue Type: Improvement Components: search Affects Versions: 4.0, 4.1, 4.2, 4.3 Reporter: Feihong Huang Assignee: Erick Erickson Priority: Minor Attachments: SOLR-5057.patch, SOLR-5057.patch, SOLR-5057.patch Original Estimate: 48h Remaining Estimate: 48h There are two case query with the same meaning below. But the case2 can't use the queryResultCache when case1 is executed. case1: q=*:*fq=field1:value1fq=field2:value2 case2: q=*:*fq=field2:value2fq=field1:value1 I think queryResultCache should not be related with the order of fq's list. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale
[ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726429#comment-13726429 ] Jack Krupansky commented on LUCENE-5152: bq. immutale Is that the Latin term for immutable?? (spelling in summary line) Lucene FST is not immutale -- Key: LUCENE-5152 URL: https://issues.apache.org/jira/browse/LUCENE-5152 Project: Lucene - Core Issue Type: Bug Components: core/FSTs Affects Versions: 4.4 Reporter: Simon Willnauer Priority: Blocker Fix For: 5.0, 4.5 Attachments: LUCENE-5152.patch, LUCENE-5152.patch a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output from and FST (BytesRef) which caused sideffects in later execution. I added an assertion into the FST that checks if a cached root arc is modified and in-fact this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need to think about how to make this less trappy since it can cause bugs that are super hard to find. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2894) Implement distributed pivot faceting
[ https://issues.apache.org/jira/browse/SOLR-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726461#comment-13726461 ] Stein J. Gran commented on SOLR-2894: - I have now re-tested the scenarios I used on April 10th (see my comment above from that date), and all of those issues I found then are now resolved :-) I applied the July 25th patch to the lucene_solr_4_4 branch (Github) and performed the tests on this version. Well done Andrew :-) Thumbs up from me. Implement distributed pivot faceting Key: SOLR-2894 URL: https://issues.apache.org/jira/browse/SOLR-2894 Project: Solr Issue Type: Improvement Reporter: Erik Hatcher Fix For: 4.5 Attachments: SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894.patch, SOLR-2894-reworked.patch Following up on SOLR-792, pivot faceting currently only supports undistributed mode. Distributed pivot faceting needs to be implemented. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5106) Grouping on multi-valued fields
Reinier Battenberg created SOLR-5106: Summary: Grouping on multi-valued fields Key: SOLR-5106 URL: https://issues.apache.org/jira/browse/SOLR-5106 Project: Solr Issue Type: Improvement Reporter: Reinier Battenberg Priority: Minor The Wiki page for FieldCollapsing mentions that Support for grouping on a multi-valued field has not yet been implemented. This issue is to document that implementation. http://wiki.apache.org/solr/FieldCollapsing#line-158 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5104) Remove Default Core
[ https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726483#comment-13726483 ] Jack Krupansky commented on SOLR-5104: -- Minor procedural nit... If you intend to remove a feature, deprecate it first (like, in 4.5.) Thanks! Remove Default Core --- Key: SOLR-5104 URL: https://issues.apache.org/jira/browse/SOLR-5104 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll Fix For: 5.0 I see no reason to maintain the notion of a default Core/Collection. We can either default to Collection1, or just simply create a core on the fly based on the client's request. Thus, all APIs that are accessing a core would require the core to be in the address path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FlushPolicy and maxBufDelTerm
First off, it's bad that you don't see .del files when conf.setMaxBufferedDeleteTerms is 1. But, it could be that newIndexWriterConfig turned on readerPooling which would mean the deletes are held in the SegmentReader and not flushed to disk. Can you make sure that's off? Second off, I think the doc is correct: a segment will not be flushed; rather, new .del files should appear against older segments. And yes, if RAM usage of the buffered del Term/Query s is too high, then a segment is flushed along with the deletes being applied (creating the .del files). I think buffered delete Querys are not counted towards setMaxBufferedDeleteTerms; so they are only flushed by RAM usage (rough rough estimate) or by other ops (merging, NRT reopen, commit, etc.). Mike McCandless http://blog.mikemccandless.com On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote: Hi I'm a little confused about FlushPolicy and IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs say: * Segments are traditionally flushed by: * ul * liRAM consumption - configured via ... * liNumber of buffered delete terms/queries - configured via * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li * /ul Yet IWC.setMaxBufDelTerm says: NOTE: This setting won't trigger a segment flush. And FlushByRamOrCountPolicy says: * li{@link #onDelete(DocumentsWriterFlushControl, DocumentsWriterPerThreadPool.ThreadState)} - flushes * based on the global number of buffered delete terms iff * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li Confused, I wrote a short unit test: public void testMaxBufDelTerm() throws Exception { Directory dir = new RAMDirectory(); IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random())); conf.setMaxBufferedDeleteTerms(1); conf.setMaxBufferedDocs(10); conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH); conf.setInfoStream(new PrintStreamInfoStream(System.out)); IndexWriter writer = new IndexWriter(dir, conf ); int numDocs = 4; for (int i = 0; i numDocs; i++) { Document doc = new Document(); doc.add(new StringField(id, doc- + i, Store.NO)); writer.addDocument(doc); } System.out.println(before delete); for (String f : dir.listAll()) System.out.println(f); writer.deleteDocuments(new Term(id, doc-0)); writer.deleteDocuments(new Term(id, doc-1)); System.out.println(\nafter delete); for (String f : dir.listAll()) System.out.println(f); writer.close(); dir.close(); } When InfoStream is turned on, I can see messages regarding terms flushing (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes effect. Yet both before and after the delete operations, the dir.list() returns only the fdx and fdt files. So is this a bug that a segment isn't flushed? If not (and I'm ok with that), is it a documentation inconsistency? Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer size, a new segment will be deleted? Slightly unrelated to FlushPolicy, but do I understand correctly that maxBufDelTerm does not apply to delete-by-query operations? BufferedDeletes doesn't increment any counter on addQuery(), so is it correct to assume that if I only delete-by-query, this setting has no effect? And the delete queries are buffered until the next segment is flushed due to other operations (constraints, commit, NRT-reopen)? Shai - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5107) LukeRequestHandler throws NullPointerException when numTerms=0
Ahmet Arslan created SOLR-5107: -- Summary: LukeRequestHandler throws NullPointerException when numTerms=0 Key: SOLR-5107 URL: https://issues.apache.org/jira/browse/SOLR-5107 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Ahmet Arslan Priority: Minor Defaults example http://localhost:8983/solr/collection1/admin/luke?fl=catnumTerms=0 yields {code} ERROR org.apache.solr.core.SolrCore – java.lang.NullPointerException at org.apache.solr.handler.admin.LukeRequestHandler.getDetailedFieldInfo(LukeRequestHandler.java:610) at org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeRequestHandler.java:378) at org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:160) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:666) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:369) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:722) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5107) LukeRequestHandler throws NullPointerException when numTerms=0
[ https://issues.apache.org/jira/browse/SOLR-5107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ahmet Arslan updated SOLR-5107: --- Attachment: SOLR-5107.patch LukeRequestHandler throws NullPointerException when numTerms=0 -- Key: SOLR-5107 URL: https://issues.apache.org/jira/browse/SOLR-5107 Project: Solr Issue Type: Bug Affects Versions: 4.4 Reporter: Ahmet Arslan Priority: Minor Attachments: SOLR-5107.patch Defaults example http://localhost:8983/solr/collection1/admin/luke?fl=catnumTerms=0 yields {code} ERROR org.apache.solr.core.SolrCore – java.lang.NullPointerException at org.apache.solr.handler.admin.LukeRequestHandler.getDetailedFieldInfo(LukeRequestHandler.java:610) at org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeRequestHandler.java:378) at org.apache.solr.handler.admin.LukeRequestHandler.handleRequestBody(LukeRequestHandler.java:160) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1845) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:666) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:369) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:722) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
On 7/31/2013 4:47 PM, Robert Muir wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Welcome to the project! - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5104) Remove Default Core
[ https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726495#comment-13726495 ] Jack Krupansky commented on SOLR-5104: -- bq. I see no reason to maintain the notion of a default Core/Collection. Okay, here is the reason... It is a great convenience and shortens URLs to make them more readable and easier to type. It greatly facilitates prototyping and experimentation and learning of the basics of Solr. And... compatibility with existing apps. So, this notion that there isn't any reason is complete nonsense. OTOH, maybe you are trying to suggest that there is some reason or valuable benefit to be gained by requiring explicit collection/core name in the URL path. But, you have not done so. Not a hint of any reason or benefit. So, if you do have a reason or perceived benefit for eliminating a great convenience feature, please disclose it. Or... is this not so much an issue of reason as because some code or tool change you are contemplating does not support the kind of flexible URL syntax that Solr supports? Well, if the benefits of the change in technology outweigh the loss of a valuable feature, then that is worth considering, but as of this moment no positive tradeoff has been proposed or established. OTOH, if there were a determined effort to give Solr a full-blown true REST API and THAT was the motive for explicit collection name, I'd be 100% all for it. Side note: Maybe collection1 should become example to make it clear that real apps should assign an app-meaningful name rather than leaving it as collection1. Remove Default Core --- Key: SOLR-5104 URL: https://issues.apache.org/jira/browse/SOLR-5104 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll Fix For: 5.0 I see no reason to maintain the notion of a default Core/Collection. We can either default to Collection1, or just simply create a core on the fly based on the client's request. Thus, all APIs that are accessing a core would require the core to be in the address path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Cassandra Targett as Lucene/Solr committer
Thanks everyone. I'm very excited to join you all. I don't know how brief this is, but here's a bit about me: I discovered Solr through my work at LucidWorks. I've had a few different roles there, but most recently I've been the tech writer. The Solr Reference Guide became part of the stuff I work on and I had to learn Solr. I like to figure as much out on my own as I can, so to learn I tried things out, I read the Jira issues, I tried to interpret the Javadocs (sometimes following the trail deep into darkness and getting it wrong). It's the same way many people passionate about Solr get started, I think - we had a job to do, in one way or another, and that's how we learned. I'm technical but not a developer (I think the last real program I wrote was in computer camp for girls in 1984, where we wrote Basic in the morning and Jazzercized in the afternoon), but even though I don't write code, I can understand very technical concepts and can sometimes read code. I'm a librarian, so I spend time thinking about how to organize information. As an undergrad I got a BA in creative writing, and tech writing has become a really lovely pairing of two skills and passions. What else? I grew up in New Hampshire and after school, I moved to Boston and at some point I decided that a) I wanted to work on the internet and b) the best way to do that was to get an MS in Library Science. It sounds sort of random, now, but that's what I did. A couple years ago I left Boston and now live in Northwest Florida (vaguely halfway between Pensacola and Tallahassee), only a couple miles from the beach. Until that point, I (loudly and often) vowed I would never live in Florida. But it turns out that I really love being able to go to the beach every day of the year, and on my second day in town I met my boyfriend and we're now sort of officially engaged. So, even though right now it's hotter-than-Hades and wetter-than-sunken-Atlantis, I stay. Lastly, in my spare time, I make mosaic art. I still have more ideas than pieces, but I'm getting there. Eventually I'll get some photos of my stuff online for all to see. On Wed, Jul 31, 2013 at 3:47 PM, Robert Muir rcm...@gmail.com wrote: I'm pleased to announce that Cassandra Targett has accepted to join our ranks as a committer. Cassandra worked on the donation of the new Solr Reference Guide [1] and getting things in order for its first official release [2]. Cassandra, it is tradition that you introduce yourself with a brief bio. Welcome! P.S. As soon as your SVN access is setup, you should then be able to add yourself to the committers list on the website as well. [1] https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide [2] https://www.apache.org/dyn/closer.cgi/lucene/solr/ref-guide/ - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5104) Remove Default Core
[ https://issues.apache.org/jira/browse/SOLR-5104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726503#comment-13726503 ] Grant Ingersoll commented on SOLR-5104: --- My reason is b/c SolrDispatchFilter is filled with legacy cruft, this being one of them. The simpler and more standard we can make all path handling, the better. I don't really care much about shorter URLs and I don't buy the prototyping/learning factor. In fact, I'd argue that it is harder b/c of it, since you have a magic core and than all of your other cores. If you just make the name of the collection part of the path always, there is no more guessing. The less legacy code for plumbing we carry forward in 5, the better off Solr we will be. And yes, I am working on making the possibility of a full blown REST API. See SOLR-5091. Remove Default Core --- Key: SOLR-5104 URL: https://issues.apache.org/jira/browse/SOLR-5104 Project: Solr Issue Type: Sub-task Reporter: Grant Ingersoll Fix For: 5.0 I see no reason to maintain the notion of a default Core/Collection. We can either default to Collection1, or just simply create a core on the fly based on the client's request. Thus, all APIs that are accessing a core would require the core to be in the address path. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FlushPolicy and maxBufDelTerm
I think the doc is correct Wait, one of the docs is wrong. I guess according to what you write, it's FlushPolicy, as a new segment is not flushed per this setting? Or perhaps they should be clarified that the deletes are flushed == applied on existing segments? I disabled reader pooling and I still don't see .del files. But I think that's explained due to there are no segments in the index yet. All documents are still in the RAM buffer, and according to what you write, I shouldn't see any segment cause of delTerms? Shai On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless luc...@mikemccandless.com wrote: First off, it's bad that you don't see .del files when conf.setMaxBufferedDeleteTerms is 1. But, it could be that newIndexWriterConfig turned on readerPooling which would mean the deletes are held in the SegmentReader and not flushed to disk. Can you make sure that's off? Second off, I think the doc is correct: a segment will not be flushed; rather, new .del files should appear against older segments. And yes, if RAM usage of the buffered del Term/Query s is too high, then a segment is flushed along with the deletes being applied (creating the .del files). I think buffered delete Querys are not counted towards setMaxBufferedDeleteTerms; so they are only flushed by RAM usage (rough rough estimate) or by other ops (merging, NRT reopen, commit, etc.). Mike McCandless http://blog.mikemccandless.com On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote: Hi I'm a little confused about FlushPolicy and IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs say: * Segments are traditionally flushed by: * ul * liRAM consumption - configured via ... * liNumber of buffered delete terms/queries - configured via * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li * /ul Yet IWC.setMaxBufDelTerm says: NOTE: This setting won't trigger a segment flush. And FlushByRamOrCountPolicy says: * li{@link #onDelete(DocumentsWriterFlushControl, DocumentsWriterPerThreadPool.ThreadState)} - flushes * based on the global number of buffered delete terms iff * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li Confused, I wrote a short unit test: public void testMaxBufDelTerm() throws Exception { Directory dir = new RAMDirectory(); IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random())); conf.setMaxBufferedDeleteTerms(1); conf.setMaxBufferedDocs(10); conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH); conf.setInfoStream(new PrintStreamInfoStream(System.out)); IndexWriter writer = new IndexWriter(dir, conf ); int numDocs = 4; for (int i = 0; i numDocs; i++) { Document doc = new Document(); doc.add(new StringField(id, doc- + i, Store.NO)); writer.addDocument(doc); } System.out.println(before delete); for (String f : dir.listAll()) System.out.println(f); writer.deleteDocuments(new Term(id, doc-0)); writer.deleteDocuments(new Term(id, doc-1)); System.out.println(\nafter delete); for (String f : dir.listAll()) System.out.println(f); writer.close(); dir.close(); } When InfoStream is turned on, I can see messages regarding terms flushing (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes effect. Yet both before and after the delete operations, the dir.list() returns only the fdx and fdt files. So is this a bug that a segment isn't flushed? If not (and I'm ok with that), is it a documentation inconsistency? Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer size, a new segment will be deleted? Slightly unrelated to FlushPolicy, but do I understand correctly that maxBufDelTerm does not apply to delete-by-query operations? BufferedDeletes doesn't increment any counter on addQuery(), so is it correct to assume that if I only delete-by-query, this setting has no effect? And the delete queries are buffered until the next segment is flushed due to other operations (constraints, commit, NRT-reopen)? Shai - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FlushPolicy and maxBufDelTerm
I set maxBufDocs=2 so that I get a segment flushed, and indeed after delete I see _0.del. So I guess this is just docs inconsistency. I'll clarify FlushPolicy docs. Shai On Thu, Aug 1, 2013 at 6:24 PM, Shai Erera ser...@gmail.com wrote: I think the doc is correct Wait, one of the docs is wrong. I guess according to what you write, it's FlushPolicy, as a new segment is not flushed per this setting? Or perhaps they should be clarified that the deletes are flushed == applied on existing segments? I disabled reader pooling and I still don't see .del files. But I think that's explained due to there are no segments in the index yet. All documents are still in the RAM buffer, and according to what you write, I shouldn't see any segment cause of delTerms? Shai On Thu, Aug 1, 2013 at 5:40 PM, Michael McCandless luc...@mikemccandless.com wrote: First off, it's bad that you don't see .del files when conf.setMaxBufferedDeleteTerms is 1. But, it could be that newIndexWriterConfig turned on readerPooling which would mean the deletes are held in the SegmentReader and not flushed to disk. Can you make sure that's off? Second off, I think the doc is correct: a segment will not be flushed; rather, new .del files should appear against older segments. And yes, if RAM usage of the buffered del Term/Query s is too high, then a segment is flushed along with the deletes being applied (creating the .del files). I think buffered delete Querys are not counted towards setMaxBufferedDeleteTerms; so they are only flushed by RAM usage (rough rough estimate) or by other ops (merging, NRT reopen, commit, etc.). Mike McCandless http://blog.mikemccandless.com On Thu, Aug 1, 2013 at 9:03 AM, Shai Erera ser...@gmail.com wrote: Hi I'm a little confused about FlushPolicy and IndexWriterConfig.setMaxBufferedDeleteTerms documentation. FlushPolicy jdocs say: * Segments are traditionally flushed by: * ul * liRAM consumption - configured via ... * liNumber of buffered delete terms/queries - configured via * {@link IndexWriterConfig#setMaxBufferedDeleteTerms(int)}/li * /ul Yet IWC.setMaxBufDelTerm says: NOTE: This setting won't trigger a segment flush. And FlushByRamOrCountPolicy says: * li{@link #onDelete(DocumentsWriterFlushControl, DocumentsWriterPerThreadPool.ThreadState)} - flushes * based on the global number of buffered delete terms iff * {@link IndexWriterConfig#getMaxBufferedDeleteTerms()} is enabled/li Confused, I wrote a short unit test: public void testMaxBufDelTerm() throws Exception { Directory dir = new RAMDirectory(); IndexWriterConfig conf = newIndexWriterConfig(TEST_VERSION_CURRENT, new MockAnalyzer(random())); conf.setMaxBufferedDeleteTerms(1); conf.setMaxBufferedDocs(10); conf.setRAMBufferSizeMB(IndexWriterConfig.DISABLE_AUTO_FLUSH); conf.setInfoStream(new PrintStreamInfoStream(System.out)); IndexWriter writer = new IndexWriter(dir, conf ); int numDocs = 4; for (int i = 0; i numDocs; i++) { Document doc = new Document(); doc.add(new StringField(id, doc- + i, Store.NO)); writer.addDocument(doc); } System.out.println(before delete); for (String f : dir.listAll()) System.out.println(f); writer.deleteDocuments(new Term(id, doc-0)); writer.deleteDocuments(new Term(id, doc-1)); System.out.println(\nafter delete); for (String f : dir.listAll()) System.out.println(f); writer.close(); dir.close(); } When InfoStream is turned on, I can see messages regarding terms flushing (vs if I comment the .setMaxBufDelTerm line), so I know this settings takes effect. Yet both before and after the delete operations, the dir.list() returns only the fdx and fdt files. So is this a bug that a segment isn't flushed? If not (and I'm ok with that), is it a documentation inconsistency? Strangely, I think, if the delTerms RAM accounting exhausts max-RAM-buffer size, a new segment will be deleted? Slightly unrelated to FlushPolicy, but do I understand correctly that maxBufDelTerm does not apply to delete-by-query operations? BufferedDeletes doesn't increment any counter on addQuery(), so is it correct to assume that if I only delete-by-query, this setting has no effect? And the delete queries are buffered until the next segment is flushed due to other operations (constraints, commit, NRT-reopen)? Shai - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5152) Lucene FST is not immutale
[ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726569#comment-13726569 ] Robert Muir commented on LUCENE-5152: - I guess one question would be if its FSTs job to defend against bytesref bugs. This issue was driven because there was a bytesref bug for suggester payloads. The same kind of bug could happen, e.g. if someone uses DirectPostings and modifies the payload coming back from the postings lists. Should we clone payload bytes in the postings lists too? what about term dictionaries? At some point then BytesRef is useless as a reference class because of a few bad apples trying to use it as a ByteBuffer. Ideally we would remove code that abuses BytesRef as a ByteBuffer instead. I don't mean to pick on your issue Simon, and it doesnt mean I object to the patch (though I wonder about performance implications), I just see this as one of many in a larger issue. Lucene FST is not immutale -- Key: LUCENE-5152 URL: https://issues.apache.org/jira/browse/LUCENE-5152 Project: Lucene - Core Issue Type: Bug Components: core/FSTs Affects Versions: 4.4 Reporter: Simon Willnauer Priority: Blocker Fix For: 5.0, 4.5 Attachments: LUCENE-5152.patch, LUCENE-5152.patch a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output from and FST (BytesRef) which caused sideffects in later execution. I added an assertion into the FST that checks if a cached root arc is modified and in-fact this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need to think about how to make this less trappy since it can cause bugs that are super hard to find. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FlushPolicy and maxBufDelTerm
On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote: I think the doc is correct Wait, one of the docs is wrong. I guess according to what you write, it's FlushPolicy, as a new segment is not flushed per this setting? Or perhaps they should be clarified that the deletes are flushed == applied on existing segments? Ahh, right. OK I think we should fix FlushPolicy to say deletes are applied? Let's try to leave the verb flushed to mean a new segment is written to disk, I think? I disabled reader pooling and I still don't see .del files. But I think that's explained due to there are no segments in the index yet. All documents are still in the RAM buffer, and according to what you write, I shouldn't see any segment cause of delTerms? Right! OK so that explains it. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FlushPolicy and maxBufDelTerm
thanks for clarifying this - I agree the wording is tricky here and we should use the term apply here! sorry for the confusion! simon On Thu, Aug 1, 2013 at 7:39 PM, Michael McCandless luc...@mikemccandless.com wrote: On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote: I think the doc is correct Wait, one of the docs is wrong. I guess according to what you write, it's FlushPolicy, as a new segment is not flushed per this setting? Or perhaps they should be clarified that the deletes are flushed == applied on existing segments? Ahh, right. OK I think we should fix FlushPolicy to say deletes are applied? Let's try to leave the verb flushed to mean a new segment is written to disk, I think? I disabled reader pooling and I still don't see .del files. But I think that's explained due to there are no segments in the index yet. All documents are still in the RAM buffer, and according to what you write, I shouldn't see any segment cause of delTerms? Right! OK so that explains it. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found
[ https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726695#comment-13726695 ] ASF subversion and git services commented on SOLR-4953: --- Commit 1509359 from hoss...@apache.org in branch 'dev/trunk' [ https://svn.apache.org/r1509359 ] SOLR-4953: Make XML Configuration parsing fail if an xpath matches multiple nodes when only a single value is expected. Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found Key: SOLR-4953 URL: https://issues.apache.org/jira/browse/SOLR-4953 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Attachments: SOLR-4953.patch, SOLR-4953.patch while reviewing some code i think i noticed that if there are multiple {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are ignored. this should be a hard failure situation, and we should have a TestBadConfig method to verify it. --- broadened goal of issue to fail if configuration contains multiple nodes/values for any option where only one value is expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-4.x #404: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-4.x/404/ 1 tests failed. REGRESSION: org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch Error Message: IOException occured when talking to server at: http://127.0.0.1:28478/sqt Stack Trace: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:28478/sqt at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:129) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCustomCollectionsAPI(CollectionsAPIDistributedZkTest.java:764) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:159) Build Log: [...truncated 24519 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5108) plugin loading should fail if mor then one instance of a singleton plugin is found
Hoss Man created SOLR-5108: -- Summary: plugin loading should fail if mor then one instance of a singleton plugin is found Key: SOLR-5108 URL: https://issues.apache.org/jira/browse/SOLR-5108 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Continuing from the config parsing/validation work done in SOLR-4953, we should improve SolrConfig so that parsing fails if multiple instances of a plugin are found for types of plugins where only one is allowed to be used at a time. at the moment, {{SolrConfig.loadPluginInfo}} happily initializes a {{ListPluginInfo}} for whatever xpath it's given, and then later code can either call {{ListPluginInfo getPluginInfos(String)}} or {{PluginInfo getPluginInfo(String)}} (the later just being shorthand for getting the first item in the list. we could make {{getPluginInfo(String)}} throw an error if the list has multiple items, but i think we should also change the signature of {{loadPluginInfo}} to be explicit about how many instances we expect to find, so we can error earlier, and have a redundant check. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5108) plugin loading should fail if mor then one instance of a singleton plugin is found
[ https://issues.apache.org/jira/browse/SOLR-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726739#comment-13726739 ] Jack Krupansky commented on SOLR-5108: -- Sounds like this might resolve SOLR-4304 - NPE in Solr SpellCheckComponent if more than one QueryConverter. plugin loading should fail if mor then one instance of a singleton plugin is found -- Key: SOLR-5108 URL: https://issues.apache.org/jira/browse/SOLR-5108 Project: Solr Issue Type: Bug Reporter: Hoss Man Assignee: Hoss Man Continuing from the config parsing/validation work done in SOLR-4953, we should improve SolrConfig so that parsing fails if multiple instances of a plugin are found for types of plugins where only one is allowed to be used at a time. at the moment, {{SolrConfig.loadPluginInfo}} happily initializes a {{ListPluginInfo}} for whatever xpath it's given, and then later code can either call {{ListPluginInfo getPluginInfos(String)}} or {{PluginInfo getPluginInfo(String)}} (the later just being shorthand for getting the first item in the list. we could make {{getPluginInfo(String)}} throw an error if the list has multiple items, but i think we should also change the signature of {{loadPluginInfo}} to be explicit about how many instances we expect to find, so we can error earlier, and have a redundant check. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726746#comment-13726746 ] Noble Paul commented on SOLR-5081: -- [~mikeschrag] COuld you get any more thread dumps? Highly parallel document insertion hangs SolrCloud -- Key: SOLR-5081 URL: https://issues.apache.org/jira/browse/SOLR-5081 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3.1 Reporter: Mike Schrag Attachments: threads.txt If I do a highly parallel document load using a Hadoop cluster into an 18 node solrcloud cluster, I can deadlock solr every time. The ulimits on the nodes are: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1031181 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 515590 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited The open file count is only around 4000 when this happens. If I bounce all the servers, things start working again, which makes me think this is Solr and not ZK. I'll attach the stack trace from one of the servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: FlushPolicy and maxBufDelTerm
ok I committed some improvements there and some other places. Thanks guys for clarifying this! Shai On Thu, Aug 1, 2013 at 8:55 PM, Simon Willnauer simon.willna...@gmail.comwrote: thanks for clarifying this - I agree the wording is tricky here and we should use the term apply here! sorry for the confusion! simon On Thu, Aug 1, 2013 at 7:39 PM, Michael McCandless luc...@mikemccandless.com wrote: On Thu, Aug 1, 2013 at 11:24 AM, Shai Erera ser...@gmail.com wrote: I think the doc is correct Wait, one of the docs is wrong. I guess according to what you write, it's FlushPolicy, as a new segment is not flushed per this setting? Or perhaps they should be clarified that the deletes are flushed == applied on existing segments? Ahh, right. OK I think we should fix FlushPolicy to say deletes are applied? Let's try to leave the verb flushed to mean a new segment is written to disk, I think? I disabled reader pooling and I still don't see .del files. But I think that's explained due to there are no segments in the index yet. All documents are still in the RAM buffer, and according to what you write, I shouldn't see any segment cause of delTerms? Right! OK so that explains it. Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found
[ https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726782#comment-13726782 ] ASF subversion and git services commented on SOLR-4953: --- Commit 1509390 from hoss...@apache.org in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1509390 ] SOLR-4953: Make XML Configuration parsing fail if an xpath matches multiple nodes when only a single value is expected. (merge r1509359) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found Key: SOLR-4953 URL: https://issues.apache.org/jira/browse/SOLR-4953 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Attachments: SOLR-4953.patch, SOLR-4953.patch while reviewing some code i think i noticed that if there are multiple {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are ignored. this should be a hard failure situation, and we should have a TestBadConfig method to verify it. --- broadened goal of issue to fail if configuration contains multiple nodes/values for any option where only one value is expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-4953) Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found
[ https://issues.apache.org/jira/browse/SOLR-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hoss Man resolved SOLR-4953. Resolution: Fixed Fix Version/s: 5.0 4.5 Config XML parsing should fail hard if an xpath is expect to match at most one node/string/int/boolean and multiple values are found Key: SOLR-4953 URL: https://issues.apache.org/jira/browse/SOLR-4953 Project: Solr Issue Type: Improvement Reporter: Hoss Man Assignee: Hoss Man Fix For: 4.5, 5.0 Attachments: SOLR-4953.patch, SOLR-4953.patch while reviewing some code i think i noticed that if there are multiple {{indexConfig/}} blocks in solrconfig.xml, one just wins and hte rest are ignored. this should be a hard failure situation, and we should have a TestBadConfig method to verify it. --- broadened goal of issue to fail if configuration contains multiple nodes/values for any option where only one value is expected. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726801#comment-13726801 ] Mike Schrag commented on SOLR-5081: --- I grabbed more and they all look basically the same as the attached, which is to say, it sort of looks like Solr isn't doing ANYTHING. I'm going to look into whether I'm crushing ZooKeeper, and maybe my requests aren't even getting to Solr. Highly parallel document insertion hangs SolrCloud -- Key: SOLR-5081 URL: https://issues.apache.org/jira/browse/SOLR-5081 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3.1 Reporter: Mike Schrag Attachments: threads.txt If I do a highly parallel document load using a Hadoop cluster into an 18 node solrcloud cluster, I can deadlock solr every time. The ulimits on the nodes are: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1031181 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 515590 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited The open file count is only around 4000 when this happens. If I bounce all the servers, things start working again, which makes me think this is Solr and not ZK. I'll attach the stack trace from one of the servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726831#comment-13726831 ] Erick Erickson commented on SOLR-5081: -- Yeah, that is odd. The stack traces you sent basically showed no deadlocks, nothing interesting at all. I suspect pursuing whether anything is getting to Solr or not is a good idea H, blunt-instrument test when the cluster is hung. What happens if you, say, submit a query directly to one of the nodes? Does it respond or do you see anything in the solr log on that node? Tip: adding distrib=false to the _query_ will not try to send sub-queries to other shards. And I wonder what happens if you, say, use post.jar (comes with the example) to try to send a doc to Solr when it's hung, anything? Clearly I'm grasping at straws here, but I'm kind of out of good ideas. Highly parallel document insertion hangs SolrCloud -- Key: SOLR-5081 URL: https://issues.apache.org/jira/browse/SOLR-5081 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3.1 Reporter: Mike Schrag Attachments: threads.txt If I do a highly parallel document load using a Hadoop cluster into an 18 node solrcloud cluster, I can deadlock solr every time. The ulimits on the nodes are: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1031181 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 515590 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited The open file count is only around 4000 when this happens. If I bounce all the servers, things start working again, which makes me think this is Solr and not ZK. I'll attach the stack trace from one of the servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5152) Lucene FST is not immutable
[ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5152: Summary: Lucene FST is not immutable (was: Lucene FST is not immutale) Lucene FST is not immutable --- Key: LUCENE-5152 URL: https://issues.apache.org/jira/browse/LUCENE-5152 Project: Lucene - Core Issue Type: Bug Components: core/FSTs Affects Versions: 4.4 Reporter: Simon Willnauer Priority: Blocker Fix For: 5.0, 4.5 Attachments: LUCENE-5152.patch, LUCENE-5152.patch a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output from and FST (BytesRef) which caused sideffects in later execution. I added an assertion into the FST that checks if a cached root arc is modified and in-fact this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need to think about how to make this less trappy since it can cause bugs that are super hard to find. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-5152) Lucene FST is not immutable
[ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726847#comment-13726847 ] Simon Willnauer commented on LUCENE-5152: - bq. Should we clone payload bytes in the postings lists too? what about term dictionaries? I agree we can be less conservative here and just use the payload and copy it into a new BytesRef or whatever is needed. I will bring up a new patch. bq. At some point then BytesRef is useless as a reference class because of a few bad apples trying to use it as a ByteBuffer. Ideally we would remove code that abuses BytesRef as a ByteBuffer instead. agreed again. We just need to make sure that we have asserts in place that check for that. bq. I don't mean to pick on your issue Simon, and it doesnt mean I object to the patch (though I wonder about performance implications), I just see this as one of many in a larger issue. no worries. I am really concerned about this since it took me forever to figure out the problems this caused. I just wanna have an infra in place that catches those problems. I am more concerned about users that get bitten by this. I agree we should figure out the bigger problem eventually but lets make sure that we fix the bad apples first Lucene FST is not immutable --- Key: LUCENE-5152 URL: https://issues.apache.org/jira/browse/LUCENE-5152 Project: Lucene - Core Issue Type: Bug Components: core/FSTs Affects Versions: 4.4 Reporter: Simon Willnauer Priority: Blocker Fix For: 5.0, 4.5 Attachments: LUCENE-5152.patch, LUCENE-5152.patch a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output from and FST (BytesRef) which caused sideffects in later execution. I added an assertion into the FST that checks if a cached root arc is modified and in-fact this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need to think about how to make this less trappy since it can cause bugs that are super hard to find. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5081) Highly parallel document insertion hangs SolrCloud
[ https://issues.apache.org/jira/browse/SOLR-5081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726848#comment-13726848 ] Mike Schrag commented on SOLR-5081: --- I actually did this exact test when I was in this state originally, and the insert _worked_, which totally confused the situation for me. However, in light of seeing nothing in the traces, it supports the theory that the cluster isn't hung, but rather I'm somehow not even getting that far in the Hadoop cluster. ZK was my best guess as something that maybe could be an earlier stage failure, but even that I would expect to have hang the test-insert. So I need to do a little more forensics here and see if I can get a better picture of wtf is going on. Highly parallel document insertion hangs SolrCloud -- Key: SOLR-5081 URL: https://issues.apache.org/jira/browse/SOLR-5081 Project: Solr Issue Type: Bug Components: SolrCloud Affects Versions: 4.3.1 Reporter: Mike Schrag Attachments: threads.txt If I do a highly parallel document load using a Hadoop cluster into an 18 node solrcloud cluster, I can deadlock solr every time. The ulimits on the nodes are: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 1031181 max locked memory (kbytes, -l) unlimited max memory size (kbytes, -m) unlimited open files (-n) 32768 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 515590 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited The open file count is only around 4000 when this happens. If I bounce all the servers, things start working again, which makes me think this is Solr and not ZK. I'll attach the stack trace from one of the servers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-5152) Lucene FST is not immutable
[ https://issues.apache.org/jira/browse/LUCENE-5152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-5152: Attachment: LUCENE-5152.patch this patch only adds the assert and fixes the problems in MemoryPostings. This could solve the immediate issue and adds some more asserts to make sure we realise if something modifies the arcs outputs. Lucene FST is not immutable --- Key: LUCENE-5152 URL: https://issues.apache.org/jira/browse/LUCENE-5152 Project: Lucene - Core Issue Type: Bug Components: core/FSTs Affects Versions: 4.4 Reporter: Simon Willnauer Priority: Blocker Fix For: 5.0, 4.5 Attachments: LUCENE-5152.patch, LUCENE-5152.patch, LUCENE-5152.patch a spinnoff from LUCENE-5120 where the analyzing suggester modified a returned output from and FST (BytesRef) which caused sideffects in later execution. I added an assertion into the FST that checks if a cached root arc is modified and in-fact this happens for instance in our MemoryPostingsFormat and I bet we find more places. We need to think about how to make this less trappy since it can cause bugs that are super hard to find. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #926: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/926/ 1 tests failed. FAILED: org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testDistribSearch Error Message: IOException occured when talking to server at: http://127.0.0.1:26547 Stack Trace: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://127.0.0.1:26547 at org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:129) at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:180) at org.apache.http.impl.conn.ManagedClientConnectionImpl.open(ManagedClientConnectionImpl.java:294) at org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:645) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:480) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.testCustomCollectionsAPI(CollectionsAPIDistributedZkTest.java:764) at org.apache.solr.cloud.CollectionsAPIDistributedZkTest.doTest(CollectionsAPIDistributedZkTest.java:159) Build Log: [...truncated 24120 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x
jamon camisso created SOLR-5109: --- Summary: Solr 4.4 will not deploy in Glassfish 4.x Key: SOLR-5109 URL: https://issues.apache.org/jira/browse/SOLR-5109 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Glassfish 4.x Reporter: jamon camisso Priority: Blocker The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x. This failure is a known issue with upstream Guava and is described here: https://code.google.com/p/guava-libraries/issues/detail?id=1433 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr allows for a successful deployment. Until the Guava developers release version 15 using their HEAD or even an RC tag seems like the only way to resolve this. This is frustrating since it was proposed that Guava be removed as a dependency before Solr 4.0 was released and yet it remains and blocks upgrading: https://issues.apache.org/jira/browse/SOLR-3601 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jamon camisso updated SOLR-5109: Attachment: guava-15.0-SNAPSHOT.jar Solr 4.4 will not deploy in Glassfish 4.x - Key: SOLR-5109 URL: https://issues.apache.org/jira/browse/SOLR-5109 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Glassfish 4.x Reporter: jamon camisso Priority: Blocker Labels: guava Attachments: guava-15.0-SNAPSHOT.jar The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x. This failure is a known issue with upstream Guava and is described here: https://code.google.com/p/guava-libraries/issues/detail?id=1433 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr allows for a successful deployment. Until the Guava developers release version 15 using their HEAD or even an RC tag seems like the only way to resolve this. This is frustrating since it was proposed that Guava be removed as a dependency before Solr 4.0 was released and yet it remains and blocks upgrading: https://issues.apache.org/jira/browse/SOLR-3601 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726976#comment-13726976 ] Uwe Schindler commented on SOLR-5109: - Hi, we cannot bundle JAR files with our source code and in the case of releasing the binary Solr package we need to download all dependencies from Maven Central. So we cannot solve this problem. Guava has to relaese a newer version first. Solr 4.4 will not deploy in Glassfish 4.x - Key: SOLR-5109 URL: https://issues.apache.org/jira/browse/SOLR-5109 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Glassfish 4.x Reporter: jamon camisso Priority: Blocker Labels: guava Attachments: guava-15.0-SNAPSHOT.jar The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x. This failure is a known issue with upstream Guava and is described here: https://code.google.com/p/guava-libraries/issues/detail?id=1433 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr allows for a successful deployment. Until the Guava developers release version 15 using their HEAD or even an RC tag seems like the only way to resolve this. This is frustrating since it was proposed that Guava be removed as a dependency before Solr 4.0 was released and yet it remains and blocks upgrading: https://issues.apache.org/jira/browse/SOLR-3601 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726978#comment-13726978 ] jamon camisso commented on SOLR-5109: - Can Guava be removed as a core dependency was proposed in SOLR-3601? Solr 4.4 will not deploy in Glassfish 4.x - Key: SOLR-5109 URL: https://issues.apache.org/jira/browse/SOLR-5109 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Glassfish 4.x Reporter: jamon camisso Priority: Blocker Labels: guava Attachments: guava-15.0-SNAPSHOT.jar The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x. This failure is a known issue with upstream Guava and is described here: https://code.google.com/p/guava-libraries/issues/detail?id=1433 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr allows for a successful deployment. Until the Guava developers release version 15 using their HEAD or even an RC tag seems like the only way to resolve this. This is frustrating since it was proposed that Guava be removed as a dependency before Solr 4.0 was released and yet it remains and blocks upgrading: https://issues.apache.org/jira/browse/SOLR-3601 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-5109) Solr 4.4 will not deploy in Glassfish 4.x
[ https://issues.apache.org/jira/browse/SOLR-5109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13726992#comment-13726992 ] Uwe Schindler commented on SOLR-5109: - Please reopen the corresponding issue and maybe provide a patch removing this dependency. I would be happy to remove guava, but other developers may have other plans. In general, I would not run Solr indide Glassfish, as Solr has very different resource usage than conventional enterprise webapps. This is one reason why Solr may no longer be a WAR in the future. Solr is a separate s erver like mysql and should run in an isolated process. Solr 4.4 will not deploy in Glassfish 4.x - Key: SOLR-5109 URL: https://issues.apache.org/jira/browse/SOLR-5109 Project: Solr Issue Type: Bug Affects Versions: 4.4 Environment: Glassfish 4.x Reporter: jamon camisso Priority: Blocker Labels: guava Attachments: guava-15.0-SNAPSHOT.jar The bundled Guava 14.0.1 JAR blocks deploying Solr 4.4 in Glassfish 4.x. This failure is a known issue with upstream Guava and is described here: https://code.google.com/p/guava-libraries/issues/detail?id=1433 Building Guava guava-15.0-SNAPSHOT.jar from master and bundling it in Solr allows for a successful deployment. Until the Guava developers release version 15 using their HEAD or even an RC tag seems like the only way to resolve this. This is frustrating since it was proposed that Guava be removed as a dependency before Solr 4.0 was released and yet it remains and blocks upgrading: https://issues.apache.org/jira/browse/SOLR-3601 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org