SOLR capacity planning and Disaster relief
CAVEAT: I am a nubie w/r to SOLR (some Lucene experience, but not SOLR itself. Trying to come up to speed. What have you all done w/r to SOLR capacity planning and disaster relief? I am curious to the following metrics: - File handles and other ulimit/profile concerns - Space calculations (particularly w/r to optimizations, etc.) - Taxonomy considerations - Single Core vs. Multi-core - ? Also, anyone plan for Disaster relief for SOLR across non-metro data centers? Currently not an issue for me, but will be shortly.
Re: Doing facet count using group truncating with distributed search
I think you're confusing cores and shards. The comments about distributed functionality are for a _sharded_ index. A sharded index simply breaks up a single logical index into parts (shards) and, when the configuration is set up, queries are automatically sent to all shards and the results collated. Even in this case, though, care must be taken that no documents in separate shards have the same ID (uniqueKey) or your results will be wonky. So your example won't work well even in a sharded situation. But what you describe is separate cores. Cores are simply completely distinct indexes that happen to be served by a single Solr instance, they have no knowledge of each other so I don't see how you'd be able to get what you're looking for. Best Erick P.S. I'm cheating a little when I say cores have no knowledge of each other, there is the a restricted case of cross core joins, but that's not germain to your problem. On Sat, Oct 20, 2012 at 3:49 PM, Kenneth Vindum k...@industry-supply.dkwrote: Hi Solr users! ** ** Could any of you tell me how to do a facet count across several cores excluding duplicates. Eg. ** ** Core A: Page 1 Id=a Text=hello world ** ** Page 2 Id=b Text=hello again ** ** Core B: Page 1 Id=a Text=Hej verden ** ** Id=c Text=Ny besked ** ** Doing a facet count on core A gives me 2 elements. Doing a facet count on core B gives me two element as well. Counting across both cores using shards should return 3 elements when doing group.truncate on the element with Id=a. This would work on a single core, but doing so on more than one core always gives me a facet count = 4. ** ** I’ve read the solr pagehttp://wiki.apache.org/solr/FieldCollapsing#Known_Limitationssaying Grouping is also supported for distributed searches from version [image: !] Solr3.5 http://wiki.apache.org/solr/Solr3.5 and from version [image: !] Solr4.0 http://wiki.apache.org/solr/Solr4.0. Currently group.truncate and group.func are the only parameters that aren't supported for distributed searches. ** ** Is this because it’s not possible to make this feature, or is it because nobody needed it yet? ** ** Thanks guys :) ** ** Kind regards Kenneth Vindum ** **
Does SolrCloud support distributed IDFs?
Hi folks, a known limitation of the old distributed search feature is the lack of distributed/global IDFs (#SOLR-1632). Does SolrCloud bring some improvements in this direction? Best regards, Sascha
Re: Open Source Social (London) - 23rd Oct
Last reminder... come along on Tuesday if you can! We'd love to meet you and share search/NLP/scaling war stories. On 11 October 2012 21:59, Richard Marr richard.m...@gmail.com wrote: Hi all, The next Open Source Search Social is on the 23rd Oct at The Plough, in Bloomsbury. We usually get a good mix of regulars and newcomers, and a good mix of backgrounds and experience levels, so please come along if you can. As usual the format is completely open so we'll be talking about whatever is most interesting at any one particular moment... ooo, a shiny thing... Details and RSVP options on the Meetup page: http://www.meetup.com/london-search-social/events/86580442/ Hope to see you there, Richard @richmarr -- Richard Marr
Why does SolrIndexSearcher.java enforce mutual exclusion of filter and filterList?
Greetings, I'm wondering if somebody would please explain why SolrIndexSearcher.java enforces mutual exclusion of filter and filterList (e.g. see: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L2039 ) For a custom application we have been using this functionality successfully and I have been maintaining patches against base releases up from 1.4 through 3.6.1 and am now finally looking at 4.0. Since I am yet-again revisiting this custom patch, I am wondering why this functionality is prevented out of the box - for two reasons really: 1) It would be great if I didn't have to maintain a custom internal branch of solr for this tiny little change 2) I am worried that the purposeful prevention of this functionality implies there is a downside to doing this. Is there a downside to utilizing both a DocSet based filter and query based filterList? If not, once I migrate this patch to 4.0 what would be the best way to get this functionality incorporated into the base? For additional info, you may find the now-2-year-old issue with patches addressing this up through 3.6.1 here: https://issues.apache.org/jira/browse/SOLR-2052 Any insight appreciated as always, Aaron
Re: Why does SolrIndexSearcher.java enforce mutual exclusion of filter and filterList?
On Sun, Oct 21, 2012 at 3:57 PM, Aaron Daubman daub...@gmail.com wrote: Greetings, I'm wondering if somebody would please explain why SolrIndexSearcher.java enforces mutual exclusion of filter and filterList (e.g. see: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L2039 ) AFAIK, It's only prevented because it's not implemented (or at least it wasn't in the past.) Things may have changed since I restructured filtering to implement post filters though. If you remove the checks, does everything actually work today? It was never implemented because Solr didn't use it itself (and the vast majority of people use Solr as-is w/o internal customization). Should be fine to support as long as it doesn't complicate the implementation too much. -Yonik http://lucidworks.com
Re: Does SolrCloud support distributed IDFs?
On 10/21/2012 01:27 PM, Sascha Szott wrote: Hi folks, a known limitation of the old distributed search feature is the lack of distributed/global IDFs (#SOLR-1632). Does SolrCloud bring some improvements in this direction? Best regards, Sascha Still waiting on that issue. I think Andrzej should just update it to trunk and commit - it's option and defaults to off. Go vote :) - Mark
IOException occured when talking to server
Hi, I'm encountering below error repeatedly when trying out distributed search. At that time, every server was not stalled. Has anyone know what the problem is? 2012-10-18 09:09:54,813 [http-8080-exec-8819] ERROR org.apache.solr.core.SolrCore - org.apache.solr.common.SolrException: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://203.242.170.141:8080/solr_jt/jtp00 at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:298) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1561) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:445) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:266) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:470) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889) at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:732) at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2262) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://203.242.170.141:8080/solr_jt/jtp00 at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:409) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:182) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:165) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:132) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:168) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:111) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:264) at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247) at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:351) ... 11 more
RAMDirectory - still stores some docs on disk?
Hello I am using RAMDirectory for running some experiments and came up with a weird (well, for me) situation. basically after indexing on RAM. i have killed the JVM and then restarted it after some time. I can still see some documents as indexed and searchable. I had indexed more than 2M docs before shutting down and after restart there were around 15K docs still in the index. So how could this happen? there is some caching mechanism which backups some amount(?) of the total index or it is directly written on disk? (if so, how? ) - Zeki ama calismiyor... Calissa yapar... -- View this message in context: http://lucene.472066.n3.nabble.com/RAMDirectory-still-stores-some-docs-on-disk-tp4015022.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR 4 BETA facet.pivot and cloud
When I run SOLR 4 BETA with zookeeper, even if I specify shards=1, pivoting does not seem to work. FYI, facet.pivot does work, in 4.0, on a per-shard basis if you include the distrib=False argument. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-4-BETA-facet-pivot-and-cloud-tp4011841p4015030.html Sent from the Solr - User mailing list archive at Nabble.com.