RE: Frequent deletions
Well, we are doing same thing(in a way). we have to do frequent deletions in mass, at a time we are deleting around 20M+ documents.All i am doing is after deletion i am firing the below command on each of our solr node and keep some patience as it take way much time. curl -vvv http://node1.solr.x.com/collection1/update?optimize=truedistrib=false; /tmp/__solr_clener_log After finishing optimisation curl returns below xml : ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime10268995/int/lst /response Regards,Amey Date: Wed, 31 Dec 2014 02:32:37 -0700 From: inna.gel...@elbitsystems.com To: solr-user@lucene.apache.org Subject: Frequent deletions Hello, We perform frequent deletions from our index, which greatly increases the index size. How can we perform an optimization in order to reduce the size. Please advise, Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Frequent-deletions-tp4176689.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr Suggestion not working in solr PLZ HELP
Hi Vaibhav, Could you check with the directory *suggest.dictionary* mySuggester is present or not, try making it with mkdir, if still problem persist try giving full path. I found good article in below link check with that too. [http://romiawasthy.blogspot.com/2014/06/configure-solr-suggester.html] Regards,Amey Date: Wed, 17 Sep 2014 00:03:33 -0700 From: vaibhav.h.pa...@gmail.com To: solr-user@lucene.apache.org Subject: Solr Suggestion not working in solr PLZ HELP Suggestion In solrconfig.xml: searchComponent name=suggest class=solr.SuggestComponent lst name=suggester str name=namemySuggester/str str name=lookupImplFuzzyLookupFactory/str str name=dictionaryImplDocumentDictionaryFactory/str str name=fieldcontent/str str name=weightField/str str name=suggestAnalyzerFieldTypestring/str /lst /searchComponent requestHandler name=/suggest class=solr.SearchHandler startup=lazy lst name=defaults str name=suggesttrue/str str name=suggest.count10/str str name=suggest.dictionarymySuggester/str /lst arr name=components strsuggest/str /arr /requestHandler -- Suggestion: localhost:28080/solr/suggest?q=foobat above throwing exception as below responselst name=responseHeaderint name=status500/intint name=QTime12/int/lstlst name=errorstr name=msgNo suggester named default was configured/strstr name=tracejava.lang.IllegalArgumentException: No suggester named default was configured at org.apache.solr.handler.component.SuggestComponent.getSuggesters(SuggestComponent.java:353) at org.apache.solr.handler.component.SuggestComponent.prepare(SuggestComponent.java:158) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:149) at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:169) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:145) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:97) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:559) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:102) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:336) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:856) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:653) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:926) at java.lang.Thread.run(Thread.java:745) /strint name=code500/int/lst/response -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Suggestion-not-working-in-solr-PLZ-HELP-tp4159351.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Moving to HDFS, How to merge indices from 8 servers ?
Thanks for reply Erik, I think i have some misconfusion about how SOLR works with HDFS, and solution i am thinking could be reorganised by user community :) Here is the actual solution/situation which is implemented by me *Usecase* : I need a google like search engine which should be work in distributed and fault tolerant mode, we are collecting the health related URLs from a third party system in large amount, approx 1Million/hour. we want to build an inventory which contains all of there detail. now i am fetching that URL data breaking it in H1, P, Div like tags with help of Jsoup lib and putting in Solr as a documents with different boost to different fields. Now after the putting this data, i have a custom program with which we categorise all the data Example. All the cancer related pages, i am querying the SOLR and fetching all URL related to cancer with CursorMark and putting in a file for further use of our system. *Old Solution* : For this i have build the 8 SOLR servers with 3 zookeepers on the individual AWS Ec2 instances with one collection:8 shards problem with this solution is whenever any instance go down i am loosing that data for a moment. link of current solution http://postimg.org/image/luli3ybtj/ *New _OR_ could be faulty solution* : I am thinking that if i use HDFS which is virtually only one file system is better so if my server go down that data is available through another server, below is steps i am thinking to do. 1 I will merge all the 8 server indices somewhere in to one.2 Make setting for HDFS on same 8 servers.3 Put the merged index folder in HDFS so it will be distributed in 8 servers physically it self.4 Restart 8 servers pointing to HDFS on each instance.5 and now i am ready to go for putting data on 8 servers and fetching through any one of SOLR , if that is down choose another so it will be guaranteed to get all the data. So is this solution sounds good, OR you guys suggest me another better solution ? Regards,Amey Date: Thu, 11 Sep 2014 14:41:48 -0700 Subject: Re: Moving to HDFS, How to merge indices from 8 servers ? From: erickerick...@gmail.com To: solr-user@lucene.apache.org Um, I really think this is pretty likely to not be a great solution. When you say merge indexes, I'm thinking you want to go from 8 shards to 1 shard. Now, this can be done with the merge indexes core admin API, see: https://wiki.apache.org/solr/MergingSolrIndexes BUT. 1 This will break all things SolrCloud-ish assuming you created your 8 shards under SolrCloud. 2 Solr is usually limited by memory, so trying to fit enough of your single huge index into memory may be problematical. This feels like an XY problem, _why_ are you asking about this? What is the use-case you want to handle by this? Best, Erick On Thu, Sep 11, 2014 at 7:44 AM, Amey Jadiye ameyjad...@codeinventory.com wrote: FYI, I searched the google for this problem but didn't find any satisfactory answer.Here is the current situation : I have the 8 shards in my solr cloud backed up with 3 zookeeper all are setup on AWS EC2 instances, all 8 are leader with no replicas.I have only 1 collection say collection1 divided in 8 shards, i have configured the index and tlog folder on each server pointing into 1TB EBS disk attached to each servers, all 8 servers are having around 100GB for index folder each. so total index files i have is ~800Gb.Now, i want to move all the data to HDFS, so I am going to setup the HDFS on all 8 serversMerge all the indexes from 8 serversPut in HDFS.Stop and Start my all solr servers on HDFS to access that common index data with setting below cp parameter and few more.-Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.data.dir=hdfs://host:port/path -Dsolr.updatelog=hdfs://host:port/path -jarNow could you tell me is this correct approach? if yes how can i merge all indices from 8 server ?Regards,Amey
How to make solr fault tolerant for query?
Just a dumb question but how can i make solr cloud fault tolerant for queries ? why i am asking this question because, i have 12 different physical server and i am running 12 solr shards on that, whenever any one of them is going down because of any reason it gives me below error, i have 3 zookeeper for 12 servers all are leader and no replica for this solr cloud. I have option of using shards.tolerant=true but this is slow and dont gives all results. Best,Amey { responseHeader: { status: 503, QTime: 7, params: { sort: last_modified asc, indent: true, q: +links:[* TO *], _: 1410512274068, wt: json } }, error: { msg: no servers hosting shard: , code: 503 } }
Moving to HDFS, How to merge indices from 8 servers ?
FYI, I searched the google for this problem but didn't find any satisfactory answer.Here is the current situation : I have the 8 shards in my solr cloud backed up with 3 zookeeper all are setup on AWS EC2 instances, all 8 are leader with no replicas.I have only 1 collection say collection1 divided in 8 shards, i have configured the index and tlog folder on each server pointing into 1TB EBS disk attached to each servers, all 8 servers are having around 100GB for index folder each. so total index files i have is ~800Gb.Now, i want to move all the data to HDFS, so I am going to setup the HDFS on all 8 serversMerge all the indexes from 8 serversPut in HDFS.Stop and Start my all solr servers on HDFS to access that common index data with setting below cp parameter and few more.-Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs -Dsolr.data.dir=hdfs://host:port/path -Dsolr.updatelog=hdfs://host:port/path -jarNow could you tell me is this correct approach? if yes how can i merge all indices from 8 server ?Regards,Amey