RE: Frequent deletions

2015-01-06 Thread Amey Jadiye
Well, we are doing same thing(in a way). we have to do frequent deletions in 
mass, at a time we are deleting around 20M+ documents.All i am doing is after 
deletion i am firing the below command on each of our solr node and keep some 
patience as it take way much time.

curl -vvv 
http://node1.solr.x.com/collection1/update?optimize=truedistrib=false;  
/tmp/__solr_clener_log

After finishing optimisation curl returns below xml :









?xml version=1.0 encoding=UTF-8?
response
lst name=responseHeaderint name=status0/intint 
name=QTime10268995/int/lst
/response

Regards,Amey

 Date: Wed, 31 Dec 2014 02:32:37 -0700
 From: inna.gel...@elbitsystems.com
 To: solr-user@lucene.apache.org
 Subject: Frequent deletions
 
 Hello,
 We perform frequent deletions from our index, which greatly increases the
 index size.
 How can we perform an optimization in order to reduce the size.
 Please advise,
 Thanks.
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Frequent-deletions-tp4176689.html
 Sent from the Solr - User mailing list archive at Nabble.com.
  

RE: Solr Suggestion not working in solr PLZ HELP

2014-09-17 Thread Amey Jadiye
Hi Vaibhav,

Could you check with the directory *suggest.dictionary* mySuggester is present 
or not, try making it with mkdir, if still problem persist try giving full path.

I found good article  in below link check with that too. 
[http://romiawasthy.blogspot.com/2014/06/configure-solr-suggester.html]

Regards,Amey Date: Wed, 17 Sep 2014 00:03:33 -0700
 From: vaibhav.h.pa...@gmail.com
 To: solr-user@lucene.apache.org
 Subject: Solr Suggestion not working in solr PLZ HELP
 
 Suggestion
 In solrconfig.xml:
 searchComponent name=suggest class=solr.SuggestComponent
 lst name=suggester
   str name=namemySuggester/str
   str name=lookupImplFuzzyLookupFactory/str   
   str name=dictionaryImplDocumentDictionaryFactory/str 
   str name=fieldcontent/str
   str name=weightField/str
   str name=suggestAnalyzerFieldTypestring/str
 /lst
   /searchComponent
  
  requestHandler name=/suggest class=solr.SearchHandler startup=lazy
 lst name=defaults
   str name=suggesttrue/str
   str name=suggest.count10/str
   str name=suggest.dictionarymySuggester/str
 /lst
 arr name=components
   strsuggest/str
 /arr
   /requestHandler
 
 
 --
 
 
 Suggestion: localhost:28080/solr/suggest?q=foobat
 
 above throwing exception as below 
 
 
 responselst name=responseHeaderint name=status500/intint
 name=QTime12/int/lstlst name=errorstr name=msgNo suggester
 named default was configured/strstr
 name=tracejava.lang.IllegalArgumentException: No suggester named default
 was configured
   at
 org.apache.solr.handler.component.SuggestComponent.getSuggesters(SuggestComponent.java:353)
   at
 org.apache.solr.handler.component.SuggestComponent.prepare(SuggestComponent.java:158)
   at
 org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:197)
   at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
   at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952)
   at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:774)
   at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:418)
   at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:207)
   at
 org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:246)
   at
 org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:214)
   at
 org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:230)
   at
 org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:149)
   at
 org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:169)
   at
 org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:145)
   at
 org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:97)
   at
 org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:559)
   at
 org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:102)
   at
 org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:336)
   at
 org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:856)
   at
 org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:653)
   at 
 org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:926)
   at java.lang.Thread.run(Thread.java:745)
 /strint name=code500/int/lst/response 
 
 
 
 
 --
 View this message in context: 
 http://lucene.472066.n3.nabble.com/Solr-Suggestion-not-working-in-solr-PLZ-HELP-tp4159351.html
 Sent from the Solr - User mailing list archive at Nabble.com.
  

RE: Moving to HDFS, How to merge indices from 8 servers ?‏‏

2014-09-15 Thread Amey Jadiye
Thanks for reply Erik,
I think i have some misconfusion about how SOLR works with HDFS, and solution i 
am thinking could be reorganised  by user community :)
Here is the actual solution/situation which is implemented by me
*Usecase* : I need a google like search engine which should be work in 
distributed and fault tolerant mode, we are collecting the health related  URLs 
from a third party system in large amount, approx 1Million/hour. we want to 
build an inventory which contains all of there detail. now i am fetching that 
URL data breaking it in H1, P, Div like tags with help of Jsoup lib and putting 
in Solr as a documents with different boost to different fields.
Now after the putting this data, i have a custom program with which we 
categorise all the data Example. All the cancer related pages, i am querying 
the SOLR and fetching all URL related to cancer with CursorMark and putting in 
a file for further use of our system.
*Old Solution* : For this i have build the 8 SOLR servers with 3 zookeepers on 
the individual AWS Ec2 instances with one collection:8 shards problem with this 
solution is whenever any instance go down i am loosing that data for a moment. 
link of current solution http://postimg.org/image/luli3ybtj/ 
*New _OR_ could be faulty solution* : I am thinking that if i use HDFS which is 
virtually only one file system is better so if my server go down that data is 
available through another server, below is steps i am thinking to do.
1  I will merge all the 8 server  indices somewhere in to one.2  Make setting 
for HDFS on same 8 servers.3  Put the merged index folder in HDFS so it will 
be distributed in 8 servers physically it self.4  Restart 8 servers pointing 
to HDFS on each instance.5  and now i am ready to go for putting data on 8 
servers and fetching through any one of SOLR , if that is down choose another 
so it will be guaranteed to get all the data. 
So is this solution sounds good, OR you guys suggest me another better solution 
?
Regards,Amey


 Date: Thu, 11 Sep 2014 14:41:48 -0700
 Subject: Re: Moving to HDFS, How to merge indices from 8 servers ?‏‏
 From: erickerick...@gmail.com
 To: solr-user@lucene.apache.org
 
 Um, I really think this is pretty likely to not be a great solution.
 When you say merge indexes, I'm thinking you want to go from 8
 shards to 1 shard. Now, this can be done with the merge indexes core
 admin API, see:
 https://wiki.apache.org/solr/MergingSolrIndexes
 
 BUT.
 1  This will break all things SolrCloud-ish assuming you created your
 8 shards under SolrCloud.
 2 Solr is usually limited by memory, so trying to fit enough of your
 single huge index into memory may be problematical.
 
 This feels like an XY problem, _why_ are you asking about this? What
 is the use-case you want to handle by this?
 
 Best,
 Erick
 
 On Thu, Sep 11, 2014 at 7:44 AM, Amey Jadiye
 ameyjad...@codeinventory.com wrote:
  FYI, I searched the google for this problem but didn't find any 
  satisfactory answer.Here is the current situation : I have the 8 shards in 
  my solr cloud backed up with 3 zookeeper all are setup on AWS EC2 
  instances, all 8 are leader with no replicas.I have only 1 collection say 
  collection1 divided in 8 shards, i have configured the index and tlog 
  folder on each server pointing into 1TB EBS disk attached to each servers, 
  all 8 servers are having around 100GB for index folder each. so total index 
  files i have is ~800Gb.Now, i want to move all the data to HDFS, so I am 
  going to setup the HDFS on all 8 serversMerge all the indexes from 8 
  serversPut in HDFS.Stop  and Start my all solr servers on HDFS to access 
  that common index data with setting  below cp parameter and few 
  more.-Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs 
  -Dsolr.data.dir=hdfs://host:port/path 
  -Dsolr.updatelog=hdfs://host:port/path -jarNow could you tell me is this 
  correct approach? if yes how can i merge all indices from 8 server 
  ?Regards,Amey
  

How to make solr fault tolerant for query?

2014-09-12 Thread Amey Jadiye
Just a dumb question but how can i make solr cloud fault tolerant for queries ? 
why i am asking this question because, i have 12 different physical server and 
i am running  12 solr shards on that, whenever any one of them is going down 
because of any reason it gives me below error, i have 3 zookeeper for 12 
servers all are leader and no replica for this solr cloud.
I have option of using shards.tolerant=true  but this is slow and dont gives 
all results.
Best,Amey
{
  responseHeader: {
status: 503,
QTime: 7,
params: {
  sort: last_modified asc,
  indent: true,
  q: +links:[* TO *],
  _: 1410512274068,
  wt: json
}
  },
  error: {
msg: no servers hosting shard: ,
code: 503
  }
} 

Moving to HDFS, How to merge indices from 8 servers ?‏‏

2014-09-11 Thread Amey Jadiye
FYI, I searched the google for this problem but didn't find any satisfactory 
answer.Here is the current situation : I have the 8 shards in my solr cloud 
backed up with 3 zookeeper all are setup on AWS EC2 instances, all 8 are leader 
with no replicas.I have only 1 collection say collection1 divided in 8 shards, 
i have configured the index and tlog folder on each server pointing into 1TB 
EBS disk attached to each servers, all 8 servers are having around 100GB for 
index folder each. so total index files i have is ~800Gb.Now, i want to move 
all the data to HDFS, so I am going to setup the HDFS on all 8 serversMerge all 
the indexes from 8 serversPut in HDFS.Stop  and Start my all solr servers on 
HDFS to access that common index data with setting  below cp parameter and few 
more.-Dsolr.directoryFactory=HdfsDirectoryFactory -Dsolr.lock.type=hdfs 
-Dsolr.data.dir=hdfs://host:port/path 
-Dsolr.updatelog=hdfs://host:port/path -jarNow could you tell me is this 
correct approach? if yes how can i merge all indices from 8 server 
?Regards,Amey