Re: language specific fields of text
You should use language detection processor factory, like below: processor class=org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory str name=langid.flcontent/str str name=langid.langFieldlanguage/str str name=langid.fallbacken/str *str name=langid.maptrue/str str name=langid.map.flcontent,fullname/str* str name=langid.map.keepOrigtrue/str str name=langid.whitelisten,fr,de,es,ru,it/str str name=langid.threshold0.7/str /processor Once you have defined fields like content_en, content_fr etc., they will be filled in automatically according to the recognized language See http://wiki.apache.org/solr/LanguageDetection -- View this message in context: http://lucene.472066.n3.nabble.com/language-specific-fields-of-text-tp3698985p4031180.html Sent from the Solr - User mailing list archive at Nabble.com.
Getting Lucense Query from Solr query (Or converting Solr Query to Lucense's query)
Is there a way to get Lucene's query from Solr query?. I have a requirement to search for terms in multiple heterogeneous indices. Presently, I am using the following approach try { Directory directory1 = FSDirectory.open(new File(E:\\database\\patient\\index)); Directory directory2 = FSDirectory.open(new File(E:\\database\\study\\index)); BooleanQuery myQuery = new BooleanQuery(); myQuery.add(new TermQuery(new Term(PATIENT_GENDER, Male)), BooleanClause.Occur.SHOULD); myQuery.add(new TermQuery(new Term(STUDY_DIVISION,Cancer Center)), BooleanClause.Occur.SHOULD); int indexCount = 2; IndexReader[] indexReader = new IndexReader[indexCount]; indexReader[0] = DirectoryReader.open(directory1); indexReader[1] = DirectoryReader.open(directory2); IndexSearcher searcher = new IndexSearcher(new MultiReader(indexReader)); TopDocs col = searcher.search(myQuery, 10); //results ScoreDoc[] docs = col.scoreDocs; } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } Here, I need to create TermQuery based on Field Names and its value. If I can get this boolean query directly from Solr query q=PATIENT_GENDER:Male OR STUDY_DIVISION:Cancer Center, that will save my coding effort. This one is a simple example but when we need to create more complex query it will be a time consuming activity and error prone. So, is there a way to get the lucense's query from solr query. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-Lucense-Query-from-Solr-query-Or-converting-Solr-Query-to-Lucense-s-query-tp4031187.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: theory of sets (first solution)
Hi, I found a own hack. It's based on free interpretation of the function strdist(). Have: - one multivalued field 'part_of' - one unique field 'groupsort' Index each item: For each group membership: add groupid to 'part_of' concat groupid and sortstring to new string add this string to a csv list End add the csv list to 'groupsort' End Have also, a own class that implements org.apache.lucene.search.spell.StringDistance, to generate a custom distance value. This class should: - split the csv list - find the element/string that starts with the given group id - translate the rest (sortstring) to a float value .../select?q=part_of:Xsort=strdist(X, groupsort, FQN) asc FQN is the fully qualified name of the own class. (remember to place the the jar in a 'lib' defined in solrconfig.xml or add a own 'lib' entry) Uwe (still looking for a smarter solution)
Re: Max number of core in Solr multi-core
Thank you for your responses. I have one more question related to Solr multi-core. By using SolrJ I create new core for each application. When user wants to add data or make query on his application, I create new HttpSolrServer for this core. In this scenario there will be many running HttpSolrServer instances. Is there a better solution? Does it cause a problem to run many instances at the same time? On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk wrote: g a collection per application instead of a core
Re: Problem occured in solr cloud set up org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request
This is all quite strange, lots of people are using SolrCloud, some with very large clusters, so I'm guessing it's something in your setup that isn't obvious. How certain are you that your network between the two machines is reliable? And have you tried with a nightly build? I'm grasping at straws because there's nothing obvious in what you've told us so far, and I'm certain others aren't encountering this problem... Sorry I can't be more help, Erick On Sun, Jan 6, 2013 at 9:58 PM, yayati yayatirajpa...@gmail.com wrote: No live SolrServers
Re: Max number of core in Solr multi-core
This might help: https://wiki.apache.org/solr/Solrj#HttpSolrServer Note that the associated SolrRequest takes the path, I presume relative to the base URL you initialized the HttpSolrServer with. Best Erick On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade parvin.gasimz...@gmail.com wrote: Thank you for your responses. I have one more question related to Solr multi-core. By using SolrJ I create new core for each application. When user wants to add data or make query on his application, I create new HttpSolrServer for this core. In this scenario there will be many running HttpSolrServer instances. Is there a better solution? Does it cause a problem to run many instances at the same time? On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk wrote: g a collection per application instead of a core
Re: custom solr sort
Am 06.01.2013 02:32, schrieb andy: I want to custom solr sort and pass solr param from client to solr server, Hi Andy, not a answer of your question, but maybe an other approach to solve your initial question. Instead of writing a new SearchComponent I decided to (miss)use the function http://wiki.apache.org/solr/FunctionQuery#strdist 'strdist' seems to have everything, you need: - a parameter 's1' - a fieldname 's2' - a slot to plugin your own algo How to use this to sort on multivalued attributes, I've described in this list as thread theory of sets Uwe
Re: custom solr sort
Can you explain why you want to implement a different sort first? There may be other ways of achieving the same thing. Upayavira On Sun, Jan 6, 2013, at 01:32 AM, andy wrote: Hi, Maybe this is an old thread or maybe it's different with previous one. I want to custom solr sort and pass solr param from client to solr server, so I implemented SearchComponent which named MySortComponent in my code, and also implemented FieldComparatorSource and FieldComparator. when I use mysearch requesthandler(see following codes), I found that custom sort just effect on the current page when I got multiple page results, but the sort is expected when I sets the rows which contains all the results. Does anybody know how to solve it or the reason? code snippet: public class MySortComponent extends SearchComponent implements SolrCoreAware { public void inform(SolrCore arg0) { } @Override public void prepare(ResponseBuilder rb) throws IOException { SolrParams params = rb.req.getParams(); String uid = params.get(uid) private RestTemplate restTemplate = new RestTemplate(); MyComparatorSource comparator = new MyComparatorSource(uid); SortSpec sortSpec = rb.getSortSpec(); if (sortSpec.getSort() == null) { sortSpec.setSort(new Sort(new SortField[] { new SortField(relation, comparator),SortField.FIELD_SCORE })); } else { SortField[] current = sortSpec.getSort().getSort(); ArrayListSortField sorts = new ArrayListSortField( current.length + 1); sorts.add(new SortField(relation, comparator)); for (SortField sf : current) { sorts.add(sf); } sortSpec.setSort(new Sort(sorts.toArray(new SortField[sorts.size()]))); } } @Override public void process(ResponseBuilder rb) throws IOException { } // - // SolrInfoMBean // - @Override public String getDescription() { return Custom Sorting; } @Override public String getSource() { return ; } @Override public URL[] getDocs() { try { return new URL[] { new URL( http://wiki.apache.org/solr/QueryComponent;) }; } catch (MalformedURLException e) { throw new RuntimeException(e); } } public class MyComparatorSource extends FieldComparatorSource { private BitSet dg1; private BitSet dg2; private BitSet dg3; public MyComparatorSource(String uid) throws IOException { SearchResponse responseBody = restTemplate.postForObject( http://search.test.com/userid/search/; + uid, null, SearchResponse.class); String d1 = responseBody.getOneDe(); String d2 = responseBody.getTwoDe(); String d3 = responseBody.getThreeDe(); if (StringUtils.hasLength(d1)) { byte[] bytes = Base64.decodeBase64(d1); dg1 = BitSetHelper.loadFromBzip2ByteArray(bytes); } if (StringUtils.hasLength(d2)) { byte[] bytes = Base64.decodeBase64(d2); dg2 = BitSetHelper.loadFromBzip2ByteArray(bytes); } if (StringUtils.hasLength(d3)) { byte[] bytes = Base64.decodeBase64(d3); dg3 = BitSetHelper.loadFromBzip2ByteArray(bytes); } } @Override public FieldComparator newComparator(String fieldname, final int numHits, int sortPos, boolean reversed) throws IOException { return new RelationComparator(fieldname, numHits); } class RelationComparator extends FieldComparator { private int[] uidDoc; private float[] values; private float bottom; String fieldName; public RelationComparator(String fieldName, int numHits) throws IOException { values = new float[numHits]; this.fieldName = fieldName; } @Override public int compare(int slot1, int slot2) { if (values[slot1] values[slot2]) return -1; if (values[slot1] values[slot2]) return 1; return 0; } @Override public int compareBottom(int doc) throws IOException { float docDistance = getRelation(doc);
Re: Max number of core in Solr multi-core
I know that but my question is different. Let me ask it in this way. I have a solr with base url localhost:8998/solr and two solr core as localhost:8998/solr/core1 and localhost:8998/solr/core2. I have one baseSolr instance initialized as : SolrServer server = new HttpSolrServer( url ); I have also create SolrServer's for each core as : SolrServer core1 = new HttpSolrServer( url + /core1 ); SolrServer core2 = new HttpSolrServer( url + /core2 ); Since there are many cores, I have to initialize SolrServer as shown above. Is there a way to create only one SolrServer with the base url and access each core using it? If it is possible, then I don't need to create new SolrServer for each core. On Mon, Jan 7, 2013 at 2:39 PM, Erick Erickson erickerick...@gmail.comwrote: This might help: https://wiki.apache.org/solr/Solrj#HttpSolrServer Note that the associated SolrRequest takes the path, I presume relative to the base URL you initialized the HttpSolrServer with. Best Erick On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade parvin.gasimz...@gmail.com wrote: Thank you for your responses. I have one more question related to Solr multi-core. By using SolrJ I create new core for each application. When user wants to add data or make query on his application, I create new HttpSolrServer for this core. In this scenario there will be many running HttpSolrServer instances. Is there a better solution? Does it cause a problem to run many instances at the same time? On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk wrote: g a collection per application instead of a core
Re: Getting Lucense Query from Solr query (Or converting Solr Query to Lucense's query)
if you are inside solr, as it seems to be the case, you can do this QParserPlugin qplug = req.getCore().getQueryPlugin(LuceneQParserPlugin.NAME); QParser parser = qplug.createParser(PATIENT_GENDER:Male OR STUDY_DIVISION:\Cancer Center\, null, req.getParams(), req); Query q = parser.parse(); maybe there is a one-line call to get the parser from solr core, but i can't find it now. Have a look at one of the subclasses of QParser --roman On Mon, Jan 7, 2013 at 4:27 AM, Sabeer Hussain shuss...@del.aithent.comwrote: Is there a way to get Lucene's query from Solr query?. I have a requirement to search for terms in multiple heterogeneous indices. Presently, I am using the following approach try { Directory directory1 = FSDirectory.open(new File(E:\\database\\patient\\index)); Directory directory2 = FSDirectory.open(new File(E:\\database\\study\\index)); BooleanQuery myQuery = new BooleanQuery(); myQuery.add(new TermQuery(new Term(PATIENT_GENDER, Male)), BooleanClause.Occur.SHOULD); myQuery.add(new TermQuery(new Term(STUDY_DIVISION,Cancer Center)), BooleanClause.Occur.SHOULD); int indexCount = 2; IndexReader[] indexReader = new IndexReader[indexCount]; indexReader[0] = DirectoryReader.open(directory1); indexReader[1] = DirectoryReader.open(directory2); IndexSearcher searcher = new IndexSearcher(new MultiReader(indexReader)); TopDocs col = searcher.search(myQuery, 10); //results ScoreDoc[] docs = col.scoreDocs; } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } Here, I need to create TermQuery based on Field Names and its value. If I can get this boolean query directly from Solr query q=PATIENT_GENDER:Male OR STUDY_DIVISION:Cancer Center, that will save my coding effort. This one is a simple example but when we need to create more complex query it will be a time consuming activity and error prone. So, is there a way to get the lucense's query from solr query. -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-Lucense-Query-from-Solr-query-Or-converting-Solr-Query-to-Lucense-s-query-tp4031187.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Max number of core in Solr multi-core
This is the exact approach we use in our multithreaded env. One server per core. I think this is the recommended approach. -Original Message- From: Parvin Gasimzade [mailto:parvin.gasimz...@gmail.com] Sent: Monday, January 07, 2013 7:00 AM To: solr-user@lucene.apache.org Subject: Re: Max number of core in Solr multi-core I know that but my question is different. Let me ask it in this way. I have a solr with base url localhost:8998/solr and two solr core as localhost:8998/solr/core1 and localhost:8998/solr/core2. I have one baseSolr instance initialized as : SolrServer server = new HttpSolrServer( url ); I have also create SolrServer's for each core as : SolrServer core1 = new HttpSolrServer( url + /core1 ); SolrServer core2 = new HttpSolrServer( url + /core2 ); Since there are many cores, I have to initialize SolrServer as shown above. Is there a way to create only one SolrServer with the base url and access each core using it? If it is possible, then I don't need to create new SolrServer for each core. On Mon, Jan 7, 2013 at 2:39 PM, Erick Erickson erickerick...@gmail.comwrote: This might help: https://wiki.apache.org/solr/Solrj#HttpSolrServer Note that the associated SolrRequest takes the path, I presume relative to the base URL you initialized the HttpSolrServer with. Best Erick On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade parvin.gasimz...@gmail.com wrote: Thank you for your responses. I have one more question related to Solr multi-core. By using SolrJ I create new core for each application. When user wants to add data or make query on his application, I create new HttpSolrServer for this core. In this scenario there will be many running HttpSolrServer instances. Is there a better solution? Does it cause a problem to run many instances at the same time? On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk wrote: g a collection per application instead of a core
RE: RE: Max number of core in Solr multi-core
This should be clarified some. In the client API, SolrServer is represents a connection to a single server backend/endpoint and should be re-used where possible. The approach being discussed is to have one client connection (represented by SolrServer class) per solr core, all residing in a single solr server (as is the case below, but not required). brbrbr--- Original Message --- On 1/7/2013 08:06 AM Jay Parashar wrote:brThis is the exact approach we use in our multithreaded env. One server per brcore. I think this is the recommended approach. br br-Original Message- brFrom: Parvin Gasimzade [mailto:parvin.gasimz...@gmail.com] brSent: Monday, January 07, 2013 7:00 AM brTo: solr-user@lucene.apache.org brSubject: Re: Max number of core in Solr multi-core br brI know that but my question is different. Let me ask it in this way. br brI have a solr with base url localhost:8998/solr and two solr core as brlocalhost:8998/solr/core1 and localhost:8998/solr/core2. br brI have one baseSolr instance initialized as : brSolrServer server = new HttpSolrServer( url ); br brI have also create SolrServer's for each core as : brSolrServer core1 = new HttpSolrServer( url + /core1 ); SolrServer core2 = brnew HttpSolrServer( url + /core2 ); br brSince there are many cores, I have to initialize SolrServer as shown above. brIs there a way to create only one SolrServer with the base url and access breach core using it? If it is possible, then I don't need to create new brSolrServer for each core. br brOn Mon, Jan 7, 2013 at 2:39 PM, Erick Erickson brerickerick...@gmail.comwrote: br br This might help: br https://wiki.apache.org/solr/Solrj#HttpSolrServer br br Note that the associated SolrRequest takes the path, I presume br relative to the base URL you initialized the HttpSolrServer with. br br Best br Erick br br br On Mon, Jan 7, 2013 at 7:02 AM, Parvin Gasimzade br parvin.gasimz...@gmail.com br wrote: br br Thank you for your responses. I have one more question related to br Solr multi-core. br By using SolrJ I create new core for each application. When user br wants to add data or make query on his application, I create new br HttpSolrServer br for br this core. In this scenario there will be many running br HttpSolrServer instances. br br Is there a better solution? Does it cause a problem to run many br instances at the same time? br br On Wed, Jan 2, 2013 at 5:35 PM, Per Steffensen st...@designware.dk br wrote: br br g a collection per application instead of a core br br br br
Re: Sorting on mutivalued fields still impossible?
Hi Jack, thank you for the hint. Since I have already a solrj client to do the preprocessing, mapping to sort fields isn't my problem. I will try to explain better in my reply to Erick. Uwe (Sorry late reaction) Am 30.08.2012 16:04, schrieb Jack Krupansky: You can also use a Field Mutating Update Processor to do a smart copy of a multi-valued field to a sortable single-valued field. See: http://wiki.apache.org/solr/UpdateRequestProcessor#Field_Mutating_Update_Processors Such as using the maximum value via MaxFieldValueUpdateProcessorFactory. See: http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/update/processor/MaxFieldValueUpdateProcessorFactory.html Which value of a multi-valued field do you wish to sort by? -- Jack Krupansky
Re: Sorting on mutivalued fields still impossible?
Am 31.08.2012 13:35, schrieb Erick Erickson: ... what would the correct behavior be for sorting on a multivalued field Hi Erick, in generally you are right, the question of multivalued fields is which value the reference is. But there are thousands of cases where this question is implicit answered. See my example ...sort=max(datefield) desc It is obvious, that the newest date should win. I see no reason why simple filters like max can't handle multivalued fields. Now four month's later i still wounder, why there is no pluginable function to map multivalued fields into a single value. eg. ...sort=sqrt(mapMultipleToOne(FQN, fieldname)) asc... Uwe (Sorry late reaction)
Re: Sorting on mutivalued fields still impossible?
If the Multiple-to-one mapping would be stable (e.g. independent of a query), why not implement it as a custom update.chain processor with a copy to a separate field? There is already a couple of implementations under FieldValueMutatingUpdateProcessor (first, last, max, min). Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jan 7, 2013 at 8:19 AM, Uwe Reh r...@hebis.uni-frankfurt.de wrote: Am 31.08.2012 13:35, schrieb Erick Erickson: ... what would the correct behavior be for sorting on a multivalued field Hi Erick, in generally you are right, the question of multivalued fields is which value the reference is. But there are thousands of cases where this question is implicit answered. See my example ...sort=max(datefield) desc It is obvious, that the newest date should win. I see no reason why simple filters like max can't handle multivalued fields. Now four month's later i still wounder, why there is no pluginable function to map multivalued fields into a single value. eg. ...sort=sqrt(**mapMultipleToOne(FQN, fieldname)) asc... Uwe (Sorry late reaction)
Re: Sorting on mutivalued fields still impossible?
Hi, like I just wrote in my reply to the similar suggestion form Jack. I'm not looking for a way to preprocess my data. My question is, why do i need two redundant fields to sort a multivalued field ('date_max' and 'date_min' for 'date') For me it's just a waste of space, poisoning the fieldcache. There is also an other class of problems, where a filterfunction like 'mapMultipleToOne' may helpful. In the thread 'theory of sets' (this list) I described a hack with the function strdist, an own class and the mapping of a multiple values as a cvs list in a single value field. Uwe Am 07.01.2013 14:54, schrieb Alexandre Rafalovitch: If the Multiple-to-one mapping would be stable (e.g. independent of a query), why not implement it as a custom update.chain processor with a copy to a separate field? There is already a couple of implementations under FieldValueMutatingUpdateProcessor (first, last, max, min). Regards, Alex.
SOLR Cloud : what is the best backup/restore strategy ?
Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume
RE: theory of sets
Hi Uwe, We have hundreds of dynamic fields but since most of our docs only use some of them it doesn't seem to be a performance drag. They can be viewed as a sparse matrix of fields in your indexed docs. Then if you make the sortinfo_for_groupx an int then that could be used in a function query to perform your sorting. See http://wiki.apache.org/solr/FunctionQuery Robi -Original Message- From: Uwe Reh [mailto:r...@hebis.uni-frankfurt.de] Sent: Thursday, January 03, 2013 1:10 PM To: solr-user@lucene.apache.org Subject: theory of sets Hi, I'm looking for a tricky solution of a common problem. I have to handle a lot of items and each could be member of several groups. - OK, just add a field called 'member_of' No that's not enough, because each group is sorted and each member has a sortstring for this group. - OK, still easy add a dynamic field 'sortinfo_for_*' and fill this for each group membership. Yes, this works, but there are thousands of different groups, that much dynamic fields are probably a serious performance issue. - Well ... I'm looking for a smart way to answer to the question Find the members of group X and sort them by the the sortstring for this group. One idea I had was to fill the 'member_of' field with composed entrys (groupname + _ + sortstring). Finding the members is easy with wildcards but there seems to be no way to use the sortstring as a boostfactor Has anybody solved this problem? Any hints are welcome. Uwe
Re: theory of sets
Hi Robi, thank you for the contribution. It's exiting to read, that your index isn't contaminated by the number of fields. I can't exclude other mistakes, but my first experience with extensive use of dynamic fields have been very poor response times. Even though I found an other solution, I should give the straight forward solution a second chance. Uwe Am 07.01.2013 17:40, schrieb Petersen, Robert: Hi Uwe, We have hundreds of dynamic fields but since most of our docs only use some of them it doesn't seem to be a performance drag. They can be viewed as a sparse matrix of fields in your indexed docs. Then if you make the sortinfo_for_groupx an int then that could be used in a function query to perform your sorting. See http://wiki.apache.org/solr/FunctionQuery
No live SolrServers Solr 4 exceptions on trying to create a collection
Any clue to why this is happening will be greatly appreciated. This has become a blocker for me. I can use the HTTPSolrServer to create a core/make requests etc, but then it behaves like Solr 3.6 http://host:port/solr/admin/cores and not http://host:port/solr/admin/collections With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). And of course when I initiate a http://127.0.0.1:7500/solr/admin/collections?action=CREATEname=myColl2instanceDir=defaultdataDir=myColl2Datacollection=myColl2numShards=2 as expected it creates the collection spread on 2 servers. I am failing to achieve the same with SolrJ. As in the code at the bottom of the mail, I use CloudSolrServer and get the No live SolrServers exception. Any help or direction will of how to create collections (using the collections API) using SolrJ will be highly appreciated. Regards Jay -Original Message- From: Jay Parashar [mailto:jparas...@itscape.com] Sent: Sunday, January 06, 2013 7:42 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4 exceptions on trying to create a collection The exception No live SolrServers is being thrown when trying to create a new Collection ( code at end of this mail). On the CloudSolrServer request method, we have this line ClientUtils.appendMap(coll, slices, clusterState.getSlices(coll)); where coll is the new collection I am trying to create and hence clusterState.getSlices(coll)); is returning null. And then the loop of the slices which adds to the urlList never happens and hence the LBHttpSolrServer created in the CloudSolrServer has a null url list in the constructor. This is giving the No live SolrServers exception. What I am missing? Instead of passing the CloudSolrServer to the create.process, if I pass the LBHttpSolrServer (server.getLbServer()), the collection gets created but only on one server. My code to create a new Cloud Server and new Collection:- String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0.1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Thanks Jay -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, January 04, 2013 6:08 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4 exceptions on trying to create a collection Tried Wireshark yet to see what host/port it is trying to connect and why it fails? It is a complex tool, but well worth learning. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Jan 4, 2013 at 6:58 PM, Jay Parashar jparas...@itscape.com wrote: Thanks! I had a different version of httpclient in the classpath. So the 2nd exception is gone but now I am back to the first one org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, January 04, 2013 4:21 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4 exceptions on trying to create a collection For the second one: Wrong version of library on a classpath or multiple versions of library on the classpath which causes wrong classes with missing fields/variables? Or library interface baked in and the implementation is newer. Some sort of mismatch basically. Most probably in Apache http library. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Jan 4, 2013 at 4:34 PM, Jay Parashar jparas...@itscape.com wrote: Hi All, I am getting exceptions on trying to create a collection. Any help is appreciated. While trying to create a collection, I got this
Re: No live SolrServers Solr 4 exceptions on trying to create a collection
Hello! Can you share the command you use to start all four Solr servers ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Any clue to why this is happening will be greatly appreciated. This has become a blocker for me. I can use the HTTPSolrServer to create a core/make requests etc, but then it behaves like Solr 3.6 http://host:port/solr/admin/cores and not http://host:port/solr/admin/collections With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). And of course when I initiate a http://127.0.0.1:7500/solr/admin/collections?action=CREATEname=myColl2instanceDir=defaultdataDir=myColl2Datacollection=myColl2numShards=2 as expected it creates the collection spread on 2 servers. I am failing to achieve the same with SolrJ. As in the code at the bottom of the mail, I use CloudSolrServer and get the No live SolrServers exception. Any help or direction will of how to create collections (using the collections API) using SolrJ will be highly appreciated. Regards Jay -Original Message- From: Jay Parashar [mailto:jparas...@itscape.com] Sent: Sunday, January 06, 2013 7:42 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4 exceptions on trying to create a collection The exception No live SolrServers is being thrown when trying to create a new Collection ( code at end of this mail). On the CloudSolrServer request method, we have this line ClientUtils.appendMap(coll, slices, clusterState.getSlices(coll)); where coll is the new collection I am trying to create and hence clusterState.getSlices(coll)); is returning null. And then the loop of the slices which adds to the urlList never happens and hence the LBHttpSolrServer created in the CloudSolrServer has a null url list in the constructor. This is giving the No live SolrServers exception. What I am missing? Instead of passing the CloudSolrServer to the create.process, if I pass the LBHttpSolrServer (server.getLbServer()), the collection gets created but only on one server. My code to create a new Cloud Server and new Collection:- String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0.1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Thanks Jay -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, January 04, 2013 6:08 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4 exceptions on trying to create a collection Tried Wireshark yet to see what host/port it is trying to connect and why it fails? It is a complex tool, but well worth learning. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Jan 4, 2013 at 6:58 PM, Jay Parashar jparas...@itscape.com wrote: Thanks! I had a different version of httpclient in the classpath. So the 2nd exception is gone but now I am back to the first one org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, January 04, 2013 4:21 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4 exceptions on trying to create a collection For the second one: Wrong version of library on a classpath or multiple versions of library on the classpath which causes wrong classes with missing fields/variables? Or library interface baked in and the implementation is newer. Some sort of mismatch basically. Most probably in Apache http library. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On
Re: Will SolrCloud always slice by ID hash?
Thanks guys. Yeah, separate rolling collections seem like the better way to go. -Scott On Sat, Dec 29, 2012 at 1:30 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: https://issues.apache.org/jira/browse/SOLR-4237
Re: No live SolrServers Solr 4 exceptions on trying to create a collection
On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote: With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). This only starts one core. If you want to use the CoreAdmin API you would need to make four calls, one to each server. If you want this done for you, you must use the Collections API - see the wiki: http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API - Mark
RE: No live SolrServers Solr 4 exceptions on trying to create a collection
Hi Rafat, The following are scripts started in the same order (external zk, 1 instance running at localhost:2181). I also tried with the embedded zk with the same result #Start of Server 1 export SOLR_HOME=/home/apache-solr-4.0.0 cd shard1A java \ -Djetty.port=8983 \ -Djetty.home=$SOLR_HOME/example/ \ -Dsolr.solr.home=multicore \ -Dbootstrap_confdir=./multicore/defaultCore/conf \ -Dcollection.configName=defaultConfig \ -DzkHost=localhost:2181 \ -DnumShards=2 \ -jar $SOLR_HOME/example/start.jar #Start of Server 2 export SOLR_HOME=/home/apache-solr-4.0.0 cd shard2A java \ -Djetty.port=8900 \ -Djetty.home=$SOLR_HOME/example/ \ -Dsolr.solr.home=multicore \ -DzkHost=localhost:2181 \ -jar $SOLR_HOME/example/start.jar #Start of Server 3 export SOLR_HOME=/home/apache-solr-4.0.0 cd shard1B java \ -Djetty.port=7574 \ -Djetty.home=$SOLR_HOME/example/ \ -Dsolr.solr.home=multicore \ -DzkHost=localhost:2181 \ -jar $SOLR_HOME/example/start.jar #Start of Server 4 export SOLR_HOME=/home/apache-solr-4.0.0 cd shard2B java \ -Djetty.port=7500 \ -Djetty.home=$SOLR_HOME/example/ \ -Dsolr.solr.home=multicore \ -DzkHost=localhost:2181 \ -jar $SOLR_HOME/example/start.jar Regards Jay -Original Message- From: Rafał Kuć [mailto:r@solr.pl] Sent: Monday, January 07, 2013 11:44 AM To: solr-user@lucene.apache.org Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a collection Hello! Can you share the command you use to start all four Solr servers ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Any clue to why this is happening will be greatly appreciated. This has become a blocker for me. I can use the HTTPSolrServer to create a core/make requests etc, but then it behaves like Solr 3.6 http://host:port/solr/admin/cores and not http://host:port/solr/admin/collections With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). And of course when I initiate a http://127.0.0.1:7500/solr/admin/collections?action=CREATEname=myColl2instanceDir=defaultdataDir=myColl2Datacollection=myColl2numShards=2 as expected it creates the collection spread on 2 servers. I am failing to achieve the same with SolrJ. As in the code at the bottom of the mail, I use CloudSolrServer and get the No live SolrServers exception. Any help or direction will of how to create collections (using the collections API) using SolrJ will be highly appreciated. Regards Jay -Original Message- From: Jay Parashar [mailto:jparas...@itscape.com] Sent: Sunday, January 06, 2013 7:42 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4 exceptions on trying to create a collection The exception No live SolrServers is being thrown when trying to create a new Collection ( code at end of this mail). On the CloudSolrServer request method, we have this line ClientUtils.appendMap(coll, slices, clusterState.getSlices(coll)); where coll is the new collection I am trying to create and hence clusterState.getSlices(coll)); is returning null. And then the loop of the slices which adds to the urlList never happens and hence the LBHttpSolrServer created in the CloudSolrServer has a null url list in the constructor. This is giving the No live SolrServers exception. What I am missing? Instead of passing the CloudSolrServer to the create.process, if I pass the LBHttpSolrServer (server.getLbServer()), the collection gets created but only on one server. My code to create a new Cloud Server and new Collection:- String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0.1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Thanks Jay -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, January 04, 2013 6:08 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4 exceptions on trying to create
RE: No live SolrServers Solr 4 exceptions on trying to create a collection
Right Mark, I am accessing the Collections API using Solrj. This is where I am stuck. If I just use the Collections API using http thru the browser, the behavior is as expected. Is there an example of using the Collections API using SolrJ? My code looks like String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0 .1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Regards Jay -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, January 07, 2013 11:57 AM To: solr-user@lucene.apache.org Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a collection On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote: With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi r=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). This only starts one core. If you want to use the CoreAdmin API you would need to make four calls, one to each server. If you want this done for you, you must use the Collections API - see the wiki: http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio ns_API - Mark
Re: No live SolrServers Solr 4 exceptions on trying to create a collection
Can you run the SolrJ client from another machine (so you go over the network) and put Wireshark in between? It will tell you if something is actually trying to connect of if the problem is even earlier. Otherwise, if you are on U*ix style machines look into dtrace/truss to see the activity. On Windows machines look at ProcessMonitor from Sysinternals. These are all 'hammer' size tools, but if you are truly stuck, they could be a way forward. Good luck, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jan 7, 2013 at 1:20 PM, Jay Parashar jparas...@itscape.com wrote: Right Mark, I am accessing the Collections API using Solrj. This is where I am stuck. If I just use the Collections API using http thru the browser, the behavior is as expected. Is there an example of using the Collections API using SolrJ? My code looks like String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,; http://127.0.0 .1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Regards Jay -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, January 07, 2013 11:57 AM To: solr-user@lucene.apache.org Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a collection On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote: With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi r=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). This only starts one core. If you want to use the CoreAdmin API you would need to make four calls, one to each server. If you want this done for you, you must use the Collections API - see the wiki: http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio ns_API - Mark
Re: How to size a SOLR Cloud
Hello FF, Something like SPM for Solr will help you understand what's making Solr slow - CPU maxed? Disk IO? Swapping? Caches too small? ... There are no general rules/recipes, but once you see what is going on we can provide guidance. Yes, you can have 1 or more replicas of a shard. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 12:14 PM, f.fourna...@gibmedia.fr f.fourna...@gibmedia.fr wrote: Hello, I'm new in SOLR and I've a collection with 25 millions of records. I want to run this collection on SOLR Cloud (sorl 4.0) under Amazon EC2 instances. Currently I've configured 2 shards and 2 replica per shard with Medium instances (4Go, 1 CPU core) and response times are very long. How to size the cloud (sharding, replica, memory, CPU,...) to have acceptable response times in my situation? more memory ? more cpu ? more shards ? Does rules to size a solr cloud exists ? Is it possible to have more than 2 replicas on one shard ? is it relevant ? Best regards FF
Re: SOLR Cloud : what is the best backup/restore strategy ?
Hi, There may be a better way, but stopping indexing and then using http://master_host:port/solr/replication?command=backup on each node may do the backup trick. I'd love to see how/if others do it. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume guillaume.lefeb...@cegedim.fr wrote: Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume
Re: Sorting on mutivalued fields still impossible?
: My question is, why do i need two redundant fields to sort a multivalued field : ('date_max' and 'date_min' for 'date') : For me it's just a waste of space, poisoning the fieldcache. how does two fields poion the fieldcache ? ... if there was a function that could find the min or max value of a multi-valued field, it would need to construct an UInvertedField of all N of the field values of each doc in order to find the min/max at query time -- by pre-computing a min_field and max_field at indexing time you only need FieldCache's for those 2 fields (where 2 = N, and N may be very big) Generall speaking: most solr use cases are willing to pay a slightly higher indexing cost (time/cpu) to have faster searches -- which answers your earlier question... Now four month's later i still wounder, why there is no pluginable function to map multivalued fields into a single value. ...because no one has written/contributed these functions (because most people would rather pay that cost at indexing time) -Hoss
Re: SOLR Cloud : what is the best backup/restore strategy ?
You should be able to continue indexing fine - it will just keep a point in time snapshot around until the copy is done. So you can trigger a backup at anytime to create a backup for that specific time, and keep indexing away, and the next night do the same thing. You will always have backed up to the point in time the backup command is received. - Mark On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, There may be a better way, but stopping indexing and then using http://master_host:port/solr/replication?command=backup on each node may do the backup trick. I'd love to see how/if others do it. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume guillaume.lefeb...@cegedim.fr wrote: Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume
Re: No live SolrServers Solr 4 exceptions on trying to create a collection
http://127.0.0.1:7500/solr/admin/cores? Why did you paste that as the example then :) ? 4.0 has problems using the collections api with the CloudSolrServer. You will be able to do it for 4.1, but for 4.0 you have to use an HttpSolrServer and pick a node to talk to. For 4.0, CloudSolrServer is just good for querying and updating. - Mark On Jan 7, 2013, at 1:20 PM, Jay Parashar jparas...@itscape.com wrote: Right Mark, I am accessing the Collections API using Solrj. This is where I am stuck. If I just use the Collections API using http thru the browser, the behavior is as expected. Is there an example of using the Collections API using SolrJ? My code looks like String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0 .1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Regards Jay -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, January 07, 2013 11:57 AM To: solr-user@lucene.apache.org Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a collection On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote: With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi r=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). This only starts one core. If you want to use the CoreAdmin API you would need to make four calls, one to each server. If you want this done for you, you must use the Collections API - see the wiki: http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio ns_API - Mark
Re: SOLR Cloud : what is the best backup/restore strategy ?
Is it possible to restore an index (previously backed up) using the same kind of http reste like request ? Something like ...solr/replication?command=restore ? On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote: You should be able to continue indexing fine - it will just keep a point in time snapshot around until the copy is done. So you can trigger a backup at anytime to create a backup for that specific time, and keep indexing away, and the next night do the same thing. You will always have backed up to the point in time the backup command is received. - Mark On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, There may be a better way, but stopping indexing and then using http://master_host:port/solr/replication?command=backup on each node may do the backup trick. I'd love to see how/if others do it. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume guillaume.lefeb...@cegedim.fr wrote: Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume
RE: No live SolrServers Solr 4 exceptions on trying to create a collection
Thanks Mark! I will wait for 4.1 then. Actually I pasted both /admin/cores and /admin/collections to highlight that the problem was only with SolrJ and both admin/collections and admin/collections were working as expected. Sorry for the confusion. Regards Jay -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, January 07, 2013 1:14 PM To: solr-user@lucene.apache.org Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a collection http://127.0.0.1:7500/solr/admin/cores? Why did you paste that as the example then :) ? 4.0 has problems using the collections api with the CloudSolrServer. You will be able to do it for 4.1, but for 4.0 you have to use an HttpSolrServer and pick a node to talk to. For 4.0, CloudSolrServer is just good for querying and updating. - Mark On Jan 7, 2013, at 1:20 PM, Jay Parashar jparas...@itscape.com wrote: Right Mark, I am accessing the Collections API using Solrj. This is where I am stuck. If I just use the Collections API using http thru the browser, the behavior is as expected. Is there an example of using the Collections API using SolrJ? My code looks like String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0 .1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnection PNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Regards Jay -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Monday, January 07, 2013 11:57 AM To: solr-user@lucene.apache.org Subject: Re: No live SolrServers Solr 4 exceptions on trying to create a collection On Jan 7, 2013, at 12:33 PM, Jay Parashar jparas...@itscape.com wrote: With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDi r=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). This only starts one core. If you want to use the CoreAdmin API you would need to make four calls, one to each server. If you want this done for you, you must use the Collections API - see the wiki: http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collectio ns_API - Mark
Re: SOLR Cloud : what is the best backup/restore strategy ?
Not to my knowledge. You could do a delete all and then merge the index in with the core admin API, but that would be a less efficient copy basically, rather than a straight file move. There is not currently a restore command though. Also, keep in mind that unless you back up to a network store or I suppose another disk drive or something, your backup is pretty precarious. - Mark On Jan 7, 2013, at 2:21 PM, Michel Dion diom...@gmail.com wrote: Is it possible to restore an index (previously backed up) using the same kind of http reste like request ? Something like ...solr/replication?command=restore ? On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote: You should be able to continue indexing fine - it will just keep a point in time snapshot around until the copy is done. So you can trigger a backup at anytime to create a backup for that specific time, and keep indexing away, and the next night do the same thing. You will always have backed up to the point in time the backup command is received. - Mark On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, There may be a better way, but stopping indexing and then using http://master_host:port/solr/replication?command=backup on each node may do the backup trick. I'd love to see how/if others do it. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume guillaume.lefeb...@cegedim.fr wrote: Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume
Solr cloud not starting properly. Only starts leaders.
Every time I stop my SolrCloud (3 shards, 1 replica each, total 6 servers) and then restart it I get the following error: SEVERE: Error getting leader from zk org.apache.solr.common.SolrException: Could not get leader props at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:709) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:673) at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:638) at org.apache.solr.cloud.ZkController.register(ZkController.java:577) at org.apache.solr.cloud.ZkController.register(ZkController.java:532) at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:709) at org.apache.solr.core.CoreContainer.register(CoreContainer.java:693) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:535) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:278) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:259) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:383) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:104) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4650) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5306) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:633) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:977) at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1655) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/productindex/leaders/shard1 at org.apache.zookeeper.KeeperException.create(KeeperException.java:102) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:244) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:241) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:63) at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:241) at org.apache.solr.cloud.ZkController.getLeaderProps(ZkController.java:687) ... 28 more Jan 07, 2013 1:23:50 PM org.apache.solr.common.SolrException log SEVERE: :org.apache.solr.common.SolrException: Error getting leader from zk at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:662) at org.apache.solr.cloud.ZkController.register(ZkController.java:577) at org.apache.solr.cloud.ZkController.register(ZkController.java:532) at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:709) at org.apache.solr.core.CoreContainer.register(CoreContainer.java:693) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:535) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:356) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:308) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:107) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:278) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:259) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:383) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:104) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4650) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5306) at
Re: SOLR Cloud : what is the best backup/restore strategy ?
There's no problem with indexing while taking snapshot. The only issue I found is some problem with index directory: https://issues.apache.org/jira/browse/SOLR-4170 It looks like Solr always looks in .../data/index/ directory without reading index.properties file (sometimes your index dir name can be like index.date). So far, it's easy to find workaround for this, but it should work in 4.1 hopefully. As far as I checked snapshotting process is harmless for Solr performance, so it is reliable. In case of index recovery you can use files from your last snapshot and just send updates newer than this. At least, that's what I do and it works pretty fine. Regards. On 7 January 2013 20:27, Mark Miller markrmil...@gmail.com wrote: Not to my knowledge. You could do a delete all and then merge the index in with the core admin API, but that would be a less efficient copy basically, rather than a straight file move. There is not currently a restore command though. Also, keep in mind that unless you back up to a network store or I suppose another disk drive or something, your backup is pretty precarious. - Mark On Jan 7, 2013, at 2:21 PM, Michel Dion diom...@gmail.com wrote: Is it possible to restore an index (previously backed up) using the same kind of http reste like request ? Something like ...solr/replication?command=restore ? On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote: You should be able to continue indexing fine - it will just keep a point in time snapshot around until the copy is done. So you can trigger a backup at anytime to create a backup for that specific time, and keep indexing away, and the next night do the same thing. You will always have backed up to the point in time the backup command is received. - Mark On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, There may be a better way, but stopping indexing and then using http://master_host:port/solr/replication?command=backup on each node may do the backup trick. I'd love to see how/if others do it. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume guillaume.lefeb...@cegedim.fr wrote: Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume
Re: theory of sets
Dynamic fields resulted in poor response times? How many fields did each document have? I can't see how a dynamic field should have any difference from any other field in terms of response time. Or are you querying across a large number of dynamic fields concurrently? I can imagine that slowing things down. Upayavira On Mon, Jan 7, 2013, at 05:18 PM, Uwe Reh wrote: Hi Robi, thank you for the contribution. It's exiting to read, that your index isn't contaminated by the number of fields. I can't exclude other mistakes, but my first experience with extensive use of dynamic fields have been very poor response times. Even though I found an other solution, I should give the straight forward solution a second chance. Uwe Am 07.01.2013 17:40, schrieb Petersen, Robert: Hi Uwe, We have hundreds of dynamic fields but since most of our docs only use some of them it doesn't seem to be a performance drag. They can be viewed as a sparse matrix of fields in your indexed docs. Then if you make the sortinfo_for_groupx an int then that could be used in a function query to perform your sorting. See http://wiki.apache.org/solr/FunctionQuery
RE: theory of sets
Hi, Just thought this possibility: I think dynamic field is solr concept, on lcene level all fields are the same, but in initial startup, lucene should load all field information into memory (not field data, but schema). If we have too many fields (like *_my_fields, * = a1, a2, ...), does this take too much memory and slow down performance (even if very few fields are really used)? Best regards, Lisheng -Original Message- From: Upayavira [mailto:u...@odoko.co.uk] Sent: Monday, January 07, 2013 2:57 PM To: solr-user@lucene.apache.org Subject: Re: theory of sets Dynamic fields resulted in poor response times? How many fields did each document have? I can't see how a dynamic field should have any difference from any other field in terms of response time. Or are you querying across a large number of dynamic fields concurrently? I can imagine that slowing things down. Upayavira On Mon, Jan 7, 2013, at 05:18 PM, Uwe Reh wrote: Hi Robi, thank you for the contribution. It's exiting to read, that your index isn't contaminated by the number of fields. I can't exclude other mistakes, but my first experience with extensive use of dynamic fields have been very poor response times. Even though I found an other solution, I should give the straight forward solution a second chance. Uwe Am 07.01.2013 17:40, schrieb Petersen, Robert: Hi Uwe, We have hundreds of dynamic fields but since most of our docs only use some of them it doesn't seem to be a performance drag. They can be viewed as a sparse matrix of fields in your indexed docs. Then if you make the sortinfo_for_groupx an int then that could be used in a function query to perform your sorting. See http://wiki.apache.org/solr/FunctionQuery
Re: When does Solr actually convert textual representation into non-text formats (e.g. Date)
: Subject: When does Solr actually convert textual representation into non-text : formats (e.g. Date) The short answer is: any place you want. At the lowest level, FieldType's are required to support converting (legal) String values into whatever native java object best represents their type -- but they are also allowed/encouraged to accept objects of that native type directly and use them as is. In some cases, like with the XmlUpdateRequestHandler code , the raw sting input is left sa is and passed down to the FieldType, because the RequestHandler's parsing code shouldn't make assumptions about the field types -- in other cases, like the JavaBinUpdateRequestHandler the type info comes along with the data, so it can easily pass the Integer/Date/Whatever on to the FieldType. In between, things like UpdateRequetProcessor's can convert from String to Date or vice versa as they see fit. As for DIH: i'm not entirely sure all of the places where a String might be converted to a Date ... i think there are special transformers for that, but when dealing with things like jdbc datasources you frequently get a true Date object back from the jdbc connection and i *think* DIH uses those Date objects as is. : 4) copyField copy field is not something i've ever considered in this context ... i genuinely don't know what would happen if you copyField'd from a TrieDateField to TextField and your indexing code was providing a true Date object ... i suspect you'd get a simple date.toString() in the text field. -Hoss
Re: Solr cloud not starting properly. Only starts leaders.
On Jan 7, 2013, at 4:26 PM, davers dboych...@improvementdirect.com wrote: KeeperErrorCode = NoNode for /collections/productindex/leaders/shard1 Odd - offhand I don't recall something like this being brought up before. Is this new for you, or always existed? Solr 4.0? As far as a key for the colors, there is an open JIRA issue for 4.1, and I think even a patch. - Mark
Re: custom solr sort
Hi Upayavira, The custom sort field is not stored in the index, I want to archieve a requirement that didfferent search users will get different search results when they search same keyword by my search engine, the search users have relationship with the each result document in the solr. But the relationship is provided by the other teams' rest service. So the search sequence is as follows : 1. I add the search user's id in the solr query ( i.e. : query.setParam(uid, vo.getUserId());) and specify my own request hanlder *mysearch* query.setParam(qt, mysearch); 2. MySortComponent set the custom sort as the first sort. 3. MyComparatorSource got the uid ,and send request to a rest service, get the relationship according the uid 4.sort the result Do you have any suggestions? Upayavira wrote Can you explain why you want to implement a different sort first? There may be other ways of achieving the same thing. Upayavira On Sun, Jan 6, 2013, at 01:32 AM, andy wrote: Hi, Maybe this is an old thread or maybe it's different with previous one. I want to custom solr sort and pass solr param from client to solr server, so I implemented SearchComponent which named MySortComponent in my code, and also implemented FieldComparatorSource and FieldComparator. when I use mysearch requesthandler(see following codes), I found that custom sort just effect on the current page when I got multiple page results, but the sort is expected when I sets the rows which contains all the results. Does anybody know how to solve it or the reason? code snippet: public class MySortComponent extends SearchComponent implements SolrCoreAware { public void inform(SolrCore arg0) { } @Override public void prepare(ResponseBuilder rb) throws IOException { SolrParams params = rb.req.getParams(); String uid = params.get(uid) private RestTemplate restTemplate = new RestTemplate(); MyComparatorSource comparator = new MyComparatorSource(uid); SortSpec sortSpec = rb.getSortSpec(); if (sortSpec.getSort() == null) { sortSpec.setSort(new Sort(new SortField[] { new SortField(relation, comparator),SortField.FIELD_SCORE })); } else { SortField[] current = sortSpec.getSort().getSort(); ArrayList SortField sorts = new ArrayList SortField ( current.length + 1); sorts.add(new SortField(relation, comparator)); for (SortField sf : current) { sorts.add(sf); } sortSpec.setSort(new Sort(sorts.toArray(new SortField[sorts.size()]))); } } @Override public void process(ResponseBuilder rb) throws IOException { } // - // SolrInfoMBean // - @Override public String getDescription() { return Custom Sorting; } @Override public String getSource() { return ; } @Override public URL[] getDocs() { try { return new URL[] { new URL( http://wiki.apache.org/solr/QueryComponent;) }; } catch (MalformedURLException e) { throw new RuntimeException(e); } } public class MyComparatorSource extends FieldComparatorSource { private BitSet dg1; private BitSet dg2; private BitSet dg3; public MyComparatorSource(String uid) throws IOException { SearchResponse responseBody = restTemplate.postForObject( http://search.test.com/userid/search/; + uid, null, SearchResponse.class); String d1 = responseBody.getOneDe(); String d2 = responseBody.getTwoDe(); String d3 = responseBody.getThreeDe(); if (StringUtils.hasLength(d1)) { byte[] bytes = Base64.decodeBase64(d1); dg1 = BitSetHelper.loadFromBzip2ByteArray(bytes); } if (StringUtils.hasLength(d2)) { byte[] bytes = Base64.decodeBase64(d2); dg2 = BitSetHelper.loadFromBzip2ByteArray(bytes); } if (StringUtils.hasLength(d3)) { byte[] bytes = Base64.decodeBase64(d3); dg3 = BitSetHelper.loadFromBzip2ByteArray(bytes); } } @Override public FieldComparator newComparator(String fieldname, final int numHits, int sortPos, boolean reversed) throws IOException { return new RelationComparator(fieldname, numHits); } class
Re: custom solr sort
: mysearch requesthandler(see following codes), I found that custom sort : just effect on the current page when I got multiple page results, but the : sort is expected when I sets the rows which contains all the results. Does : anybody know how to solve it or the reason? I haven't familiarized myself with the lucene sort code in a while, and much of your custom sort code is greek to me, but this one method does jump out at me... : @Override : public int compareDocToValue(int arg0, Object arg1) : throws IOException { : // TODO Auto-generated method stub : return 0; : } ...i'm pretty sure you need to implement that method correctly to get meaningful sort ordering. FWIW: If i was in your place, and had an external REST service that provided me with the sort values to use for each doc's unique key, given a users unique id, my first inclination would not be to implement it as a custom SearchComponent. My first inclination would be to implement it as a custom ValueSourceParser (returning a custom ValueSource), and then leverage the function query syntax in the sort (ie: sort=myFunction(the_user_id) asc) ... that should mean a lot less non-sort related code you have to write. (or if i was still using Solr 3.6.x, i might implement a special FieldType -- using RandomFiled as inspiration and then register it with a UID__* dynamicField so sort=UID__the_user_id asc called my REST service using 'the_user_id' as input) -Hoss
Re: Solr Cloud not electing leader properly
Please see: http://lucene.472066.n3.nabble.com/Attention-Solr-4-0-SolrCloud-users-td4024998.html - Mark On Jan 7, 2013, at 9:16 PM, davers dboych...@improvementdirect.com wrote: I have a SolrCloud as seen here: http://d.pr/i/ya86 When I stop solr-shard-1 solr-shard-4 should become the new leader. Instead it does not. Here is the output from the logs. INFO: A cluster state change has occurred - updating... Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: Running the leader process. Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: Checking if I should try and be the leader. Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext shouldIBeLeader INFO: My last published State was Active, it's okay to be the leader. Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.ShardLeaderElectionContext runLeaderProcess INFO: I may be the new leader - try and sync Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.RecoveryStrategy close WARNING: Stopping recovery for zkNodeName=solr-shard-4.sys.id.build.com:8080_solr_productindexcore=productindex Jan 07, 2013 6:11:54 PM org.apache.solr.cloud.SyncStrategy sync INFO: Sync replicas to http://solr-shard-4.sys.id.build.com:8080/solr/productindex/ Jan 07, 2013 6:11:54 PM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=productindex url=http://solr-shard-4.sys.id.build.com:8080/solr START replicas=[http://solr-shard-1.sys.id.build.com:8080/solr/productindex/] nUpdates=100 Jan 07, 2013 6:11:54 PM org.apache.solr.update.PeerSync handleResponse WARNING: PeerSync: core=productindex url=http://solr-shard-4.sys.id.build.com:8080/solr exception talking to http://solr-shard-1.sys.id.build.com:8080/solr/productindex/, failed org.apache.solr.client.solrj.SolrServerException: IOException occured when talking to server at: http://solr-shard-1.sys.id.build.com:8080/solr/productindex at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:166) at org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:133) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) Caused by: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:189) at java.net.SocketInputStream.read(SocketInputStream.java:121) at org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:149) at org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:111) at org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:264) at org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:98) at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252) at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282) at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247) at org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216) at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298) at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352) ... 11 more Jan 07, 2013 6:11:54 PM org.apache.solr.update.PeerSync sync INFO: PeerSync: core=productindex url=http://solr-shard-4.sys.id.build.com:8080/solr DONE. sync failed Jan
Re: SOLR Cloud : what is the best backup/restore strategy ?
Hi, Right, you can continue indexing, but if you need to run http://master_host:port/solr/replication?command=backup on each node and if you want a snapshot that represents a specific index state, then you need to stop indexing (and hard commit). That's what I had in mind. But if one just wants *some* snapshot and it doesn't matter that a snapshot on each node is a from a slightly different time with a slightly different index make up, so to speak, then yes, just continue indexing. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote: You should be able to continue indexing fine - it will just keep a point in time snapshot around until the copy is done. So you can trigger a backup at anytime to create a backup for that specific time, and keep indexing away, and the next night do the same thing. You will always have backed up to the point in time the backup command is received. - Mark On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, There may be a better way, but stopping indexing and then using http://master_host:port/solr/replication?command=backup on each node may do the backup trick. I'd love to see how/if others do it. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume guillaume.lefeb...@cegedim.fr wrote: Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume
Re: Atomicity of commits (soft OR hard) across replicas - Solr Cloud
Thanks *Tomás !! *This was useful. On Mon, Dec 31, 2012 at 6:03 PM, Tomás Fernández Löbbe tomasflo...@gmail.com wrote: If by cronned commit you mean auto-commit: auto-commits are local to each node, are not distributed, so there is no something like a cluster-wide atomicity there. The commit may be performed in one node now, and in other nodes in 5 minutes (depending on the maxTime you have configured). If you mean that you are issuing commits from outside Solr, those are going to be by default distributed to all the nodes. The operation will succeed only if all nodes succeed, but if one of the nodes fail, the operation will fail. However, the nodes that did succeed WILL have a new view of the index at this point. (I'm not sure if something is done in this situation with the failing node). The local commit operation in one node *is* atomic. Tomás On Mon, Dec 31, 2012 at 7:04 AM, samarth s samarth.s.seksa...@gmail.com wrote: Tried reading articles online, but could not find one that confirmed the same 100% :). Does a cronned soft commit complete its commit cycle only after all the replicas have the newest data visible ? -- Regards, Samarth -- Regards, Samarth
Re: How to size a SOLR Cloud
Hi I have some experience with practical limits. We have several setup we have tried to run with high load for long time: 1) * 20 shards in one collection spread over 5 nodes (4 shards for the collection per node), no redunancdy (only one replica per shard) * Indexing 35-50 mio documents per day and searching a little along the way * We do not have detailed measurements on searching, but my impression is that search response times are fairly ok (below 5 secs for non-complicated searches) - at least the first 15 days, up to about 500 mio documents * We have very detailed measurements on indexing times though. They are good the first 15-17 days, up to 500-600 mio documents. Then we see a temporary slow-down in indexing times. This is because major merges happen at the same time across all shards. The indexing times speed up when this is over, though. After about 20 days everything stops running - things just get too slow and eventually nothing happens. 2) * Same as 1), except 40 shards in one collection spread over 10 nodes, no redundancy * Slowdown points seems to change linearly - slow-down around 1 billion docs and complete stop 1.3-1.4 billion docs Therefore it seems a little strange to me that you have problems with 25 mio docs in two shards. One major difference is the redundancy, though. We are having only one replica per shard. We started our trying to run with redundancy (2 replica per shard) but that involved a lot of problems. Things never successfully recover when recover situations occur, and we see like 4-times indexing times compared to non-redundancy (even though a max of 2-times should be expected). Regards, Per Steffensen On 1/7/13 6:14 PM, f.fourna...@gibmedia.fr wrote: Hello, I'm new in SOLR and I've a collection with 25 millions of records. I want to run this collection on SOLR Cloud (sorl 4.0) under Amazon EC2 instances. Currently I've configured 2 shards and 2 replica per shard with Medium instances (4Go, 1 CPU core) and response times are very long. How to size the cloud (sharding, replica, memory, CPU,...) to have acceptable response times in my situation? more memory ? more cpu ? more shards ? Does rules to size a solr cloud exists ? Is it possible to have more than 2 replicas on one shard ? is it relevant ? Best regards FF
Re: SOLR Cloud : what is the best backup/restore strategy ?
Definitely. I agree. It's good to stop loading before snapshot. Anyway, doing index snapshot say every 1 hour and re-indexing documents never than last 1-1.5 hour should reduce your index recovery time. On 8 January 2013 07:36, Otis Gospodnetic otis.gospodne...@gmail.comwrote: Hi, Right, you can continue indexing, but if you need to run http://master_host:port/solr/replication?command=backup on each node and if you want a snapshot that represents a specific index state, then you need to stop indexing (and hard commit). That's what I had in mind. But if one just wants *some* snapshot and it doesn't matter that a snapshot on each node is a from a slightly different time with a slightly different index make up, so to speak, then yes, just continue indexing. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 2:12 PM, Mark Miller markrmil...@gmail.com wrote: You should be able to continue indexing fine - it will just keep a point in time snapshot around until the copy is done. So you can trigger a backup at anytime to create a backup for that specific time, and keep indexing away, and the next night do the same thing. You will always have backed up to the point in time the backup command is received. - Mark On Jan 7, 2013, at 1:45 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, There may be a better way, but stopping indexing and then using http://master_host:port/solr/replication?command=backup on each node may do the backup trick. I'd love to see how/if others do it. Otis -- Solr ElasticSearch Support http://sematext.com/ On Mon, Jan 7, 2013 at 10:33 AM, LEFEBVRE Guillaume guillaume.lefeb...@cegedim.fr wrote: Hello, Using a SOLR Cloud architecture, what is the best procedure to backup and restore SOLR index and configuration ? Thanks, Guillaume