Re: Search with punctuations
Hi Erick, Thanks for your reply! I have tried both of the suggestions that you have mentioned i.e., 1. Using WhitespaceTokensizerFactory 2. Using WordDelimiterFilterFactory with catenateWords=1 But, I still face the same issue. Should the tokenizers/ factories used must be the same for both query and index analyzers? As per my scenario, when I search for INTL, I want SOLR to return both the records containing string like INTL and INT'L. Please do suggest me other alternatives to achieve this. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510p4077973.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom processing in Solr Request Handler plugin and its debugging ?
Ok Thanks Erick, for your help. Tony. On Sun, Jul 14, 2013 at 5:12 PM, Erick Erickson erickerick...@gmail.comwrote: Not sure how to do the pass to another request handler thing, but the debugging part is pretty straightforward. I use IntelliJ, but as far as I know Eclipse has very similar capabilities. First, I cheat and path to the jar that's the output from my IDE, that saves copying the jar around. So my solrconfig.xml file has a lib directive like ../../../../../eoe/project/out/artifact/jardir where this is wherever your IDE wants to put it. It can sometimes be tricky to get enough ../../../ in there. Second, edit config, select remote and a form comes up. Fill in host and port, something like localhost and 5900 (this latter is whatever you want. In IntelliJ that'll give you the specific command to use to start Solr so you can attach. This looks like the following for my setup: java -Xdebug -Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=5900 -jar start.jar Now just fire up Solr as above. Fire up your remote debugging session in IntelliJ. Set breakpoints as you wish. NOT: the suspend=y bit above means that Solr will do _nothing_ until you attach the debugger and hit go HTH Erick On Sat, Jul 13, 2013 at 6:57 AM, Tony Mullins tonymullins...@gmail.com wrote: Please any help on how to pass the search request to different RequestHandler from within the custom RequestHandler and how to debug the custom RequestHandler plugin ? Thanks, Tony On Fri, Jul 12, 2013 at 4:41 PM, Tony Mullins tonymullins...@gmail.com wrote: Hi, I have defined my new Solr RequestHandler plugin like this in SolrConfig.xml requestHandler name=/myendpoint class=com.abc.MyRequestPlugin /requestHandler And its working fine. Now I want to do some custom processing from my this plugin by making a search query to regular '/select' handler. requestHandler name=/select class=solr.SearchHandler /requestHandler And then receive the results back from '/select' handler and perform some custom processing on those results and send the response back to my custom /myendpoint handler. And for this I need help on how to make a call to '/select' handler from within the .MyRequestPlugin class and perform some calculation on the results. I also need some help on how to debug my plugin ? As its .jar is been deployed to solr_hom/lib ... how can I attach my plugin's code in eclipse to Solr process so I could debug it when user will send request to my plugin. Thanks, Tony
Doc's FunctionQuery result field in my custom SearchComponent class ?
Hi, I have extended Solr's SearchComonent class and I am iterating through all the docs in ResponseBuilder in @overrider Process() method. Here I want to get the value of FucntionQuery result but in Document object I am only seeing the standard field of document not the FucntionQuery result. This is my query http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 Result of above query in browser shows me that 'freq' is part of doc but its not there in Document object in my @overrider Process() method. How can I get the value of FunctionQuery result in my custom SearchComponent ? Thanks, Tony
facet filtering
Hi, How can I have faceting on a subset of the query docset e.g. with something akin to: SimpleFacets.base = SolrIndexSearcher.getDocSet( Query mainQuery, SolrIndexSearcher.getDocSet(Query filter) ) Is there anything like facet.fq? Cheers, Dan
Getting numDocs and pendingDocs in Solr4.3
Hi, I'm trying to write a validation test that reads some statistics by querying Solr 4.3 via HTTP, namely the number of indexed documents (`numDocs`) and the number of pending documents (`pendingDocs`) from the Solr4 cluster. I believe that in Solr3 there was a `stats.jsp` page thtat offered both numbers. Is there a way to get both fields in Solr4? Best regards, Federico
Re: SolrCloud group.query error shard X did not set sort field values or how i can set fillFields=true on IndexSearcher.search
Thank you! I really need to eventually increase the number of shards, so I can not directly use numshards = X and the only way out - splitshards, but then I encountered the following problem: 1. run empty node1 java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -jar start.jar -DnumShards=1 2. run empty node2 java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar 3. cluster is - collection1 - shard1 - master (node1) and replica (node2) 4. add some data (10 docs) 5. http://node1:8983/solr/collection1/select?q=*:* response lst name=responseHeader int name=status0/int int name=QTime5/int lst name=params str name=q*:*/str /lst /lst result name=response numFound=10 start=0 doc.../doc doc.../doc /result /response 6. try group.query http://node1:8983/solr/collection1/select?q=*:*group=truegroup.query=street:%D0%9A%D0%BE%D1%80%D0%BE%D0%BB%D0%B5%D0%B2%D0%B0 response lst name=responseHeader int name=status0/int int name=QTime13/int lst name=params str name=q*:*/str str name=group.querystreet:Королева/str str name=grouptrue/str /lst /lst lst name=grouped lst name=street:Королева int name=matches10/int result name=doclist numFound=10 start=0 doc str name=idcdb1c990-d00c-4d2c-95ba-4f496e559be3/str str name=streetКоролева/str str name=house7/str int name=number62/int str name=ownerСидоров/str str name=noteДела отлично!/str long name=_version_1440614179417358336/long /doc /result /lst /lst /response 7. try split shard1 http://node1:8983/solr/admin/collections?action=SPLITSHARDcollection=collection1shard=shard1 response lst name=responseHeader int name=status0/int int name=QTime9288/int /lst lst name=success lst lst name=responseHeader int name=status0/int int name=QTime2441/int /lst str name=corecollection1_shard1_1_replica1/str str name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime2479/int /lst str name=corecollection1_shard1_0_replica1/str str name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime5002/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime5002/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime141/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime0/int /lst str name=corecollection1_shard1_0_replica1/str str name=statusEMPTY_BUFFER/str /lst lst lst name=responseHeader int name=status0/int int name=QTime1/int /lst str name=corecollection1_shard1_1_replica1/str str name=statusEMPTY_BUFFER/str /lst lst lst name=responseHeader int name=status0/int int name=QTime2515/int /lst str name=corecollection1_shard1_1_replica2/str str name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime2554/int /lst str name=corecollection1_shard1_0_replica2/str str name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime4001/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime4002/int /lst /lst /lst /response 8. Claster state change to shard1 - master (inactive), shard1 - slave (inactive) shard1_0 - master, shard1_0 - slave, shard1_1 - master, shard1_1 - slave 9. Commit http://node1:8983/solr/collection1/update?commit=true 10. Reload http://node1:8983/solr/collection1/select?q=*:* gives me different results numFound 5,0,10 (i add 10 docs) Node2 core info is collection1 - shard1 - 10 docs collection1_shard1_0_replica2 - 0 docs collection1_shard1_1_replica2 - 0 docs 11. I restart node2 Node2 core info is collection1 - shard1 - 10 docs collection1_shard1_0_replica2 - 5 docs collection1_shard1_1_replica2 - 5 docs 12. http://node1:8983/solr/collection1/select?q=*:* always gives the correct result - 10 documents But http://node1:8983/solr/collection1/select?q=*:*group=truegroup.query=street:%D0%9A%D0%BE%D1%80%D0%BE%D0%BB%D0%B5%D0%B2%D0%B0 returns the familiar error shard 0 did not set sort field values (FieldDoc.fields is null); you must pass fillFields=true to IndexSearcher.search on each shard I somehow did not operate correctly splitshard? Also, I tried once to indicate the number of shard 2 1. run empty node1 java
Re: Is it possible to find a leader from a list of cores in solr via java code
Hi, I got the solution to the above problem . Sharing the code so that it could help people in future PoolingClientConnectionManager poolingClientConnectionManager = new PoolingClientConnectionManager(); poolingClientConnectionManager.setMaxTotal(2); poolingClientConnectionManager.setDefaultMaxPerRoute(1); HttpClient httpClient = (HttpClient)new DefaultHttpClient(poolingClientConnectionManager); LBHttpSolrServer lbServer = new LBHttpSolrServer(httpClient); server = new CloudSolrServer(zkhost, lbServer); server.setDefaultCollection(collectionName); Thanks a lot to every1 in the thread chain. Your suggestions helped a lot -- View this message in context: http://lucene.472066.n3.nabble.com/Is-it-possible-to-find-a-leader-from-a-list-of-cores-in-solr-via-java-code-tp4074994p4078012.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr is not responding on deployment in tomcat
Hi, maybe someone here can help me with my solr-4.3.1 issue. I've successful deployed the solr.war on a tomcat7 instance. Starting the tomcat with only the solr.war deployed - works nicely. I can see the admin interface and logs are clean. If i deploy my wicket-spring-data-solr based app (using the HttpSolrServer) after the solr app without restarting the tomcat = all is fine to. I've implemented a ping to see if server is up. code private void waitUntilSolrIsAvailable(int i) { if (i == 0) { logger.info(Check solr state...); } if (i 5) { throw new RuntimeException(Solr is not avaliable after more than 25 secs. Going down now.); } if (i 0) { try { logger.info(Wait for solr to get alive.); Thread.currentThread().wait(5000); } catch (InterruptedException e) { throw new RuntimeException(e); } } try { i++; SolrPingResponse r = solrServer.ping(); if (r.getStatus() 0) { waitUntilSolrIsAvailable(i); } logger.info(Solr is alive.); } catch (SolrServerException | IOException e) { throw new RuntimeException(e); } } /code Here i can see log log 54295 [localhost-startStop-2] INFO org.apache.wicket.Application – [wicket.project] init: Wicket extensions initializer INFO - 2013-07-15 12:07:45.261; de.company.service.SolrServerInitializationService; Check solr state... 54505 [localhost-startStop-2] INFO de.company.service.SolrServerInitializationService – Check solr state... INFO - 2013-07-15 12:07:45.768; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} hits=0 status=0 QTime=20 55012 [http-bio-8080-exec-1] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} hits=0 status=0 QTime=20 INFO - 2013-07-15 12:07:45.770; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} status=0 QTime=22 55014 [http-bio-8080-exec-1] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} status=0 QTime=22 INFO - 2013-07-15 12:07:45.854; de.company.service.SolrServerInitializationService; Solr is alive. 55098 [localhost-startStop-2] INFO de.company.service.SolrServerInitializationService – Solr is alive. /log But if i restart the tomcat with both webapps (solr and wicket) the solr is not responding on the ping request. log INFO - 2013-07-15 12:02:27.634; org.apache.wicket.Application; [wicket.project] init: Wicket extensions initializer 11932 [localhost-startStop-1] INFO org.apache.wicket.Application – [wicket.project] init: Wicket extensions initializer INFO - 2013-07-15 12:02:27.787; de.company.service.SolrServerInitializationService; Check solr state... 12085 [localhost-startStop-1] INFO de.company.service.SolrServerInitializationService – Check solr state... /log What could that be or how can i get infos where this is stopping? Thanks for your support Per
Solr Zookeeper - Too Many file descriptors on network failure
Hi, I am having an issue with network failure to one of the node (or many). When network is down, number of sockets in that machine keeps on increasing, At a point it throws too many file descriptors exeption. When network is available before that exception, all the open sockets are getting closed. and hence the node could able to join to cloud. But when network is available again, after that exception , node couldnt able to join to cloud. Thanks in advance RANJITH VENKATESAN -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Zookeeper-Too-Many-file-descriptors-on-network-failure-tp4077979.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr-Max connections
Hi, I am using solr-4.3.0 with zookeeper-3.4.5. My scenario is, users will communicate with solr via zookeeper ports. *My question is how many users can simultaneously access the solr. * In zookeeper i configured maxClientCxns, but that is for max connections from a single host(User??) Note: My assumption is ,maxconnections will be based on Jetty or Tomcat. Is it so?? If not how to configure maxConnections in solr and zookeeper. In my case 1000 users may search simultaneously. And also indexing will also happen at the same time. Thanks in advance Ranjith Venkatesan -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Max-connections-tp4078008.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Doc's FunctionQuery result field in my custom SearchComponent class ?
Please any help on how to get the value of 'freq' field in my custom SearchComponent ? http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 docstr name=id11/strstr name=typeVideo Games/strstr name=formatxbox 360/strstr name=productThe Amazing Spider-Man/strint name=popularity11/intlong name=_version_1439994081345273856/longint name=freq1/int/doc Here is my code DocList docs = rb.getResults().docList; DocIterator iterator = docs.iterator(); int sumFreq = 0; String id = null; for (int i = 0; i docs.size(); i++) { try { int docId = iterator.nextDoc(); // Document doc = searcher.doc(docId, fieldSet); Document doc = searcher.doc(docId); In doc object I can see the schema fields like 'id', 'type','format' etc. but I cannot find the field 'freq' which I needed. Is there any way to get the FunctionQuery fields in doc object ? Thanks, Tony On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.comwrote: Hi, I have extended Solr's SearchComonent class and I am iterating through all the docs in ResponseBuilder in @overrider Process() method. Here I want to get the value of FucntionQuery result but in Document object I am only seeing the standard field of document not the FucntionQuery result. This is my query http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 Result of above query in browser shows me that 'freq' is part of doc but its not there in Document object in my @overrider Process() method. How can I get the value of FunctionQuery result in my custom SearchComponent ? Thanks, Tony
Nested query in SOLR filter query (fq)
Hi all, I have the following case. Solr documents has fields -- id and status. Id is not unique. Unique is the combination of these two elements. Documents with same id have different statuses. List of Documents -ID- -STATUS- id11 id12 id13 id14 id21 id22 id31 I need to make query that takes all documents with specific status and to exclude documents that don't have other specific status. As an example I need to get all documents with status 2 and don't have status 3. The expected result should be document : id22 Another example: all documents with status 1 and don't have status 3. Then the result should be: id21 id31 Here is my query that don't work http://192.168.130.14:13080/solr/select/?q=status:1version=2.2start=0rows=10indent=onfl=id,statusfq=-id:(*:*%20AND%20status:2) The problem is in filter query(fq) part. In fq must be the ids of the documents with status 2 and if the current document id is in this list to be excluded. I guess some subquery must be used in fq part or something else. Just for information we are using APACHE SOLR 3.6 and document count is around 100k. Thanks in advance! -- View this message in context: http://lucene.472066.n3.nabble.com/Nested-query-in-SOLR-filter-query-fq-tp4078020.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr caching clarifications
Manuel: First off, anything that Mike McCandless says about low-level details should override anything I say. The memory savings he's talking about there are actually something he tutored me in once on a chat. The savings there, as I understand it, aren't huge. For large sets I think it's a 25% savings (if I calculated right). But consider that even without those savings, 8 filter cache entries will be more than the entire structure that JIRA talks about As to your fq question, absolutely! Any yes/no clause that, as you say, contribute to the score is a candidate to be moved to a fq clause. There are a couple of things to be aware of though. 1 be a little careful of using NOW. If you don't use it correctly, fq clauses will not be re-used. See: http://searchhub.org/2012/02/23/date-math-now-and-filter-queries/ 2 How you usually do this is through the UI, not the users entering a query. For instance if you have a date-range picker your a;; constructs the fq clause from that. Or you append fq clauses to the links you create when you display facets or No, there's no automatic tool for this. There's not likely to be one since there's no way to infer the intent. Say you put in a clause like q=a AND b. That scores things. It would give the same result set as q=*:*fq=1fq=b which would compute no scores. How could a tool infer when this was or wasn't OK? Best Erick On Sun, Jul 14, 2013 at 6:10 PM, Manuel Le Normand manuel.lenorm...@gmail.com wrote: Alright, thanks Erick. For the question about memory usage of merges, taken from Mike McCandless Blog The big thing that stays in RAM is a logical int[] mapping old docIDs to new docIDs, but in more recent versions of Lucene (4.x) we use a much more efficient structure than a simple int[] ... see https://issues.apache.org/jira/browse/LUCENE-2357 How much RAM is required is mostly a function of how many documents (lots of tiny docs use more RAM than fewer huge docs). A related clarification As my users are not aware of the fq possibility, i was wondering how do I make the best out of this field cache. Would if be efficient transforming implicitly their query to a filter query on fields that are boolean searches (date range etc. that do not affect the score of a document). Is this a good practice? Is there any plugin for a query parser that makes it? Inline On Thu, Jul 11, 2013 at 8:36 AM, Manuel Le Normand manuel.lenorm...@gmail.com wrote: Hello, As a result of frequent java OOM exceptions, I try to investigate more into the solr jvm memory heap usage. Please correct me if I am mistaking, this is my understanding of usages for the heap (per replica on a solr instance): 1. Buffers for indexing - bounded by ramBufferSize 2. Solr caches 3. Segment merge 4. Miscellaneous- buffers for Tlogs, servlet overhead etc. Particularly I'm concerned by Solr caches and segment merges. 1. How much memory consuming (bytes per doc) are FilterCaches (bitDocSet) and queryResultCaches (DocList)? I understand it is related to the skip spaces between doc id's that match (so it's not saved as a bitmap). But basically, is every id saved as a java int? Different beasts. filterCache consumes, essentially, maxDoc/8 bytes (you can get the maxDoc number from your Solr admin page). Plus some overhead for storing the fq text, but that's usually not much. This is for each entry up to Size. queryResultCache is usually trivial unless you've configured it extravagantly. It's the query string length + queryResultWindowSize integers per entry (queryResultWindowSize is from solrconfig.xml). 2. QueryResultMaxDocsCached - (for example = 100) means that any query resulting in more than 100 docs will not be cached (at all) in the queryResultCache? Or does it have to do with the documentCache? It's just a limit on the queryResultCache entry size as far as I can tell. But again this cache is relatively small, I'd be surprised if it used significant resources. 3. DocumentCache - written on the wiki it should be greater than max_results*concurrent_queries. Max result is just the num of rows displayed (rows-start) param, right? Not the queryResultWindow. Yes. This a cache (I think) for the _contents_ of the documents you'll be returning to be manipulated by various components during the life of the query. 4. LazyFieldLoading=true - when quering for id's only (fl=id) will this cache be used? (on the expense of eviction of docs that were already loaded with stored fields) Not sure, but I don't think this will contribute much to memory pressure. This is about now many fields are loaded to get a single value from a doc in the results list, and since one is usually working with 20 or so docs this is usually a small amount of memory. 5. How large is the heap used by mergings? Assuming we have a merge of 10 segments of 500MB each (half inverted files - *.pos *.doc etc, half
Re: SolrCloud group.query error shard X did not set sort field values or how i can set fillFields=true on IndexSearcher.search
I'm going to let someone who knows the splitting details take over G... Best Erick On Mon, Jul 15, 2013 at 5:19 AM, Evgeny Salnikov evg...@salnikoff.com wrote: Thank you! I really need to eventually increase the number of shards, so I can not directly use numshards = X and the only way out - splitshards, but then I encountered the following problem: 1. run empty node1 java -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -DzkRun -jar start.jar -DnumShards=1 2. run empty node2 java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar 3. cluster is - collection1 - shard1 - master (node1) and replica (node2) 4. add some data (10 docs) 5. http://node1:8983/solr/collection1/select?q=*:* response lst name=responseHeader int name=status0/int int name=QTime5/int lst name=params str name=q*:*/str /lst /lst result name=response numFound=10 start=0 doc.../doc doc.../doc /result /response 6. try group.query http://node1:8983/solr/collection1/select?q=*:*group=truegroup.query=street:%D0%9A%D0%BE%D1%80%D0%BE%D0%BB%D0%B5%D0%B2%D0%B0 response lst name=responseHeader int name=status0/int int name=QTime13/int lst name=params str name=q*:*/str str name=group.querystreet:Королева/str str name=grouptrue/str /lst /lst lst name=grouped lst name=street:Королева int name=matches10/int result name=doclist numFound=10 start=0 doc str name=idcdb1c990-d00c-4d2c-95ba-4f496e559be3/str str name=streetКоролева/str str name=house7/str int name=number62/int str name=ownerСидоров/str str name=noteДела отлично!/str long name=_version_1440614179417358336/long /doc /result /lst /lst /response 7. try split shard1 http://node1:8983/solr/admin/collections?action=SPLITSHARDcollection=collection1shard=shard1 response lst name=responseHeader int name=status0/int int name=QTime9288/int /lst lst name=success lst lst name=responseHeader int name=status0/int int name=QTime2441/int /lst str name=corecollection1_shard1_1_replica1/str str name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime2479/int /lst str name=corecollection1_shard1_0_replica1/str str name=saved/home/evgenysalnikov/solrtest/node1/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime5002/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime5002/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime141/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime0/int /lst str name=corecollection1_shard1_0_replica1/str str name=statusEMPTY_BUFFER/str /lst lst lst name=responseHeader int name=status0/int int name=QTime1/int /lst str name=corecollection1_shard1_1_replica1/str str name=statusEMPTY_BUFFER/str /lst lst lst name=responseHeader int name=status0/int int name=QTime2515/int /lst str name=corecollection1_shard1_1_replica2/str str name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime2554/int /lst str name=corecollection1_shard1_0_replica2/str str name=saved/home/evgenysalnikov/solrtest/node2/example/solr/solr.xml/str /lst lst lst name=responseHeader int name=status0/int int name=QTime4001/int /lst /lst lst lst name=responseHeader int name=status0/int int name=QTime4002/int /lst /lst /lst /response 8. Claster state change to shard1 - master (inactive), shard1 - slave (inactive) shard1_0 - master, shard1_0 - slave, shard1_1 - master, shard1_1 - slave 9. Commit http://node1:8983/solr/collection1/update?commit=true 10. Reload http://node1:8983/solr/collection1/select?q=*:* gives me different results numFound 5,0,10 (i add 10 docs) Node2 core info is collection1 - shard1 - 10 docs collection1_shard1_0_replica2 - 0 docs collection1_shard1_1_replica2 - 0 docs 11. I restart node2 Node2 core info is collection1 - shard1 - 10 docs collection1_shard1_0_replica2 - 5 docs collection1_shard1_1_replica2 - 5 docs 12. http://node1:8983/solr/collection1/select?q=*:* always gives the correct result - 10 documents But
How to change extracted directory
Hi, I'm trying to change default tempDir where solr.war file is extracted to. If I change context or webbaps XML it works, but I need to do it from commandline and don't know how. I tried to run: java -Djava.io.tmpdir=/path/to/my/dir -jar start.jar or java -Djavax.servlet.context.tempdir=/path/to/my/dir -jar start.jar ..without success. I always get default directory: [main] WARN org.eclipse.jetty.xml.XmlConfiguration – Config error at Set name=tempDirectoryProperty name=jetty.home default=.//solr-webapp/Set ... Caused by: java.lang.IllegalArgumentException: Bad temp directory: /opt/solr/app/solr-webapp at org.eclipse.jetty.webapp.WebAppContext.setTempDirectory(WebAppContext.java:1127) and suggestions? regards -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-change-extracted-directory-tp4078024.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Search with punctuations
1 You have to re-index after changing your schema, did you? 2 The admin/analysis page is your friend. It'll show you exactly what transformations are applied both at query and index time. 3 WhitespaceTokenizerFactory is only _part_ of the solution, it just breaks up the incoming. WordDelimiterFilterFactory would then be applied to each token. 4 Yes, you must have the index and query time analysis chains be compatible. For the time being, identical is probably best as I'm guessing you're not entirely familiar with the process. Best Erick On Mon, Jul 15, 2013 at 2:40 AM, kobe.free.wo...@gmail.com kobe.free.wo...@gmail.com wrote: Hi Erick, Thanks for your reply! I have tried both of the suggestions that you have mentioned i.e., 1. Using WhitespaceTokensizerFactory 2. Using WordDelimiterFilterFactory with catenateWords=1 But, I still face the same issue. Should the tokenizers/ factories used must be the same for both query and index analyzers? As per my scenario, when I search for INTL, I want SOLR to return both the records containing string like INTL and INT'L. Please do suggest me other alternatives to achieve this. Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Search-with-punctuations-tp4077510p4077973.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Nested query in SOLR filter query (fq)
Hello, it sounds like FieldCollapsing or Join scenarios, but given the only information which you provided, it can be solved by indexing statuses as multivalue field: -ID- -STATUS- id1(1 2 3 4) id2(1 2) id3(1) q=*:*fq=STATUS:1fq=NOT STATUS:3 On Mon, Jul 15, 2013 at 3:19 PM, EquilibriumCST valeri_ho...@abv.bg wrote: Hi all, I have the following case. Solr documents has fields -- id and status. Id is not unique. Unique is the combination of these two elements. Documents with same id have different statuses. List of Documents -ID- -STATUS- id11 id12 id13 id14 id21 id22 id31 I need to make query that takes all documents with specific status and to exclude documents that don't have other specific status. As an example I need to get all documents with status 2 and don't have status 3. The expected result should be document : id22 Another example: all documents with status 1 and don't have status 3. Then the result should be: id21 id31 Here is my query that don't work http://192.168.130.14:13080/solr/select/?q=status:1version=2.2start=0rows=10indent=onfl=id,statusfq=-id:(*:*%20AND%20status:2) The problem is in filter query(fq) part. In fq must be the ids of the documents with status 2 and if the current document id is in this list to be excluded. I guess some subquery must be used in fq part or something else. Just for information we are using APACHE SOLR 3.6 and document count is around 100k. Thanks in advance! -- View this message in context: http://lucene.472066.n3.nabble.com/Nested-query-in-SOLR-filter-query-fq-tp4078020.html Sent from the Solr - User mailing list archive at Nabble.com. -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Running Solr in a cluster - high availability only
Hi, I would like to run two Solr instances on different computers as a cluster. My main interest is High availability - meaning, in case one server crashes or is down there will be always another one. (my performances on a single instance are great. I do not need to split the data to two servers.) Questions: 1. What is the best practice? Is it different than clustering for index splitting? Do I need Shards? 2. Do I need zoo keeper? 3. Is it a container based configuration (different for jetty and tomcat) 4, Do I need an external NLB for that ? 5. When one computer is up after crashing. how dows it updates its index?
Re: HTTP Status 503 - Server is shutting down
Hello, I am able to configure solr 4.3.1 version with tomcat6. I followed these steps: 1. Extract solr431 package. In my case I did in E:\solr-4.3.1\example\solr 2. Now copied solr dir from extracted package (E:\solr-4.3.1\example\solr) into TOMCAT_HOME dir. In my case TOMCAT_HOME dir is pointed to E:\Apache\Tomcat 6.0. 3. I can refer now SOLR_HOME as E:\Apache\Tomcat 6.0\solr (please remember this) 4. Copy the solr.war file from extracted package to SOLR HOME dir i.e E:\Apache\Tomcat 6.0\solr. This is required to create the context. As I donot want to pass this as JAVA OPTS 5. Create solr1.xml file into TOMCAT_HOME\conf\Catalina\localhost (I gave file name as solr1.xml ) ?xml version=1.0 encoding=utf-8?Context docBase=E:\Apache\Tomcat 6.0\solr\solr.war debug=0 crossContext=true Environment name=solr/home type=java.lang.String value=E:\Apache\Tomcat 6.0\solr override=true//Context 6. Also copy solr.war file into TOMCAT_HOME\webapps for deployment purpose 7. If you start tomcat you will get errors as mentioned by Shawn. S0 you need to copy all the 5 jar files from solr extracted package ( E:\solr-4.3.1\example\lib\ext ) to TOMCAT_HOME\lib dir.(jul-to-slf4j-1.6.6, jcl-over-slf4j-1.6.6, slf4j-log4j12-1.6.6, slf4j-api-1.6.6,log4j-1.2.16) 8. Also copy the log4js.properties file from E:\solr-4.3.1\example\resources dir to TOMCAT_HOME\lib dir. 9. Now if you start the tomcat you wont having any problem. 10. As in my side I am using additional jar for data import requesthandler . So for this please modify the solrconfig.xml file to point the location of data import jar. 11. What I did : In solrconfig.xml file : In section !-- lib/ directives can be used to instruct Solr to load an Jars. lib dir=./lib / -- I add one line after this section (If I use above line then I need to create lib dir inside Collection1 dir) lib dir=../lib / 12. In SOLR_HOME (E:\Apache\Tomcat 6.0\solr) I created a lib folder because in my solrconfig.xml file I am referring this lib dir. And copied all the dataimport related jar files.(solr-dataimporthandler-4.3.1***) I did it in this way because I do not want to use TOMCAT_HOME\lib. 13. Now restart the tomcat I am sure there should not be any problem. If there is some problem, refer solr.log file which is in TOMCAT_HOME\logs dir. As I said in point 12 that I do not want to put jar files related to solr ino TOMCAT_HOME\lib dir, but for logging mechanism I have to do. I tried to put all the 5 jars into this folder and removed from TOMCAT lib.. but then I got the error. In Ideal scenario, we should not put all the jar files related to solr into TOMCAT lib dir Regards Sandeep On Mon, Jul 15, 2013 at 12:27 AM, PeterKerk vettepa...@hotmail.com wrote: Ok, still getting the same error HTTP Status 503 - Server is shutting down, so here's what I did now: - reinstalled tomcat - deployed solr-4.3.1.war in C:\Program Files\Apache Software Foundation\Tomcat 6.0\webapps - copied log4j-1.2.16.jar,slf4j-api-1.6.6.jar,slf4j-log4j12-1.6.6.jar to C:\Program Files\Apache Software Foundation\Tomcat 6.0\webapps\solr-4.3.1\WEB-INF\lib - copied log4j.properties from C:\Dropbox\Databases\solr-4.3.1\example\resources to C:\Dropbox\Databases\solr-4.3.1\example\lib - restarted tomcat Now this shows in my Tomcat console: 14-jul-2013 20:54:38 org.apache.catalina.core.AprLifecycleListener init INFO: The APR based Apache Tomcat Native library which allows optimal performanc e in production environments was not found on the java.library.path: C:\Program Files\Apache Software Foundation\Tomcat 6.0\bin;C:\Windows\Sun\Java\bin;C:\Windo ws\system32;C:\Windows;C:\Program Files\Common Files\Microsoft Shared\Windows Li ve;C:\Program Files (x86)\Common Files\Microsoft Shared\Windows Live;C:\Windows\ system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShe ll\v1.0\;C:\Program Files\TortoiseSVN\bin;c:\msxsl;C:\Program Files (x86)\Window s Live\Shared;C:\Program Files\Microsoft\Web Platform Installer\;C:\Program File s (x86)\Microsoft ASP.NET\ASP.NET Web Pages\v1.0\;C:\Program Files (x86)\Windows Kits\8.0\Windows Performance Toolkit\;C:\Program Files\Microsoft SQL Server\110 \Tools\Binn\;C:\Program Files (x86)\Microsoft SQL Server\110\Tools\Binn\;C:\Prog ram Files\Microsoft SQL Server\110\DTS\Binn\;C:\Program Files (x86)\Microsoft SQ L Server\110\Tools\Binn\ManagementStudio\;C:\Program Files (x86)\Microsoft SQL S erver\110\DTS\Binn\;C:\Program Files (x86)\Java\jre6\bin;C:\Program Files\Java\j re631\bin;. 14-jul-2013 20:54:39 org.apache.coyote.http11.Http11Protocol init INFO: Initializing Coyote HTTP/1.1 on http-8080 14-jul-2013 20:54:39 org.apache.catalina.startup.Catalina load INFO: Initialization processed in 287 ms 14-jul-2013 20:54:39 org.apache.catalina.core.StandardService start INFO: Starting service Catalina 14-jul-2013 20:54:39 org.apache.catalina.core.StandardEngine start INFO: Starting
Re: Running Solr in a cluster - high availability only
* Go with SolrCloud - unless you think you're smarter than Yonik and Mark Miller. * Replicas are used for both query capacity and resilience (HA). * Shards are used for increased index capacity (number of documents) and to reduce query latency (parallel processing of portions of a query.) * You need at least three zookeepers for HA. They need to be external to the cluster in production. * Load balancing - you need to do your own testing to confirm whether you need it. If so, that is outside of Solr. * SolrCloud automatically recovers nodes when they come back up. -- Jack Krupansky -Original Message- From: Mysurf Mail Sent: Monday, July 15, 2013 8:32 AM To: solr-user@lucene.apache.org Subject: Running Solr in a cluster - high availability only Hi, I would like to run two Solr instances on different computers as a cluster. My main interest is High availability - meaning, in case one server crashes or is down there will be always another one. (my performances on a single instance are great. I do not need to split the data to two servers.) Questions: 1. What is the best practice? Is it different than clustering for index splitting? Do I need Shards? 2. Do I need zoo keeper? 3. Is it a container based configuration (different for jetty and tomcat) 4, Do I need an external NLB for that ? 5. When one computer is up after crashing. how dows it updates its index?
How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?
When I search something which has non ASCII characters at Google it returns me results both original and ascified versions and *highlights both of them*. For example if I search *çiğli* at Google first result is that: *Çiğli* Belediyesi www.*cigli*.bel.tr/ How can I do that at Solr? How can I indicate that to Solr: *Both Ascified and Non-Ascii versions of tokens are same?** *
Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?
Either do a custom highlighter or preprocess the query and generate an OR of the accented and unaccented terms. Solr has no magic feature to do both. Sure, you could do a token filter that duplicated each term and included both the accented and unaccented versions, but... it gets messy and is a pain with phrases. It is worth a Jira though. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Monday, July 15, 2013 9:06 AM To: solr-user@lucene.apache.org Subject: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same? When I search something which has non ASCII characters at Google it returns me results both original and ascified versions and *highlights both of them*. For example if I search *çiğli* at Google first result is that: *Çiğli* Belediyesi www.*cigli*.bel.tr/ How can I do that at Solr? How can I indicate that to Solr: *Both Ascified and Non-Ascii versions of tokens are same?** *
Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?
Hi Furkan, Using MappingCharFilterFactory with mapping-FoldToASCII.txt or mapping-ISOLatin1Accent.txt http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.MappingCharFilterFactory From: Furkan KAMACI furkankam...@gmail.com To: solr-user@lucene.apache.org Sent: Monday, July 15, 2013 4:06 PM Subject: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same? When I search something which has non ASCII characters at Google it returns me results both original and ascified versions and *highlights both of them*. For example if I search *çiğli* at Google first result is that: *Çiğli* Belediyesi www.*cigli*.bel.tr/ How can I do that at Solr? How can I indicate that to Solr: *Both Ascified and Non-Ascii versions of tokens are same?** *
Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same?
Actually, on second thought, I think you should be able to do this directly, but I don't have the highlighter magic at my fingertips. The field type analyzer simply needs to map the accented characters; the character positions of the accented and unaccented tokens should line up fine. Really, it is no different that highlighting tokens that have differences in upper and lower case. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Monday, July 15, 2013 9:13 AM To: solr-user@lucene.apache.org Subject: Re: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same? Either do a custom highlighter or preprocess the query and generate an OR of the accented and unaccented terms. Solr has no magic feature to do both. Sure, you could do a token filter that duplicated each term and included both the accented and unaccented versions, but... it gets messy and is a pain with phrases. It is worth a Jira though. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Monday, July 15, 2013 9:06 AM To: solr-user@lucene.apache.org Subject: How to Indicate Solr That: Both Ascified and Non-Ascii versions of tokens are same? When I search something which has non ASCII characters at Google it returns me results both original and ascified versions and *highlights both of them*. For example if I search *çiğli* at Google first result is that: *Çiğli* Belediyesi www.*cigli*.bel.tr/ How can I do that at Solr? How can I indicate that to Solr: *Both Ascified and Non-Ascii versions of tokens are same?** *
Re: Nested query in SOLR filter query (fq)
Yes I know about that, but design schema cannot be changed. This is not my decision :) -- View this message in context: http://lucene.472066.n3.nabble.com/Nested-query-in-SOLR-filter-query-fq-tp4078020p4078047.html Sent from the Solr - User mailing list archive at Nabble.com.
Facet sorting seems weird
Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Re: HTTP Status 503 - Server is shutting down
Hi Sandeep, Thank you for your extensive answer :) Before I'm going through all your steps, I noticed you mentioning something about a data import handler. Now, what I will be requiring after I've completed the basic setup of Tomcat6 and Solr431 I want to migrate my Solr350 (now running on Cygwin) cores to that environment. C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\tt C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\shop C:\Dropbox\Databases\apache-solr-3.5.0\example\example-DIH\solr\homes Will all your steps still apply with my above requirements or is a different approach needed when migrating from the example-DIH with multiple cores? Many thanks again! :) -- View this message in context: http://lucene.472066.n3.nabble.com/HTTP-Status-503-Server-is-shutting-down-tp4065958p4078059.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Facet sorting seems weird
Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Aggregating data with Solr, getting group stats
Hi, I see there are few ways in Solr which can almost be used for my use case, but all of them appear to fall short eventually. Here is what I am trying to do: consider the following document structure (there are many more fields in play, but this is enough for example): Manufacturer ProductType Color Size Price CountAvailableItems Based on user parameters (search string, some filters), I would fetch a set of documents. What I need is to group resulting documents by different attribute combinations (say Manufacturer + Color or ProductType + Color + Size or ...) and get stats (Max Price, Avg Price, Num of available items) for those groups. Possible solutions in Solr: 1) StatsComponent - provides all stats I would need, but its grouping functionality is basic - it can group on a single field (stats.field + stats.facet) while I need field combinations. There is an issue https://issues.apache.org/jira/browse/SOLR-2472 which tried to deal with that, but it looks like it got stuck in the past. 2) Pivot Faceting - seems like it would provide all the grouping logic I need and in combination with https://issues.apache.org/jira/browse/SOLR-3583Percentiles for facets, pivot facets, and distributed pivot facets would bring percentiles and averages. However, I would still miss things like Max/Min/Sum and the issue is not committed yet anyway. I would also depend on another yet to be committed issue https://issues.apache.org/jira/browse/SOLR-2894 for distributed support. 3) Configurable Collectors - https://issues.apache.org/jira/browse/SOLR-4465- seems promissing, but it allows grouping by just one field and, probably a bigger problem, seem it was just a POC and will need overhauling before it is anywhere near being ready for commit Are there any other options I missed? Thanks, Bojan
Re: Getting numDocs and pendingDocs in Solr4.3
On 7/15/2013 3:08 AM, Federico Ragona wrote: Hi, I'm trying to write a validation test that reads some statistics by querying Solr 4.3 via HTTP, namely the number of indexed documents (`numDocs`) and the number of pending documents (`pendingDocs`) from the Solr4 cluster. I believe that in Solr3 there was a `stats.jsp` page thtat offered both numbers. Is there a way to get both fields in Solr4? Solr4 should have all the stats that Solr3 has and then some. If you select your core from the core selector, then click on Plugins / Stats, click on UPDATEHANDER, then open updateHandler on the right, I think you'll find at least some of what you were looking for. Other parts of what you were looking for might be found on the Overview for the core. If you have the default core named collection1 then a URL like this one will get you there. You can replace collection1 with the name of your core. The /#/ in this URL indicates that it is part of the admin UI, not something you'd want to query in a program: http://server:port/solr/#/collection1/plugins/updatehandler?entry=updateHandler The admin UI gathers most of its core-level information from the mbeans handler found in the core itself. The following URL is suitable for querying in a program. Note the collection1 in this URL as well: http://server:port/solr/collection1/admin/mbeans?stats=true This will default to XML output. Like most things in Solr, if you add wt=json to the URL, you'll get JSON format. You can also add indent=true for human readability. Thanks, Shawn
RE: Facet sorting seems weird
Hello, thank you for the quick reply! But given that facet.sort=index just sorts by the faceted index (and I don't want the facet itself to be in lower-case), would that really work? Regards, Henrik Ossipoff -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 15. juli 2013 16:46 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Re: How to change extracted directory
On 7/15/2013 5:45 AM, wolbi wrote: I'm trying to change default tempDir where solr.war file is extracted to. If I change context or webbaps XML it works, but I need to do it from commandline and don't know how. I tried to run: java -Djava.io.tmpdir=/path/to/my/dir -jar start.jar or java -Djavax.servlet.context.tempdir=/path/to/my/dir -jar start.jar ..without success. I always get default directory: [main] WARN org.eclipse.jetty.xml.XmlConfiguration – Config error at Set name=tempDirectoryProperty name=jetty.home default=.//solr-webapp/Set The temp directory location is specified by the Solr context fragment for the example Jetty, which you can find in example/contexts/solr-jetty-context.xml. If you have 4.1 or 4.0, that will be named example/contexts/solr.xml instead. The easiest thing to do is to edit that file. This overrides what you specify on the commandline. I saw that you asked this same question in #solr on IRC. You have to be patient in an IRC tech channel. I didn't even see your question until long after you had disconnected. It can literally take hours before anyone is at their keyboard. According to the time on your email, it's taken me a few hours for this, too. Thanks, Shawn
RE: Facet sorting seems weird
Hi Henrik, We did something related to this that I'll share. I'm rather new to Solr so take this idea cautiously :-) Our requirement was to show exact values but have case-insensitive sorting and facet filtering (prefix filtering). We created an index field (type=string) for creating facets so that the values are indexed as-is. The values we indexed were given the format lowercase value|exact value So for example, given the value bObles, we would index the string bobles|bObles. When displaying the facet we split the facet value from Solr in half and display the second half to the user. Of course the caveat is that you could have 2 facets that differ only in case, but to me that's a data cleansing issue. James -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: Monday, July 15, 2013 10:57 AM To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hello, thank you for the quick reply! But given that facet.sort=index just sorts by the faceted index (and I don't want the facet itself to be in lower-case), would that really work? Regards, Henrik Ossipoff -Original Message- From: David Quarterman [mailto:da...@corexe.com] Sent: 15. juli 2013 16:46 To: solr-user@lucene.apache.org Subject: RE: Facet sorting seems weird Hi Henrik, Try setting up a copyfield in your schema and set the copied field to use something like 'text_ws' which implements LowerCaseFilterFactory. Then sort on the copyfield. Regards, DQ -Original Message- From: Henrik Ossipoff Hansen [mailto:h...@entertainment-trading.com] Sent: 15 July 2013 15:08 To: solr-user@lucene.apache.org Subject: Facet sorting seems weird Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Re: Solr is not responding on deployment in tomcat
Sounds like Wicket and Solr are using the same port(s)... If you start Wicket first then look at the Solr logs, you might see some message about port already in use or some such. If this is SolrCloud, there are also the ZooKeeper ports to wonder about. Best Erick On Mon, Jul 15, 2013 at 6:49 AM, Per Newgro per.new...@gmx.ch wrote: Hi, maybe someone here can help me with my solr-4.3.1 issue. I've successful deployed the solr.war on a tomcat7 instance. Starting the tomcat with only the solr.war deployed - works nicely. I can see the admin interface and logs are clean. If i deploy my wicket-spring-data-solr based app (using the HttpSolrServer) after the solr app without restarting the tomcat = all is fine to. I've implemented a ping to see if server is up. code private void waitUntilSolrIsAvailable(int i) { if (i == 0) { logger.info(Check solr state...); } if (i 5) { throw new RuntimeException(Solr is not avaliable after more than 25 secs. Going down now.); } if (i 0) { try { logger.info(Wait for solr to get alive.); Thread.currentThread().wait(5000); } catch (InterruptedException e) { throw new RuntimeException(e); } } try { i++; SolrPingResponse r = solrServer.ping(); if (r.getStatus() 0) { waitUntilSolrIsAvailable(i); } logger.info(Solr is alive.); } catch (SolrServerException | IOException e) { throw new RuntimeException(e); } } /code Here i can see log log 54295 [localhost-startStop-2] INFO org.apache.wicket.Application – [wicket.project] init: Wicket extensions initializer INFO - 2013-07-15 12:07:45.261; de.company.service.SolrServerInitializationService; Check solr state... 54505 [localhost-startStop-2] INFO de.company.service.SolrServerInitializationService – Check solr state... INFO - 2013-07-15 12:07:45.768; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} hits=0 status=0 QTime=20 55012 [http-bio-8080-exec-1] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} hits=0 status=0 QTime=20 INFO - 2013-07-15 12:07:45.770; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} status=0 QTime=22 55014 [http-bio-8080-exec-1] INFO org.apache.solr.core.SolrCore – [collection1] webapp=/solr path=/admin/ping params={wt=javabinversion=2} status=0 QTime=22 INFO - 2013-07-15 12:07:45.854; de.company.service.SolrServerInitializationService; Solr is alive. 55098 [localhost-startStop-2] INFO de.company.service.SolrServerInitializationService – Solr is alive. /log But if i restart the tomcat with both webapps (solr and wicket) the solr is not responding on the ping request. log INFO - 2013-07-15 12:02:27.634; org.apache.wicket.Application; [wicket.project] init: Wicket extensions initializer 11932 [localhost-startStop-1] INFO org.apache.wicket.Application – [wicket.project] init: Wicket extensions initializer INFO - 2013-07-15 12:02:27.787; de.company.service.SolrServerInitializationService; Check solr state... 12085 [localhost-startStop-1] INFO de.company.service.SolrServerInitializationService – Check solr state... /log What could that be or how can i get infos where this is stopping? Thanks for your support Per
Re: Running Solr in a cluster - high availability only
With only two instances, replication may be the way to go. Or send updates to both. Solr Cloud is much more tightly coupled, requires Zookeeper, etc. There are more ways for two Solr Cloud nodes to fail, compared with two Solr nodes using old-style replication. In general, a loosely-coupled system will be more robust. You should look at Solr Cloud if you need sharding or near real time. wunder On Jul 15, 2013, at 5:54 AM, Jack Krupansky wrote: * Go with SolrCloud - unless you think you're smarter than Yonik and Mark Miller. * Replicas are used for both query capacity and resilience (HA). * Shards are used for increased index capacity (number of documents) and to reduce query latency (parallel processing of portions of a query.) * You need at least three zookeepers for HA. They need to be external to the cluster in production. * Load balancing - you need to do your own testing to confirm whether you need it. If so, that is outside of Solr. * SolrCloud automatically recovers nodes when they come back up. -- Jack Krupansky -Original Message- From: Mysurf Mail Sent: Monday, July 15, 2013 8:32 AM To: solr-user@lucene.apache.org Subject: Running Solr in a cluster - high availability only Hi, I would like to run two Solr instances on different computers as a cluster. My main interest is High availability - meaning, in case one server crashes or is down there will be always another one. (my performances on a single instance are great. I do not need to split the data to two servers.) Questions: 1. What is the best practice? Is it different than clustering for index splitting? Do I need Shards? 2. Do I need zoo keeper? 3. Is it a container based configuration (different for jetty and tomcat) 4, Do I need an external NLB for that ? 5. When one computer is up after crashing. how dows it updates its index? -- Walter Underwood wun...@wunderwood.org
Re: Facet sorting seems weird
Hi Henrik, If I understand the question correctly (case-insensitive sorting of the facet values), then this is the limitation of the current Facet component. You can see the full implementation at: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818 If you are comfortable with Java code, the easiest thing might be to copy/fix the component and use your own one for faceting. The components are defined in solrconfig.xml and FacetComponent is in a default chain. See: https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194 If you do manage to do this (I would recommend doing it as an extra option), it would be nice to have it contributed back to Solr. I think you are not the only one with this requirement. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff
Re: Apache Solr 4 - after 1st commit the index does not grow
Ok, I have removed the problem with OutOfMemory by increasing jvm parameters... and now I have another problem. My index worked since yesterday evening... the number of documents increased (I run bin/crawl script every 3 hours and I have 27040 documents now).. but the last increase was 6 hours ago... why it stoped to grow again? You can look at my solr here: http://ir-dev.lmcloud.vse.cz:8082/solr/#/~logging The log says: java.lang.RuntimeException: [was class java.io.CharConversionException] Invalid UTF-8 character 0x at char #2800441, byte #3096524) What is it? how can I solve it? Does anyone have any idea? -- View this message in context: http://lucene.472066.n3.nabble.com/Apache-Solr-4-after-1st-commit-the-index-does-not-grow-tp4077913p4078077.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Apache Solr 4 - after 1st commit the index does not grow
As I can see, this is the same problem like one from older posts - http://lucene.472066.n3.nabble.com/strange-utf-8-problem-td3094473.html ...but it was without any response. -- View this message in context: http://lucene.472066.n3.nabble.com/Apache-Solr-4-after-1st-commit-the-index-does-not-grow-tp4077913p4078079.html Sent from the Solr - User mailing list archive at Nabble.com.
How to pass null OR empty values to fq?
Hi, I am trying to pass empty values to fq parameter but passing null (or empty) doesn't seem to work for fq. Something like... q=*:*fq=(field1:test OR null) We are trying to make fq more tolerant by making not fail whenever a particular variable value is not passed.. Ex: /select?q=*:*fq=lname:$lname -- lname is empty here and I dont want the query to fail rather than just do a pass through and return everything (returned by q). I can't really use swich plugin directly as I have more number of cases to handle hence I am trying to handle it by creating a custom component extending the Qparserplugin.. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-pass-null-OR-empty-values-to-fq-tp4078081.html Sent from the Solr - User mailing list archive at Nabble.com.
How to pass null or empty value to fq?
Hi, I am trying to pass empty values to fq parameter but passing null (or empty) doesn't seem to work for fq. Something like... q=*:*fq=(field1:test OR null) We are trying to make fq more tolerant.. It shouldn't fail if particular variable value is not passed.. Ex: /select?q=*:*fq=lname:$lname -- lname is empty here and I dont want the query to fail rather than just do a pass through and return everything (returned by q). I can't really use swich plugin directly as I have more number of cases to handle hence I am trying to handle it by creating a custom component extending the Qparserplugin.. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-pass-null-or-empty-value-to-fq-tp4078082.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to pass null OR empty values to fq?
I'm more than a little skeptical about your intentions here... just clean up your code and pass clean parameters ONLY!!! Why is that so difficult? You should have an application layer between your application client and Solr, anyway, so what's the difficulty? I mean, why are you just trying so hard just to avoid a few conditional statements in your app layer?? -- Jack Krupansky -Original Message- From: SolrLover Sent: Monday, July 15, 2013 11:43 AM To: solr-user@lucene.apache.org Subject: How to pass null OR empty values to fq? Hi, I am trying to pass empty values to fq parameter but passing null (or empty) doesn't seem to work for fq. Something like... q=*:*fq=(field1:test OR null) We are trying to make fq more tolerant by making not fail whenever a particular variable value is not passed.. Ex: /select?q=*:*fq=lname:$lname -- lname is empty here and I dont want the query to fail rather than just do a pass through and return everything (returned by q). I can't really use swich plugin directly as I have more number of cases to handle hence I am trying to handle it by creating a custom component extending the Qparserplugin.. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-pass-null-OR-empty-values-to-fq-tp4078081.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Replication process on Master/Slave slowing down slave read/search performance
Walter, Could you provide some more details about your staggered replication approach? We are currently running into similar issues and looks like staggered replication is a better approach to address the performance issues on Slaves. thanks Aditya -- View this message in context: http://lucene.472066.n3.nabble.com/Replication-process-on-Master-Slave-slowing-down-slave-read-search-performance-tp707934p4078090.html Sent from the Solr - User mailing list archive at Nabble.com.
Clearing old nodes from zookeper without restarting solrcloud cluster
Hi, Is there an easy way to clear zookeeper of all offline solr nodes without restarting the cluster? We are having some stability issues and we think it maybe due to the leader querying old offline nodes. thank you, Luis Guerrero
Re: How to pass null OR empty values to fq?
Jack, First, thanks a lot for your response. We hardcode certain queries directly in search component as its easy for us to make changes to the query from SOLR side compared to changing in applications (as many applications - mobile, desktop etc.. use single SOLR instance). We don't want to change the code which forms the query every time the query changes rather just changing the query in SOLR should do the job...Search team controls the boost and other matching criteria hence search team changes the boost more often without affecting the application...Now whenever a particular value is not passed in the query, we are trying to do a pass through so that the entire query doesn't fail (we pass through only when the custom plugin is used along with the query - for ex: !optional is the custom plugin that shouldn't throw any error if a value for any particular variable is not present)... requestHandler name=find class=solr.SearchHandler default=true str name=q ( _query_:{!dismax qf=lname_i v=$lname}^8.3 OR _query_:{!dismax qf=lname_phonetic v=$lname}^8.6 ) ( _query_:{!optional df='addr' qs=1 v=$where}^6.2 OR _query_:{!optional df='addr_i' qs=1 v=$where}^6.2 ) ( _query_:{!dismax qf=person_name v=$fname}^3.9 OR _query_:{!dismax qf=name_phonetic_i v=$fname}^0.9 OR ) /str / -- View this message in context: http://lucene.472066.n3.nabble.com/Re-How-to-pass-null-OR-empty-values-to-fq-tp4078085p4078094.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Replication process on Master/Slave slowing down slave read/search performance
We ran replication at ten minute intervals. One master, five slaves, and replication on the hour on the first slave, ten minutes after the hour on the second, twenty minutes after on the third, and so on. You could do this with a single crontab on the master. Send requests to each slave to replicate. We had a small index (about 250K docs) that was updated once per day. The replication ran every hour, just in case we had to make a mid-day change, which did happen sometimes. wunder On Jul 15, 2013, at 9:20 AM, adityab wrote: Walter, Could you provide some more details about your staggered replication approach? We are currently running into similar issues and looks like staggered replication is a better approach to address the performance issues on Slaves. thanks Aditya -- View this message in context: http://lucene.472066.n3.nabble.com/Replication-process-on-Master-Slave-slowing-down-slave-read-search-performance-tp707934p4078090.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr caching clarifications
Great explanation and article. Yes, this buffer for merges seems very small, and still optimized. Thats impressive.
Velocity Example: Where is #url_for_home defined?
I am new to using Velocity esp. with Solr. In the Velocity example provided, I am curious where #url_for_home is set i.e. its value assigned? (It is used a lot in the macros defined in VM_global_library.vm.) Thank you in advance, O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Velocity-Example-Where-is-url-for-home-defined-tp4078104.html Sent from the Solr - User mailing list archive at Nabble.com.
solr 4.3, autocommit, maxdocs
I have a solr 4.3 instance I am in the process of standing up. It started out with an empty index. I have in it's solrconfig.xml, updateHandler class=solr.DirectUpdateHandler2 autoCommit maxDocs10/maxDocs openSearcherfalse/openSearcher /autoCommit updateHandler I have an index process running, that has currently added around 400k documents to Solr. I had expected that a 'commit' would be run every 100k documents, from the above configuration, so 4 commits would have been run by now, and I'd see documents in the index. However, when I look in the Solr admin interface, at my core's 'overview' page, it still says num docs 0, segment count 0. When I expected num docs 400k at this point. Is there something I'm misunderstanding about the configuration or the admin interface? Or am I right in my expectations, but something else must be going wrong? Thanks for any advice, Jonathan
Re: solr 4.3, autocommit, maxdocs
Jonathan, Please note the openSearcher=false part of your configuration. This is why you don't see documents. The commits are occurring, and being written to segments on disk, but they are not visible to the search engine because a Solr searcher class has not opened them for visibility. You can either change the value to true, or alternatively call a deterministic commit call at the end of your load (a solr/update?commit=true will default to openSearcher=true). Hope that's of use! Jason On Jul 15, 2013, at 9:52 AM, Jonathan Rochkind rochk...@jhu.edu wrote: I have a solr 4.3 instance I am in the process of standing up. It started out with an empty index. I have in it's solrconfig.xml, updateHandler class=solr.DirectUpdateHandler2 autoCommit maxDocs10/maxDocs openSearcherfalse/openSearcher /autoCommit updateHandler I have an index process running, that has currently added around 400k documents to Solr. I had expected that a 'commit' would be run every 100k documents, from the above configuration, so 4 commits would have been run by now, and I'd see documents in the index. However, when I look in the Solr admin interface, at my core's 'overview' page, it still says num docs 0, segment count 0. When I expected num docs 400k at this point. Is there something I'm misunderstanding about the configuration or the admin interface? Or am I right in my expectations, but something else must be going wrong? Thanks for any advice, Jonathan
Change Velocity Template Directory
Is there any way to change the default Velocity directory where the Velocity templates are stored? In the example download, I modified the solrconfig.xml under the Solr Request Handler to add: str name=v.base_dirconf/mycustom//str I have a mycustom directory under the conf directory for the example core, but I still get the “Unable to find resource 'browse.vm'” exception/error. I actually renamed the velocity directory to mycustom. So it has all the template files that Velocity needs - at least that’s what I figured. Thank you in advance for any help, O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Change-Velocity-Template-Directory-tp4078120.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Change Velocity Template Directory
Try supplying an absolute path. I'm away from my computer so can't check just yet, but it is probably coded to consider that value absolute since moving it generally means you want templates outside of your Solr conf/. Erik On Jul 15, 2013, at 13:25, O. Olson olson_...@yahoo.it wrote: Is there any way to change the default Velocity directory where the Velocity templates are stored? In the example download, I modified the solrconfig.xml under the Solr Request Handler to add: str name=v.base_dirconf/mycustom//str I have a mycustom directory under the conf directory for the example core, but I still get the “Unable to find resource 'browse.vm'” exception/error. I actually renamed the velocity directory to mycustom. So it has all the template files that Velocity needs - at least that’s what I figured. Thank you in advance for any help, O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Change-Velocity-Template-Directory-tp4078120.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 4.3, autocommit, maxdocs
Ah, thanks for this explanation. Although I don't entirely understand it, I am glad there is an expected explanation! This Solr instance is actually set up to be a replication master. It never gets searched itself, it just replicates to slaves that get searched. Perhaps some time in the past (I am migrating from an already set up Solr 1.4 instance), I set this value to false, figuring it was not neccesary to actually open a searcher, since the master does not get searched itself ordinarily. Despite the opensearcher=false... once committed, are the committed docs still going to be sent via replication to a slave, is the index used for replication actually changed, even though a searcher hasn't been opened to take account of it? Or will the opensearcher=false keep the commits from being seen by replication slaves too? Thanks for any tips, Jonathan On 7/15/13 12:57 PM, Jason Hellman wrote: Jonathan, Please note the openSearcher=false part of your configuration. This is why you don't see documents. The commits are occurring, and being written to segments on disk, but they are not visible to the search engine because a Solr searcher class has not opened them for visibility. You can either change the value to true, or alternatively call a deterministic commit call at the end of your load (a solr/update?commit=true will default to openSearcher=true). Hope that's of use! Jason On Jul 15, 2013, at 9:52 AM, Jonathan Rochkind rochk...@jhu.edu wrote: I have a solr 4.3 instance I am in the process of standing up. It started out with an empty index. I have in it's solrconfig.xml, updateHandler class=solr.DirectUpdateHandler2 autoCommit maxDocs10/maxDocs openSearcherfalse/openSearcher /autoCommit updateHandler I have an index process running, that has currently added around 400k documents to Solr. I had expected that a 'commit' would be run every 100k documents, from the above configuration, so 4 commits would have been run by now, and I'd see documents in the index. However, when I look in the Solr admin interface, at my core's 'overview' page, it still says num docs 0, segment count 0. When I expected num docs 400k at this point. Is there something I'm misunderstanding about the configuration or the admin interface? Or am I right in my expectations, but something else must be going wrong? Thanks for any advice, Jonathan
Example for DIH data source through query string
Hi, I want to dynamically specify the data source in the URL when invoking data import handler. I'm looking at this : http://wiki.apache.org/solr/DataImportHandler#solrconfigdatasource requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandlerlst name=defaults str name=config/home/username/data-config.xml/str lst name=datasource str name=drivercom.mysql.jdbc.Driver/str str name=urljdbc:mysql://localhost/dbname/str str name=userdb_username/str str name=passworddb_password/str /lst/lst /requestHandler Can anyone give me a good example ? ie http://localhost:8983/solr/dataimport?datasource=what goes here ? Your help is much appreciated. Thanks
Re: Doc's FunctionQuery result field in my custom SearchComponent class ?
any help plz !!! On Mon, Jul 15, 2013 at 4:13 PM, Tony Mullins tonymullins...@gmail.comwrote: Please any help on how to get the value of 'freq' field in my custom SearchComponent ? http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 docstr name=id11/strstr name=typeVideo Games/strstr name=formatxbox 360/strstr name=productThe Amazing Spider-Man/strint name=popularity11/intlong name=_version_1439994081345273856/longint name=freq1/int/doc Here is my code DocList docs = rb.getResults().docList; DocIterator iterator = docs.iterator(); int sumFreq = 0; String id = null; for (int i = 0; i docs.size(); i++) { try { int docId = iterator.nextDoc(); // Document doc = searcher.doc(docId, fieldSet); Document doc = searcher.doc(docId); In doc object I can see the schema fields like 'id', 'type','format' etc. but I cannot find the field 'freq' which I needed. Is there any way to get the FunctionQuery fields in doc object ? Thanks, Tony On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.comwrote: Hi, I have extended Solr's SearchComonent class and I am iterating through all the docs in ResponseBuilder in @overrider Process() method. Here I want to get the value of FucntionQuery result but in Document object I am only seeing the standard field of document not the FucntionQuery result. This is my query http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 Result of above query in browser shows me that 'freq' is part of doc but its not there in Document object in my @overrider Process() method. How can I get the value of FunctionQuery result in my custom SearchComponent ? Thanks, Tony
Re: Example for DIH data source through query string
I don't think you can get there from here. But you can specify config file on a query line. If you only have a couple of configurations, you could have them in different files and switch that way. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 2:56 PM, Kiran J kiranjuni...@gmail.com wrote: Hi, I want to dynamically specify the data source in the URL when invoking data import handler. I'm looking at this : http://wiki.apache.org/solr/DataImportHandler#solrconfigdatasource requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandlerlst name=defaults str name=config/home/username/data-config.xml/str lst name=datasource str name=drivercom.mysql.jdbc.Driver/str str name=urljdbc:mysql://localhost/dbname/str str name=userdb_username/str str name=passworddb_password/str /lst/lst /requestHandler Can anyone give me a good example ? ie http://localhost:8983/solr/dataimport?datasource=what goes here ? Your help is much appreciated. Thanks
Re: Velocity Example: Where is #url_for_home defined?
#url_for_home is defined in conf/velocity/VM_global_library.vm. Note that it builds upon #url_root defined just above it, so maybe that's what you want to adjust if you need to tinker with it. Erik On Jul 15, 2013, at 12:49, O. Olson olson_...@yahoo.it wrote: I am new to using Velocity esp. with Solr. In the Velocity example provided, I am curious where #url_for_home is set i.e. its value assigned? (It is used a lot in the macros defined in VM_global_library.vm.) Thank you in advance, O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Velocity-Example-Where-is-url-for-home-defined-tp4078104.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Doc's FunctionQuery result field in my custom SearchComponent class ?
Hi, I think the process of retrieving a stored field (through fl) is happens after SearchComponent. One solution: If you wrap a q params with function your score will be a result of the function. For example, http://localhost:8080/solr/collection2/demoendpoint?q=termfreq%28product,%27spider%27%29wt=xmlindent=truefl=*,score Now your score is going to be a result of termfreq(product,'spider') -- Patanachai Tangchaisin On 07/15/2013 12:01 PM, Tony Mullins wrote: any help plz !!! On Mon, Jul 15, 2013 at 4:13 PM, Tony Mullins tonymullins...@gmail.comwrote: Please any help on how to get the value of 'freq' field in my custom SearchComponent ? http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 docstr name=id11/strstr name=typeVideo Games/strstr name=formatxbox 360/strstr name=productThe Amazing Spider-Man/strint name=popularity11/intlong name=_version_1439994081345273856/longint name=freq1/int/doc Here is my code DocList docs = rb.getResults().docList; DocIterator iterator = docs.iterator(); int sumFreq = 0; String id = null; for (int i = 0; i docs.size(); i++) { try { int docId = iterator.nextDoc(); // Document doc = searcher.doc(docId, fieldSet); Document doc = searcher.doc(docId); In doc object I can see the schema fields like 'id', 'type','format' etc. but I cannot find the field 'freq' which I needed. Is there any way to get the FunctionQuery fields in doc object ? Thanks, Tony On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.comwrote: Hi, I have extended Solr's SearchComonent class and I am iterating through all the docs in ResponseBuilder in @overrider Process() method. Here I want to get the value of FucntionQuery result but in Document object I am only seeing the standard field of document not the FucntionQuery result. This is my query http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 Result of above query in browser shows me that 'freq' is part of doc but its not there in Document object in my @overrider Process() method. How can I get the value of FunctionQuery result in my custom SearchComponent ? Thanks, Tony CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
MorphlineSolrSink
Newbie question: I have a Flume server, where I am writing to sink which is a RollingFile Sink. I have to take this files from the sink and send it to Solr which can index and provide search. Do I need to configure MorphineSolrSink? What is the mechanism's to do this or send this data over to Solr. Thanks, Rajesh
Different 'fl' for first X results
How to get a different field list in the first X results? For example, in the first 5 results I want fields A, B, C, and on the next results I need only fields A, and B. -- View this message in context: http://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Different 'fl' for first X results
It is not really possible. Why do you actually need it? Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 4:58 PM, Weber solrmaill...@fluidolabs.com wrote: How to get a different field list in the first X results? For example, in the first 5 results I want fields A, B, C, and on the next results I need only fields A, and B. -- View this message in context: http://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 4.3.1: Errors When Attempting to Index LatLon Fields
I'm trying to index documents containing geo-spatial coordinates using Solr 4.3.1 and am running into some difficulties. Whenever I attempt to index a particular document containing a geospatial coordinate pair (using post.jar), the operation fails as follows: SimplePostTool version 1.5 Posting files to base url http://localhost:8080/solr/update using content-type application/xml.. POSTing file rib1.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8080/solr/update 1 files indexed. COMMITting Solr index changes to http://localhost:8080/solr/update.. Time spent: 0:00:00.063 The solr log shows the following: 08:30:39 ERROR SolrCore org.apache.solr.common.SolrException: undefined field: geoFindspot_0_coordinate There relevant parts of my schema.xml are: field name=geoFindspot type=location indexed=true stored=true multiValued=true/ ... fieldType name=location class=solr.LatLonType subFieldSuffix=_coordinate/ dynamicField name=*_coordinate type=tdouble indexed=true stored=false / The document I am attempting to index has this field: field name=geoFindspot51.512332,-0.090588/field As far as I can tell, my configuration complies with the instructions on the relevant Wiki page (http://wiki.apache.org/solr/SpatialSearch) and I can see nothing amiss. Any suggestions as to why this is failing would be greatly appreciated. Thank you!
Re: Velocity Example: Where is #url_for_home defined?
Thank you very much Erik. That’s exactly what I was looking for. I can swear I looked into VM_global_library.vm. I'm not sure how I missed it :-( O. O. Erik Hatcher-4 wrote #url_for_home is defined in conf/velocity/VM_global_library.vm. Note that it builds upon #url_root defined just above it, so maybe that's what you want to adjust if you need to tinker with it. Erik -- View this message in context: http://lucene.472066.n3.nabble.com/Velocity-Example-Where-is-url-for-home-defined-tp4078104p4078186.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Different 'fl' for first X results
1. Request all fields needed for all results and simply ignore the extra field(s) (which can be empty or missing and will automatically be ignored by Solr anyway). 2. Two separate query requests. 3. A custom search component. 4. Wait for the new scripted query request handler that gives you full control in a custom script. -- Jack Krupansky -Original Message- From: Weber Sent: Monday, July 15, 2013 4:58 PM To: solr-user@lucene.apache.org Subject: Different 'fl' for first X results How to get a different field list in the first X results? For example, in the first 5 results I want fields A, B, C, and on the next results I need only fields A, and B. -- View this message in context: http://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Change Velocity Template Directory
Thank you Erik. I did not think the Windows file/directory path format would work for Solr. For others the following worked for me: str name=v.base_dirC:\Users\MyUsername\Solr\example\example-DIH\solr\db\conf\mycustom\/str Erik Hatcher-4 wrote Try supplying an absolute path. I'm away from my computer so can't check just yet, but it is probably coded to consider that value absolute since moving it generally means you want templates outside of your Solr conf/. Erik -- View this message in context: http://lucene.472066.n3.nabble.com/Change-Velocity-Template-Directory-tp4078120p4078188.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 4.3.1: Errors When Attempting to Index LatLon Fields
Make sure that dynamicFields are within fields rather than types. Solr tends to ignore misplaced configuration elements. -- Jack Krupansky -Original Message- From: Scott Vanderbilt Sent: Monday, July 15, 2013 5:10 PM To: solr-user@lucene.apache.org Subject: Solr 4.3.1: Errors When Attempting to Index LatLon Fields I'm trying to index documents containing geo-spatial coordinates using Solr 4.3.1 and am running into some difficulties. Whenever I attempt to index a particular document containing a geospatial coordinate pair (using post.jar), the operation fails as follows: SimplePostTool version 1.5 Posting files to base url http://localhost:8080/solr/update using content-type application/xml.. POSTing file rib1.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8080/solr/update 1 files indexed. COMMITting Solr index changes to http://localhost:8080/solr/update.. Time spent: 0:00:00.063 The solr log shows the following: 08:30:39 ERROR SolrCore org.apache.solr.common.SolrException: undefined field: geoFindspot_0_coordinate There relevant parts of my schema.xml are: field name=geoFindspot type=location indexed=true stored=true multiValued=true/ ... fieldType name=location class=solr.LatLonType subFieldSuffix=_coordinate/ dynamicField name=*_coordinate type=tdouble indexed=true stored=false / The document I am attempting to index has this field: field name=geoFindspot51.512332,-0.090588/field As far as I can tell, my configuration complies with the instructions on the relevant Wiki page (http://wiki.apache.org/solr/SpatialSearch) and I can see nothing amiss. Any suggestions as to why this is failing would be greatly appreciated. Thank you!
Re: Solr 4.3.1: Errors When Attempting to Index LatLon Fields
Brilliant. That's precisely what the issue was. The Wiki didn't give a context for where the dynamicField element was supposed to go and I assumed (incorrectly) that it was in types. Of course, I should not have assumed that and verified it independently. Mea culpa. Thanks the gentle application of the clue stick. g On 7/15/2013 2:25 PM, Jack Krupansky wrote: Make sure that dynamicFields are within fields rather than types. Solr tends to ignore misplaced configuration elements. -- Jack Krupansky -Original Message- From: Scott Vanderbilt Sent: Monday, July 15, 2013 5:10 PM To: solr-user@lucene.apache.org Subject: Solr 4.3.1: Errors When Attempting to Index LatLon Fields I'm trying to index documents containing geo-spatial coordinates using Solr 4.3.1 and am running into some difficulties. Whenever I attempt to index a particular document containing a geospatial coordinate pair (using post.jar), the operation fails as follows: SimplePostTool version 1.5 Posting files to base url http://localhost:8080/solr/update using content-type application/xml.. POSTing file rib1.xml SimplePostTool: WARNING: Solr returned an error #400 Bad Request SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8080/solr/update 1 files indexed. COMMITting Solr index changes to http://localhost:8080/solr/update.. Time spent: 0:00:00.063 The solr log shows the following: 08:30:39 ERROR SolrCore org.apache.solr.common.SolrException: undefined field: geoFindspot_0_coordinate There relevant parts of my schema.xml are: field name=geoFindspot type=location indexed=true stored=true multiValued=true/ ... fieldType name=location class=solr.LatLonType subFieldSuffix=_coordinate/ dynamicField name=*_coordinate type=tdouble indexed=true stored=false / The document I am attempting to index has this field: field name=geoFindspot51.512332,-0.090588/field As far as I can tell, my configuration complies with the instructions on the relevant Wiki page (http://wiki.apache.org/solr/SpatialSearch) and I can see nothing amiss. Any suggestions as to why this is failing would be greatly appreciated. Thank you!
Re: ACL implementation: Pseudo-join performance Atomic Updates
On Sun, Jul 14, 2013 at 1:45 PM, Oleg Burlaca oburl...@gmail.com wrote: Hello Erick, Join performance is most sensitive to the number of values in the field being joined on. So if you have lots and lots of distinct values in the corpus, join performance will be affected. Yep, we have a list of unique Id's that we get by first searching for records where loggedInUser IS IN (userIDs) This corpus is stored in memory I suppose? (not a problem) and then the bottleneck is to match this huge set with the core where I'm searching? Somewhere in maillist archive people were talking about external list of Solr unique IDs but didn't find if there is a solution. Back in 2010 Yonik posted a comment: http://find.searchhub.org/document/363a4952446b3cd#363a4952446b3cd sorry, haven't the previous thread in its entirety, but few weeks back that Yonik's proposal got implemented, it seems ;) http://search-lucene.com/m/Fa3Dg14mqoj/bitsetsubj=Re+Solr+large+boolean+filter You could use this to send very large bitset filter (which can be translated into any integers, if you can come up with a mapping function). roman bq: I suppose the delete/reindex approach will not change soon There is ongoing work (search the JIRA for Stacked Segments) Ah, ok, I was feeling it affects the architecture, ok, now the only hope is Pseudo-Joins )) One way to deal with this is to implement a post filter, sometimes called a no cache filter. thanks, will have a look, but as you describe it, it's not the best option. The approach too many documents, man. Please refine your query. Partial results below means faceting will not work correctly? ... I have in mind a hybrid approach, comments welcome: Most of the time users are not searching, but browsing content, so our virtual filesystem stored in SOLR will use only the index with the Id of the file and the list of users that have access to it. i.e. not touching the fulltext index at all. Files may have metadata (EXIF info for images for ex) that we'd like to filter by, calculate facets. Meta will be stored in both indexes. In case of a fulltext query: 1. search FT index (the fulltext index), get only the number of search results, let it be Rf 2. search DAC index (the index with permissions), get number of search results, let it be Rd let maxR be the maximum size of the corpus for the pseudo-join. *That was actually my question: what is a reasonable number? 10, 100, 1000 ? * if (Rf maxR) or (Rd maxR) then use the smaller corpus to join onto the second one. this happens when (only a few documents contains the search query) OR (user has access to a small number of files). In case none of these happens, we can use the too many documents, man. Please refine your query. Partial results below but first searching the FT index, because we want relevant results first. What do you think? Regards, Oleg On Sun, Jul 14, 2013 at 7:42 PM, Erick Erickson erickerick...@gmail.com wrote: Join performance is most sensitive to the number of values in the field being joined on. So if you have lots and lots of distinct values in the corpus, join performance will be affected. bq: I suppose the delete/reindex approach will not change soon There is ongoing work (search the JIRA for Stacked Segments) on actually doing something about this, but it's been under consideration for at least 3 years so your guess is as good as mine. bq: notice that the worst situation is when everyone has access to all the files, it means the first filter will be the full index. One way to deal with this is to implement a post filter, sometimes called a no cache filter. The distinction here is that 1 it is not cached (duh!) 2 it is only called for documents that have made it through all the other lower cost filters (and the main query of course). 3 lower cost means the filter is either a standard, cached filters and any no cache filters with a cost (explicitly stated in the query) lower than this one's. Critically, and unlike normal filter queries, the result set is NOT calculated for all documents ahead of time You _still_ have to deal with the sysadmin doing a *:* query as you are well aware. But one can mitigate that by having the post-filter fail all documents after some arbitrary N, and display a message in the app like too many documents, man. Please refine your query. Partial results below. Of course this may not be acceptable, but HTH Erick On Sun, Jul 14, 2013 at 12:05 PM, Jack Krupansky j...@basetechnology.com wrote: Take a look at LucidWorks Search and its access control: http://docs.lucidworks.com/display/help/Search+Filters+for+Access+Control Role-based security is an easier nut to crack. Karl Wright of ManifoldCF had a Solr patch for document access control at one point: SOLR-1895 - ManifoldCF SearchComponent plugin for enforcing
Re: Different 'fl' for first X results
Is there a JIRA number for the last one? Regards, Alex On 15 Jul 2013 17:21, Jack Krupansky j...@basetechnology.com wrote: 1. Request all fields needed for all results and simply ignore the extra field(s) (which can be empty or missing and will automatically be ignored by Solr anyway). 2. Two separate query requests. 3. A custom search component. 4. Wait for the new scripted query request handler that gives you full control in a custom script. -- Jack Krupansky -Original Message- From: Weber Sent: Monday, July 15, 2013 4:58 PM To: solr-user@lucene.apache.org Subject: Different 'fl' for first X results How to get a different field list in the first X results? For example, in the first 5 results I want fields A, B, C, and on the next results I need only fields A, and B. -- View this message in context: http://lucene.472066.n3.** nabble.com/Different-fl-for-**first-X-results-tp4078178.htmlhttp://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Different 'fl' for first X results
SOLR-5005 - JavaScriptRequestHandler https://issues.apache.org/jira/browse/SOLR-5005 -- Jack Krupansky -Original Message- From: Alexandre Rafalovitch Sent: Monday, July 15, 2013 6:56 PM To: solr-user@lucene.apache.org Subject: Re: Different 'fl' for first X results Is there a JIRA number for the last one? Regards, Alex On 15 Jul 2013 17:21, Jack Krupansky j...@basetechnology.com wrote: 1. Request all fields needed for all results and simply ignore the extra field(s) (which can be empty or missing and will automatically be ignored by Solr anyway). 2. Two separate query requests. 3. A custom search component. 4. Wait for the new scripted query request handler that gives you full control in a custom script. -- Jack Krupansky -Original Message- From: Weber Sent: Monday, July 15, 2013 4:58 PM To: solr-user@lucene.apache.org Subject: Different 'fl' for first X results How to get a different field list in the first X results? For example, in the first 5 results I want fields A, B, C, and on the next results I need only fields A, and B. -- View this message in context: http://lucene.472066.n3.** nabble.com/Different-fl-for-**first-X-results-tp4078178.htmlhttp://lucene.472066.n3.nabble.com/Different-fl-for-first-X-results-tp4078178.html Sent from the Solr - User mailing list archive at Nabble.com.
Changes in DIrectSpellChecker configuration cause hang on startup
Hi All, I changed the name of the queryAnalyzerFieldType for my spellcheck component and the corresponding field and now when solr starts up, it hangs at this point: 5797 [searcherExecutor-4-thread-1] INFO org.apache.solr.core.SolrCore – QuerySenderListener sending requests to Searcher@153d12bfmain{StandardDirectoryReader(segments_k9p:127340 _1cz(4.3):C387286/120 _2u1(4.3):C405320/146 _4pl(4.3):C493017/136 _65a(4.3):C322122/160 _7ky(4.3):C312296/147 _936(4.3):C326967/135 _b9j(4.3):C474140/229 _cyy(4.3):C298811/88428 _124m(4.3):C622322/137649 My config for the spellcheckcomponent: searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypemarkup/str !-- Multiple Spell Checkers can be declared and used by this component -- !-- a spellchecker built from a field of the main index -- lst name=spellchecker str name=namedefault/str str name=fieldmarkup_texts/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str !-- minimum accuracy needed to be considered a valid spellcheck suggestion -- float name=accuracy0.5/float !-- the maximum #edits we consider when enumerating terms: can be 1 or 2 -- int name=maxEdits1/int !-- the minimum shared prefix when enumerating terms -- int name=minPrefix1/int !-- maximum number of inspections per result. -- int name=maxInspections5/int !-- minimum length of a query term to be considered for correction -- int name=minQueryLength4/int !-- maximum threshold of documents a query term can appear to be considered for correction -- float name=maxQueryFrequency0.01/float !-- uncomment this to require suggestions to occur in 1% of the documents float name=thresholdTokenFrequency.01/float -- /lst Has anyone got some insight? Thanks
How to use joins in solr 4.3.1
Hello, I am trying to join data between two cores: merchant and location This is my query: http://_server_.com:8983/solr/location/select?q={!join from=merchantId to=merchantId fromIndex=merchant}walgreens Ref: http://wiki.apache.org/solr/Join Merchants core has documents for the query: walgreens with an merchantId 1 A simple query: http://_server_.com:8983/solr/location/select?q=walgreens returns documents called walgreens with merchantId=1 Location core has documents with merchantId=1 too. But my join query returns no documents. This is the response I get: { responseHeader:{ status:0, QTime:5, params:{ debugQuery:true, indent:true, q:{!join from=merchantId to=merchantId fromIndex=merchant}walgreens, wt:json}}, response:{numFound:0,start:0,maxScore:0.0,docs:[] }, debug:{ rawquerystring:{!join from=merchantId to=merchantId fromIndex=merchant}walgreens, querystring:{!join from=merchantId to=merchantId fromIndex=merchant}walgreens, parsedquery:JoinQuery({!join from=merchantId to=merchantId fromIndex=merchant}allText:walgreens), parsedquery_toString:{!join from=merchantId to=merchantId fromIndex=merchant}allText:walgreens, QParser:, explain:{}}} Any suggestions? -- Thanks, -Utkarsh
Re: How to use joins in solr 4.3.1
I have also tried these queries (as per this SO answer: http://stackoverflow.com/questions/12665797/is-solr-4-0-capable-of-using-join-for-multiple-core ) 1. http://_server_.com:8983/solr/location/select?q=:fq={!join from=merchantId to=merchantId fromIndex=merchant}walgreens And I get this: { responseHeader:{ status:400, QTime:1, params:{ indent:true, q::, wt:json, fq:{!join from=merchantId to=merchantId fromIndex=merchant}walgreens}}, error:{ msg:org.apache.solr.search.SyntaxError: Cannot parse ':': Encountered \ \:\ \: \\ at line 1, column 0.\nWas expecting one of:\nNOT ...\n\+\ ...\n\-\ ...\nBAREOPER ...\n \(\ ...\n\*\ ...\nQUOTED ...\nTERM ...\n PREFIXTERM ...\nWILDTERM ...\nREGEXPTERM ...\n\[\ ...\n\{\ ...\nLPARAMS ...\nNUMBER ...\nTERM ...\n\*\ ...\n, code:400}} And this: 2.http://_server_.com:8983/solr/location/select?q=walgreensfq={!join from=merchantId to=merchantId fromIndex=merchant} { responseHeader:{ status:500, QTime:5, params:{ indent:true, q:walgreens, wt:json, fq:{!join from=merchantId to=merchantId fromIndex=merchant}}}, error:{ msg:Server at http://_SERVER_:8983/solr/location returned non ok status:500, message:Server Error, trace:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Server at http://_SERVER_:8983/solr/location returned non ok status:500, message:Server Error\n\tat org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:372)\n\tat org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)\n\tat org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156)\n\tat org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)\n\tat java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:138)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)\n\tat java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:138)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)\n\tat java.lang.Thread.run(Thread.java:662)\n, code:500}} Thanks, -Utkarsh On Mon, Jul 15, 2013 at 4:27 PM, Utkarsh Sengar utkarsh2...@gmail.comwrote: Hello, I am trying to join data between two cores: merchant and location This is my query: http://_server_.com:8983/solr/location/select?q={!join from=merchantId to=merchantId fromIndex=merchant}walgreens Ref: http://wiki.apache.org/solr/Join Merchants core has documents for the query: walgreens with an merchantId 1 A simple query: http://_server_.com:8983/solr/location/select?q=walgreens returns documents called walgreens with merchantId=1 Location core has documents with merchantId=1 too. But my join query returns no documents. This is the response I get: { responseHeader:{ status:0, QTime:5, params:{ debugQuery:true, indent:true, q:{!join from=merchantId to=merchantId fromIndex=merchant}walgreens, wt:json}}, response:{numFound:0,start:0,maxScore:0.0,docs:[] }, debug:{ rawquerystring:{!join from=merchantId to=merchantId fromIndex=merchant}walgreens, querystring:{!join from=merchantId to=merchantId fromIndex=merchant}walgreens, parsedquery:JoinQuery({!join from=merchantId to=merchantId fromIndex=merchant}allText:walgreens), parsedquery_toString:{!join from=merchantId to=merchantId fromIndex=merchant}allText:walgreens, QParser:, explain:{}}} Any suggestions? -- Thanks, -Utkarsh -- Thanks, -Utkarsh
Re: SolrCloud: how to index documents into a specific core and how to search against that core?
Yandong, have you figured out if it works for you to use one collection per customer? We have the similar use-case as yours: customer id's are used as core names. that was the reason our company did not upgrade to solrcould ... I might remember it wrong but I vaguely remember I looked into using collection for each customer, and it seems the number of collections as current release are fixes, aren't they? thanks Jie -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-how-to-index-documents-into-a-specific-core-and-how-to-search-against-that-core-tp3985262p4078210.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud: Collection API question and problem with core loading
Hi there, I run 2 solr instances ( Tomcat 7, Solr 4.3.0 , one shard),one external Zookeeper instance and have lots of cores. I use collection API to create the new core dynamically after the configuration for the core is uploaded to the Zookeeper and it all works fine. As there are so many cores it takes very long time to load them at start up I would like to start up the server quickly and load the cores on demand. When the core is created via collection API it is created with default parameter : loadOnStartup=true ( this can be seen in solr.xml ) Question: is there a way to specify this parameter so it can be set 'false' in collection API ? Problem: If I manually set loadOnStartup=true for the core I had exception below when I used CloudSolrServer to query the core : Error: org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request Seems to me that CloudSolrServer will not trigger the core to be loaded. Is it possible to get the core loaded using CloudSolrServer? Regards, Patrick
Re: Example for DIH data source through query string
Thank you Alex. On Mon, Jul 15, 2013 at 12:37 PM, Alexandre Rafalovitch arafa...@gmail.comwrote: I don't think you can get there from here. But you can specify config file on a query line. If you only have a couple of configurations, you could have them in different files and switch that way. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 2:56 PM, Kiran J kiranjuni...@gmail.com wrote: Hi, I want to dynamically specify the data source in the URL when invoking data import handler. I'm looking at this : http://wiki.apache.org/solr/DataImportHandler#solrconfigdatasource requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandlerlst name=defaults str name=config/home/username/data-config.xml/str lst name=datasource str name=drivercom.mysql.jdbc.Driver/str str name=urljdbc:mysql://localhost/dbname/str str name=userdb_username/str str name=passworddb_password/str /lst/lst /requestHandler Can anyone give me a good example ? ie http://localhost:8983/solr/dataimport?datasource=what goes here ? Your help is much appreciated. Thanks
Book contest idea - feedback requested
Hello, Packt Publishing has kindly agreed to let me run a contest with e-copies of my book as prizes: http://www.packtpub.com/apache-solr-for-indexing-data/book Since my book is about learning Solr and targeted at beginners and early intermediates, here is what I would like to do. I am asking for feedback on whether people on the mailing list like the idea or have specific objections to it. 1) The basic idea is to get Solr users and write and vote on what they find hard with Solr, especially in understanding the features (as contrasted with just missing ones). 2) I'll probably set it up as a User Voice forum, which has all the mechanisms for suggesting and voting on ideas. With an easier interface than Jira 3) The top N voted ideas will get the books as prizes and I will try to fix/document/create JIRAs for those issues. 4) I am hoping to specifically reach out to the communities where Solr is a component and where they don't necessarily hang out on our mailing list. I am thinking SolrNet, Drupal, project Blacklight, Cloudera, CrafterCMS, SiteCore, Typo3, SunSpot, Nutch. Obviously, anybody and everybody from this list would be absolutely welcome to participate as well. Yes? No? Suggestions? Also, if you are maintainer of one of the products/services/libraries that has Solr in it and want to reach out to your community yourself, I think it would be a lot better than If I did it. Contact me directly and I will let you know what template/FAQ I want you to include in the announcement message when it is ready. Thank you all in advance for the comments and suggestions. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
Re: MorphlineSolrSink
Rajesh, I think this question is better suited for the FLUME user mailing list. You will need to configure the sink with the expected values so that the events from the channels can head to the right place. On Mon, Jul 15, 2013 at 4:49 PM, Rajesh Jain rjai...@gmail.com wrote: Newbie question: I have a Flume server, where I am writing to sink which is a RollingFile Sink. I have to take this files from the sink and send it to Solr which can index and provide search. Do I need to configure MorphineSolrSink? What is the mechanism's to do this or send this data over to Solr. Thanks, Rajesh -- °O° Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/
Re: Clearing old nodes from zookeper without restarting solrcloud cluster
I know that you can clear zookeeper's data directoy using the CLI with the clear command, I just want to know if its possible to update the cluster's state without wiping everything out. Anyone have any ideas/suggestions? On Mon, Jul 15, 2013 at 11:21 AM, Luis Carlos Guerrero Covo lcguerreroc...@gmail.com wrote: Hi, Is there an easy way to clear zookeeper of all offline solr nodes without restarting the cluster? We are having some stability issues and we think it maybe due to the leader querying old offline nodes. thank you, Luis Guerrero -- Luis Carlos Guerrero Covo M.S. Computer Engineering (57) 3183542047
Re: Book contest idea - feedback requested
Hello Alex, This sounds like an excellent idea! :) Saqib On Mon, Jul 15, 2013 at 8:11 PM, Alexandre Rafalovitch arafa...@gmail.comwrote: Hello, Packt Publishing has kindly agreed to let me run a contest with e-copies of my book as prizes: http://www.packtpub.com/apache-solr-for-indexing-data/book Since my book is about learning Solr and targeted at beginners and early intermediates, here is what I would like to do. I am asking for feedback on whether people on the mailing list like the idea or have specific objections to it. 1) The basic idea is to get Solr users and write and vote on what they find hard with Solr, especially in understanding the features (as contrasted with just missing ones). 2) I'll probably set it up as a User Voice forum, which has all the mechanisms for suggesting and voting on ideas. With an easier interface than Jira 3) The top N voted ideas will get the books as prizes and I will try to fix/document/create JIRAs for those issues. 4) I am hoping to specifically reach out to the communities where Solr is a component and where they don't necessarily hang out on our mailing list. I am thinking SolrNet, Drupal, project Blacklight, Cloudera, CrafterCMS, SiteCore, Typo3, SunSpot, Nutch. Obviously, anybody and everybody from this list would be absolutely welcome to participate as well. Yes? No? Suggestions? Also, if you are maintainer of one of the products/services/libraries that has Solr in it and want to reach out to your community yourself, I think it would be a lot better than If I did it. Contact me directly and I will let you know what template/FAQ I want you to include in the announcement message when it is ready. Thank you all in advance for the comments and suggestions. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
Re: Clearing old nodes from zookeper without restarting solrcloud cluster
Hello Luis, I don't think that is possible. If you delete clusterstate.json from zookeeper, you will need to restart the nodes.. I could be very wrong about this Saqib On Mon, Jul 15, 2013 at 8:50 PM, Luis Carlos Guerrero Covo lcguerreroc...@gmail.com wrote: I know that you can clear zookeeper's data directoy using the CLI with the clear command, I just want to know if its possible to update the cluster's state without wiping everything out. Anyone have any ideas/suggestions? On Mon, Jul 15, 2013 at 11:21 AM, Luis Carlos Guerrero Covo lcguerreroc...@gmail.com wrote: Hi, Is there an easy way to clear zookeeper of all offline solr nodes without restarting the cluster? We are having some stability issues and we think it maybe due to the leader querying old offline nodes. thank you, Luis Guerrero -- Luis Carlos Guerrero Covo M.S. Computer Engineering (57) 3183542047
Re: MorphlineSolrSink
On Tue, Jul 16, 2013 at 2:19 AM, Rajesh Jain rjai...@gmail.com wrote: Newbie question: I have a Flume server, where I am writing to sink which is a RollingFile Sink. I have to take this files from the sink and send it to Solr which can index and provide search. Do I need to configure MorphineSolrSink? Yes What is the mechanism's to do this or send this data over to Solr. More details here http://flume.apache.org/FlumeUserGuide.html#morphlinesolrsink As suggested, please move further related question to Flume User ML. Thanks, Rajesh -- thanks ashish Blog: http://www.ashishpaliwal.com/blog My Photo Galleries: http://www.pbase.com/ashishpaliwal
Re: Facet sorting seems weird
Alex, You could submit a JIRA ticket, and add an option like facet.sort = insensitive, and f. syntax Then we all get the benefit of the new feature. On Mon, Jul 15, 2013 at 9:16 AM, Alexandre Rafalovitch arafa...@gmail.comwrote: Hi Henrik, If I understand the question correctly (case-insensitive sorting of the facet values), then this is the limitation of the current Facet component. You can see the full implementation at: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/handler/component/FacetComponent.java#L818 If you are comfortable with Java code, the easiest thing might be to copy/fix the component and use your own one for faceting. The components are defined in solrconfig.xml and FacetComponent is in a default chain. See: https://github.com/apache/lucene-solr/blob/trunk/solr/example/solr/collection1/conf/solrconfig.xml#L1194 If you do manage to do this (I would recommend doing it as an extra option), it would be nice to have it contributed back to Solr. I think you are not the only one with this requirement. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Mon, Jul 15, 2013 at 10:08 AM, Henrik Ossipoff Hansen h...@entertainment-trading.com wrote: Hello, first time writing to the list. I am a developer for a company where we recently switched all of our search core from Sphinx to Solr with very great results. In general we've been very happy with the switch, and everything seems to work just as we want it to. Today however we've run into a bit of a issue regarding faceted sort. For example we have a field called brand in our core, defined as the text_en datatype from the example Solr core. This field is copied into facet_brand with the datatype string (since we don't really need to do much with it except show it for faceted navigation). Now, given these two entries into the field on different documents, LEGO and bObles, and given facet.sort=index, it appears that LEGO is sorted as being before bObles. I assume this is because of casing differences. My question then is, how do we define a decent datatype in our schema, where the casing is exact, but we are able to sort it without casing mattering? Thank you :) Best regards, Henrik Ossipoff -- Bill Bell billnb...@gmail.com cell 720-256-8076
Re: Book contest idea - feedback requested
Hi Alex, great please go ahead.. -Sandeep On Tue, Jul 16, 2013 at 9:40 AM, Ali, Saqib docbook@gmail.com wrote: Hello Alex, This sounds like an excellent idea! :) Saqib On Mon, Jul 15, 2013 at 8:11 PM, Alexandre Rafalovitch arafa...@gmail.comwrote: Hello, Packt Publishing has kindly agreed to let me run a contest with e-copies of my book as prizes: http://www.packtpub.com/apache-solr-for-indexing-data/book Since my book is about learning Solr and targeted at beginners and early intermediates, here is what I would like to do. I am asking for feedback on whether people on the mailing list like the idea or have specific objections to it. 1) The basic idea is to get Solr users and write and vote on what they find hard with Solr, especially in understanding the features (as contrasted with just missing ones). 2) I'll probably set it up as a User Voice forum, which has all the mechanisms for suggesting and voting on ideas. With an easier interface than Jira 3) The top N voted ideas will get the books as prizes and I will try to fix/document/create JIRAs for those issues. 4) I am hoping to specifically reach out to the communities where Solr is a component and where they don't necessarily hang out on our mailing list. I am thinking SolrNet, Drupal, project Blacklight, Cloudera, CrafterCMS, SiteCore, Typo3, SunSpot, Nutch. Obviously, anybody and everybody from this list would be absolutely welcome to participate as well. Yes? No? Suggestions? Also, if you are maintainer of one of the products/services/libraries that has Solr in it and want to reach out to your community yourself, I think it would be a lot better than If I did it. Contact me directly and I will let you know what template/FAQ I want you to include in the announcement message when it is ready. Thank you all in advance for the comments and suggestions. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book)
Re: Doc's FunctionQuery result field in my custom SearchComponent class ?
No sorry, I am still not getting the termfreq() field in my 'doc' object. I do get the _version_ field in my 'doc' object which I think is realValue=StoredField. At which point termfreq() or any other FunctionQuery field becomes the part of doc object in Solr ? And at that point can I perform some custom logic and append the response ? Thanks. Tony On Tue, Jul 16, 2013 at 1:34 AM, Patanachai Tangchaisin patanachai.tangchai...@wizecommerce.com wrote: Hi, I think the process of retrieving a stored field (through fl) is happens after SearchComponent. One solution: If you wrap a q params with function your score will be a result of the function. For example, http://localhost:8080/solr/**collection2/demoendpoint?q=** termfreq%28product,%27spider%**27%29wt=xmlindent=truefl=*,**scorehttp://localhost:8080/solr/collection2/demoendpoint?q=termfreq%28product,%27spider%27%29wt=xmlindent=truefl=*,score Now your score is going to be a result of termfreq(product,'spider') -- Patanachai Tangchaisin On 07/15/2013 12:01 PM, Tony Mullins wrote: any help plz !!! On Mon, Jul 15, 2013 at 4:13 PM, Tony Mullins tonymullins...@gmail.com* *wrote: Please any help on how to get the value of 'freq' field in my custom SearchComponent ? http://localhost:8080/solr/**collection2/demoendpoint?q=** spiderwt=xmlindent=truefl=***,freq:termfreq%28product,%** 27spider%27%29http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 docstr name=id11/strstr name=typeVideo Games/strstr name=formatxbox 360/strstr name=productThe Amazing Spider-Man/strint name=popularity11/int**long name=_version_**1439994081345273856/longint name=freq1/int/doc Here is my code DocList docs = rb.getResults().docList; DocIterator iterator = docs.iterator(); int sumFreq = 0; String id = null; for (int i = 0; i docs.size(); i++) { try { int docId = iterator.nextDoc(); // Document doc = searcher.doc(docId, fieldSet); Document doc = searcher.doc(docId); In doc object I can see the schema fields like 'id', 'type','format' etc. but I cannot find the field 'freq' which I needed. Is there any way to get the FunctionQuery fields in doc object ? Thanks, Tony On Mon, Jul 15, 2013 at 1:16 PM, Tony Mullins tonymullins...@gmail.com **wrote: Hi, I have extended Solr's SearchComonent class and I am iterating through all the docs in ResponseBuilder in @overrider Process() method. Here I want to get the value of FucntionQuery result but in Document object I am only seeing the standard field of document not the FucntionQuery result. This is my query http://localhost:8080/solr/**collection2/demoendpoint?q=** spiderwt=xmlindent=truefl=***,freq:termfreq%28product,%** 27spider%27%29http://localhost:8080/solr/collection2/demoendpoint?q=spiderwt=xmlindent=truefl=*,freq:termfreq%28product,%27spider%27%29 Result of above query in browser shows me that 'freq' is part of doc but its not there in Document object in my @overrider Process() method. How can I get the value of FunctionQuery result in my custom SearchComponent ? Thanks, Tony CONFIDENTIALITY NOTICE == This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.