Re: using PositionIncrementAttribute to increment certain term positions to large values
Hi, answering my own question for the records: the experiments show that the described functionality is achievable with the TokenFilter class implementation. The only caveat though, is that Highlighter component stops working properly, if the match position goes beyond the length of the text field. As for the performance, no major delays compared to the original proximity search implementation have been noticed. Best, Dmitry Kan On Wed, Dec 19, 2012 at 10:53 AM, Dmitry Kan solrexp...@gmail.com wrote: Dear list, We are currently evaluating proximity searches (term1 term2 ~slope) for a specific use case. In particular, each document contains artificial delimiter characters (one character between each pair of sentences in the text). Our goal is to hit the sentences individually for any proximity search and avoid sentence cross-boundary matches. We figured, that by using PositionIncrementAttribute as a field in the descendant of TokenFilter class it is possible to set a position increment of each artificial character (which is a term in Lucene / SOLR notation) to an arbitrarily large number. Thus any proximity searches with reasonably small slope values should automatically hit withing the sentence boundaries. Does this sound like a right way to tackle the problem? Are there any performance costs involved? Thanks in advance for any input, Dmitry Kan
Re: Which token filter can combine 2 terms into 1?
Hi, Have a look onto TokenFilter. Extending it will give you access to a TokenStream. Regards, Dmitry Kan On Fri, Dec 21, 2012 at 9:05 AM, Xi Shen davidshe...@gmail.com wrote: Hi, I am looking for a token filter that can combine 2 terms into 1? E.g. the input has been tokenized by white space: t1 t2 t2a t3 I want a filter that output: t1 t2t2a t3 I know it is a very special case, and I am thinking about develop a filter of my own. But I cannot figure out which API I should use to look for terms in a Token Stream. -- Regards, David Shen http://about.me/davidshen https://twitter.com/#!/davidshen84
search with spaces
Hi, I have a text field with value O O Jaane Jaane. When i search with *q=Jaane Jaane* it is giving the results. But if i give *q=O O Jaane Jaane* it is not working? What could be the reason? Thanks, Sangeetha -- View this message in context: http://lucene.472066.n3.nabble.com/search-with-spaces-tp4029265.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: search with spaces
Which Analyzer is being used in the field that was indexed ? May be you can use solradmin to analyze and see how is your index thanks On Thu, Dec 27, 2012 at 2:30 PM, Sangeetha sangeetha...@gmail.com wrote: Hi, I have a text field with value O O Jaane Jaane. When i search with *q=Jaane Jaane* it is giving the results. But if i give *q=O O Jaane Jaane* it is not working? What could be the reason? Thanks, Sangeetha -- View this message in context: http://lucene.472066.n3.nabble.com/search-with-spaces-tp4029265.html Sent from the Solr - User mailing list archive at Nabble.com. -- Chandan Tamrakar * *
solr + jetty deployment issue
Hi, I am having trouble with getting solr + jetty to work. I am following all instructions to the letter from - http://wiki.apache.org/solr/SolrJetty. I also created a work folder - /opt/solr/work. I am also setting tmpdir to a new path in /etc/default/jetty . I am confirming the tmpdir is set to the new path from admin dashboard, under args. It works like a charm. But when I restart jetty multiple times, after 3/4 such restarts it starts hanging. Admin pages just dont load and my app fails to acquire a connection with solr. What I might be missing? Should I be rather looking at my code and see if I am not committing correctly? Please let me know if you have faced similar issue in the past and how to tackle it. Thank you. -- Best Regards, Sushrut
Re: Reindex ALL Solr CORES in one GO..
Thanks Gora, I can definitely trigger the full re-indexing using CURL for multiple cores although if i try to index multiple cores (more than 4-5 cores) simultaneously then the re-indexing fails due to DB connection pool problems( Connection not available ). Thus I need to schedule indexing once the previous indexing is over. Unfortunately to track the status of indexing for a core one need to keeping pinging the server to check completion status. Is there a way to get a response from SOLR once the indexing is complete ? How can i increase the connection pool size in SOLR ? Regards Anupam On Wed, Dec 26, 2012 at 7:06 PM, Gora Mohanty g...@mimirtech.com wrote: On 26 December 2012 18:06, Anupam Bhattacharya anupam...@gmail.com wrote: Hello Everyone, Is it possible to schedule full reindexing of all solr cores without going to individually to the DIH screen of each core ? One could quite easily write a wrapper around Solr's URLs for indexing. You could use a tool like curl, a simple shell script, or pretty much any programming language to do this. Regards, Gora -- Thanks Regards Anupam Bhattacharya
Re: Reindex ALL Solr CORES in one GO..
Unfortunately to track the status of indexing for a core one need to keeping pinging the server to check completion status. Is there a way to get a response from SOLR once the indexing is complete ? Yes it is possible : http://wiki.apache.org/solr/DataImportHandler#EventListeners
Re: Dynamic collections in SolrCloud for log indexing
Added https://issues.apache.org/jira/browse/SOLR-4237 Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Tue, Dec 25, 2012 at 9:13 PM, Mark Miller markrmil...@gmail.com wrote: I've been thinking about aliases for a while as well. Seem very handy and fairly easy to implement. So far there has just always been higher priority things (need to finish collection api responses this week…) but this is something I'd def help work on. - Mark On Dec 25, 2012, at 1:49 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Right, this is not really about routing in ElasticSearch-sense. What's handy for indexing logs are index aliases which I thought I had added to JIRA a while back, but it looks like I have not. Index aliases would let you keep a last 7 days alias fixed while underneath you push and pop an index every day without the client app having to adjust. Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Dec 24, 2012 at 4:30 AM, Per Steffensen st...@designware.dk wrote: I believe it is a misunderstandig to use custom routing (or sharding as Erick calls it) for this kind of stuff. Custom routing is nice if you want to control which slice/shard under a collection a specific document goes to - mainly to be able to control that two (or more) documents are indexed on the same slice/shard, but also just to be able to control on which slice/shard a specific document is indexed. Knowing/controlling this kind of stuff can be used for a lot of nice purposes. But you dont want to move slices/shards around among collection or delete/add slices from/to a collection - unless its for elasticity reasons. I think you should fill a collection every week/month and just keep those collections as is. Instead of ending up with a big historic collection containing many slices/shards/cores (one for each historic week/month), you will end up with many historic collections (one for each historic week/month). Searching historic data you will have to cross-search those historic collections, but that is no problem at all. If Solr Cloud is made at it is supposed to be made (and I believe it is) it shouldnt require more resouces or be harder in any way to cross-search X slices across many collections, than it is to cross-search X slices under the same collection. Besides that see my answer for topic Will SolrCloud always slice by ID hash? a few days back. Regards, Per Steffensen On 12/24/12 1:07 AM, Erick Erickson wrote: I think this is one of the primary use-cases for custom sharding. Solr 4.0 doesn't really lend itself to this scenario, but I _believe_ that the patch for custom sharding has been committed... That said, I'm not quite sure how you drop off the old shard if you don't need to keep old data. I'd guess it's possible, but haven't implemented anything like that myself. FWIW, Erick On Fri, Dec 21, 2012 at 12:17 PM, Upayavira u...@odoko.co.uk wrote: I'm working on a system for indexing logs. We're probably looking at filling one core every month. We'll maintain a short term index containing the last 7 days - that one is easy to handle. For the longer term stuff, we'd like to maintain a collection that will query across all the historic data, but that means every month we need to add another core to an existing collection, which as I understand it in 4.0 is not possible. How do people handle this sort of situation where you have rolling new content arriving? I'm sure I've heard people using SolrCloud for this sort of thing. Given it is logs, distributed IDF has no real bearing. Upayavira
Re: Which token filter can combine 2 terms into 1?
Hi Guys, I also worked on a CombiningTokenFilter, see: https://issues.apache.org/jira/browse/LUCENE-3413 Patch has been up and available for a while. HTH! Cheers, Chris On 12/27/12 12:26 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi, Have a look onto TokenFilter. Extending it will give you access to a TokenStream. Regards, Dmitry Kan On Fri, Dec 21, 2012 at 9:05 AM, Xi Shen davidshe...@gmail.com wrote: Hi, I am looking for a token filter that can combine 2 terms into 1? E.g. the input has been tokenized by white space: t1 t2 t2a t3 I want a filter that output: t1 t2t2a t3 I know it is a very special case, and I am thinking about develop a filter of my own. But I cannot figure out which API I should use to look for terms in a Token Stream. -- Regards, David Shen http://about.me/davidshen https://twitter.com/#!/davidshen84
Re: Converting fq params to Filter object
Hi Lance, Thanks for the response. I didn't quite understand how to issue the queries from DirectSpellChecker with the fq params applied like you were suggesting - could you point me to the API that can be used for this? Also, we haven't benchmarked the DirectSpellChecker against the IndexBasedSpellChecker. I considered issuing one large OR query with all corrections but that doesn't ensure that *every* correction would return some hits with the fq params applied, it only tells us that some correction returned hits so this isn't restrictive enough for us. And ANDing the corrections together becomes too restrictive since it requires that *all* corrections existed in the same documents instead of checking that they individually exist in some docs (which satisfy the filter queries of course). Thanks, Nalini On Wed, Dec 26, 2012 at 9:32 PM, Lance Norskog goks...@gmail.com wrote: A Solr facet query does a boolean query, caches the Lucene facet data structure, and uses it as a Lucene filter. After that until you do a full commit, using the same fq=string (you must match the string exactly) fetches the cached data structure and uses it again as a Lucene filter. Have you benchmarked the DirectSpellChecker against IndexBasedSpellChecker? If you use the fq= filter query as the spellcheck.q= query it should use the cached filter. Also, since you are checking all words against the same filter query, can you just do one large OR query with all of the words? On 12/26/2012 03:10 PM, Nalini Kartha wrote: Hi Otis, Sorry, let me be more specific. The end goal is for the DirectSpellChecker to make sure that the corrections it is returning will return some results taking into account the fq params included in the original query. This is a follow up question to another question I had posted earlier - http://mail-archives.apache.**org/mod_mbox/lucene-solr-user/** 201212.mbox/%**3CCAMqOzYFTgiWyRbvwSdF0hFZ1SZN** kQ9gnBJfDb_OBNeLsMvR0XA@mail.**gmail.com%3Ehttp://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3ccamqozyftgiwyrbvwsdf0hfz1sznkq9gnbjfdb_obnelsmvr...@mail.gmail.com%3E Initially, the way I was thinking of implementing this was to call one of the SolrIndexSearcher.getDocSet() methods for ever correction, passing in the correction as the Query and a DocSet created from the fq queries. But I didn't think that calling a SolrIndexSearcher method in Lucene code (DirectSpellChecker) was a good idea. So I started looking at which method on IndexSearcher would accomplish this. That's where I'm stuck trying to figure out how to convert the fq params into a Filter object. Does this approach make sense? Also I realize that this implementation is probably non-performant but wanted to give it a try and measure how it does. Any advice about what the perf overhead from issuing such queries for say 50 corrections would be? Note that the filter from the fq params is the same for every query - would that be cached and help speed things up? Thanks, Nalini On Wed, Dec 26, 2012 at 3:34 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, The fq *is* for filtering. What is your end goal, what are you trying to achieve? Otis Solr ElasticSearch Support http://sematext.com/ On Dec 26, 2012 11:22 AM, Nalini Kartha nalinikar...@gmail.com wrote: Hi, I'm trying to figure out how to convert the fq params that are being passed to Solr into something that can be used to filter the results of a query that's being issued against the Lucene IndexSearcher (I'm modifying some Lucene code to issue the query so calling through to one of the SolrIndexSearcher methods would be ugly). Looks like one of the IndexSearcher.search(Query query, Filter filter, ...) methods would do what I want but I'm wondering if there's any easy way of converting the fq params into a Filter? Or is there a better way of doing all of this? Thanks, Nalini
Re: Converting fq params to Filter object
I think the answer is yes, that there's a better way to doing all of this. But I'm not yet sure what this all entails in your situation. What are you overriding with the Lucene searches? I imagine Solr has the flexibility to handle what you're trying to do without overriding anything core in SolrIndexSearcher. Generally, the way to get a custom filter in place is to create a custom query parser and use that for your fq parameter, like fq={!myparser param1='some value'}possible+expression+if+needed, so maybe that helps? Tell us more about what you're doing specifically, and maybe we can guide you to a more elegant way to plug in any custom logic you want. Erik On Dec 26, 2012, at 11:21 , Nalini Kartha wrote: Hi, I'm trying to figure out how to convert the fq params that are being passed to Solr into something that can be used to filter the results of a query that's being issued against the Lucene IndexSearcher (I'm modifying some Lucene code to issue the query so calling through to one of the SolrIndexSearcher methods would be ugly). Looks like one of the IndexSearcher.search(Query query, Filter filter, ...) methods would do what I want but I'm wondering if there's any easy way of converting the fq params into a Filter? Or is there a better way of doing all of this? Thanks, Nalini
Re: Converting fq params to Filter object
Hi Eric, Sorry, I think I wasn't very clear in explaining what we need to do. We don't really need to do any complicated overriding, just want to change the DirectSpellChecker to issue a query for every correction it finds *with fq params from the original query taken into account* so that we can check if the correction would actually result in some hits. I was thinking of implementing this using the IndexSearcher.search(Query query, Filter filter, int n) method where 'query' is a regular TermQuery (the term is the correction) and 'filter' would represent the fq params. What I'm not sure about is how to convert the fq params from Solr into a Filter object and whether this is something we need to build ourselves or if there's an existing API for this. Also, I'm new to this code so not sure if I'm approaching this the wrong way. Any advice/pointers are much appreciated. Thanks, Nalini On Thu, Dec 27, 2012 at 12:53 PM, Erik Hatcher erik.hatc...@gmail.comwrote: I think the answer is yes, that there's a better way to doing all of this. But I'm not yet sure what this all entails in your situation. What are you overriding with the Lucene searches? I imagine Solr has the flexibility to handle what you're trying to do without overriding anything core in SolrIndexSearcher. Generally, the way to get a custom filter in place is to create a custom query parser and use that for your fq parameter, like fq={!myparser param1='some value'}possible+expression+if+needed, so maybe that helps? Tell us more about what you're doing specifically, and maybe we can guide you to a more elegant way to plug in any custom logic you want. Erik On Dec 26, 2012, at 11:21 , Nalini Kartha wrote: Hi, I'm trying to figure out how to convert the fq params that are being passed to Solr into something that can be used to filter the results of a query that's being issued against the Lucene IndexSearcher (I'm modifying some Lucene code to issue the query so calling through to one of the SolrIndexSearcher methods would be ugly). Looks like one of the IndexSearcher.search(Query query, Filter filter, ...) methods would do what I want but I'm wondering if there's any easy way of converting the fq params into a Filter? Or is there a better way of doing all of this? Thanks, Nalini
Re: Converting fq params to Filter object
Apologies for misunderstanding. Does what you're trying to do already work this way using the http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.maxCollationTries maxCollationTries feature of the spellcheck component? It looks like it passes through the fq's even, so that the hit count that the extended results is inclusive of the filters. Maybe I'm missing something though, sorry. Erik On Dec 27, 2012, at 14:09 , Nalini Kartha wrote: Hi Eric, Sorry, I think I wasn't very clear in explaining what we need to do. We don't really need to do any complicated overriding, just want to change the DirectSpellChecker to issue a query for every correction it finds *with fq params from the original query taken into account* so that we can check if the correction would actually result in some hits. I was thinking of implementing this using the IndexSearcher.search(Query query, Filter filter, int n) method where 'query' is a regular TermQuery (the term is the correction) and 'filter' would represent the fq params. What I'm not sure about is how to convert the fq params from Solr into a Filter object and whether this is something we need to build ourselves or if there's an existing API for this. Also, I'm new to this code so not sure if I'm approaching this the wrong way. Any advice/pointers are much appreciated. Thanks, Nalini On Thu, Dec 27, 2012 at 12:53 PM, Erik Hatcher erik.hatc...@gmail.comwrote: I think the answer is yes, that there's a better way to doing all of this. But I'm not yet sure what this all entails in your situation. What are you overriding with the Lucene searches? I imagine Solr has the flexibility to handle what you're trying to do without overriding anything core in SolrIndexSearcher. Generally, the way to get a custom filter in place is to create a custom query parser and use that for your fq parameter, like fq={!myparser param1='some value'}possible+expression+if+needed, so maybe that helps? Tell us more about what you're doing specifically, and maybe we can guide you to a more elegant way to plug in any custom logic you want. Erik On Dec 26, 2012, at 11:21 , Nalini Kartha wrote: Hi, I'm trying to figure out how to convert the fq params that are being passed to Solr into something that can be used to filter the results of a query that's being issued against the Lucene IndexSearcher (I'm modifying some Lucene code to issue the query so calling through to one of the SolrIndexSearcher methods would be ugly). Looks like one of the IndexSearcher.search(Query query, Filter filter, ...) methods would do what I want but I'm wondering if there's any easy way of converting the fq params into a Filter? Or is there a better way of doing all of this? Thanks, Nalini
RE: Converting fq params to Filter object
Nalini, You could take the code from SpellCheckCollator#collate and have it issue a test query for each word individually instead of for each collation. This would do exactly what you want. See http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/solr/core/src/java/org/apache/solr/spelling/SpellCheckCollator.java If you are concerned this isn't low-level enough and that performance would suffer, then see https://issues.apache.org/jira/browse/SOLR-3240 , which has a patch that uses a collector that quits after finding one document. This makes each test query faster at the expense of not getting exact hit-counts. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Nalini Kartha [mailto:nalinikar...@gmail.com] Sent: Thursday, December 27, 2012 1:09 PM To: solr-user@lucene.apache.org Subject: Re: Converting fq params to Filter object Hi Eric, Sorry, I think I wasn't very clear in explaining what we need to do. We don't really need to do any complicated overriding, just want to change the DirectSpellChecker to issue a query for every correction it finds *with fq params from the original query taken into account* so that we can check if the correction would actually result in some hits. I was thinking of implementing this using the IndexSearcher.search(Query query, Filter filter, int n) method where 'query' is a regular TermQuery (the term is the correction) and 'filter' would represent the fq params. What I'm not sure about is how to convert the fq params from Solr into a Filter object and whether this is something we need to build ourselves or if there's an existing API for this. Also, I'm new to this code so not sure if I'm approaching this the wrong way. Any advice/pointers are much appreciated. Thanks, Nalini On Thu, Dec 27, 2012 at 12:53 PM, Erik Hatcher erik.hatc...@gmail.comwrote: I think the answer is yes, that there's a better way to doing all of this. But I'm not yet sure what this all entails in your situation. What are you overriding with the Lucene searches? I imagine Solr has the flexibility to handle what you're trying to do without overriding anything core in SolrIndexSearcher. Generally, the way to get a custom filter in place is to create a custom query parser and use that for your fq parameter, like fq={!myparser param1='some value'}possible+expression+if+needed, so maybe that helps? Tell us more about what you're doing specifically, and maybe we can guide you to a more elegant way to plug in any custom logic you want. Erik On Dec 26, 2012, at 11:21 , Nalini Kartha wrote: Hi, I'm trying to figure out how to convert the fq params that are being passed to Solr into something that can be used to filter the results of a query that's being issued against the Lucene IndexSearcher (I'm modifying some Lucene code to issue the query so calling through to one of the SolrIndexSearcher methods would be ugly). Looks like one of the IndexSearcher.search(Query query, Filter filter, ...) methods would do what I want but I'm wondering if there's any easy way of converting the fq params into a Filter? Or is there a better way of doing all of this? Thanks, Nalini
Re: Converting fq params to Filter object
Hi James, Yup, that was what I tried to do initially but it seems like calling through to those Solr methods from DirectSpellChecker was not a good idea - am I wrong? And like you mentioned, this seemed like it wasn't low-level enough. Eric: Unfortunately the collate functionality does not work for our use case since the queries we're correcting are default OR. Here's the original thread about this - http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3ccamqozyftgiwyrbvwsdf0hfz1sznkq9gnbjfdb_obnelsmvr...@mail.gmail.com%3E Thanks, Nalini On Thu, Dec 27, 2012 at 2:46 PM, Dyer, James james.d...@ingramcontent.comwrote: https://issues.apache.org/jira/browse/SOLR-3240
RE: Converting fq params to Filter object
Nalini, Assuming that you're using Solr, the hook into the collate functionality is in SpellCheckComponent#addCollationsToResponse . To do what you want, you would have to modify the call to SpellCheckCollator to issue test queries against the individual words instead of the collations. See http://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x/solr/core/src/java/org/apache/solr/handler/component/SpellCheckComponent.java Of course if you're using Lucene directly and not Solr, then you would want to build a series of queries that each query one word with the filters applied. DirectSpellChecker#suggestSimilar returns an array of SuggestWord instances that contain the individual words you would want to try. To optimize this, you can use the same approach as in SOLR-3240, implementing a Collector that only looks for 1 document then quits. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Nalini Kartha [mailto:nalinikar...@gmail.com] Sent: Thursday, December 27, 2012 2:31 PM To: solr-user@lucene.apache.org Subject: Re: Converting fq params to Filter object Hi James, Yup, that was what I tried to do initially but it seems like calling through to those Solr methods from DirectSpellChecker was not a good idea - am I wrong? And like you mentioned, this seemed like it wasn't low-level enough. Eric: Unfortunately the collate functionality does not work for our use case since the queries we're correcting are default OR. Here's the original thread about this - http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3ccamqozyftgiwyrbvwsdf0hfz1sznkq9gnbjfdb_obnelsmvr...@mail.gmail.com%3E Thanks, Nalini On Thu, Dec 27, 2012 at 2:46 PM, Dyer, James james.d...@ingramcontent.comwrote: https://issues.apache.org/jira/browse/SOLR-3240
Re: search with spaces
That's debugQuery=true or debug=query. -- Jack Krupansky -Original Message- From: Otis Gospodnetic Sent: Thursday, December 27, 2012 10:56 AM To: solr-user@lucene.apache.org Subject: Re: search with spaces Hi, Add debugQuery=query to your search requests. That will point you in the right direction. Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Thu, Dec 27, 2012 at 3:45 AM, Sangeetha sangeetha...@gmail.com wrote: Hi, I have a text field with value O O Jaane Jaane. When i search with *q=Jaane Jaane* it is giving the results. But if i give *q=O O Jaane Jaane* it is not working? What could be the reason? Thanks, Sangeetha -- View this message in context: http://lucene.472066.n3.nabble.com/search-with-spaces-tp4029265.html Sent from the Solr - User mailing list archive at Nabble.com.
Frequent OOM - (Unknown source in logs).
Hello, I am seeing frequent OOMs for the past 2 days on a SolrCloud Cluster (Solr4.0 with a patch from Solr-2592) setup (3 shards, each shard with 2 instances. Each instance is running CentOS with 30GB memory, 500GB disk space), with a separate Zoo Keeper ensemble of 3. Here is the stacktrace: http://pastebin.com/cV5DxD4N I also saw there is a Jira issue which looks similar, the difference being, in the stacktrace I get, I can Not see which process is trying to do a expandCapacity. /java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)/ Where as the stacktrace mentioned in this issue (https://issues.apache.org/jira/browse/SOLR-3881) is /at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:100)/ Has anyone seen this issue before? Any fixes for this? -- View this message in context: http://lucene.472066.n3.nabble.com/Frequent-OOM-Unknown-source-in-logs-tp4029361.html Sent from the Solr - User mailing list archive at Nabble.com.
old index not cleaned up on the slave
Hi, I'm using master/slave replication on Solr 4.0. Replication is successfully run. But old index not cleaned up. Is that bug or not? My slave index directory is below... $ ls -l solr_kr/krg01/data/index/ total 23472512 -rw-r--r--. 1 tomcat tomcat563722625 Dec 24 21:48 _15.fdt -rw-r--r--. 1 tomcat tomcat 4855210 Dec 24 21:48 _15.fdx -rw-r--r--. 1 tomcat tomcat4155 Dec 24 22:01 _15.fnm -rw-r--r--. 1 tomcat tomcat 3367203143 Dec 24 22:01 _15_Lucene40_0.frq -rw-r--r--. 1 tomcat tomcat 6951612380 Dec 24 22:01 _15_Lucene40_0.prx -rw-r--r--. 1 tomcat tomcat 1096591353 Dec 24 22:01 _15_Lucene40_0.tim -rw-r--r--. 1 tomcat tomcat 26026916 Dec 24 22:01 _15_Lucene40_0.tip -rw-r--r--. 1 tomcat tomcat 388 Dec 24 22:01 _15.si -rw-r--r--. 1 tomcat tomcat 98 Nov 30 13:43 segments_3 -rw-r--r--. 1 tomcat tomcat 99 Dec 24 22:01 segments_4 -rw-r--r--. 1 tomcat tomcat 20 Aug 12 07:21 segments.gen -rw-r--r--. 1 tomcat tomcat 563742324 Nov 30 13:32 _t.fdt -rw-r--r--. 1 tomcat tomcat 4855210 Nov 30 13:32 _t.fdx -rw-r--r--. 1 tomcat tomcat 4155 Nov 30 13:43 _t.fnm -rw-r--r--. 1 tomcat tomcat 3382846438 Nov 30 13:43 _t_Lucene40_0.frq -rw-r--r--. 1 tomcat tomcat 6951620034 Nov 30 13:43 _t_Lucene40_0.prx -rw-r--r--. 1 tomcat tomcat 1096654275 Nov 30 13:43 _t_Lucene40_0.tim -rw-r--r--. 1 tomcat tomcat26027222 Nov 30 13:43 _t_Lucene40_0.tip -rw-r--r--. 1 tomcat tomcat379 Nov 30 13:43 _t.si -- View this message in context: http://lucene.472066.n3.nabble.com/old-index-not-cleaned-up-on-the-slave-tp4029370.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: solr + jetty deployment issue
Do you see any errors coming in on the console, stderr? I start solr this way and redirect the stdout and stderr to log files, when I have a problem stderr generally has the answer: java \ -server \ -Djetty.port=8080 \ -Dsolr.solr.home=/opt/solr \ -Dsolr.data.dir=/mnt/solr_data \ -jar /opt/solr/start.jar /opt/solr/logs/stdout.log 2/opt/solr/logs/stderr.log -Original Message- From: Sushrut Bidwai [mailto:bidwai.sush...@gmail.com] Sent: Thursday, December 27, 2012 7:40 PM To: solr-user@lucene.apache.org Subject: solr + jetty deployment issue Hi, I am having trouble with getting solr + jetty to work. I am following all instructions to the letter from - http://wiki.apache.org/solr/SolrJetty. I also created a work folder - /opt/solr/work. I am also setting tmpdir to a new path in /etc/default/jetty . I am confirming the tmpdir is set to the new path from admin dashboard, under args. It works like a charm. But when I restart jetty multiple times, after 3/4 such restarts it starts hanging. Admin pages just dont load and my app fails to acquire a connection with solr. What I might be missing? Should I be rather looking at my code and see if I am not committing correctly? Please let me know if you have faced similar issue in the past and how to tackle it. Thank you. -- Best Regards, Sushrut
MoreLikeThis only returns 1 result
I'm doing a query like this for MoreLikeThis, sending it a document ID. But the only result I ever get back is the document ID I sent it. The debug response is below. If I read it correctly, it's taking id:1004401713626 as the term (not the document ID) and only finding it once. But I want it to match the document with ID 1004401713626 of course. I tried q=id[1004410713626], but that generates an exception: Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'id:[1004401713626]': Encountered ] ] at line 1, column 17. Was expecting one of: TO ... RANGEIN_QUOTED ... RANGEIN_GOOP ... This must be easy, but the documentation is minimal. My Query: http://107.23.102.164:8080/solr/select/?qt=mltq=id:[1004401713626]rows=10; mlt.fl=item_name,item_brand,short_description,long_description,catalog_names ,categories,keywords,attributes,facetimemlt.mintf=2mlt.mindf=5mlt.maxqt=1 00mlt.boost=falsedebugQuery=true response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=mlt.mindf5/str str name=mlt.fl item_name,item_brand,short_description,long_description,catalog_names,catego ries,keywords,attributes,facetime /str str name=mlt.boostfalse/str str name=debugQuerytrue/str str name=qid:1004401713626/str str name=mlt.mintf2/str str name=mlt.maxqt100/str str name=qtmlt/str str name=rows10/str /lst/lst result name=response numFound=1 start=0 doc long name=facetime0/long str name=id1004401713626/str /doc /result lst name=debug str name=rawquerystringid:1004401713626/str str name=querystringid:1004401713626/str str name=parsedqueryid:1004401713626/str str name=parsedquery_toStringid:1004401713626/str lst name=explain str name=1004401713626 18.29481 = (MATCH) fieldWeight(id:1004401713626 in 2843152), product of: 1.0 = tf(termFreq(id:1004401713626)=1) 18.29481 = idf(docFreq=1, maxDocs=64873893) 1.0 = fieldNorm(field=id, doc=2843152) /str /lst
Re: solr + jetty deployment issue
Hi David, From what I see in the log and threaddump it seems that getSearcher method in SolrCore is not able to acquire required lock and because of that its blocking startup of the server. Here is threaddump - http://pastebin.com/GPnAzF1q . On Fri, Dec 28, 2012 at 8:01 AM, David Parks davidpark...@yahoo.com wrote: Do you see any errors coming in on the console, stderr? I start solr this way and redirect the stdout and stderr to log files, when I have a problem stderr generally has the answer: java \ -server \ -Djetty.port=8080 \ -Dsolr.solr.home=/opt/solr \ -Dsolr.data.dir=/mnt/solr_data \ -jar /opt/solr/start.jar /opt/solr/logs/stdout.log 2/opt/solr/logs/stderr.log -Original Message- From: Sushrut Bidwai [mailto:bidwai.sush...@gmail.com] Sent: Thursday, December 27, 2012 7:40 PM To: solr-user@lucene.apache.org Subject: solr + jetty deployment issue Hi, I am having trouble with getting solr + jetty to work. I am following all instructions to the letter from - http://wiki.apache.org/solr/SolrJetty. I also created a work folder - /opt/solr/work. I am also setting tmpdir to a new path in /etc/default/jetty . I am confirming the tmpdir is set to the new path from admin dashboard, under args. It works like a charm. But when I restart jetty multiple times, after 3/4 such restarts it starts hanging. Admin pages just dont load and my app fails to acquire a connection with solr. What I might be missing? Should I be rather looking at my code and see if I am not committing correctly? Please let me know if you have faced similar issue in the past and how to tackle it. Thank you. -- Best Regards, Sushrut -- Best Regards, Sushrut http://sushrutbidwai.com
Re: MoreLikeThis only returns 1 result
Sounds like it is simply dispatching to the normal search request handler. Although you specified qt=mlt, make sure you enable the legacy select handler dispatching in solrconfig.xml. Change: requestDispatcher handleSelect=false to requestDispatcher handleSelect=true Or, simply address the MLT handler directly: http://107.23.102.164:8080/solr/mlt?q=... Or, use the MoreLikeThis search component: http://localhost:8983/solr/select?q=...mlt=true;... See: http://wiki.apache.org/solr/MoreLikeThis -- Jack Krupansky -Original Message- From: David Parks Sent: Thursday, December 27, 2012 9:59 PM To: solr-user@lucene.apache.org Subject: MoreLikeThis only returns 1 result I'm doing a query like this for MoreLikeThis, sending it a document ID. But the only result I ever get back is the document ID I sent it. The debug response is below. If I read it correctly, it's taking id:1004401713626 as the term (not the document ID) and only finding it once. But I want it to match the document with ID 1004401713626 of course. I tried q=id[1004410713626], but that generates an exception: Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'id:[1004401713626]': Encountered ] ] at line 1, column 17. Was expecting one of: TO ... RANGEIN_QUOTED ... RANGEIN_GOOP ... This must be easy, but the documentation is minimal. My Query: http://107.23.102.164:8080/solr/select/?qt=mltq=id:[1004401713626]rows=10; mlt.fl=item_name,item_brand,short_description,long_description,catalog_names ,categories,keywords,attributes,facetimemlt.mintf=2mlt.mindf=5mlt.maxqt=1 00mlt.boost=falsedebugQuery=true response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=mlt.mindf5/str str name=mlt.fl item_name,item_brand,short_description,long_description,catalog_names,catego ries,keywords,attributes,facetime /str str name=mlt.boostfalse/str str name=debugQuerytrue/str str name=qid:1004401713626/str str name=mlt.mintf2/str str name=mlt.maxqt100/str str name=qtmlt/str str name=rows10/str /lst/lst result name=response numFound=1 start=0 doc long name=facetime0/long str name=id1004401713626/str /doc /result lst name=debug str name=rawquerystringid:1004401713626/str str name=querystringid:1004401713626/str str name=parsedqueryid:1004401713626/str str name=parsedquery_toStringid:1004401713626/str lst name=explain str name=1004401713626 18.29481 = (MATCH) fieldWeight(id:1004401713626 in 2843152), product of: 1.0 = tf(termFreq(id:1004401713626)=1) 18.29481 = idf(docFreq=1, maxDocs=64873893) 1.0 = fieldNorm(field=id, doc=2843152) /str /lst
RE: MoreLikeThis only returns 1 result
Ok, that worked, I had the /mlt request handler misconfigured (forgot a '/'). It's working now. Thanks! -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Friday, December 28, 2012 11:38 AM To: solr-user@lucene.apache.org Subject: Re: MoreLikeThis only returns 1 result Sounds like it is simply dispatching to the normal search request handler. Although you specified qt=mlt, make sure you enable the legacy select handler dispatching in solrconfig.xml. Change: requestDispatcher handleSelect=false to requestDispatcher handleSelect=true Or, simply address the MLT handler directly: http://107.23.102.164:8080/solr/mlt?q=... Or, use the MoreLikeThis search component: http://localhost:8983/solr/select?q=...mlt=true;... See: http://wiki.apache.org/solr/MoreLikeThis -- Jack Krupansky -Original Message- From: David Parks Sent: Thursday, December 27, 2012 9:59 PM To: solr-user@lucene.apache.org Subject: MoreLikeThis only returns 1 result I'm doing a query like this for MoreLikeThis, sending it a document ID. But the only result I ever get back is the document ID I sent it. The debug response is below. If I read it correctly, it's taking id:1004401713626 as the term (not the document ID) and only finding it once. But I want it to match the document with ID 1004401713626 of course. I tried q=id[1004410713626], but that generates an exception: Caused by: org.apache.lucene.queryParser.ParseException: Cannot parse 'id:[1004401713626]': Encountered ] ] at line 1, column 17. Was expecting one of: TO ... RANGEIN_QUOTED ... RANGEIN_GOOP ... This must be easy, but the documentation is minimal. My Query: http://107.23.102.164:8080/solr/select/?qt=mltq=id:[1004401713626]rows=10; mlt.fl=item_name,item_brand,short_description,long_description,catalog_names ,categories,keywords,attributes,facetimemlt.mintf=2mlt.mindf=5mlt.maxqt=1 00mlt.boost=falsedebugQuery=true response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=mlt.mindf5/str str name=mlt.fl item_name,item_brand,short_description,long_description,catalog_names,catego ries,keywords,attributes,facetime /str str name=mlt.boostfalse/str str name=debugQuerytrue/str str name=qid:1004401713626/str str name=mlt.mintf2/str str name=mlt.maxqt100/str str name=qtmlt/str str name=rows10/str /lst/lst result name=response numFound=1 start=0 doc long name=facetime0/long str name=id1004401713626/str /doc /result lst name=debug str name=rawquerystringid:1004401713626/str str name=querystringid:1004401713626/str str name=parsedqueryid:1004401713626/str str name=parsedquery_toStringid:1004401713626/str lst name=explain str name=1004401713626 18.29481 = (MATCH) fieldWeight(id:1004401713626 in 2843152), product of: 1.0 = tf(termFreq(id:1004401713626)=1) 18.29481 = idf(docFreq=1, maxDocs=64873893) 1.0 = fieldNorm(field=id, doc=2843152) /str /lst
RE: MoreLikeThis supporting multiple document IDs as input?
I'm somewhat new to Solr (it's running, I've been through the books, but I'm no master). What I hear you say is that MLT *can* accept, say 5, documents and provide results, but the results would essentially be the same as running the query 5 times for each document? If that's the case, I might accept it. I would just have to merge them together at the end (perhaps I'd take the top 2 of each result, for example). Being somewhat new I'm a little confused by the difference between a Search Component and a Handler. I've got the /mlt handler working and I'm using that. But how's that different from a Search Component? Is that referring to the default /solr/select?q=... style query? And if what I said about multiple documents above is correct, what's the syntax to try that out? Thanks very much for the great help! Dave -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, December 26, 2012 12:07 PM To: solr-user@lucene.apache.org Subject: Re: MoreLikeThis supporting multiple document IDs as input? MLT has both a request handler and a search component. The MLT handler returns similar documents only for the first document that the query matches. The MLT search component returns similar documents for each of the documents in the search results, but processes each search result base document one at a time and keeps its similar documents segregated by each of the base documents. It sounds like you wanted to merge the base search results and then find documents similar to that merged super-document. Is that what you were really seeking, as opposed to what the MLT component does? Unfortunately, you can't do that with the components as they are. You would have to manually merge the values from the base documents and then you could POST that text back to the MLT handler and find similar documents using the posted text rather than a query. Kind of messy, but in theory that should work. -- Jack Krupansky -Original Message- From: David Parks Sent: Tuesday, December 25, 2012 5:04 AM To: solr-user@lucene.apache.org Subject: MoreLikeThis supporting multiple document IDs as input? I'm unclear on this point from the documentation. Is it possible to give Solr X # of document IDs and tell it that I want documents similar to those X documents? Example: - The user is browsing 5 different articles - I send Solr the IDs of these 5 articles so I can present the user other similar articles I see this example for sending it 1 document ID: http://localhost:8080/solr/select/?qt=mltq=id:[document id]mlt.fl=[field1],[field2],[field3]fl=idrows=10 But can I send it 2+ document IDs as the query?
Re: solr + jetty deployment issue
Here is latest threaddump taken after setting up latest nightly build version - apache-solr-4.1-2012-12-27_04-32-37 - http://pastebin.com/eum7CxX4 Kind of stuck with this from few days now, so can use little help. Here is more details on the issue - 1. Setting up jetty + solr using instructions - http://wiki.apache.org/solr/SolrJetty 2. Initial install with clean data dirs goes smoothly. 3. I can connect to server and index 10K+ documents with out any issues. I use 10 threads in my app to do so. Not experiencing any concurrency/deadlock issues. 4. When stop my app and then restart jetty, after few restarts - I get above mentioned threaddump and startup of server stays blocked forever. 5. If I delete data dir and start again, problem goes away. But reappears on server restarts. On Fri, Dec 28, 2012 at 9:03 AM, Sushrut Bidwai bidwai.sush...@gmail.comwrote: Hi David, From what I see in the log and threaddump it seems that getSearcher method in SolrCore is not able to acquire required lock and because of that its blocking startup of the server. Here is threaddump - http://pastebin.com/GPnAzF1q . On Fri, Dec 28, 2012 at 8:01 AM, David Parks davidpark...@yahoo.comwrote: Do you see any errors coming in on the console, stderr? I start solr this way and redirect the stdout and stderr to log files, when I have a problem stderr generally has the answer: java \ -server \ -Djetty.port=8080 \ -Dsolr.solr.home=/opt/solr \ -Dsolr.data.dir=/mnt/solr_data \ -jar /opt/solr/start.jar /opt/solr/logs/stdout.log 2/opt/solr/logs/stderr.log -Original Message- From: Sushrut Bidwai [mailto:bidwai.sush...@gmail.com] Sent: Thursday, December 27, 2012 7:40 PM To: solr-user@lucene.apache.org Subject: solr + jetty deployment issue Hi, I am having trouble with getting solr + jetty to work. I am following all instructions to the letter from - http://wiki.apache.org/solr/SolrJetty. I also created a work folder - /opt/solr/work. I am also setting tmpdir to a new path in /etc/default/jetty . I am confirming the tmpdir is set to the new path from admin dashboard, under args. It works like a charm. But when I restart jetty multiple times, after 3/4 such restarts it starts hanging. Admin pages just dont load and my app fails to acquire a connection with solr. What I might be missing? Should I be rather looking at my code and see if I am not committing correctly? Please let me know if you have faced similar issue in the past and how to tackle it. Thank you. -- Best Regards, Sushrut -- Best Regards, Sushrut http://sushrutbidwai.com -- Best Regards, Sushrut http://sushrutbidwai.com
RE: MoreLikeThis supporting multiple document IDs as input?
Hi Dave, Think of search components as a chain of Java classes that get executed during each search request. If you open solrconfig.xml you will see how they are defined and used. HTH Otis Solr ElasticSearch Support http://sematext.com/ On Dec 28, 2012 12:06 AM, David Parks davidpark...@yahoo.com wrote: I'm somewhat new to Solr (it's running, I've been through the books, but I'm no master). What I hear you say is that MLT *can* accept, say 5, documents and provide results, but the results would essentially be the same as running the query 5 times for each document? If that's the case, I might accept it. I would just have to merge them together at the end (perhaps I'd take the top 2 of each result, for example). Being somewhat new I'm a little confused by the difference between a Search Component and a Handler. I've got the /mlt handler working and I'm using that. But how's that different from a Search Component? Is that referring to the default /solr/select?q=... style query? And if what I said about multiple documents above is correct, what's the syntax to try that out? Thanks very much for the great help! Dave -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, December 26, 2012 12:07 PM To: solr-user@lucene.apache.org Subject: Re: MoreLikeThis supporting multiple document IDs as input? MLT has both a request handler and a search component. The MLT handler returns similar documents only for the first document that the query matches. The MLT search component returns similar documents for each of the documents in the search results, but processes each search result base document one at a time and keeps its similar documents segregated by each of the base documents. It sounds like you wanted to merge the base search results and then find documents similar to that merged super-document. Is that what you were really seeking, as opposed to what the MLT component does? Unfortunately, you can't do that with the components as they are. You would have to manually merge the values from the base documents and then you could POST that text back to the MLT handler and find similar documents using the posted text rather than a query. Kind of messy, but in theory that should work. -- Jack Krupansky -Original Message- From: David Parks Sent: Tuesday, December 25, 2012 5:04 AM To: solr-user@lucene.apache.org Subject: MoreLikeThis supporting multiple document IDs as input? I'm unclear on this point from the documentation. Is it possible to give Solr X # of document IDs and tell it that I want documents similar to those X documents? Example: - The user is browsing 5 different articles - I send Solr the IDs of these 5 articles so I can present the user other similar articles I see this example for sending it 1 document ID: http://localhost:8080/solr/select/?qt=mltq=id:[document id]mlt.fl=[field1],[field2],[field3]fl=idrows=10 But can I send it 2+ document IDs as the query?
Re: solr + jetty deployment issue
If I comment out the /browse requesthandler from solrconfig.xml, problem goes away. So issue is definitely with the way I am configuring solrconfig.xml. I will debug into on my side. On Fri, Dec 28, 2012 at 11:55 AM, Sushrut Bidwai bidwai.sush...@gmail.comwrote: Here is latest threaddump taken after setting up latest nightly build version - apache-solr-4.1-2012-12-27_04-32-37 - http://pastebin.com/eum7CxX4 Kind of stuck with this from few days now, so can use little help. Here is more details on the issue - 1. Setting up jetty + solr using instructions - http://wiki.apache.org/solr/SolrJetty 2. Initial install with clean data dirs goes smoothly. 3. I can connect to server and index 10K+ documents with out any issues. I use 10 threads in my app to do so. Not experiencing any concurrency/deadlock issues. 4. When stop my app and then restart jetty, after few restarts - I get above mentioned threaddump and startup of server stays blocked forever. 5. If I delete data dir and start again, problem goes away. But reappears on server restarts. On Fri, Dec 28, 2012 at 9:03 AM, Sushrut Bidwai bidwai.sush...@gmail.comwrote: Hi David, From what I see in the log and threaddump it seems that getSearcher method in SolrCore is not able to acquire required lock and because of that its blocking startup of the server. Here is threaddump - http://pastebin.com/GPnAzF1q . On Fri, Dec 28, 2012 at 8:01 AM, David Parks davidpark...@yahoo.comwrote: Do you see any errors coming in on the console, stderr? I start solr this way and redirect the stdout and stderr to log files, when I have a problem stderr generally has the answer: java \ -server \ -Djetty.port=8080 \ -Dsolr.solr.home=/opt/solr \ -Dsolr.data.dir=/mnt/solr_data \ -jar /opt/solr/start.jar /opt/solr/logs/stdout.log 2/opt/solr/logs/stderr.log -Original Message- From: Sushrut Bidwai [mailto:bidwai.sush...@gmail.com] Sent: Thursday, December 27, 2012 7:40 PM To: solr-user@lucene.apache.org Subject: solr + jetty deployment issue Hi, I am having trouble with getting solr + jetty to work. I am following all instructions to the letter from - http://wiki.apache.org/solr/SolrJetty. I also created a work folder - /opt/solr/work. I am also setting tmpdir to a new path in /etc/default/jetty . I am confirming the tmpdir is set to the new path from admin dashboard, under args. It works like a charm. But when I restart jetty multiple times, after 3/4 such restarts it starts hanging. Admin pages just dont load and my app fails to acquire a connection with solr. What I might be missing? Should I be rather looking at my code and see if I am not committing correctly? Please let me know if you have faced similar issue in the past and how to tackle it. Thank you. -- Best Regards, Sushrut -- Best Regards, Sushrut http://sushrutbidwai.com -- Best Regards, Sushrut http://sushrutbidwai.com -- Best Regards, Sushrut http://sushrutbidwai.com
RE: MoreLikeThis supporting multiple document IDs as input?
So the Search Components are executed in series an _every_ request. I presume then that they look at the request parameters and decide what and whether to take action. So in the case of the MLT component this was said: The MLT search component returns similar documents for each of the documents in the search results, but processes each search result base document one at a time and keeps its similar documents segregated by each of the base documents. So what I think I understand is that the Query Component (presumably this guy: org.apache.solr.handler.component.QueryComponent) takes the input from the q parameter and returns a result (the q=id:123456 ensure that the Query Component will return just this one document). The MltComponent then looks at the result from the QueryComponent and generates its results. The part that is still confusing is understanding the difference between these two comments: - The MLT search component returns similar documents for each of the documents in the search results - The MLT handler returns similar documents only for the first document that the query matches. -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Friday, December 28, 2012 1:26 PM To: solr-user@lucene.apache.org Subject: RE: MoreLikeThis supporting multiple document IDs as input? Hi Dave, Think of search components as a chain of Java classes that get executed during each search request. If you open solrconfig.xml you will see how they are defined and used. HTH Otis Solr ElasticSearch Support http://sematext.com/ On Dec 28, 2012 12:06 AM, David Parks davidpark...@yahoo.com wrote: I'm somewhat new to Solr (it's running, I've been through the books, but I'm no master). What I hear you say is that MLT *can* accept, say 5, documents and provide results, but the results would essentially be the same as running the query 5 times for each document? If that's the case, I might accept it. I would just have to merge them together at the end (perhaps I'd take the top 2 of each result, for example). Being somewhat new I'm a little confused by the difference between a Search Component and a Handler. I've got the /mlt handler working and I'm using that. But how's that different from a Search Component? Is that referring to the default /solr/select?q=... style query? And if what I said about multiple documents above is correct, what's the syntax to try that out? Thanks very much for the great help! Dave -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: Wednesday, December 26, 2012 12:07 PM To: solr-user@lucene.apache.org Subject: Re: MoreLikeThis supporting multiple document IDs as input? MLT has both a request handler and a search component. The MLT handler returns similar documents only for the first document that the query matches. The MLT search component returns similar documents for each of the documents in the search results, but processes each search result base document one at a time and keeps its similar documents segregated by each of the base documents. It sounds like you wanted to merge the base search results and then find documents similar to that merged super-document. Is that what you were really seeking, as opposed to what the MLT component does? Unfortunately, you can't do that with the components as they are. You would have to manually merge the values from the base documents and then you could POST that text back to the MLT handler and find similar documents using the posted text rather than a query. Kind of messy, but in theory that should work. -- Jack Krupansky -Original Message- From: David Parks Sent: Tuesday, December 25, 2012 5:04 AM To: solr-user@lucene.apache.org Subject: MoreLikeThis supporting multiple document IDs as input? I'm unclear on this point from the documentation. Is it possible to give Solr X # of document IDs and tell it that I want documents similar to those X documents? Example: - The user is browsing 5 different articles - I send Solr the IDs of these 5 articles so I can present the user other similar articles I see this example for sending it 1 document ID: http://localhost:8080/solr/select/?qt=mltq=id:[document id]mlt.fl=[field1],[field2],[field3]fl=idrows=10 But can I send it 2+ document IDs as the query?