Re: Indexing tweet and searching @keyword OR #keyword
I tried tweaking WordDelimiterFactory but I won't accept # OR @ symbols and it ignored totally. I need solution plz suggest. On 4 August 2011 21:08, Jonathan Rochkind rochk...@jhu.edu wrote: It's the WordDelimiterFactory in your filter chain that's removing the punctuation entirely from your index, I think. Read up on what the WordDelimiter filter does, and what it's settings are; decide how you want things to be tokenized in your index to get the behavior your want; either get WordDelimiter to do it that way by passing it different arguments, or stop using WordDelimiter; come back with any questions after trying that! On 8/4/2011 11:22 AM, Mohammad Shariq wrote: I have indexed around 1 million tweets ( using text dataType). when I search the tweet with # OR @ I dont get the exact result. e.g. when I search for #ipad OR @ipad I get the result where ipad is mentioned skipping the # and @. please suggest me, how to tune or what are filterFactories to use to get the desired result. I am indexing the tweet as text, below is text which is there in my schema.xml. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.**KeywordTokenizerFactory/ filter class=solr.**CommonGramsFilterFactory words=stopwords.txt minShingleSize=3 maxShingleSize=3 ignoreCase=true/ filter class=solr.**WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**SnowballPorterFilterFactory protected=protwords.txt language=English/ /analyzer analyzer type=query tokenizer class=solr.**KeywordTokenizerFactory/ filter class=solr.**CommonGramsFilterFactory words=stopwords.txt minShingleSize=3 maxShingleSize=3 ignoreCase=true/ filter class=solr.**WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**SnowballPorterFilterFactory protected=protwords.txt language=English/ /analyzer /fieldType -- Thanks and Regards Mohammad Shariq
Re: Indexing tweet and searching @keyword OR #keyword
Do you really want a search on ipad to *fail* to match input of #ipad? Or vice-versa? My requirement is : I want to search both '#ipad' and 'ipad' for q='ipad' BUT for q='#ipad' I want to search ONLY '#ipad' excluding 'ipad'. On 10 August 2011 19:49, Erick Erickson erickerick...@gmail.com wrote: Please look more carefully at the documentation for WDDF, specifically: split on intra-word delimiters (all non alpha-numeric characters). WordDelimiterFilterFactory will always throw away non alpha-numeric characters, you can't tell it do to otherwise. Try some of the other tokenizers/analyzers to get what you want, and also look at the admin/analysis page to see what the exact effects are of your fieldType definitions. Here's a great place to start: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters You probably want something like WhitespaceTokenizerFactory followed by LowerCaseFilterFactory or some such... But I really question whether this is what you want either. Do you really want a search on ipad to *fail* to match input of #ipad? Or vice-versa? KeywordTokenizerFactory is probably not the place you want to start, the tokenization process doesn't break anything up, you happen to be getting separate tokens because of WDDF, which as you see can't process things the way you want. Best Erick On Wed, Aug 10, 2011 at 3:09 AM, Mohammad Shariq shariqn...@gmail.com wrote: I tried tweaking WordDelimiterFactory but I won't accept # OR @ symbols and it ignored totally. I need solution plz suggest. On 4 August 2011 21:08, Jonathan Rochkind rochk...@jhu.edu wrote: It's the WordDelimiterFactory in your filter chain that's removing the punctuation entirely from your index, I think. Read up on what the WordDelimiter filter does, and what it's settings are; decide how you want things to be tokenized in your index to get the behavior your want; either get WordDelimiter to do it that way by passing it different arguments, or stop using WordDelimiter; come back with any questions after trying that! On 8/4/2011 11:22 AM, Mohammad Shariq wrote: I have indexed around 1 million tweets ( using text dataType). when I search the tweet with # OR @ I dont get the exact result. e.g. when I search for #ipad OR @ipad I get the result where ipad is mentioned skipping the # and @. please suggest me, how to tune or what are filterFactories to use to get the desired result. I am indexing the tweet as text, below is text which is there in my schema.xml. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.**KeywordTokenizerFactory/ filter class=solr.**CommonGramsFilterFactory words=stopwords.txt minShingleSize=3 maxShingleSize=3 ignoreCase=true/ filter class=solr.**WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**SnowballPorterFilterFactory protected=protwords.txt language=English/ /analyzer analyzer type=query tokenizer class=solr.**KeywordTokenizerFactory/ filter class=solr.**CommonGramsFilterFactory words=stopwords.txt minShingleSize=3 maxShingleSize=3 ignoreCase=true/ filter class=solr.**WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.**LowerCaseFilterFactory/ filter class=solr.**SnowballPorterFilterFactory protected=protwords.txt language=English/ /analyzer /fieldType -- Thanks and Regards Mohammad Shariq -- Thanks and Regards Mohammad Shariq
Indexing tweet and searching @keyword OR #keyword
I have indexed around 1 million tweets ( using text dataType). when I search the tweet with # OR @ I dont get the exact result. e.g. when I search for #ipad OR @ipad I get the result where ipad is mentioned skipping the # and @. please suggest me, how to tune or what are filterFactories to use to get the desired result. I am indexing the tweet as text, below is text which is there in my schema.xml. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.CommonGramsFilterFactory words=stopwords.txt minShingleSize=3 maxShingleSize=3 ignoreCase=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory protected=protwords.txt language=English/ /analyzer analyzer type=query tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.CommonGramsFilterFactory words=stopwords.txt minShingleSize=3 maxShingleSize=3 ignoreCase=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory protected=protwords.txt language=English/ /analyzer /fieldType -- Thanks and Regards Mohammad Shariq
Delete by range query
Hi, I want to delete the bunch of docs from my solr using rangeQuery. I have one field called 'time' which is tint. I am deleting using the query : deletequerytime:[1296777600+TO+1296778000]/query/delete but solr is returning Error, Saying bad request. however I am able to delete one by one using below deleteQuery: deletequerytime:1296777600/query/delete Please suggest any solution to this problem. -- Thanks and Regards Mohammad Shariq
Re: Delete by range query
Thanks Koji Its working now. On 27 July 2011 19:30, Koji Sekiguchi k...@r.email.ne.jp wrote: deletequerytime:[**1296777600+TO+1296778000]/**query/delete Should be deletequerytime:[**1296777600 TO 1296778000]/query/delete ? koji -- http://www.rondhuit.com/en/ -- Thanks and Regards Mohammad Shariq
Re: How to find whether solr server is running or not
Check for HTTP response code, if its other than 200 means services are not OK. On 19 July 2011 14:39, Romi romijain3...@gmail.com wrote: I am running an application that get search results from solr server. But when server is not running i get no response from the server. Is there any way i can found that my server is not running so that i can give proper error message regarding it - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-find-whether-solr-server-is-running-or-not-tp3181870p3181870.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq
Re: Need Suggestion
below are certain things to do for search latency. 1) Do bulk insert. 2) Commit after every ~5000 docs. 3) Do optimization once in a day. 4) in search query use fq parameter. What is the size of JVM you are using ??? On 15 July 2011 17:44, Rohit ro...@in-rev.com wrote: I am facing some performance issues on my Solr Installation (3core server). I am indexing live twitter data based on certain keywords, as you can imagine, the rate at which documents are received is very high and so the updates to the core is very high and regular. Given below are the document size on my three core. Twitter - 26874747 Core2- 3027800 Core3- 6074253 My Server configuration has 8GB RAM, but now we are experiencing server performance drop. What can be done to improve this? Also, I have a few questions. 1. Does the number of commit takes high memory? Will reducing the number of commits per hour help? 2. Most of my queries are field or date faceting based? how to improve those? Regards, Rohit Regards, Rohit Mobile: +91-9901768202 About Me: http://about.me/rohitg http://about.me/rohitg -- Thanks and Regards Mohammad Shariq
how to do ExactMatch for PhraseQuery
I need exact match On PhraseQuery. when I search for the phrase call it spring I get the result for : 1) It's spring 2) The spring but my requirement is ExactMatch for PhraseQuery. my search field is text. Along with PhraseQuery I am doing RegularQuery too. how to tune the solr to do Exactmatch for PhraseQuery without affecting RegularQuery. below is text field of schema.xml. fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.CommonGramsFilterFactory words=stopwords.txt maxShingleSize=3 ignoreCase=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzeranalyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.CommonGramsFilterFactory words=stopwords.txt maxShingleSize=3 ignoreCase=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType Thanks Shariq
Re: what s the optimum size of SOLR indexes
There are Solutions for Indexing huge data. e.g. SolrCloud, ZooKeeperIntegration, MultiCore, MultiShard. depending on your requirement you can choose one or other. On 4 July 2011 17:21, Jame Vaalet jvaa...@capitaliq.com wrote: Hi, What would be the maximum size of a single SOLR index file for resulting in optimum search time ? In case I have got to index all the documents in my repository (which is in TB size) what would be the ideal architecture to follow , distributed SOLR ? Regards, JAME VAALET Software Developer EXT :8108 Capital IQ -- Thanks and Regards Mohammad Shariq
How to disable Phonetic search
I am using solr1.4 When I search for keyword ansys I get lot of posts. but when I search for ansys NOT ansi I get nothing. I guess its because of Phonetic search, ansys is converted into ansi ( that is NOT keyword) and nothing returns. How to handle this kind of problem. -- Thanks and Regards Mohammad Shariq
Re: How to disable Phonetic search
I was using SnowballPorterFilterFactory for stemming, and that stammer was stemming the words. I added the keyword ansys to file protwords.txt. Now the stemming is not happening for ansys and Its OK now. On 29 June 2011 17:12, Ahmet Arslan iori...@yahoo.com wrote: I am using solr1.4 When I search for keyword ansys I get lot of posts. but when I search for ansys NOT ansi I get nothing. I guess its because of Phonetic search, ansys is converted into ansi ( that is NOT keyword) and nothing returns. How to handle this kind of problem. Find and remove occurrences of solr.PhoneticFilterFactory from your schema.xml file. -- Thanks and Regards Mohammad Shariq
Re: Removing duplicate documents from search results
I also have the problem of duplicate docs. I am indexing news articles, Every news article will have the source URL, If two news-article has the same URL, only one need to index, removal of duplicate at index time. On 23 June 2011 21:24, simon mtnes...@gmail.com wrote: have you checked out the deduplication process that's available at indexing time ? This includes a fuzzy hash algorithm . http://wiki.apache.org/solr/Deduplication -Simon On Thu, Jun 23, 2011 at 5:55 AM, Pranav Prakash pra...@gmail.com wrote: This approach would definitely work is the two documents are *Exactly* the same. But this is very fragile. Even if one extra space has been added, the whole hash would change. What I am really looking for is some %age similarity between documents, and remove those documents which are more than 95% similar. *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny On Thu, Jun 23, 2011 at 15:16, Omri Cohen o...@yotpo.com wrote: What you need to do, is to calculate some HASH (using any message digest algorithm you want, md5, sha-1 and so on), then do some reading on solr field collapse capabilities. Should not be too complicated.. *Omri Cohen* Co-founder @ yotpo.com | o...@yotpo.com | +972-50-7235198 | +972-3-6036295 My profiles: [image: LinkedIn] http://www.linkedin.com/in/omric [image: Twitter] http://www.twitter.com/omricohe [image: WordPress]http://omricohen.me Please consider your environmental responsibility. Before printing this e-mail message, ask yourself whether you really need a hard copy. IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email by mistake, please notify the sender immediately and do not disclose the contents to anyone or make copies thereof. Signature powered by http://www.wisestamp.com/email-install?utm_source=extensionutm_medium=emailutm_campaign=footer WiseStamp http://www.wisestamp.com/email-install?utm_source=extensionutm_medium=emailutm_campaign=footer -- Forwarded message -- From: Pranav Prakash pra...@gmail.com Date: Thu, Jun 23, 2011 at 12:26 PM Subject: Removing duplicate documents from search results To: solr-user@lucene.apache.org How can I remove very similar documents from search results? My scenario is that there are documents in the index which are almost similar (people submitting same stuff multiple times, sometimes different people submitting same stuff). Now when a search is performed for keyword, in the top N results, quite frequently, same document comes up multiple times. I want to remove those duplicate (or possible duplicate) documents. Very similar to what Google does when they say In order to show you most relevant result, duplicates have been removed. How can I achieve this functionality using Solr? Does Solr has an implied or plugin which could help me with it? *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny -- Thanks and Regards Mohammad Shariq
Re: Removing duplicate documents from search results
I am making the Hash from URL, but I can't use this as UniqueKey because I am using UUID as UniqueKey, Since I am using SOLR as index engine Only and using Riak(key-value storage) as storage engine, I dont want to do the overwrite on duplicate. I just need to discard the duplicates. 2011/6/28 François Schiettecatte fschietteca...@gmail.com Create a hash from the url and use that as the unique key, md5 or sha1 would probably be good enough. Cheers François On Jun 28, 2011, at 7:29 AM, Mohammad Shariq wrote: I also have the problem of duplicate docs. I am indexing news articles, Every news article will have the source URL, If two news-article has the same URL, only one need to index, removal of duplicate at index time. On 23 June 2011 21:24, simon mtnes...@gmail.com wrote: have you checked out the deduplication process that's available at indexing time ? This includes a fuzzy hash algorithm . http://wiki.apache.org/solr/Deduplication -Simon On Thu, Jun 23, 2011 at 5:55 AM, Pranav Prakash pra...@gmail.com wrote: This approach would definitely work is the two documents are *Exactly* the same. But this is very fragile. Even if one extra space has been added, the whole hash would change. What I am really looking for is some %age similarity between documents, and remove those documents which are more than 95% similar. *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny On Thu, Jun 23, 2011 at 15:16, Omri Cohen o...@yotpo.com wrote: What you need to do, is to calculate some HASH (using any message digest algorithm you want, md5, sha-1 and so on), then do some reading on solr field collapse capabilities. Should not be too complicated.. *Omri Cohen* Co-founder @ yotpo.com | o...@yotpo.com | +972-50-7235198 | +972-3-6036295 My profiles: [image: LinkedIn] http://www.linkedin.com/in/omric [image: Twitter] http://www.twitter.com/omricohe [image: WordPress]http://omricohen.me Please consider your environmental responsibility. Before printing this e-mail message, ask yourself whether you really need a hard copy. IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email by mistake, please notify the sender immediately and do not disclose the contents to anyone or make copies thereof. Signature powered by http://www.wisestamp.com/email-install?utm_source=extensionutm_medium=emailutm_campaign=footer WiseStamp http://www.wisestamp.com/email-install?utm_source=extensionutm_medium=emailutm_campaign=footer -- Forwarded message -- From: Pranav Prakash pra...@gmail.com Date: Thu, Jun 23, 2011 at 12:26 PM Subject: Removing duplicate documents from search results To: solr-user@lucene.apache.org How can I remove very similar documents from search results? My scenario is that there are documents in the index which are almost similar (people submitting same stuff multiple times, sometimes different people submitting same stuff). Now when a search is performed for keyword, in the top N results, quite frequently, same document comes up multiple times. I want to remove those duplicate (or possible duplicate) documents. Very similar to what Google does when they say In order to show you most relevant result, duplicates have been removed. How can I achieve this functionality using Solr? Does Solr has an implied or plugin which could help me with it? *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny -- Thanks and Regards Mohammad Shariq -- Thanks and Regards Mohammad Shariq
Re: Removing duplicate documents from search results
Hey François, thanks for your suggestion, I followed the same link ( http://wiki.apache.org/solr/Deduplication) they have the solution*, either make Hash as uniqueKey OR overwrite on duplicate, I dont need either. I need Discard on Duplicate. * I have not used it but it looks like it will do the trick. François On Jun 28, 2011, at 8:44 AM, Pranav Prakash wrote: I found the deduplication thing really useful. Although I have not yet started to work on it, as there are some other low hanging fruits I've to capture. Will share my thoughts soon. *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny 2011/6/28 François Schiettecatte fschietteca...@gmail.com Maybe there is a way to get Solr to reject documents that already exist in the index but I doubt it, maybe someone else with can chime here here. You could do a search for each document prior to indexing it so see if it is already in the index, that is probably non-optimal, maybe it is easiest to check if the document exists in your Riak repository, it no add it and index it, and drop if it already exists. François On Jun 28, 2011, at 8:24 AM, Mohammad Shariq wrote: I am making the Hash from URL, but I can't use this as UniqueKey because I am using UUID as UniqueKey, Since I am using SOLR as index engine Only and using Riak(key-value storage) as storage engine, I dont want to do the overwrite on duplicate. I just need to discard the duplicates. 2011/6/28 François Schiettecatte fschietteca...@gmail.com Create a hash from the url and use that as the unique key, md5 or sha1 would probably be good enough. Cheers François On Jun 28, 2011, at 7:29 AM, Mohammad Shariq wrote: I also have the problem of duplicate docs. I am indexing news articles, Every news article will have the source URL, If two news-article has the same URL, only one need to index, removal of duplicate at index time. On 23 June 2011 21:24, simon mtnes...@gmail.com wrote: have you checked out the deduplication process that's available at indexing time ? This includes a fuzzy hash algorithm . http://wiki.apache.org/solr/Deduplication -Simon On Thu, Jun 23, 2011 at 5:55 AM, Pranav Prakash pra...@gmail.com wrote: This approach would definitely work is the two documents are *Exactly* the same. But this is very fragile. Even if one extra space has been added, the whole hash would change. What I am really looking for is some %age similarity between documents, and remove those documents which are more than 95% similar. *Pranav Prakash* temet nosce Twitter http://twitter.com/pranavprakash | Blog http://blog.myblive.com | Google http://www.google.com/profiles/pranny On Thu, Jun 23, 2011 at 15:16, Omri Cohen o...@yotpo.com wrote: What you need to do, is to calculate some HASH (using any message digest algorithm you want, md5, sha-1 and so on), then do some reading on solr field collapse capabilities. Should not be too complicated.. *Omri Cohen* Co-founder @ yotpo.com | o...@yotpo.com | +972-50-7235198 | +972-3-6036295 My profiles: [image: LinkedIn] http://www.linkedin.com/in/omric [image: Twitter] http://www.twitter.com/omricohe [image: WordPress]http://omricohen.me Please consider your environmental responsibility. Before printing this e-mail message, ask yourself whether you really need a hard copy. IMPORTANT: The contents of this email and any attachments are confidential. They are intended for the named recipient(s) only. If you have received this email by mistake, please notify the sender immediately and do not disclose the contents to anyone or make copies thereof. Signature powered by http://www.wisestamp.com/email-install?utm_source=extensionutm_medium=emailutm_campaign=footer WiseStamp http://www.wisestamp.com/email-install?utm_source=extensionutm_medium=emailutm_campaign=footer -- Forwarded message -- From: Pranav Prakash pra...@gmail.com Date: Thu, Jun 23, 2011 at 12:26 PM Subject: Removing duplicate documents from search results To: solr-user@lucene.apache.org How can I remove very similar documents from search results? My scenario is that there are documents in the index which are almost similar (people submitting same stuff multiple times, sometimes different people submitting same stuff). Now when a search is performed for keyword, in the top N results, quite frequently, same document comes up multiple times. I want to remove those duplicate (or possible duplicate) documents. Very similar to what Google does when they say In order to show you most relevant result, duplicates have been removed. How can I achieve this functionality using Solr? Does
Solr PhraseSearch and ExactMatch
Hello, I am using solr1.4 on ubuntu 10.10. Currently I got the requirement to do the ExactMatch for PhraseSearch. I tried googling but I did'nt got the exact solution. I am doing the search on 'text' field. if I give the search query : http://localhost:8983/solr/select/?q=the search agency http://localhost It apply the stopWordsFilter and remove the 'the' word from query, but my requirement is to do the ExactMatch. please suggest me the right solution to this problem. -- Thanks and Regards Mohammad Shariq
Re: Solr PhraseSearch and ExactMatch
I can use 'String' instead of 'Text' for exact match, but I need ExactMatch only on PhraseSearch. On 27 June 2011 16:29, Gora Mohanty g...@mimirtech.com wrote: On Mon, Jun 27, 2011 at 3:42 PM, Mohammad Shariq shariqn...@gmail.com wrote: Hello, I am using solr1.4 on ubuntu 10.10. Currently I got the requirement to do the ExactMatch for PhraseSearch. I tried googling but I did'nt got the exact solution. I am doing the search on 'text' field. if I give the search query : http://localhost:8983/solr/select/?q=the search agency http://localhost It apply the stopWordsFilter and remove the 'the' word from query, but my requirement is to do the ExactMatch. please suggest me the right solution to this problem. [...] Use a string field which does not have any analyzers, and tokenizers. Regards, Gora -- Thanks and Regards Mohammad Shariq
Re: Analyzer creates PhraseQuery
I guess 'to' may be listed in 'stopWords' . On 28 June 2011 08:27, entdeveloper cameron.develo...@gmail.com wrote: I have an analyzer setup in my schema like so: analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=2/ /analyzer What's happening is if I index a term like toys and dolls, if I search for to, I get no matches. The debug output in solr gives me: str name=rawquerystringto/str str name=querystringto/str str name=parsedqueryPhraseQuery(autocomplete:t o to)/str str name=parsedquery_toStringautocomplete:t o to/str Which means it looks like the lucene query parser is turning it into a PhraseQuery for some reason. The explain seems to confirm that this PhraseQuery is what's causing my document to not match: 0.0 = (NON-MATCH) weight(autocomplete:t o to in 82), product of: 1.0 = queryWeight(autocomplete:t o to), product of: 6.684934 = idf(autocomplete: t=60 o=68 to=14) 0.1495901 = queryNorm 0.0 = fieldWeight(autocomplete:t o to in 82), product of: 0.0 = tf(phraseFreq=0.0) 6.684934 = idf(autocomplete: t=60 o=68 to=14) 0.1875 = fieldNorm(field=autocomplete, doc=82) But why? This seems like it should match to me, and indeed the Solr analysis tool highlights the matches (see image), so something isn't lining up right. http://lucene.472066.n3.nabble.com/file/n3116288/Screen_shot_2011-06-27_at_7.55.49_PM.png In case you're wondering, I'm trying to implement a semi-advanced autocomplete feature that goes beyond using what a simple EdgeNGram analyzer could do. -- View this message in context: http://lucene.472066.n3.nabble.com/Analyzer-creates-PhraseQuery-tp3116288p3116288.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq
Re: how to index data in solr form database automatically
First write a Script in Python ( or JAVA or PHP or anyLanguage) which reads the data from database and index into Solr. Now setup this script as cron-job to run automatically at certain interval. On 24 June 2011 17:23, Romi romijain3...@gmail.com wrote: would you please tell me how can i use Cron for auto index my database tables in solr - Thanks Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-index-data-in-solr-form-database-automatically-tp3102893p3103768.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq
Search is taking long-long time.
I am running two solrShards. I have indexed 100 million docs in each shard ( each are 50 GB and only 'id' is stored). My search have became very slow. Its taking around 2-3 seconds. below is my query : http://solrHost1:8080/solr/select?shards=solrHost1:8080/solr,solrHost2:8080/solrq= QUERYfq=FilterQueryfl=idstart=0rows=100indent=onsort=time desc QUERY and FilterQuery is below : QUERY = Online Shopping AND ( Amex OR Am ex OR American express OR americanexpress ) FilterQuery = time:[1308659371 TO 1308745771] AND category:news AND lang:English How to boost the query perfomance. default search filed is title( text). -- Thanks and Regards Mohammad Shariq
Re: Search is taking long-long time.
this is how my 'time' field looks in schema : field name=time type=tint indexed=true stored=false/ and also, I am doing frequent Update to Solr (every 5 minuts). On 22 June 2011 18:41, Ahmet Arslan iori...@yahoo.com wrote: I am running two solrShards. I have indexed 100 million docs in each shard ( each are 50 GB and only 'id' is stored). My search have became very slow. Its taking around 2-3 seconds. below is my query : http://solrHost1:8080/solr/select?shards=solrHost1:8080/solr,solrHost2:8080/solrq= QUERYfq=FilterQueryfl=idstart=0rows=100indent=onsort=time desc QUERY and FilterQuery is below : QUERY = Online Shopping AND ( Amex OR Am ex OR American express OR americanexpress ) FilterQuery = time:[1308659371 TO 1308745771] AND category:news AND lang:English How to boost the query perfomance. default search filed is title( text). If fieldType of time is not trie-based, you can change it to tdate, tint etc. For range queries. If you don't update your index frequently, you can use separate filter queries (fq) for your clauses. To benefit from caching. fq=category:newsfq=lang:English http://wiki.apache.org/solr/SolrPerformanceFactors http://wiki.apache.org/lucene-java/ImproveSearchingSpeed -- Thanks and Regards Mohammad Shariq
Re: fq vs adding to query
fq is filter-query, search based on category, timestamp, language etc. but I dont see any performance improvement if use 'keyword' in fq. useCases : fq=lang:Englishq=camera AND digital OR fq=time:[13023567 TO 13023900]q=camera AND digital On 19 June 2011 20:17, Jamie Johnson jej2...@gmail.com wrote: Are there any hard and fast rules about when touse fq vs adding to the query? For instance if I started with a search of camera then wanted to add another keyword say digital, is it better to do q=camera AND digital or q=camerafq=digital I know that fq isn't taken into account when doing highlighting, so what I am currently doing is when there are facet based queries I am doing fqs but everything else is being added to the query, so in the case above I would have done q=camera AND digital. If however there was a field called category with values standard or digital I would have done q=camerafq=category:digital. Any guidance would be appreciated. -- Thanks and Regards Mohammad Shariq
Re: Weird optimize performance degradation
I also have the solr with around 100mn docs. I do optimize once in a week, and it takes around 1 hour 30 mins to optimize. On 19 June 2011 20:02, Santiago Bazerque sbazer...@gmail.com wrote: Hello Erick, thanks for your answer! Yes, our over-optimization is mainly due to paranoia over these strange commit times. The long optimize time persisted in all the subsequent commits, and this is consistent with what we are seeing in other production indexes that have the same problem. Once the anomaly shows up, it never commits quickly again. I combed through the last 50k documents that were added before the first slow commit. I found one with a larger than usual number of fields (didn't write down the number, but it was a few thousands). I deleted it, and the following optimize was normal again (110 seconds). So I'm pretty sure a document with lots of fields is the cause of the slowdown. If that would be useful, I can do some further testing to confirm this hypothesis and send the document to the list. Thanks again for your answer. Best, Santiago On Sun, Jun 19, 2011 at 10:21 AM, Erick Erickson erickerick...@gmail.com wrote: First, there's absolutely no reason to optimize this often, if at all. Older versions of Lucene would search faster on an optimized index, but this is no longer necessary. Optimize will reclaim data from deleted documents, but is generally recommended to be performed fairly rarely, often at off-peak hours. Note that optimize will re-write your entire index into a single new segment, so following your pattern it'll take longer and longer each time. But the speed change happening at 500,000 documents is suspiciously close to the default mergeFactor of 10 X 50,000. Do subsequent optimizes (i.e. on the 750,000th document) still take that long? But this doesn't make sense because if you're optimizing instead of committing, each optimize should reduce your index to 1 segment and you'll never hit a merge. So I'm a little confused. If you're really optimizing every 50K docs, what I'd expect to see is successively longer times, and at the end of each optimize I'd expect there to be only one segment in your index. Are you sure you're not just seeing successively longer times on each optimize and just noticing it after 10? Best Erick On Sun, Jun 19, 2011 at 6:04 AM, Santiago Bazerque sbazer...@gmail.com wrote: Hello! Here is a puzzling experiment: I build an index of about 1.2MM documents using SOLR 3.1. The index has a large number of dynamic fields (about 15.000). Each document has about 100 fields. I add the documents in batches of 20, and every 50.000 documents I optimize the index. The first 10 optimizes (up to exactly 500k documents) take less than a minute and a half. But the 11th and all subsequent commits take north of 10 minutes. The commit logs look identical (in the INFOSTREAM.txt file), but what used to be Jun 19, 2011 4:03:59 AM IW 13 [Sun Jun 19 04:03:59 EDT 2011; Lucene Merge Thread #0]: merge: total 50 docs Jun 19, 2011 4:04:37 AM IW 13 [Sun Jun 19 04:04:37 EDT 2011; Lucene Merge Thread #0]: merge store matchedCount=2 vs 2 now eats a lot of time: Jun 19, 2011 4:37:06 AM IW 14 [Sun Jun 19 04:37:06 EDT 2011; Lucene Merge Thread #0]: merge: total 55 docs Jun 19, 2011 4:46:42 AM IW 14 [Sun Jun 19 04:46:42 EDT 2011; Lucene Merge Thread #0]: merge store matchedCount=2 vs 2 What could be happening between those two lines that takes 10 minutes at full CPU? (and with 50k docs less used to take so much less?). Thanks in advance, Santiago -- Thanks and Regards Mohammad Shariq
Re: Solr and Tag Cloud
I am also looking for the same, Is there any way to find the cloud-tag of all the documents matching a specific query. On 18 June 2011 09:42, Jamie Johnson jej2...@gmail.com wrote: Does anyone have details of how to generate a tag cloud of popular terms across an entire data set and then also across a query? -- Thanks and Regards Mohammad Shariq
Re: Is it true that I cannot delete stored content from the index?
I have define uniqueKey in my solr and Deleting the docs from solr using this uniqueKey. and then doing optimization once in a day. is this right way to delete ??? On 19 June 2011 05:14, Erick Erickson erickerick...@gmail.com wrote: Yep, you've got to delete and re-add. Although if you have a uniqueKey defined you can just re-add that document and Solr will automatically delete the underlying document. You might have to optimize the index afterwards to get the data to really disappear since the deletion process just marks the document as deleted. Best Erick On Sat, Jun 18, 2011 at 1:20 PM, Gabriele Kahlout gabri...@mysimpatico.com wrote: Hello, I've indexing with the content field stored. Now I'd like to delete all stored content, is there how to do that without re-indexing? It seems not from lucene FAQ http://wiki.apache.org/lucene-java/LuceneFAQ#How_do_I_update_a_document_or_a_set_of_documents_that_are_already_indexed.3F : How do I update a document or a set of documents that are already indexed? There is no direct update procedure in Lucene. To update an index incrementally you must first *delete* the documents that were updated, and *then re-add*them to the index. -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains [LON] or the addressee acknowledges the receipt within 48 hours then I don't resend the email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x) Now + 48h) ⇒ ¬resend(I, this). If an email is sent by a sender that is not a trusted contact or the email does not contain a valid code then the email is not received. A valid code starts with a hyphen and ends with X. ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ L(-[a-z]+[0-9]X)). -- Thanks and Regards Mohammad Shariq
Search failed even if it has the keyword .
Hello, solr-search failed even if it has the keyword . I am using solr (solr3.1 on ubuntu 10.10) for Indexing the tweets. I am indexing certain tweets, but solr do'nt return any result when I search any keyword from tweet. in Solr, tweet is stored as 'text'. below is the tweet which I index : *RT @Khan_KK: DescribeYourImageWithAMovieTitle Khan Rais* once this tweet is index, I search for : * http://127.0.0.1:8983/solr/select/?q=DescribeYourImageWithAMovieTitleversion=2.2start=0rows=10indent=on * and nothing is returned from Solr even though this tweet is there in the solr. I tried searching many keywords e.g. describe, image, movie but nothing is returned from solr. I am using 'text' field of solr3.1. am I using right textChunker ?? please help me. -- Thanks and Regards Mohammad Shariq
Re: Search failed even if it has the keyword .
Hi Pravesh, this is how my schema looks for 'text' field : fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType My default search field is 'defaultquery' and I am copy field is : copyField source=title dest=defaultquery/ copyField source=description dest=defaultquery/ copyField source=siteurl dest=defaultquery/ And My tweet is indexed into 'title'. On 17 June 2011 15:46, pravesh suyalprav...@yahoo.com wrote: First check, in your schema.xml, which is your default search field. Also look if you are using WordDelimiterFilterFactory in your schema.xml for the specific field. This would tokenize your words on every capital letter, so, for the word DescribeYourImageWithAMovieTitle will be broken into multiple tokens and each will be searchable. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-failed-even-if-it-has-the-keyword-tp3075626p3075644.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq
Re: Search failed even if it has the keyword .
very very thanks Parvesh. my 'title' was 'text' whereas 'defaultquery' was 'query_text'. I change my 'defaultquery' to 'text' and problem is solved. thanks again. On 17 June 2011 16:57, pravesh suyalprav...@yahoo.com wrote: What is the type for the field's defaultquery title in your schema.xml ? -- View this message in context: http://lucene.472066.n3.nabble.com/Search-failed-even-if-it-has-the-keyword-tp3075626p3075797.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq
Re: how to Index and Search non-Eglish Text in solr
Can I specify multiple language in filter tag in schema.xml ??? like below fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr. WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.SnowballPorterFilterFactory language=Dutch / filter class=solr.SnowballPorterFilterFactory language=English / filter class=solr.SnowballPorterFilterFactory language=Chinese / tokenizer class=solr.WhitespaceTokenizerFactory/ tokenizer class=solr.CJKTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/filter class=solr.SnowballPorterFilterFactory language=Hungarian / On 8 June 2011 18:47, Erick Erickson erickerick...@gmail.com wrote: This page is a handy reference for individual languages... http://wiki.apache.org/solr/LanguageAnalysis But the usual approach, especially for Chinese/Japanese/Korean (CJK) is to index the content in different fields with language-specific analyzers then spread your search across the language-specific fields (e.g. title_en, title_fr, title_ar). Stemming and stopwords particularly give surprising results if you put words from different languages in the same field. Best Erick On Wed, Jun 8, 2011 at 8:34 AM, Mohammad Shariq shariqn...@gmail.com wrote: Hi, I had setup solr( solr-1.4 on Ubuntu 10.10) for indexing news articles in English, but my requirement extend to index the news of other languages too. This is how my schema looks : field name=news type=text indexed=true stored=false required=false/ And the text Field in schema.xml looks like : fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType My Problem is : Now I want to index the news articles in other languages to e.g. Chinese,Japnese. How I can I modify my text field so that I can Index the news in other lang too and make it searchable ?? Thanks Shariq -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-Index-and-Search-non-Eglish-Text-in-solr-tp3038851p3038851.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq
Re: how to Index and Search non-Eglish Text in solr
Thanks Erick for your help. I have another silly question. Suppose I created mutiple fieldTypes e.g. news_English, news_Chinese, news_Japnese etc. after creating these field, can I copy all these to CopyField *defaultquery *like below : *copyField source=news_English dest=defaultquery/ copyField source=news_Chinese dest=defaultquery/ copyField source=news_Japnese dest=defaultquery/ *and my defaultquery looks like :* field name=defaultquery type=query_text indexed=false stored=false multiValued=true/ *Is this right way to deal with multiple language Indexing and searching* * ???* * On 9 June 2011 19:06, Erick Erickson erickerick...@gmail.com wrote: No, you'd have to create multiple fieldTypes, one for each language Best Erick On Thu, Jun 9, 2011 at 5:26 AM, Mohammad Shariq shariqn...@gmail.com wrote: Can I specify multiple language in filter tag in schema.xml ??? like below fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr. WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.SnowballPorterFilterFactory language=Dutch / filter class=solr.SnowballPorterFilterFactory language=English / filter class=solr.SnowballPorterFilterFactory language=Chinese / tokenizer class=solr.WhitespaceTokenizerFactory/ tokenizer class=solr.CJKTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/filter class=solr.SnowballPorterFilterFactory language=Hungarian / On 8 June 2011 18:47, Erick Erickson erickerick...@gmail.com wrote: This page is a handy reference for individual languages... http://wiki.apache.org/solr/LanguageAnalysis But the usual approach, especially for Chinese/Japanese/Korean (CJK) is to index the content in different fields with language-specific analyzers then spread your search across the language-specific fields (e.g. title_en, title_fr, title_ar). Stemming and stopwords particularly give surprising results if you put words from different languages in the same field. Best Erick On Wed, Jun 8, 2011 at 8:34 AM, Mohammad Shariq shariqn...@gmail.com wrote: Hi, I had setup solr( solr-1.4 on Ubuntu 10.10) for indexing news articles in English, but my requirement extend to index the news of other languages too. This is how my schema looks : field name=news type=text indexed=true stored=false required=false/ And the text Field in schema.xml looks like : fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=1 generateNumberParts=1 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=1/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English protected=protwords.txt/ /analyzer /fieldType My Problem is : Now I want to index the news articles in other languages to e.g. Chinese,Japnese. How I can I modify my text field so that I can Index the news in other lang too and make it searchable ?? Thanks Shariq -- View this message in context: http://lucene.472066.n3.nabble.com/how-to-Index-and-Search-non-Eglish-Text-in-solr-tp3038851p3038851.html Sent from the Solr - User mailing list archive at Nabble.com. -- Thanks and Regards Mohammad Shariq -- Thanks and Regards Mohammad Shariq
Re: SolrCloud questions
I am also planning to move to SolrCloud; since its still in under development, I am not sure about its behavior in Production. Please update us once you find it stable. On 10 June 2011 03:56, Upayavira u...@odoko.co.uk wrote: I'm exploring SolrCloud for a new project, and have some questions based upon what I've found so far. The setup I'm planning is going to have a number of multicore hosts, with cores being moved between hosts, and potentially with cores merging as they get older (cores are time based, so once today has passed, they don't get updated). First question: The solr/conf dir gets uploaded to Zookeeper when you first start up, and using system properties you can specify a name to be associated with those conf files. How do you handle it when you have a multicore setup, and different configs for each core on your host? Second question: Can you query collections when using multicore? On single core, I can query: http://localhost:8983/solr/collection1/select?q=blah On a multicore system I can query: http://localhost:8983/solr/core1/select?q=blah but I cannot work out a URL to query collection1 when I have multiple cores. Third question: For replication, I'm assuming that replication in SolrCloud is still managed in the same way as non-cloud Solr, that is as ReplicationHandler config in solrconfig? In which case, I need a different config setup for each slave, as each slave has a different master (or can I delegate the decision as to which host/core is its master to zookeeper?) Thanks for any pointers. Upayavira --- Enterprise Search Consultant at Sourcesense UK, Making Sense of Open Source -- Thanks and Regards Mohammad Shariq
Re: Re: Can I update a specific field in solr?
Solr dont support partial updates. On 8 June 2011 16:04, ZiLi dangld...@163.com wrote: Thanks very much , I'll re-index a whole document : ) 发件人: Chandan Tamrakar 发送时间: 2011-06-08 18:25:37 收件人: solr-user 抄送: 主题: Re: Can I update a specific field in solr? I think You can do that but you need to re-index a whole document again. note that there is nothing like update , its usually delete and then add. thanks On Wed, Jun 8, 2011 at 4:00 PM, ZiLi dangld...@163.com wrote: Hi, I try to update a specific field in solr , but I didn't find anyway to implement this . Anyone who knows how to ? Any suggestions will be appriciate : ) 2011-06-08 ZiLi -- Chandan Tamrakar * * -- Thanks and Regards Mohammad Shariq
Re: solr speed issues..
How frequently you Optimize your solrIndex ?? Optimization also helps in reducing search latency. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-speed-issues-tp2254823p3038794.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: problem: zooKeeper Integration with solr
how this method (http://localhost:8983/solr/select?shards=*Machine:Port/Solr Path,**Machine:Port/Solr Path*indent=trueq=query) is better than zooKeeper, could you please refer any performance doc. On 7 June 2011 08:18, bmdakshinamur...@gmail.com bmdakshinamur...@gmail.com wrote: Instead of integrating zookeeper, you could create shards over multiple machines and specify the shards while you are querying solr. Eg: http://localhost:8983/solr/select?shards=*Machine:Port/Solr Path,* *Machine:Port/Solr Path*indent=trueq=query On Mon, Jun 6, 2011 at 5:59 PM, Mohammad Shariq shariqn...@gmail.com wrote: Hi folk, I am using solr to index around 100mn docs. now I am planning to move to cluster based solr, so that I can scale the indexing and searching process. since solrCloud is in development stage, I am trying to index in shard based environment using zooKeeper. I followed the steps from http://wiki.apache.org/solr/ZooKeeperIntegrationthen also I am not able to do distributes search. Once I index the docs in one shard, not able to query from other shard and vice-versa, (using the query http://localhost:8180/solr/select/?q=itunesversion=2.2start=0rows=10indent=on ) I am running solr3.1 on ubuntu 10.10. please help me. -- Thanks and Regards Mohammad Shariq -- Thanks and Regards, DakshinaMurthy BM -- Thanks and Regards Mohammad Shariq
problem: zooKeeper Integration with solr
Hi folk, I am using solr to index around 100mn docs. now I am planning to move to cluster based solr, so that I can scale the indexing and searching process. since solrCloud is in development stage, I am trying to index in shard based environment using zooKeeper. I followed the steps from http://wiki.apache.org/solr/ZooKeeperIntegrationthen also I am not able to do distributes search. Once I index the docs in one shard, not able to query from other shard and vice-versa, (using the query http://localhost:8180/solr/select/?q=itunesversion=2.2start=0rows=10indent=on ) I am running solr3.1 on ubuntu 10.10. please help me. -- Thanks and Regards Mohammad Shariq