Re: Scoring of DisMax in Solr
This seems like a bug to me. On 10/4/11 6:52 PM, David Ryan help...@gmail.com wrote: Hi, When I examine the score calculation of DisMax in Solr, it looks to me that DisMax is using tf x idf^2 instead of tf x idf. Does anyone have insight why tf x idf is not used here? Here is the score contribution from one one field: score(q,c) = queryWeight x fieldWeight = tf x idf x idf x queryNorm x fieldNorm Here is the example that I used to derive the formula above. Clearly, idf is multiplied twice in the score calculation. * http://localhost:8983/solr/select/?q=GBversion=2.2start=0rows=10indent =ondebugQuery=truefl=id,score * str name=6H500F0 0.18314168 = (MATCH) sum of: 0.18314168 = (MATCH) weight(text:gb in 1), product of: 0.35845062 = queryWeight(text:gb), product of: 2.3121865 = idf(docFreq=6, numDocs=26) 0.15502669 = queryNorm 0.5109258 = (MATCH) fieldWeight(text:gb in 1), product of: 1.4142135 = tf(termFreq(text:gb)=2) 2.3121865 = idf(docFreq=6, numDocs=26) 0.15625 = fieldNorm(field=text, doc=1) /str Thanks!
Re: A simple query?
Hi, Set your default operator to OR i.e. solrQueryParser defaultOperator=OR/ in schema.xml Also keep your fieldType=text i.e. field name=myfield type=text indexed=true stored=true/ As you would want whitespace tokenization and try your query with () i.e. /select/?q=myfields:(a b)version=2.2start=0rows=2indent=on This hopefully should solve your problem. -- View this message in context: http://lucene.472066.n3.nabble.com/A-simple-query-tp3395465p3395735.html Sent from the Solr - User mailing list archive at Nabble.com.
is there a way to know which mm value was used?
Hello, I'd like to be able to know programmaticaly what value mm was set to for one request (to avoid having to parse the query, identify stopwords, calculate mm based on solrconfig.xml). Is there a way to get mm value in solr response? Thanks, Elisabeth
Re: boosting and relevancy options from solr extensibility points -java-
in a certain time period (say christmas) I will promote a doc in christmas keyword You might check the QueryElevation component in SOLR. or based on users interest I will boost a specific category of products. or (I am not sure how can I do this one) I will boost docs that current user's friends (source:facebook) purchased/used/... You can check https://cwiki.apache.org/confluence/display/MAHOUT/Recommender+Documentation apache mahout for this purpose. It's got recommendation engine that works pretty well. Thanx Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/boosting-and-relevancy-options-from-solr-extensibility-points-java-tp3149916p3395752.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: is there a way to know which mm value was used?
You can explicitly pass /mm/ for every search, and get it in your response, otherwise use /debugQuery=true/, it will give you all implicitly used defaults (but you wouldn't want to use this in production) Thanx Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/is-there-a-way-to-know-which-mm-value-was-used-tp3395746p3395765.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: is there a way to know which mm value was used?
Hi, since this isn't logged anywhere, as far as I can say, there are two ways: Either you apply mm within your url-call, so that you get the whole mm param back per request and calculate the applied mm with this information (sounds bad), or you recalculate it within your own custom search component. The last approach is better in my opinion. The third option requires you to modify/fork existing code. Keep in mind that this means that you have to maintain your modification/fork on every update. Regards, Em Am 05.10.2011 09:01, schrieb elisabeth benoit: Hello, I'd like to be able to know programmaticaly what value mm was set to for one request (to avoid having to parse the query, identify stopwords, calculate mm based on solrconfig.xml). Is there a way to get mm value in solr response? Thanks, Elisabeth
Re: Hierarchical faceting with Date
You count index the date as a text field(or use a new text field to store date as text) and then try it on this new field Thanx Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/Hierarchical-faceting-with-Date-tp3394521p3395824.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing PDF
It seems unreasonable that if I want to index a local file, I have to references this local file by an URL. This isn't a estrange file, this is a file downloaded from lucid web portal called: Starting a Search Application.pdf This problem may be a codification problem, or char set problem. I open this file with a PDF Reader and I have no problems, and I don’t Know why referencing this file with and URL will fix this problem, can you help me? I'm working with SolrJ, from Java, does some have the same problem with SolrJ? Thanks to Paul Libbrecht, for your option. Best regards 2011/10/4 Paul Libbrecht p...@hoplahup.net full of boxes for me. Héctor, you need another way to reference these! (e.g. a URL) paul Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit : Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with some files I’ve got problems because they stored estrange characters. I got stored this content: +++ Starting a Search Application Abstract Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page ii Do You Need Full-text Search? ∞ ∞ ∞ Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page 1
Re: Indexing PDF
Sorry you have the reason, this file was indexed with a .Net web service client, that calls a Java application(a web service) that calls Solr using SolrJ. I will try to index this in a different way, may be this resolve the problem. Thanks Best regards El 5 de octubre de 2011 08:42, Héctor Trujillo hecto...@gmail.comescribió: It seems unreasonable that if I want to index a local file, I have to references this local file by an URL. This isn't a estrange file, this is a file downloaded from lucid web portal called: Starting a Search Application.pdf This problem may be a codification problem, or char set problem. I open this file with a PDF Reader and I have no problems, and I don’t Know why referencing this file with and URL will fix this problem, can you help me? I'm working with SolrJ, from Java, does some have the same problem with SolrJ? Thanks to Paul Libbrecht, for your option. Best regards 2011/10/4 Paul Libbrecht p...@hoplahup.net full of boxes for me. Héctor, you need another way to reference these! (e.g. a URL) paul Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit : Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with some files I’ve got problems because they stored estrange characters. I got stored this content: +++ Starting a Search Application Abstract Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page ii Do You Need Full-text Search? ∞ ∞ ∞
How do i get results for quering with separated words?
Hello, i have configured a catchall searchword field. In this i copy the value of field name. Name value = Star Wars. Now i try to find this document by searchword starwars. But it's not found. Vice versa same problem. Name value = SuperRTL, searchword is super rtl. Replacing all whitespaces (in index and query) leads to unsatisfiying results. Can someone give me please a small description how i can solve this. Maybe there is already a blog on this. Thanks for help Mike
Re: How do i get results for quering with separated words?
which type in the schema.xml do you use. try out WordDelimiterFilterFactory or some other filters from this site: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory - --- System One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores 200.000 - Solr1 for Search-Requests - commit every Minute - 5GB Xmx - Solr2 for Update-Request - delta every Minute - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3395982.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing PDF
Héctor, I was meaning you need another way to reference the file *to the mailing list*. Sorry for the confusion. I do not think there's anything special to the set of interfaces you're using if the delivery is the same for the solr client and the acrobat plugin. To make sure of it, you could try to request static files served by a faithful server. paul Le 5 oct. 2011 à 11:00, Héctor Trujillo a écrit : Sorry you have the reason, this file was indexed with a .Net web service client, that calls a Java application(a web service) that calls Solr using SolrJ. I will try to index this in a different way, may be this resolve the problem. Thanks Best regards El 5 de octubre de 2011 08:42, Héctor Trujillo hecto...@gmail.comescribió: It seems unreasonable that if I want to index a local file, I have to references this local file by an URL. This isn't a estrange file, this is a file downloaded from lucid web portal called: Starting a Search Application.pdf This problem may be a codification problem, or char set problem. I open this file with a PDF Reader and I have no problems, and I don’t Know why referencing this file with and URL will fix this problem, can you help me? I'm working with SolrJ, from Java, does some have the same problem with SolrJ? Thanks to Paul Libbrecht, for your option. Best regards 2011/10/4 Paul Libbrecht p...@hoplahup.net full of boxes for me. Héctor, you need another way to reference these! (e.g. a URL) paul Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit : Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with some files I’ve got problems because they stored estrange characters. I got stored this content: +++ Starting a Search Application Abstract Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page ii Do You Need Full-text Search? ∞
Re: indexing FTP documet with solrj
Hello, To crawl the document you can use Apache Tika before sending the content to Solr (via Solrj). Regards, Marc. On Wed, Oct 5, 2011 at 1:16 AM, Chris Hostetter hossman_luc...@fucit.orgwrote: : I want to index some document with solrj API's but the URL of theses : documents is FTP, : How to set username and password for FTP acount in solrj : : in solrj API there is CommonsHttpSolrServer method but i do not find any : method for FTP configuration it sounds like you are getting ocnfused between using SolrJ to talk to *solr* And using SolrJ to index arbitrary URLs. SolrJ doesn't do any crawling -- if you have data that you want to index then your client code needs to decide what that data is (and where it comes from) and feed that data to SolrJ as documents to index. the only URLs that SolrJ knows about are: * the URL for tlaking to Solr * strings that SolrJ passes to solr as document fields that may just so happen to be URLs (SolrJ doesn't know/care) -Hoss
Re: How do i get results for quering with separated words?
Thanks stockii, but WDFF ist splitting on Numeric or NameChange only. For Star Wars in index and starwars in query this means that both are not equal. Or? Thanks Mike which type in the schema.xml do you use. try out WordDelimiterFilterFactory or some other filters from this site: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory - --- System One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores 200.000 - Solr1 for Search-Requests - commit every Minute - 5GB Xmx - Solr2 for Update-Request - delta every Minute - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3395982.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing PDF
Can you attach this PDF to an email send to the list? Or is it too large for that? Or, you can try running Tika directly on the PDF to see if it's able to extract the text. Mike McCandless http://blog.mikemccandless.com 2011/10/5 Héctor Trujillo hecto...@gmail.com: Sorry you have the reason, this file was indexed with a .Net web service client, that calls a Java application(a web service) that calls Solr using SolrJ. I will try to index this in a different way, may be this resolve the problem. Thanks Best regards El 5 de octubre de 2011 08:42, Héctor Trujillo hecto...@gmail.comescribió: It seems unreasonable that if I want to index a local file, I have to references this local file by an URL. This isn't a estrange file, this is a file downloaded from lucid web portal called: Starting a Search Application.pdf This problem may be a codification problem, or char set problem. I open this file with a PDF Reader and I have no problems, and I don’t Know why referencing this file with and URL will fix this problem, can you help me? I'm working with SolrJ, from Java, does some have the same problem with SolrJ? Thanks to Paul Libbrecht, for your option. Best regards 2011/10/4 Paul Libbrecht p...@hoplahup.net full of boxes for me. Héctor, you need another way to reference these! (e.g. a URL) paul Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit : Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with some files I’ve got problems because they stored estrange characters. I got stored this content: +++ Starting a Search Application Abstract Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page ii Do You Need Full-text Search? ∞ ∞ ∞
Re: How do i get results for quering with separated words?
index this field without whitespaces ? XD - --- System One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores 200.000 - Solr1 for Search-Requests - commit every Minute - 5GB Xmx - Solr2 for Update-Request - delta every Minute - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: A simple query?
Thanks but, unfortunately that will not solve the problem since it will bring back both the first and second doc. Besides, the query terms is: a b y z, not just: a b -- View this message in context: http://lucene.472066.n3.nabble.com/A-simple-query-tp3395465p3396297.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing PDF
Hmm, no attachment; maybe it's too large? Can you send it directly to me? Mike McCandless http://blog.mikemccandless.com 2011/10/5 Héctor Trujillo hecto...@gmail.com: This is the file that give me errors. 2011/10/5 Michael McCandless luc...@mikemccandless.com Can you attach this PDF to an email send to the list? Or is it too large for that? Or, you can try running Tika directly on the PDF to see if it's able to extract the text. Mike McCandless http://blog.mikemccandless.com 2011/10/5 Héctor Trujillo hecto...@gmail.com: Sorry you have the reason, this file was indexed with a .Net web service client, that calls a Java application(a web service) that calls Solr using SolrJ. I will try to index this in a different way, may be this resolve the problem. Thanks Best regards El 5 de octubre de 2011 08:42, Héctor Trujillo hecto...@gmail.comescribió: It seems unreasonable that if I want to index a local file, I have to references this local file by an URL. This isn't a estrange file, this is a file downloaded from lucid web portal called: Starting a Search Application.pdf This problem may be a codification problem, or char set problem. I open this file with a PDF Reader and I have no problems, and I don’t Know why referencing this file with and URL will fix this problem, can you help me? I'm working with SolrJ, from Java, does some have the same problem with SolrJ? Thanks to Paul Libbrecht, for your option. Best regards 2011/10/4 Paul Libbrecht p...@hoplahup.net full of boxes for me. Héctor, you need another way to reference these! (e.g. a URL) paul Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit : Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with some files I’ve got problems because they stored estrange characters. I got stored this content: +++ Starting a Search Application Abstract Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page ii Do You Need Full-text Search?
Re: Indexing PDF
I've uloaded the file here: http://www.filesonic.com/file/2342166624/Starting_a_Search_Application.pdf try this, thanks 2011/10/5 Michael McCandless luc...@mikemccandless.com Hmm, no attachment; maybe it's too large? Can you send it directly to me? Mike McCandless http://blog.mikemccandless.com 2011/10/5 Héctor Trujillo hecto...@gmail.com: This is the file that give me errors. 2011/10/5 Michael McCandless luc...@mikemccandless.com Can you attach this PDF to an email send to the list? Or is it too large for that? Or, you can try running Tika directly on the PDF to see if it's able to extract the text. Mike McCandless http://blog.mikemccandless.com 2011/10/5 Héctor Trujillo hecto...@gmail.com: Sorry you have the reason, this file was indexed with a .Net web service client, that calls a Java application(a web service) that calls Solr using SolrJ. I will try to index this in a different way, may be this resolve the problem. Thanks Best regards El 5 de octubre de 2011 08:42, Héctor Trujillo hecto...@gmail.comescribió: It seems unreasonable that if I want to index a local file, I have to references this local file by an URL. This isn't a estrange file, this is a file downloaded from lucid web portal called: Starting a Search Application.pdf This problem may be a codification problem, or char set problem. I open this file with a PDF Reader and I have no problems, and I don’t Know why referencing this file with and URL will fix this problem, can you help me? I'm working with SolrJ, from Java, does some have the same problem with SolrJ? Thanks to Paul Libbrecht, for your option. Best regards 2011/10/4 Paul Libbrecht p...@hoplahup.net full of boxes for me. Héctor, you need another way to reference these! (e.g. a URL) paul Le 4 oct. 2011 à 16:49, Héctor Trujillo a écrit : Hi all, I'm indexing pdf's files with SolrJ, and most of them work. But with some files I’ve got problems because they stored estrange characters. I got stored this content: +++ Starting a Search Application Abstract Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page i Starting a Search Application A Lucid Imagination White Paper ¥ April 2009 Page ii Do You Need Full-text Search?
Re: is there a way to know which mm value was used?
On 10/5/2011 1:01 AM, elisabeth benoit wrote: Hello, I'd like to be able to know programmaticaly what value mm was set to for one request (to avoid having to parse the query, identify stopwords, calculate mm based on solrconfig.xml). Is there a way to get mm value in solr response? To supplement the other answers you've gotten: If you set echoParams to all, either on the URL or in the solrconfig.xml request handler definition, each request should give you whatever value of mm is used, along with all the other parameters, which might be useful information. If mm is not present in the response when you do this, then it probably was not specified anywhere. That would indicate that it is set to the default, 100%. http://wiki.apache.org/solr/CoreQueryParameters#echoParams http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 Thanks, Shawn
Re: How do i get results for quering with separated words?
Isn't this more a problem of the query string? Let's assume i have a game name like Nintentdo 3DS - 'Star Wars - Clone Wars'. Can i copy that name to a field cutting the - and ', lowercase the result string and remove the whitespaces? So that i have nintendo3dsstarwarsclonewars. Is that findable with my starwars query string? Thanks for helping me Mike index this field without whitespaces ? XD - --- System One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores 200.000 - Solr1 for Search-Requests - commit every Minute - 5GB Xmx - Solr2 for Update-Request - delta every Minute - 4GB Xmx -- View this message in context: http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: schema changes changes 3.3 to 3.4?
Okay I did use the analysis tool and it did make me notice a few things but more important what changed there is no longer a field type named text on the new schema, there is only text_en which is weird as text field is the default when doing a query.. anyway, when I used the analysis tool and made the steamers and all match between the old schema and the new schema I get a result in the analysis tool but not in the query. I have to say that I have been using Solr with the default schema without any changes in the past, but since I upgraded to 3.4.0 I have this problem with the plurals not been displayed. -- View this message in context: http://lucene.472066.n3.nabble.com/schema-changes-changes-3-3-to-3-4-tp3391019p3396737.html Sent from the Solr - User mailing list archive at Nabble.com.
Sorting by article title
Hi all! I have documents, all of which have a title, and I would like to sort by that title. The catch is, I wish to sort ignoring any A or The at the beginning of the title. My first (and only) attempt is by creating a type that looks like: fieldType name=titleSort class=solr.TextField sortMissingLast=true omitNorms=true analyzer tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.PatternReplaceFilterFactory pattern=([^a-z]) replacement= replace=all / filter class=solr.PatternReplaceFilterFactory pattern=^the\s replacement= replace=first / filter class=solr.PatternReplaceFilterFactory pattern=^a\s replacement= replace=first / filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/ /analyzer /fieldType Also, the StopFilter should do the same thing I think, so there is some redundancy here too, right? and a field that looks like: field name=title.main type=stringSort indexed=true maxChars=32 stored=true multiValued=false/ I copyField my original title to this field at index time. However, when I add sort=title.main asc to my query, the original sort is what I see. Clearly, I'm either doing something wrong, or I am misunderstanding something. Can anybody explain what's up and suggest a way to accomplish what I need to do? Thanks in Advance!! -- View this message in context: http://lucene.472066.n3.nabble.com/Sorting-by-article-title-tp3396743p3396743.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: is there a way to know which mm value was used?
thanks for answering. echoParams just echos mm value in solrconfig.xml (in my case mm = 4-1 6-2), not the actual value of mm for one particular request. I think would be very useful to be able to know which mm value was effectively used, in particular for request with stopwords. It's of course possible to calculate mm in my own code, but this would necessitate to be synchronize with mm default value in solrconfig.xml + with stopwords.txt + identifying all stopwords in request. Best regards, Elisabeth 2011/10/5 Shawn Heisey s...@elyograg.org On 10/5/2011 1:01 AM, elisabeth benoit wrote: Hello, I'd like to be able to know programmaticaly what value mm was set to for one request (to avoid having to parse the query, identify stopwords, calculate mm based on solrconfig.xml). Is there a way to get mm value in solr response? To supplement the other answers you've gotten: If you set echoParams to all, either on the URL or in the solrconfig.xml request handler definition, each request should give you whatever value of mm is used, along with all the other parameters, which might be useful information. If mm is not present in the response when you do this, then it probably was not specified anywhere. That would indicate that it is set to the default, 100%. http://wiki.apache.org/solr/**CoreQueryParameters#echoParamshttp://wiki.apache.org/solr/CoreQueryParameters#echoParams http://wiki.apache.org/solr/**DisMaxQParserPlugin#mm_.** 28Minimum_.27Should.27_Match.**29http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29 Thanks, Shawn
Re: How do i get results for quering with separated words?
I think you could define star wars and starwars as synonyms in synonyms.txt... maybe not generic enough? 2011/10/5 Mike Mander wicket-m...@gmx.de Isn't this more a problem of the query string? Let's assume i have a game name like Nintentdo 3DS - 'Star Wars - Clone Wars'. Can i copy that name to a field cutting the - and ', lowercase the result string and remove the whitespaces? So that i have nintendo3dsstarwarsclonewars* *. Is that findable with my starwars query string? Thanks for helping me Mike index this field without whitespaces ? XD - --**- System --** -- One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores 200.000 - Solr1 for Search-Requests - commit every Minute - 5GB Xmx - Solr2 for Update-Request - delta every Minute - 4GB Xmx -- View this message in context: http://lucene.472066.n3.** nabble.com/How-do-i-get-**results-for-quering-with-**separated-words-** tp3395966p3396207.htmlhttp://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html Sent from the Solr - User mailing list archive at Nabble.com.
How to empty SolR Cache
Hello, everybody. Firstly, I must advise you that I'm a probie with mailing lists and a Froggie, so please excuse that could look as obvious errors, in both computing and language. I'm currently trying to benchmark my SolR install with a custom script, but this benchmark must be run with all SolR caches empty; is there a way to erase SolR caches by a command or to restart SolR with an option to avoid cache autowarming? Thank you in advance. Kind regards. -- David GUYOT Sys admin Europe Camions Interactive Moulin Collot F-88500 AMBACOURT Tel.:03.29.30.47.85
Re: is there a way to know which mm value was used?
On 10/5/2011 9:06 AM, elisabeth benoit wrote: thanks for answering. echoParams just echos mm value in solrconfig.xml (in my case mm = 4-1 6-2), not the actual value of mm for one particular request. I think would be very useful to be able to know which mm value was effectively used, in particular for request with stopwords. It's of course possible to calculate mm in my own code, but this would necessitate to be synchronize with mm default value in solrconfig.xml + with stopwords.txt + identifying all stopwords in request. Just tried this on a Solr 3.4.0 server. I have an edismax handler that includes echoParams, set to all, as well as an mm parameter, set to 2-1 4-50%. If I send a request with no mm parameter, that value is reflected in the response. When I add mm=50%25 to the URL in my browser (%25 being the URL encoding for the percent symbol), the response changes the mm value to 50% as expected, overriding the value in solrconfig.xml. I have not tried it with SolrJ or any of the other client APIs, just a browser. Is this not happening for you? Thanks, Shawn
Re: How to empty SolR Cache
On 10/5/2011 9:18 AM, David GUYOT wrote: I'm currently trying to benchmark my SolR install with a custom script, but this benchmark must be run with all SolR caches empty; is there a way to erase SolR caches by a command or to restart SolR with an option to avoid cache autowarming? Remove any firstSearcher or newSearcher warming queries, and set autoWarm to 0 on all caches. Once you've done that, all caches for that core will be empty if you restart Solr, reload the core, or do any kind of index update. That will let you get worst-case scenario benchmarks. You may already know this, but I'll say it anyway: Those benchmarks will not reflect real-world performance with a typical well-tuned configuration. Thanks, Shawn
more like this
Hi, for my application, I would like to be able to create web queries (wget/curl) that get more like this for either a single arbitrarily specified URL or for the first x terms in a search query. I want to return the results to myself as a csv file using wt=csv. How can I accomplish the MLT piece of it? Fred Z. - Subscribe to the Nimble Books Mailing List http://eepurl.com/czS- for monthly updates
Re: Scoring of DisMax in Solr
Thanks! What's the procedure to report this if it's a bug? EDisMax has similar behavior. On Tue, Oct 4, 2011 at 11:24 PM, Bill Bell billnb...@gmail.com wrote: This seems like a bug to me. On 10/4/11 6:52 PM, David Ryan help...@gmail.com wrote: Hi, When I examine the score calculation of DisMax in Solr, it looks to me that DisMax is using tf x idf^2 instead of tf x idf. Does anyone have insight why tf x idf is not used here? Here is the score contribution from one one field: score(q,c) = queryWeight x fieldWeight = tf x idf x idf x queryNorm x fieldNorm Here is the example that I used to derive the formula above. Clearly, idf is multiplied twice in the score calculation. * http://localhost:8983/solr/select/?q=GBversion=2.2start=0rows=10indent =ondebugQuery=truefl=id,score * str name=6H500F0 0.18314168 = (MATCH) sum of: 0.18314168 = (MATCH) weight(text:gb in 1), product of: 0.35845062 = queryWeight(text:gb), product of: 2.3121865 = idf(docFreq=6, numDocs=26) 0.15502669 = queryNorm 0.5109258 = (MATCH) fieldWeight(text:gb in 1), product of: 1.4142135 = tf(termFreq(text:gb)=2) 2.3121865 = idf(docFreq=6, numDocs=26) 0.15625 = fieldNorm(field=text, doc=1) /str Thanks!
Re: How do i get results for quering with separated words?
Have you tried to correct spaces by spelling dictionary? if you build you dictionary from non tokenized terms, you'll have starwars - Star Wars and super rtl-superrtl corrections. WDYT? On Wed, Oct 5, 2011 at 7:13 PM, elisabeth benoit elisaelisael...@gmail.comwrote: I think you could define star wars and starwars as synonyms in synonyms.txt... maybe not generic enough? 2011/10/5 Mike Mander wicket-m...@gmx.de Isn't this more a problem of the query string? Let's assume i have a game name like Nintentdo 3DS - 'Star Wars - Clone Wars'. Can i copy that name to a field cutting the - and ', lowercase the result string and remove the whitespaces? So that i have nintendo3dsstarwarsclonewars* *. Is that findable with my starwars query string? Thanks for helping me Mike index this field without whitespaces ? XD - --**- System --** -- One Server, 12 GB RAM, 2 Solr Instances, 8 Cores, 1 Core with 45 Million Documents other Cores 200.000 - Solr1 for Search-Requests - commit every Minute - 5GB Xmx - Solr2 for Update-Request - delta every Minute - 4GB Xmx -- View this message in context: http://lucene.472066.n3.** nabble.com/How-do-i-get-**results-for-quering-with-**separated-words-** tp3395966p3396207.html http://lucene.472066.n3.nabble.com/How-do-i-get-results-for-quering-with-separated-words-tp3395966p3396207.html Sent from the Solr - User mailing list archive at Nabble.com. -- Sincerely yours Mikhail (Mike) Khludnev Developer Grid Dynamics tel. 1-415-738-8644 Skype: mkhludnev http://www.griddynamics.com mkhlud...@griddynamics.com
Re: CopyField copying to self
On Thu, Oct 6, 2011 at 1:49 AM, Jamie Johnson jej2...@gmail.com wrote: I have a field named test_txt which I am populating in some cases, and not in others. I also have a copy field directive to copy data from _txt to text_txt. Thigns seem to work except I believe the field is also copying to itself. Is there anyway to avoid this behavior? Sorry, what do you mean by the field is also copying to itself? What are you seeing that is leading you to think so? Regards, Gora