Re: stopwords not working in multicore setup
Ahh, thank you for the hints Martin... German stopwords without Umlaut work correctly. So I'm trying to figure out where the UTF-8 chars are getting messed up. Using the Solr admin web UI, I did a search for title:für and the xml (or json) output in the browser shows the query with the proper encoding, but the Solr logs show this: INFO: [page_30d_de] webapp=/solr path=/select params={explainOther=fl=*,scoreindent=onstart=0q=title:f?rhl.fl=qt=standardwt=xmlfq=version=2.2rows=10} hits=76 status=0 QTime=2 Notice the title:f?r. How do I fix that? I'm using Jetty btw... Thanks for the help. On Fri, Mar 25, 2011 at 3:05 AM, Martin Rödig r...@shi-gmbh.com wrote: I have some questions about your config: Is the stopwords-de.txt in the same diractory as the shema.xml? Is the title field from type text? Have you the same problem with german stopwords with out Umlaut (ü,ö,ä) like the word denn? A Problem can be that the stopwords-de.txt is not save as UTF-8, so the filter can not read the umlaut ü in the file. Mit freundlichen Grüßen M.Sc. Dipl.-Inf. (FH) Martin Rödig SHI Elektronische Medien GmbH - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - AKTUELL - NEU - AB SOFORT Solr/Lucene Schulung vom 19. - 21. April in Berlin Als erster zertifizierter Trainingspartner von Lucid Imagination in Deutschland, Österreich und Schweiz bietet SHI ab sofort deutschsprachige Solr Schulungen an. Weitere Informationen: www.shi-gmbh.com/services/solr-training Achtung: Die Anzahl der Plätze ist beschränkt! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Postadresse: Watzmannstr. 23, 86316 Friedberg Besuchsadresse: Curt-Frenzel-Str. 12, 86167 Augsburg Tel.: 0821 7482633 18 Tel.: 0821 7482633 0 (Zentrale) Fax: 0821 7482633 29 Internet: http://www.shi-gmbh.com Registergericht Augsburg HRB 17382 Geschäftsführer: Peter Spiske Steuernummer: 103/137/30412 -Ursprüngliche Nachricht- Von: Christopher Bottaro [mailto:cjbott...@onespot.com] Gesendet: Freitag, 25. März 2011 05:37 An: solr-user@lucene.apache.org Betreff: stopwords not working in multicore setup Hello, I'm running a Solr server with 5 cores. Three are for English content and two are for German content. The default stopwords setup works fine for the English cores, but the German stopwords aren't working. The German stopwords file is stopwords-de.txt and resides in the same directory as stopwords.txt. The German cores use a different schema (named schema.page.de.xml) which has the following text field definition: http://pastie.org/1711866 The stopwords-de.txt file looks like this: http://pastie.org/1711869 The query I'm doing is this: q = title:für And it's returning documents with für in the title. Title is a text field which should use the stopwords-de.txt, as seen in the aforementioned pastie. Any ideas? Thanks for the help.
stopwords not working in multicore setup
Hello, I'm running a Solr server with 5 cores. Three are for English content and two are for German content. The default stopwords setup works fine for the English cores, but the German stopwords aren't working. The German stopwords file is stopwords-de.txt and resides in the same directory as stopwords.txt. The German cores use a different schema (named schema.page.de.xml) which has the following text field definition: http://pastie.org/1711866 The stopwords-de.txt file looks like this: http://pastie.org/1711869 The query I'm doing is this: q = title:für And it's returning documents with für in the title. Title is a text field which should use the stopwords-de.txt, as seen in the aforementioned pastie. Any ideas? Thanks for the help.
Re: multicore replication slave
Answered my own question. Instead of naming each core in the replication handler, you use a variable instead: requestHandler name=/replication class=solr.ReplicationHandler lst name=slave str name=masterUrlhttp://solr.mydomain.com:8983/solr/${solr.core.name}/replication/str str name=pollInterval00:00:60/str /lst /requestHandler That will get all of your cores replicating. -- C On Mon, Oct 11, 2010 at 6:25 PM, Christopher Bottaro cjbott...@onespot.com wrote: Hello, I can't get my multicore slave to replicate from the master. The master is setup properly and the following urls return 00OKNo command as expected: http://solr.mydomain.com:8983/solr/core1/replication http://solr.mydomain.com:8983/solr/core2/replication http://solr.mydomain.com:8983/solr/core3/replication The following pastie shows how my slave is setup: http://pastie.org/1214209 But it's not working (i.e. I see no replication attempts in the slave's log). Any ideas? Thanks for the help.
multicore replication slave
Hello, I can't get my multicore slave to replicate from the master. The master is setup properly and the following urls return 00OKNo command as expected: http://solr.mydomain.com:8983/solr/core1/replication http://solr.mydomain.com:8983/solr/core2/replication http://solr.mydomain.com:8983/solr/core3/replication The following pastie shows how my slave is setup: http://pastie.org/1214209 But it's not working (i.e. I see no replication attempts in the slave's log). Any ideas? Thanks for the help.
How to see the query generated by MoreLikeThisHandler?
Hello, Is there a way to see exactly what query is generated by the MoreLikeThisHandler? If I send debugQuery=true then I see in the response a key called parsedquery but it doesn't seem quite right. What I mean by that is when I make the MoreLikeThis query, I set mlt.fl to title,content but the query shown in parsedquery does not query on title at all... only on content. Furthermore, the query looks something like this content:word1 content:word2 content:word3 but if I copy and paste that into a standard query, nothing comes back because the default term operator is AND. If I change that query to content:word1 OR content:word2 OR content:word3, I get results but they are not the same as what the MLT query returns. Is there a way to see the generated query without actually running it? As of now, I'm making a MLT query with rows=0, but I think it's still running the query because it takes a non trivial amount of time and it also shows numFound in the response. Thanks for the help, -- Christopher
DisMaxRequestHandler questions about bf and bq
Hello, I have a couple of questions regarding the bf and bq params to the DisMaxRequestHandler. 1) Can I specify them more than once? Ex: bf=log(popularity)bf=log(comment_count) 2) When using bq, how can I specify what score to use for documents not returned by the query? In other words, how do I mimic this behavior using bq: bf=query($qq, 0.1)qq=site:news.yahoo.com Thanks for the help!
Boost a document score via query using MoreLikeThisHandler
Hello, Is it possible to boost a document's score based on something like fq=site(com.google*). In other words, I want to boost the score of documents who's site field starts with com.google. I'm using the MoreLikeThisHandler. Thanks for the help, -- Christopher
Re: Boost a document score via query using MoreLikeThisHandler
On Mon, Mar 1, 2010 at 7:36 PM, Christopher Bottaro cjbott...@onespot.com wrote: Hello, Is it possible to boost a document's score based on something like fq=site(com.google*). In other words, I want to boost the score of documents who's site field starts with com.google. I'm using the MoreLikeThisHandler. Thanks for the help, -- Christopher Ok, I think I need to do this with BoostQParserPlugin and nested queries, but I can't quite figure it out. So this works... q={!boost b=log(popularity)}(title:barack OR title:obama) But instead of boosting by popularity, I want to boost by site: q={!boost b=query({ !query q='site:*.yahoo.com' })}(title:barack OR title:obama) This is the exception I get... org.apache.lucene.queryParser.ParseException: Expected identifier at pos 18 str='{!boost b=query({ !query q='site:*.yahoo.com' })}(title:barack OR title:obama)' But that doesn't work. Any tips? Thanks.