Ahh, thank you for the hints Martin... German stopwords without Umlaut work correctly.
So I'm trying to figure out where the UTF-8 chars are getting messed up. Using the Solr admin web UI, I did a search for title:für and the xml (or json) output in the browser shows the query with the proper encoding, but the Solr logs show this: INFO: [page_30d_de] webapp=/solr path=/select params={explainOther=&fl=*,score&indent=on&start=0&q=title:f?r&hl.fl=&qt=standard&wt=xml&fq=&version=2.2&rows=10} hits=76 status=0 QTime=2 Notice the title:f?r. How do I fix that? I'm using Jetty btw... Thanks for the help. On Fri, Mar 25, 2011 at 3:05 AM, Martin Rödig <r...@shi-gmbh.com> wrote: > I have some questions about your config: > > Is the stopwords-de.txt in the same diractory as the shema.xml? > Is the title field from type text? > Have you the same problem with german stopwords with out Umlaut (ü,ö,ä) > like the word "denn"? > > A Problem can be that the stopwords-de.txt is not save as UTF-8, so the > filter can not read the umlaut ü in the file. > > > Mit freundlichen Grüßen > M.Sc. Dipl.-Inf. (FH) Martin Rödig > > SHI Elektronische Medien GmbH > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > - - - - - - - - > AKTUELL - NEU - AB SOFORT > Solr/Lucene Schulung vom 19. - 21. April in Berlin > > Als erster zertifizierter Trainingspartner von Lucid Imagination in > Deutschland, Österreich und Schweiz bietet SHI ab sofort > deutschsprachige Solr Schulungen an. > Weitere Informationen: www.shi-gmbh.com/services/solr-training > Achtung: Die Anzahl der Plätze ist beschränkt! > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > - - - - - - - - > Postadresse: Watzmannstr. 23, 86316 Friedberg > Besuchsadresse: Curt-Frenzel-Str. 12, 86167 Augsburg > Tel.: 0821 7482633 18 > Tel.: 0821 7482633 0 (Zentrale) > Fax: 0821 7482633 29 > > Internet: http://www.shi-gmbh.com > Registergericht Augsburg HRB 17382 > Geschäftsführer: Peter Spiske > Steuernummer: 103/137/30412 > > -----Ursprüngliche Nachricht----- > Von: Christopher Bottaro [mailto:cjbott...@onespot.com] > Gesendet: Freitag, 25. März 2011 05:37 > An: solr-user@lucene.apache.org > Betreff: stopwords not working in multicore setup > > Hello, > > I'm running a Solr server with 5 cores. Three are for English content and > two are for German content. The default stopwords setup works fine for the > English cores, but the German stopwords aren't working. > > The German stopwords file is stopwords-de.txt and resides in the same > directory as stopwords.txt. The German cores use a different schema (named > schema.page.de.xml) which has the following text field definition: > http://pastie.org/1711866 > > The stopwords-de.txt file looks like this: http://pastie.org/1711869 > > The query I'm doing is this: q => "title:für" > > And it's returning documents with für in the title. Title is a text field > which should use the stopwords-de.txt, as seen in the aforementioned pastie. > > Any ideas? Thanks for the help. >