Here's the sample code. Incidentally, this is in C#. I am using Lucene.NET, but I am assuming this problem could be universal to all versions and that this is a question that is best exposed to the collective wisdom of the Java user group.
default list of ISO country codes. * public string[] DEFAULT_STOP_WORDS = { "a", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "no", "not", "of", "on", "or", "s", "such", "t", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with", "inc","incorporated","co.","ltd","ltd." }; * create array containing stop words, but where ISO country code equivalents are omitted. * public string[] MY_STOP_WORDS = { "a", "and", "are", "as", "but", "by", "for", "if", "in", "into", "is", "no", "not", "of", "on", "or", "s", "such", "t", "that", "the", "their", "then", "there", "these", "they", "this", "to", "was", "will", "with", "inc", "incorporated", "co.", "ltd", "ltd." }; * Next, create query and submit to search. Provide MY_STOP_WORDS array as parameter to the standard analyzer. * Query query = QueryParser.Parse(strQuery, "company_name", new StandardAnalyzer(MY_STOP_WORDS)); Hits hits = searcher.Search(query); * Note that the default field for the query object is company name. However, multi-field queries will be submitted to the query object in the variable "strQuery". For example, * +(company_name:widgets ^10~ international ^5~ incorporated~ ) +(country_iso:US) * There is a bit of logic elsewhere in my application that constructs this syntax based on field names and values submitted via the UI. However, if one of those country code values is "AT", "BE", "IT", "IN", etc; then the query logic is erroneously constructed as the following. Note that the country code is missing. * +(company_name:belgium ^10~ telecom ^5~ ) +(country_iso:) * Note that the country ISO field is null. If a query is sumbitted to the search object in this way, then I receive the following exception at runtime. * Lucene.Net.QueryParsers.ParseException was unhandled by user code Message="Encountered \")\" at line 1, column 60.\r\nWas expecting one of:\r\n \"(\" ...\r\n <QUOTED> ...\r\n <TERM> ...\r\n <PREFIXTERM> ...\r\n <WILDTERM> ...\r\n \"[\" ...\r\n \"{\" ...\r\n <NUMBER> ...\r\n " Source="Lucene.Net" StackTrace: at Lucene.Net.QueryParsers.QueryParser.jj_consume_token(Int32 kind) at Lucene.Net.QueryParsers.QueryParser.Clause(String field) at Lucene.Net.QueryParsers.QueryParser.Query(String field) * And finally, here is how I am creating my index: * doc.Add(Field.Keyword("rec_id", entity_id.Trim())); doc.Add(Field.Text("aaa", ob10_account_id.Trim())); doc.Add(Field.Text("company_name", entity_name.Trim())); doc.Add(Field.Text("VAT_reg", VAT_reg.Trim())); doc.Add(Field.Text("account_type_description", account_type_description.Trim())); doc.Add(Field.Text("account_type", account_type.Trim())); doc.Add(Field.Text("add_line1", add_line1.Trim())); doc.Add(Field.Text("add_line2", add_line2.Trim())); doc.Add(Field.Text("add_line3", add_line3.Trim())); doc.Add(Field.Text("add_line4", add_line4.Trim())); doc.Add(Field.Text("add_line5", add_line5.Trim())); doc.Add(Field.Text("add_line6", add_line6.Trim())); doc.Add(Field.Keyword("country_iso", country_iso.Trim())); doc.Add(Field.Text("country_name", country_name.Trim())); doc.Add(Field.Text("entity_status_desc", entity_status_desc.Trim())); doc.Add(Field.Text("acct_status_desc", acct_status_desc.Trim())); doc.Add(Field.Text("firstname", firstname.Trim())); doc.Add(Field.Text("lastname", lastname.Trim())); writer.AddDocument(doc); * Have I submitted my custom stop words incorrectly? Should I somehow use a per-field analyzer for the country_ISO field? If so, which? Thanks so much in advance for your help.