[ https://issues.apache.org/jira/browse/SOLR-248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12498492 ]
Ryan McKinley commented on SOLR-248: ------------------------------------ > > 1) would it make sense for the keep option to refer to a file, using the same > format as StopFilter ... that way it's easy to reuse the same file (which > seems like it would be a common case. > probably. that is a good idea > 2) what is the point of forceFirstLetter="true" ? ... if you want to force > capitalization, what's the point of making hte keep list? > This is one that came of necessity! with keep="the ..." and input: "Grand army of the Republic", "the arts" I want: "Grand Army of the Republic" and "The Arts" "forceFirstLetter" only applies to the first character in the token, not to each word. > 3) is okPrefix going to force the case for things that have that prefix in an > alternate case, or only allow that casing to remain (ie: if i index McKeen, > Mckeen, mckeen and MCKEEN what tokens do i wind up with?) > As written, if the prefix matches, it assumes the word capitalization is correct. For my input data, this is sufficient -- but it should problem do something smarter. So, if you index "McKeen, Mckeen, mckeen, MCKEEN and McKEEN", you would get: "McKeen, Mckeen, Mckeen, Mckeen And McKEEN" If "okPrefix" was treated as *the* capitalization for input where the lowercase prefix matches "mck", it would give: "McKeen, McKeen, McKeen, McKeen And McKeen" > Capitalization Filter Factory > ----------------------------- > > Key: SOLR-248 > URL: https://issues.apache.org/jira/browse/SOLR-248 > Project: Solr > Issue Type: New Feature > Reporter: Ryan McKinley > Priority: Minor > Attachments: SOLR-248-CapitalizationFilter.patch > > > For tokens that are used in faceting, it is nice to have standard > capitalization. > I want "Aerial views" and "Aerial Views" to both be: "Aerial Views" -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.