This may not be a practically solvable problem, but the company I work for has a large number of lengthy mixed-language documents - for example, scholarly articles about Islam written in English but containing lengthy passages of Arabic. Ideally, we would like users to be able to search both the English and Arabic portions of the text, using the full complement of language-processing tools such as stemming and stopword removal.
The problem, of course, is that these two languages co-occur in the same field. Is there any way to apply different processing to different words or paragraphs within a single field through language detection? Is this to all intents and purposes impossible within Solr? Or is another approach (using language detection to split the single large field into language-differentiated smaller fields, for example) possible/recommended? Thanks, Tim Hill