It's interesting results...
I'm not a Unicode specialist, but Japanese query cannot match Arabic
documents if both of them correctly encoded.

I cannot recommend such use case, single field for all languages,
but maybe you should check "indexed" (analyzed) tokens for inspection, not
"stored" data.
Are there any CharFilters / TokenFilters that change (or corrupt) tokens
unexpectedly?

Thanks,
Tomoko

Reply via email to