Github user xristy commented on the issue:
https://github.com/apache/jena/pull/489
@rvesse, that's correct. Some of our configurations have 4 or 5 language
tagged fields for different encodings, e.g., sa-x-iast, sa-alalc97,
sa-x-aux-ndia, an so forth. Our dataset of ~33M triples shows a savings in the
Lucene index of ~15%---
