: Since Edge N-gram tokens are a subset of N-gram tokens, I was wondering if : I could be a bit more space efficient.
the use of edgengrams is really a question of your goal and wether having terms for those overlaping ngram tokens is what you want. If you just want to match the existing terms you already have in a way that scores documents higher when those terms have a lower term position, then something like a "SpanFirstQuery" is probably more appropriate. Off the top of my head, i'm not sure that there is an easy way to generate a SpanFirstQuery in solr at the moment -- the "surround" parser supports SpanQueries through a special syntax, but I don't think it supports SpanFirstQuery - so writing your own quick and dirty SpanFirstQParser (based on TermQParser) might be the best way to go. -Hoss