If you're not familiar with the edismax query parser, that's often what
people use to fire the query off against more than one field without
having the users be aware of it. That means you could ngram
the e-mail field and when a user types something in the search
box search against both the "all" and "email" fields without them having
to know they exist.

Best,
Erick

On Wed, Jul 25, 2018 at 6:23 AM, Christopher Schultz
<ch...@christopherschultz.net> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Chris,
>
> On 7/24/18 4:46 PM, Chris Hostetter wrote:
>>
>> : We are using Solr as a user index, and users have email
>> addresses. : : Our old search behavior used a SQL substring match
>> for any search : terms entered, and so users are used to being able
>> to search for e.g. : "chr" and finding my email address
>> ("ch...@christopherschultz.net"). : : By default, Solr doesn't
>> perform substring matches, and it might be : difficult to re-train
>> users to use *chr* to find email addresses by : substring.
>>
>> In the past, were you really doing arbitrary substring matching, or
>> just prefix matching?  ie would a search for "sto" match
>> "ch...@christopherschultz.net"
>
> Yes. Searching for "sto" would result in a SQL query with a " WHERE
> ... LIKE '%sto%'" clause. So it was slow as hell, of course.
>
>> Personally, if you know you have an email field, would suggest
>> using a custom tokenizer that splits on "@" and "." (and maybe
>> other punctuation characters like "-") and then take your raw user
>> input and feed it to the prefix parser (instead of requiring your
>> users to add the "*")...
>>
>> q={!prefix f=email v=$user_input}&user_input=chr
>>
>> ...which would match ch...@gmail.com, f...@chris.com, f...@bar.chr
>> etc.
>>
>> (this wouldn't help you though if you *really* want arbitrary
>> substring matching -- as erick suggested ngrams is pretty much your
>> best bet for something like that)
>>
>> Bear in mind, you can combine that "forced prefix" query against
>> the (otkenized) email field with other queries that could parse
>> your input in other ways...
>>
>> user_input=... q=({!prefix f=email v=$user_input} OR {!dismax
>> qf="first_name last_name" ..etc.. v=$user_input})
>>
>> so if your user input is "chris" you'll get term matches on the
>> first_name field, or the last_name field as well as prefix matches
>> on the email field.
>
> The problem is that our users (admins) sometimes need to locate users
> by their email address, and people often forget the exact spelling. So
> they'll call and say "I can't get in" and we have to search for "chris
> schultz" and then "chris" and then it turns out that their email
> address was actually sexylove...@yahoo.com, so they often have to try
> a bunch of searches before finding the right user record. Having to
> search for "sexylover42", a complete-match word, isn't going to work
> for their use-case. They need to be able to search for "lover" and
> have it work. I think n-grams sounds like the only way to get this
> done. I'll have to play-around with it a little bit to see how it behave
> s.
>
> Thanks,
> - -chris
> -----BEGIN PGP SIGNATURE-----
> Comment: GPGTools - http://gpgtools.org
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
>
> iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAltYedQACgkQHPApP6U8
> pFjzgQ/9GW7kI9Lefnmj7zH8JsqZfW1Y/PrF4YA1RjbliNWRn2dRPz7Q7C2ITO/n
> Ys73uUII3qPz8M/H6d0LN57Un96BGAjIhf6WZSiIRAQcvenhGaS/lROciq6I8iN8
> hB+1X2GixTG8fbq6Q6Q3jRG22S0GpW+OL2mJcu3wCkQ2dzyBWObWxjF1ag5O4pT+
> AP0lqAgpUTsWAeMPPd6dkuStOhXraJQc+1WwwEw36gohwaZwLMftcOl2ohnys/DM
> pdyqQEQ6fOldJLBHLU8PyNVHxJA5qZjVTwu3S7zv7w+2N+V8bHOl6y5ir3krOEs0
> OIvFX+Do+pbsg+QQ5VY8LDxbPBCjgDiWTpplh3Ym0raaVMoMQ6GfFfsOPF9jYhxS
> gb0eMwVTJFWM0xvMaH4xSXLR/Dh6upT/0do1sTr7kKjhIlwc3pfR/vIwqsVer1HJ
> Qsj6Pc+ZJckOrPGGIYCZEWZwlS8ONinAx4fh23/C1GltU19kHtRvGTQLzRT+9sus
> 2stvkD44Lv7zuc49/Y07NISxcUceTlbZHKC5ebzAtKNDS2p+qYLJlbdTZQIofMsb
> zmncdP+s5cSYgiCZZS19E2GxP7Yw2rmSn2zsSF6yJMgMy9logJi5HS1UQ54IWvn7
> eAzvM+TcV6i+8Hf9kijNcg4/OZPv67DZt6HDcXO2K+a/AMyQElE=
> =4Y/b
> -----END PGP SIGNATURE-----

Reply via email to