Re: fieldtype for name
Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to search for: 'name surname' name, surname' 'surname name' 'surname, name' And nothing else, I don't need to worry about nick names or abbreviations of a name, just the above variations. I think I might use text_ws. On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de wrote: Hi Michael, in our index ob bibliographic metadata, we see the need for at least tree fields: - name_facet: String as type, because the facet should should represent the original inverted format from our data. - name: TextField for searching. This field is heavily analyzed to match different orders, to match synonyms, phonetic similarity, German umlauts and other European stuff. - name_lc: TextField. This field is just mapped to lower case. It's used to boost docs with the same style of writing like the users input. Uwe Am 08.01.2013 15:30, schrieb Michael Jones: Hi, What would be the best fieldtype for a persons name? at the moment I'm using text_general but, if I search for bob smith, some results I get back might be rob thomas. In that it's matched 'ob'. But I only really want results that are either 'bob smith' 'bob, smith' 'smith, bob' 'smith bob' Thanks
Re: fieldtype for name
Also. I'm allowing users to do enter a name with quotes to search for an exact name. So at the moment only smith, robert will return any results where *robert smith* will return all variations including 'smith, herbert' On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.comwrote: Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to search for: 'name surname' name, surname' 'surname name' 'surname, name' And nothing else, I don't need to worry about nick names or abbreviations of a name, just the above variations. I think I might use text_ws. On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.dewrote: Hi Michael, in our index ob bibliographic metadata, we see the need for at least tree fields: - name_facet: String as type, because the facet should should represent the original inverted format from our data. - name: TextField for searching. This field is heavily analyzed to match different orders, to match synonyms, phonetic similarity, German umlauts and other European stuff. - name_lc: TextField. This field is just mapped to lower case. It's used to boost docs with the same style of writing like the users input. Uwe Am 08.01.2013 15:30, schrieb Michael Jones: Hi, What would be the best fieldtype for a persons name? at the moment I'm using text_general but, if I search for bob smith, some results I get back might be rob thomas. In that it's matched 'ob'. But I only really want results that are either 'bob smith' 'bob, smith' 'smith, bob' 'smith bob' Thanks
Re: fieldtype for name
Hi, Without seeing the configs I would guess default query operator might be OR (and check docs for mm parameter on the Wiki) or there are ngrams involved. Former is more likely. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 9, 2013 6:16 AM, Michael Jones michaelj...@gmail.com wrote: Also. I'm allowing users to do enter a name with quotes to search for an exact name. So at the moment only smith, robert will return any results where *robert smith* will return all variations including 'smith, herbert' On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.com wrote: Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to search for: 'name surname' name, surname' 'surname name' 'surname, name' And nothing else, I don't need to worry about nick names or abbreviations of a name, just the above variations. I think I might use text_ws. On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de wrote: Hi Michael, in our index ob bibliographic metadata, we see the need for at least tree fields: - name_facet: String as type, because the facet should should represent the original inverted format from our data. - name: TextField for searching. This field is heavily analyzed to match different orders, to match synonyms, phonetic similarity, German umlauts and other European stuff. - name_lc: TextField. This field is just mapped to lower case. It's used to boost docs with the same style of writing like the users input. Uwe Am 08.01.2013 15:30, schrieb Michael Jones: Hi, What would be the best fieldtype for a persons name? at the moment I'm using text_general but, if I search for bob smith, some results I get back might be rob thomas. In that it's matched 'ob'. But I only really want results that are either 'bob smith' 'bob, smith' 'smith, bob' 'smith bob' Thanks
Re: fieldtype for name
Hi, My schema file is here http://pastebin.com/ArY7xVUJ Query (name:'ian paisley') returns ~ 3000 results Query (name:'paisley, ian') returns ~ 250 results - That is how the name is stored, so is returning just the results with that person. I need all variations to return 250 results Query (name:*ian paisley*) returns ~ 8000 results - but acceptable as I know it has a wild card. Thanks On Wed, Jan 9, 2013 at 12:56 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Without seeing the configs I would guess default query operator might be OR (and check docs for mm parameter on the Wiki) or there are ngrams involved. Former is more likely. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 9, 2013 6:16 AM, Michael Jones michaelj...@gmail.com wrote: Also. I'm allowing users to do enter a name with quotes to search for an exact name. So at the moment only smith, robert will return any results where *robert smith* will return all variations including 'smith, herbert' On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.com wrote: Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to search for: 'name surname' name, surname' 'surname name' 'surname, name' And nothing else, I don't need to worry about nick names or abbreviations of a name, just the above variations. I think I might use text_ws. On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de wrote: Hi Michael, in our index ob bibliographic metadata, we see the need for at least tree fields: - name_facet: String as type, because the facet should should represent the original inverted format from our data. - name: TextField for searching. This field is heavily analyzed to match different orders, to match synonyms, phonetic similarity, German umlauts and other European stuff. - name_lc: TextField. This field is just mapped to lower case. It's used to boost docs with the same style of writing like the users input. Uwe Am 08.01.2013 15:30, schrieb Michael Jones: Hi, What would be the best fieldtype for a persons name? at the moment I'm using text_general but, if I search for bob smith, some results I get back might be rob thomas. In that it's matched 'ob'. But I only really want results that are either 'bob smith' 'bob, smith' 'smith, bob' 'smith bob' Thanks
Re: fieldtype for name
Try q=name:(ian paisley)q.op=AND Does that work better for you? It would also match Ian James Paisley, but not Ian Jackson. Upayavira On Wed, Jan 9, 2013, at 01:30 PM, Michael Jones wrote: Hi, My schema file is here http://pastebin.com/ArY7xVUJ Query (name:'ian paisley') returns ~ 3000 results Query (name:'paisley, ian') returns ~ 250 results - That is how the name is stored, so is returning just the results with that person. I need all variations to return 250 results Query (name:*ian paisley*) returns ~ 8000 results - but acceptable as I know it has a wild card. Thanks On Wed, Jan 9, 2013 at 12:56 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Without seeing the configs I would guess default query operator might be OR (and check docs for mm parameter on the Wiki) or there are ngrams involved. Former is more likely. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 9, 2013 6:16 AM, Michael Jones michaelj...@gmail.com wrote: Also. I'm allowing users to do enter a name with quotes to search for an exact name. So at the moment only smith, robert will return any results where *robert smith* will return all variations including 'smith, herbert' On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.com wrote: Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to search for: 'name surname' name, surname' 'surname name' 'surname, name' And nothing else, I don't need to worry about nick names or abbreviations of a name, just the above variations. I think I might use text_ws. On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de wrote: Hi Michael, in our index ob bibliographic metadata, we see the need for at least tree fields: - name_facet: String as type, because the facet should should represent the original inverted format from our data. - name: TextField for searching. This field is heavily analyzed to match different orders, to match synonyms, phonetic similarity, German umlauts and other European stuff. - name_lc: TextField. This field is just mapped to lower case. It's used to boost docs with the same style of writing like the users input. Uwe Am 08.01.2013 15:30, schrieb Michael Jones: Hi, What would be the best fieldtype for a persons name? at the moment I'm using text_general but, if I search for bob smith, some results I get back might be rob thomas. In that it's matched 'ob'. But I only really want results that are either 'bob smith' 'bob, smith' 'smith, bob' 'smith bob' Thanks
Re: fieldtype for name
Brilliant! Thank you! On Wed, Jan 9, 2013 at 1:37 PM, Upayavira u...@odoko.co.uk wrote: q=name:(ian paisley)q.op=AND
Re: fieldtype for name
Or if synonyms are involved, which they likely aren't in this case. although for name matching I'd think one would want them, perhaps on another copy of the name field to allow strict vs. nickname matching. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 8, 2013 9:35 AM, Shawn Heisey s...@elyograg.org wrote: On 1/8/2013 7:30 AM, Michael Jones wrote: Hi, What would be the best fieldtype for a persons name? at the moment I'm using text_general but, if I search for bob smith, some results I get back might be rob thomas. In that it's matched 'ob'. But I only really want results that are either 'bob smith' 'bob, smith' 'smith, bob' 'smith bob' A search for bob smith could only match rob thomas if the fieldtype includes the edge ngram filter, or if you do a fuzzy term search, or if the fieldtype includes a stemming filter that turns bob, robert, and rob into the same root word. Eliminate all those things and it should work like you expect. Thanks, Shawn
Re: fieldtype for name
Hi Michael, in our index ob bibliographic metadata, we see the need for at least tree fields: - name_facet: String as type, because the facet should should represent the original inverted format from our data. - name: TextField for searching. This field is heavily analyzed to match different orders, to match synonyms, phonetic similarity, German umlauts and other European stuff. - name_lc: TextField. This field is just mapped to lower case. It's used to boost docs with the same style of writing like the users input. Uwe Am 08.01.2013 15:30, schrieb Michael Jones: Hi, What would be the best fieldtype for a persons name? at the moment I'm using text_general but, if I search for bob smith, some results I get back might be rob thomas. In that it's matched 'ob'. But I only really want results that are either 'bob smith' 'bob, smith' 'smith, bob' 'smith bob' Thanks