Re: fieldtype for name

2013-01-09 Thread Michael Jones
Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to
search for:
'name surname'
name, surname'
'surname name'
'surname, name'

And nothing else, I don't need to worry about nick names or abbreviations
of a name, just the above variations. I think I might use text_ws.


On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de wrote:

 Hi Michael,

 in our index ob bibliographic metadata, we see the need for at least tree
 fields:
 - name_facet: String as type, because the facet should should represent
 the original inverted format from our data.
 - name: TextField for searching. This field is heavily analyzed to match
 different orders, to match synonyms, phonetic similarity, German umlauts
 and other European stuff.
 - name_lc: TextField. This field is just mapped to lower case. It's used
 to boost docs with the same style of writing like the users input.

 Uwe

 Am 08.01.2013 15:30, schrieb Michael Jones:

  Hi,

 What would be the best fieldtype for a persons name? at the moment I'm
 using text_general but, if I search for bob smith, some results I get back
 might be rob thomas. In that it's matched 'ob'.

 But I only really want results that are either

 'bob smith'
 'bob, smith'
 'smith, bob'
 'smith bob'

 Thanks





Re: fieldtype for name

2013-01-09 Thread Michael Jones
Also. I'm allowing users to do enter a name with quotes to search for an
exact name. So at the moment only smith, robert will return any results
where *robert smith* will return all variations including 'smith, herbert'


On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.comwrote:

 Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to
 search for:
 'name surname'
 name, surname'
 'surname name'
 'surname, name'

 And nothing else, I don't need to worry about nick names or abbreviations
 of a name, just the above variations. I think I might use text_ws.


 On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.dewrote:

 Hi Michael,

 in our index ob bibliographic metadata, we see the need for at least tree
 fields:
 - name_facet: String as type, because the facet should should represent
 the original inverted format from our data.
 - name: TextField for searching. This field is heavily analyzed to match
 different orders, to match synonyms, phonetic similarity, German umlauts
 and other European stuff.
 - name_lc: TextField. This field is just mapped to lower case. It's used
 to boost docs with the same style of writing like the users input.

 Uwe

 Am 08.01.2013 15:30, schrieb Michael Jones:

  Hi,

 What would be the best fieldtype for a persons name? at the moment I'm
 using text_general but, if I search for bob smith, some results I get
 back
 might be rob thomas. In that it's matched 'ob'.

 But I only really want results that are either

 'bob smith'
 'bob, smith'
 'smith, bob'
 'smith bob'

 Thanks






Re: fieldtype for name

2013-01-09 Thread Otis Gospodnetic
Hi,

Without seeing the configs I would guess default query operator might be OR
(and check docs for mm parameter on the Wiki) or there are ngrams involved.
Former is more likely.

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Jan 9, 2013 6:16 AM, Michael Jones michaelj...@gmail.com wrote:

 Also. I'm allowing users to do enter a name with quotes to search for an
 exact name. So at the moment only smith, robert will return any results
 where *robert smith* will return all variations including 'smith, herbert'


 On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.com
 wrote:

  Thanks. It isn't necessarily the need to match 'dick' to 'robert' but to
  search for:
  'name surname'
  name, surname'
  'surname name'
  'surname, name'
 
  And nothing else, I don't need to worry about nick names or abbreviations
  of a name, just the above variations. I think I might use text_ws.
 
 
  On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de
 wrote:
 
  Hi Michael,
 
  in our index ob bibliographic metadata, we see the need for at least
 tree
  fields:
  - name_facet: String as type, because the facet should should represent
  the original inverted format from our data.
  - name: TextField for searching. This field is heavily analyzed to match
  different orders, to match synonyms, phonetic similarity, German umlauts
  and other European stuff.
  - name_lc: TextField. This field is just mapped to lower case. It's used
  to boost docs with the same style of writing like the users input.
 
  Uwe
 
  Am 08.01.2013 15:30, schrieb Michael Jones:
 
   Hi,
 
  What would be the best fieldtype for a persons name? at the moment I'm
  using text_general but, if I search for bob smith, some results I get
  back
  might be rob thomas. In that it's matched 'ob'.
 
  But I only really want results that are either
 
  'bob smith'
  'bob, smith'
  'smith, bob'
  'smith bob'
 
  Thanks
 
 
 
 



Re: fieldtype for name

2013-01-09 Thread Michael Jones
Hi,

My schema file is here http://pastebin.com/ArY7xVUJ

Query (name:'ian paisley') returns ~ 3000 results
Query (name:'paisley, ian') returns ~ 250 results - That is how the name is
stored, so is returning just the results with that person.

I need all variations to return 250 results

Query (name:*ian paisley*) returns ~ 8000 results - but acceptable as I
know it has a wild card.

Thanks


On Wed, Jan 9, 2013 at 12:56 PM, Otis Gospodnetic 
otis.gospodne...@gmail.com wrote:

 Hi,

 Without seeing the configs I would guess default query operator might be OR
 (and check docs for mm parameter on the Wiki) or there are ngrams involved.
 Former is more likely.

 Otis
 Solr  ElasticSearch Support
 http://sematext.com/
 On Jan 9, 2013 6:16 AM, Michael Jones michaelj...@gmail.com wrote:

  Also. I'm allowing users to do enter a name with quotes to search for an
  exact name. So at the moment only smith, robert will return any results
  where *robert smith* will return all variations including 'smith,
 herbert'
 
 
  On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.com
  wrote:
 
   Thanks. It isn't necessarily the need to match 'dick' to 'robert' but
 to
   search for:
   'name surname'
   name, surname'
   'surname name'
   'surname, name'
  
   And nothing else, I don't need to worry about nick names or
 abbreviations
   of a name, just the above variations. I think I might use text_ws.
  
  
   On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de
  wrote:
  
   Hi Michael,
  
   in our index ob bibliographic metadata, we see the need for at least
  tree
   fields:
   - name_facet: String as type, because the facet should should
 represent
   the original inverted format from our data.
   - name: TextField for searching. This field is heavily analyzed to
 match
   different orders, to match synonyms, phonetic similarity, German
 umlauts
   and other European stuff.
   - name_lc: TextField. This field is just mapped to lower case. It's
 used
   to boost docs with the same style of writing like the users input.
  
   Uwe
  
   Am 08.01.2013 15:30, schrieb Michael Jones:
  
Hi,
  
   What would be the best fieldtype for a persons name? at the moment
 I'm
   using text_general but, if I search for bob smith, some results I get
   back
   might be rob thomas. In that it's matched 'ob'.
  
   But I only really want results that are either
  
   'bob smith'
   'bob, smith'
   'smith, bob'
   'smith bob'
  
   Thanks
  
  
  
  
 



Re: fieldtype for name

2013-01-09 Thread Upayavira
Try q=name:(ian paisley)q.op=AND

Does that work better for you?

It would also match Ian James Paisley, but not Ian Jackson.

Upayavira

On Wed, Jan 9, 2013, at 01:30 PM, Michael Jones wrote:
 Hi,
 
 My schema file is here http://pastebin.com/ArY7xVUJ
 
 Query (name:'ian paisley') returns ~ 3000 results
 Query (name:'paisley, ian') returns ~ 250 results - That is how the name
 is
 stored, so is returning just the results with that person.
 
 I need all variations to return 250 results
 
 Query (name:*ian paisley*) returns ~ 8000 results - but acceptable as I
 know it has a wild card.
 
 Thanks
 
 
 On Wed, Jan 9, 2013 at 12:56 PM, Otis Gospodnetic 
 otis.gospodne...@gmail.com wrote:
 
  Hi,
 
  Without seeing the configs I would guess default query operator might be OR
  (and check docs for mm parameter on the Wiki) or there are ngrams involved.
  Former is more likely.
 
  Otis
  Solr  ElasticSearch Support
  http://sematext.com/
  On Jan 9, 2013 6:16 AM, Michael Jones michaelj...@gmail.com wrote:
 
   Also. I'm allowing users to do enter a name with quotes to search for an
   exact name. So at the moment only smith, robert will return any results
   where *robert smith* will return all variations including 'smith,
  herbert'
  
  
   On Wed, Jan 9, 2013 at 11:09 AM, Michael Jones michaelj...@gmail.com
   wrote:
  
Thanks. It isn't necessarily the need to match 'dick' to 'robert' but
  to
search for:
'name surname'
name, surname'
'surname name'
'surname, name'
   
And nothing else, I don't need to worry about nick names or
  abbreviations
of a name, just the above variations. I think I might use text_ws.
   
   
On Tue, Jan 8, 2013 at 9:39 PM, Uwe Reh r...@hebis.uni-frankfurt.de
   wrote:
   
Hi Michael,
   
in our index ob bibliographic metadata, we see the need for at least
   tree
fields:
- name_facet: String as type, because the facet should should
  represent
the original inverted format from our data.
- name: TextField for searching. This field is heavily analyzed to
  match
different orders, to match synonyms, phonetic similarity, German
  umlauts
and other European stuff.
- name_lc: TextField. This field is just mapped to lower case. It's
  used
to boost docs with the same style of writing like the users input.
   
Uwe
   
Am 08.01.2013 15:30, schrieb Michael Jones:
   
 Hi,
   
What would be the best fieldtype for a persons name? at the moment
  I'm
using text_general but, if I search for bob smith, some results I get
back
might be rob thomas. In that it's matched 'ob'.
   
But I only really want results that are either
   
'bob smith'
'bob, smith'
'smith, bob'
'smith bob'
   
Thanks
   
   
   
   
  
 


Re: fieldtype for name

2013-01-09 Thread Michael Jones
Brilliant! Thank you!

On Wed, Jan 9, 2013 at 1:37 PM, Upayavira u...@odoko.co.uk wrote:

 q=name:(ian paisley)q.op=AND


Re: fieldtype for name

2013-01-08 Thread Otis Gospodnetic
Or if synonyms are involved, which they likely aren't in this case.
although for name matching I'd think one would want them, perhaps on
another copy of the name field to allow strict vs. nickname matching.

Otis
Solr  ElasticSearch Support
http://sematext.com/
On Jan 8, 2013 9:35 AM, Shawn Heisey s...@elyograg.org wrote:

 On 1/8/2013 7:30 AM, Michael Jones wrote:

 Hi,

 What would be the best fieldtype for a persons name? at the moment I'm
 using text_general but, if I search for bob smith, some results I get back
 might be rob thomas. In that it's matched 'ob'.

 But I only really want results that are either

 'bob smith'
 'bob, smith'
 'smith, bob'
 'smith bob'


 A search for bob smith could only match rob thomas if the fieldtype
 includes the edge ngram filter, or if you do a fuzzy term search, or if the
 fieldtype includes a stemming filter that turns bob, robert, and rob into
 the same root word.  Eliminate all those things and it should work like you
 expect.

 Thanks,
 Shawn




Re: fieldtype for name

2013-01-08 Thread Uwe Reh

Hi Michael,

in our index ob bibliographic metadata, we see the need for at least 
tree fields:
- name_facet: String as type, because the facet should should represent 
the original inverted format from our data.
- name: TextField for searching. This field is heavily analyzed to match 
different orders, to match synonyms, phonetic similarity, German umlauts 
and other European stuff.
- name_lc: TextField. This field is just mapped to lower case. It's used 
to boost docs with the same style of writing like the users input.


Uwe

Am 08.01.2013 15:30, schrieb Michael Jones:

Hi,

What would be the best fieldtype for a persons name? at the moment I'm
using text_general but, if I search for bob smith, some results I get back
might be rob thomas. In that it's matched 'ob'.

But I only really want results that are either

'bob smith'
'bob, smith'
'smith, bob'
'smith bob'

Thanks