from:"\"Lee Gee\""

Folding of accented to non-accented only — leaving symbols

2014-10-13 Thread Lee Gee

I now the asciifolding filter docs are really very clear on this, but it 
took me an embarrassingly long time to realise I was losing my currency 
symbol (£) to the ASCII folding filter.

Other than creating my own character map with the char map filter, does 
there exist something of production quality that would translate accented 
UTF8 characters of the Latin-alphabet into non-accented characters in the 
ASCII range?

TIA
Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff95c6ec-7907-454e-bd58-774ee173f4e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Pattern replace apostrophes?

2014-10-09 Thread Lee Gee

The problem was that it was not an apostrophe, but an opening single quote. 

Have increased editor font size to address this issue.

On Tuesday, October 7, 2014 8:00:13 PM UTC+1, Ivan Brusic wrote:
>
> What type of query are you using? Perhaps the query you are using is not 
> using the same analyzer at search time.
>
> -- 
> Ivan
>
> On Tue, Oct 7, 2014 at 6:06 AM, Lee Gee > 
> wrote:
>
>> My users have issues with apostrophes: I need to index and search "aaa's" 
>> as it is, and without the apostrophe, as "aaas".
>>
>> If I use a char_filter to remove apostrophes when indexing and when 
>> searching, the _analyze endpoint shows me that they produce 'words' without 
>> apostrophes like this (respectively):
>>
>>   {...   {
>>   end_offset => 5,
>>   position => 1,
>>   start_offset => 0,
>>   token => "aaas",
>>   type => "word",
>>   }  }
>>
>>   {
>>   end_offset => 5,
>>   position => 1,
>>   start_offset => 0,
>>   token => "aaas",
>>   type => "word",
>> },
>>
>> But there seems to be nothing I can do to find "aaas" / "aaa's" when 
>> searching!
>>
>> Is this expected?  
>>
>> TIA
>> Lee
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d8e1d1c9-cc1e-49e7-88b8-e767dd2fac08%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ampersand synonym in YAML?

2014-10-07 Thread Lee Gee

  name_synonyms:
  type: synonym
  synonyms:
- "1,one"
# - "&,and,+=>and"
- '& => and'
  
How can I use YAML to correctly configure a synonym for ampersands and the 
'plus' symbol and the word 'and'?

The above synonym for 1/one seems to work.

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d4557c3f-8c90-421a-a841-797135c34d86%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Pattern replace apostrophes?

2014-10-07 Thread Lee Gee

The index uses the keyword tokenizer, with edge-ngram (and other) filters — 
it only wants to match from the start of the string, for autocomplete.

The search analyser is also keyword, with various filters.

The pattern-replace filter for apostrophes is applied to both.

On Tuesday, October 7, 2014 8:00:13 PM UTC+1, Ivan Brusic wrote:
>
> What type of query are you using? Perhaps the query you are using is not 
> using the same analyzer at search time.
>
> -- 
> Ivan
>
> On Tue, Oct 7, 2014 at 6:06 AM, Lee Gee > 
> wrote:
>
>> My users have issues with apostrophes: I need to index and search "aaa's" 
>> as it is, and without the apostrophe, as "aaas".
>>
>> If I use a char_filter to remove apostrophes when indexing and when 
>> searching, the _analyze endpoint shows me that they produce 'words' without 
>> apostrophes like this (respectively):
>>
>>   {...   {
>>   end_offset => 5,
>>   position => 1,
>>   start_offset => 0,
>>   token => "aaas",
>>   type => "word",
>>   }  }
>>
>>   {
>>   end_offset => 5,
>>   position => 1,
>>   start_offset => 0,
>>   token => "aaas",
>>   type => "word",
>> },
>>
>> But there seems to be nothing I can do to find "aaas" / "aaa's" when 
>> searching!
>>
>> Is this expected?  
>>
>> TIA
>> Lee
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3c5fbb3b-decc-41f6-8be6-2a1f5f37f4be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Pattern replace apostrophes?

2014-10-07 Thread Lee Gee

My users have issues with apostrophes: I need to index and search "aaa's" 
as it is, and without the apostrophe, as "aaas".

If I use a char_filter to remove apostrophes when indexing and when 
searching, the _analyze endpoint shows me that they produce 'words' without 
apostrophes like this (respectively):

  {...   {
  end_offset => 5,
  position => 1,
  start_offset => 0,
  token => "aaas",
  type => "word",
  }  }

  {
  end_offset => 5,
  position => 1,
  start_offset => 0,
  token => "aaas",
  type => "word",
},

But there seems to be nothing I can do to find "aaas" / "aaa's" when 
searching!

Is this expected?  

TIA
Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Sorting equal scores by field length?

2014-10-07 Thread Lee Gee

Thank you, Vineeth!

On Thursday, October 2, 2014 2:41:42 PM UTC+1, vineeth mohan wrote:
>
> Hello Lee , 
>
> There is a workaround for that.
> You need to enabled word count and then use that in the script.
>
> Word count - 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html
> Script in scoring - 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html#_term_statistics_2
>
> HTH.
>
> Thanks
>   Vineeth
>
> On Thu, Oct 2, 2014 at 4:33 AM, Lee Gee > 
> wrote:
>
>> Is it possible to sort equally-scored results by the length of the field?
>>
>> Or am I doing something else incorrectly?
>>
>> With an edge_ngram filter on a keyword field, with search term S, I see 
>> SUPER comes before S in my results.
>>
>> As a last resort, I could add a field to reflect the length of the 
>> keyword field, or even turn on dynamic scripting, but I imagine i am 
>> missing something vital
>>
>> Thanks
>> Lee
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/3de35adb-4573-4322-b2c0-c1e49320102d%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/3de35adb-4573-4322-b2c0-c1e49320102d%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff650736-935f-44dc-8a3b-7254cd1a5e4c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sorting equal scores by field length?

2014-10-02 Thread Lee Gee

Is it possible to sort equally-scored results by the length of the field?

Or am I doing something else incorrectly?

With an edge_ngram filter on a keyword field, with search term S, I see 
SUPER comes before S in my results.

As a last resort, I could add a field to reflect the length of the keyword 
field, or even turn on dynamic scripting, but I imagine i am missing 
something vital

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3de35adb-4573-4322-b2c0-c1e49320102d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: edge_ngram results

2014-10-02 Thread Lee Gee

The problem was that my test script did not pause between 
creating/populating the index, and searching on it. Even though there are 
very few documents (10), ElasticSearch still needs a second or two to catch 
its breath and mop its brow before it is ready to search.

Now to find a way to rank shorter strings higher than longer ones but 
that's another question

thanks
Lee

On Wednesday, October 1, 2014 11:24:17 AM UTC+1, Lee Gee wrote:
>
> I have an ElasticSearch string field configured for autocomplete like this:
>
> autocomplete_analyzer:
>   type: custom
>   tokenizer: whitespace
>   filter: [ lowercase, asciifolding, ending_synonym, 
> name_synonyms, autocomplete_filter ]
>
> autocomplete_filter:
>   type: edge_ngram
>   min_gram: 1
>   max_gram: 20
>   token_chars: [ letter, digit, whitespace, punctuation, symbol ]
>
> search_analyzer:
>   type: custom
>   tokenizer: whitespace
>   filter: [ lowercase, asciifolding, standard, name_synonyms, 
> ending_synonym ]
>
>
>
> I have a record where the field contains 'S XYZ', and lots of other 
> records where the field contains other words beginning S.
>
> I do not understand why, when I search for 'S XYZ', it is not the first 
> result.
>
> Could someone please explain ?
>
> Many thanks in anticipation
> lee
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c43961cb-224a-4b17-a03e-fc44926a05ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: edge_ngram results

2014-10-02 Thread Lee Gee

'explain' shows only two differences between the two results:

Hit on 'S' vs. hit on 'DqWjDCcsh S'

* idf(docFreq=1, maxDocs=1) vs. idf(docFreq=10, maxDocs=10)

* fieldNorm(doc=0) vs. fieldNorm(doc=9)

My possibly flawed understanding is that IDF is the inverse document 
frequency of the search term across the whole index — what confuses me is 
that these are results for the same term in the same index, so shouldn't 
the IDF be the same...?

tia
lee

On Wednesday, October 1, 2014 11:24:17 AM UTC+1, Lee Gee wrote:
>
> I have an ElasticSearch string field configured for autocomplete like this:
>
> autocomplete_analyzer:
>   type: custom
>   tokenizer: whitespace
>   filter: [ lowercase, asciifolding, ending_synonym, 
> name_synonyms, autocomplete_filter ]
>
> autocomplete_filter:
>   type: edge_ngram
>   min_gram: 1
>   max_gram: 20
>   token_chars: [ letter, digit, whitespace, punctuation, symbol ]
>
> search_analyzer:
>   type: custom
>   tokenizer: whitespace
>   filter: [ lowercase, asciifolding, standard, name_synonyms, 
> ending_synonym ]
>
>
>
> I have a record where the field contains 'S XYZ', and lots of other 
> records where the field contains other words beginning S.
>
> I do not understand why, when I search for 'S XYZ', it is not the first 
> result.
>
> Could someone please explain ?
>
> Many thanks in anticipation
> lee
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/681ebe12-7cfa-4ed6-a045-ad287545d4eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

edge_ngram results

2014-10-01 Thread Lee Gee

I have an ElasticSearch string field configured for autocomplete like this:

autocomplete_analyzer:
  type: custom
  tokenizer: whitespace
  filter: [ lowercase, asciifolding, ending_synonym, name_synonyms, 
autocomplete_filter ]

autocomplete_filter:
  type: edge_ngram
  min_gram: 1
  max_gram: 20
  token_chars: [ letter, digit, whitespace, punctuation, symbol ]

search_analyzer:
  type: custom
  tokenizer: whitespace
  filter: [ lowercase, asciifolding, standard, name_synonyms, 
ending_synonym ]



I have a record where the field contains 'S XYZ', and lots of other records 
where the field contains other words beginning S.

I do not understand why, when I search for 'S XYZ', it is not the first 
result.

Could someone please explain ?

Many thanks in anticipation
lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/218280b1-2c9c-42db-854d-62d1c8de8862%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Facted navigation with totals, drilling down

2014-09-12 Thread Lee Gee

Thank you, all.

On Sunday, September 7, 2014 1:57:40 PM UTC+1, Lee Gee wrote:
>
> I have two 'types' in an index, or two indices of different types (I'd 
> prefer the latter but can live with the former).
>
> I'm running an aggregation by type to implement what my UX people refer to 
> as faceted search — which makes Googling for ES  help quite tricky. 
>
> UX would like to filter by type but retain a count for total hits in each 
> aggregation bucket — it the total number of each type of record that 
> matches the query.
>
> Can this be done in one query?
>
> Failing that, can two queries be supplied/run in parallel?
>
> Thanks
> Lee
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/77035121-06ac-436f-8165-9ab44bc1b47c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Facted navigation with totals, drilling down

2014-09-07 Thread Lee Gee

I have two 'types' in an index, or two indices of different types (I'd 
prefer the latter but can live with the former).

I'm running an aggregation by type to implement what my UX people refer to 
as faceted search — which makes Googling for ES  help quite tricky. 

UX would like to filter by type but retain a count for total hits in each 
aggregation bucket — it the total number of each type of record that 
matches the query.

Can this be done in one query?

Failing that, can two queries be supplied/run in parallel?

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8a815ff7-3fb0-480f-ab4a-0786577d4cb6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Weight be position in string?

2014-08-27 Thread Lee Gee

Is it possible to boost the weight a result if it is closer to the start of 
a string in the index?  So that searching for 'bar' would weight 'foo bar 
baz' higher than 'foo baz bar'?

I'm working with ngrams, if that helps.

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52bf1a41-b01d-4e3a-a61d-ab364afe2869%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: _suggest suggestion/question

2014-08-26 Thread Lee Gee

Thank you, Vineeth.

On Sunday, August 17, 2014 12:04:20 PM UTC+1, vineeth mohan wrote:
>
> Hello Lee ,
>
> You will need to use context suggester for this purpose - 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/suggester-context.html
>
> Also this difference stems from the fact that , both actual data and auto 
> completion data are stored in different data structures.
> This is to make sure that the auto completion data is memory resident and 
> thus super fast.
>
> Thanks
>   Vineeth
>
>
> On Sun, Aug 17, 2014 at 3:32 PM, Lee Gee > 
> wrote:
>
>> My reading, which may not be accurate, of this [1] clear and concise 
>> post, 
>> is that it is not possible to use a reference to an existing field as an 
>> argument to a suggestor's 'input' or 'payload' fields.
>>
>> Please would you clarify if I have missed something?
>>
>> If I was correct, would it be much work to add these features?
>>
>> TIA
>> Lee
>>
>> [1] http://www.elasticsearch.org/blog/you-complete-me/
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/2367a474-f47b-43ae-bad0-7326256dec60%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/2367a474-f47b-43ae-bad0-7326256dec60%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9ea51925-5ef8-48f3-8960-e5462e112713%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Swap indexes?

2014-08-26 Thread Lee Gee

I was looking for the index alias, thanks all.

On Tuesday, June 17, 2014 9:31:00 AM UTC+1, Lee Gee wrote:
>
> Is it possible to have one ES instance create an index and then have a 
> second instance use that created index, without downtime?
>
> tia
> lee
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c577a018-fe46-4b73-a08c-ea07796fa02d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

_suggest suggestion/question

2014-08-17 Thread Lee Gee

My reading, which may not be accurate, of this [1] clear and concise post, 
is that it is not possible to use a reference to an existing field as an 
argument to a suggestor's 'input' or 'payload' fields.

Please would you clarify if I have missed something?

If I was correct, would it be much work to add these features?

TIA
Lee

[1] http://www.elasticsearch.org/blog/you-complete-me/

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2367a474-f47b-43ae-bad0-7326256dec60%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Support for Anchoring in Elasticsearch Regex

2014-07-29 Thread Lee Gee

Lucene and Elastic Search both anchor regexp by default.

"Lucene’s patterns are always anchored. The pattern provided must match the 
entire string. "

— 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax


On Wednesday, December 18, 2013 7:19:48 AM UTC, Vaidik Kapoor wrote:
>
> Hi Folks,
>
> I see that Elasticsearch supports Regex. But that is limited to Lucene's 
> Regex Engine which does not support anchoring i.e. the entire string will 
> always be anchored. This works as long as you have fixed regular 
> expressions to run, but in cases where the regex query is taken from the 
> user, this becomes very limiting.
>
> Is there an alternative regex engine for Elasticsearch that at least 
> supports $ and ^ for anchoring? Quick Google and Github search did not get 
> me anything. If not, then is anybody doing something similar or have a work 
> around? One possible solution that I can think of is converting user's 
> entered regex to Lucene compatible regex. But that gets really complex to 
> do correctly with all the grouping and alternation in regex.
>
> I don't want the entire Perl regex kind of support. Just the anchoring bit 
> is important. Has anybody tried to solve this problem before?
>
> Thanks,
> Vaidik Kapoor
> vaidikkapoor.info
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/27a0c79c-94bc-4878-b355-dd4895bc4135%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Swap indexes?

2014-06-17 Thread Lee Gee

Is it possible to have one ES instance create an index and then have a 
second instance use that created index, without downtime?

tia
lee

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9fe7a9eb-11dc-4092-8ec4-e5fc11eaebba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Folding of accented to non-accented only — leaving symbols

Re: Pattern replace apostrophes?

Ampersand synonym in YAML?

Re: Pattern replace apostrophes?

Pattern replace apostrophes?

Re: Sorting equal scores by field length?

Sorting equal scores by field length?

Re: edge_ngram results

Re: edge_ngram results

edge_ngram results

Re: Facted navigation with totals, drilling down

Facted navigation with totals, drilling down

Weight be position in string?

Re: _suggest suggestion/question

Re: Swap indexes?

_suggest suggestion/question

Re: Support for Anchoring in Elasticsearch Regex

Swap indexes?

18 matches

Site Navigation

Mail list logo

Footer information