from:"Lee Gee"

Folding of accented to non-accented only — leaving symbols

2014-10-13 Thread Lee Gee

I now the asciifolding filter docs are really very clear on this, but it 
took me an embarrassingly long time to realise I was losing my currency 
symbol (£) to the ASCII folding filter.

Other than creating my own character map with the char map filter, does 
there exist something of production quality that would translate accented 
UTF8 characters of the Latin-alphabet into non-accented characters in the 
ASCII range?

TIA
Lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ff95c6ec-7907-454e-bd58-774ee173f4e3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Pattern replace apostrophes?

2014-10-09 Thread Lee Gee

The problem was that it was not an apostrophe, but an opening single quote.

Have increased editor font size to address this issue.

On Tuesday, October 7, 2014 8:00:13 PM UTC+1, Ivan Brusic wrote:

What type of query are you using? Perhaps the query you are using is not
using the same analyzer at search time.

--
Ivan

On Tue, Oct 7, 2014 at 6:06 AM, Lee Gee lee...@gmail.com javascript:
wrote:

My users have issues with apostrophes: I need to index and search aaa's
as it is, and without the apostrophe, as aaas.

If I use a char_filter to remove apostrophes when indexing and when
searching, the _analyze endpoint shows me that they produce 'words' without
apostrophes like this (respectively):

{... {
end_offset = 5,
position = 1,
start_offset = 0,
token = aaas,
type = word,
} }

{
end_offset = 5,
position = 1,
start_offset = 0,
token = aaas,
type = word,
},

But there seems to be nothing I can do to find aaas / aaa's when
searching!

Is this expected?

TIA
Lee

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d8e1d1c9-cc1e-49e7-88b8-e767dd2fac08%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Pattern replace apostrophes?

2014-10-08 Thread Lee Gee

The index uses the keyword tokenizer, with edge-ngram (and other) filters —
it only wants to match from the start of the string, for autocomplete.

The search analyser is also keyword, with various filters.

The pattern-replace filter for apostrophes is applied to both.

On Tuesday, October 7, 2014 8:00:13 PM UTC+1, Ivan Brusic wrote:

What type of query are you using? Perhaps the query you are using is not
using the same analyzer at search time.

--
Ivan

On Tue, Oct 7, 2014 at 6:06 AM, Lee Gee lee...@gmail.com javascript:
wrote:

My users have issues with apostrophes: I need to index and search aaa's
as it is, and without the apostrophe, as aaas.

If I use a char_filter to remove apostrophes when indexing and when
searching, the _analyze endpoint shows me that they produce 'words' without
apostrophes like this (respectively):

{... {
end_offset = 5,
position = 1,
start_offset = 0,
token = aaas,
type = word,
} }

{
end_offset = 5,
position = 1,
start_offset = 0,
token = aaas,
type = word,
},

But there seems to be nothing I can do to find aaas / aaa's when
searching!

Is this expected?

TIA
Lee

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/3c5fbb3b-decc-41f6-8be6-2a1f5f37f4be%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ampersand synonym in YAML?

2014-10-08 Thread Lee Gee

  name_synonyms:
  type: synonym
  synonyms:
- 1,one
# - ,and,+=and
- ' = and'
  
How can I use YAML to correctly configure a synonym for ampersands and the 
'plus' symbol and the word 'and'?

The above synonym for 1/one seems to work.

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d4557c3f-8c90-421a-a841-797135c34d86%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Pattern replace apostrophes?

2014-10-07 Thread Lee Gee

My users have issues with apostrophes: I need to index and search aaa's 
as it is, and without the apostrophe, as aaas.

If I use a char_filter to remove apostrophes when indexing and when 
searching, the _analyze endpoint shows me that they produce 'words' without 
apostrophes like this (respectively):

  {...   {
  end_offset = 5,
  position = 1,
  start_offset = 0,
  token = aaas,
  type = word,
  }  }

  {
  end_offset = 5,
  position = 1,
  start_offset = 0,
  token = aaas,
  type = word,
},

But there seems to be nothing I can do to find aaas / aaa's when 
searching!

Is this expected?  

TIA
Lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a959fe9f-6899-47fd-a371-131c1e51071c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: edge_ngram results

2014-10-02 Thread Lee Gee

'explain' shows only two differences between the two results:

Hit on 'S' vs. hit on 'DqWjDCcsh S'

* idf(docFreq=1, maxDocs=1) vs. idf(docFreq=10, maxDocs=10)

* fieldNorm(doc=0) vs. fieldNorm(doc=9)

My possibly flawed understanding is that IDF is the inverse document
frequency of the search term across the whole index — what confuses me is
that these are results for the same term in the same index, so shouldn't
the IDF be the same...?

tia
lee

On Wednesday, October 1, 2014 11:24:17 AM UTC+1, Lee Gee wrote:

I have an ElasticSearch string field configured for autocomplete like this:

autocomplete_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, ending_synonym,
name_synonyms, autocomplete_filter ]

autocomplete_filter:
type: edge_ngram
min_gram: 1
max_gram: 20
token_chars: [ letter, digit, whitespace, punctuation, symbol ]

search_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, standard, name_synonyms,
ending_synonym ]

I have a record where the field contains 'S XYZ', and lots of other
records where the field contains other words beginning S.

I do not understand why, when I search for 'S XYZ', it is not the first
result.

Could someone please explain ?

Many thanks in anticipation
lee

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/681ebe12-7cfa-4ed6-a045-ad287545d4eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: edge_ngram results

2014-10-02 Thread Lee Gee

The problem was that my test script did not pause between 
creating/populating the index, and searching on it. Even though there are 
very few documents (10), ElasticSearch still needs a second or two to catch 
its breath and mop its brow before it is ready to search.

Now to find a way to rank shorter strings higher than longer ones but 
that's another question

thanks
Lee

On Wednesday, October 1, 2014 11:24:17 AM UTC+1, Lee Gee wrote:

 I have an ElasticSearch string field configured for autocomplete like this:

 autocomplete_analyzer:
   type: custom
   tokenizer: whitespace
   filter: [ lowercase, asciifolding, ending_synonym, 
 name_synonyms, autocomplete_filter ]

 autocomplete_filter:
   type: edge_ngram
   min_gram: 1
   max_gram: 20
   token_chars: [ letter, digit, whitespace, punctuation, symbol ]

 search_analyzer:
   type: custom
   tokenizer: whitespace
   filter: [ lowercase, asciifolding, standard, name_synonyms, 
 ending_synonym ]



 I have a record where the field contains 'S XYZ', and lots of other 
 records where the field contains other words beginning S.

 I do not understand why, when I search for 'S XYZ', it is not the first 
 result.

 Could someone please explain ?

 Many thanks in anticipation
 lee



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c43961cb-224a-4b17-a03e-fc44926a05ec%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Sorting equal scores by field length?

2014-10-02 Thread Lee Gee

Is it possible to sort equally-scored results by the length of the field?

Or am I doing something else incorrectly?

With an edge_ngram filter on a keyword field, with search term S, I see 
SUPER comes before S in my results.

As a last resort, I could add a field to reflect the length of the keyword 
field, or even turn on dynamic scripting, but I imagine i am missing 
something vital

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3de35adb-4573-4322-b2c0-c1e49320102d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

edge_ngram results

2014-10-01 Thread Lee Gee

I have an ElasticSearch string field configured for autocomplete like this:

autocomplete_analyzer:
  type: custom
  tokenizer: whitespace
  filter: [ lowercase, asciifolding, ending_synonym, name_synonyms, 
autocomplete_filter ]

autocomplete_filter:
  type: edge_ngram
  min_gram: 1
  max_gram: 20
  token_chars: [ letter, digit, whitespace, punctuation, symbol ]

search_analyzer:
  type: custom
  tokenizer: whitespace
  filter: [ lowercase, asciifolding, standard, name_synonyms, 
ending_synonym ]



I have a record where the field contains 'S XYZ', and lots of other records 
where the field contains other words beginning S.

I do not understand why, when I search for 'S XYZ', it is not the first 
result.

Could someone please explain ?

Many thanks in anticipation
lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/218280b1-2c9c-42db-854d-62d1c8de8862%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Facted navigation with totals, drilling down

2014-09-07 Thread Lee Gee

I have two 'types' in an index, or two indices of different types (I'd 
prefer the latter but can live with the former).

I'm running an aggregation by type to implement what my UX people refer to 
as faceted search — which makes Googling for ES  help quite tricky. 

UX would like to filter by type but retain a count for total hits in each 
aggregation bucket — it the total number of each type of record that 
matches the query.

Can this be done in one query?

Failing that, can two queries be supplied/run in parallel?

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8a815ff7-3fb0-480f-ab4a-0786577d4cb6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Weight be position in string?

2014-08-27 Thread Lee Gee

Is it possible to boost the weight a result if it is closer to the start of 
a string in the index?  So that searching for 'bar' would weight 'foo bar 
baz' higher than 'foo baz bar'?

I'm working with ngrams, if that helps.

Thanks
Lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/52bf1a41-b01d-4e3a-a61d-ab364afe2869%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Swap indexes?

2014-08-26 Thread Lee Gee

I was looking for the index alias, thanks all.

On Tuesday, June 17, 2014 9:31:00 AM UTC+1, Lee Gee wrote:

 Is it possible to have one ES instance create an index and then have a 
 second instance use that created index, without downtime?

 tia
 lee


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c577a018-fe46-4b73-a08c-ea07796fa02d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: _suggest suggestion/question

2014-08-26 Thread Lee Gee

Thank you, Vineeth.

On Sunday, August 17, 2014 12:04:20 PM UTC+1, vineeth mohan wrote:

Hello Lee ,

You will need to use context suggester for this purpose -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/suggester-context.html

Also this difference stems from the fact that , both actual data and auto
completion data are stored in different data structures.
This is to make sure that the auto completion data is memory resident and
thus super fast.

Thanks
Vineeth

On Sun, Aug 17, 2014 at 3:32 PM, Lee Gee lee...@gmail.com javascript:
wrote:

My reading, which may not be accurate, of this [1] clear and concise
post,
is that it is not possible to use a reference to an existing field as an
argument to a suggestor's 'input' or 'payload' fields.

Please would you clarify if I have missed something?

If I was correct, would it be much work to add these features?

TIA
Lee

[1] http://www.elasticsearch.org/blog/you-complete-me/

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com javascript:.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2367a474-f47b-43ae-bad0-7326256dec60%40googlegroups.com

https://groups.google.com/d/msgid/elasticsearch/2367a474-f47b-43ae-bad0-7326256dec60%40googlegroups.com?utm_medium=emailutm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9ea51925-5ef8-48f3-8960-e5462e112713%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

_suggest suggestion/question

2014-08-17 Thread Lee Gee

My reading, which may not be accurate, of this [1] clear and concise post, 
is that it is not possible to use a reference to an existing field as an 
argument to a suggestor's 'input' or 'payload' fields.

Please would you clarify if I have missed something?

If I was correct, would it be much work to add these features?

TIA
Lee

[1] http://www.elasticsearch.org/blog/you-complete-me/

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/2367a474-f47b-43ae-bad0-7326256dec60%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Support for Anchoring in Elasticsearch Regex

2014-07-29 Thread Lee Gee

Lucene and Elastic Search both anchor regexp by default.

Lucene’s patterns are always anchored. The pattern provided must match the
entire string.

—
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax

On Wednesday, December 18, 2013 7:19:48 AM UTC, Vaidik Kapoor wrote:

Hi Folks,

I see that Elasticsearch supports Regex. But that is limited to Lucene's
Regex Engine which does not support anchoring i.e. the entire string will
always be anchored. This works as long as you have fixed regular
expressions to run, but in cases where the regex query is taken from the
user, this becomes very limiting.

Is there an alternative regex engine for Elasticsearch that at least
supports $ and ^ for anchoring? Quick Google and Github search did not get
me anything. If not, then is anybody doing something similar or have a work
around? One possible solution that I can think of is converting user's
entered regex to Lucene compatible regex. But that gets really complex to
do correctly with all the grouping and alternation in regex.

I don't want the entire Perl regex kind of support. Just the anchoring bit
is important. Has anybody tried to solve this problem before?

Thanks,
Vaidik Kapoor
vaidikkapoor.info

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/27a0c79c-94bc-4878-b355-dd4895bc4135%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Swap indexes?

2014-06-17 Thread Lee Gee

Is it possible to have one ES instance create an index and then have a 
second instance use that created index, without downtime?

tia
lee

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9fe7a9eb-11dc-4092-8ec4-e5fc11eaebba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Folding of accented to non-accented only — leaving symbols

Re: Pattern replace apostrophes?

Re: Pattern replace apostrophes?

Ampersand synonym in YAML?

Pattern replace apostrophes?

Re: edge_ngram results

Re: edge_ngram results

Sorting equal scores by field length?

edge_ngram results

Facted navigation with totals, drilling down

Weight be position in string?

Re: Swap indexes?

Re: _suggest suggestion/question

_suggest suggestion/question

Re: Support for Anchoring in Elasticsearch Regex

Swap indexes?

16 matches

Site Navigation

Mail list logo

Footer information