Elasticsearch 1.5.2 and 1.4.5 released

2015-04-27 Thread Clinton Gormley
Hi All, We're happy to announce that Elasticsearch 1.5.2 and Elasticsearch 1.4.5 have been released! THESE VERSIONS CONTAIN A SECURITY FIX FOR A DIRECTORY TRAVERSAL VULNERABILITY. WE ADVISE ALL USERS TO UPGRADE The blog post at [1] describes the release content at a high level. The release

Re: partial update elaticsearch-perl

2015-01-29 Thread Clinton Gormley
Hi Jorge The `doc` should be passed in the `body` parameter: $e-update( index = 'myindex', type = 'mytype', id= mykey, body = { doc = { link = http://www.nw-kicoso.com;, sortierung = 5 } } ); On 8 December 2014 at 10:29, Jorge von Rudno

Re: Perl client: Cannot combine params and body?

2015-01-21 Thread Clinton Gormley
Hi Andrew The code looks correct. You have send_get_body_as POST commented out - I'm guessing that is the problem. Probably the service you're using does not allow GET requests with bodies. I'd uncomment that and try again. Ping me on https://github.com/elasticsearch/elasticsearch-perl/issues

Re: [IMPORTANT] Issues using Perl API client installation

2014-12-30 Thread Clinton Gormley
On Monday, 29 December 2014 22:26:24 UTC+1, Vilas Reddy wrote: Hi, I am trying to use Perl API for retrieving data from Elasticsearch. I am using Elasticsearch in windows cygwin. I need help with installing perl api and using it. I tried the following: *1. Installed cpan in cygwin and

Re: Connecting to ES via a http proxy in perl client

2014-10-28 Thread Clinton Gormley
Hi Kevin On Friday, 24 October 2014 18:24:00 UTC+2, Kevin Van Workum wrote: I'm trying to connect to my ES via a proxy using a client written in perl. What's the best way to do this? Here's what I have, and it works, but I suspect there's a more straight forward approach: $e =

Re: ES error while creating index on Solaris

2014-10-20 Thread Clinton Gormley
Hi Abhinav It would be good to know exactly where this problem is coming from. Is it the way that Logstash adds the template, or is it in the Elasticsearch layer. Please could you try something: * Delete the existing template and index in Elasticsearch * Take the Logstash template and create it

Re: ES error while creating index on Solaris

2014-10-20 Thread Clinton Gormley
I think this is fixed in v1.3.5 with https://github.com/elasticsearch/elasticsearch/pull/7468 On Monday, 20 October 2014 17:08:55 UTC+2, Jörg Prante wrote: I have added a comment to https://github.com/elasticsearch/elasticsearch/issues/6962 Jörg -- You received this message because you

[ANN] Elasticsearch 1.4.0.Beta1 released

2014-10-01 Thread Clinton Gormley
We're happy to announce that Elasticsearch 1.4.0.Beta1 has been released! The blog post at [1] describes the release content at a high level. The release notes at [2] give details and a direct link to download. Please download and try it out. Feedback, bug reports and patches are all

[ANN] Elasticsearch 1.3.4 released

2014-09-30 Thread Clinton Gormley
We're happy to announce that Elasticsearch 1.3.4 has been released! This is a bug fix release, especially for users with large numbers of shards. The blog post at [1] describes the release content at a high level. The release notes at [2] give details and a direct link to download. Please

[ANN] Elasticsearch 1.3.3 released

2014-09-29 Thread Clinton Gormley
We're happy to announce that Elasticsearch 1.3.3 has been released! This is a bug fix release. The blog post at [1] describes the release content at a high level. The release notes at [2] give details and a direct link to download. Please download and try it out. Feedback, bug reports

Re: ES default - async or sync

2014-09-07 Thread Clinton Gormley
Hiya OK I see where the confusion is coming in. I used the word asynchronously in slightly different contexts there. I will try to reword in the Definitive Guide. Replication is sync by default, in other words: the primary waits for indexing to happen on the replica before it returns to the

Re: Topics/Entities with relevancy scores and searching

2014-08-25 Thread Clinton Gormley
On 24 August 2014 19:46, Scott Decker sc...@publishthis.com wrote: Have you done this? any concerns to performance with this sort of scoring, or, it is just as fast if you were doing base lucene scoring if we override the score function and just use our own? -- we will of course try it and

Re: Parent/Child query performance in version 1.1.2

2014-08-25 Thread Clinton Gormley
Something else to note: parent-child now uses global ordinals to make queries 3x faster than they were previously, but global ordinals need to be rebuilt after the index has refreshed (assuming some data has changed). Currently there is no way to refresh p/c global ordinals eagerly (ie during the

Re: Can't find unit tests for reserved characters

2014-08-23 Thread Clinton Gormley
On 21 August 2014 23:38, ben billumi...@gmail.com wrote: I was trying to demonstrate the escaping of a space not a slash. According to the ES documentation (copied in my original post) that says a space must be escaped. Thanks! On Thursday, August 21, 2014 12:31:29 PM UTC-7, Clinton Gormley

Re: Topics/Entities with relevancy scores and searching

2014-08-23 Thread Clinton Gormley
Have a look at: * http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-delimited-payload-tokenfilter.html * http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html On 23 August 2014 15:04, Scott Decker

Re: Can't find unit tests for reserved characters

2014-08-21 Thread Clinton Gormley
That's the JSON parsing, not the query_string parsing. You need to use a double slash in JSON in order to pass a single slash, ie: { query: { query_string: { query: name:exampleof\\ bug } } } Also, re the reserved characters in the query string - that is all handled by Lucene,

Re: Example needed for Perl Search::Elasticsearch

2014-08-13 Thread Clinton Gormley
Hiya Simple question, but there seems to be a lack of detailed examples for using the otherwise very useful Search::Elasticsearch CPAN module ! The idea was that the API of the module maps very closely to all of the REST APIs in Elasticsearch, so that anything that works with raw curl

Re: clarity for shard allocation disable/enable during upgrade

2014-08-12 Thread Clinton Gormley
On Monday, 11 August 2014 15:31:28 UTC+2, bitsof...@gmail.com wrote: I have 8 data nodes and 6 coordinator nodes in an active cluster running 1.2.1 I want to upgrade to 1.3.1 When reading http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html the

Re: final request when scroll-scanning has 0 successful shards

2014-08-04 Thread Clinton Gormley
This is correct. On the last request, no hits are returned because all shards have already been drained of results. If you look at shards.total and shards.failed, you'll see they are also 0 clint On 4 August 2014 12:54, Tim S timsti...@gmail.com wrote: When scroll-scanning

Re: Pull Requests mounting?

2014-08-01 Thread Clinton Gormley
Hi James A cursory glance at the GitHub pages shows a build-up of pull requests: ElasticSearch: 135 Kibana: 64 Is there a movement to get these merged or otherwise cleaned up? We are waiting on at least one of these but lack hope in the face of the volumes pending. We have a

Re: slow filter execution

2014-07-31 Thread Clinton Gormley
On 31 July 2014 20:25, Kireet Reddy kir...@feedly.com wrote: Quick update, I found that if I explicitly set _cache to true, things seem to work more as expected, i.e. subsequent executions of the query sped up. I looked at DateFieldMapper.rangeFilter() and to me it looks like if a number is

Re: slow filter execution

2014-07-30 Thread Clinton Gormley
Don't use the `and` filter - use the `bool` filter instead. They have different execution modes and the `bool` filter works best with bitset filters (but also knows how to handle non-bitset filters like geo etc). Just remove the `and`, `or` and `not` filters from your DSL vocabulary. Also, not

Re: Sort Order when relevance is equal

2014-07-22 Thread Clinton Gormley
Hi Erich On 14 May 2014 02:49, Erich Lin e...@onekingslane.com wrote: 1) will they always be in the same order if we set the preference parameter to an arbitrary string like the user’s session ID. They will be, until a merge happens (eg from indexing, updating, deleting, or just because...)

Re: Heap / GC Issues

2014-07-19 Thread Clinton Gormley
Your filter cache is only taking up 3GB of the heap, which fits with the default limit of 10% of heap space. So the filter cache is not at fault here. I would look at the two usual suspects: * field data - how much space is this consuming? Try: curl

Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Clinton Gormley
What you can do is to set the mapping for the date field to have: { type: date, format: -MM-dd HH:mm:ss, ignore_malformed: true } then it will just ignore those invalid dates rather than throwing an error -- You received this message because you are subscribed to the Google Groups

Re: Dealing with spam in this forum

2014-07-02 Thread Clinton Gormley
I've received in my mailbox at least 49 spams just for the 06/30. I won't call this a few spam email. I'm subscribed for years on many mailing lists, and I'm pretty sure that it would take years to get as much spam on those lists as I get in 1 day on ES mailing list. That's

Dealing with spam in this forum

2014-07-01 Thread Clinton Gormley
Hi all Recently we've had a few spam emails that have made it through Google's filters, and there have been a calls for us to change to a moderate-first-post policy. I am reluctant to adopt this policy for the following reasons: We get about 30 new users every day from all over the world,

Re: Clarification on has_child filter memory requirements

2014-06-21 Thread Clinton Gormley
I've updated the docs on memory usage with parent-child. Hopefully more understandable: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html?1#_memory_considerations_8 On 21 June 2014 07:32, Drew Kutcharian d...@venarc.com wrote: Thanks Alex.

Re: boolean multi-field silently ignored in 1.2.1

2014-06-20 Thread Clinton Gormley
heya bruce that looks like a bug - please open an issue clint On 20 June 2014 19:41, Bruce Ritchie bruce.ritc...@gmail.com wrote: I'm seeing multi-fields of type boolean silently being reduced to a normal boolean field in 1.2.1 which wasn't the behavior in 0.90.9. See

Re: How to find the number of authors who have written between 2-3 books?

2014-06-20 Thread Clinton Gormley
Alternatively, if you mode this with parent-child, then you can use min_children/max_children which is available in the next release http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html#_min_max_children_2 clint On 20 June 2014 17:15, Mike

Re: guarding from double-start

2014-06-20 Thread Clinton Gormley
And in your config file, set: node.max_local_storage_nodes: 1 that way you won't start two nodes on a single instance On 20 June 2014 16:54, Andrew Gaydenko andrew.gayde...@gmail.com wrote: On Friday, June 20, 2014 6:49:04 PM UTC+4, Maciej Dziardziel wrote: use start-stop-daemon or

Re: problem indexing with my analyzer

2014-06-20 Thread Clinton Gormley
You seriously don't want 3..250 length ngrams That's ENORMOUS Typically set min/max to 3 or 4, and that's it http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_ngrams_for_partial_matching.html#_ngrams_for_partial_matching On 20 June 2014 16:05, Tanguy Bernard

Re: Best cluster environment for search

2014-06-11 Thread Clinton Gormley
On Thursday, 5 June 2014 00:54:15 UTC+2, Jörg Prante wrote: Why do you use terms on _id field and not the the ids filter? ids filter is more efficient since it reuses the _uid field which is cached by default. So does the terms filter. The only advantage of the _ids filter is that you

Re: How to get just certain fields on query time?

2014-05-24 Thread Clinton Gormley
Yes, with source filtering: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-source-filtering.html#search-request-source-filtering On 24 May 2014 20:54, Tom t.opp...@superreal.de wrote: Hi, is there a way to get just parts of _source on query time? Thx

Re: Changing simple type mapping

2014-05-23 Thread Clinton Gormley
Hi Brian What you can do is to update the mapping of the numeric field to { ignore_malformed: true }, then it will just ignore the bad data, but still use the multi-field: PUT /my_index/my_type/1 {number: 123} PUT /my_index/_mapping/my_type { properties: { number:

Re: Changing simple type mapping

2014-05-23 Thread Clinton Gormley
Thank you for your suggestion. What will that do for the existing data? Will I still be able to store copyrightYear as either a number or a string? It won't change any existing data. However, for data you index in the future it will index either a number and a string, or (if it can't coerce

Re: Query routing inside of a cluster

2014-05-14 Thread Clinton Gormley
Hi Savva I presume you're using cluster.routing.allocation.awareness? If so, then shards on nodes with the same node attributes are preferred: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#_automatic_preference_when_searching_geting On 14 May

Re: Matching on sibling json nodes ?

2014-05-12 Thread Clinton Gormley
Hi Kristian You can use nested objects and set include_in_parent to true (it's like using type:nested and type:object on the same field), then highlight on the fields in the parent object. clint On 12 May 2014 13:42, Kristian Rosenvold kristian.rosenv...@gmail.comwrote: We're submitting a

Re: Confused as to what works - Percolator

2014-05-10 Thread Clinton Gormley
It helps to provide the exact steps that you were trying, otherwise we are left to guess. I tried your example and it worked, but then I probably did it in a slightly different way. I think the likeliest problem is that you are creating the percolator queries before you index a document or

Re: Elastic search ignoring refresh_interval setting from elasticsearch.yml

2014-04-24 Thread Clinton Gormley
Hi Arjit When you set it via the config file, it isn't included in the output of GET /_settings, but is is being honoured. That said, I much prefer using the API for these things instead of setting them in the config file clint On 24 April 2014 14:27, Kartavya pulkitdot...@gmail.com wrote:

Re: MVEL scripting to return 0 or 1 for a boolean

2014-04-20 Thread Clinton Gormley
On 19 April 2014 20:57, Shane Neeley snee...@molecularmatch.com wrote: script: log(_score * (doc['field1'].value == doc['field2'].value) script: log(_score * ((doc['field1'].value == doc['field2'].value) ? : 1 : 0) -- You received this message because you are subscribed to the Google Groups

Re: array of strings vs string

2014-04-20 Thread Clinton Gormley
Also have a read about position_offset_gap: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_multi_value_fields_2.html On 17 April 2014 14:42, Aleh Aleshka olegl...@gmail.com wrote: Thanks! On Wednesday, April 16, 2014 6:39:00 PM UTC+3, vineeth mohan wrote: Hello Aleh ,

Re: Splunk vs. Elastic search performance?

2014-04-19 Thread Clinton Gormley
Goldman Sachs gave a talk about how they're using Elasticsearch to index 5TB of log data per day. I can't find the video of the talk, but from a blogpost about it: Next was Indy Tharmakumar from our hosts Goldman Sachshttp://www.goldmansachs.com/, showing how his team have built powerful support

Re: Testing for an Empty String

2014-04-19 Thread Clinton Gormley
Hi Paul You need to use a missing filter: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-missing-filter.html#query-dsl-missing-filter You can read the section of the guide about Dealing with null values here:

Re: Is ElasticSearch the Right Tool for This

2014-04-18 Thread Clinton Gormley
Hiya It's a bit more verbose, but yes you can do queries like that easily. I've assumed that all of your fields are exact value not_analyzed string fields, rather than full text fields: GET /_search { _source: [ col1, col2 ], query: { filtered: { filter: { bool: {

Spam emails in the mailing list

2014-03-31 Thread Clinton Gormley
Hi all Recently we've had a number of spam emails sent to the mailing list, which Google hasn't caught. We apologise for this, but we ask you to just ignore them. As soon as we spot the ones that make it through Google's net, we delete and ban the user. Unfortunately, the only other option

Re: OpenSearch for elasticsearch.org docs?

2014-03-19 Thread Clinton Gormley
Hmm I tried this on google.com and cnn.com on Chrome and Safari and didn't see what you describe at all. Tab just selected the first option from the list of suggestions under the URL field. What am I missing? On 18 March 2014 19:25, Ivan Brusic i...@brusic.com wrote: That is not how OpenSearch

Re: integration test issues with elasticsearch

2014-03-18 Thread Clinton Gormley
Please could you open an issue and try to provide the steps to reproduce this issue. thanks clint -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails from it, send an email to

Re: Delete by query fails often with HTTP 503

2014-03-18 Thread Clinton Gormley
Do you have lots of shards on just a few nodes? Delete by query is handled by the `index` thread pool, but those threads are shared across all shards on a node. Delete by query can produce a large number of changes, which can fill up the thread pool queue and result in rejections. You can either

Re: Corrupted ElasticSearch index ?

2014-03-17 Thread Clinton Gormley
Are you sure you didn't run out of disk space or file handles at some stage, or have an OOM exception? On 16 March 2014 16:37, bizzorama bizzor...@gmail.com wrote: Hi, it turned out that it was not a problem of ES version (we tested on both 0.90.10 and 0.90.9) but just a ES bug ... after

Re: Mechanism of internal search with multiple indices

2014-03-17 Thread Clinton Gormley
Hi Mohit All documents stored in a single index are stored at the same level, regardless of their type. The _type is just a hidden field in each document. So if you do a search like: GET /index_one,index_two/_search { query: { match: { field_foo: some search terms }}} then it queries

Re: OutOfMemoryError: Direct buffer memory

2014-03-17 Thread Clinton Gormley
Are you sending an enormous bulk indexing request? If so, try to send fewer docs at a time, eg 1,000 On 17 March 2014 10:39, Daniel Guo daniel5...@gmail.com wrote: I use elasticsearch as an index server. And I deploy a web project to create index and search result from my es server. I got

Re: elasticsearch memory usage

2014-03-17 Thread Clinton Gormley
On 17 March 2014 09:18, Amit Soni amitson...@gmail.com wrote: Hello team - I see the recommendation here in this thread to use JDK 1.7 update 25. However in the websitehttp://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/requirements.htmlupdate 51 is recommended. Hi Amit

Re: Timeouts on Node Stats API?

2014-03-17 Thread Clinton Gormley
Good to hear! Thanks for reporting back On 17 March 2014 15:04, Xiao Yu m...@xyu.io wrote: I still don't have any definitive logs or traces that point to the exact cause of this situation but it appears to be some weird scheduling bug with hyper threading. Our nodes are running on OpenJDK

Re: Using machine learning and TF-IDF for record linkage, fuzzy grouping, and deduplication?

2014-03-17 Thread Clinton Gormley
I'd start with the more_like_this query and see how far that takes you. clint On 17 March 2014 18:28, Shrin King aoi...@gmail.com wrote: Given a new big department merged from three departments. A few employees worked for two or three departments before merging. That means, the attributes

Re: Creating dynamic fields from a field

2014-03-15 Thread Clinton Gormley
To add to what Binh said, you really shouldn't add field names like this: On 14 March 2014 21:20, Pablo Musa pablitom...@gmail.com wrote: { title:The greatest band ever - Urban Legion, greatest_x : 1, band_x : 1, ever_x : 1, Urban_x: 1, Legion_x : 1, greatest_y : [],

Re: Low priority queries or query throttling?

2014-03-14 Thread Clinton Gormley
Adding to what Zach said, I'd also be interested in looking at what causes these queries to be so slow. Potentially their performance could be greatly improved. clint On 14 March 2014 01:29, Zachary Tong zacharyjt...@gmail.com wrote: What's the nature of the queries? There may be some

Re: max_score is not coming for query with filter search

2014-03-14 Thread Clinton Gormley
My first question is: why do you want the score? The score is used only for sorting, and you're sorting on NODE_ID. If you really want it (and there is a cost to computing the score) then you can set track_scores to true.

Re: bool query with filter giving error

2014-03-14 Thread Clinton Gormley
You need to pass the search request a query, so just change the above to: GET /_search { query: { filtered: }, from: 0, size: 3 ...} On 14 March 2014 14:55, Subhadip Bagui i.ba...@gmail.com wrote: Hi, I'm trying to run the below bool query with filter range to fetch all the node

Re: Timeouts on Node Stats API?

2014-03-14 Thread Clinton Gormley
Anything in the logs or slow logs? You're sure slow GCs aren't impacting performance? On 14 March 2014 15:17, Xiao Yu m...@xyu.io wrote: Can you do a hot_threads while this is happening? Just for good measure I also checked hot threads for blocking and waiting, nothing interesting there

Re: Calculating rolling average using aggregations

2014-03-13 Thread Clinton Gormley
I rethought this problem last night. The solutions I've presented already are a lot less efficient than they could be, as they increase the work per doc by a factor of the number of buckets (ie 24h * 28d = 672). It'd be much more efficient to calculate this rolling average client side in a single

Re: Source filtering of nested object data

2014-03-13 Thread Clinton Gormley
On 12 March 2014 21:55, Ben Hirsch benhir...@gmail.com wrote: I will know the 5-10 id's needed to be fetched at run-time. With script_fields how would I access the children with those specific id's? With script fields, you have access to the whole _source field, so you would need to write a

Re: precise field matching without defining a not_analyzed extra field possible?

2014-03-13 Thread Clinton Gormley
Appreciate that Clint. But I was asking whether I could do without having to modify mappings - see ref to another post seemingly alluding to that That post refers to using the keyword_repeat token filter to index stemmed and unstemmed tokens in the same positions. It won't work for your use

Re: boosting result documents that have a certain key-value combination in a nested strucutre

2014-03-13 Thread Clinton Gormley
On 12 March 2014 23:32, Michael Schlenzka mich...@schlenzka.com wrote: I do not want the sum of all the values of the key-value-pairs. I want to boost each document (with a specific key) only with the value for the matching key/color (e.g. if searching for documents with blue as color each

Re: Auto-created spam index

2014-03-13 Thread Clinton Gormley
On 13 March 2014 05:15, Ivan Brusic i...@brusic.com wrote: That said, your Elasticsearch server is still accessible to anyone over the internet. I Or somebody on your network is infected with a bot. -- You received this message because you are subscribed to the Google Groups elasticsearch

Re: Node not joining cluster on boot

2014-03-13 Thread Clinton Gormley
Can you telnet from each box to port 9300 on the other box? Does your bridge support multicast? If not, you could use unicast instead. clint On 13 March 2014 10:31, Guillaume Loetscher sterfi...@gmail.com wrote: Sure Node # 1: root@es_node1:~# grep -E '^[^#]'

Re: Term query to get children based on parent id

2014-03-12 Thread Clinton Gormley
Hi Rukshan Very nicely laid out question. Thanks for providing all of the steps. I agree that it doesn't work and (at least according to the docs) it should, so I've opened an issue here: https://github.com/elasticsearch/elasticsearch/issues/5399 Curiously, performing the same lookup using a

Re: Lucene query parser to ES Java API

2014-03-12 Thread Clinton Gormley
You may want to look at using the simple query string query instead: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html It has the benefit of not throwing syntax errors, but trying to do the right thing. On 12 March 2014 01:53, Ivan

Re: What to do if ES document count continuously increases - with zero indexing ongoing

2014-03-12 Thread Clinton Gormley
Have you looked at the docs? http://www.elasticsearch.org/guide/en/marvel/current/index.html#configuration On 12 March 2014 06:42, Swaroop CH swaroo...@yandex.com wrote: The source of the problem is Marvel - is there anyway to disable Marvel indexing? Trying to set `marvel.agent.indices:

Re: max_score anyone?

2014-03-12 Thread Clinton Gormley
Hi Yuri First, here's a query that will work for you: https://gist.github.com/clintongormley/9505141 I'm using a has_parent query to sum up all the values of the revenue field, then using a function_score query to ensure that those values fall within a range. However, your data model feels like

Re: Calculating rolling average using aggregations

2014-03-12 Thread Clinton Gormley
, Clinton Gormley wrote: Yes, easily. Aggregations are really powerful. Here's an example: # First insert some data curl -XPUT http://localhost:9200/myindex/mytype/1; -d' { created: 2014/03/10 12:05:00, somefield: 10 }' curl -XPUT http://localhost:9200/myindex/mytype/2; -d' { created

Re: Calculating rolling average using aggregations

2014-03-12 Thread Clinton Gormley
Heya Bihn The part I'm not getting is this: the rolling average for every hour in the last 28 days. ie what period should each bucket/rolling avg cover? an hour? 28 days? You can still do rolling averages with aggregations, but they require a bit more work. I wanted to get the exact specs before

Re: Calculating rolling average using aggregations

2014-03-12 Thread Clinton Gormley
. See the demo here: https://gist.github.com/clintongormley/9515005 clint On 12 March 2014 16:29, Clinton Gormley cl...@traveljury.com wrote: Heya Bihn The part I'm not getting is this: the rolling average for every hour in the last 28 days. ie what period should each bucket/rolling avg

Re: Source filtering of nested object data

2014-03-12 Thread Clinton Gormley
You could use script_fields to generate the values you want: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-script-fields.html may be a bit tricky though :) On 12 March 2014 16:59, Ben Hirsch benhir...@gmail.com wrote: Is it possible to filter the _source

Re: What to do if ES document count continuously increases - with zero indexing ongoing

2014-03-12 Thread Clinton Gormley
On 12 March 2014 18:28, Swaroop CH swaroo...@yandex.com wrote: Thanks for the reply. I did look at the docs, I tried setting `marvel.agent.enabled: false` via the API, but ES logs an error saying it is not dynamically updateable, and so on. Yes, you have to set it in the config file (at

Re: are tokens produced by an analyzer == terms?

2014-03-12 Thread Clinton Gormley
Yes. Tokens and terms are used as synonyms, although officially there is a difference: http://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html But for our purposes, wherever you read token think term On 12 March 2014 20:40, Nikita Tovstoles nikita.tovsto...@gmail.com wrote:

Re: Issue with bettermap / kibana

2014-03-12 Thread Clinton Gormley
Hi Romain This issue is fixed in master. cloudmade turned off public access, so we have switched to the mapquest servers. Clint On 12 March 2014 17:52, Romain NIO rom...@gmail.com wrote: Hi, I'm facing some issues with the plugin bettermap in Kibana. Kibana is not able to load the

Re: boosting result documents that have a certain key-value combination in a nested strucutre

2014-03-12 Thread Clinton Gormley
Hiya Michael The problem with your current attempts is that the nested filter matches on the color blue, but then returns the PARENT document, so when you try to access the colors.value field it is not available. Instead, you need to run a nested filter, and return the sum of the colors.value

Re: dynamically add/delete clusters on tribe node

2014-03-12 Thread Clinton Gormley
Not currently, no. Open an issue if you'd like to see it supported clint On 12 March 2014 21:00, cleesmith cleesmith2...@gmail.com wrote: Is it possible to dynamically add or delete clusters on a tribe node ? ... i.e. without editing elasticsearch.yml and then having to stop/start the tribe

Re: precise field matching without defining a not_analyzed extra field possible?

2014-03-12 Thread Clinton Gormley
You're almost there with: On 12 March 2014 21:06, Nikita Tovstoles nikita.tovsto...@gmail.com wrote: { user: { properties: { name: { type: string }, name.raw: { type : string, index: not_analyzed } } } } Instead, use

Re: multichar delimiter in path_hierarchy tokenizer

2014-03-11 Thread Clinton Gormley
You could fake it using the pattern_replace character filter: curl -XPUT http://localhost:9200/myindex; -d' { settings: { analysis: { analyzer: { arrow: { tokenizer: path_hierarchy, char_filter: [ arrow_to_slash ] } },

Re: Dynamic template (key/value mapping)

2014-03-11 Thread Clinton Gormley
No. You'd have to change the document before indexing it. On 10 March 2014 08:14, Michael Gulliksen michael.gullik...@gmail.comwrote: If i have a document like this: document: { id:11513, title:this is the title sections: [ { value:This is the

Re: DateRange aggregation semantics - include_lower/include_upper?

2014-03-11 Thread Clinton Gormley
On 7 March 2014 12:46, mooky nick.minute...@gmail.com wrote: So the previous, current and next period-end dates are: 2014-02-19, 2014-03-19 2014-04-16. I define the ranges therefore as: Overdue: date *2014-02-19* March: 2014-02-20 date *2014-03-19* April: 2014-03-20 date *2014-04-16*

Re: help with setting up mappings (Perl)

2014-03-10 Thread Clinton Gormley
Hi Dom First, make sure you're using the new Search::Elasticsearch client https://metacpan.org/pod/Search::Elasticsearch - we've just renamed it to avoid namespace clashes with older clients. Then: to configure the mapping yourself, you need to do it before you index any data (using the bulk

Re: Details on snapshot and restore in ES 1.0

2014-03-05 Thread Clinton Gormley
Here is a link to the docs: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-snapshots.html Without knowing what you are missing it is difficult to provide more information Clint On 5 March 2014 11:01, Hari Prasad iamhari1...@gmail.com wrote: Hi I am try to know

Re: sending failed shard error

2014-03-05 Thread Clinton Gormley
It sounds like you have multiple versions of Elasticsearch running with the same cluster name. clint On 5 March 2014 12:29, Hari Prasad iamhari1...@gmail.com wrote: Hi i have installed the new Elasticsearch 1.0. I am trying to setup cluster with multicast disabled and unicast ip set. i was

Re: Details on snapshot and restore in ES 1.0

2014-03-05 Thread Clinton Gormley
repository. Whats is its use and possible use cases? On Wednesday, 5 March 2014 15:35:53 UTC+5:30, Clinton Gormley wrote: Here is a link to the docs: http://www.elasticsearch.org/guide/en/ elasticsearch/reference/current/modules-snapshots.html Without knowing what you are missing

Re: Help me debug CPU use issues

2014-03-05 Thread Clinton Gormley
Look at the output from the hot threads API to see what is consuming the CPU. Also, I'd check your garbage collection times (look in the logs and in the nodes stats output), and make sure that you have zero bytes in swap. On 5 March 2014 11:20, Aivars Irmejs aiv...@gmail.com wrote: Recently

Re: sending failed shard error

2014-03-05 Thread Clinton Gormley
If they have the same cluster name, they will try to join each other. On 5 March 2014 12:36, Hari Prasad iamhari1...@gmail.com wrote: But the different ES have different clusters. will that still cause this problem? On Wednesday, 5 March 2014 17:02:27 UTC+5:30, Clinton Gormley wrote

Re: sending failed shard error

2014-03-05 Thread Clinton Gormley
On 5 March 2014 12:47, Hari Prasad iamhari1...@gmail.com wrote: My clusters have different names, also each of their multicast is false. There is another node somewhere that is interfering. Have a look in the logs, or the output from: curl 'localhost:9200/_nodes?allpretty' clint --

Re: node not allowed to joined cluster?

2014-03-05 Thread Clinton Gormley
On 5 March 2014 08:30, Lukáš Vlček lukas.vl...@gmail.com wrote: You can provide a list of nodes (IP addresses or DNS names) that are allowed to join the cluster. Other nodes will not be allowed. Where? I'm not aware of this option? -- You received this message because you are subscribed to

Re: River documentation vanished?

2014-03-05 Thread Clinton Gormley
On 4 March 2014 10:27, Lukáš Vlček lukas.vl...@gmail.com wrote: we try to keep documentation for our rivers updated but I noticed that the general river documentation page [1] is probably no longer available on Elaticsearch.org site? Is this intentional? I can see twitter river is still

Re: multi_match boolean across fields

2014-03-05 Thread Clinton Gormley
On 5 March 2014 00:24, Thibaut thibaut.pa...@gmail.com wrote: Is it possible to keep the boosting applied to the individual fields when computing the score ? No. Field-level index time boosts will not be preserved with copy_to. Coming very soon in 1.1.0 is the `cross_fields` type of

Re: elasticsearch suggest by middle words by making preserve_position_increments: false

2014-03-05 Thread Clinton Gormley
you're using the simple analyzer at index time, which means that it is indexing [the,beatles]. If you change it to use the stop analyzer at both search and index time then it should work. clint On 4 March 2014 15:38, Shams Haque shams...@gmail.com wrote: Hi, I am trying to implement middle

Re: Question on script scoring

2014-03-03 Thread Clinton Gormley
It could be that the precedence of || ? : is not what you expect. I've run into issues with that in mvel before. Try using parentheses to make the script unambiguous clint On 3 March 2014 20:51, Amit Soni amitson...@gmail.com wrote: Hi everyone - Since we were recommended to move away from

Re: What are good combinations of search analyzer, index analyzer and query for implementing an effective autocompleter using ElasticSearch ?

2014-02-03 Thread Clinton Gormley
On 1 February 2014 20:19, joergpra...@gmail.com joergpra...@gmail.comwrote: No, suggester is not restricted to prefix, in 0.90.4 fuzziness was added, as documented. Fuzzy suggest completion means your query may contain errors within an edit distance. But, it is still a prefix suggester...

Re: Corrupt index creation when elasticsearch is killed just after index is created

2013-12-26 Thread Clinton Gormley
If you kill Elasticsearch immediately after creating the index, you interrupt the process of shard allocation. When you restart Elasticsearch, it assumes that the shards have been allocated somewhere and so doesn't try to assign new shards to prevent any data loss. The index itself isn't corrupt,