Elasticsearch 1.5.2 and 1.4.5 released

2015-04-27 Thread Clinton Gormley
Hi All, We're happy to announce that Elasticsearch 1.5.2 and Elasticsearch 1.4.5 have been released! THESE VERSIONS CONTAIN A SECURITY FIX FOR A DIRECTORY TRAVERSAL VULNERABILITY. WE ADVISE ALL USERS TO UPGRADE The blog post at [1] describes the release content at a high level. The release n

Re: partial update elaticsearch-perl

2015-01-29 Thread Clinton Gormley
Hi Jorge The `doc` should be passed in the `body` parameter: $e->update( index => 'myindex', type => 'mytype', id=> "mykey", body => { doc => { link => "http://www.nw-kicoso.com";, sortierung => "5" } } ); On 8 December 2014 at 10:29, Jorge von Rudno <

Re: Perl client: Cannot combine params and body?

2015-01-21 Thread Clinton Gormley
Hi Andrew The code looks correct. You have send_get_body_as POST commented out - I'm guessing that is the problem. Probably the service you're using does not allow GET requests with bodies. I'd uncomment that and try again. Ping me on https://github.com/elasticsearch/elasticsearch-perl/issues

Re: [IMPORTANT] Issues using Perl API client installation

2014-12-30 Thread Clinton Gormley
On Monday, 29 December 2014 22:26:24 UTC+1, Vilas Reddy wrote: > > Hi, > > I am trying to use Perl API for retrieving data from Elasticsearch. > I am using Elasticsearch in windows cygwin. > > I need help with installing perl api and using it. I tried the following: > > *1. Installed cpan in cygw

Re: Connecting to ES via a http proxy in perl client

2014-10-28 Thread Clinton Gormley
Hi Kevin On Friday, 24 October 2014 18:24:00 UTC+2, Kevin Van Workum wrote: > > I'm trying to connect to my ES via a proxy using a client written in perl. > What's the best way to do this? > > Here's what I have, and it works, but I suspect there's a more straight > forward approach: > > $e = Se

Re: ES error while creating index on Solaris

2014-10-20 Thread Clinton Gormley
I think this is fixed in v1.3.5 with https://github.com/elasticsearch/elasticsearch/pull/7468 On Monday, 20 October 2014 17:08:55 UTC+2, Jörg Prante wrote: > > I have added a comment to > https://github.com/elasticsearch/elasticsearch/issues/6962 > > Jörg > -- You received this message because

Re: ES error while creating index on Solaris

2014-10-20 Thread Clinton Gormley
Hi Abhinav It would be good to know exactly where this problem is coming from. Is it the way that Logstash adds the template, or is it in the Elasticsearch layer. Please could you try something: * Delete the existing template and index in Elasticsearch * Take the Logstash template and create it

[ANN] Elasticsearch 1.4.0.Beta1 released

2014-10-01 Thread Clinton Gormley
We're happy to announce that Elasticsearch 1.4.0.Beta1 has been released! The blog post at [1] describes the release content at a high level. The release notes at [2] give details and a direct link to download. Please download and try it out. Feedback, bug reports and patches are all welco

[ANN] Elasticsearch 1.3.4 released

2014-09-30 Thread Clinton Gormley
We're happy to announce that Elasticsearch 1.3.4 has been released! This is a bug fix release, especially for users with large numbers of shards. The blog post at [1] describes the release content at a high level. The release notes at [2] give details and a direct link to download. Please dow

[ANN] Elasticsearch 1.3.3 released

2014-09-29 Thread Clinton Gormley
We're happy to announce that Elasticsearch 1.3.3 has been released! This is a bug fix release. The blog post at [1] describes the release content at a high level. The release notes at [2] give details and a direct link to download. Please download and try it out. Feedback, bug reports an

Re: ES default - async or sync

2014-09-07 Thread Clinton Gormley
Hiya OK I see where the confusion is coming in. I used the word asynchronously in slightly different contexts there. I will try to reword in the Definitive Guide. Replication is sync by default, in other words: the primary waits for indexing to happen on the replica before it returns to the use

Re: Parent/Child query performance in version 1.1.2

2014-08-25 Thread Clinton Gormley
Something else to note: parent-child now uses global ordinals to make queries 3x faster than they were previously, but global ordinals need to be rebuilt after the index has refreshed (assuming some data has changed). Currently there is no way to refresh p/c global ordinals "eagerly" (ie during th

Re: Topics/Entities with relevancy scores and searching

2014-08-25 Thread Clinton Gormley
On 24 August 2014 19:46, Scott Decker wrote: > Have you done this? any concerns to performance with this sort of scoring, > or, it is just as fast if you were doing base lucene scoring if we override > the score function and just use our own? > -- we will of course try it and run our own performa

Re: Topics/Entities with relevancy scores and searching

2014-08-23 Thread Clinton Gormley
Have a look at: * http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-delimited-payload-tokenfilter.html * http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html On 23 August 2014 15:04, Scott Decker wrote: > Hey all,

Re: Can't find unit tests for reserved characters

2014-08-23 Thread Clinton Gormley
in: "foo\\ bar" On 21 August 2014 23:38, ben wrote: > I was trying to demonstrate the escaping of a space not a slash. > > According to the ES documentation (copied in my original post) that says a > space must be escaped. > > Thanks! > > > On Thursday, Augu

Re: Can't find unit tests for reserved characters

2014-08-21 Thread Clinton Gormley
That's the JSON parsing, not the query_string parsing. You need to use a double slash in JSON in order to pass a single slash, ie: { "query": { "query_string": { "query": "name:exampleof\\ bug" } } } Also, re the reserved characters in the query string - that is all handled by

Re: Example needed for Perl Search::Elasticsearch

2014-08-13 Thread Clinton Gormley
Hiya > Simple question, but there seems to be a lack of detailed examples for using the otherwise very useful Search::Elasticsearch CPAN module ! The idea was that the API of the module maps very closely to all of the REST APIs in Elasticsearch, so that anything that works with raw curl statem

Re: clarity for shard allocation disable/enable during upgrade

2014-08-12 Thread Clinton Gormley
On Monday, 11 August 2014 15:31:28 UTC+2, bitsof...@gmail.com wrote: > > I have 8 data nodes and 6 coordinator nodes in an active cluster running > 1.2.1 > > I want to upgrade to 1.3.1 > > When reading > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-upgrade.html >

Re: final request when scroll-scanning has 0 successful shards

2014-08-04 Thread Clinton Gormley
This is correct. On the last request, no hits are returned because all shards have already been drained of results. If you look at shards.total and shards.failed, you'll see they are also 0 clint On 4 August 2014 12:54, Tim S wrote: > When scroll-scanning >

Re: Pull Requests mounting?

2014-08-01 Thread Clinton Gormley
Hi James > A cursory glance at the GitHub pages shows a build-up of pull requests: > > ElasticSearch: 135 > Kibana: 64 > > Is there a movement to get these merged or otherwise cleaned up? We are > waiting on at least one of these but lack hope in the face of the volumes > pending. > > > We hav

Re: slow filter execution

2014-07-31 Thread Clinton Gormley
On 31 July 2014 20:25, Kireet Reddy wrote: > Quick update, I found that if I explicitly set _cache to true, things seem > to work more as expected, i.e. subsequent executions of the query sped up. > I looked at DateFieldMapper.rangeFilter() and to me it looks like if a > number is passed, caching

Re: slow filter execution

2014-07-30 Thread Clinton Gormley
Don't use the `and` filter - use the `bool` filter instead. They have different execution modes and the `bool` filter works best with bitset filters (but also knows how to handle non-bitset filters like geo etc). Just remove the `and`, `or` and `not` filters from your DSL vocabulary. Also, not s

Re: Sort Order when relevance is equal

2014-07-21 Thread Clinton Gormley
Hi Erich On 14 May 2014 02:49, Erich Lin wrote: > 1) will they always be in the same order if we set the preference > parameter to an arbitrary string like the user’s session ID. > They will be, until a merge happens (eg from indexing, updating, deleting, or just because...) > 2) If so, is th

Re: Heap / GC Issues

2014-07-19 Thread Clinton Gormley
Your filter cache is only taking up 3GB of the heap, which fits with the default limit of 10% of heap space. So the filter cache is not at fault here. I would look at the two usual suspects: * field data - how much space is this consuming? Try: curl 'localhost:9200/_nodes/stats/indices/fieldd

Re: Dealing with spam in this forum

2014-07-02 Thread Clinton Gormley
> > I've received in my mailbox at least 49 spams just for the 06/30. I won't > call this "a few spam email". I'm subscribed for years on many mailing > lists, and I'm pretty sure that it would take years to get as much spam on > those lists as I get in 1 day on ES mailing list. > That's inte

Re: problem index date yyyy-MM-dd’T'HH:mm:ss.SSS

2014-07-02 Thread Clinton Gormley
What you can do is to set the mapping for the date field to have: { "type": "date", "format": "-MM-dd HH:mm:ss", "ignore_malformed": true } then it will just ignore those invalid dates rather than throwing an error -- You received this message because you are subscribed to the Google Gr

Dealing with spam in this forum

2014-07-01 Thread Clinton Gormley
Hi all Recently we've had a few spam emails that have made it through Google's filters, and there have been a calls for us to change to a moderate-first-post policy. I am reluctant to adopt this policy for the following reasons: We get about 30 new users every day from all over the world, many

Re: Clarification on has_child filter memory requirements

2014-06-21 Thread Clinton Gormley
I've updated the docs on memory usage with parent-child. Hopefully more understandable: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html?1#_memory_considerations_8 On 21 June 2014 07:32, Drew Kutcharian wrote: > Thanks Alex. What do you mea

Re: problem indexing with my analyzer

2014-06-20 Thread Clinton Gormley
You seriously don't want 3..250 length ngrams That's ENORMOUS Typically set min/max to 3 or 4, and that's it http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_ngrams_for_partial_matching.html#_ngrams_for_partial_matching On 20 June 2014 16:05, Tanguy Bernard wrote: > Than

Re: guarding from double-start

2014-06-20 Thread Clinton Gormley
And in your config file, set: node.max_local_storage_nodes: 1 that way you won't start two nodes on a single instance On 20 June 2014 16:54, Andrew Gaydenko wrote: > On Friday, June 20, 2014 6:49:04 PM UTC+4, Maciej Dziardziel wrote: >> >> use start-stop-daemon or adapt /etc/init.d/elasti

Re: How to find the number of authors who have written between 2-3 books?

2014-06-20 Thread Clinton Gormley
Alternatively, if you mode this with parent-child, then you can use min_children/max_children which is available in the next release http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-has-child-filter.html#_min_max_children_2 clint On 20 June 2014 17:15, Mike wrote

Re: boolean multi-field silently ignored in 1.2.1

2014-06-20 Thread Clinton Gormley
heya bruce that looks like a bug - please open an issue clint On 20 June 2014 19:41, Bruce Ritchie wrote: > I'm seeing multi-fields of type boolean silently being reduced to a normal > boolean field in 1.2.1 which wasn't the behavior in 0.90.9. See > https://gist.github.com/Omega359/0c2a9369

Re: Best cluster environment for search

2014-06-11 Thread Clinton Gormley
On Thursday, 5 June 2014 00:54:15 UTC+2, Jörg Prante wrote: > > Why do you use terms on _id field and not the the ids filter? ids filter > is more efficient since it reuses the _uid field which is cached by default. > > So does the terms filter. The only advantage of the _ids filter is that yo

Re: How to get just certain fields on query time?

2014-05-24 Thread Clinton Gormley
Yes, with source filtering: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-source-filtering.html#search-request-source-filtering On 24 May 2014 20:54, Tom wrote: > Hi, > > is there a way to get just parts of _source on query time? > > Thx > Tom > > -- > Yo

Re: Changing simple type mapping

2014-05-23 Thread Clinton Gormley
> Thank you for your suggestion. What will that do for the existing data? > > Will I still be able to store copyrightYear as either a number or a string? > It won't change any existing data. However, for data you index in the future it will index either a number and a string, or (if it can't coer

Re: Changing simple type mapping

2014-05-23 Thread Clinton Gormley
Hi Brian What you can do is to update the mapping of the numeric field to { ignore_malformed: true }, then it will just ignore the bad data, but still use the multi-field: PUT /my_index/my_type/1 {"number": 123} PUT /my_index/_mapping/my_type { "properties": { "numb

Re: Query routing inside of a cluster

2014-05-14 Thread Clinton Gormley
Hi Savva I presume you're using cluster.routing.allocation.awareness? If so, then shards on nodes with the same node attributes are preferred: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-cluster.html#_automatic_preference_when_searching_geting On 14 May 2014

Re: Matching on sibling json nodes ?

2014-05-12 Thread Clinton Gormley
Hi Kristian You can use nested objects and set "include_in_parent" to true (it's like using type:nested and type:object on the same field), then highlight on the fields in the parent object. clint On 12 May 2014 13:42, Kristian Rosenvold wrote: > We're submitting a json document that looks lik

Re: Confused as to what works - Percolator

2014-05-10 Thread Clinton Gormley
It helps to provide the exact steps that you were trying, otherwise we are left to guess. I tried your example and it worked, but then I probably did it in a slightly different way. I think the likeliest problem is that you are creating the percolator queries before you index a document or setup

Re: Elastic search ignoring refresh_interval setting from elasticsearch.yml

2014-04-24 Thread Clinton Gormley
Hi Arjit When you set it via the config file, it isn't included in the output of GET /_settings, but is is being honoured. That said, I much prefer using the API for these things instead of setting them in the config file clint On 24 April 2014 14:27, Kartavya wrote: > You can check your suc

Re: array of strings vs string

2014-04-20 Thread Clinton Gormley
Also have a read about position_offset_gap: http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/_multi_value_fields_2.html On 17 April 2014 14:42, Aleh Aleshka wrote: > Thanks! > > > On Wednesday, April 16, 2014 6:39:00 PM UTC+3, vineeth mohan wrote: > >> Hello Aleh , >> >> Both s

Re: MVEL scripting to return 0 or 1 for a boolean

2014-04-20 Thread Clinton Gormley
On 19 April 2014 20:57, Shane Neeley wrote: > "script": "log(_score * (doc['field1'].value == doc['field2'].value)" "script": "log(_score * ((doc['field1'].value == doc['field2'].value) ? : 1 : 0)" -- You received this message because you are subscribed to the Google Groups "elasticsearch"

Re: Testing for an Empty String

2014-04-19 Thread Clinton Gormley
Hi Paul You need to use a "missing" filter: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-missing-filter.html#query-dsl-missing-filter You can read the section of the guide about "Dealing with null values" here: http://www.elasticsearch.org/guide/en/elasticsearch

Re: Splunk vs. Elastic search performance?

2014-04-19 Thread Clinton Gormley
Goldman Sachs gave a talk about how they're using Elasticsearch to index 5TB of log data per day. I can't find the video of the talk, but from a blogpost about it: Next was Indy Tharmakumar from our hosts Goldman Sachs, showing how his team have built powerful support

Re: Is ElasticSearch the Right Tool for This

2014-04-18 Thread Clinton Gormley
Hiya It's a bit more verbose, but yes you can do queries like that easily. I've assumed that all of your fields are "exact value" not_analyzed string fields, rather than full text fields: GET /_search { "_source": [ "col1", "col2" ], "query": { "filtered": { "filter": { "bo

Spam emails in the mailing list

2014-03-31 Thread Clinton Gormley
Hi all Recently we've had a number of spam emails sent to the mailing list, which Google hasn't caught. We apologise for this, but we ask you to just ignore them. As soon as we spot the ones that make it through Google's net, we delete and ban the user. Unfortunately, the only other option would

Re: OpenSearch for elasticsearch.org docs?

2014-03-19 Thread Clinton Gormley
Hmm I tried this on google.com and cnn.com on Chrome and Safari and didn't see what you describe at all. Tab just selected the first option from the list of suggestions under the URL field. What am I missing? On 18 March 2014 19:25, Ivan Brusic wrote: > That is not how OpenSearch works. With Op

Re: Mechanism of internal search with multiple indices

2014-03-18 Thread Clinton Gormley
If the field doesn't exist in the mapping, then the index is not searched. clint On 18 March 2014 09:56, golchhamohit wrote: > Thanks for explaining clearly how the query modifies itself when a query is > given with multiple indices and types. > > My another doubt is that which I represent her

Re: Delete by query fails often with HTTP 503

2014-03-18 Thread Clinton Gormley
Do you have lots of shards on just a few nodes? Delete by query is handled by the `index` thread pool, but those threads are shared across all shards on a node. Delete by query can produce a large number of changes, which can fill up the thread pool queue and result in rejections. You can either

Re: integration test issues with elasticsearch

2014-03-18 Thread Clinton Gormley
Please could you open an issue and try to provide the steps to reproduce this issue. thanks clint -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearc

Re: Using machine learning and TF-IDF for record linkage, fuzzy grouping, and deduplication?

2014-03-17 Thread Clinton Gormley
I'd start with the more_like_this query and see how far that takes you. clint On 17 March 2014 18:28, Shrin King wrote: > Given a new big department merged from three departments. A few employees > worked for two or three departments before merging. That means, the > attributes of one person m

Re: Corrupted ElasticSearch index ?

2014-03-17 Thread Clinton Gormley
zzorama a écrit : >> >> No, when things are running everything is ok, indexes break during >> restart/powerdown >> 17-03-2014 13:11, "Clinton Gormley" napisał(a): >> >>> Are you sure you didn't run out of disk space or file handles at some &g

Re: Timeouts on Node Stats API?

2014-03-17 Thread Clinton Gormley
Good to hear! Thanks for reporting back On 17 March 2014 15:04, Xiao Yu wrote: > I still don't have any definitive logs or traces that point to the exact > cause of this situation but it appears to be some weird scheduling bug with > hyper threading. Our nodes are running on OpenJDK 7u25 with h

Re: elasticsearch memory usage

2014-03-17 Thread Clinton Gormley
On 17 March 2014 09:18, Amit Soni wrote: > Hello team - I see the recommendation here in this thread to use JDK 1.7 > update 25. However in the > websiteupdate > 51 is recommended. > Hi Amit You should use u

Re: OutOfMemoryError: Direct buffer memory

2014-03-17 Thread Clinton Gormley
Are you sending an enormous bulk indexing request? If so, try to send fewer docs at a time, eg 1,000 On 17 March 2014 10:39, Daniel Guo wrote: > I use elasticsearch as an index server. And I deploy a web project to > create index and search result from my es server. > > I got the following err

Re: bool query with filter giving error

2014-03-17 Thread Clinton Gormley
This this instead: { "query": { "filtered": { "query": { "bool": { "should": [ { "match": { "NODE_HOSTNAME": "cloudserver.aricent.com" } }, { "match": { "NODE

Re: Mechanism of internal search with multiple indices

2014-03-17 Thread Clinton Gormley
On 17 March 2014 13:26, Clinton Gormley wrote: > "query": { > "dis_max": { > "queries": [ > { > "filtered": { > "query": { "match": {"field_f

Re: Mechanism of internal search with multiple indices

2014-03-17 Thread Clinton Gormley
Hi Mohit All documents stored in a single index are stored at the same "level", regardless of their type. The "_type" is just a hidden field in each document. So if you do a search like: GET /index_one,index_two/_search { "query": { "match": { "field_foo": "some search terms" }}} then

Re: Corrupted ElasticSearch index ?

2014-03-17 Thread Clinton Gormley
Are you sure you didn't run out of disk space or file handles at some stage, or have an OOM exception? On 16 March 2014 16:37, bizzorama wrote: > Hi, > > it turned out that it was not a problem of ES version (we tested on both > 0.90.10 and 0.90.9) but just a ES bug ... > after restarting pc or

Re: Creating dynamic fields from a field

2014-03-15 Thread Clinton Gormley
To add to what Binh said, you really shouldn't add field names like this: On 14 March 2014 21:20, Pablo Musa wrote: > { > "title":"The greatest band ever - Urban Legion", > "greatest_x" : 1, > "band_x" : 1, > "ever_x" : 1, > "Urban_x": 1, > "Legion_x" : 1, > "greatest_y" : [], > "

Re: Timeouts on Node Stats API?

2014-03-14 Thread Clinton Gormley
Anything in the logs or slow logs? You're sure slow GCs aren't impacting performance? On 14 March 2014 15:17, Xiao Yu wrote: > > Can you do a hot_threads while this is happening? >> > > Just for good measure I also checked hot threads for blocking and waiting, > nothing interesting there eithe

Re: bool query with filter giving error

2014-03-14 Thread Clinton Gormley
You need to pass the search request a "query", so just change the above to: GET /_search { "query": { "filtered": }, from: 0, size: 3 ...} On 14 March 2014 14:55, Subhadip Bagui wrote: > Hi, > > I'm trying to run the below bool query with filter range to fetch all the > node data with C

Re: max_score is not coming for query with filter search

2014-03-14 Thread Clinton Gormley
My first question is: why do you want the score? The score is used only for sorting, and you're sorting on NODE_ID. If you really want it (and there is a cost to computing the score) then you can set track_scores to true. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search

Re: Low priority queries or query throttling?

2014-03-14 Thread Clinton Gormley
Adding to what Zach said, I'd also be interested in looking at what causes these queries to be so slow. Potentially their performance could be greatly improved. clint On 14 March 2014 01:29, Zachary Tong wrote: > What's the nature of the queries? There may be some optimizations that > can be

Re: boosting result documents that have a certain key-value combination in a nested strucutre

2014-03-13 Thread Clinton Gormley
On 13 March 2014 11:35, joergpra...@gmail.com wrote: > Have you tried term boosting? > > > http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_boosting > This won't help him boost by the value in the value field associated with each color=blue

Re: Node not joining cluster on boot

2014-03-13 Thread Clinton Gormley
Can you telnet from each box to port 9300 on the other box? Does your bridge support multicast? If not, you could use unicast instead. clint On 13 March 2014 10:31, Guillaume Loetscher wrote: > Sure > > Node # 1: > root@es_node1:~# grep -E '^[^#]' /etc/elasticsearch/elasticsearch.yml > clus

Re: Auto-created spam index

2014-03-13 Thread Clinton Gormley
On 13 March 2014 05:15, Ivan Brusic wrote: > That said, your Elasticsearch server is still accessible to anyone over > the internet. I Or somebody on your network is infected with a bot. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To un

Re: boosting result documents that have a certain key-value combination in a nested strucutre

2014-03-13 Thread Clinton Gormley
On 12 March 2014 23:32, Michael Schlenzka wrote: > I do not want the sum of all the values of the key-value-pairs. I want to > boost each document (with a specific key) only with the value for the > matching key/color (e.g. if searching for documents with blue as color each > document should be b

Re: precise field matching without defining a not_analyzed extra field possible?

2014-03-13 Thread Clinton Gormley
> Appreciate that Clint. But I was asking whether I could do without having > to modify mappings - see ref to another post seemingly alluding to that That post refers to using the keyword_repeat token filter to index stemmed and unstemmed tokens in the same positions. It won't work for your use c

Re: Source filtering of nested object data

2014-03-13 Thread Clinton Gormley
On 12 March 2014 21:55, Ben Hirsch wrote: > I will know the 5-10 id's needed to be fetched at run-time. With > script_fields how would I access the children with those specific id's? > With script fields, you have access to the whole _source field, so you would need to write a script to step thr

Re: Calculating rolling average using aggregations

2014-03-13 Thread Clinton Gormley
I rethought this problem last night. The solutions I've presented already are a lot less efficient than they could be, as they increase the work per doc by a factor of the number of buckets (ie 24h * 28d = 672). It'd be much more efficient to calculate this rolling average client side in a single

Re: precise field matching without defining a not_analyzed extra field possible?

2014-03-12 Thread Clinton Gormley
You're almost there with: On 12 March 2014 21:06, Nikita Tovstoles wrote: > { > "user": { > "properties": { > "name": { > "type": "string" > }, > "name.raw": { > "type" : "string", > "index": "not_analyzed" > } > } > } > } > Instea

Re: dynamically add/delete clusters on tribe node

2014-03-12 Thread Clinton Gormley
Not currently, no. Open an issue if you'd like to see it supported clint On 12 March 2014 21:00, cleesmith wrote: > Is it possible to dynamically add or delete clusters on a tribe node ? > ... i.e. without editing elasticsearch.yml and then having to stop/start > the tribe node > > -- > You re

Re: boosting result documents that have a certain key-value combination in a nested strucutre

2014-03-12 Thread Clinton Gormley
Hiya Michael The problem with your current attempts is that the nested filter matches on the color "blue", but then returns the PARENT document, so when you try to access the colors.value field it is not available. Instead, you need to run a nested filter, and return the sum of the colors.value f

Re: Issue with bettermap / kibana

2014-03-12 Thread Clinton Gormley
Hi Romain This issue is fixed in master. cloudmade turned off public access, so we have switched to the mapquest servers. Clint On 12 March 2014 17:52, Romain NIO wrote: > Hi, > > I'm facing some issues with the plugin "bettermap" in Kibana. Kibana is > not able to load the background of the

Re: are tokens produced by an analyzer == terms?

2014-03-12 Thread Clinton Gormley
Yes. Tokens and terms are used as synonyms, although officially there is a difference: http://nlp.stanford.edu/IR-book/html/htmledition/tokenization-1.html But for our purposes, wherever you read "token" think "term" On 12 March 2014 20:40, Nikita Tovstoles wrote: > Reference says that an Anal

Re: What to do if ES document count continuously increases - with zero indexing ongoing

2014-03-12 Thread Clinton Gormley
On 12 March 2014 18:28, Swaroop CH wrote: > Thanks for the reply. I did look at the docs, I tried setting > `marvel.agent.enabled: false` via the API, but ES logs an error saying it > is "not dynamically updateable", and so on. > Yes, you have to set it in the config file (at least for now) --

Re: Source filtering of nested object data

2014-03-12 Thread Clinton Gormley
You could use script_fields to generate the values you want: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-script-fields.html may be a bit tricky though :) On 12 March 2014 16:59, Ben Hirsch wrote: > Is it possible to filter the _source of nested data con

Re: Calculating rolling average using aggregations

2014-03-12 Thread Clinton Gormley
more. See the demo here: https://gist.github.com/clintongormley/9515005 clint On 12 March 2014 16:29, Clinton Gormley wrote: > Heya Bihn > > The part I'm not getting is this: "the rolling average for every hour in > the last 28 days". ie what period should each buc

Re: Calculating rolling average using aggregations

2014-03-12 Thread Clinton Gormley
Heya Bihn The part I'm not getting is this: "the rolling average for every hour in the last 28 days". ie what period should each bucket/rolling avg cover? an hour? 28 days? You can still do rolling averages with aggregations, but they require a bit more work. I wanted to get the exact specs befor

Re: Calculating rolling average using aggregations

2014-03-12 Thread Clinton Gormley
ast 28 days. What I am looking for more precisely is to calculate a 28 > rolling (or moving average) using the last 28 days of data and redoing that > calculation every hour. > > I suppose what could be done is to do an average on a field that is a > hourly count. > > > > On

Re: max_score anyone?

2014-03-12 Thread Clinton Gormley
Hi Yuri First, here's a query that will work for you: https://gist.github.com/clintongormley/9505141 I'm using a has_parent query to sum up all the values of the revenue field, then using a function_score query to ensure that those values fall within a range. However, your data model feels like

Re: What to do if ES document count continuously increases - with zero indexing ongoing

2014-03-12 Thread Clinton Gormley
Have you looked at the docs? http://www.elasticsearch.org/guide/en/marvel/current/index.html#configuration On 12 March 2014 06:42, Swaroop CH wrote: > The source of the problem is Marvel - is there anyway to disable Marvel > indexing? > > Trying to set `marvel.agent.indices: "-*"` says "ignori

Re: Lucene query parser to ES Java API

2014-03-12 Thread Clinton Gormley
You may want to look at using the simple query string query instead: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html It has the benefit of not throwing syntax errors, but trying to do the right thing. On 12 March 2014 01:53, Ivan Bru

Re: Term query to get children based on parent id

2014-03-12 Thread Clinton Gormley
Hi Rukshan Very nicely laid out question. Thanks for providing all of the steps. I agree that it doesn't work and (at least according to the docs) it should, so I've opened an issue here: https://github.com/elasticsearch/elasticsearch/issues/5399 Curiously, performing the same lookup using a `ha

Re: DateRange aggregation semantics - include_lower/include_upper?

2014-03-11 Thread Clinton Gormley
On 7 March 2014 12:46, mooky wrote: > So the previous, current and next period-end dates are: > 2014-02-19, 2014-03-19 & 2014-04-16. > I define the ranges therefore as: > Overdue: date < *2014-02-19* > March: 2014-02-20 < date < *2014-03-19* > April: 2014-03-20 < date < *2014-04-16* > Actually,

Re: Dynamic template (key/value mapping)

2014-03-11 Thread Clinton Gormley
No. You'd have to change the document before indexing it. On 10 March 2014 08:14, Michael Gulliksen wrote: > If i have a document like this: > > document: > { >id:"11513", >title:"this is the title" >sections: > [ > { > value:"This is the headline", >

Re: multichar delimiter in path_hierarchy tokenizer

2014-03-11 Thread Clinton Gormley
You could fake it using the pattern_replace character filter: curl -XPUT "http://localhost:9200/myindex"; -d' { "settings": { "analysis": { "analyzer": { "arrow": { "tokenizer": "path_hierarchy", "char_filter": [ "arrow_to_slash" ]

Re: Calculating rolling average using aggregations

2014-03-11 Thread Clinton Gormley
Yes, easily. Aggregations are really powerful. Here's an example: # First insert some data curl -XPUT "http://localhost:9200/myindex/mytype/1"; -d' { "created": "2014/03/10 12:05:00", "somefield": 10 }' curl -XPUT "http://localhost:9200/myindex/mytype/2"; -d' { "created": "2014/03/10 12:0

Re: help with setting up mappings (Perl)

2014-03-10 Thread Clinton Gormley
Hi Dom You need to consult the Elasticsearch documentation for mapping. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#string What you're looking for is: properties => { somefield => { type => 'string',

Re: Can a replica be updated with the deltas only?

2014-03-10 Thread Clinton Gormley
If a replica recovers from the primary, then the node hosting it is shut down shortly thereafter, when it comes back up it will only copy the segments that have changed in the interim period. However, merges happen independently on the primary and the replica. When a replica has been running for a

Re: help with setting up mappings (Perl)

2014-03-10 Thread Clinton Gormley
Hi Dom First, make sure you're using the new Search::Elasticsearch client https://metacpan.org/pod/Search::Elasticsearch - we've just renamed it to avoid namespace clashes with older clients. Then: to configure the mapping yourself, you need to do it before you index any data (using the bulk meth

Re: Error "array index out of bounds java.lang.OutOfMemoryError: Java heap space"

2014-03-10 Thread Clinton Gormley
Also, are you using Ubuntu 10.04? I see you have slow young generation GC, and that version of Ubuntu had a bug in that area. On 10 March 2014 10:24, prashy wrote: > Hi gkwelding, > > I have checked explicitly on my box and the value for MAX_OPEN_FILES and > MAX_MAP_COUNT has been set to 65535

Re: DateRange aggregation semantics - include_lower/include_upper?

2014-03-05 Thread Clinton Gormley
In filters we prefer using gt/gte/lt/lte because there you can be specific about including or excluding, but for aggregations you want "to" to exclude otherwise you end up with overlapping ranges. What use case do you have where you want to change that? clint On 5 March 2014 14:26, mooky wrote

Re: elasticsearch suggest by middle words by making preserve_position_increments: false

2014-03-05 Thread Clinton Gormley
you're using the "simple" analyzer at index time, which means that it is indexing ["the","beatles"]. If you change it to use the "stop" analyzer at both search and index time then it should work. clint On 4 March 2014 15:38, Shams Haque wrote: > Hi, > > I am trying to implement middle word se

Re: multi_match boolean across fields

2014-03-05 Thread Clinton Gormley
On 5 March 2014 00:24, Thibaut wrote: > Is it possible to keep the boosting applied to the individual fields when > computing the score ? > No. Field-level index time boosts will not be preserved with copy_to. Coming very soon in 1.1.0 is the `cross_fields` type of multi_match query, which is d

Re: River documentation vanished?

2014-03-05 Thread Clinton Gormley
On 4 March 2014 10:27, Lukáš Vlček wrote: > we try to keep documentation for our rivers updated but I noticed that the > general river documentation page [1] is probably no longer available on > Elaticsearch.org site? > Is this intentional? I can see twitter river is still pointing to it as > wel

Re: Error "array index out of bounds java.lang.OutOfMemoryError: Java heap space"

2014-03-05 Thread Clinton Gormley
On 5 March 2014 06:35, prashy wrote: > I am using the bulk API for indexing the data. How much data are you indexing in a single request? If you send all the docs at once, you will run out of memory. The idea is to have bulk requests of eg 1-5,000 docs at a time (but it depends how big your do

Re: node not allowed to joined cluster?

2014-03-05 Thread Clinton Gormley
On 5 March 2014 08:30, Lukáš Vlček wrote: > You can provide a list of nodes (IP addresses or DNS names) that are > allowed to join the cluster. Other nodes will not be allowed. Where? I'm not aware of this option? -- You received this message because you are subscribed to the Google Groups "

Re: sending failed shard error

2014-03-05 Thread Clinton Gormley
On 5 March 2014 12:47, Hari Prasad wrote: > My clusters have different names, also each of their multicast is false. > There is another node somewhere that is interfering. Have a look in the logs, or the output from: curl 'localhost:9200/_nodes?all&pretty' clint -- You received this mes

  1   2   >