inconsistent paging

2014-08-18 Thread Ron Sher
Hi, We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the

Re: inconsistent paging

2014-08-18 Thread David Pilato
You need to use scroll if you have that requirement. See: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#search-request-scroll -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 18 août 2014 à 08:02, Ron Sher

Re: inconsistent paging

2014-08-18 Thread Adrien Grand
Hi Ron, The cause of this issue is that Elasticsearch uses Lucene's internal doc IDs as tie-breakers. Internal doc IDs might be completely different across replicas of the same data, so this explains why documents that have the same sort values are not consistently ordered. There are 2 potential

Re: Help with the percentiles aggregation

2014-08-18 Thread Adrien Grand
Hi John, You should be able to do something like: { aggs: { verb: { terms: { field: verb }, aggs: { load_time_outliers: { percentiles: { field: responsetime } } } } } } This will first break down your

Re: impact of stored fields on performance

2014-08-18 Thread Adrien Grand
Hi Ashish, On Thu, Aug 14, 2014 at 12:35 AM, Ashish Mishra laughingbud...@gmail.com wrote: That sounds possible. We are using spindle disks. I have ~36Gb free for the filesystem cache, and the previous data size (without the added field) was 60-65Gb per node. So it's likely that 50% of

Re: accessing field data faster in script

2014-08-18 Thread Adrien Grand
Script filters are inherently slow due to the fact that they cannot leverage the inverted index in order to skip efficiently over non-matching documents. Even if they were written in assembly, this would likely still be slow. What kind of filtering are you trying to do with scripts? On Thu, Aug

Re: Return selected fields from aggregation?

2014-08-18 Thread Adrien Grand
Can you elaborate more on what you are after? On Wed, Aug 13, 2014 at 5:16 PM, project2501 darreng5...@gmail.com wrote: The old facet DSL was very nice and easy to understand. I could declare only which fields I wanted returned. how is this done with aggregations? The docs do not say. I

inconsistent paging

2014-08-18 Thread ronsher
We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of 60 and we have 63 documents. So the first page is using size 60 and offset 0. The second page is using size 60 and offset 60. What we see is that the result is inconsistent. Meaning, on the 2nd

Re: Access to AbstractAggregationBuilder.name

2014-08-18 Thread Adrien Grand
Hi Phil, We would indeed consider a PR for that change if it makes things easier to you. Feel free to ping me when you open it so that I don't miss it. On Wed, Aug 13, 2014 at 3:55 PM, Phil Wills otherp...@gmail.com wrote: Hello, In the Java API AbstractAggregationBuilder's name property is

Re: inconsistent paging

2014-08-18 Thread vineeth mohan
You have asked teh same question from another GMAIL ID. Please refer to the answers over there. Thanks Vineeth On Mon, Aug 18, 2014 at 10:08 AM, ronsher rons...@gmail.com wrote: We've noticed a strange behavior in elasticsearch during paging. In one case we use a paging size of

Re: accessing field data faster in script

2014-08-18 Thread avacados
Thanks Adrien for reply. My script filter was, === { script: { script: xyz, params: { startRange: 1407939675, // Timestamp in milliseconds ... keep changing on all queries

Using a char_filter in combination with a lowercase filter

2014-08-18 Thread Matthias Hogerheijde
Hi, We're using Elasticsearch with an Analyzer to map the `y` character to `ij`, (*char_fitler* named char_mapper) since in Dutch these two are somewhat interchangeable. We're also using a *lowercase filter*. This is the configuration: { analysis: { analyzer: { index: {

Re: accessing field data faster in script

2014-08-18 Thread Adrien Grand
Your filter would be faster if you used range filters on the start/end dates instead of using a script. On Mon, Aug 18, 2014 at 10:52 AM, avacados kotadia.ak...@gmail.com wrote: _cache: true // I removed this caching and i found significant performance improvement...

Excheption in suggester responce

2014-08-18 Thread makr
Hi! I try test elasticsearch suggester, but i got strange error. user@user:/user/esconfig # curl -X POST 'localhost:9200/dwh_direct/_suggest?pretty' -d @suggester { _shards : { total : 5, successful : 0, failed : 5, failures : [ { index : dwh_direct, shard : 0,

Enhancing perf for my cluster

2014-08-18 Thread Pierrick Boutruche
Hi everyone ! I'm currently working on a tool with *ES and Twitter Streaming API*, in which I try to find interesting profiles on Twitter, based on what they tweet, RT and which of their interactions are shared/RT. Anyway, I use ES to index and search among tweets. To do that, I get Twitter

elasticsearch php api with multiple hosts

2014-08-18 Thread Niv Penso
I followed this link to create an elasticsearch 2 nodes cluster on Azure: this link http://thomasardal.com/running-elasticsearch-in-a-cluster-on-azure/ the installation and configuring went good. When i started to check the cluster i found a strange behaviour from the php client. I declared

How to update nest from 0.12 to 1.0

2014-08-18 Thread Dmitriy Bashkalin
Hello. Does someone use NEST for .NET? Please help me. Sometime ago I asked how to get part of textfield. I wanted to do it with Highlight param no_match_size, but it's supported since NEST version 1.0RC1. After update nest.dll from 0.12 to 1.0 I got problem that nothing works. Looking GitHub

Re: A few questions about node types + usage

2014-08-18 Thread Alex
Hello again Mark, Thanks for your response. Your answers really are very helpful. As with our previous conversation https://groups.google.com/d/topic/elasticsearch/ZouS4NVsTJw/discussion I am confused about how to make a client node also be master eligible. This is what I posted there, I

river-csv plugin

2014-08-18 Thread HansPeterSloot
Hi, This is for elasticsearch : elasticsearch-1.3.2-1.noarch There are 2 nodes in the cluster. I have installed the river-csv pluging. When loading a file with 5 million rows loading stops after 477400 rows. I load with : curl -XPUT localhost:9200/_river/my_csv_river/_meta -d ' { type :

Re: Excheption in suggester responce

2014-08-18 Thread vineeth mohan
Hello Maxim , Can you show the schema and a sample data that you have indexed. Thanks Vineeth On Mon, Aug 18, 2014 at 3:31 PM, m...@ciklum.com wrote: Hi! I try test elasticsearch suggester, but i got strange error. user@user:/user/esconfig # curl -X POST

Re: EsRejectedExecutionException: rejected execution (queue capacity 1000)

2014-08-18 Thread Sávio S . Teles de Oliveira
You can put *threadpool.search.type: **cached* on elasticsearch.yml for unbounded queue for reads. 2014-08-10 9:52 GMT-03:00 James digital...@gmail.com: On Sat, 2014-08-09 at 23:53 -0700, Deep wrote: Hi, Elastic search internally has thread pool and a queue size is associated with

Re: Query problem

2014-08-18 Thread Luc Evers
David hi, How can I configure the mapping so that the default analyzer will be the whitespace one? On Wed, Aug 13, 2014 at 2:46 PM, David Pilato da...@pilato.fr wrote: Having no answer is not good. I think something goes wrong here. May be you should see something in logs. That

ThreadPool reject_policy

2014-08-18 Thread Sávio S . Teles de Oliveira
What does it work the threadpool using reject_policy *caller*? Can I catch the exception EsRejectedExecutionException (using Java api) during heavy writes? -- Atenciosamente, Sávio S. Teles de Oliveira voice: +55 62 9136 6996 http://br.linkedin.com/in/savioteles Mestrando em Ciências da

Re: Query problem

2014-08-18 Thread David Pilato
I think could help you:  http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 18 août 2014 à 15:39:36, Luc Evers (lucev...@gmail.com) a écrit:  David hi,   How

How to normalize score when combining regular query and function_score?

2014-08-18 Thread JohnnyM
First of all kudos on the awesome job everyone here is doing! I was wondering if you guys can help me solve this puzzle: Also available on stack overflow: http://stackoverflow.com/questions/25361795/elasticsearch-how-to-normalize-score-when-combining-regular-query-and-function Idealy what I

help with a grok filter

2014-08-18 Thread Kevin M
Could someone help me write a grok filter for this log real quick here is what the log looks like: Aug 18 09:40:39 server01 webmin_log: 172.16.16.96 - username *[18/Aug/2014:09:40:39 -0400]* GET /right.cgi?open=systemopen=status HTTP/1.1 200 3228 here is what I have so far: match = [

Re: Help with the percentiles aggregation

2014-08-18 Thread John Ogden
That's spot on. Thanks! On 18 Aug 2014 09:08, Adrien Grand adrien.gr...@elasticsearch.com wrote: Hi John, You should be able to do something like: { aggs: { verb: { terms: { field: verb }, aggs: { load_time_outliers: { percentiles: {

ES ignores index.query.bool.max_clause_count in elasticsearch.yml

2014-08-18 Thread l . daedelow
It seems to me that ES ignores the index.query.bool.max_clause_count argument in elasticsearch.yml Setting index.query.bool.max_clause_count: 5000 results in the following error: Caused by: org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 1024 Any solution whats

Re: Help with the percentiles aggregation

2014-08-18 Thread John Ogden
Slight follow on - do you know if returning this sort of stuff via Kibana is on the cards? Just looking for an easy way to graph the results. Thanks. On Friday, 15 August 2014 10:23:16 UTC+1, John Ogden wrote: Hi, Am trying to run a single command which calculates percentiles for

Help with multiple data ranges in a single query

2014-08-18 Thread John Ogden
I've been given a requirement to produce a single kibana dashboard showing app response times for multiple date ranges, and am stumped at how to proceed. The user wants to see today's graph, along with the previous working day, day -7, day -28 and day -364 on the same screen - ideally, all 4

Re: Help with the percentiles aggregation

2014-08-18 Thread Adrien Grand
Support for aggregations is indeed something that is on the roadmap for the next version of Kibana (Kibana 4), see this message from Rashid: https://groups.google.com/forum/?utm_medium=emailutm_source=footer#!msg/elasticsearch/I7um1mX4GSk/aUsT2EmyxysJ On Mon, Aug 18, 2014 at 4:33 PM, John Ogden

Aggregates - include source data

2014-08-18 Thread John D. Ament
Hi, From looking at the docs, didn't seem overly clear. Is it possible to include the data in an aggregate, or is it counts only? John -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from this group and stop receiving emails

Re: Aggregates - include source data

2014-08-18 Thread Adrien Grand
Aggregations only report counts or various metrics (see the metrics aggregations: stats, min, max, sum, percentiles, cardinality, top_hits, ...). Maybe top_hits is what you are looking for?

Re: Optimization Questions

2014-08-18 Thread Andrew Selden
Hi Greg, I believe max_num_segments is technically a hint that can be overridden by the merge algorithm if it decides to. You might try simply re-running the optimize again to get from ~25 down closer to 1. Sorry but I don't know of any way to see when the optimize is finished - it's really

Re: Enhancing perf for my cluster

2014-08-18 Thread Pierrick Boutruche
Hey guys, Finally i changed all my queries to constantscorequeries. It's way better, but still, certain pages take a lot of time running... I don't understand why, and i don't have anything in my ES logs... Now the average time for search 20 users and their mentions/timeline + scoring them

indexing problem when using logstash

2014-08-18 Thread vitaly . bulgakov
I am using the foollowing config file filter{ grok{ match=[ message, (?:\?|\)C\=%{DATA:kw}\%{DATA}\sT\s%{DATA:town}\sS\s%{WORD:state}\s%{DATA}%{IP:ip} ] } grok{ match=[

Re: help with a grok filter

2014-08-18 Thread vitaly
On Monday, August 18, 2014 9:57:41 AM UTC-4, Kevin M wrote: Could someone help me write a grok filter for this log real quick here is what the log looks like: Aug 18 09:40:39 server01 webmin_log: 172.16.16.96 - username *[18/Aug/2014:09:40:39 -0400]* GET

Re: help with a grok filter

2014-08-18 Thread Kevin M
I dont see your post - what I am stuck with is whenever the date changes on that log example: *[18/Aug/2014:09:40:39 -0400]* *[20/Aug/2014:11:40:39 -0104]* *[19/Aug/2014:08:40:39 -0500]* the filter will not match it On Monday, August 18, 2014 1:53:37 PM UTC-4, vitaly wrote: On Monday,

[ANN] Experimental Highlighter 0.0.11 Released

2014-08-18 Thread Nikolas Everett
I released version 0.0.11 of the Experimental Highlighter https://github.com/wikimedia/search-highlighter we've been using . Its compatible with Elasticsearch 1.3.x and has a few new features: 1. Conditional highlighting - skip highlighting fields you aren't going to use! Save time and IO

[ANN] Elasticsearch Mapper Attachment plugin 2.2.1 released

2014-08-18 Thread David Pilato
Heya, We are pleased to announce the release of the Elasticsearch Mapper Attachment plugin, version 2.2.1 The mapper attachments plugin adds the attachment type to Elasticsearch using Apache Tika.. Release Notes - Version 2.2.1 Earlier today there was an Apache POI release to address a

[ANN] Elasticsearch Mapper Attachment plugin 2.3.1 released

2014-08-18 Thread David Pilato
Heya, We are pleased to announce the release of the Elasticsearch Mapper Attachment plugin, version 2.3.1 The mapper attachments plugin adds the attachment type to Elasticsearch using Apache Tika.. Release Notes - Version 2.3.1 Earlier today there was an Apache POI release to address a

Unassigned Node and shards

2014-08-18 Thread IronMan2014
I saw this problem twice now. I start with a Green two-node cluster, default 5 shards/node, I index about 50,000 docs, shards/replicas look great and well balanced across the 2 nodes. I try the same test with 8 million docs. I come back when its done, and I see all primary shards on node1 and

[ANN] swift-repository-plugin v0.5 released

2014-08-18 Thread Chad Horohoe
Hi all, Just released to Central the v0.5 of the swift-repository plugin. Mainly contains documentation updates but also built against 1.3.2 instead of 1.1.0. https://github.com/wikimedia/search-repository-swift -Chad -- You received this message because you are subscribed to the Google

Re: How to increase memory

2014-08-18 Thread joergpra...@gmail.com
What version of ES do you use? Jörg On Mon, Aug 18, 2014 at 9:42 PM, rookie7799 pavelbara...@gmail.com wrote: Hello there, We are having the same exact problem with a really resource hungry query: 5 nodes with 16GB ES_HEAP_SIZE 1.2 Billion records inside 1 index with 5 shards Whenever

Top hits aggregation default sort

2014-08-18 Thread Dan Tuffery
I using the top hits aggregation with a has_child query. In the top_hits aggregation documentation it says '*By default the hits are sorted by the score of the main query*', but I'm not seeing that in the results for my query { from: 0, size: 3, query: { has_child: {

Re: How to increase memory

2014-08-18 Thread rookie7799
Hi, it's 1.3.2 On Monday, August 18, 2014 5:49:03 PM UTC-4, Jörg Prante wrote: What version of ES do you use? Jörg On Mon, Aug 18, 2014 at 9:42 PM, rookie7799 pavelb...@gmail.com javascript: wrote: Hello there, We are having the same exact problem with a really resource hungry query:

How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread shriyansh jain
Hi, I have a Elasticsearch Cluster of 2 nodes. I have configured them to store data at the location which is /auto/share. I want to point one of the two nodes in the cluster to some other location to store the data say /auto/foo. What would be the best way of achieving the above task without

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread Mark Walkom
Do you want to copy the existing data in /auto/share to /auto/foo, or start with no data? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 08:23, shriyansh jain shriyanshaj...@gmail.com wrote: Hi,

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread Mark Walkom
If you want no data in /auto/foo then just create the directory, give it the right permissions and then update the config to point to it. It's the same process you did for /auto/share. Do you have replicas set on your indexes? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread shriyansh jain
Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it loose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread shriyansh jain
Yes, I have set *index.number_of_replicas: 1*. If I just point one of the 2 nodes to some other location, wont it lose the data stored by that node.? Thank you, Shriyansh On Monday, August 18, 2014 3:34:48 PM UTC-7, Mark Walkom wrote: If you want no data in /auto/foo then just create the

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread Mark Walkom
If you point the instance to a new data location then yes, it will startup with no data, but it won't lose the data completely as it will still be located in your original /auto/share directory. However given you have replicas set what will happen is when the node starts up pointing to the new

Re: how to get char_filter to work?

2014-08-18 Thread Ivan Brusic
Sorry if I have not replied sooner, but I was on vacation. I would use the two fields solution, especially since you simply cannot store a stripped version. The source field is compressed, so the additional index size is content dependent. Never used highlighting, so I cannot recommend

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread Mark Walkom
Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 11:50, shriyansh jain shriyanshaj...@gmail.com wrote: As you mentioned the node will

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread shriyansh jain
Just to make sure if /auto/share goes down I have data in /auto/foo. Thanks, Shriyansh On Monday, August 18, 2014 6:55:59 PM UTC-7, Mark Walkom wrote: Why do you want to do this if you are worried about data loss? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email:

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread shriyansh jain
To make sure if /auto/share goes down, I have data in /auto/foo. And I am short of space on /auto/share. Mainly bcz of these 2 reasons. Thanks, Shriyansh On Monday, August 18, 2014 6:55:59 PM UTC-7, Mark Walkom wrote: Why do you want to do this if you are worried about data loss? Regards,

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread Mark Walkom
This is why you have replicas, they give you redundancy at a higher level that the filesystem, If you are still concerned then you should add another node and increase your replicas. Playing around on the FS to create replicas is only extra management overhead and likely to end up causing more

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread Mark Walkom
Apart from replica's, that's really outside the scope of what ES provides. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 19 August 2014 12:12, shriyansh jain shriyanshaj...@gmail.com wrote: I got your point sir,

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread shriyansh jain
Thank you for helping me out. I really appreciate it. Regards, Shriyansh On Monday, August 18, 2014 7:23:50 PM UTC-7, Mark Walkom wrote: Apart from replica's, that's really outside the scope of what ES provides. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email:

Re: How to safely migrate from one mount to another mount in Elasticsearch to store the data

2014-08-18 Thread shriyansh jain
I would like to know one more thing, what would be steps if I want to copy the data from /auto/share to /auto/foo for a particular node.? Thanks, Shriyansh On Monday, August 18, 2014 3:26:39 PM UTC-7, Mark Walkom wrote: Do you want to copy the existing data in /auto/share to /auto/foo, or

Re: Using a char_filter in combination with a lowercase filter

2014-08-18 Thread Ivan Brusic
Char filters are applied before the text is tokenized, and therefore they are applied before the normal filters are used, which is why they are a separate class of filter. With Lucene, the order is: char filters - tokenizer - filters Have you looked into the ICU analyzer?

Re: A few questions about node types + usage

2014-08-18 Thread Mark Walkom
Master, data and client are really just abstractions of different combinations of node.data and node.master values. A node.master=true, node.data=false can handle both cluster management and queries. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com