Re: Aggregations across multiple indices

2015-03-14 Thread Christian Rohling
Karl, thank you. That does solve the problem. -Christian On Mar 12, 2015 5:35 PM, "Karl Putland" wrote: > you might look at > http://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html#search-aggregations-metrics-cardinality-aggregatio

Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread Mark Walkom
Can you provide more info on what the error/problem is, logs might help. On 14 March 2015 at 10:12, joergpra...@gmail.com wrote: > I'm out - no experience with EC2. I avoid foreign servers at all cost. > Maybe 120G RAM is affected by swap/memory overcommit. Do not forget to > check memlock and

Re: Field names with the same name across types having different index/type in Elasticsearch

2015-03-14 Thread joergpra...@gmail.com
If you have thousands of tenants with thousands of potentially overlapping mappings that should operate independently, the hardware sizing of a cluster is a challenge, yes. OTOH you can play tricks at your search/index front end API if you can hide ES internals from the customers, e.g. prefixing f

Re: Field names with the same name across types having different index/type in Elasticsearch

2015-03-14 Thread shahshi15
Wouldn't that be a bit too much though ? I mean if we have thousands of customers (tenants) we will have to create index for each of them ? Wouldn't it affect performance and wouldn't maintaining those many indexes in the cluster a bit too much ? On Saturday, March 14, 2015 at 10:48:35 AM UTC-

Re: Field names with the same name across types having different index/type in Elasticsearch

2015-03-14 Thread joergpra...@gmail.com
You are right, I suggest to use different indices for tenant 1 and 2, this is also good for separating other concerns (like index term statistics, scoring, field faceting, deleting docs, etc.) As a matter of fact it is not Lucene that stands in the way. Internally, ES keeps a hash map of field nam

Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread joergpra...@gmail.com
I'm out - no experience with EC2. I avoid foreign servers at all cost. Maybe 120G RAM is affected by swap/memory overcommit. Do not forget to check memlock and memory ballooning. The chances are few you can control host settings as a guest in a virtual server environment. Jörg On Sat, Mar 14, 20

Re: Searching ES nested data using Hive

2015-03-14 Thread Nolan Grace
Haha I was able to figure it out. As long as the hive external table is created you can reference the nested fields as if the struct column was its own table in the select statement. For example after the band table was created directly referencing lat in Hive is a easy as SELECT location.lat

Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread Lindsey Poole
btw - we're on EC2 I2-4xl hosts, so we have ~120g ram and SSDs. On Saturday, March 14, 2015 at 9:04:34 AM UTC-7, Lindsey Poole wrote: > > I did see the ES_DIRECT_SIZE, but it seems to be ineffective. > > I will try setting -XX:MaxDirectMemorySize directly. > > On Saturday, March 14, 2015 at 4:43:2

Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread Lindsey Poole
I did see the ES_DIRECT_SIZE, but it seems to be ineffective. I will try setting -XX:MaxDirectMemorySize directly. On Saturday, March 14, 2015 at 4:43:22 AM UTC-7, Jörg Prante wrote: > > You may try limit direct memory on JVM level by > using -XX:MaxDirectMemorySize (default is unlimited). See a

Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread joergpra...@gmail.com
You may try limit direct memory on JVM level by using -XX:MaxDirectMemorySize (default is unlimited). See also ES_DIRECT_SIZE in http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-service.html#_linux I recommend at least 2GB Jörg On Sat, Mar 14, 2015 at 1:03 AM, Lindsey Poole

Re: Is there limitation how many indices could I create in ES cluster? and performance?

2015-03-14 Thread joergpra...@gmail.com
You may use a single index with enough shards for users and use routing for accessing the shard where a user ID has the docs indexed. See also shard overallocation http://www.elastic.co/guide/en/elasticsearch/guide/current/overallocation.html and https://groups.google.com/forum/#!msg/elasticsearch/

Re: Is there limitation how many indices could I create in ES cluster? and performance?

2015-03-14 Thread David Pilato
Each index comes with a cost and probably having million of indices will require a lot of machines. Also the cluster state will be a way too big so it could affect cluster stability. You will probably have at the end of the day a lot of small indices. I mean: don't do this! :) Share indices be

Is there limitation how many indices could I create in ES cluster? and performance?

2015-03-14 Thread zehong yin
Hi, all Is there limitation how many indices could I create in ES cluster? and Does the number of indices affect performance? I have used DATE as indice for logs from MMO game servers. That give me chance to remove old data. But right now, I'm considering use userid as indice, that means there

Re: Counting the frequency of a term

2015-03-14 Thread Christoffer Vig
You can do this, but it involves scripting and is perhaps not very simple. The frequency of a term in a document is given as _index['FIELD']['TERM'].tf() http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html#_term_statistics_2 Combine this with a script f

Search parents by latest child

2015-03-14 Thread Rauan Maemirov
Here's the gist of my data scheme: https://gist.github.com/rauanmaemirov/7b3af9106ccc2963d2a5 There are a collection of entities as parents and a collection of events as child documents. What I need to do is search documents by *the latest event of a particular type.* If you run that script on

Re: children aggregation

2015-03-14 Thread Adrien Grand
Hi, This aggregation works with parent/child functionality which requires that parents and children are in the sane shard. So having parents and children in different indexes is not possible. See http://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child.html On Tue, Mar 10, 2015 at