Re: Aggregations across multiple indices

2015-03-14 Thread Christian Rohling
Karl, thank you. That does solve the problem.

-Christian
On Mar 12, 2015 5:35 PM, "Karl Putland"  wrote:

> you might look at
> http://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html#search-aggregations-metrics-cardinality-aggregation
>
> --K
>
> Karl Putland
> Senior Engineer
> *SimpleSignal*
> Anywhere: 303-242-8608
> 
>
>
> On Thu, Mar 12, 2015 at 10:04 AM, Christian Rohling 
> wrote:
>
>> Hello Everyone,
>> I am attempting to use aggregations to count the number of documents
>> matching a given query across multiple indices. What I would like to do, is
>> make those counts on distinct keys. Say I had following document in 2
>> different indices, aliased together.
>> ```
>> {
>> _index: myindex
>> _type: mytype
>> _id: 1
>> _version: 1
>> _score: 1
>> _source: {
>> country: MEXICO
>> }
>> }```
>>
>> When I make an aggs term query on the field "country" I would like it to
>> only return a single count for the document with id=1(which exists in both
>> indices). The actual use case is a bit more complicated than what's
>> described above, this is just an example of the functionality that I am
>> looking for. I cannot find any info in the docs, and have asked in the IRC
>> channel to no avail.
>>
>> -Christian Rohling
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CALsYvrzV-PyUNUHcUHWNCDBQKz5jV9%3DTPoQ2hW1me8q%2BhBgKDg%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CA%2BEXWszW-B43Mc%2B6LZMxA5x2Hym5EPgNFQ%3DZ0a1da7s2yjEAyw%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALsYvryw0a%2BdwqhZALgJRZAOPcyK%3DWTgvTErscZo0726oV4ybg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread Mark Walkom
Can you provide more info on what the error/problem is, logs might help.

On 14 March 2015 at 10:12, joergpra...@gmail.com 
wrote:

> I'm out - no experience with EC2. I avoid foreign servers at all cost.
> Maybe 120G RAM is affected by swap/memory overcommit.  Do not forget to
> check memlock and memory ballooning. The chances are few you can control
> host settings as a guest in a virtual server environment.
>
> Jörg
>
> On Sat, Mar 14, 2015 at 5:06 PM, Lindsey Poole  wrote:
>
>> btw - we're on EC2 I2-4xl hosts, so we have ~120g ram and SSDs.
>>
>>
>> On Saturday, March 14, 2015 at 9:04:34 AM UTC-7, Lindsey Poole wrote:
>>>
>>> I did see the ES_DIRECT_SIZE, but it seems to be ineffective.
>>>
>>> I will try setting -XX:MaxDirectMemorySize directly.
>>>
>>> On Saturday, March 14, 2015 at 4:43:22 AM UTC-7, Jörg Prante wrote:

 You may try limit direct memory on JVM level by
 using -XX:MaxDirectMemorySize (default is unlimited). See also
 ES_DIRECT_SIZE in http://www.elastic.co/guide/en/elasticsearch/
 reference/current/setup-service.html#_linux

 I recommend at least 2GB

 Jörg

 On Sat, Mar 14, 2015 at 1:03 AM, Lindsey Poole 
 wrote:

> Hey guys,
>
> We're running into some problems under heavy write, nominal read
> volume when the Lucene memory mapped files have exhausted available
> physical memory, and segments from disk must be paged into memory.
>
> Are there any configs available to control how much physical memory is
> available to MemoryMapDirectory?
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google
> Groups "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to elasticsearc...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/12d0d4ba-8a32-4c85-9635-40d7791865e5%
> 40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/cf50ed46-cacf-414b-8b20-b82595dc2fd0%40googlegroups.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHY_M2BAywrG%2BaNcg59xA4_ocph1oqE0bzba4HTqrdLqQ%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-%2BAWW-v1XfUZaKvTDDgFxLv3X5gegShk8Uzi1egPa3Aw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Field names with the same name across types having different index/type in Elasticsearch

2015-03-14 Thread joergpra...@gmail.com
If you have thousands of tenants with thousands of potentially overlapping
mappings that should operate independently, the hardware sizing of a
cluster is a challenge, yes.

OTOH you can play tricks at your search/index front end API if you can hide
ES internals from the customers, e.g. prefixing field names with the tenant
ID so field names become unique. This should not be a recommended method,
though - because ES should be able to handle overlapping mappings in a more
feasible way.

Jörg

On Sat, Mar 14, 2015 at 7:38 PM,  wrote:

> Wouldn't that be a bit too much though ? I mean if we have thousands of
> customers (tenants) we will have to create index for each of them ?
> Wouldn't it affect performance and wouldn't maintaining those many indexes
> in the cluster a bit too much ?
>
> On Saturday, March 14, 2015 at 10:48:35 AM UTC-7, Jörg Prante wrote:
>>
>> You are right, I suggest to use different indices for tenant 1 and 2,
>> this is also good for separating other concerns (like index term
>> statistics, scoring, field faceting, deleting docs, etc.)
>>
>> As a matter of fact it is not Lucene that stands in the way. Internally,
>> ES keeps a hash map of field names across types, i.e. correct field name
>> lookup is a challenge if a field name denotes two different field
>> specifications in an index.
>>
>> Jörg
>>
>> On Fri, Mar 13, 2015 at 9:47 PM,  wrote:
>>
>>> I have detailed my question on stackoverflow here:
>>> http://stackoverflow.com/questions/29041509/field-
>>> names-with-the-same-name-across-types-having-different-
>>> index-type-in-elast
>>>
>>> Briefing it here as well :
>>>
>>> I have been reading a lot on mappings in Elasticsearch and here's
>>> something interesting that I found
>>>
>>> Field names with the same name across types are highly recommended to have
>>>  the same type and same mapping characteristics (analysis settings for
>>> example). There is an effort to allow to explicitly "choose" which field to
>>> use by using type prefix (my_type.my_field), but it’s not complete, and 
>>> there
>>> are places where it will never work (like faceting on the field).
>>>
>>> I found the above quote from the documentation here
>>> 
>>>
>>> Now my use case is exactly that .. Here's an example. Suppose that some
>>> field in tenant1 has to have the following mapping (for a given entity
>>> user):
>>>
>>> {
>>>   "tenantId1_user": {
>>> "properties": {
>>>   "someField": {
>>> "type": "string",
>>> "index":"analyzed"
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Now, for the same field in a different tenant (for the same entity type,
>>> lets say user), the type has to change like this:
>>>
>>> {
>>>   "tenantId2_user": {
>>> "properties": {
>>>   "someField": {
>>> "type": "int",
>>> "index":"analyzed"
>>>   }
>>> }
>>>   }
>>> }
>>>
>>> Now from what I understand from the above quote, it means that
>>> technically even though I can provide this mapping, it is not recommended
>>> because deep down Lucene handles them in the same way.
>>>
>>> My questions are:
>>>
>>> 1) How can I handle my usecase ? Should I just separate out each tenant
>>> in a different index so I don't have to worry about this mapping ?
>>>
>>> 2) Is there any other workaround ? Considering the fact that if I have
>>> too many tenants that means I will have too many indices?
>>>
>>> 3) What's the recommended way for this usecase?
>>>
>>> All help appreciated!
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/0264dafc-82e9-44fb-8193-b2661e8225a6%
>>> 40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/d41800b1-22ab-46ec-b4ee-a85ff298d50c%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To 

Re: Field names with the same name across types having different index/type in Elasticsearch

2015-03-14 Thread shahshi15
Wouldn't that be a bit too much though ? I mean if we have thousands of 
customers (tenants) we will have to create index for each of them ? 
Wouldn't it affect performance and wouldn't maintaining those many indexes 
in the cluster a bit too much ? 

On Saturday, March 14, 2015 at 10:48:35 AM UTC-7, Jörg Prante wrote:
>
> You are right, I suggest to use different indices for tenant 1 and 2, this 
> is also good for separating other concerns (like index term statistics, 
> scoring, field faceting, deleting docs, etc.)
>
> As a matter of fact it is not Lucene that stands in the way. Internally, 
> ES keeps a hash map of field names across types, i.e. correct field name 
> lookup is a challenge if a field name denotes two different field 
> specifications in an index.
>
> Jörg
>
> On Fri, Mar 13, 2015 at 9:47 PM, > wrote:
>
>> I have detailed my question on stackoverflow here: 
>>
>> http://stackoverflow.com/questions/29041509/field-names-with-the-same-name-across-types-having-different-index-type-in-elast
>>
>> Briefing it here as well : 
>>
>> I have been reading a lot on mappings in Elasticsearch and here's 
>> something interesting that I found
>>
>> Field names with the same name across types are highly recommended to have
>>  the same type and same mapping characteristics (analysis settings for 
>> example). There is an effort to allow to explicitly "choose" which field to 
>> use by using type prefix (my_type.my_field), but it’s not complete, and 
>> there 
>> are places where it will never work (like faceting on the field).
>>
>> I found the above quote from the documentation here 
>> 
>>
>> Now my use case is exactly that .. Here's an example. Suppose that some 
>> field in tenant1 has to have the following mapping (for a given entity 
>> user):
>>
>> {
>>   "tenantId1_user": {
>> "properties": {
>>   "someField": {
>> "type": "string",
>> "index":"analyzed"
>>   }
>> }
>>   }
>> }
>>
>> Now, for the same field in a different tenant (for the same entity type, 
>> lets say user), the type has to change like this:
>>
>> {
>>   "tenantId2_user": {
>> "properties": {
>>   "someField": {
>> "type": "int",
>> "index":"analyzed"
>>   }
>> }
>>   }
>> } 
>>
>> Now from what I understand from the above quote, it means that 
>> technically even though I can provide this mapping, it is not recommended 
>> because deep down Lucene handles them in the same way. 
>>
>> My questions are: 
>>
>> 1) How can I handle my usecase ? Should I just separate out each tenant 
>> in a different index so I don't have to worry about this mapping ? 
>>
>> 2) Is there any other workaround ? Considering the fact that if I have 
>> too many tenants that means I will have too many indices?
>>
>> 3) What's the recommended way for this usecase?
>>
>> All help appreciated!
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/0264dafc-82e9-44fb-8193-b2661e8225a6%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d41800b1-22ab-46ec-b4ee-a85ff298d50c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Field names with the same name across types having different index/type in Elasticsearch

2015-03-14 Thread joergpra...@gmail.com
You are right, I suggest to use different indices for tenant 1 and 2, this
is also good for separating other concerns (like index term statistics,
scoring, field faceting, deleting docs, etc.)

As a matter of fact it is not Lucene that stands in the way. Internally, ES
keeps a hash map of field names across types, i.e. correct field name
lookup is a challenge if a field name denotes two different field
specifications in an index.

Jörg

On Fri, Mar 13, 2015 at 9:47 PM,  wrote:

> I have detailed my question on stackoverflow here:
>
> http://stackoverflow.com/questions/29041509/field-names-with-the-same-name-across-types-having-different-index-type-in-elast
>
> Briefing it here as well :
>
> I have been reading a lot on mappings in Elasticsearch and here's
> something interesting that I found
>
> Field names with the same name across types are highly recommended to have
>  the same type and same mapping characteristics (analysis settings for
> example). There is an effort to allow to explicitly "choose" which field to
> use by using type prefix (my_type.my_field), but it’s not complete, and there
> are places where it will never work (like faceting on the field).
>
> I found the above quote from the documentation here
> 
>
> Now my use case is exactly that .. Here's an example. Suppose that some
> field in tenant1 has to have the following mapping (for a given entity
> user):
>
> {
>   "tenantId1_user": {
> "properties": {
>   "someField": {
> "type": "string",
> "index":"analyzed"
>   }
> }
>   }
> }
>
> Now, for the same field in a different tenant (for the same entity type,
> lets say user), the type has to change like this:
>
> {
>   "tenantId2_user": {
> "properties": {
>   "someField": {
> "type": "int",
> "index":"analyzed"
>   }
> }
>   }
> }
>
> Now from what I understand from the above quote, it means that technically
> even though I can provide this mapping, it is not recommended because deep
> down Lucene handles them in the same way.
>
> My questions are:
>
> 1) How can I handle my usecase ? Should I just separate out each tenant in
> a different index so I don't have to worry about this mapping ?
>
> 2) Is there any other workaround ? Considering the fact that if I have too
> many tenants that means I will have too many indices?
>
> 3) What's the recommended way for this usecase?
>
> All help appreciated!
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0264dafc-82e9-44fb-8193-b2661e8225a6%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoFg0Pe7wkeJPmVCRuPS0Pjvch59RVv5NVoDH5aheF7D%2Bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread joergpra...@gmail.com
I'm out - no experience with EC2. I avoid foreign servers at all cost.
Maybe 120G RAM is affected by swap/memory overcommit.  Do not forget to
check memlock and memory ballooning. The chances are few you can control
host settings as a guest in a virtual server environment.

Jörg

On Sat, Mar 14, 2015 at 5:06 PM, Lindsey Poole  wrote:

> btw - we're on EC2 I2-4xl hosts, so we have ~120g ram and SSDs.
>
>
> On Saturday, March 14, 2015 at 9:04:34 AM UTC-7, Lindsey Poole wrote:
>>
>> I did see the ES_DIRECT_SIZE, but it seems to be ineffective.
>>
>> I will try setting -XX:MaxDirectMemorySize directly.
>>
>> On Saturday, March 14, 2015 at 4:43:22 AM UTC-7, Jörg Prante wrote:
>>>
>>> You may try limit direct memory on JVM level by
>>> using -XX:MaxDirectMemorySize (default is unlimited). See also
>>> ES_DIRECT_SIZE in http://www.elastic.co/guide/en/elasticsearch/
>>> reference/current/setup-service.html#_linux
>>>
>>> I recommend at least 2GB
>>>
>>> Jörg
>>>
>>> On Sat, Mar 14, 2015 at 1:03 AM, Lindsey Poole  wrote:
>>>
 Hey guys,

 We're running into some problems under heavy write, nominal read volume
 when the Lucene memory mapped files have exhausted available physical
 memory, and segments from disk must be paged into memory.

 Are there any configs available to control how much physical memory is
 available to MemoryMapDirectory?

 Thanks!

 --
 You received this message because you are subscribed to the Google
 Groups "elasticsearch" group.
 To unsubscribe from this group and stop receiving emails from it, send
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit https://groups.google.com/d/
 msgid/elasticsearch/12d0d4ba-8a32-4c85-9635-40d7791865e5%
 40googlegroups.com
 
 .
 For more options, visit https://groups.google.com/d/optout.

>>>
>>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/cf50ed46-cacf-414b-8b20-b82595dc2fd0%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHY_M2BAywrG%2BaNcg59xA4_ocph1oqE0bzba4HTqrdLqQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Searching ES nested data using Hive

2015-03-14 Thread Nolan Grace
Haha I was able to figure it out.  As long as the hive external table is 
created you can reference the nested fields as if the struct column was its 
own table in the select statement.  For example after the band table was 
created directly referencing lat in Hive is a easy as SELECT location.lat 
FROM Band;

On Friday, March 6, 2015 at 10:22:19 AM UTC-6, nolan grace wrote:
>
> Hello forgive me if I am under-informed but i am looking for a way to 
> search complex data in elasticsearch using hive.  The structure I am 
> looking at has nested location object that contains longitude and latitude 
> and I am looking for a way to create a hive table that flattens out that 
> data or to create a separate table with the longitude and latitude in it i 
> can use to join to the rest of my dataset in Hive.  Please let me know if 
> you need more information or if this use case is documented somewhere i 
> reference.  Thanks for your help.
>
> band:{
>
> name:"nolan",
> location:
> {
> lat:101,
> long: 101
> }
>
> }
>
> Hive table band
> columns - name lat long
>
>
> Nolan Grace
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/93300322-daa1-4447-8941-07755c2dc3cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread Lindsey Poole
btw - we're on EC2 I2-4xl hosts, so we have ~120g ram and SSDs.

On Saturday, March 14, 2015 at 9:04:34 AM UTC-7, Lindsey Poole wrote:
>
> I did see the ES_DIRECT_SIZE, but it seems to be ineffective.
>
> I will try setting -XX:MaxDirectMemorySize directly.
>
> On Saturday, March 14, 2015 at 4:43:22 AM UTC-7, Jörg Prante wrote:
>>
>> You may try limit direct memory on JVM level by 
>> using -XX:MaxDirectMemorySize (default is unlimited). See also 
>> ES_DIRECT_SIZE in 
>> http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-service.html#_linux
>>  
>>
>> I recommend at least 2GB
>>
>> Jörg
>>
>> On Sat, Mar 14, 2015 at 1:03 AM, Lindsey Poole  wrote:
>>
>>> Hey guys,
>>>
>>> We're running into some problems under heavy write, nominal read volume 
>>> when the Lucene memory mapped files have exhausted available physical 
>>> memory, and segments from disk must be paged into memory.
>>>
>>> Are there any configs available to control how much physical memory is 
>>> available to MemoryMapDirectory?
>>>
>>> Thanks!
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/elasticsearch/12d0d4ba-8a32-4c85-9635-40d7791865e5%40googlegroups.com
>>>  
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/cf50ed46-cacf-414b-8b20-b82595dc2fd0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread Lindsey Poole
I did see the ES_DIRECT_SIZE, but it seems to be ineffective.

I will try setting -XX:MaxDirectMemorySize directly.

On Saturday, March 14, 2015 at 4:43:22 AM UTC-7, Jörg Prante wrote:
>
> You may try limit direct memory on JVM level by 
> using -XX:MaxDirectMemorySize (default is unlimited). See also 
> ES_DIRECT_SIZE in 
> http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-service.html#_linux
>  
>
> I recommend at least 2GB
>
> Jörg
>
> On Sat, Mar 14, 2015 at 1:03 AM, Lindsey Poole  > wrote:
>
>> Hey guys,
>>
>> We're running into some problems under heavy write, nominal read volume 
>> when the Lucene memory mapped files have exhausted available physical 
>> memory, and segments from disk must be paged into memory.
>>
>> Are there any configs available to control how much physical memory is 
>> available to MemoryMapDirectory?
>>
>> Thanks!
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/12d0d4ba-8a32-4c85-9635-40d7791865e5%40googlegroups.com
>>  
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0f540631-52f0-48d3-b33a-855c54d8ce94%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What configuration is available to control MemoryMapDirectory

2015-03-14 Thread joergpra...@gmail.com
You may try limit direct memory on JVM level by
using -XX:MaxDirectMemorySize (default is unlimited). See also
ES_DIRECT_SIZE in
http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-service.html#_linux


I recommend at least 2GB

Jörg

On Sat, Mar 14, 2015 at 1:03 AM, Lindsey Poole  wrote:

> Hey guys,
>
> We're running into some problems under heavy write, nominal read volume
> when the Lucene memory mapped files have exhausted available physical
> memory, and segments from disk must be paged into memory.
>
> Are there any configs available to control how much physical memory is
> available to MemoryMapDirectory?
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/12d0d4ba-8a32-4c85-9635-40d7791865e5%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH1q7WdOhPWo9D8vdCDVU97epZhATOU6qZyt_sVS_ySeg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there limitation how many indices could I create in ES cluster? and performance?

2015-03-14 Thread joergpra...@gmail.com
You may use a single index with enough shards for users and use routing for
accessing the shard where a user ID has the docs indexed. See also shard
overallocation
http://www.elastic.co/guide/en/elasticsearch/guide/current/overallocation.html
and
https://groups.google.com/forum/#!msg/elasticsearch/49q-_AgQCp8/MRol0t9asEcJ

Jörg

On Sat, Mar 14, 2015 at 10:46 AM, zehong yin  wrote:

> Hi, all
>
> Is there limitation how many indices could I create in ES cluster? and
> Does the number of indices affect performance?
> I have used DATE as indice for logs from MMO game servers. That give me
> chance to remove old data.
>
> But right now, I'm considering use userid as indice, that means there
> might be milliion users, as well as million indices created.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b2f0d724-f1b1-40b4-8457-318a195ebdd7%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGN5ugQWN%3DW_Fs19_tsiGxw4uhwE_3Ye30_eCGFWJ%2BY9A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there limitation how many indices could I create in ES cluster? and performance?

2015-03-14 Thread David Pilato
Each index comes with a cost and probably having million of indices will 
require a lot of machines.
Also the cluster state will be a way too big so it could affect cluster 
stability.

You will probably have at the end of the day a lot of small indices.

I mean: don't do this! :)

Share indices between users but use routing to make sure all docs for a given 
user go to the same shard.

Having rolling indices is a good idea though.

Here are my first thoughts about this.
Hope this helps.

--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

> Le 14 mars 2015 à 02:46, zehong yin  a écrit :
> 
> Hi, all
> 
> Is there limitation how many indices could I create in ES cluster? and Does 
> the number of indices affect performance?
> I have used DATE as indice for logs from MMO game servers. That give me 
> chance to remove old data.
> 
> But right now, I'm considering use userid as indice, that means there might 
> be milliion users, as well as million indices created.
>  
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/b2f0d724-f1b1-40b4-8457-318a195ebdd7%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/BEB80606-4151-4308-B32C-556906E302E3%40pilato.fr.
For more options, visit https://groups.google.com/d/optout.


Is there limitation how many indices could I create in ES cluster? and performance?

2015-03-14 Thread zehong yin
Hi, all

Is there limitation how many indices could I create in ES cluster? and Does 
the number of indices affect performance?
I have used DATE as indice for logs from MMO game servers. That give me 
chance to remove old data.

But right now, I'm considering use userid as indice, that means there might 
be milliion users, as well as million indices created.
 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b2f0d724-f1b1-40b4-8457-318a195ebdd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Counting the frequency of a term

2015-03-14 Thread Christoffer Vig
You can do this, but it involves scripting and is perhaps not very simple. 
The frequency of a term in a document is given as 
_index['FIELD']['TERM'].tf()
http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html#_term_statistics_2

Combine this with a script field:
http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-script-fields.html
If you are willing to take the risk to  enable dynamic scripting on your 
box it can be done like this

{

   "query": {

  "match_all": {}

   },

   "script_fields": {

  "qcount": {

 "script": "_index['FIELDNAME']['QUERYTERM'].tf()"

  },

  "qfield": {

 "script": "doc['FIELDNAME']"

  }

   }

}


fredag 13. mars 2015 19.32.00 UTC+1 skrev reza sadoddin følgende:
>
>
> I am almost new to ElasticSearch, and need to do a simple task. I want to 
> count the frequency of a word in a single document (not the number of 
> documents containing a term). Assume that my index consists of a single 
> document.
>
> Thanks
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57ec332c-1df7-46d6--87da9115f28e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Search parents by latest child

2015-03-14 Thread Rauan Maemirov
Here's the gist of my data 
scheme: https://gist.github.com/rauanmaemirov/7b3af9106ccc2963d2a5

There are a collection of entities as parents and a collection of events as 
child documents.
What I need to do is search documents by *the latest event of a particular 
type.*

If you run that script on localhost (test index), you could see that search 
request returns all 4 test entities, even though latest event for doc 2 
doesn't match with the query terms.
I need a way to filter by the latest events and then apply query terms, is 
it possible?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/03263058-a180-45f2-8a2b-e13d4c9427a5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: children aggregation

2015-03-14 Thread Adrien Grand
Hi,

This aggregation works with parent/child functionality which requires that
parents and children are in the sane shard. So having parents and children
in different indexes is not possible.

See
http://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child.html

On Tue, Mar 10, 2015 at 5:22 AM, kazoompa  wrote:

> Hi,
>
> I was wondering whether this aggregation will work with parent documents
> residing in different indexes than the children documents? Are there any
> limitations with respect to shards? I remember parent-child relationships
> have shard limitation as stated here: http://goo.gl/gU6Mcx.
>
> In short, I would like to know the limitations of this aggregation type
> since there were none described in the children agg documentation:
> http://goo.gl/Y3yfhc.
>
>
> Thanks a lot.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/055b8b28-d403-483e-92cb-a1b3ae04b252%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Adrien Grand

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j4AhbQsVTk%2BuzdVy2W_HK_5qk9YaDROxzgcYx3_fzv%3Dtw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.