Re: Split child query to query and filter?

2014-08-14 Thread Adam Porat
I will answer myself:
Any queried type (the parent or any one of its children), if there should 
be a filter on it, should be queried using a *filtered query* clause. Any 
filtered field should come under the filtered-query *filter *clause, and 
any analyzed or score-affecting field should come under the filtered-query 
*query 
*clause.

בתאריך יום שלישי, 10 ביוני 2014 12:05:20 UTC+3, מאת Adam Porat:
>
> Hi,
>
>   I need to perform a query + filter on child documents. For the query, 
> I'm using TopChildren.
>   Now I wonder what would be more efficient regarding query on *date/numeric 
> (no score needed) fields* of this child -
>should I query on these fields using a HasChild filter in a bool query 
> with the TopChildren query,
>or should I just combine those date/numeric fields within a single 
> TopChildren query?
>
> Thank you
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ecc523ba-4a55-4470-beaf-0fd2ff9f58aa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Using facets/aggretagions on parent document, queried by TopChildren

2014-08-14 Thread Adam Porat
I will answer myself: 
The facets are counted in relation to the total_hits in the response. This 
is true just as well when using TopChildren.


בתאריך יום שלישי, 5 באוגוסט 2014 11:11:02 UTC+3, מאת Adam Porat:
>
> Hi,
>
>The TopChildren query works with an estimated hit size, and the 
> TotalHits might be incorrect if there are more child docs matching the 
> required hits.
>How does that affect facets or aggregations defined on the parent 
> document? Would their count might be likewise be incorrent? Or would they 
> cause the TopChildren to actually expand to look for all possible children?
>
> Thank you.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0d330bf6-235d-4895-8bfb-75740b96b1e1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Index size on node VS heap size

2014-08-14 Thread Adam Porat
Thanks Mark.

בתאריך יום שלישי, 12 באוגוסט 2014 09:20:43 UTC+3, מאת Mark Walkom:
>
> No it can be more, it depends on what sort of queries you are doing and 
> what data structures/types you are indexing.
>
> Best bet is to keep throwing data at the index until the server can't take 
> it, then you know the limit.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com 
> web: www.campaignmonitor.com
>
>
> On 12 August 2014 16:18, Adam Porat > 
> wrote:
>
>> Hi,
>>
>> Let's say my machine has 40G of RAM, and so I set HEAP_SIZE to 20G, 
>> as recommended.
>>
>> And let's say I have a single index on the machine.
>>
>> Rougly, how large can the index be to maintain good performance? Must 
>> it be somewhat less than 20G?
>>
>> Thank you.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/e794ceab-d16e-4405-9775-cab138b35f26%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/e794ceab-d16e-4405-9775-cab138b35f26%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dbfab38e-fd9f-4a84-a498-9c8d44ff5ea1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Index size on node VS heap size

2014-08-11 Thread Adam Porat
Hi,

Let's say my machine has 40G of RAM, and so I set HEAP_SIZE to 20G, as 
recommended.

And let's say I have a single index on the machine.

Rougly, how large can the index be to maintain good performance? Must 
it be somewhat less than 20G?

Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e794ceab-d16e-4405-9775-cab138b35f26%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Nest - range filter in a form of BaseFilter

2014-08-05 Thread Adam Porat
Any ideas anyone?

בתאריך יום ראשון, 3 באוגוסט 2014 14:47:05 UTC+3, מאת Adam Porat:
>
> Hi,
> I'm using Nest version 0.12.0.
> I need to get a range filter in a form of BaseFilter.
> However, this line of code creates a faulty BaseFilter which doesn't 
> contain the actual condition:
> *agg.ElasticsearchFilter = Nest.Filter.Range(i => 
> i.GreaterOrEquals(filteredUpdateDate));*
> Is this a bug, or am I doing something wrong?
> Any way that will enable to cast from a RangeFilterDescriptor to a 
> BaseFilter will do. 
> This is becase .FacetFilter( i => i.Bool(j => j.Must(...))) accepts a 
> BaseFilter[].
> Thank you.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/59cfd185-5beb-4d9a-afd8-8c1664912e41%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Using facets/aggretagions on parent document, queried by TopChildren

2014-08-05 Thread Adam Porat
Hi,

   The TopChildren query works with an estimated hit size, and the 
TotalHits might be incorrect if there are more child docs matching the 
required hits.
   How does that affect facets or aggregations defined on the parent 
document? Would their count might be likewise be incorrent? Or would they 
cause the TopChildren to actually expand to look for all possible children?

Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f2a87846-64ab-4786-ad91-305b197152c1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Nest - range filter in a form of BaseFilter

2014-08-03 Thread Adam Porat
Hi,
I'm using Nest version 0.12.0.
I need to get a range filter in a form of BaseFilter.
However, this line of code creates a faulty BaseFilter which doesn't 
contain the actual condition:
*agg.ElasticsearchFilter = Nest.Filter.Range(i => 
i.GreaterOrEquals(filteredUpdateDate));*
Is this a bug, or am I doing something wrong?
Any way that will enable to cast from a RangeFilterDescriptor to a 
BaseFilter will do. 
This is becase .FacetFilter( i => i.Bool(j => j.Must(...))) accepts a 
BaseFilter[].
Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7bb41594-74c8-440a-92fb-c78678129933%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Term aggregation/facet NOT INCLUDING the faceted field

2014-07-29 Thread Adam Porat
Hi Adrien, it looks like your link gives the best answer, basically use the 
construct:

{
  "query": {
"filtered": {
  "query": "your query goes here",
  "filter": "filters to take into account for top-hits and aggs"
}
  },
  "post_filter" : "filters to take into account for top-hits only",
  "aggs": {
"my_filter": {
  "filter": "filter to take into account for aggs only"
}
  }}


בתאריך יום שלישי, 29 ביולי 2014 12:42:13 UTC+3, מאת Adrien Grand:
>
> Hi Adam, you can do it today in a single query using post_filter, but it 
> is true that it is not very convenient. See discussion on 
> https://github.com/elasticsearch/elasticsearch/pull/7020 for more 
> information.
>
>
> On Tue, Jul 29, 2014 at 11:32 AM, Adam Porat  > wrote:
>
>> Hi,
>>
>> Faceted Navigation, as can be seen in *LinkedIn *and *Booking.com*, 
>> include the fundamental characteristic that the value-count of each faceted 
>> field is constrained by the query (set of matched documents), *excluding 
>> *any constraint on the given field itself.
>>  
>> For example, in LinkedIn, If the "locations" facet is currently 
>> filtered with "Canada" (the "Canada" check-box is selected), the facet will 
>> still include counts for the other locations.
>>
>> It seems that Elasticsearch lacks the feature to count the terms in 
>> the set of matched documents, AS IF the faceted field itself was not a part 
>> of the query. Am I correct?
>>
>> And thus, if I have *n* faceted fields in my faceted navigation, 
>> each clicked checkbox involves executing *n* separate queries - one to 
>> modify the actual result set, and *n-1* to rebuild the other facets 
>> (each one with a query which excludes its field).
>>
>>  Did I get this right? any better ways to do this?
>>
>> Thank you.
>>
>>
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com .
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/89ecc3ed-2d58-4253-b0d1-a7d12495c89d%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/89ecc3ed-2d58-4253-b0d1-a7d12495c89d%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Adrien Grand
>  

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/82cf53b7-6028-42c0-8f68-2ebc430b5cf2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Term aggregation/facet NOT INCLUDING the faceted field

2014-07-29 Thread Adam Porat
Hi David,
   post filter could have been a solution for single faceted-field 
scenario. However each facet in a multi-facet scenario needs to be affected 
by constraints on the other faceted-fields.

בתאריך יום שלישי, 29 ביולי 2014 12:40:24 UTC+3, מאת David Pilato:
>
> In that case, you should look at post filters instead of adding filters to 
> the query.
>
> My 2 cents
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 29 juil. 2014 à 11:32, Adam Porat > a 
> écrit :
>
> Hi,
>
> Faceted Navigation, as can be seen in *LinkedIn *and *Booking.com 
> <http://Booking.com>*, include the fundamental characteristic that the 
> value-count of each faceted field is constrained by the query (set of 
> matched documents), *excluding *any constraint on the given field itself.
>  
> For example, in LinkedIn, If the "locations" facet is currently 
> filtered with "Canada" (the "Canada" check-box is selected), the facet will 
> still include counts for the other locations.
>
> It seems that Elasticsearch lacks the feature to count the terms in 
> the set of matched documents, AS IF the faceted field itself was not a part 
> of the query. Am I correct?
>
> And thus, if I have *n* faceted fields in my faceted navigation, each 
> clicked checkbox involves executing *n* separate queries - one to modify 
> the actual result set, and *n-1* to rebuild the other facets (each one 
> with a query which excludes its field).
>
>  Did I get this right? any better ways to do this?
>
> Thank you.
>
>
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/89ecc3ed-2d58-4253-b0d1-a7d12495c89d%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/elasticsearch/89ecc3ed-2d58-4253-b0d1-a7d12495c89d%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/05489778-c3f1-42ff-922d-8a3175be17dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Term aggregation/facet NOT INCLUDING the faceted field

2014-07-29 Thread Adam Porat
Hi,

Faceted Navigation, as can be seen in *LinkedIn *and *Booking.com*, 
include the fundamental characteristic that the value-count of each faceted 
field is constrained by the query (set of matched documents), *excluding *any 
constraint on the given field itself.
 
For example, in LinkedIn, If the "locations" facet is currently 
filtered with "Canada" (the "Canada" check-box is selected), the facet will 
still include counts for the other locations.

It seems that Elasticsearch lacks the feature to count the terms in the 
set of matched documents, AS IF the faceted field itself was not a part of 
the query. Am I correct?

And thus, if I have *n* faceted fields in my faceted navigation, each 
clicked checkbox involves executing *n* separate queries - one to modify 
the actual result set, and *n-1* to rebuild the other facets (each one with 
a query which excludes its field).

 Did I get this right? any better ways to do this?

Thank you.

   

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/89ecc3ed-2d58-4253-b0d1-a7d12495c89d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Split child query to query and filter?

2014-06-10 Thread Adam Porat
Hi,

  I need to perform a query + filter on child documents. For the query, I'm 
using TopChildren.
  Now I wonder what would be more efficient regarding query on *date/numeric 
(no score needed) fields* of this child -
   should I query on these fields using a HasChild filter in a bool query 
with the TopChildren query,
   or should I just combine those date/numeric fields within a single 
TopChildren query?

Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/78636d82-808f-4149-84e5-6a8f18ee048f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Maximum number of indices per machine

2014-03-11 Thread Adam Porat
Hi Joerg & Itamar,
Thanks for your reply. What I am trying to get is an order of magnitue. 
Suppose a server has 20GB RAM. Each index has about 20K documents, each 
document is about 500 words, average term distribution. The question is 
whether the server would be able to smoothly handle tenths, hundreds, or 
thousands (or more?) of open indices. Suppose each index gets about 10 new 
documents per minute, and serves about 30 queries per minute.
Thank you!

בתאריך יום שני, 3 במרץ 2014 19:04:22 UTC+2, מאת Jörg Prante:
>
> Sure, you can create hundreds of thousands of indices, without docs, or 
> with just one doc. Closing an index even frees the resources of the index 
> management. This is not useful, just an edge case. 
>
> I'm afraid this is not the answer to the question. 
>
> The number of open indices (shards) is limited by disk space and maximum 
> number of file descriptors. But that is also theoretical. Using all indices 
> at once will use more resources than file descriptors. It depends on the 
> index / query characteristics (RAM for shards, fields, term count, term 
> distribution etc.) That is not directly related to "powerful" servers. Even 
> the weakest server can *create* as much indices as the strongest one. 
> Putting workload on indices is a different story.
>
> Jörg
>
>
>
> On Mon, Mar 3, 2014 at 11:21 AM, Itamar Syn-Hershko 
> 
> > wrote:
>
>> Joerg, I think what Adam means is how many open indexes an ES instance 
>> can hold before lagging or crashing (assuming max open files is set 
>> correctly etc).
>>
>> Small indexes usually mean less segments per index, so the question may 
>> boil down mostly to number of open IR/IW?
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/46fb6eed-ef7d-4fb1-93e7-020b0218c063%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


child documents - getting results & highlights of

2014-03-11 Thread Adam Porat
Hi,

   Regarding has_child query,  Is there a plan when will it will be 
possible to get the child document, as well as highlights?

Thank you

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/974213c7-ea42-42d9-b8df-36ff7416896a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Maximum number of indices per machine

2014-03-03 Thread Adam Porat
Hi,

  How many *indices per machine* (roughly) can ElasticSearch handle 
smoothly? Let's say average # of documents per index is 20 thousand, and 
the machine is a powerful server.

Thank you.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c75e845a-9c95-498f-ba8e-9fe31320ad8b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.