LocalManualCache is a component of Guava's LRU cache
https://code.google.com/p/guava-libraries/source/browse/guava-gwt/src-super/com/google/common/cache/super/com/google/common/cache/CacheBuilder.java,
which is used by Elasticsearch for both the filter and field data cache.
Based on your node
Glad you got it working! Just a note, I don't monitor the ES mailing list
very closely...but if you have any more problems/questions with the PHP
client in the future, feel free to just open a tick at the repo and I would
be happy to help :)
-Z
On Wednesday, February 11, 2015 at 7:09:57 AM
You can also set it as a parameter in the body, just like a regular query:
$searchParams['index'] = 'my_index';$searchParams['type'] =
'my_type';$searchParams['body'] = [
'query' = [
'match' = [
'test_field' = 'abc'
]
],
'size' = 20]
$queryResponse =
and your time.
btw,which tool you use for monitoring ES cluster and what you monitor ?
Thanks
Rajan
On Thursday, March 20, 2014 2:05:52 PM UTC-7, Zachary Tong wrote:
Unfortunately, there is no way that we can tell you an optimal number.
But there is a way that you can perform some capacity
at a point!
Hmm...that's interesting. I would have recommended those two exact
methods. I'll do some digging and see why they didn't work...
-Z
On Thursday, March 20, 2014 1:23:48 AM UTC-5, Ivan Brusic wrote:
Responses inline.
On Wed, Mar 19, 2014 at 7:25 PM, Zachary Tong
zachar
You are correct in your analysis of the fuzzy scoring. Fuzzy variants are
scored (relatively) the same as the exact match, because they are treated
the same when executed internally.
If you want to score exact matches higher, I would use a boolean
combination of an exact match and a fuzzy
Unfortunately, there is no way that we can tell you an optimal number. But
there is a way that you can perform some capacity tests, and arrive at
usable numbers that you can extrapolate from. The process is very simple:
- Create a single index, with a single shard, on a single
Sorry for the delay between my twitter response and my reply here.
Basically, sorting first and then performing query/filter matches is not
really a tenable solution, due to memory constraints. If you were to sort
first, you would need to sort the documents (which may be very expensive
over
Why limit shards to 5gb? Have you capacity tested your hardware + data and
determined the max size is around 5gb?
If the answer is no, I would encourage you to perform some capacity
planning first. It may be that your system can handle 15 or 20gb per
shard, or put a different way, 50m
Could you provide a small recreation of the problem in a gist? Just a few
sample documents and two searches with different sorts that don't work?
It could be your mapping that is incorrect, or a syntax issue with the
query/doc/mapping. It may be a typo, but you are sorting on
title.value,
es_log file the warnings of monitor.jvm are still present.
Am Montag, 17. März 2014 14:32:29 UTC+1 schrieb Zachary Tong:
Ah, sorry, I misread your JVM stats dump (thought it was one long list,
instead of multiple calls to the same API). With a single node cluster, 20
concurrent bulks may
have 12
available processors...
Should we may slow down bulk loader with adding a wait of a few seconds?
Am Dienstag, 18. März 2014 13:22:57 UTC+1 schrieb Zachary Tong:
My observations from your Node Stats
- Your node tends to have around 20-25 merges happening at any given
time
Are you running searches at the same time, or only indexing? Are you bulk
indexing? How big (in physical kb/mb) are your bulk requests?
Can you attach the output of these APIs (preferably during memory buildup
but before the OOM):
- curl -XGET 'localhost:9200/_nodes/'
- curl -XGET
know :)
Have a good day all,
Erdal.
Le mercredi 12 mars 2014 20:05:11 UTC+1, Zachary Tong a écrit :
For the record, this array syntax should work as well:
$qry = array(
'query' = array(
'function_score' = array(
'functions' = array(
array
Can you gist up the output of these two commands?
curl -XGET http://localhost:9200/_nodes/stats;
curl -XGET http://localhost:9200/_nodes;
Those are my first-stop APIs for determining where memory is being
allocated.
By the way, these settings don't do anything anymore (they were depreciated
I believe you are just witnessing the OS caching files in memory. Lucene
(and therefore by extension Elasticsearch) uses a large number of files to
represent segments. TTL + updates will cause even higher file turnover
than usual.
The OS manages all of this caching and will reclaim it for
that though.
On Thursday, March 13, 2014 5:17:20 PM UTC-7, Zachary Tong wrote:
I believe you are just witnessing the OS caching files in memory. Lucene
(and therefore by extension Elasticsearch) uses a large number of files to
represent segments. TTL + updates will cause even higher file
Also, are there other processes running which may be causing the problem?
Does the behavior only happen when ES is running?
On Thursday, March 13, 2014 8:31:18 PM UTC-4, Zachary Tong wrote:
Cool, curious to see what happens. As an aside, I would recommend
downgrading to Java 1.7.0_u25
Hey there. It's possible, but it requires explicit object creation via the
PHP \stdClass() object. PHP isn't very good at deciding between arrays and
objects when using json_encode, unless you use explicit objects.
Untested, but try this:
$scriptScore = new \stdClass();$scriptScore-script =
For the record, this array syntax should work as well:
$qry = array(
'query' = array(
'function_score' = array(
'functions' = array(
array(script_score = array('script' =
doc['boostfield'].value))
),
'query' = array(
1. I agree with your assessment, and there is no mechanism to optimize
for this. There will be some overhead to filtering so many terms (although
it's difficult to say how much), it will generate more churn on the filter
cache, and updates to that single document may become
The `took` parameter is the number of milliseconds that the query took to
execute on the Elasticsearch server. It's basically the time required to
parse the query, broadcast it to the shards and collect the results. It
doesn't include network time going to and from Elasticsearch itself (since
If you simply want to decrease the amount of memory that Elasticsearch is
using, you need to change your heap size (via the HEAP_SIZE environment
variable). That controls the total memory allocated to Elasticsearch.
Echoing what Binh said...try not to change the field-data settings unless
you
Hi Christian, I looked into the issue and it appears there isn't really an
alternative at the moment. The property was removed because Lucene removed
the underlying functionality, since it can potentially break token streams
(usually in regards to synonyms). You can see the issue
here:
You can monitor filter cache from three different levels - index, node and
cluster. The output is similar for all three outputs, you'll see a size in
bytes and an eviction count.
- Per-index:
curl -XGET http://localhost:9200/my_index/_stats
- Per-node:
curl -XGET
So the root cause is that your query is structured incorrectly. The
match_all should be inside of a query element, inside the filtered
query:
curl -XGET http://localhost:9200/test_search/_search?pretty=true; -d'
{
query: {
filtered: {
query : {match_all: {}},
Just wanted to add a quick note: long recovery times (due to divergence of
shards between primary/replica) is an issue that we will be an addressing.
No ETA as of yet, but something that is on the roadmap. :)
-Zach
On Wednesday, December 4, 2013 7:48:04 PM UTC-5, Greg Brown wrote:
Thanks
27 matches
Mail list logo