What would you like to do with this filter? Only compute aggregations on
documents that match this filter?
On Mon, May 5, 2014 at 10:05 AM, Niv Penso n...@toonimo.com wrote:
Hey Adrian thank you for the quick respone.
I don't understand how to combine this filter condition in the aggregation:
I am trying to count the number of documents that answers the filter
seperatly for each country.
On Monday, May 5, 2014 11:14:32 AM UTC+3, Adrien Grand wrote:
What would you like to do with this filter? Only compute aggregations on
documents that match this filter?
On Mon, May 5, 2014
Our cluster has been running for about 6 months now and we've collected a
few settings in /_cluster/settings, even the transients aren't transient as
I hardly ever take the entire cluster down. I vaguely remember reading that
entries could be deleted now but I can't find the article again. Any
Hey,
very hard to tell what has been happening here without more information.
There has been a fix for this added recently, see
https://github.com/elasticsearch/elasticsearch/commit/3ea1d00869a33b1e36984bc8c38e03fa16829778
However this only happens when you get rejected requests in your thread
Hey,
did you make sure, that every node has access to that directory specified
in the repository settings. This has to be a network share, where every
node in your running cluster is allowed to write to.
--Alex
On Mon, Apr 28, 2014 at 6:03 PM, Hasitha hasitha...@gmail.com wrote:
Hi
I've a
using cached for both index and search thread pool?
curl -XPUT localhost:9200/_cluster/settings -d '{
persistent : {
threadpool.index.type : cached,
threadpool.search.type : cached
}
}'
On Monday, April 28, 2014 5:40:14 PM UTC+8, Le Thanh Nguyen wrote:
Hi,
we have the
Hey,
the problem is, that you are trying to search over all indices. One of
those indices contains your kibana dashboards - which is just fine. However
that index (named kibana-int) does not contain a timestamp field.
The most simple solution might be, to change your search to not search in
that
Hey,
in order for others to follow it might make a lot of sense to provide a
complete example, which can be used by others to rebuild your problem
(instead of just supplying a query), see http://www.elasticsearch.org/help
Thanks!
--Alex
On Tue, Apr 29, 2014 at 4:08 PM, Brian Lamb
Hey,
this is very hard to tell without further debugging/information. Can you
check your logfiles, if there are any exceptions popping up? Can you check
the logfile of your master node, if node joins/leaves or even master
elections happen? Also you should add monitoring to your infrastructure
Hey,
you could change your default loglevel to find out, if those settings are
actually applied (either DEBUG or TRACE). Depending on the elasticsearch
version you are using, you might want to try with a lower-cased setting of
max_bytes_per_sec and set it to 250mb. Also, can you show the
Hey,
there is no predefined formula. The reason for this is, that it is very
use-case dependent. If your data is really important, you might want to put
it onto every node of your cluster. If your data is too big to be put onto
a single node, you need to split it among several nodes, but still
Hello.
I have an index populated with people and their birth dates. I want to
create a histogram of their current ages. Is this possible?
So I have tried using a date histogram aggregation with year interval which
gives me what years the users where born. This is a good start. To get
exact
Hi,
I'm new to Elasticsearch and try to integrate it into our project but met a
problem. In our data source, it has two fields: mode and schedule which are
all integer. Through UI, it should can query records based on these two
fields like:
*SELECT * FROM doc WHERE mode = 1 AND (schedule = 1
Hey,
can you include a full example including data you have been indexing? Or
include error messages, as you wrote you cannot index any data anymore?
Also, your JSON is not valid as the mapping for the class field contains a
comma too much.
--Alex
On Thu, May 1, 2014 at 1:53 AM, Deepak Jha
Hey,
you could take a look at the terms aggregation using the _type field, which
contains the type information. See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#search-aggregations-bucket-terms-aggregation
--Alex
On
WOW Thnx you helped me so much.
Do you know when version 1.2.0 is going to be released? (weeks, months,
over half year?)
Niv
On Sunday, May 4, 2014 7:54:10 PM UTC+3, Niv Penso wrote:
Hey guys,
I have this mappings:
{
ckdocuments: {
mappings: {
msv: {
Hey,
this would be a bit more tricky, as it requires you to merge two events
(the external dataset and your live visitor stats) into a single event as a
sort of preprocessing step. I think I would start with the geoip support
from logstash and use your apache logs, which at least allows you to
Hey,
using different analyzers in one document, which are extracted from the
document is not possible. However if you have one document per language,
you could use the _analyzer capability for this, see
I'm running a cluster with 4 nodes, all with identical server specs and ES
versions (1.1.0)
I keep running into this problem when running this query:
{
query: {
bool: {
must: [
{
term: {
name: general
}
},
{
term: {
Hi
Hoping there are some nest users here who could help me out.
I am trying to figure out how to specify cutom field analyzers on a Poco
class , so that when it get's indexed automatically via Nest the fields
will be appropriately anlyzed.
But I can't seem to find a way to specify a custom
Check your effective mappings. Although you declared type date on dt, it
might contain actually some strings or non-parseable date strings?
Jörg
On Mon, May 5, 2014 at 2:40 PM, Jan van Vlimmeren
jan.vanvlimme...@gmail.com wrote:
I'm running a cluster with 4 nodes, all with identical server
Hello,
I have a requirement to retrieve the terms frequency (TF) from all recently
indexed documents (last 24 hours)
So in a query I have to supply the time range and expect to get the TFs of
all terms in the given time range
Is it possible to do in ES? If yes, please refer me to the
Hi mark:
That was just one example. The Documents were news articles. Hence the
broad coverage and not specific on -topic documents. Since this is news
from third party sources, I do not have control over what comes into the
index. (i.e. separate the machine generated from manually
Ive got some problems getting fsriver to index my files, ive got a
diskimage mounted on /mnt/ and it contains a folder files containing files.
I am running this on a single server for testing purposes.
When I try to index the files in the folder with:
curl -XPUT
That was it, many thanks.
Strange only 2 shards fail and the output seemed correct except for the
data on those shards.
On Monday, May 5, 2014 3:28:02 PM UTC+2, Jörg Prante wrote:
Check your effective mappings. Although you declared type date on dt, it
might contain actually some strings or
Do you see anything in logs?
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 5 mai 2014 à 15:50, John Ottosson john.a.ottos...@gmail.com a écrit :
Ive got some problems getting fsriver to index my files, ive got a diskimage
mounted on /mnt/ and it contains a folder
I made a post on StackOverflow, but I think the visibility there is low for
ElasticSearch problems. I've copied the text here
I am using *Logstash*, *ElasticSearch* and *Kibana* to allow multiple users
to log in and view the log data they have forwarded. I have created index
aliases for each
Hi, I tried reinstalling elasticsearch and fsriver, when trying to connect
to localhost it failed, perhaps some setting i missed. it said curl: (7)
Failed to connect to localhost port 9200: Connection refused, i used the ip
adress instead and got this in the var/log/elasticsearch/elasticsearch.log
more..
[2014-05-05 18:16:35,335][INFO ][node ] [Valerie Cooper] version[1.1.1],
pid[1258], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-05 18:16:35,336][INFO ][node ] [Valerie Cooper] initializing ...
[2014-05-05 18:16:35,897][INFO ][plugins ] [Valerie Cooper] loaded
[river-fs], sites []
the url is not the issue, i renamed it in this text.
2014-05-05 18:39 GMT+02:00 John Ottosson john.a.ottos...@gmail.com:
more..
[2014-05-05 18:16:35,335][INFO ][node ] [Valerie Cooper] version[1.1.1],
pid[1258], build[f1585f0/2014-04-16T14:27:12Z]
[2014-05-05 18:16:35,336][INFO ][node ]
This should work:
[ElasticProperty(Analyzer = my_analyzer)]
public string Content { get; set;}
You can also specify Index/SearchAnalyzer this way (for fine-grained
control)
--
Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer
Thanks Itamar
Would my_analyzer in this case be an analyzer i defined in code myself
or would it have to have been already created in ES?
On Monday, May 5, 2014 5:44:18 PM UTC+1, Itamar Syn-Hershko wrote:
This should work:
[ElasticProperty(Analyzer = my_analyzer)]
public
It would have to be defined on ES, the analyzer name has to be the one that
ES recognizes (as the plugin defines it)
--
Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer Consultant
Author of RavenDB in Action http://manning.com/synhershko/
This should be possible to do using script fields:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-facet.html#_script_field
However, you will need to figure out how to do date manipulation there
--
Itamar Syn-Hershko
http://code972.com | @synhershko
Hello Alex,
Thanks for your reply. I have already looked at the _analyzer capability,
but it doesn't help me. I will continue using the structure I described
above.
Loïc
Le lundi 5 mai 2014 13:21:30 UTC+2, Alexander Reelsen a écrit :
Hey,
using different analyzers in one document, which
fsriver 1.0.0 does not work with elasticsearch 0.90.7
Look here for older versions: https://github.com/dadoonet/fsriver/tree/es-0.90
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 5 mai 2014 à 18:41:03, John Ottosson (john.a.ottos...@gmail.com) a
Ah ok, so if i undesratd correctly, there is no way to define a custom
analyser in code and use that to to index a poco?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an
There are 2 ways to define a custom analyzer - one is via configuration
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-analyzers.html,
and the other is via code.
If your custom analyzer is written in code, it will be Java code that has
to be deployed to ES as a plugin
Hi Alex,
Thank you for your explanation. It makes sense now. However, I'm not sure I
understood your proposal.
So I would adjust the mlt_fields accordingly, and possibly extract the
relevant portions of texts manually
What do you mean by adjusting mlt_fields? The only shared field that is
ok, i installed 0.90.10 version of elasticsearch instead and rebooted,
mounted the image again and tried to index the files
it seems to fail to get _meta from fs mydocs, not sure what im doing wrong.
im running
java version 1.7.0_55
elasticsearch 0.90.10
fsriver 1.0.0
curl 7.35.0
php 5.5.9
Did you read the link I sent about versions?
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 5 mai 2014 à 20:19:56, John Ottosson (john.a.ottos...@gmail.com) a écrit:
ok, i installed 0.90.10 version of elasticsearch instead and rebooted, mounted
the
How many indexes do you have? It almost looks like the system itself cant
allocate the ram needed?
You might try jacking up the nofile to something like 99 as well? I'd
definitely go with 31g heapsize.
As for moving indexes, you might be able to copy the entire data store,
then remove
Currently I have 279 indexes on a single node and elasticsearch starts for
few minutes and dies ; I only have 60G RAM on disk and as far as I know 60%
is the max that one should allocate to elasticsearch ; I tried allocating
38G and it lasted for few more minutes and it died.
*(I think
You might turn off the bootstrap.mlockall flag just for now - it'll make ES
swap a ton, but your error message looks like an OS level issue. Make sure
you have lots of swap available and grab some coffee.
What I'd also try if turning off bootstrap.mlockall doesnt work:
- Tarball the entire data
*..- Fire up both nodes, make sure they both have the same cluster name*=
This is exactly what I wrote in my second message is where Elasticsearch
is messing up. When I move the index to a new node and delete that index
from master and then start master node and other data node, it (master)
Hi,
Reading data in JSON format from ES (which I think is what you are interested
in doing) is not available out of the box.
Simply because you can do the same thing directly from the command line with
curl or any http-like client.
One of the reasons behind hadoop-streaming is to allow native
Get node2 running with rock. Then issue a disable_allocation and then bring
up node1.
curl -XPUT localhost:9200/_cluster/settings -d
'{transient:{cluster.routing.allocation.disable_allocation:true}}'
From there, adjust the replica settings on the indexes down to 0 so they
dont copy. Once thats
This is a simple example to illustrate the point. The real use case:
1. Is a rather large amount of data which I'd like to handle in parallel
also take advantage of es-hadoop's handling of shards.
2. Uses an existing job execution framework toolset based on
hadoop-streaming; I would rather
I've been following a tuturial here:
http://blog.qbox.io/multi-field-partial-word-autocomplete-in-elasticsearch-using-ngrams
but when I query for results I only get matches if i use the entire word.
Partial word searches always give zero results. How do debug this ?
A query for 'aragon'
Hi,
Working on setting up some custom console for ES clusters.
We are having some nodes acting as clients only and some as data nodes only.
Just wondering how to retrieve the type of the node using REST or Java API.
Sincerely,
Arkadiy
--
You received this message because you are subscribed
How would the json format help you though? If you plan parse the output, using a simpler text representation (such as
TextOutputFormat) should be a lot easier.
On 5/5/14 11:35 PM, Peter Sheridan wrote:
This is a simple example to illustrate the point. The real use case:
1. Is a rather large
I have a number of fields (15K) that were created inadvertently.
How would I delete them all from all indexes based on a pattern of the
field name.
Eg all the fields I want to delete are in indexes named with the standard
logstash naming and are
calculus_data.calculus_*
Thanks,
-Chris
Any return?
--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
Hi Alex,
I am using logstash to index my data to elasticsearch. When I do not use
this template I am able to push data.
Here is the few lines from the log file that I have:
2014-05-05 14:51:20,248 (main) a_class INFO: message1
2014-05-05 14:51:20,249 (main) a_class INFO: message2
2014-05-05
Check out this page -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-nodes-info.html
There is also the cat API -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cat-nodes.html
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
You have only two nodes it seems. Adding nodes may help.
Beside data nodes that do the heavy work, set up 3 master eligible nodes
(data-less nodes, with reasonable smaller heap size for cluster state and
mappings). Set the other data nodes to non-eligible for becoming master.
Jörg
On Mon, May
I have 11 nodes. 3 are dedicated masters and the other 8 are data nodes.
On May 5, 2014 4:03 PM, joergpra...@gmail.com joergpra...@gmail.com
wrote:
You have only two nodes it seems. Adding nodes may help.
Beside data nodes that do the heavy work, set up 3 master eligible nodes
(data-less
Then you need more nodes, more heap on existing nodes or less data.
You've reached the limit of what your current cluster can handle, that is
why this is happening.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 6
Can anybody help plz?
On Monday, May 5, 2014 10:24:09 AM UTC+12, JGL wrote:
Hi Martjin,
The percolator query in the 1st post above is what we registered to the
percolator and kinda working, which consolidate all IDs in one query string
for a match query, which seems not quite a elegant
Is there any way to prevent ES from blowing up just by selecting too much
data? This is my biggest concern.
Is it because the bootstrap.mlockall is on, so we give ES/JVM a specified
amount of memory and thats all that node will receive? If we turned that
off and had gobs more swap available for
Thanks Nate, but this doesn't work. node2 is not the master. So starting it
first didn't make sense, anyway I tried it and I couldn't execute anything
on a nonmaster node (node2) unless master was started
I started node2 (non master) and ran this: curl -XPUT
localhost:9200/_cluster/settings -d
Moving data on the OS level without making ES aware can cause difficulties
as you are seeing.
A few suggestions on how to resolve this and improve things in general;
1. Set your heap size to 31GB.
2. Use Oracle's java, not OpenJDK.
3. Set bootstrap.mlockall to true, you don't want to
You can find an example reverse proxy config for nginx here -
https://github.com/elasticsearch/kibana/blob/master/sample/nginx.conf
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 6 May 2014 11:34, Nish
Thank You Mark!!!
_cat/nodes is exactly what I need
_nodes as I can see only provides master status but but not client/data
status
On Monday, May 5, 2014 6:37:41 PM UTC-4, Mark Walkom wrote:
Check out this page -
Hey Mark,
Thanks for the response. I have currently created two new medium test
instances (1 master 1 data only) because I didn't want to mess with the
main dataset. In my test setup, I have about 600MB of data ; 7 indexes
After looking around a lot I saw that the directory organization is
yeah ; I figured out ; actually somehow I deleted TCP:80 from my security
group (I guess lack of sleep made me a zombie )
On Mon, May 5, 2014 at 9:36 PM, Mark Walkom ma...@campaignmonitor.comwrote:
You can find an example reverse proxy config for nginx here -
Don't copy indexes on the OS level!
Is your new cluster balancing the shards?
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 6 May 2014 12:46, Nishchay Shah electron...@gmail.com wrote:
Hey Mark,
Thanks for the
Probably not.
I deleted all data from slave and restarted both servers and I see this:
*Master: *
[root@ip-10-169-36-251 logstash-2013.12.22]# du -h --max-depth=1
16M./0
16M./1
8.0K./_state
15M./4
15M./3
15M./2
75M.
*Data: *
[root@ip-10-186-152-19
Hi David, thanks a lot for the reply, and you are greatly correct.
I tested carefully (wait for master2 to be up and running before starting
master1), the document now synced.
But, I have a new question now, as real failover, when the 2 server maybe
goes down itself (not shutdonw manually),
FYI settings:
*Master*:
[root@ip-10-169-36-251 logstash-2013.12.05]# grep -vE ^$|^#
/xx/elasticsearch-1.1.1/config/elasticsearch.yml
cluster.name: elasticsearchtest
node.name: node1
node.master: true
node.data: true
index.number_of_replicas: 0
discovery.zen.ping.multicast.enabled: false
I'm not sure I understand this - *write request can be send to any node,
which in turn will do proxying, we can avoid this and only hit the
primaries. This avoids the proxying, rerouting*.
Even if you hit a primary, ES will still have to re-route the document to
the primary shard handling the
Having 3 nodes makes sure you can easily maintain a majority quorum.
Once you get to larger sizes, it may/does make sense to have some data and
master only nodes.
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 6
You need to install a monitoring plugin to gain better insight into what is
happening, it makes things a lot easier to visually see cluster/node/index
state and remove your shell commands, which may not be 100% representative
of what ES is actually doing.
I suggest elastichq and marvel.
Regards,
I am getting similar error with :
ES : 1.1.1
Hadoop: Hadoop 1.0.2
JDK: 1.6.0_24
elasticsearch-hadoop-1.3.0.M3
/hadoopApps/wordcount$ hadoop jar WordCount.jar WordCount
/user/himanshu/inputdir
Exception in thread main java.lang.NoClassDefFoundError:
74 matches
Mail list logo