Re: JDBC River missing documents??

2015-04-22 Thread joergpra...@gmail.com
There are log messages at ES cluster side, you should look there why bulk indexing failed. Jörg On Thu, Apr 23, 2015 at 5:45 AM, GWired wrote: > Found this in the logs: > > [2015-04-22 22:01:25,063][ERROR][river.jdbc.BulkNodeClient] bulk [15] > failed with 945 failed items, failure message = fa

Re: FreeBSD 10.1 install elasticsearch plugin fails

2015-04-22 Thread David Pilato
The command you write is totally wrong. You set a url, you define the wrong version... And it sounds you renamed plugin script but that's not an issue. Whatever. Doc says: bin/plugin install elasticsearch/elasticsearch-mapper-attachments/2.5.0 Try bin/plugin install elasticsearch/elasticsearch-

Re: Suspicious connections on ES

2015-04-22 Thread Jason Zhang
Yes, and those processes continue to ddos other ips... I've stopped those processes and delete the binary files. Also, disable the dynamic scripting. On Thursday, April 23, 2015 at 11:50:19 AM UTC+8, Mark Walkom wrote: > > It looks like your instance has been breached. > > You may want to take a

Re: Suspicious connections on ES

2015-04-22 Thread Mark Walkom
It looks like your instance has been breached. You may want to take a look at https://www.elastic.co/blog/scripting-security/ On 23 April 2015 at 11:46, Jason Zhang wrote: > Yes, but I've configured iptables to avoid those foreign unknown > connections like: > > ``` > $ sudo iptables -I INPUT -

Re: JDBC River missing documents??

2015-04-22 Thread GWired
Found this in the logs: [2015-04-22 22:01:25,063][ERROR][river.jdbc.BulkNodeClient] bulk [15] failed with 945 failed items, failure message = failure in bulk execution: On Wednesday, April 22, 2015 at 7:53:25 PM UTC-5, GWired wrote: > > Hi All, > > I've just been informed that i'm off by up to

FreeBSD 10.1 install elasticsearch plugin fails

2015-04-22 Thread Pccom Frank
Hi, Please help to install plugins for elasticsearch on FreeBSD. I tried different ways, I always fail. The following is one of them: root@mail:/usr/local/lib/elasticsearch/plugins # elasticsearch-plugin --url https://github.com/elastic/elasticsearch-mapper-attachments.git --install elasticsear

Re: Suspicious connections on ES

2015-04-22 Thread Jason Zhang
Yes, but I've configured iptables to avoid those foreign unknown connections like: ``` $ sudo iptables -I INPUT -p tcp -s my_ip --dport 9200:9400 -j ACCEPT $ sudo iptables -P INPUT -j DROP ``` I forgot to say that I set `script.disable_dynamic: false` to run some external js scripts. At that ti

Re: Elasticsearch puppet module's problem

2015-04-22 Thread Mark Walkom
The module works with the notion of an instance and when you setup an instance it creates /etc/init.d/elasticsearch-$instancename On 22 April 2015 at 17:42, Sergey Zemlyanoy wrote: > Update > > > When I commented t

Re: Grouping/extracting results uploaded to Elasticsearch

2015-04-22 Thread Mark Walkom
You ideally want to restructure your data and split the fields out. If you can't do it in your code then Logstash would be able to do something. On 22 April 2015 at 23:05, KT SSP wrote: > Hello > > We have a build process using ant that is externally monitored (web page) > and shows the current

Re: 30 billion unique documents (and counting)

2015-04-22 Thread Kimbro Staken
Running ES at scale is all about balance and sizing right. Like the 3 bears, not too big and not too small, just right. Big boxes will just be wasted and too small of boxes will have you hitting limits too soon. Given the way java works with heaps above 30GB-ish the best size for a node right now s

Re: Is there a way to know memory required

2015-04-22 Thread Mark Walkom
See the docs or more info on doc values - http://www.elastic.co/guide/en/elasticsearch/guide/current/doc-values.html On 23 April 2015 at 05:56, bvnrwork wrote: > > 35 fields - each contains 10 KB of text and what is doc values you said ? > > On Saturday, 18 April 2015 22:41:18 UTC-4, Mark Walko

Re: 30 billion unique documents (and counting)

2015-04-22 Thread Mark Walkom
​Not really, the smaller server you mentioned before would be suitable.​ On 23 April 2015 at 11:04, Jack Park wrote: > That starts to argue for lots of smaller servers maybe even with smaller > SSD's. Say, a low power i3 with 16 or 23gb ram, and a 128gb SSD. Is that > right? > > On Wed, Apr 22,

Re: 30 billion unique documents (and counting)

2015-04-22 Thread Jack Park
That starts to argue for lots of smaller servers maybe even with smaller SSD's. Say, a low power i3 with 16 or 23gb ram, and a 128gb SSD. Is that right? On Wed, Apr 22, 2015 at 2:56 PM, Mark Walkom wrote: > If you are using time series data then you should be using time series > indices. As Fred

Re: 30 billion unique documents (and counting)

2015-04-22 Thread Jack Park
I would certainly like to see that book, or at least a draft of it ;-) On Wed, Apr 22, 2015 at 10:12 AM, Kimbro Staken wrote: > Hello Fred, > > I have clusters as large as 200billion documents/130TB. Sharing > experiences on that would require a book, but a couple quick things that > jumped out

Re: Suspicious connections on ES

2015-04-22 Thread Mark Walkom
Is your ES instance open to the world? Check your ES logs as well. On 22/04/2015 8:44 pm, "Jason Zhang" wrote: > Also, I've noticed there're many suspicious files in /tmp, like: > > ``` > $ ls -al /tmp > 26000 > 32 > 991linux > conf.n > elasticsearch/ > gates.lock > git > icp > Intelip > Intelips

JDBC River missing documents??

2015-04-22 Thread GWired
Hi All, I've just been informed that i'm off by up to 100k records or so in my jdbc river fed index. I am using the column strategy using a createddate and lastmodified date. Kibana is reporting an entirely different # than what i see reported in the DB.. Table A has 978634 in SQL, 934646 sh

Re: Elasticsearch ingest performance

2015-04-22 Thread Kimbro Staken
Hello Brian, Many things will affect the rate of ingest, the biggest one is making sure the load gets spread around. But are you sure ES is what's bottlenecking here? With only 5 shards you're only using half your cluster but I'm willing to bet your 20 threads on the importer isn't maxing that out

Elasticsearch ingest performance

2015-04-22 Thread bparkison
We are running a 10-node Elasticsearch 1.4.2 cluster, and getting cluster wide throughput of 18161 docs/sec, or about 18MB/sec. We'd like to improve this as much as we can, without impacting query times too much. Our hardware: RAM: 128GB Disks: 8 disks, 7200 RPM, 1TB in a RAID 0 array CPU: Int

Re: enabling filter cache

2015-04-22 Thread Eddie Kim
I'm a bit confused, is terms filter slower because it has to iterate through a list of bitsets whereas lucene already has access to the list of matching documents via inverted index? Also, if I set cache=true for each individual filter, does it allow me to create any permutation of my bool filter

Re: Data too large error

2015-04-22 Thread Mark Walkom
This original thread is nearly a year old! You'd be better off starting a new one :) On 23 April 2015 at 01:36, Spencer Owen wrote: > Did you figure this out? I'm running into the same problem. > > On Thursday, July 31, 2014 at 3:22:28 AM UTC-6, Rhys Campbell wrote: >> >> I occasionally get the

Re: 30 billion unique documents (and counting)

2015-04-22 Thread Mark Walkom
If you are using time series data then you should be using time series indices. As Fred pointed out, routing an entire month's worth of data to a single shard is not going to scale. Also, we recommend that you keep shard size below 50GB, this helps with recovery and distribution. There is also a h

Nested Filter & Nested Aggregation don't work together

2015-04-22 Thread BradVido
Am I misunderstanding the Nested Filter? I expected it to exclude any nested objects that didn't match it (and subsequently not match them in any nested aggregations for the same path). Example: I have a field "foo" with a mapping type of "nested". I execute a mach-all query with a Nested Term

Re: Is there a way to know memory required

2015-04-22 Thread bvnrwork
35 fields - each contains 10 KB of text and what is doc values you said ? On Saturday, 18 April 2015 22:41:18 UTC-4, Mark Walkom wrote: > > Not really; how large are your fields, are they analysed, are you using > doc values? > You really need to test this with your own data set. > > On 18 Apri

Consultant Needed for initial setup / audit / tuning /etc.

2015-04-22 Thread Brian Gruber
Looking to see if we have setup the system efficiently and correctly as well as general guidance. I'm looking for someone to provide some initial consulting (not very long) to give my current setup a nice audit and make sure I've set things up efficiently/correctly. I'm having trouble finding a

Re: enabling filter cache

2015-04-22 Thread Nikolas Everett
With term queries I imagine its nanoseconds to a net loss to use the filter cache. You should really test it though because I'm not 100% sure. There was talk of elassticsearch being more intelligent about which filters it decides to cache but I don't know where that's gone. Nik On Wed, Apr 22, 2

Re: enabling filter cache

2015-04-22 Thread Eddie Kim
In terms of performance, we we talking nanoseconds saved by using term filters, or possibly a few milliseconds? Given the performance requirements for this query, even saving a few milliseconds is a lot. Also, it looks like I should cache at the individual filter level, as they will be bundled diff

Re: enabling filter cache

2015-04-22 Thread Nikolas Everett
On Wed, Apr 22, 2015 at 2:41 PM, Ed Kim wrote: > Hi, I have a dynamic query built via java api that assembles a filtered > query depending on the parameter input. I have about a dozen filters > (mostly term filters) that may or may not be used, and had a couple > questions: > > 1. Is it ok to sim

Re: FIQL for abstraction of Query Syntax

2015-04-22 Thread joergpra...@gmail.com
I implemented CQL for Elasticearch https://github.com/xbib/elasticsearch-plugin-sru I do not recommend it for the general case because CQL is inferior to the power and expressiveness of Elasticsearch DSL. If you have audience that prefers old school boolean search and do not want ES-specific feat

enabling filter cache

2015-04-22 Thread Ed Kim
Hi, I have a dynamic query built via java api that assembles a filtered query depending on the parameter input. I have about a dozen filters (mostly term filters) that may or may not be used, and had a couple questions: 1. Is it ok to simply set the parent boolFilterBuilder cache setting to tr

more_like_this POST/JSON equivalent of GET

2015-04-22 Thread Aleem B
My first day with ElasticSearch and it's wonderful. The following query works fine: curl -XGET ' http://localhost:9200/demo/news/1177421/_mlt?mlt_fields=title,content&fields=id&min_doc_freq=1&pretty=true ' But the following returns incorrect results: curl -XPOST 'http://localhost:9200/demo/new

FIQL for abstraction of Query Syntax

2015-04-22 Thread Pradeep B
Hi I am responsible for provisioning a search service to the enterprise and would like to provision a querying syntax in a vendor agnostic fashion. Has anyone tried this before ? I am looking at FIQL and CQL

Re: 30 billion unique documents (and counting)

2015-04-22 Thread Kimbro Staken
Hello Fred, I have clusters as large as 200billion documents/130TB. Sharing experiences on that would require a book, but a couple quick things that jumped out at me. 1. do not go the huge server route. Elasticasearch works best when you scale it horizontally. The 64GB route is a much better opti

Re: Question about Query DSL in ElasticSearch

2015-04-22 Thread Tiago Filipe
Thanks for the reply. I already found another solution though (not related to ElasticSearch) Thanks again. Em quarta-feira, 22 de abril de 2015 16:07:54 UTC+1, Adrien Grand escreveu: > > This SQL query is a join and in general elasticsearch does not support > joins. > > If the id field is your

Re: Data too large error

2015-04-22 Thread Spencer Owen
Did you figure this out? I'm running into the same problem. On Thursday, July 31, 2014 at 3:22:28 AM UTC-6, Rhys Campbell wrote: > > I occasionally get the following error in Kibana from elasticsearch > > *1. * > > > *Oops!ElasticsearchException[org.elasticsearch.common.breaker.CircuitBreak

Re: Question about Query DSL in ElasticSearch

2015-04-22 Thread Adrien Grand
This SQL query is a join and in general elasticsearch does not support joins. If the id field is your PK, you might be able to do it by indexing B as a child of A (using parent/child) and then searching for all documents in A that have a child in B. On Wed, Apr 22, 2015 at 4:11 PM, Tiago Filipe

Re: I have got a little Problem with my synonym filter ....

2015-04-22 Thread Ste Phan
Ok, I found my error ... the structure of the index definition was wrong ... sorry. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@g

Question about Query DSL in ElasticSearch

2015-04-22 Thread Tiago Filipe
I'm new to ElasticSearch and I'm struggling with this question. Basically what I want to do is sort of like this (SQL Example): SELECT A.id FROM TableA A, TableB B WHERE A.id = B.id; I want a Query that returns all of the info from TableA, but only if the id from TableA is equal to an id from

Re: I have got a little Problem with my synonym filter ....

2015-04-22 Thread Ste Phan
I tried multi_match queries. The little Example seems to work meanwhile ... don't know why?! But my original index has the same problem. I am posting the synonyms file, so as the create statement. Analyzing this via: GET /myindex/_analyze?field=article.authors {gumbel} results to: { "toke

Percentile based on aggregated data.

2015-04-22 Thread imarkoffko
My problem: There are a lot of documents with the following structure (example): {"someId" : 1, "someCounter": 2} {"someId" : 1, "someCounter": 1} {"someId" : 2, "someCounter": 3} {"someId" : 3, "someCounter": 5} {"someId" : 3, "someCounter": 1} {"someId" : 3, "someCounter": 3} I want to calculat

Re: SHIELD terms lookup filter : AuthorizationException BUG

2015-04-22 Thread Bert Vermeiren
Hi Jay, Thanks to acknowledge ! Is there any way to work around this issue ? We definitely need a kind of "join" filter for limiting the returned data based on some permissions/tokens. We are also starting discussions for a support and re-distribution license with both your and our marketing

Re: Query boost values available in script_score?

2015-04-22 Thread Nikolas Everett
You may want to write your question in json form. Like with a little arrow saying "this value is the one I want". On Wed, Apr 22, 2015 at 9:04 AM, Kevin Reilly wrote: > Bump. > > On Monday, April 20, 2015 at 2:48:51 PM UTC-4, Kevin Reilly wrote: >> >> Hi. Are query boost values available in scri

Grouping/extracting results uploaded to Elasticsearch

2015-04-22 Thread KT SSP
Hello We have a build process using ant that is externally monitored (web page) and shows the current goal in a group of goals. The active goal is being uploaded to elasticsearch every 60 seconds and is allowing us to monitor a builds progress. So for three goals (A,B & C) we would see somethin

Re: Query boost values available in script_score?

2015-04-22 Thread Kevin Reilly
Bump. On Monday, April 20, 2015 at 2:48:51 PM UTC-4, Kevin Reilly wrote: > > Hi. Are query boost values available in script_score? > Read the documentation with no success but perhaps I overlooked something. > > Thanks in advance. > -- You received this message because you are subscribed to the

Re: Master node refuse to accept its role

2015-04-22 Thread David Pilato
I just ran a small test on a ec2 instance. I just set node.master: true node.data: false on a node without and then with cloud-aws-plugin and it worked well in both cases. Could you try to simplify your settings to see what setting is actually causing this? cluster.name: testingelastic clou

Re: SHIELD terms lookup filter : AuthorizationException BUG

2015-04-22 Thread Jay Modi
Hi Bert, Thank you for the detailed report and reproduction of this issue. This is a known limitation with Shield and certain operations in elasticsearch. We're working to resolve this in a future release. We will be documenting this limitation and all of the operations affected shortly; this

Re: Evaluating Moving to Discourse - Feedback Wanted

2015-04-22 Thread Zhongxing Xu
This is really important for us in China!! Please switch to whatever service that is accessible in China. Thank you. 在 2015年4月2日星期四 UTC+8下午11:36:33,leslie.hawthorn写道: > > Hello everyone, > > As we’ve begun to scale up development on three different open source > projects, we’ve found Google Group

Re: Master node refuse to accept its role

2015-04-22 Thread Zaid Amir
Both nodes have: ElasticSearch: 1.5.0 Cloud-AWS: 2.5.0 On Wednesday, April 22, 2015 at 2:29:39 PM UTC+3, David Pilato wrote: > > Which versions for: > > Elasticsearch > Cloud-aws-plugin > > ? > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > Le 22 avr. 2015 à 12:41,

Re: Analyzers in Elastic search

2015-04-22 Thread David Pilato
You could start here http://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-analysis.html ? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > Le 22 avr. 2015 à 12:58, Sharath Chandra a écrit : > > Hi every one, > I am very new to Elastic Search.Now i am reading

Re: Master node refuse to accept its role

2015-04-22 Thread David Pilato
Which versions for: Elasticsearch Cloud-aws-plugin ? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > Le 22 avr. 2015 à 12:41, Zaid Amir a écrit : > > Hi, > > I am starting a new cluster and I want it to be set with two nodes. One is a > data only node and the other is

Analyzers in Elastic search

2015-04-22 Thread Sharath Chandra
Hi every one, I am very new to Elastic Search.Now i am reading analyzers.I read that *An analyzer is used at index Time and at search Time.*What is Index time what is the search time.I want to have basic information about analyzers.Please suggest on this. Thank you -- You received this message

Re: Suspicious connections on ES

2015-04-22 Thread Jason Zhang
Also, I've noticed there're many suspicious files in /tmp, like: ``` $ ls -al /tmp 26000 32 991linux conf.n elasticsearch/ gates.lock git icp Intelip Intelips Intelnet Intelnets jrtj log .lz1429583673 xudp xx32 zlwanby ``` Is my machine be hacked? On Wednesday, April 22, 2015 at 6:16:15 PM UTC+8

Master node refuse to accept its role

2015-04-22 Thread Zaid Amir
Hi, I am starting a new cluster and I want it to be set with two nodes. One is a data only node and the other is a master only node. Both nodes are hosted on Amazon EC2 and are in the same region Here is my configuration for both nodes. Master Node: cluster.name: testingelastic cloud.aws.acce

Suspicious connections on ES

2015-04-22 Thread Jason Zhang
Hi, Recently I find something odd using lsof: ``` $ sudo lsof -p pid | grep -i tcp | awk '{print $1, $10}' | sort | uniq freeBSD my_ip:random_port->unknown_ip:port Intelnets my_ip:random_port->unknown_ip:port .lz142958 my_ip:random_port->unknown_ip:port service (ESTABLISHED) sh (ESTABLISHED) xu

Re: upgrade java for elasticsearch node

2015-04-22 Thread Jason Wee
Thank Jörg, fully aware of Java 7 eol. Index will remain as is as in, after java upgraded to 7 for all nodes, client can query/index without any problem. If internally lucene index need to upgrade, so be it, everything just okay. That was what I mean. Well yea, backup is very important too, that'

Fault tolerant tribe nodes?

2015-04-22 Thread Espen Wang Andreassen
Does anyone know if it is possible to get a tribe-node to _not_ go dark if a configured child-cluster goes offline? We would like to use it to provide federated search across multiple clusters (in different data centers) - but as it is now the tribe node will not return _anything_ if one of the

Re: upgrade java for elasticsearch node

2015-04-22 Thread joergpra...@gmail.com
Please note, Java 7 has reached end of life, and will no longer receive updates https://www.java.com/en/download/faq/java_7.xml I recommend Java 8. ES is sensitive to JVM changes (hash codes for hash maps are computed differently in Java 8) but this exposes only in rare cases. I am not sure wha

Re: upgrade java for elasticsearch node

2015-04-22 Thread Jason Wee
Thanks david, my follow up questions. > So, basically shutdown all nodes and clients. Then upgrade your JVM. That sounds to me no rolling upgrade :( the users will experience down time, but with your recommendation, when jvm is upgraded on all nodes and clients, the es instance in the es node just

Re: What web server Elasticsearch use on windows?

2015-04-22 Thread David Pilato
Elasticsearch provides its own web server so you don’t need to provide anything else than a JVM. Netty is used BTW. -- David Pilato - Developer | Evangelist elastic.co @dadoonet | @elasticsearchfr | @scrutmydocs

Re: org.elasticsearch.index.mapper.MapperParsingException: failed to parse -- NEED HELP

2015-04-22 Thread David Pilato
I think you mapping expects an object for the error field but you sent a string in it. -- David Pilato - Developer | Evangelist elastic.co @dadoonet | @elasticsearchfr | @scrutmydocs

Re: upgrade java for elasticsearch node

2015-04-22 Thread David Pilato
You need to upgrade both at the same time. Otherwise, you might get non serializable exceptions and the cluster might not behave correctly. So, basically shutdown all nodes and clients. Then upgrade your JVM. That said, you might be able to upgrade clients after that but I’d be super conservati

org.elasticsearch.index.mapper.MapperParsingException: failed to parse -- NEED HELP

2015-04-22 Thread Tony Chong
Hi, Hello. I have read about similar problems online but haven't really figured out what the solution is, and was hoping somebody can point me in the right direction. I'm using ELK. ES 1.5.0 Logstash 1.5.0rc2 Kibana 4.0.1 I have all type of application logs that are written out as JSON, but

What web server Elasticsearch use on windows?

2015-04-22 Thread Xudong You
I deployed ES on my machine and can access via HTTP url, but I did not install IIS on my windows machine yet, so what web server does Elasticsearch use on windows? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group

What web server does Elasticsearch use on Windows?

2015-04-22 Thread Xudong You
I deployed my ES on windows machine and can access it via HTTP request, but I did not install IIS yet, so what web server does ES used by default on windows? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and

30 billion unique documents (and counting)

2015-04-22 Thread fdevillamil
Hi list, I've been using ES in production since 0.17.6 with clusters up to 64 virtual machines and 20T data (including 3 replica). We're now thinking about pushing things a bit further and I wondered if people here had similar experience / needs as we do. Our current index is 1.1 billion uniqu

Re: maxDocs different between primary and replica shards

2015-04-22 Thread christian . dahlqvist
Hi, Merging of segments and the resulting removal of deleted documents is not coordinated across nodes in Elasticsearch, meaning that the amount of deleted documents can differ between primary and replica shards. Optimising an index down to a single segment does resolve this, but can as noted b

Re: Elasticsearch Version Upgrade

2015-04-22 Thread Norberto Meijome
Yup thanks , that's what I thought. On 22/04/2015 2:49 pm, "David Pilato" wrote: > Only post 1.0 > > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > Le 22 avr. 2015 à 01:14, Norberto Meijome a écrit : > > David, is this the case with older versions (both client and s

Re: Elasticsearch puppet module's problem

2015-04-22 Thread Sergey Zemlyanoy
Update When I commented this part of config.pp the service and configs appeared on host /*# Removal of files that are provided with the package which we don't use file { '/etc/init.d/elasticsearch': ensure => 'absent' } file { '/usr/lib/systemd/system/elasticsearch.servic

Re: maxDocs different between primary and replica shards

2015-04-22 Thread chris
Same problem... Bueller? Bueller? On Thursday, May 8, 2014 at 1:07:44 PM UTC-7, Csaba Dezsényi wrote: > > I exactly have the same issue! > Does someone have solution for this? > > Thanks, > Csaba > > 2013. november 28., csütörtök 14:26:51 UTC+1 időpontban Klaus Brunner a > következőt írta: >> >

upgrade java for elasticsearch node

2015-04-22 Thread Jason Wee
Hello, We are using java6 for our elasticsearch node of version 0.90 and googled, nothing discuss specifically on how to or procedure to upgrade java deployed in elasticsearch cluster. The one came close is this, https://svn.apache.org/repos/asf/lucene/dev/trunk/lucene/JRE_VERSION_MIGRATION.t

Re: Script to return array for scripted metric aggregation from combine

2015-04-22 Thread vineeth mohan
Hello Colin , You are the man :). Seems i have a lot to learn in groovy. Thanks a ton man , it really helped me. Thanks Vineeth On Tue, Apr 21, 2015 at 9:39 PM, Colin Goodheart-Smithe < coling...@elastic.co> wrote: > Vineeth, > > You can return any standard groovy object (by this i m