Re: Threadpool sizes

2015-04-17 Thread Otis Gospodnetic
Yes, merges can hurt, but you can throttle them: http://search-lucene.com/?q=throttle+merge&fc_project=ElasticSearch You can easily correlate search latency with merges, flushes, and refreshes with something like SPM for Elasticsearch. This could help you figure out how much you need to throttl

Jest alternative for Java client for ES HTTP API?

2014-11-26 Thread Otis Gospodnetic
Hi, Are there other Java clients that talk to the ES HTTP API that people like to use, other than Jest? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ -- You received this message because you are subscr

Re: Node stats, since when?

2014-11-24 Thread Otis Gospodnetic
November 16, 2014 9:20:29 PM UTC+1, Otis Gospodnetic wrote: >> >> Hi, >> >> If it's total, then they are from the start. >> >> Otis >> -- >> Monitoring * Alerting * Anomaly Detection * Centralized Log Management >> Solr & Elasticsearch S

Re: Is there any way of tracking request id?

2014-11-22 Thread Otis Gospodnetic
Hi Krysztof, I'm not 100% sure if this will help, but we are adding tracing and error capture to SPM . I *think* that may do what you are after. In the mean time you could try setting up something like Zipkin. Otis -- Monitoring * Alerting * Anomaly Detection * Centr

Re: Tribe Nodes

2014-11-22 Thread Otis Gospodnetic
I believe so, yes. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, November 19, 2014 11:10:38 AM UTC-5, Hari Kosaraju wrote: > > Hi, > > Is it possible to have multiple tribe nodes connect to the sa

Re: cost of automatics refresh

2014-11-22 Thread Otis Gospodnetic
See http://blog.sematext.com/2013/07/08/elasticsearch-refresh-interval-vs-indexing-performance/ Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, November 19, 2014 3:04:29 PM UTC-5, Jinyuan Zhou wrot

Re: problem with heap space overusage

2014-11-22 Thread Otis Gospodnetic
It could be a number of things. Check your various ES caches. Full? Correlated with GC activity increase and eventual OOM. Then check your queries - are they big? Expensive aggregations? (the other day I saw one of our clients using agg queries 10K lines in size) I could keep asking questi

Re: how to migrate lucene index into elasticsearch

2014-11-22 Thread Otis Gospodnetic
You didn't say why you can't just reindex data from original source, but that would be the cleanest way and likely the fastest in terms of human time (and $) you'll *likely* spend if you try using a "shortcut". Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr &

Re: Marvel / ES query document count major discrepancy

2014-11-22 Thread Otis Gospodnetic
Maybe it's counting replicas? Or its own docs? See SPM for an alternative. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thursday, November 20, 2014 2:36:42 PM UTC-5, Mike Seid wr

Re: Why ES node starts recovering all the data from other nodes after reboot?

2014-11-22 Thread Otis Gospodnetic
Hi Konstantin, Check out http://gibrown.com/2014/11/19/elasticsearch-the-broken-bits/ Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thursday, November 20, 2014 9:48:56 PM UTC-5, Konstantin Erman wrote: > >

Re: Index load distribution

2014-11-22 Thread Otis Gospodnetic
To add to Mark's comment - you'll obviously want to makes sure your cluster is more or less balanced (in terms of shards, their sizes, etc.). Should happen automatically, but we've seen a number of situations where things were not working well because shards were not quite balanced, so you may

Re: Node spikes to 1000 threads and hangs, once or twice a day. Help?

2014-11-22 Thread Otis Gospodnetic
Hi, Look at query rates and see if they correlate. I'm guessing they jumped, too. SPM will help with that. Once you confirm you can trace the source of queries further upstream. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & El

Re: Are rivers still being deprecated?

2014-11-16 Thread Otis Gospodnetic
rious data import handlers of >> SolrCloud demonstrate where this leads to. >> >> I don't know what stance the Elasticsearch folks take towards rivers, but >> from an architectural point of view, the clear separation from >> Elasticsearch seems like a good mid- to

Re: Are rivers still being deprecated?

2014-11-16 Thread Otis Gospodnetic
We've seen issues with Rivers in the past and no longer use them in our engagements. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thursday, November 6, 2014 9:17:36 PM UTC-5, Alexandre Rafalovitch wrote: >

Re: Node stats, since when?

2014-11-16 Thread Otis Gospodnetic
Hi, If it's total, then they are from the start. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, November 12, 2014 7:40:52 AM UTC-5, Lasse Schou wrote: > > Hi, > > In the Node Stats API, there are s

To HTTP chunk or not to chunk

2014-11-16 Thread Otis Gospodnetic
Hi, http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.4/modules-http.html recommends NOT to use HTTP Chunking and to use Keep-Alive connections. Yet, the HTTP chunking link there points to http://en.wikipedia.org/wiki/Chunked_transfer_encoding , which implies HTTP Chunking is goo

Re: Heavy load on a small Elasticsearch cluster

2014-10-25 Thread Otis Gospodnetic
Hi, On Friday, October 24, 2014 3:30:01 PM UTC-4, Mathieu Lecarme wrote: > > > > Le vendredi 24 octobre 2014 19:59:01 UTC+2, Jörg Prante a écrit : >> >> You're doomed :) >> >> What monitoring tool do you use? Try to reduce the frequency at least. >> >> Jörg >> >> > New Relic monitor the OS but do

Re: Multiple TCP between ES servers

2014-10-23 Thread Otis Gospodnetic
Hi, Have a look at https://www.found.no/foundation/elasticsearch-networking/ Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thursday, October 23, 2014 2:40:47 PM UTC-4, daaku gee wrote: > > > ES version 1.1

Re: how to retrieve cluster and node stats on data node when disable http (http.enabled: false)

2014-10-23 Thread Otis Gospodnetic
Hello, This works for SPM : https://sematext.atlassian.net/wiki/display/PUBSPM/SPM+FAQ#SPMFAQ-CanSPMcollectmetricsevenwhenElasticsearchHTTPAPIisdisabled? so maybe it will work for you, too. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management

Re: Upper bounds on the number of indexes in an elastic search cluster

2014-09-26 Thread Otis Gospodnetic
Hi Todd, Maybe I missed it or you never said what sort of number of applications/indexes you are thinking - dozens, hundreds, a few thousand, 10K+? Some of our SPM users have Elasticsearch clusters with a dozen or so nodes and 5-10K indexes. Our own Logsene

Re: msgpack support

2014-09-26 Thread Otis Gospodnetic
Hi, I don't know of it, although I see https://github.com/elasticsearch/logstash/blob/master/lib/logstash/codecs/msgpack.rb Did you maybe implement this already? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematex

Re: Rollup data in ES

2014-09-23 Thread Otis Gospodnetic
Hi, We aggregate outside of ES, in memory, and push in bulk. We could still roll up the data stored in ES later on if we wanted to, but reading from ES could get expensive. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sem

Elasticsearch on EC2 R3 instances ==> trouble

2014-09-23 Thread Otis Gospodnetic
Hi, Here is something worth sharing, I think. For Logsene , We've been trying out EC2 R3 instances (specifically r3.large) with Elasticsearch (1.1.x and 1.3.x including 1.3.2) with the latest Java 7 update 67. We tried this on Ubuntu 14.04 AMI, as well as Amazon

Re: Unnecessary Cache Eviction Explained

2014-09-22 Thread Otis Gospodnetic
Hi, It sounds like every single ES deployment out there suffers from this, or am I missing something? Is there an ES issue where this could be tracked (even if the problem in in Guava)? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch

Re: powerful cluster is not able to handle 1.5Tb of data, how to optimize?

2014-09-17 Thread Otis Gospodnetic
Hi Pavel, When you open Kibana and things are slow, what's happening with your server? Is/are the CPUs maxed out for a minute? Do you see heavy disk IO? Swapping? You can use our SPM to see all this and various other ES metrics. Show/tell us what you see and people wil

Re: Query pre-processing before execution?

2014-08-25 Thread Otis Gospodnetic
you want field specific mangling, I would use the field mapper to >>> create a new field type. There, you have read access to the whole >>> (immutable) document source and you can pre-process the field input data in >>> the given document context before indexing. >&

Query pre-processing before execution?

2014-08-20 Thread Otis Gospodnetic
Hi, What is the best way to pre-process a query a bit before ES executes it? (e.g. I want to shingle the query string a bit and expand/rewrite a query before letting ES execute it) I can create a custom Rest Action and a new API endpoint, but I'd prefer to hide custom query pre-processing behi

Re: Large Scale elastic Search Logstash collection system

2014-08-17 Thread Otis Gospodnetic
Hi Robert, Or maybe it's worth rethinking the architecture to avoid having to do tricks like no-replicas for 1h. Kafka in front of ES comes to mind. We use this setup for Logsene and don't have the problem with log loss, so it may work well for you, too. I thin

Re: Elastic + Kibana Server Specs Recommendation

2014-08-17 Thread Otis Gospodnetic
Hi AK, It's a little hard to make a recommendation like this because it really depends on how you've structured your logical and physical index, how much historical data you want to keep and query, what sort of queries you run, how fast you need things to be, etc. Something like SPM for Elasti

Re: How would you compare ES, Lucene with Enterprise Search ?

2014-08-17 Thread Otis Gospodnetic
tis, > I know this is old, but what do you mean by "taller"? > > On Friday, May 24, 2013 12:28:17 PM UTC-4, Otis Gospodnetic wrote: >> >> Hi, >> >> Short answer: I don't know. >> Medium answer: I'm sure each vendor would claim superiority

Re: Recommendations needed for large ELK system design

2014-07-31 Thread Otis Gospodnetic
You can further simplify your architecture by using rsyslog with omelasticsearch instead of LS. This might be handy: http://blog.sematext.com/2013/07/01/recipe-rsyslog-elasticsearch-kibana/ Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://s

Re: Diagnosing a slow query

2014-07-31 Thread Otis Gospodnetic
ORs certainly tend to be slower than simpler/shorter term queries, but I'd suspect that cross-DC part because your index is tinny and servers are beefy and plentiful. Maybe you can look at your network and query metrics and correlate a drop or spike in traffic or packet loss with slow queries?

Re: Handling node failure in ES cluster

2014-07-21 Thread Otis Gospodnetic
Lots of things could be the source of problems here. Maybe you can tune the JVM params. We don't know what you are using or what your GC activity looks like. Can you share GC metrics graphs? If you don't have any GC monitoring, you can use SPM . Why do you have 5

Re: Search query memory usage metrics

2014-07-20 Thread Otis Gospodnetic
Hi, As far as I know, not doable on a per-query level. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Sunday, July 20, 2014 2:02:22 PM UTC-4, kmoore.cce wrote: > > Hi, > > I was wondering if there was a means to gather s

Re: Splunk vs. Elastic search performance?

2014-07-18 Thread Otis Gospodnetic
Hi Oneti, Yes, you can use omelasticsearch and index logs directly from rsyslog to ES. No need for Logstash. We have some documentation about how to index logs into Logsene over at https://sematext.atlassian.net/wiki/display/PUBLOGSENE/Sending+Events+to+Logsene and the piece that sounds like

Re: Any experience with ES and Data Compressing Filesystems?

2014-07-18 Thread Otis Gospodnetic
Hi Horst, I wouldn't bother with this for the reasons Joerg mentioned, but should you try it anyway, I'd love to hear your findings/observations. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, July 16, 2014 6

Re: tiering storage / Curator

2014-07-18 Thread Otis Gospodnetic
Hi, On Tuesday, July 15, 2014 1:20:39 AM UTC-4, Patrick Proniewski wrote: > > Hello, > > Curator makes is possible to migrate an index to another storage > programmatically, and that's very nice to keep old indices on cheap > storage. But if I understand correctly, a unique ES cluster cannot ha

Re: Certified platforms for ElasticSearch

2014-07-18 Thread Otis Gospodnetic
It runs just fine under OSX and I suspect everything else that runs Java. What other platforms are you interested in or had trouble with? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, July 16, 2014 2:27:14 AM

Re: How logs stored in Logstash/Elastisearch

2014-07-18 Thread Otis Gospodnetic
Hi, On Wednesday, July 16, 2014 8:04:09 AM UTC-4, Sandip Bankewar wrote: > > Hello Mark, > > Thanks for your response. > > 1. one log entry in the Logstash is a document what do you mean by that? > Look at Lucene. ES uses Lucene. Lucene has the notion of a "document". In case of Logstash+ES,

Re: using only ES or combined combined with Mongodb or Cassandra

2014-07-18 Thread Otis Gospodnetic
Some people simply feel more comfortable using some of these other technologies. Others may really have different access patterns, some of which ES may not handle well (this is theory - in our work with many, many companies using ES I've seen many cases of this double-storage approach, but I d

Re: Cassandra + Elasticsearch or Just Elasticsearch for Primary data store.

2014-07-18 Thread Otis Gospodnetic
It doesn't sound like Cassandra adds any value. You could have asked the same question, but substituting Cassandra with HBase or HDFS or MySQL, or any other type of storage. But if your main goal is to search it, ES will do just fine. You can always do snapshots to make backups, feed ES throu

Re: Get Perf Counters for ElasticSearch Nodes using JSON

2014-07-18 Thread Otis Gospodnetic
Are you looking for something like http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster-stats.html ? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, July 16, 2014 8:14:00 PM UTC-4, K

Re: IP geolocation without Logstash

2014-07-18 Thread Otis Gospodnetic
Hi, On Thursday, July 17, 2014 10:20:55 AM UTC-4, Justin Koehler wrote: > > I'm working on a system to record usage data for an application that > submits its data to an ES cluster. I would like to record the location of > each data point based on IP geolocation. I found the Logstash plugin that

2 JOBs: Elasticsearch engineer @ Sematext

2014-07-08 Thread Otis Gospodnetic
Hi, At Sematext we have 2 interesting openings. 1) We are looking for an engineer who knows Elasticsearch (or Solr or both) and wants to use these technologies to implement search and analytics solutions for both Sematext's own products such as SPM

Re: Alerting in ELK stack?

2014-07-07 Thread Otis Gospodnetic
We have and use SPM for all our metrics (ES, Kafka, Apache, MySQL, Hadoop, everything) and we feed our logs to Logsene (it has a Kibana UI and a "native" UI). SPM has alerting and anomaly detection, so we use that to get out of bed early

Re: What does it take to make a custom stemmer for ES?

2014-07-07 Thread Otis Gospodnetic
Hi Nandiya, Have a look at Lucene and its source-code for token filters. You'd implement a custom stemmer at Lucene level, and then just use that in ES. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Monday, July 7, 20

Re: Migration from Solr to ElasticSearch

2014-06-02 Thread Otis Gospodnetic
Hi, You could migrate from Solr to ES without reindexing because at the end of the day it is Lucene that writes data to index. You'd want to make sure your ES mappings match your Solr schema. You'd want to create the matching number of shards and replicas you had in Solr(Cloud?). You'd manually

Re: alerts from Kibana/ES

2014-06-02 Thread Otis Gospodnetic
alytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ tel: +1 347 480 1610 fax: +1 718 679 9190 On Friday, May 30, 2014 2:31:07 AM UTC-4, NF wrote: > > That's right, Otis. > > On Friday, May 30, 2014 7:20:27 AM UTC+2, Otis Gospodnetic wrote: &

Re: alerts from Kibana/ES

2014-05-29 Thread Otis Gospodnetic
Hi, There's no alerting in Kibana. Have a look at SPM - it has ES monitoring, threshold and heartbeat alerting, anomaly detection, and a number of other features. Actually, re-reading your email - you are looking to get notified when a certain event is captured? B

Re: Optimizations for nested aggregations

2014-05-29 Thread Otis Gospodnetic
Massive JSON responses could indeed be a problem. I think you can easily see if CPU, Disk, or Network are the bottleneck using really any monitoring tool. Even dstat --cpu --mem --disk --net will give you an idea. :) Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elas

Re: Managing Snapshot Files from Outside ES

2014-05-29 Thread Otis Gospodnetic
Hi, I believe once you take a snapshot, it's up to you what you do with them... if I understood your question correctly. What you describe - sending old snapshots to Glacier, deletion of old snapshot files sounds interesting. Would be interesting to see info about how to do this published som

Re: looking for heavy write optimization

2014-05-29 Thread Otis Gospodnetic
Hi, I see Jorg already provided a nice list of suggestions. But check that FS type - ext2! That's ancient! Try ext3, ext4, or xfs. If you turn off journaling things will be faster. You are using UDP you said, so you must be OK with some data loss. Btw. SPM can m

Re: Print nearby lines after query result

2014-05-29 Thread Otis Gospodnetic
I'd be curious to hear if anyone has any clever suggestions for this, too. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thursday, May 29, 2014 11:29:18 AM UTC-4, Senthil Raja wrote: > > > In unix we are using grep -A 5 -

Re: ANN: new elasticsearch discovery plugin - eskka

2014-05-08 Thread Otis Gospodnetic
Cool, Shikhar, At Sematext we use both ES and Akka (in SPM ), so this is interesting for me to see... Would it make sense to add a bit more to the README. things like: * why? is something wrong with Zen? * pros and cons of this vs. Zen vs. ZK Thanks, Otis -- Perfo

Re: Create new index in busy cluster

2014-04-28 Thread Otis Gospodnetic
Hi, In short, yes. How much exactly depends on how intensive your indexing is. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Monday, April 28, 2014 11:28:41 AM UTC-4, Martin Forssen wrote: > > Assuming I have a heavily

Misleading response latency in Stats API?

2014-04-28 Thread Otis Gospodnetic
Hi, Does ES expose the "total latency" stats - meaning, the latencies that the user sees as "took" in responses? The only latency stats I see ES exposing in the stats API are the latencies from requests between individual nodes, which can often be much shorter than the total latency and thus m

Re: Performance tuning ES for in-memory

2014-04-26 Thread Otis Gospodnetic
or single node, with the > index on 4*2TB spindle disks. > > I have not much SSD for long runs available now. More intensive tests over > days would take synthetic data (docs with random data in random fields with > statistical models). > > Jörg > > > On Fri, Apr 2

Re: Performance tuning ES for in-memory

2014-04-25 Thread Otis Gospodnetic
erge settings of ES 1.1.0 (with serial segment > merge, 1.1.1 has concurrent segment merge again), same routine runs for > many hours, it starts fast, but gets slower then. > > Jörg > > > On Fri, Apr 25, 2014 at 4:13 PM, Otis Gospodnetic > > > wrote: > >> &

Re: Performance tuning ES for in-memory

2014-04-25 Thread Otis Gospodnetic
Hi J On Wednesday, April 23, 2014 12:04:05 PM UTC-4, Jörg Prante wrote: > > High sustainable bulk indexing is very stable here. > > I have 3x HP DL165 G7 32 core machines and can index for hours at same > speed with this settings > > https://gist.github.com/jprante/10666960 > At what rate, thoug

Re: Sense on github abandoned?

2014-04-05 Thread Otis Gospodnetic
Hi, Couldn't Boaz go in and just add the LICENSE to his old Sense Github repo now and solve the license part of the problem? Otis On Tuesday, April 1, 2014 12:14:44 PM UTC-4, kimchy wrote: > > Sense started as a weekend project, and Boaz did not place a license on > it. As you mentioned, this

Re: Disable merge

2014-03-25 Thread Otis Gospodnetic
Hey Vineeth, Don't think so. But you can play with flush and merge throttle params. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Tuesday, March 25, 2014 11:04:05 AM UTC-4, vineeth mohan wrote: > > Hello , > > I am doi

Re: word hit maximum time by users

2014-03-25 Thread Otis Gospodnetic
Hi Robin, Are you looking for something like http://www.sematext.com/search-analytics/ ? Check out the Top Queries report there. You can see it in this live demo, actually: https://apps.sematext.com/demo#sa Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearc

Re: GROUP BY query in ElasticSearch

2014-03-24 Thread Otis Gospodnetic
> Indeed sounds like > https://github.com/elasticsearch/elasticsearch/issues/256 > > Not easy to do. > > > On Tuesday, March 25, 2014 2:20:42 AM UTC, Otis Gospodnetic wrote: >> >> Hi David, >> >> I looked around and haven't been able to find any exampl

Re: GROUP BY query in ElasticSearch

2014-03-24 Thread Otis Gospodnetic
Hi David, I looked around and haven't been able to find any examples of bucket aggregations other than terms one. I think the way this aggregation is described makes it sound like group by in RDBMS world, but it looks like that's not the case. Take this pseudo-example: SELECT category, city,

Re: Logstash logs

2014-03-24 Thread Otis Gospodnetic
There is no such thing as default log level in Logstash, as far as I know. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ tel: +1 347 480 1610 fax: +1 718 679 9190 On Friday, March 21, 2014 2:20:06 PM UTC-4, Ivan Brusic wr

Re: Optimal number of Shards per node

2014-03-24 Thread Otis Gospodnetic
Hi Rajan, http://www.elasticsearch.org/guide/en/elasticsearch/client/community/current/health.html Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ tel: +1 347 480 1610 fax: +1 718 679 9190 On Friday, March 21, 2014 12:56:5

Re: ES instead of Cassandra.

2014-03-24 Thread Otis Gospodnetic
Hi Tim, It depends on the details - "analytics platform" is a bit too abstract. For what it's worth, we run a big performance monitoring and search analytics service (see signature) on top of HBase as persistent data store. This data store layer is pluggable, so we plugged in Solr and then E

Re: Scaling to 150k/sec

2014-03-19 Thread Otis Gospodnetic
Hi, These are nice and small and require no analysis. Turn of _all, tweak merge rate, use high refresh interval, give ES/Lucene a good buffer, look at xa log flush settings, etc. and you should be able to get to 150K/sec without requiring dozens of servers. Otis -- Performance Monitoring * Lo

Re: Best way to duplicate data across clusters live?

2014-03-12 Thread Otis Gospodnetic
Consider Kafka 0.8.1. It comes with a MirrorMaker tool that mirrors Kafka data (to multiple DCs). Once data is local, you can feed your ES from the local Kafka broker. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wed

Re: Multilingual Search Plugin

2014-03-12 Thread Otis Gospodnetic
Can I join the game? Nobody at Wikimedia uses a Mac or Winblows? ;) OK, sorry, couldn't resist, please carry on. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Wednesday, March 12, 2014 2:11:27 PM UTC-4, Nikolas Everett

Re: 3,000 events/sec Architecture

2014-03-12 Thread Otis Gospodnetic
g protocol socket > receiver and index log messages into ES, for example by reusing syslog4j. > Similar to the UDP bulk indexing. > > Jörg > > > On Wed, Mar 12, 2014 at 4:16 AM, Otis Gospodnetic > > > wrote: > >> Hi, >> >> Is that Logstash inst

Re: scalability and creating 1 index per user

2014-03-11 Thread Otis Gospodnetic
Hi, I know some users of SPM for Elasticsearch have clusters with many thousands indexes (and growing), each with 5+ shards. They've been using SPM for man months now, plus they are our clients, so I've had the chance to see their servers and metrics and can tell you that I don't see any othe

Re: Does the server support streaming?

2014-03-11 Thread Otis Gospodnetic
Hi, The key is to find the ideal bulk size and the ideal bulk request concurrency level, and then make sure the client always feeds ES enough data to achieve (close to) ideal utilization and minimize idling on either side. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr

Re: Why not auto release the memory

2014-03-11 Thread Otis Gospodnetic
Hi, Are you seeing OOMs? Or long GC pauses? If not, don't worry - let the JVM make use of the memory you let it use with -Xmx. Conceptually similar to http://www.linuxatemyram.com/ Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext

Re: 3,000 events/sec Architecture

2014-03-11 Thread Otis Gospodnetic
Hi, Is that Logstash instance reading files that are produces by syslog-ng servers? Maybe not but if yes, have you considered using Rsyslog with omelasticsearch instead to simplify the architecture? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Supp

Re: Elasticsearch Performance Analysis

2014-03-11 Thread Otis Gospodnetic
Hi, I doubt you'll have trouble with 500 events/second unless you are doing it on a Spectrum. Here are some more numbers for log events: http://blog.sematext.com/2014/01/20/rsyslog-8-1-elasticsearch-output-performance/ Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & El

Re: [Ann] Elasticsearch Image Plugin 1.0.0 released

2014-03-11 Thread Otis Gospodnetic
Cool! I've been meaning to write something to index all my photos and..very/most importantly, find (near) dupes, so I can clean up my piles of digital pictures. I have a feeling I'm not the only person in a need for something like this and if you've used LIRE (still alive? Amazing!) + E

Re: Securing Kibana + Elasticsearch for public use

2014-03-11 Thread Otis Gospodnetic
Hi, Re ES and VPSs with up to 512 MB RAM. Sure, doable, but very very small nodes/indexes. :) Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thursday, March 6, 2014 1:50:23 AM UTC-5, qu...@hakase.org wrote: > > Hi, >

Re: ES alerting mechanism for failure scenarios, high latency situations

2014-03-06 Thread Otis Gospodnetic
Hi, We use our own SPM for Elasticsearch. It has classic threshold-based alerts as well as alerts based on automatic anomaly detection - http://blog.sematext.com/2013/10/15/introducing-algolerts-anomaly-detection-alerts/ . It's a SaaS, not a plugin, but maybe it would work for you. Otis --

JOB @ Sematext: Professional Services Lead => Head

2014-02-18 Thread Otis Gospodnetic
Hello, We have what I think is a great opening at Sematext. Ideal candidate would be in New York, but that's not an absolute must. More info below + on http://blog.sematext.com/2014/02/18/job-professional-services-lead-solr-and-elasticsearch/in job-ad-speak, but I'd be happy to describe what we a

Re: Fault Tolerance Fallacy

2014-02-05 Thread Otis Gospodnetic
Support * http://sematext.com/ On Wed, Feb 5, 2014 at 2:58 PM, Ivan Brusic wrote: > Otis, you can listen to the comment here at 1:04:20: > http://player.vimeo.com/video/85255909 > > > On Mon, Jan 27, 2014 at 11:15 AM, Otis Gospodnetic < > otis.gospodne...@gmail.com> wrote: > >

Re: How to monitor for filter cache churn?

2014-01-27 Thread Otis Gospodnetic
Hi Tikitu, Re 1. and filer cache size + eviction monitoring, here is an example: https://apps.sematext.com/spm-reports/s/b5g0cSyGm0 Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Monday, January 27, 2014 5:03:47 PM UTC-5

Re: Fault Tolerance Fallacy

2014-01-27 Thread Otis Gospodnetic
FYI, from Shay: https://twitter.com/otisg/status/427866316444553216 Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thursday, January 23, 2014 6:49:05 PM UTC-5, Ken Edwards wrote: > > A few months ago when my team was deci

Re: What components to track for performance monitoring?

2014-01-25 Thread Otis Gospodnetic
n merges(?) > > I haven't looked at 1.0 so not sure if already addressed but would be > loverly to have a more granular breakdown. The main issue we have is > tracking timings on individual queries. Maybe there is already some > setting to allow this? > > On Tuesday

Re: CPU LOAD GETTING HIGH (0.90.3)

2014-01-06 Thread Otis Gospodnetic
atency increase with the Query count (see attached graphs). > This is all expected. The only thing that concerns me the most is that > Query response time is starting to slow down significantly (~200 ms) and > the Load is going above the number of cores (6) during peak traffic... > >

Re: which web ui is mostly used for ElasticSearch?

2014-01-06 Thread Otis Gospodnetic
Hi, For user-facing applications people tend to develop their own. For logs people tend to use Kibana directly or via services like http://sematext.com/logsene For backend operations people tend to use tools like elasticsearch-head, kopf, and such. Otis -- Performance Monitoring * Log Analytics

Re: Moving from Sphinx to ES

2014-01-03 Thread Otis Gospodnetic
HI Brian, We helped a client successfully migrate from Sphinx to Solr last year. Migration to ES should be similar to what we did for them for Solr. Inline... On Friday, January 3, 2014 12:08:55 PM UTC-5, Brian Lovett wrote: > > Hi everyone, we're considering moving from Sphinx to Elastic Sear

Re: Cluster not able to keep up?

2014-01-03 Thread Otis Gospodnetic
Hi, I bet it's Lucene segment merges. You have more machines so you can sustain high input rates longer, but I bet you'll hit the moment when the indexing rate drops again. Check this graph: https://apps.sematext.com/spm-reports/s/eUgWhPqZrg (just look at the last big "tooth") Or instead of lo

Re: CPU LOAD GETTING HIGH (0.90.3)

2014-01-03 Thread Otis Gospodnetic
Hi Greg, The CPU usage is high? Can you share some graphs that show trends? Is the CPU wait time high by some chance? user? system? Can you correlate CPU usage with disk IO or GC? You can easily look at this sort of stuff with SPM for ES and send any graphs you want directly to this list, s

Re: Reports and Notifications.

2013-12-30 Thread Otis Gospodnetic
asticsearch Support * http://sematext.com/ On Monday, December 30, 2013 1:05:44 PM UTC-5, CP wrote: > > Is there a way to install Logsene on premises, looking for something > opensource (no budget at this time)? > > > On Thu, Dec 26, 2013 at 11:42 AM, Otis Gospodnetic >

Re: Shard relocation progress

2013-12-29 Thread Otis Gospodnetic
Hi Mohit, On Thursday, December 19, 2013 12:38:36 PM UTC-5, Mo wrote: > > Thanks! that helps, I'll look at these settings. > > Regarding monitoring is there a way to look at the progress. > Of the migration state/%/ETA for a single shard? I don't think there's anything that will show you that

Field updates (via LUCENE-5189?)

2013-12-27 Thread Otis Gospodnetic
Hi, Does ES offer support for field updates (vs. having to do a complete doc delete + add), say via https://issues.apache.org/jira/browse/LUCENE-5189 ? Thanks, Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ -- You received

Re: Reports and Notifications.

2013-12-26 Thread Otis Gospodnetic
; triggered/sent. We are looking for simuliar functionality. I will check > out what you sent. > On Dec 24, 2013 7:31 PM, "Otis Gospodnetic" > > > wrote: > >> Hi, >> >> Could you please describe what you mean by "reports"? Are you looking

Re: Several questions on ES in production environment

2013-12-24 Thread Otis Gospodnetic
Hi, On Tuesday, December 24, 2013 8:47:54 AM UTC-5, Han JU wrote: > > Hi, > > We're approaching the first release of our product and we use > ElasticSearch as a key component in our system. But there's still some > questions and doubts so I'd like to listen to the more experienced users > and E

Re: Compared to Solr (with Solr Cloud), what is the advantage(s) of Elasticsearch?

2013-12-24 Thread Otis Gospodnetic
Hi Daniel, Here is a an unbiased 6-part series on this very topic: http://blog.sematext.com/2012/08/23/solr-vs-elasticsearch-part-1-overview/ Note that SolrCloud has improved a lot since then and ES also got a number of new features. Sometimes one's requirements and must-have features determi

Re: Reports and Notifications.

2013-12-24 Thread Otis Gospodnetic
Hi, Could you please describe what you mean by "reports"? Are you looking for daily/weekly email with graphs or something else? We have that in SPM (monitoring) and Logsene (log analytics) is getting it, too. Kibana has this as well via phantomjs, I believe, though I'm not sure how/if it's h