concerns on possible load of aggregation

2015-02-25 Thread Seungjin Lee
We are running a PAAS built with elasticsearch and we want to provide multi-column count aggregation feature through ES aggregation Let's take below as an example POST /INDEX_PATTREN-*/_search { query:{match:{project:dummyProject}}, size:0, aggs: { col1: { terms: {

questions regarding elasticsearch-spark

2015-01-15 Thread Seungjin Lee
Hi all, I'm quite familiar with ElasticSearch but new to spark, and elasticsearch-spark. My idea at this moment is that by using spark together with elasticsearch, it might be able to increase search performance when the time interval is fixed. question is, is hadoop need to be set up first to

how to use regexp within phrase querystring?

2014-10-19 Thread Seungjin Lee
Let's assume we store data somewhat like below. (in body field) dataSize : 32 dataSize : 45 dataSize : 567 In query String, body:dataSize 567 works, body:dataSize AND body:/[0-9]{3}/ works, And now we want to know how to combine those two, body:dataSize /[0-9]{3}/ it never works. what

scaling out percolator performance?

2014-07-20 Thread Seungjin Lee
Hi all, for testing, I have one index with only 1 shard and 4 replicas. in .percolator type of that index, there are 1.5k queries to be percolated. and total 5 modern machines with 48G ram, assinged 12G for elasticsearch on each node. What I'm seeing now is, it rarely scales out in performance

too many open files

2014-07-17 Thread Seungjin Lee
hello, I'm using elasticsearch with storm, Java TransportClient. I have total 128 threads across machines which communicate with elasticsearch cluster. From time to time, error below occurs org.elasticsearch.common.netty.channel.ChannelException: Failed to create a selector. at

Re: percolator throughput decreases as time passes

2014-07-17 Thread Seungjin Lee
not really, amount of queries were same throughout process lifecycle 2014-07-16 19:04 GMT+09:00 Martijn v Groningen martijn.v.gronin...@gmail.com: Do the amount of registered percolate queries also increase? On 15 July 2014 12:02, Seungjin Lee sweetest0...@gmail.com wrote: ​ hi all

percolator throughput decreases as time passes

2014-07-15 Thread Seungjin Lee
​ hi all, we use elasticsearch with storm, continuously making percolation request. as you see above, percolator throughput decreases as time passes. but we are not seeing any other problematic statistics, except that CPU usage also decreases as throughput decreases. can you guess any reason

percolator throughput is not stable

2014-07-14 Thread Seungjin Lee
​ this is the statistic I'm seeing from marvel in my dev cluster. I run a new process around 11:00 and as you see throughput fluctuates and eventully stuck at less than 10k/s which is worse than the throughput in it's early phase. It seems very weired to me, I looked up resource usage(cpu, mem)

Re: increase query performance by adding more machines, shouldn't it be linear to # of machines?

2014-07-02 Thread Seungjin Lee
yes, all same machines on which only ES with same configuration is running 2014-07-02 14:55 GMT+09:00 David Pilato da...@pilato.fr: Are you using same physical machine for all your VMs? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 2 juil. 2014 à 07:09, Seungjin

increase query performance by adding more machines, shouldn't it be linear to # of machines?

2014-07-01 Thread Seungjin Lee
Hi all, I'm testing percolator performance, 50k/s is required condition with 3~4k rules. now I only have 1 simple rule, and 5 es vms with 1 shard and 4 replicas. and using Java transport client like below new TransportClient(settings)

Response time of Java percolate API is unstable

2014-06-26 Thread Seungjin Lee
Hi, We're now in performance test and seeing some unexpected result. We use Java percolate API client.preparePercolate().setIndices(index).setDocumentType(projectName).setSource(log).execute().actionGet(); LOGGER.info(duration+ms for percolation, es time +response.getTookInMillis()+ ms for log

getting no node available exception when there are many requests

2014-06-18 Thread Seungjin Lee
Hello all, I'm using storm with elasticsearch percolator and normally it works as expected. but when there are to many requests simultaneously, some of them succeed and some fail with no node available exception. I think there should be a configuration about maximum requests per seconds or