date:20140916

Re: aggregations

2014-09-16 Thread navdeep agarwal

Sorry for delayed response, i am using 1.3 version ,i was able to change limit,field data circut breaker,i changed it to 80 ,this is nice setting to know . but it doesn't work ,may be heap size is my problem ,but i have very limited heap space . Thanks you. On Friday, September 5, 2014 2:19:25

Re: fs river - Error while reading content

2014-09-16 Thread David Pilato

Could you turn on debug? See https://github.com/dadoonet/fsriver#debug-mode Also, which versions are you using? -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 17 sept. 2014 à 01:49, Preeti Jain a écrit : Hi, I'm using elasticsearch version 1.0.1 that is installed on l

Re: data from hive table

2014-09-16 Thread Costin Leau

Looks like a bug in hive which passes a null progressable to the underlying task. I would recommend upgrading hive to 0.10 or better ( and hadoop to 1.2.1 while at it). On Sep 16, 2014 8:27 PM, "ibmuser1" wrote: > Hi, my hadoop version is 1.1.1 and hive version is 0.9.0 (biginsights > installatio

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-09-16 Thread Costin Leau

Please upgrade to version 2.0.1 On 9/17/14 1:18 AM, Jinyuan Zhou wrote: I have confirmed with both elasticsearch hive and easticsearcg mr, If both below situation happens, , EsOutFormat produces invalid header for bulk indexing. 1. es.resouce contains data to be extracted from doucment 2.

fs river - Error while reading content

2014-09-16 Thread Preeti Jain

Hi, I'm using elasticsearch version 1.0.1 that is installed on linux machine. I have created fs river to index content from file system. The river definition is curl -XPUT 'http://localhost:9200/_river/riverTest/_meta' -d '{ "type": "fs", "fs": { "url": " /data01/test/NewVehicleFiles1/",

Search performance - Scaling options Horizontally vs Vertically

2014-09-16 Thread Venkat Pavan Bellapu Konda

I have observed that elastic search defaults the search thread pool to 3 X #of CPUs and even if you increase this to a fix # it does not really help as the threads start sharing the CPU cycles. Does this mean that to get same performance results for more concurrent searches I either have to s

Re: better places to store es.nodes and es.port in ES Hive integration?

2014-09-16 Thread Jinyuan Zhou

I have confirmed with both elasticsearch hive and easticsearcg mr, If both below situation happens, , EsOutFormat produces invalid header for bulk indexing. 1. es.resouce contains data to be extracted from doucment 2. es.mapping.id set to be one of field sin document I looked at the code

slow execution of nested boolean filter

2014-09-16 Thread Kireet Reddy

I have a query with a nested boolean (boolean within a boolean) filter with a should clause that performs really terribly. But if I move the nested query up to top level, it performs as much as 50x faster. I am struggling to understand why this is the case. Here are the 2 forms: https://gist.g

Sorting and Pagination

2014-09-16 Thread Matt Hughes

I have logstash indicies that go back thirty days. I have logs in those indices from today. If I do a search with: "size": 500, "sort": [ { "@timestamp": { "order": "desc", "ignore_unmapped": true } } ] I don't get any logs from today. If I limit the s

Re: Accuracy issue of aggregation results

2014-09-16 Thread Yifan Wang

Hi Matt, Thanks for your quick response. However neither worked for us. In our case, we set shard_size to 50K (option1 ), it is still missing documents. The cluster became unstable if we try to further increase it. We cannot use shard_min_doc_count_value, because even it is one hit, its value u

Re: Accuracy issue of aggregation results

2014-09-16 Thread Matt Weber

Hi Yifan, Nothing dynamic, but you can increase the number of terms collected on each shard to increase the accuracy [1]. Might also want to play with the shard_min_doc_count value if you know certain shards have a low hit count and are throwing off the aggregations [2]. [1] http://www.elasticse

Re: Just cannot seem to make progress with rsyslog and Logstash

2014-09-16 Thread Marty Hillman

Thanks Mark. Thought this was that list. :-) On Tuesday, September 16, 2014 3:08:45 PM UTC-5, Mark Walkom wrote: > > You should ask this over on the logstash list - > https://groups.google.com/forum/?hl=en-GB#!forum/logstash-users :) > > Regards, > Mark Walkom > > Infrastructure Engineer > Camp

Re: Just cannot seem to make progress with rsyslog and Logstash

2014-09-16 Thread Mark Walkom

You should ask this over on the logstash list - https://groups.google.com/forum/?hl=en-GB#!forum/logstash-users :) Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 17 September 2014 06:04, Marty Hillman wrote: > I ev

Just cannot seem to make progress with rsyslog and Logstash

2014-09-16 Thread Marty Hillman

I even bought the book and rebuilt my test environment servers from scratch, but I still have the same issues. On the central server, I have redis, logstash 1.4 and elasticsearch 1.3 installed - all from apt repositories. I verified that all services are started and I can curl results from the

Re: Strange issue with 2 seperate ELK servers

2014-09-16 Thread Mark Walkom

By default ES uses a discovery method that allows any node with the same cluster name to join an existing node with the same cluster name, thereby forming one cluster. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html and you want to look at unicast di

Accuracy issue of aggregation results

2014-09-16 Thread Yifan Wang

It seems to be a common problem that the top N results returned from an aggregation query is inaccurate due to uneven distribution of matching documents on different shards, because ES will collect top N buckets from each shard no matter actually how many hits are on each shard. It is very ofte

2 Exact Same Documents Being Ranked Differently

2014-09-16 Thread Randy Jensen

I'm trying to track down an issue where 2 simple documents I'm testing are being ranked quite a bit differently. For testing purposes, I'm only searching against one field, "keywords". The only word in that field for both documents is "jefferson". However, when I search for the word "jefferson"

Re: Cluster troubles, Azure related?

2014-09-16 Thread Tim Heikell

Ah, I just found the n/2+1 recommendation, so I expect I need to set it to 3. On Tuesday, September 16, 2014 11:30:38 AM UTC-7, Tim Heikell wrote: > > Thanks for the reply Jörg. I have discovery.zen.minimum_master_nodes=2. > Should it be something different? > > On Tuesday, September 16, 2014 11

Re: Cluster troubles, Azure related?

2014-09-16 Thread Tim Heikell

Thanks for the reply Jörg. I have discovery.zen.minimum_master_nodes=2. Should it be something different? On Tuesday, September 16, 2014 11:21:16 AM UTC-7, Jörg Prante wrote: > > It looks like you did not configure minimum_master_nodes > > Jörg > > On Tue, Sep 16, 2014 at 8:00 PM, Tim Heikell >

Re: Cluster troubles, Azure related?

2014-09-16 Thread joergpra...@gmail.com

It looks like you did not configure minimum_master_nodes Jörg On Tue, Sep 16, 2014 at 8:00 PM, Tim Heikell wrote: > We are prepping to launch our app into production and seem to be having > some stability issues. We have a cluster of 4 VMs on Azure that all use the > Azure plugin for discovery.

Cluster troubles, Azure related?

2014-09-16 Thread Tim Heikell

We are prepping to launch our app into production and seem to be having some stability issues. We have a cluster of 4 VMs on Azure that all use the Azure plugin for discovery. Most of the time it works as expected, but sometimes it looses its mind. This morning for example, I made adjustments t

Re: Received response for a request that has timed out Error

2014-09-16 Thread shriyansh jain

Thank you.! I got that working. On Tuesday, September 16, 2014 7:25:39 AM UTC-7, pawansharma2045 wrote: > > So you need to restart that node. > > On Tue, Sep 16, 2014 at 12:46 AM, shriyansh jain > wrote: > >> Hi, >> >> I am getting the following error in the elasticsearch log file. I have a >> c

OLAP analytics in Elasticsearch

2014-09-16 Thread Maaz

I am working with Analytic of events, I use hadoop to process the logs and store some results in Mysql. This did not work now due to scalability issues as logs are keep coming daily. We need to show stats per year, month, week, day, hour along with filtering capability Our samples can grow for

data from hive table

2014-09-16 Thread ibmuser1

Hi, my hadoop version is 1.1.1 and hive version is 0.9.0 (biginsights installation). I am trying to push data from existing hive table(s) into elasticsearch. My job fails with the following error. I copied hive script as well below the error. Not sure what I am doing wrong. Can you help?

Re: java.lang.NoSuchFieldError: ALLOW_UNQUOTED_FIELD_NAMES when trying to query elasticsearch using spark

2014-09-16 Thread SAURAV PAUL

Oh. Sorry :-) On Mon, Sep 15, 2014 at 3:27 AM, Mark Walkom wrote: > You probably want to put this in your own thread :) > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: ma...@campaignmonitor.com > web: www.campaignmonitor.com > > On 15 September 2014 06:55, SAUR

Re: suggest without stemming

2014-09-16 Thread Julien Ricard

Hello, I have the exact same issue. I wonder how to get full strings instead of their stems which is not what I expect from a "suggest" query. Don't have any solution yet. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from t

Re: [Version 1.3.2] Root type mapping not empty after parsing! Remaining fields:

2014-09-16 Thread Jack Park

A solution was found here http://stackoverflow.com/questions/22071198/adding-mapping-to-a-type-from-java-how-do-i-do-it On Mon, Sep 15, 2014 at 4:16 PM, Jack Park wrote: > I got this on 1.2.2 and found on the web that it was a bug. So, I upgraded > to 1.3.2 and got the same bug. > > There was a

Re: Some efficient ways to export data other then JSON from Elasticsearch.

2014-09-16 Thread John Smith

Hadn't looked at Jakson for a while but it seems to do both XML and CSV (limited to json that represents tabular data) On Tuesday, 16 September 2014 10:48:58 UTC-4, John Smith wrote: > > Yep, already doing that part actually... > > Was just wondering I guess the best way to deserialize from json

[scripting score] quick way to see if a value is or is not in a field of a type array

2014-09-16 Thread NM

Hi guys, I have objects with 3 fields of type array containing a large amount of integers. these integers are mutually excluded between fields = if an integer is in the field1, it can't be in the field2 or field3, and vise-versa for instance object_1: { field1: [1,4,5,8], field2: [2,6,7

Re: Some efficient ways to export data other then JSON from Elasticsearch.

2014-09-16 Thread Costin Leau

When it comes to JSON, Jackson should be at the top of your list. It's an excellent library and it has plenty of support for XML [1] [1] https://github.com/FasterXML/jackson-dataformat-xml On 9/16/14 5:48 PM, John Smith wrote: Yep, already doing that part actually... Was just wondering I gue

Re: Some efficient ways to export data other then JSON from Elasticsearch.

2014-09-16 Thread John Smith

Yep, already doing that part actually... Was just wondering I guess the best way to deserialize from json to xml for instance. I suppose it's slightly off topic but what are some good json to xml converters. On Tuesday, 16 September 2014 10:23:05 UTC-4, David Pilato wrote: > > You need to use

Elasticsearch.Net, strange deserialization after .Index

2014-09-16 Thread Lasse Schou

Hi, (let me know if this is not the right place to post ElasticSearch.Net questions). I'm indexing a document of type "User" through ElasticSearch.Net with this command (key is a string guid): client.Index(index, "user", key, user); This invokes the serializer and stores the json in my ES clu

Re: Received response for a request that has timed out Error

2014-09-16 Thread Pawan Sharma

So you need to restart that node. On Tue, Sep 16, 2014 at 12:46 AM, shriyansh jain wrote: > Hi, > > I am getting the following error in the elasticsearch log file. I have a > cluster of 2 elasticsearch nodes, and have a setup of ELK stack with redis > as a buffer. Everything was running fine but

Re: Field Data Cache Size and Eviction

2014-09-16 Thread Philippe Laflamme

Sorry for bumping this, but I'm a little stumped here. We have some nodes that are evicting fielddata cache entries for seemingly no reason: 1) we've set indices.fielddata.cache.size to 10gb 2) the metrics from the node stats endpoint show that the indices.fielddata.memory_size_in_bytes never ex

Re: Some efficient ways to export data other then JSON from Elasticsearch.

2014-09-16 Thread David Pilato

You need to use the scan and scroll API for that. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan This class could help you in Java: https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search

Re: ReceiveTimeoutTransportException in logs

2014-09-16 Thread Pawan Sharma

We are also facing this kind of issue in es version 1.1.1. Some node gets disconnected and while analyzing the logs in that disconnected node we got a lot connection time out error. So sometime this issue gets solved by restarting the master node, but sometime we may need to restart the whole c

Re: Failed to running hive job with CDH 5.1.2 and ES-Hadoop 2.0.0

2014-09-16 Thread Costin Leau

In Gibhub under issues [1] or in the release notes for the 2.0.1 release. Most likely, you are facing issue #210. [1] https://github.com/elasticsearch/elasticsearch-hadoop/issues?q=is%3Aissue+label%3Av2.0.1+is%3Aclosed On 9/16/14 4:52 PM, Joe,Yu wrote: On Tue, Sep 16, 2014 at 7:01 PM, Costin

Re: Some efficient ways to export data other then JSON from Elasticsearch.

2014-09-16 Thread John Smith

Also it has to be done on the back end so JAVA it is... On Tuesday, 16 September 2014 10:04:44 UTC-4, John Smith wrote: > > Hi, building some sort of internal tool to export data from Elasticsearch > and I would liek to offer csv or XML. > > Just wondering what options there are... > > > Bassical

Some efficient ways to export data other then JSON from Elasticsearch.

2014-09-16 Thread John Smith

Hi, building some sort of internal tool to export data from Elasticsearch and I would liek to offer csv or XML. Just wondering what options there are... Bassically a user can login to a front end (No I cannot use what is out there, it's only a small portion of a larger tool within the organiza

perform aggregation on the result of a subquery aggregation

2014-09-16 Thread Guillaume

Hi, I'm newbie with Elastic search. I'm validating Elasticsearch regarding our needs. Lets say I want to monitor disk usage of my VMs. - vm1 and vm2 are in Platform PF_A, vm3 is in platform PF_B The mapping I declared (can be pasted in sense) PUT /example_201408/vm/_mapping { "_timestamp

Re: Failed to running hive job with CDH 5.1.2 and ES-Hadoop 2.0.0

2014-09-16 Thread Joe,Yu

On Tue, Sep 16, 2014 at 7:01 PM, Costin Leau wrote: > Hi, > > Upgrade to es-hadoop 2.0.1. > The error is caused by the fact that you have nodes within the ES cluster > without a HTTP/REST point. These are now properly excluded though note, it > means they will not be used by es-hadoop. > As an a

Strange issue with 2 seperate ELK servers

2014-09-16 Thread Kevin M

So I have 1 ELK server setup and working just fine IP is 172.16.40.28. We wanted to build a second one to log different servers and for several reasons keep the data seperate. So I built the new server and setup ELK again, all seems fine. The IP of the new server is 172.16.40.29. When I go to t

Generate keyword from multiple string

2014-09-16 Thread manish kumar

I have following scanerio SHOP1 sells : apple laptop apple ipad apple phone SHOP2 sells : apple laptop SHOP3 sells : HP laptop i wanted to generate keyword for each shop sells. such that "apple ipad" "ipad apple" should show only SHOP1 not SHOP2. How can i generate searchable keyword by

Fast upserts, inmemory, fast expiring data aggregations ?

2014-09-16 Thread ddorian43

Hi List, >From the looks of it everything is possible but I still have some questions. My application consist of events being upserted that expire after 30 seconds and doing aggregations on those. I always filter on user_id which is also the routing_value. event_fields = {"user_id","timestamp

Sorting search results

2014-09-16 Thread Matej Zerovnik

Hello! I'm trying to create a query, that would return the last(sorted by timestamp) 10 hits. I'm using logstash to parse and index my log files... I tried 2 different queries: { "query" : { "filtered" : { "query": {"match" : {"user" : "abc"}}, "query":

Annotating results?

2014-09-16 Thread James Addison

I have two types stored in an index: locations and activities.An activity has a 'relation' to a location - ie. an activity takes place at a location. Is it possible to get a location search result set that includes the count of activities at each location? Sort of like annotating each locatio

Re: Filtered not working with has_parent in elasticsearch 1.3.0

2014-09-16 Thread Martijn v Groningen

You need to wrap the has_parent query in the query part of the filtered query: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html#query-dsl-filtered-query I don't see how this query could have worked in 0.90.5, since the format is incorrect, but if

Re: Filtered not working with has_parent in elasticsearch 1.3.0

2014-09-16 Thread Roopendra Vishwakarma

Any Suggestion?? On Monday, 15 September 2014 23:31:26 UTC+5:30, Roopendra Vishwakarma wrote: > > In elasticsearch 1.3.0 *filtered* not working with *has_parent*. In > *elasticsearch > 0.90.5* its working fine. > > I am using below query. In this query I need add filtered inside > *has_parent-

Re: Failed to running hive job with CDH 5.1.2 and ES-Hadoop 2.0.0

2014-09-16 Thread Costin Leau

Hi, Upgrade to es-hadoop 2.0.1. The error is caused by the fact that you have nodes within the ES cluster without a HTTP/REST point. These are now properly excluded though note, it means they will not be used by es-hadoop. As an alternative, consider enabling HTTP on all your data nodes. On 9

Failed to running hive job with CDH 5.1.2 and ES-Hadoop 2.0.0

2014-09-16 Thread Joe,Yu

Hello list, I have 4 node ES cluster and 6 node CDH running in the lab. The Hive job is as below: hive job=== CREATE TABLE logs (type STRING, time STRING, ext STRING, ip STRING, req STRING, res INT, bytes INT, phpmem INT, agent STRING) ROW FORMAT DELIMITED FIELDS TERMINATED B

Re: ReceiveTimeoutTransportException in logs

2014-09-16 Thread Mark Walkom

Can you manually test all of that using telnet? Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 16 September 2014 20:09, Abhishek Aggarwal wrote: > Thanks for the reply. I am facing this error intermittently. Transp

Re: ReceiveTimeoutTransportException in logs

2014-09-16 Thread Abhishek Aggarwal

Thanks for the reply. I am facing this error intermittently. Transport Client works fine sometimes - so it rules out firewall or port related issues. I have only one ES node (version 1.1.1) - Firewall is not configured - TCP and UDP on port 9300 are open - sniff is disabled (I 'm using default t

Re: ReceiveTimeoutTransportException in logs

2014-09-16 Thread joergpra...@gmail.com

Maybe you use a network filter / firewall which is misconfigured - no connection is possible, everything seems to time out. You must open TCP and UDP on port 9300 on all the hosts of the cluster nodes if you use TransportClient. Also check if your network can operate regarding other nodes, if you

ReceiveTimeoutTransportException in logs

2014-09-16 Thread Abhishek Aggarwal

I am connecting to single instance of Elastic Search server remotely via Transport client. In my web application which makes use of Transport client, i am see following messages in the logs: I have checked, my network connection is proper and ES server is up. But still getting these messages i

Re: Write a plugin to query and aggregate results from multiple shards

2014-09-16 Thread joergpra...@gmail.com

If you want to use the filter parser plugin - I think you mean https://github.com/lmenezes/elasticsearch-terms-fetch-filter-plugin - then why don't you simply extend the plugin and build a new plugin from that codebase? >From what I understand is you somehow want to modify the search action core c

configuring puppet-elasticsearch with hiera yaml

2014-09-16 Thread RM

Hello, I've picked up a great little utility called wirbelsturm (https://github.com/miguno/wirbelsturm). With it I've managed to automate the creation of Vagrant backed VMs for a large chunk of my infrastructure without much pain. Then I got to elasticsearch. I've tried a fe variations of the

Re: Highly variable query performance with ES 1.3.2 (filter + aggregations)

2014-09-16 Thread joergpra...@gmail.com

Just saw that the query profiler can not show what the shard execution times are, so maybe this is not a big help. Jörg On Tue, Sep 16, 2014 at 9:24 AM, joergpra...@gmail.com < joergpra...@gmail.com> wrote: > If you are sure the spikes are caused by the JVM, I recommend to attach a > profiler to

Re: Highly variable query performance with ES 1.3.2 (filter + aggregations)

2014-09-16 Thread joergpra...@gmail.com

If you are sure the spikes are caused by the JVM, I recommend to attach a profiler to the JVM, then you can monitor the code. On JVM level, it is hard to trace queries, so maybe you want to test out bleeding edge? Here is a query profiler: https://github.com/elasticsearch/elasticsearch/pull/6699

Re: HELP : using multiple but not all interfaces for elasticsearch

2014-09-16 Thread David Pilato

You can not bind the same port to 2 IP. This should work: network.host: 192.168.1.213 See details at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-network.html#modules-network HTH -- David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticse

Re: Is there any way to prevent ES from disclosing exception details in REST response?

2014-09-16 Thread joergpra...@gmail.com

You could just check for the response code 500, and you're done, no need to capture streams. Jörg On Tue, Sep 16, 2014 at 12:53 AM, Alex Roytman wrote: > I guess I could but it would mean passing a response wrapper to capture > output stream and then copy it to real request or discard it in cas

HELP : using multiple but not all interfaces for elasticsearch

2014-09-16 Thread HansPeterSloot

Hi, I have 2 nodes with each 2 network interfaces. One of the networks is public and the other is private. I want to use elasticsearch only on the private network and for convenience also on the loopback devices. I have tried multiple ways in the yml file: network.bind_host: [ "192.168.1.213" ,

61 matches

Mail list logo