Disk space usage categorization based field

2014-07-30 Thread pmanvi
Is it technically feasible to drill down to know disk usage incurred by each field level. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-stats.html provides storage cost at the index level 'store' I would like to know the storage cost incurred by each of the

Update a field if _source is disabled

2014-07-30 Thread 'Sandeep Ramesh Khanzode' via elasticsearch
Hi, I read it here (http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-update.html) that the _source field needs to be enabled for Update API to work. Does it mean that from Java or REST API, I cannot update any field defined in the type mapping unless the _source is

Re: The number of types a index can handle

2014-07-30 Thread joergpra...@gmail.com
There is no limit in ES. Each type uses a certain amount of heap for caching ids and the mapping. You can create types / mappings until heap explodes. Each modification of a mapping is propagated through the cluster, which is not a cheap operation. You have to test by yourself if your design

Re: Update a field if _source is disabled

2014-07-30 Thread David Pilato
No you can't as behind the scene the full document is removed and inserted with new values (new version). -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 30 juil. 2014 à 08:52, 'Sandeep Ramesh Khanzode' via elasticsearch elasticsearch@googlegroups.com a écrit : Hi, I

Re: Status red on Marvel overview raising shardFailures

2014-07-30 Thread Idan
Hi, Wonder if anyone got any clue about this? maybe additional logs needed to nail this one? thanks. On Tuesday, July 29, 2014 10:14:42 AM UTC+3, Idan wrote: I have status red on marvel dashboard. If I check the the 'Shared allocation' tab on the overview I see this error: Oops!

changing number of shards for new indices

2014-07-30 Thread Kingdom Joy
Hello, Right now I have two nodes in my ES (part of ELK stack) cluster and 1 shard for each index. I would like to change number of shards to two for future indices. Can I do this by changing config file and restarting logstash? Will it change number of shards for indices created after

Re: changing number of shards for new indices

2014-07-30 Thread Kingdom Joy
Typo in my previous message, here's corrected post: Hello, Right now I have two nodes in my ES (part of ELK stack) cluster and 1 shard for each index. I would like to change number of shards to two for future indices. Can I do this by changing config file and restarting ES? Will it change

Re: changing number of shards for new indices

2014-07-30 Thread Mark Walkom
It doesn't change existing indexes only new ones. You can either do the setting change via the API or in the config, if you choose the latter you will need a restart. See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html Regards, Mark Walkom

Re: sum-aggregation script doesn't allow negative values?

2014-07-30 Thread Colin Goodheart-Smithe
Would you be able to re-run your query and post the stack trace from the Elasticsearch server logs. This might help to work out whats going on. Thanks Colin On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote: Ok. I think I found the problem. As soon as I try to sort on the script

Re: sum-aggregation script doesn't allow negative values?

2014-07-30 Thread Colin Goodheart-Smithe
Also, your shard_size parameter should always be greater than the size parameter. So if you are asking for size of 10 then I would try setting shard_size to 20 or 30. On Wednesday, 30 July 2014 09:22:16 UTC+1, Colin Goodheart-Smithe wrote: Would you be able to re-run your query and post the

Re: slow filter execution

2014-07-30 Thread Kireet Reddy
For my test case it's the same every time. In the real query it will change every time, but I planned to not cache this filter and have a less granular date filter in the bool filter that would be cached. However while debugging I noticed slowness with the date range filters even while testing

Elasticsearch always uses the default mapping

2014-07-30 Thread Amirah
Hello, Am trying to create an index using CSV River Plugin for ElasticSearch https://github.com/AgileWorksOrg/elasticsearch-river-csv, my csv file contains *String*, *long* and* date* values. My problem is : - ElasticSearch always consider* long* values as *string* ( with default mapping )

Exception when using ES 1.3.1 Caused by: java.lang.IncompatibleClassChangeError: Implementing class

2014-07-30 Thread gregorymaertens via elasticsearch
Hello, I have a project using Play framework version 1.2.7 which used ES 1.1.1. I wanted to update it to the latest and greatest (1.3.1), but encountered the following exception when running the unit tests within the play framework: An unexpected error occured caused by exception

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread Laurent T.
Hi David, I tried, as you suggested, to activate dynamic scripting and to force groovy as a default_lang but the results stay unchanged. And yeah, no other node on the cluster.. Here's the test's output logs: TestClient: Loading config files... TestClient: Creating local node... juil. 30,

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread David Pilato
I think you are doing something wrong. If you defined a mapping it should not be overwritten by the CSV river as far as I know. --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 30 juillet 2014 à 10:31:07, Amirah (beldjilal...@gmail.com) a écrit: Hello,

Re: slow filter execution

2014-07-30 Thread David Pilato
May be a stupid question: why did you put that filter inside a query and not within the same filter you have at the end? For my test case it's the same every time. In the real query it will change every time, but I planned to not cache this filter and have a less granular date filter in the

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread Amirah
Thanks for the answer, Am creating and defining my mapping ( and index) as following : PUT /newindex/ PUT /newindex/_mapping { newindex : { properties: { MyStringValue: {type: string}, MyLongValue: {type: long}, MyDateValue:{type: date} } } }

Integration testing a native script

2014-07-30 Thread Nick T
Is there a way to have a native java script accessible in integration tests? In my integration tests I am creating a test node in the /tmp folder. I've tried copying the script to /tmp/plugins/scripts but that was quite hopeful and unfortunately does not work. Desperate for help. Thanks --

Re: Integration testing a native script

2014-07-30 Thread David Pilato
I might be wrong but I think that scripts should be located in config/scripts, right?  --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 30 juillet 2014 à 11:31:10, Nick T (nttod...@gmail.com) a écrit: Is there a way to have a native java script

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread joergpra...@gmail.com
You should try to add groovy jar to your classpath. It is not in the dependencies in Maven's pom.xml. Example: dependency groupIdorg.codehaus.groovy/groupId artifactIdgroovy-all/artifactId version2.3.5/version

Re: Exception when using ES 1.3.1 Caused by: java.lang.IncompatibleClassChangeError: Implementing class

2014-07-30 Thread joergpra...@gmail.com
This is a dependency problem. Check your classpath if you have clean dependencies to ES 1.3.1 code only. Jörg On Wed, Jul 30, 2014 at 10:41 AM, gregorymaertens via elasticsearch elasticsearch@googlegroups.com wrote: Hello, I have a project using Play framework version 1.2.7 which used ES

Re: sum-aggregation script doesn't allow negative values?

2014-07-30 Thread Valentin
Hi Colin, I try increasing it up to 40 but nothing changes. I would post the stack trace but I don't know how to find them. Thanks Valentin On Wednesday, July 30, 2014 10:24:09 AM UTC+2, Colin Goodheart-Smithe wrote: Also, your shard_size parameter should always be greater than the size

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread Laurent T.
Nice catch Jörg, that indeed did the trick. @David Shouldn't groovy be bundled in the ES jar if it's the new default ? Will it be provided by ES when i run on a live cluster ? Thanks! On Wednesday, July 30, 2014 11:41:23 AM UTC+2, Jörg Prante wrote: You should try to add groovy jar to your

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread joergpra...@gmail.com
The ES team decided to postpone groovy as default to Elasticsearch 1.4 version. In 1.3, mvel is still the default, so authors have some time to rewrite their scripts if they prefer to. So I think it is ok to not include groovy jar by default, and make this optional to those who want to switch

Cloud-aws version for 1.3.1 of elasticsearch

2014-07-30 Thread Thomas
Hi, I wanted to ask whether the version of cloud-aws plugin is 2.1.1 for elasticsearch 1.3.1, by looking at the github page: https://github.com/elasticsearch/elasticsearch-cloud-aws/tree/es-1.3 How come the plugin version for 1.3.1 of elasticserach goes backwards? For elasticsearch 1.2.x the

Re: Integration testing a native script

2014-07-30 Thread Thomas
Hi, I have tried the same approach and it worked for me, meaning to copy the script I want to perform an integration test and run my IT. I do the following steps 1) Setup the required paths for elasticsearch final Settings settings = settingsBuilder()

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread David Pilato
Ha! Right!  Thanks Jörg! I forgot that I run the same issue recently. I should add more memory to my brain cluster :) --  David Pilato | Technical Advocate | Elasticsearch.com @dadoonet | @elasticsearchfr Le 30 juillet 2014 à 12:08:58, joergpra...@gmail.com (joergpra...@gmail.com) a écrit:

Re: Integration testing a native script

2014-07-30 Thread Thomas
I have noticed that you mention native java script so you have implemented it as a plugin? if so try the following in your settings: final Settings settings = settingsBuilder() ... .put(plugin.types, YourPlugin.class.getName()) Thomas On Wednesday, 30

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread David Pilato
This looks strange to me PUT /newindex/_mapping  {       newindex : {                      properties: {     MyStringValue: {type: string},     MyLongValue: {type: long},     MyDateValue:{type: date}   }       }   }       } What is your type name? --  David Pilato | Technical Advocate | 

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread Laurent T.
Ok well, anyway i think you may want to update the docs about this cause i think i won't be the only one facing this :) Thanks again to both of you. On Wednesday, July 30, 2014 12:30:09 PM UTC+2, David Pilato wrote: Ha! Right! Thanks Jörg! I forgot that I run the same issue recently. I

Re: slow filter execution

2014-07-30 Thread Clinton Gormley
Don't use the `and` filter - use the `bool` filter instead. They have different execution modes and the `bool` filter works best with bitset filters (but also knows how to handle non-bitset filters like geo etc). Just remove the `and`, `or` and `not` filters from your DSL vocabulary. Also, not

Re: Cosine Similarity ElasticSearch

2014-07-30 Thread peter SP
I am also interested in this question. I have found a fairly old code snippet [1] to calculate the cosine similarity in lucene, but I was wondering if elasticsearch provided an easier API to access this information. [1]

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread Amirah
there is a missing part ( copy paste error) /_river/ So, yes i use this PUT /_river/newindex/_mapping { newindex : { properties: { MyStringValue: {type: string}, MyLongValue: {type: long}, MyDateValue:{type: date} } } } } to create the mapping, my

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread David Pilato
That's the problem. A River creates documents in another index than _river. If I look at the river documentation, you can set it using: index : {         index : my_csv_data,         type : csv_type,         bulk_size : 100,         bulk_threshold : 10     } So basically, you need to define

log index creation API requests

2014-07-30 Thread bitsofinfo . g
Hi - any tips for how I should configure the logging.yml file to give me more verbose output, including source ip address if possible, to give more info when an index is created? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe

Re: cluster.routing.allocation.enable behavior (sticky shard allocation not working as expected)

2014-07-30 Thread Andrew Davidoff
On Tuesday, July 29, 2014 3:27:13 PM UTC-4, Ivan Brusic wrote: Have you changed your gateway settings? http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html#recover-after It still remains a bit of black magic to me. Sometimes it works, sometimes it

Re: bulk indexing - optimal refresh_interval

2014-07-30 Thread shikhar
Thanks for the explanation! I'll switch over for the next time I need to reindex. On Tue, Jul 29, 2014 at 6:35 PM, Michael McCandless m...@elasticsearch.com wrote: Disabling refresh (-1) is a good choice if you are fully maximizing your cluster's CPU/IO resources (using enough bulk client

Re: Geo distance filter exceptions

2014-07-30 Thread Madhavan Ramachandran
Thanks for your response. I am using Nest dll (.Net) to index the data in ES (in windows as a service). How to add the geopoint to my index columns? Regards Madhavan.TR On Tuesday, July 29, 2014 3:35:56 PM UTC-5, David Pilato wrote: No you can't out of the box. If you want to use built

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread Amirah
I don't really see the problem, i selected my newindex ( it exsists in my mapping with my types) PUT /newindex/ PUT /_river/newindex/_mapping { newindex : { properties: { marques: {type: string}, ventes: {type: long}, mois:{type: date} } } } PUT

ORA-01882: timezone region not found

2014-07-30 Thread George DRAGU
Hello, Is it any possibility to specify a parameter value to java command line behind the JDBC River? I think at a -Duser.timezone=Europe/Istanbul, for exemple. When I try to create a JDBC River for an Oracle database (with jprante plugin) I catch this error. Thanks -- You received this

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread David Pilato
You applied a mapping to index _river and type newindex. This is not what I said. You need to apply your mapping to newindex index and newindex type. Basically something like: PUT /newindex/ PUT /newindex/newindex/_mapping {         newindex : {    properties: {    marques: {type: string},    

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread Amira BELDJILALI
ah, yes, i didn't specify the type, thank you so much for your help On 30 July 2014 16:03, David Pilato da...@pilato.fr wrote: You applied a mapping to index _river and type newindex. This is not what I said. You need to apply your mapping to newindex index and newindex type. Basically

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread Amirah
ah, yes, i didn't specify the type, thank you so much for your help On Wednesday, July 30, 2014 4:04:18 PM UTC+2, David Pilato wrote: You applied a mapping to index _river and type newindex. This is not what I said. You need to apply your mapping to newindex index and newindex type.

Re: Geo distance filter exceptions

2014-07-30 Thread Madhavan Ramachandran
Hi I have updated the mapping for my index.. added a column with geopoint.. locationgeo_point when i search without geo filter for data.. i can able to see the location information. { _index: offlocations_geo, _type: officelocations, _id: 21,

Re: Logging of percolator reverse queries

2014-07-30 Thread Arkadiy Rudin
Just checking if anybody knows the answer... On Monday, July 28, 2014 4:14:59 PM UTC-4, Arkadiy Rudin wrote: Looks like the percolator queries are not getting recorded in any of existing slow query logs. Is it something that I am missing in configuration or logging for percolator is not

Re: Geo distance filter exceptions

2014-07-30 Thread Joffrey Hercule
Hi ! Use query. ex : { query : { filtered : { query : { match_all : {} }, filter : { geo_distance : { distance : 50km, city.location : { lat : 43.4, lon : 5.4

best way to parse deeply nested aggregations using client API

2014-07-30 Thread birjupat
I am using Java client API to get aggregations back. Following is the structure which I am dealing with. aggregations top_models buckets key : BMW doc_count : 3 top_models buckets key : X5 doc_count :

Recommendations needed for large ELK system design

2014-07-30 Thread Alex
Hello, We wish to set up an entire ELK system with the following features: - Input from Logstash shippers located on 400 Linux VMs. Only a handful of log sources on each VM. - Data retention for 30 days, which is roughly 2TB of data in indexed ES JSON form (not including replica

Java transport client, which hosts to add?

2014-07-30 Thread Andrew Gaydenko
The client has addTransportAddress(). So, I can add all cluster nodes. Is it intended way? Or - what are those considerations must be taken into account while adding hosts? -- You received this message because you are subscribed to the Google Groups elasticsearch group. To unsubscribe from

Re: slow filter execution

2014-07-30 Thread Kireet Reddy
Thanks for the detailed reply. I am a bit confused about and vs bool filter execution. I read this post http://www.elasticsearch.org/blog/all-about-elasticsearch-filter-bitsets/ on the elasticsearch blog. From that, I thought the bool filter would work by basically creating a bitset for the

Re: Java transport client, which hosts to add?

2014-07-30 Thread Ivan Brusic
You should as many nodes as possible. If you enable client.transport.sniff, then the transport client will ask the nodes it does connect to about the other nodes in the cluster, which means you can potentially only need to specific a single node (not ideal in case that node is down). -- Ivan

Re: Geo distance filter exceptions

2014-07-30 Thread Madhavan Ramachandran
Nope.. it did not work..got exception as QueryParsingException[[offlocations_geo] failed to find geo_point field [city.location Regards Madhavan.TR On Wednesday, July 30, 2014 10:08:45 AM UTC-5, Joffrey Hercule wrote: Hi ! Use query. ex : { query : { filtered : { query :

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread Laurent T.
Just FYI, if anyone else runs into the same troubles, Groovy seems to be provided on a real cluster and it's in version 2.3.2. On Wednesday, July 30, 2014 1:19:17 PM UTC+2, Laurent T. wrote: Ok well, anyway i think you may want to update the docs about this cause i think i won't be the only

Re: Java transport client, which hosts to add?

2014-07-30 Thread Andrew Gaydenko
On Wednesday, July 30, 2014 8:13:28 PM UTC+4, Ivan Brusic wrote: You should as many nodes as possible. If you enable client.transport.sniff, then the transport client will ask the nodes it does connect to about the other nodes in the cluster, which means you can potentially only need to

How to use Curator to manage old data and avoid running out of storage space?

2014-07-30 Thread David Reagan
I've been implementing an ELK stack for the past year or so. I had thought that we would have plenty of space, but recently added a log source that increased the number of log entries a day by around 30x. That prompted me to start looking into ways of managing ES's data storage in order to keep

Re: log index creation API requests

2014-07-30 Thread Ivan Brusic
The logging.xml file will only control which logging statements get outputed, not the amount of information it may contain. The log line in question does not have the source ip, which is long gone by the time the service gets the request.

Logging every query

2014-07-30 Thread Alejandro de la Viña
've got an enviroment set on Dev that should keep a log with every query ran, but it's not writing anything. I'm using the slow-log feature for it... These are my thresholds on the elasticsearch.yml: http://pastebin.com/raw.php?i=qfwnruhD And this is my whole logging.yml:

Re: cluster.routing.allocation.enable behavior (sticky shard allocation not working as expected)

2014-07-30 Thread Ivan Brusic
The idea is that the cluster should be delayed when a cluster rebalance occurs, but even with these settings, I often find that shards are moved immediately. Are you using the default stores throttling settings? I found them to be quite low. Cheers, Ivan On Wed, Jul 30, 2014 at 6:02 AM,

Re: ORA-01882: timezone region not found

2014-07-30 Thread joergpra...@gmail.com
This is ES related, but, what Oracle JDBC version is this and what Oracle Database Server version? Jörg On Wed, Jul 30, 2014 at 3:59 PM, George DRAGU george.gd.dr...@gmail.com wrote: Hello, Is it any possibility to specify a parameter value to java command line behind the JDBC River? I

Re: [java api] Trying to use groovy instead of mvel

2014-07-30 Thread David Pilato
It is! The issue you run into is just a Java dependency issue. Clients don't need for example to have Groovy. That's the reason it's marked as optional dependency. Best. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 30 juil. 2014 à 17:55, Laurent T.

Re: Elasticsearch always uses the default mapping

2014-07-30 Thread David Pilato
You did specify the type. But you sent the put mapping request in the wrong index. -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 30 juil. 2014 à 16:08, Amira BELDJILALI beldjilal...@gmail.com a écrit : ah, yes, i didn't specify the type, thank you so much for your

Failed to configure logging error on start up.

2014-07-30 Thread Peter Li
I have a setup with multiple servers. The file tree for each is like the following: /data/ configs/ elastic-1.yml logging-1.yml scripts/ (empty) elastic-core/ (from distribution) bin/... config/... lib/... logs/... elastic-1/ bin --

Re: ORA-01882: timezone region not found

2014-07-30 Thread joergpra...@gmail.com
Ups, I mean of course, this is *not* ES related ... Jörg On Wed, Jul 30, 2014 at 7:59 PM, joergpra...@gmail.com joergpra...@gmail.com wrote: This is ES related, but, what Oracle JDBC version is this and what Oracle Database Server version? Jörg On Wed, Jul 30, 2014 at 3:59 PM, George

Re: Failed to configure logging error on start up.

2014-07-30 Thread Peter Li
Did more experiments. If I used a real scripts directory, instead of a symbolic link, then no error message. But does this means that I will have to drop the same script into all my server's config/scripts directory ? It would be nice to use symbolic links for this. Any suggestions ? On

is there a query that can return a combined parent child

2014-07-30 Thread Stephen Ward
I've got my rivers working my parent child mapping done. I've written some has_child querys but I'm a noob to ES is there any way to join the data . i.e. aggs and bucketing the children ? if so does anyone have an example. -- You received this message because you are subscribed to the Google

Unique data by referencing 2 fields

2014-07-30 Thread Cameron Barker
Hi all, Would it be possible to get unique items from an elastic search database that reference 2 fields for uniqueness, all while using only elastic search or a plug in? *I.E.* *Initial Data:* { provider: tumblr text: I need to get this. } { provider: twitter text: I need to get this. } {

ip type and support for a port?

2014-07-30 Thread Chris Neal
Hi all, I'm trying to use the ip type in ES, but my IPs also have ports. That doesn't seem to be supported, which was a bit of a surprise! Does anyone know of a way to do this? Or does it sound like a good feature to add support for to this type? Thanks! Chris -- You received this message

Re: How to use Curator to manage old data and avoid running out of storage space?

2014-07-30 Thread Aaron Mildenstein
Hi David, Backing up indices to a repository is a great way to conserve space in your cluster. Curator provides a helper script called es_repo_mgr that will aid in creation of a repository. There is more information about snapshot creation here: modules-snapshots.html

Re: ip type and support for a port?

2014-07-30 Thread joergpra...@gmail.com
Can you give an example what you mean by IP ports? Transport protocols like TCP has ports, but IP (Internet addresses) is used to address hosts on a network. Jörg On Wed, Jul 30, 2014 at 11:02 PM, Chris Neal chris.n...@derbysoft.net wrote: Hi all, I'm trying to use the ip type in ES, but

Re: How to know if my curator instance is running fine ?

2014-07-30 Thread Aaron Mildenstein
Sorry this never got responded to. Unless your indices are hourly, and in a format that curator recognizes, it will not delete anything. What are your index names, or your naming schema? --Aaron On Thursday, May 15, 2014 8:49:00 AM UTC-5, Guillaume boufflers wrote: Hello buds ! I've

Configuration Brain Wobbles

2014-07-30 Thread Christopher Ambler
I have a cluster with six nodes. The nodes are in different data centers, but I don't think that matters, as the connectivity is beefy and thick. I have turned multicast off and unicast on. Each node knows about all the others explicitly. When I bring up a visualization of the cluster using the

Using ES for gmail-like application

2014-07-30 Thread Maxim Kramarenko
I am going to use ES to implement gmail-like app, distributed over multiple cluster nodes. Questions are: 1) Messages will be compressed, so I need to store binaries in index, not plain text. Is it possible to store binary data with minimal overhead, without base64 encoding ? 2) I need to

Remote access through SSH

2014-07-30 Thread Chia-Eng Chang
About the HTTP API, I wonder if I want to remote access a cluster on SSH server, what should I include in my http rest command: example as mapping: curl -XGET ' http://localhost:9200/ index /_mapping/ type ' I tried something like below but got failed: curl -XGET -u user_name:

Re: Remote access through SSH

2014-07-30 Thread Mark Walkom
You need to use SSH directly for it, curl won't work. ssh user@host -i ~/.ssh/id_rsa.pub Assuming you have a public key on the server. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 31 July 2014 08:47, Chia-Eng

Re: Configuration Brain Wobbles

2014-07-30 Thread Mark Walkom
Standard response to this is ES is not built for multi DC clustering, but as long as you are aware you are of that then it's fine. Have you looked at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-allocation.html ? Regards, Mark Walkom Infrastructure

Re: cluster.routing.allocation.enable behavior (sticky shard allocation not working as expected)

2014-07-30 Thread Mark Walkom
I've seen this as well Ivan, and have also had a few people on IRC comment on the same thing - shards that are local are not simply being initialised, but being reallocated elsewhere. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web:

more like this vs. mlt

2014-07-30 Thread Peter Li
I ran a query: curl -XGET $url/ease/RadiologyResult/90642/_mlt?routing=07009409mlt_fields=Observation.Valuemin_term_freq=1min_doc_freq=1pretty It worked and returned several documents. But if I ran this: curl -XGET $url/ease/RadiologyResult/_search?routing=07009409pretty -d ' { query

Re: Remote access through SSH

2014-07-30 Thread Mark Walkom
You may want to look at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search.html If you are just learning ES, then check out http://exploringelasticsearch.com/ Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web:

Memory Explosion: Heap Dump in less than one minute

2014-07-30 Thread Tom Wilson
Help! Elasticsearch was working fine, but now it's using up all its heap space in the matter of a few minutes. I uninstalled the river and am performing no queries. How do I diagnose the problem? 2-3 minutes after starting, it runs out of heap space, and I'm not sure how to find out why. Here

Re: Memory Explosion: Heap Dump in less than one minute

2014-07-30 Thread Mark Walkom
What java version? How much heap have you allocated and how much RAM on the server? Basically you have too much data for the heap size, so increasing it will help. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 31

Re: Memory Explosion: Heap Dump in less than one minute

2014-07-30 Thread Tom Wilson
JDK 1.7.0_51 It has 512MB of heap, which was enough -- I've been running it like that for the past few months, and I only have two indexes and around 300-400 documents. This is a development instance I'm running on my local machine. This only happened when I started it today. -tom On

Re: Memory Explosion: Heap Dump in less than one minute

2014-07-30 Thread Mark Walkom
Up that to 1GB and see if it starts. 512MB is pretty tiny, you're better off starting at 1/2GB if you can. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 31 July 2014 10:28, Tom Wilson twilson...@gmail.com wrote:

Re: Remote access through SSH

2014-07-30 Thread Chia-Eng Chang
Thank you for the links. Yeah, I am new to ES. (and http rest) What I understand is that if I want to get the index documents on my SSH server, I can SSH log in the server. And then rest http get from localhost:9200. Could you explain more about use SSH directly for it? I think what I want to

Re: Memory Explosion: Heap Dump in less than one minute

2014-07-30 Thread Tom Wilson
Upping to 1GB, memory usage seems to level off at 750MB, but there's a problem in there somewhere. I'm getting a failure message, and the marvel dashboard isn't able to fetch. C:\elasticsearch-1.1.1\binelasticsearch Picked up _JAVA_OPTIONS: -Djava.net.preferIPv4Stack=true [2014-07-30

Re: Memory Explosion: Heap Dump in less than one minute

2014-07-30 Thread Mark Walkom
Unless you are attached to the stats you have in the marvel index for today it might be easier to delete them than try to recover the unavailable shards. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email: ma...@campaignmonitor.com web: www.campaignmonitor.com On 31 July 2014

Re: Remote access through SSH

2014-07-30 Thread Mark Walkom
You can also curl from your local machine to the server, without having to SSH to it - curl -XGET http://IPADDRESS:9200/ You don't need to provide SSH credentials for that transport client example. Regards, Mark Walkom Infrastructure Engineer Campaign Monitor email:

Re: The number of types a index can handle

2014-07-30 Thread panfei
Thanks for the information 2014-07-30 14:55 GMT+08:00 joergpra...@gmail.com joergpra...@gmail.com: There is no limit in ES. Each type uses a certain amount of heap for caching ids and the mapping. You can create types / mappings until heap explodes. Each modification of a mapping is

Help needed understanding analyzer behavior

2014-07-30 Thread Neko Escondido
Hello community, I'm having problem understanding how analyzer should work. The result is different from what I expect. :( I have created a custom analyzer to index phone number as below: analysis : { analyzer : { phone : {

Re: Help needed understanding analyzer behavior

2014-07-30 Thread Nikolas Everett
It's probably easier to do a char filter to remove all non digits. On the other hand if you want to normalize numbers that sometimes contain area and country code to numbers you'll probably want to do that outside of elasticsearch or with a plugin. That gets difficult when you need to handle non

Elasticsearch still scan all types in a index even if I specify a type

2014-07-30 Thread panfei
First, put some sample data: curl -XPUT 'localhost:9200/testindex/action1/1?pretty' -d ' { title: jumping tom, val: 101 }' curl -XPUT 'localhost:9200/testindex/action2/1?pretty' -d ' { title: jumping jerry, val: test }' as you can see, and the mapping is : { action1 : {

Re: more like this vs. mlt

2014-07-30 Thread vineeth mohan
Hello Peter , You have set these variable for the API and not the query , that is why its working - min_term_freq=1 , min_doc_freq=1 Thanks Vineeth On Thu, Jul 31, 2014 at 5:02 AM, Peter Li jenli.pe...@gmail.com wrote: I ran a query: curl -XGET

Re: Help needed understanding analyzer behavior

2014-07-30 Thread Neko Escondido
Hi Nikolas Thank you very much for your feedback. I was hoping to be able to search against the phone number field in normalized, original, number parts format. If I modify the input into normalized format, then, search using original/number parts will not return the desired result... Or am I

How many Java clients do I need?

2014-07-30 Thread Andrew Gaydenko
As far as I understand, Java client instance is stateless, and it's methods are pure functions (I means operating methods rather those related to initial configuration just after instantiation). As a result, it is sufficient to have the only client for given cluster for given JVM. Is it true?

Re: Recommendations needed for large ELK system design

2014-07-30 Thread Mark Walkom
1 - Looks ok, but why two replicas? You're chewing up disk for what reason? Extra comments below. 2 - It's personal preference really and depends on how your end points send to redis. 3 - 4GB for redis will cache quite a lot of data if you're only doing 50 events p/s (ie hours or even days based