Re: Elasticsearch Version Upgrade

2015-04-22 Thread Norberto Meijome
Yup thanks , that's what I thought.
On 22/04/2015 2:49 pm, "David Pilato"  wrote:

> Only post 1.0
>
>
> --
> David ;-)
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>
> Le 22 avr. 2015 à 01:14, Norberto Meijome  a écrit :
>
> David, is this the case with older versions (both client and server on
> 0.90.x versions using java client), and across the 0.90 to 1.x boundary, or
> only post 1.x?
> On 22/04/2015 12:03 am, "David Pilato"  wrote:
>
>> This should work in both ways.
>>
>> The client knows what is the node version.
>> The node knows what is the client version.
>>
>> So basically, if one knows he should not send a new data because the
>> other one is too old, it will simply ignore it.
>> Same for reading. If your node is newer, he knows that the client won’t
>> provide X or Y value. So he won’t try to read it.
>>
>>
>> That said, the best thing to do is to test it! :D
>>
>>
>>
>> --
>> *David Pilato* - Developer | Evangelist
>> *elastic.co <http://elastic.co>*
>> @dadoonet <https://twitter.com/dadoonet> | @elasticsearchfr
>> <https://twitter.com/elasticsearchfr> | @scrutmydocs
>> <https://twitter.com/scrutmydocs>
>>
>>
>>
>>
>>
>> Le 21 avr. 2015 à 15:39, Costya Regev  a écrit :
>>
>> Another Question : if i will upgrade my Elasticsearch Client to Version
>> 1.5.1 and my Elasticsearch Servers will stay on version 1.4.2  will it work
>> ? it there a backward compatibility ?
>>
>> On Tuesday, April 21, 2015 at 4:21:38 PM UTC+3, Costya Regev wrote:
>>>
>>> Just checking ,
>>>
>>> so you are sure that there is forward compatibility... and my system
>>> will work fine with Es Client version of 1.4.1 when the server's version
>>> will be 1.5.1 , right ?
>>>
>>> On Tuesday, April 21, 2015 at 3:12:44 PM UTC+3, David Pilato wrote:
>>>>
>>>> It should work fine.
>>>>
>>>> --
>>>> David ;-)
>>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>>
>>>> Le 21 avr. 2015 à 14:08, Costya Regev  a écrit :
>>>>
>>>> Hi ,
>>>>
>>>> We have Elasticsearch Servers running with Es Version 1.4.2,our client
>>>> version is 1.4.1.
>>>>
>>>> We are about to upgrade our Es cluster Version to 1.5.1 , my question
>>>> is :
>>>>
>>>> Do we need to upgrade the client version to 1.5.1 or our current
>>>> version should be compatible with the new Version?
>>>>
>>>>
>>>> Thanks,
>>>> Costya.
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "elasticsearch" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to elasticsearc...@googlegroups.com.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/6bd32750-ed76-457a-ac79-24e810b689be%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/6bd32750-ed76-457a-ac79-24e810b689be%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/B456AA03-5B37-4E04-9BD0-DE472DFB4AF2%40pilato.fr
>> <https://groups.google.com/d/msgid/elasticsearch/B456AA03-5B37-4E04-9BD0-D

Re: Elasticsearch Version Upgrade

2015-04-21 Thread Norberto Meijome
David, is this the case with older versions (both client and server on
0.90.x versions using java client), and across the 0.90 to 1.x boundary, or
only post 1.x?
On 22/04/2015 12:03 am, "David Pilato"  wrote:

> This should work in both ways.
>
> The client knows what is the node version.
> The node knows what is the client version.
>
> So basically, if one knows he should not send a new data because the other
> one is too old, it will simply ignore it.
> Same for reading. If your node is newer, he knows that the client won’t
> provide X or Y value. So he won’t try to read it.
>
>
> That said, the best thing to do is to test it! :D
>
>
>
> --
> *David Pilato* - Developer | Evangelist
> *elastic.co *
> @dadoonet  | @elasticsearchfr
>  | @scrutmydocs
> 
>
>
>
>
>
> Le 21 avr. 2015 à 15:39, Costya Regev  a écrit :
>
> Another Question : if i will upgrade my Elasticsearch Client to Version
> 1.5.1 and my Elasticsearch Servers will stay on version 1.4.2  will it work
> ? it there a backward compatibility ?
>
> On Tuesday, April 21, 2015 at 4:21:38 PM UTC+3, Costya Regev wrote:
>>
>> Just checking ,
>>
>> so you are sure that there is forward compatibility... and my system will
>> work fine with Es Client version of 1.4.1 when the server's version will be
>> 1.5.1 , right ?
>>
>> On Tuesday, April 21, 2015 at 3:12:44 PM UTC+3, David Pilato wrote:
>>>
>>> It should work fine.
>>>
>>> --
>>> David ;-)
>>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>>
>>> Le 21 avr. 2015 à 14:08, Costya Regev  a écrit :
>>>
>>> Hi ,
>>>
>>> We have Elasticsearch Servers running with Es Version 1.4.2,our client
>>> version is 1.4.1.
>>>
>>> We are about to upgrade our Es cluster Version to 1.5.1 , my question is
>>> :
>>>
>>> Do we need to upgrade the client version to 1.5.1 or our current version
>>> should be compatible with the new Version?
>>>
>>>
>>> Thanks,
>>> Costya.
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearc...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/b82263a6-e35b-4a38-b4ac-d1b570e233a9%40googlegroups.com
>>> 
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/6bd32750-ed76-457a-ac79-24e810b689be%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/B456AA03-5B37-4E04-9BD0-DE472DFB4AF2%40pilato.fr
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4Kf3JqnY5jjeO5NYhPGA7ZaXSWxDi4FhmG_y3-Gji5T8g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Index Size and Replica Impact

2015-04-20 Thread Norberto Meijome
Replica = 3 means 4 copies of your data ( for each shard, 1 master and 3
replicas)
On 21/04/2015 7:54 am, "TB"  wrote:

> I have my indexes size @ 6 GB currently with replica set @ 1.
> I have 3 node cluster, in order to utilize the cluster , my understanding
> that i would have set the replica to 3.
> If i do that, would my index size grow more than 6 GB in each node?
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b3a6247f-a2a1-446f-8ed5-e93be4672cc3%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4KPdHs%3DpUUWxdSNHn4HmH20mqofesO%3DcJpBEMSqpFtE9A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: CAT API

2015-04-04 Thread Norberto Meijome
Replica =1 means you have 1 replica of the master shards - I.e. you have 2
copies of the data in total.
On 02/04/2015 2:06 pm, "Nishad Karekar"  wrote:

> I am unable to understand the results from the CAT API
>
> curl 'http://hdpdncwy0001.global.shareddev.acxiom.net:9200/_cat/indices?v'
>
> health index pri rep docs.count docs.deleted store.size pri.store.size
>
> green  ce  5   1  770000  4.7tb  2.3tb
>
>
>
> From the result I see that the replication is 1 yet my
>
> Total Store size is two times my primary store size.Can someone please
> explain what that is the case?
>
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/d91cef67-30d3-458a-9709-93d298a8d136%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JRydj-q-C8782S6AM-tJN6Cp6YL0oMN7FyPaqk07OMaw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Getting XML into ES efficiently

2015-04-04 Thread Norberto Meijome
Hi,
My gut feel is don't add this to the ES setup itself. Horses for courses -
have your script (Python +1) running somewhere taking care of the
processing, dealing with issues on the ftp side , etc. Let ES do its
thing...specially if the XML parsing will take so much memory and you need
external services.

The script can run/be managed/ designed in many ways , from a simple
cronjob to a celery task or a service under chronos/mesos , or a consumer
getting messages and publishing to ES ( though if you have 1 large XML to
process once a day I wouldn't go with a consumer ...)

Good luck,
B
On 03/04/2015 3:47 pm, "Employ"  wrote:

> Thank you for the reply. I do need to do work on the data before importing
> such as language detection and geocoding using third party libraries and I
> feel like log stash may be great for getting my some of the way it won't be
> able to get me all the way.
>
> A custom plugin may be my only option in that regard but is it really
> going to provide me any benefits over something like scrapy? Any feedback
> would be appreciated
>
> Sent from my iPhone
>
> On 3 Apr 2015, at 00:44, Mark Walkom  wrote:
>
> You can do data transformation on the fly, yes.
>
> Language detection can't be done in LS that I know of, but you can
> definitely trim things.
>
> On 3 April 2015 at 13:16, Employ  wrote:
>
>> Thank you for the reply. I've seen that mentioned but does it have the
>> capability to modify the XML content before it is imported? For example,
>> adding the ability to do language detection and trimming via custom scripts?
>>
>> On 2 Apr 2015, at 19:44, Mark Walkom  wrote:
>>
>> Logstash can handle XML, it has a filter specifically for it -
>> http://www.elastic.co/guide/en/logstash/current/plugins-filters-xml.html
>>
>> On 3 April 2015 at 09:33, James  wrote:
>>
>>> Hi,
>>>
>>> Currently I am using scrapy to parse an XML file from an ftp server into
>>> elasticsearch. It works but seems quite a heavy weight solution and it uses
>>> a lot of memory too.
>>>
>>> I am wondering if I am better off writing a plugin for ES instead.
>>>
>>> I have some questions:
>>>
>>> A) It seems writing it in Python (since I'm a python guy) as a push
>>> plugin rather than a pull river makes sense, unless anyone has a reason why
>>> pull is better?
>>>
>>> B) For simple importing (and slight modification such as trimming,
>>> language check etc) is it likely that an ES plugin is likely going to be a
>>> better solution to importing fairly large XML files or should I just leave
>>> scrapy to do it as it is doing at the moment?
>>>
>>> Any help and advice would be appreciated as I start on this journey.
>>>
>>> James
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/610e7f9b-3d23-44a9-b8f3-07deb262dd54%40googlegroups.com
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "elasticsearch" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/elasticsearch/L9uzIGfT7Gs/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X8TLso3YjNLpqHoR5r87nr6Li2Ng53AjHwwNzE1j9FJeA%40mail.gmail.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/AE7A0FB1-0DE9-4BBF-BEE1-7A29964204E5%40employ.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/elasticsearch/L9uzIGfT7Gs/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X92a7G536TNgArbzdvC9P%2B0gKMS_5jMBxT9ZBVDJ9PMMg%40mail.gmai

Re: Why does creating a repository fail?

2015-03-18 Thread Norberto Meijome
yes, that's the difference between a (network service which exposes ) block
storage like iSCSI and a network file system like NFS ( or glusterFS or
Lustre... ).

I don't see why on a local device (iSCSI) you'd have any issue with numeric
uid not matching the 'name' of the user - unless, of course, you were to
detach the volume from host/vm-1 and attached it to another system which
had a different user table.

centralised authentication systems like ldap would solve this for you too
:) (though it seems overkill when all you'd have to do is plan your uid
space properly across client + nfs servers ...). configuration systems like
puppet / chef / ansible / salt should also help you ensure uids across
multiple system...

Anyway, Magnus' suggestion is on the money for the '1 server problem'

On Wed, Mar 18, 2015 at 3:00 AM, David Reagan  wrote:

> @Mark Walkom, So, I'm looking into iscsi. From what I have learned so far,
> you actually format the LUN with whatever file system you want. So,
> wouldn't the gid/uid issue show up there as well, if I formatted to ext3 or
> ext4? Since Ubuntu would treat it like a normal partition and use typical
> linux file perms on it.
>
> --David Reagan
>
> On Mon, Mar 16, 2015 at 5:37 PM, David Reagan  wrote:
>
>> If I were manually creating the elasticsearch user, that'd be easy. But
>> I'm relying on apt to do the job for me. So, yeah...
>>
>> Hmm... I suppose I could manually create an elasticsearch2 user, then
>> modify the defaults files to use it when running ES. Still seems clunky...
>>
>> --David Reagan
>>
>> On Mon, Mar 16, 2015 at 5:20 PM, Andrew Selden  wrote:
>>
>>> I’m not that familiar with iSCSI so I hesitate to say for sure, but
>>> anytime you are cross-mounting filesystems on Linux you have to take
>>> uid/gid consistency into account.
>>>
>>> - Andrew
>>>
>>> On Mar 16, 2015, at 4:46 PM, David Reagan  wrote:
>>>
>>> Would an iSCSI mount have the same issue? I believe our SAN supports
>>> both.
>>>
>>> --David Reagan
>>>
>>> On Mon, Mar 16, 2015 at 4:40 PM, Andrew Selden 
>>> wrote:
>>>
 Hi David,

 This is a common problem with NFS. Unfortunately the protocol assumes
 identical uid/gid mappings across all machines. It’s just one of those
 annoying sys-admin tasks that one has to take into account when using NFS.
 To get your permissions back to less permissive settings you will have to
 edit the /etc/passwd and /etc/group files to keep them in sync.

 See http://www.tldp.org/HOWTO/NFS-HOWTO/troubleshooting.html#SYMPTOM4
 for more context.

 - Andrew


 On Mar 16, 2015, at 4:04 PM, David Reagan  wrote:

 First, it is a file permissions issue. I did get snapshots to run when
 I chmoded to 777. As you can see from the ls output, /mounts/prod_backup is
 777. Prior to that it was 775 or 755. So, I could revise my question to
 "How can I get snapshots working without using insecure file permissions?"

 root@log-elasticsearch-01:~# mount
 /dev/mapper/ws--template--01-root on / type ext4 (rw,errors=remount-ro)
 proc on /proc type proc (rw,noexec,nosuid,nodev)
 sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
 none on /sys/fs/fuse/connections type fusectl (rw)
 none on /sys/kernel/debug type debugfs (rw)
 none on /sys/kernel/security type securityfs (rw)
 udev on /dev type devtmpfs (rw,mode=0755)
 devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
 tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
 none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
 none on /run/shm type tmpfs (rw,nosuid,nodev)
 /dev/sda1 on /boot type ext2 (rw)
 rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw)
 nfsip:/vol/Logs/prod_backup on /mounts/prod_backup type nfs
 (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)
 nfsip:/vol/Logs/log-elasticsearch-01 on /mounts/log-elasticsearch-01
 type nfs (rw,nfsvers=3,hard,intr,tcp,actimeo=3,addr=nfsip)

 root@log-elasticsearch-01:~# ls -ld /mounts
 drwxr-xr-x 6 root root 4096 Oct  1 13:43 /mounts

 root@log-elasticsearch-01:~# ls -ld /mounts/prod_backup/
 drwxrwxrwx 4 elasticsearch elasticsearch 4096 Mar 16 13:41
 /mounts/prod_backup/

 --David Reagan

 On Mon, Mar 16, 2015 at 3:47 PM, Mark Walkom 
 wrote:

> Can you post the output from *mount* and *ls -ld /mounts
> /mounts/prod_backup*?
>
> On 16 March 2015 at 13:33, David Reagan  wrote:
>
>> Why does this happen?
>>
>>
>> curl -XPUT 'http://localhost:9200/_snapshot/my_backup?pretty=true'
>>> -d '{
>>> > "type": "fs",
>>> > "settings": {
>>> > "location": "/mounts/prod_backup/my_backup",
>>> > "compress": true
>>> > }
>>> > }'
>>> {
>>>   "error" :
>>> "RemoteTransportException[[log-elasticsearch-02][inet[/10.x.x.83:9300]][cluster:admin/repository/put]]

Re: logstash failed to send ping to elasticsearch

2015-03-11 Thread Norberto Meijome
TCP/9200 is for REST interface...Zen ping should be on 9300 ... I suspect
you set a config wrong...
On 11/03/2015 4:33 pm, "Monika Bhadauria"  wrote:

> Hi guys,
>
> I have my Elasticsearch on one server and logstash on another.
>
> I am getting the following error in my logstash, will need your inputs:
>
> log4j, [2015-03-11T05:26:32.662]  WARN:
> org.elasticsearch.discovery.zen.ping.unicast:
> [logstash-ip-172-xx-xxx-7-8623-2016] failed to send ping to
> [[#zen_unicast_1#][ip-172-xx-xx0-7][inet[/172.xx.xxx.71:9200]]]
> org.elasticsearch.transport.ReceiveTimeoutTransportException:
> [][inet[/172.31.100.71:9200]][discovery/zen/unicast] request_id [402]
> timed out after [3750ms]
> at
> org.elasticsearch.transport.TransportService$TimeoutHandler.run(TransportService.java:356)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
> Any help would be highly appreciated.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/2612f577-2f73-49f2-9590-966347fbdde3%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4%2Bhed_fm036_DcNMKdsX-s68qkXGnOmYU6mH%3DAEgw_zzA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: EC2 cluster storage question

2015-02-25 Thread Norberto Meijome
Yes, of course EBS all the time would help for storage, but it can't
compete with local ssd in speed.
 On 25/02/2015 9:31 pm, "Mark Walkom"  wrote:

> Fair point. The rsync option could work, but then why not just use EBS and
> then shut the nodes down to save the rsync work?
> Tagging nodes probably won't help in this instance.
>
> Basically if you want to shut everything down you need to go through
> recovery, and depending on how long that takes it may not be worth the
> cost. This is something you need to test.
>
> On 25 February 2015 at 18:14, Norberto Meijome  wrote:
>
>> OP points out he is using ephemeral storage...hence shutdown will destroy
>> the data...but it can be rsynced to EBS as part of the shutdown
>> process...and then repeat in reverse when starting things up again...
>>
>> Though I guess you could let ES take care of it by tagging nodes
>> accordingly and updating the index settings .(hope it makes sense...)
>> On 25/02/2015 4:58 pm, "Mark Walkom"  wrote:
>>
>>> Why not just shut the cluster down, disable allocation first and then
>>> just gracefully power things off?
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_D15Aq62TzhbTN8kWKDPGpsuoYP2e2RJta9N5_tu4_ZA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_D15Aq62TzhbTN8kWKDPGpsuoYP2e2RJta9N5_tu4_ZA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CACj2-4Jafq4Fqf2GOsdK5OCcmdk3AtW3B2%3DjJYHTgCjyUzOQWg%40mail.gmail.com
>> <https://groups.google.com/d/msgid/elasticsearch/CACj2-4Jafq4Fqf2GOsdK5OCcmdk3AtW3B2%3DjJYHTgCjyUzOQWg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-oqv%3DFHF3%3DoULiWy_rJBf4PSi3AjgbDE_BtBwLP9Xt_w%40mail.gmail.com
> <https://groups.google.com/d/msgid/elasticsearch/CAEYi1X-oqv%3DFHF3%3DoULiWy_rJBf4PSi3AjgbDE_BtBwLP9Xt_w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4K2q1h3roWtrMxAGfxZoUGCBZfq5RH52Mh_UPkxSEzTzg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: EC2 cluster storage question

2015-02-24 Thread Norberto Meijome
OP points out he is using ephemeral storage...hence shutdown will destroy
the data...but it can be rsynced to EBS as part of the shutdown
process...and then repeat in reverse when starting things up again...

Though I guess you could let ES take care of it by tagging nodes
accordingly and updating the index settings .(hope it makes sense...)
On 25/02/2015 4:58 pm, "Mark Walkom"  wrote:

> Why not just shut the cluster down, disable allocation first and then just
> gracefully power things off?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEYi1X_D15Aq62TzhbTN8kWKDPGpsuoYP2e2RJta9N5_tu4_ZA%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4Jafq4Fqf2GOsdK5OCcmdk3AtW3B2%3DjJYHTgCjyUzOQWg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES 0.9 on EC2 - Processor load maximizes on 100% of 1 core on multi core processor.

2015-02-21 Thread Norberto Meijome
BTW, are you reducing / disabling the refresh rate while bulk indexing ?
On 22/02/2015 10:08 am, "Norberto Meijome"  wrote:

> OK, so what you have is resource  contention between searches and
> indexing...
> On 22/02/2015 12:44 am, "Maik Broxterman"  wrote:
>
>> Ok, I seem to have found a trick.
>>
>> - Step 1: remove the Route53 DNS record for the cluster for 1 node, so it
>> has no incoming client traffic
>> - Step 2: remove all replica's for the index
>> - Step 3: use the reroute api to move the shard to the node without
>> traffic
>> - Step 4: run the bulk requests again
>>
>> Thanks both for your time and effort!
>>
>> Gr, Maik
>>
>> On Friday, February 20, 2015 at 9:59:30 PM UTC+1, Mark Walkom wrote:
>>>
>>> I'd try more that 1K, more like 2-3K, see if that helps.
>>>
>>> On 21 February 2015 at 04:49, Maik Broxterman 
>>> wrote:
>>>
>>>> Hi Norberto,
>>>>
>>>> Thanks. Yes I've tried that, between 100 and 1000. It does not matter.
>>>> The strange thing is that if I do exactly the same in the 1.4 cluster that
>>>> has 4 cores processors, it just blows fully to 395% (~4*100%).
>>>>
>>>>
>>>> On Friday, February 20, 2015 at 1:20:16 PM UTC+1, Norberto Meijome
>>>> wrote:
>>>>>
>>>>> Hi Maik,
>>>>> Have you tried changing bulk size? May also be worth seeing if
>>>>> separating masters to their own nodes makes a difference...
>>>>> On 20/02/2015 8:22 pm, "Maik Broxterman"  wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> We are currently in the process of moving from an ES 0.9 cluster to
>>>>>> an ES 1.4 cluster. Both clusters are in Amazon Ec2.
>>>>>>
>>>>>> Before doing so, we need to index a lot of indexes to the ES 0.9
>>>>>> cluster first. The nodes in this cluster are all m3.2xlarge machines (8
>>>>>> cores, 30G of memory). In general the nodes in this cluster are having an
>>>>>> average processor load of 3% (so no problems at all there). The nodes are
>>>>>> newly created from the image, so we can assume that they are clean.
>>>>>>
>>>>>> The problem arises when we are going to do bulk requests. Whenever
>>>>>> the distribution of the threads on one node is around 1/8 of the total of
>>>>>> the processors, latency on the cluster goes up from 3,5ms to 100's of ms 
>>>>>> in
>>>>>> average.
>>>>>>
>>>>>> When I do a *top *the threads are all divided over all the
>>>>>> processors. All processors can have 800% of load if you add it up, but
>>>>>> whenever the addition of percentages of all cores reaches 100%, it
>>>>>> immediately starts throttling (making other requests very slow).
>>>>>>
>>>>>> *Question*
>>>>>>
>>>>>> Does anybody have experience with this situation and if yes, is there
>>>>>> a way to easily fix this?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Example of what I see in top:
>>>>>>
>>>>>> Cpu0  :  3.7%us,  0.3%sy,  0.0%ni, 96.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Cpu1  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Cpu2  :  1.0%us,  0.3%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Cpu3  :  0.7%us,  0.0%sy,  0.0%ni, 99.0%id,  0.3%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Cpu4  :  1.7%us,  0.0%sy,  0.0%ni, 98.3%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Cpu5  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Cpu6  :  3.0%us,  0.3%sy,  0.0%ni, 96.7%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Cpu7  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,  0.0%st
>>>>>>
>>>>>> Mem:  30764132k total, 30613104k used,   151028k free,   12

Re: ES 0.9 on EC2 - Processor load maximizes on 100% of 1 core on multi core processor.

2015-02-21 Thread Norberto Meijome
OK, so what you have is resource  contention between searches and
indexing...
On 22/02/2015 12:44 am, "Maik Broxterman"  wrote:

> Ok, I seem to have found a trick.
>
> - Step 1: remove the Route53 DNS record for the cluster for 1 node, so it
> has no incoming client traffic
> - Step 2: remove all replica's for the index
> - Step 3: use the reroute api to move the shard to the node without traffic
> - Step 4: run the bulk requests again
>
> Thanks both for your time and effort!
>
> Gr, Maik
>
> On Friday, February 20, 2015 at 9:59:30 PM UTC+1, Mark Walkom wrote:
>>
>> I'd try more that 1K, more like 2-3K, see if that helps.
>>
>> On 21 February 2015 at 04:49, Maik Broxterman  wrote:
>>
>>> Hi Norberto,
>>>
>>> Thanks. Yes I've tried that, between 100 and 1000. It does not matter.
>>> The strange thing is that if I do exactly the same in the 1.4 cluster that
>>> has 4 cores processors, it just blows fully to 395% (~4*100%).
>>>
>>>
>>> On Friday, February 20, 2015 at 1:20:16 PM UTC+1, Norberto Meijome wrote:
>>>>
>>>> Hi Maik,
>>>> Have you tried changing bulk size? May also be worth seeing if
>>>> separating masters to their own nodes makes a difference...
>>>> On 20/02/2015 8:22 pm, "Maik Broxterman"  wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> We are currently in the process of moving from an ES 0.9 cluster to an
>>>>> ES 1.4 cluster. Both clusters are in Amazon Ec2.
>>>>>
>>>>> Before doing so, we need to index a lot of indexes to the ES 0.9
>>>>> cluster first. The nodes in this cluster are all m3.2xlarge machines (8
>>>>> cores, 30G of memory). In general the nodes in this cluster are having an
>>>>> average processor load of 3% (so no problems at all there). The nodes are
>>>>> newly created from the image, so we can assume that they are clean.
>>>>>
>>>>> The problem arises when we are going to do bulk requests. Whenever the
>>>>> distribution of the threads on one node is around 1/8 of the total of the
>>>>> processors, latency on the cluster goes up from 3,5ms to 100's of ms in
>>>>> average.
>>>>>
>>>>> When I do a *top *the threads are all divided over all the
>>>>> processors. All processors can have 800% of load if you add it up, but
>>>>> whenever the addition of percentages of all cores reaches 100%, it
>>>>> immediately starts throttling (making other requests very slow).
>>>>>
>>>>> *Question*
>>>>>
>>>>> Does anybody have experience with this situation and if yes, is there
>>>>> a way to easily fix this?
>>>>>
>>>>>
>>>>>
>>>>> Example of what I see in top:
>>>>>
>>>>> Cpu0  :  3.7%us,  0.3%sy,  0.0%ni, 96.0%id,  0.0%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Cpu1  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Cpu2  :  1.0%us,  0.3%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Cpu3  :  0.7%us,  0.0%sy,  0.0%ni, 99.0%id,  0.3%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Cpu4  :  1.7%us,  0.0%sy,  0.0%ni, 98.3%id,  0.0%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Cpu5  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Cpu6  :  3.0%us,  0.3%sy,  0.0%ni, 96.7%id,  0.0%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Cpu7  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,
>>>>> 0.0%si,  0.0%st
>>>>>
>>>>> Mem:  30764132k total, 30613104k used,   151028k free,   129224k
>>>>> buffers
>>>>>
>>>>> Swap:0k total,0k used,0k free, 12410696k cached
>>>>>
>>>>>
>>>>>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>>>>>
>>>>>
>>>>> 24584 elastics  20   0 19.5g  15g 119m S 14.6 53.2   4351:41 java
>>>>>
>>>>>
>>>>> *Other cases*
>>>>>
>>>>> This problem appears in exa

Re: ES 0.9 on EC2 - Processor load maximizes on 100% of 1 core on multi core processor.

2015-02-20 Thread Norberto Meijome
Hi Maik,
Have you tried changing bulk size? May also be worth seeing if separating
masters to their own nodes makes a difference...
On 20/02/2015 8:22 pm, "Maik Broxterman"  wrote:

> Hello,
>
> We are currently in the process of moving from an ES 0.9 cluster to an ES
> 1.4 cluster. Both clusters are in Amazon Ec2.
>
> Before doing so, we need to index a lot of indexes to the ES 0.9 cluster
> first. The nodes in this cluster are all m3.2xlarge machines (8 cores, 30G
> of memory). In general the nodes in this cluster are having an average
> processor load of 3% (so no problems at all there). The nodes are newly
> created from the image, so we can assume that they are clean.
>
> The problem arises when we are going to do bulk requests. Whenever the
> distribution of the threads on one node is around 1/8 of the total of the
> processors, latency on the cluster goes up from 3,5ms to 100's of ms in
> average.
>
> When I do a *top *the threads are all divided over all the processors.
> All processors can have 800% of load if you add it up, but whenever the
> addition of percentages of all cores reaches 100%, it immediately starts
> throttling (making other requests very slow).
>
> *Question*
>
> Does anybody have experience with this situation and if yes, is there a
> way to easily fix this?
>
>
>
> Example of what I see in top:
>
> Cpu0  :  3.7%us,  0.3%sy,  0.0%ni, 96.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Cpu1  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Cpu2  :  1.0%us,  0.3%sy,  0.0%ni, 98.7%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Cpu3  :  0.7%us,  0.0%sy,  0.0%ni, 99.0%id,  0.3%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Cpu4  :  1.7%us,  0.0%sy,  0.0%ni, 98.3%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Cpu5  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Cpu6  :  3.0%us,  0.3%sy,  0.0%ni, 96.7%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Cpu7  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,
> 0.0%st
>
> Mem:  30764132k total, 30613104k used,   151028k free,   129224k buffers
>
> Swap:0k total,0k used,0k free, 12410696k cached
>
>
>   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>
>
> 24584 elastics  20   0 19.5g  15g 119m S 14.6 53.2   4351:41 java
>
>
> *Other cases*
>
> This problem appears in exactly the same way on 4 core instances and 2
> core instances. A respective 1/4 and 1/2 total load of processors causes it
> to have a really high latency
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a63eb62e-8437-477c-b379-c3fdf8a21a37%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4KkkaWLdgetUBcWfTeVqKV-zP5TykzGbYAJVZ6%3DQ%2BkJrw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Discovery on EC2 - unicast, separate VPCs, public IPs

2015-02-08 Thread Norberto Meijome
Sure...the interesting point in the OP is the fact both servers are in
different VPCs - not sure if it should be possible to resolve across vpcs
...
On 08/02/2015 7:04 pm, "Ivan G"  wrote:

>
> DNS queries inside  vpc are resolved to the internal IP by aws servers.
>
> One simple way is to use elastic IP for this computer and then point A
> register to that IP.
> El 07/02/2015 23:43, "Eugen Paraschiv"  escribió:
>
>> Hi,
>> I have the following simple EC2 topology:
>> - a VPC with my entire cluster, running in a public subnet
>> - a new slave in another VPC (also a public subnet)
>> - I'm using unicast - the slave has the following config:
>> discovery.zen.ping.multicast.enabled: false
>> discovery.zen.ping.unicast.hosts: ["master_elastic_ip:9300"]
>> So - the slave points to the public IP of the master - not the private
>> one.
>>
>> However - this new slave tries to connect to the master on the private IP
>> instead of the public one - and I'm getting:
>> org.elasticsearch.common.netty.channel.ConnectTimeoutException:
>> connection timed out: /172.61.51.253:9300
>> Where 172.61.51.253 is the private IP.
>> Not sure what that is - do I need to configure anything on the slave to
>> make sure it uses the public IP to reach the master?
>> Thanks,
>> Eugen.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/026f6d30-d496-4905-a5f9-80c6be82669b%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CA%2BjeyjOO6tU3qEsyf3xcD7QgZD_e-CM_g-oM9%2BRt%2B3LpBWPwUg%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4LNYv%2BrZZcZOt8rJTU6QcA1qAVdHNJzCutFzWwx%2B32Eew%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Discovery on EC2 - unicast, separate VPCs, public IPs

2015-02-07 Thread Norberto Meijome
Are you referring to the master server by fqdn or IP?  If fqdn, don't
forget about ec2's split horizon (though I don't think it should be
resolvable across 2 separate VPCs...
Can you open a socket from host 1 to host 2 manually (with nc or telnet) on
TCP/9300?
On 08/02/2015 9:43 am, "Eugen Paraschiv"  wrote:

> Hi,
> I have the following simple EC2 topology:
> - a VPC with my entire cluster, running in a public subnet
> - a new slave in another VPC (also a public subnet)
> - I'm using unicast - the slave has the following config:
> discovery.zen.ping.multicast.enabled: false
> discovery.zen.ping.unicast.hosts: ["master_elastic_ip:9300"]
> So - the slave points to the public IP of the master - not the private
> one.
>
> However - this new slave tries to connect to the master on the private IP
> instead of the public one - and I'm getting:
> org.elasticsearch.common.netty.channel.ConnectTimeoutException: connection
> timed out: /172.61.51.253:9300
> Where 172.61.51.253 is the private IP.
> Not sure what that is - do I need to configure anything on the slave to
> make sure it uses the public IP to reach the master?
> Thanks,
> Eugen.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/026f6d30-d496-4905-a5f9-80c6be82669b%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JqqAynnkvJ85JW3EphiM22_ONx1xHFH96-SrjZ2UKb6w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is re-election/assignment of the master node possible?

2014-11-27 Thread Norberto Meijome
The load issue affecting master detection / election shouldn't happen if
you have dedicated masters... At least it is with 0.90.x

( with my limited knowledge of ES implementation details, there seems to be
a lock or priority issue when serving large # of requests (http / thrift) ,
affecting cluster / metadata updates... I would think these metadata tasks
ought to take priority in some cases over queries... )
On 26/11/2014 6:11 pm, "Erik theRed"  wrote:

> Thanks, Nik -
>
> There's no data on the node so it sounds like master reelection should
> fail over fairly quickly.
>
> On Wednesday, November 26, 2014 2:58:43 PM UTC-6, Nikolas Everett wrote:
>>
>>
>>
>> On Wed, Nov 26, 2014 at 3:47 PM, Erik theRed  wrote:
>>
>>> Is there any notion triggering a re-election of the master node?
>>>
>>> I'm currently running 1.2.4, and I have an instance that is scheduled
>>> for retirement (my favorite!) and it just so happens that it's my master
>>> node.  What can I do to avoid the dreaded "RED" state?  Is there some
>>> mechanism that can allow me to re-assign the current master to one of the
>>> other available two dedicated master nodes so I can reboot the current
>>> master?
>>>
>>
>> Move all the shards off of the node using allocation include/exclude
>> settings.  If you shoot the master one of the other master eligible nodes
>> will take over quickly and there won't be any interruptions.
>>
>>
>>> I ask because I'm a bit gun-shy due to my experience when an elected
>>> master node has gone unresponsive (before I created dedicated masters) due
>>> to excessive HTTP connections, master re-election seemed to never occur and
>>> everything comes crumbling down.
>>>
>>
>> I've never had that problem.  My cluster is pretty small though - only 31
>> nodes.
>>
>> Nik
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/b9506885-e321-4abe-b1c2-db0d802b07ec%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4J8Ycj07jhHdX71JJjAHW0_ZALMH9mPUcW1__sF_1NagA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: If I use EC2 Discovery Plugin do I necessarily give internet access to my instances?

2014-11-20 Thread Norberto Meijome
Yes..but this might not be an option if your instance is in a private
subnet...it also means handling all your IPS like this ( though in theory
you don't need internal IPs, security group id/name would do as well...) -
there r limits to how many rules you can add to a secgroup

At the same time, adding eip would complicate the OP's apparent sec
requirements ...
On 20/11/2014 12:04 pm,  wrote:

> I have the same problem yesterday. What I did is  make elastic IP and
> associate it with your ec2 instance. In the sercuity group you need open
> both private Ip and the elastic IP.  try it.
>
> On Wednesday, November 19, 2014 8:01:48 AM UTC-5, David Vasquez wrote:
>>
>> Hi everyone!
>>
>> I'm trying to configure tight security rules to my elasticsearch cluster
>> meaning that the network access rules must be exactly what is needed. Now
>> I've found that the EC2 Discovery plugin does a call to AWS (
>> ec2.us-east-1.amazonaws.com:443) and for that I would need to give
>> internet access to my elasticsearch instances.
>>
>> That said, it means a big drawback for my security configuration because
>> I cannot tie the call to a fixed IP, neither to a fixed port and hence my
>> access rules would be wide open.
>>
>> Can you please tell me how do you manage this security issue on AWS?
>>
>> Thank you very much!
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/17504959-fd11-4b16-ab3f-640a083c1b19%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4%2BripJ%3DmDUgH8VbXMAvFEQwGAbqWSwwS-Nm0TEeyUpOtw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: If I use EC2 Discovery Plugin do I necessarily give internet access to my instances?

2014-11-19 Thread Norberto Meijome
Hi David,
Indeed, the plugin makes AWS API calls ( ec2 describe instances) in order
to find candidates to cluster with. Unfortunately, if memory serves me
right, those are to external IPs...

Hint - tinyproxy with whitelist on your nat gw , and proper env
configuration so that the client side (java , in this case) is aware of the
proxy.

Cheers,
B
On 19/11/2014 10:01 am, "David Vasquez"  wrote:

> Hi everyone!
>
> I'm trying to configure tight security rules to my elasticsearch cluster
> meaning that the network access rules must be exactly what is needed. Now
> I've found that the EC2 Discovery plugin does a call to AWS (
> ec2.us-east-1.amazonaws.com:443) and for that I would need to give
> internet access to my elasticsearch instances.
>
> That said, it means a big drawback for my security configuration because I
> cannot tie the call to a fixed IP, neither to a fixed port and hence my
> access rules would be wide open.
>
> Can you please tell me how do you manage this security issue on AWS?
>
> Thank you very much!
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/936cd83a-a080-4409-8e5d-0b10463abcbd%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4L5k7r0n4tv6aiLE_Q1LYTvmN5a0PjHprTLBX_jLhX8%3DQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES filling up the 'old' GC pool

2014-11-18 Thread Norberto Meijome
FWIW, we saw many long running GC events using the default GC manager -
changing to G1 solved most of the problems ( at the expense of slightly
higher CPU all the time) After that you can take the longer road to
debugging memory allocation for your use case :-)
On 18/11/2014 6:21 am, "Wilfred Hughes"  wrote:

> We're running elasticsearch 1.2.4 on Java 1.7.0_40, for what it's worth.
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/44213a82-7604-430c-9017-8d4398f9694d%40googlegroups.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4K5uo6BRnEhxv109gorBigVTzoCt_Fcjqgv5BYXA70u5A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cluster discovery on Amazon EC2 problem - need urgent help

2014-10-17 Thread Norberto Meijome
I am pretty sure you can open the ports for the sec group the elb belongs
to , regardless of the az. (Az, not region). Unless you r using network
acls.

Anyway, not really ES... pm me if u want to continue the AWS discussion :-)

On 16/10/2014 3:37 pm, "Zoran Jeremic"  wrote:
>

> For the zone availability, I had to go with everything in one zone. Main
reason was the problem to connect ELB controlled application instances with
backend instances (MySQL, MongoDB and Elasticsearch). It's not possible to
add rule to the backend instances having port+elb security group if
instances are in different zones, so I had to keep everything in one zone.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JkUOB_VmMyO41%2B1GjEF4S79Z2-doYkVXfjLgSOLowPFA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: running on EC2 S3 vs EBS

2014-10-13 Thread Norberto Meijome
Or, if your use case allows for it, have a very well oiled rebuild process
(data included).
On 14/10/2014 8:36 am, "Itamar Syn-Hershko"  wrote:

> Yes, you don't want to use anything other than local storage for
> Elasticsearch. Not EBS and definitely not S3. You can use the
> snapshot/restore API to continously backup to S3 and get all the data
> protection you need.
>
> --
>
> Itamar Syn-Hershko
> http://code972.com | @synhershko 
> Freelance Developer & Consultant
> Author of RavenDB in Action 
>
> On Tue, Oct 14, 2014 at 12:17 AM, Matthias Johnson 
> wrote:
>
>> We've begun deploying to AWS EC2. I've seen refrences in the group about
>> the S3 gateway and it being deprecated. That seems to be confirmed by
>> looking at the docs, which don't seem to list the S3 Gateway specifically
>> after 0.90.x.
>>
>> We are also using the elasticsearch-cloud-aws plugins
>> , which does a
>> nice job at helping the auto discovery. It also shows settings for using S3.
>>
>> After some reading my understanding is that the plugin is basically just
>> snapshots that are stored in S3. Is that understanding correct? Is this
>> much different from the original gateway?
>>
>> That suggests that unless we take frequent snapshots we would run a risk
>> of data loss if the entire cluster wen't down (right now we are using
>> instance storage). Is that right?
>>
>> Switching to EBS would give us better protection against data loss, since
>> the data is stored on a more permanent basis as well as improved recovery
>> after an entire cluster going down?
>>
>> Are there any good guides on configuring this sort of setup with
>> cloudformation and templates and/or tying EBS volumes for ES use to
>> machines when a cluster is resurrected?
>>
>> \@matthias
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/415a7669-4c2a-4d3e-a960-67390c1197cf%40googlegroups.com
>> 
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZtPqGN7PDTQkoYcsYwfM_3bmVrEECwZcCiNqDsLsa9gqQ%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4%2BQqbg9_PF7i-timWB14_nrZ5asFj3%3DXxk5Wd8qabZvcw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Cluster discovery on Amazon EC2 problem - need urgent help

2014-10-12 Thread Norberto Meijome
Inline below ...

On Sun, Oct 12, 2014 at 5:28 AM, Zoran Jeremic 
wrote:

> Hi Norberto,
>
> Thank you for your advices. This is really helpful, since I have never
> used elasticsearch in the  cluster before, and never had went live with a
> number of users. My previous experience was on ES single node and very
> small number of users, so I'm still concern how this will work. The main
> problem is that I don't know how many users I could expect, so I should be
> ready to expand the cluster if it's necessary.
>

Sure - that's one of the nice things about ES , and AWS - you can keep
tuning as you go...


>
> So far, I created a cluster of 3 m3.large instances having 3 indexes (5
> shards and 2 replicas).
> I couldn't manage to connect it with ec2 autodiscovery. The only option
> that worked for me is having one node that will be referred from other
> nodes as unicast host. I think it might work if I have one node that will
> always been on.
>

build for failure.


>
> You were right about having a keys in config. I didn't need it. Can I also
> remove this from my java application? I guess it could be removed if launch
> configuration contains IAM instance profile.
>

I don't know why your app needs AWS credentials, so I cannot really answer
that - but, in general, if the AWS library you use supports IAM profiles
then you should be able to remove hardcoded creds. YMMV.


> I also decreased zen discovery timeout to 3s.
>
>  - your master config shows master false... You want the master with
> master =true and data = false... Obviously you want  more than one master (
> if you don't have too much load start with all nodes available as data and
> master, then separate functionality as needed). Don't forget to set the
> minimum expected # nodes to n-master/2+1 to prevent split brain scenarios.
> I've set all 3 nodes as master and data, but I'm not sure that I
> understand what is the advantage of having nodes that are not master nodes.
> I know these nodes will not be elected as master, but what is the idea for
> that, and what would I get if I set master not to have data on it? Would it
> increase performance?
>

TL;DR - scalability, performance : There are certain operations which need
to be performed by master node in a timely . If your node is already too
busy handling searches, 'master operations' will suffer( and your whole
cluster will slow down ).

It is much cheaper to run separate, smaller master (and load balancer )
nodes , separate from your data nodes, than to scale up + out your data
nodes to handle all the operations.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-node.html


>
>
>  It should work pretty well with ec2 auto discovery - unicast is a good
> starting point but unless you are statically assigning them via cloud
> formation (or manually?), it may not be worth the trouble (and it stops you
> from dynamically scaling your cluster)
> How will ES node behave in Amazon auto-scale and could it be used like I'm
> using auto scaling to meet high load? If I already have set 5 shards and 2
> replicas on previous 3 nodes, will these shards and replicas be moved to
> new nodes, and how long it might take for this? If this is what is going
> on, I guess it's not good idea to auto-scale new ES node when I have a high
> intensity of ES use, and then to turn it off later.
>

yeah, that's definitely not something that will always work with
autoscaling.
-  You can use autoscaling to ensure the minimum # of nodes is defined (ie,
automatic rebuild of killed node).
- if you know you have, say, 8 hours with 50% more traffic, you can
increase the number of nodes some time before peak, increase # of replicas
 after the peak, reduce replica # and remove nodes... Not autoscaling
per se, but building from the get go without hardcoded hostnames will help
you do things like this.

btw, you also want to play with routing awareness, so your replicas are
distributed across different AZ.

AND beware of cost of inter-AZ traffic :) ( yes, it conflicts with the 'AZ
routing awareness')


> Sorry if these questions are too naive.
>

:) not at all!

good luck


>
> Thanks,
> Zoran
>
>
>
> On Friday, 10 October 2014 20:43:02 UTC-7, Norberto Meijome wrote:
>>
>> Zoran, good to hear it is working now.
>>
>> It should work pretty well with ec2 auto discovery - unicast is a good
>> starting point but unless you are statically assigning them via cloud
>> formation (or manually?), it may not be worth the trouble (and it stops you
>> from dynamically scaling your cluster)
>>
>> - make sure u have the ec2 plugin installed.
>> - if you use iam profiles, you don't  need a key sp

Re: Cluster discovery on Amazon EC2 problem - need urgent help

2014-10-10 Thread Norberto Meijome
Zoran, good to hear it is working now.

It should work pretty well with ec2 auto discovery - unicast is a good
starting point but unless you are statically assigning them via cloud
formation (or manually?), it may not be worth the trouble (and it stops you
from dynamically scaling your cluster)

- make sure u have the ec2 plugin installed.
- if you use iam profiles, you don't  need a key specified in the config
(this will override the key from the Profile). Also make sure you
manually  test your profile is applied properly ( AWS CLI is a good
agnostic tool for this).
- reduce the zen discovery timeout - it seems that it will always start w
zen then failover to ec2 and it can take 30secs or so to timeout... ( maybe
it was my bad config, I used to have zen when I was moving from unicast to
ec2 disco ...I don't remember finding an option to disabling zen disco).

- the default logs should show you enough info to debug any of this.

- your master config shows master false... You want the master with master
=true and data = false... Obviously you want  more than one master ( if you
don't have too much load start with all nodes available as data and master,
then separate functionality as needed). Don't forget to set the minimum
expected # nodes to n-master/2+1 to prevent split brain scenarios.
On 11/10/2014 1:38 pm, "Zoran Jeremic"  wrote:

> Hi David,
>
> Thank you for your advices. It really helped me to solve the issue and
> make it works.
> At the end I had to leave these two:
>  discovery.zen.ping.multicast.enabled: false
>  discovery.zen.ping.unicast.hosts:
> ["10.185.210.54[9300-9400]","10.101.176.236[9300-9400]"]
>
> and to remove:
> network.publish_host: 255.255.255.255
>
> And it got work finally. What turned to be the biggest problem is what you
> mentioned at the beginning, missing spaces after ":", missing spaces at the
> beginning of line  and some extra spaces after #. I thought that : is
> delimiter, and it doesn't have to be followed by space. Strange thing is
> that if I have such problems in elasticsearch.yml, there is no logs that
> indicates that there is some problem. It doesn't log anything and can't
> start elasticsearch, or just ignore wrong properties.
>
> Thanks,
> Zoran
>
> On Friday, 10 October 2014 14:11:00 UTC-7, David Pilato wrote:
>>
>> Not sure but may be related to public/private IP.
>> May be debug logs will give you more insights?
>>
>> --
>> David ;-)
>> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
>>
>> Le 10 oct. 2014 à 22:40, Zoran Jeremic  a écrit :
>>
>> Hi David,
>>
>> Thank you for your quick response. That was great guess about the space
>> after ":". It was really something that made a problem, so I'm now a step
>> forward. It seems that it's trying to establish the connection, but there
>> are a plenty of exceptions stating that Nework is unreachable. Why this
>> exception if I can telnet between nodes on 9300?
>>
>> [2014-10-10 20:22:12,184][WARN ][transport.netty  ] [Joey Bailey]
>> exception caught on transport layer [[id: 0x5541474b]], closing connection
>> java.net.SocketException: Network is unreachable
>> at sun.nio.ch.Net.connect0(Native Method)
>> at sun.nio.ch.Net.connect(Net.java:465)
>> at sun.nio.ch.Net.connect(Net.java:457)
>> at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:670)
>> at org.elasticsearch.common.netty.channel.socket.nio.
>> NioClientSocketPipelineSink.connect(NioClientSocketPipelineSink.java:108)
>> at org.elasticsearch.common.netty.channel.socket.nio.
>> NioClientSocketPipelineSink.eventSunk(NioClientSocketPipelineSink.
>> java:70)
>> at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.
>> sendDownstream(DefaultChannelPipeline.java:574)
>> at org.elasticsearch.common.netty.channel.Channels.
>> connect(Channels.java:634)
>> at org.elasticsearch.common.netty.channel.AbstractChannel.
>> connect(AbstractChannel.java:207)
>> at org.elasticsearch.common.netty.bootstrap.ClientBootstrap.connect(
>> ClientBootstrap.java:229)
>> at org.elasticsearch.common.netty.bootstrap.ClientBootstrap.connect(
>> ClientBootstrap.java:182)
>> at org.elasticsearch.transport.netty.NettyTransport.
>> connectToChannels(NettyTransport.java:705)
>> at org.elasticsearch.transport.netty.NettyTransport.
>> connectToNode(NettyTransport.java:647)
>> at org.elasticsearch.transport.netty.NettyTransport.
>> connectToNode(NettyTransport.java:615)
>> at org.elasticsearch.transport.TransportService.connectToNode(
>> TransportService.java:129)
>> at org.elasticsearch.cluster.service.InternalClusterService$
>> UpdateTask.run(InternalClusterService.java:404)
>> at org.elasticsearch.common.util.concurrent.
>> PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(
>> PrioritizedEsThreadPoolExecutor.java:134)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.ru

Re: alerts from Kibana/ES

2014-05-27 Thread Norberto Meijome
Hi, not sure tbh
Kibana is a js interface so I don't think it makes sense to alert from it.
You could monitor the results stored in ES with nagios/zabbix/ your
monitoring of choice and parse the json result and alert based on that.
We've used logstash's statsd module to send data we are interested - we
have standard checks against a lot of statsd data points so this was a
simple way to integrate it all.
On 27/05/2014 7:02 pm, "NF"  wrote:

> Hi,
>
> We’re using Kibana/Elasticsearch to visualize different kind of logs in
> our company. Now, we would need a feature that would allow us to send an
> alert/notification (email or other) when a certain event/trigger is
> captured.
>
>  I’d like to know if in Kibana/Elasticsearch backlog there is such a
> feature planned? If so, when might we expect it available?
>
> If not, could you please suggest any (open source) solution to satisfy our
> need?
>
> Thanks,
>
> Natalia
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0107a345-9eb2-431f-8639-3bcc526dbaea%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JgG9Ub%2BaNdQt_e97n9N37pSgDv4z1nCfhZtDAkYMe1zQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Deployment architecture

2014-04-30 Thread Norberto Meijome
Sending indexing requests to SLB - is this less optimal, or would outright
fail?
On 30/04/2014 9:04 am, "Mark Walkom"  wrote:

> For searches, yes. You'd want the indexing to go to the masters.
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
>
> On 30 April 2014 09:02, Norberto Meijome  wrote:
>
>> On a related note, if you have separate slb and master, your main LB
>> (say, haproxy) would be pointing to the slb , not the master , right?
>>  On 29/04/2014 8:40 pm, "Dinesh Chandra" 
>> wrote:
>>
>>> Hi,
>>>
>>> I am very new to elasticsearch, I am trying to deploy elasticsearch in
>>> my dev environment - While there are many ways in which Elasticsearch can
>>> be deployed, I and my team have arrived at this architecture
>>>
>>> 4 Data Nodes
>>> 3 Master Nodes
>>> 2 Search Load Balancers (SLB)
>>>
>>> Now my question is:
>>>  - Does it make sense to have SLB at all?
>>>  - Can I just have master nodes and have them perform the JOB of SLB too?
>>>
>>> Please enlighten me on a sensible Elasticsearch Architecture!
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to elasticsearch+unsubscr...@googlegroups.com.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/elasticsearch/82ee8ae2-c84d-4685-b061-d3e433b7969f%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/82ee8ae2-c84d-4685-b061-d3e433b7969f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/CACj2-4K9mh%3D%3Dv02mkRForLfHO8E4MYUcd3kNvfvFJGWvRwFiCg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CACj2-4K9mh%3D%3Dv02mkRForLfHO8E4MYUcd3kNvfvFJGWvRwFiCg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAEM624bQXpsN12dCPQefkkL8LMX0bdsGVrs2uS0ZRLMtqRM%3DXg%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAEM624bQXpsN12dCPQefkkL8LMX0bdsGVrs2uS0ZRLMtqRM%3DXg%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JqQ3Q%3DTKaTWbZTEkbFBW%2Bj6acGeFiBo7omUH-6aEo1Lg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch Deployment architecture

2014-04-29 Thread Norberto Meijome
On a related note, if you have separate slb and master, your main LB (say,
haproxy) would be pointing to the slb , not the master , right?
 On 29/04/2014 8:40 pm, "Dinesh Chandra"  wrote:

> Hi,
>
> I am very new to elasticsearch, I am trying to deploy elasticsearch in my
> dev environment - While there are many ways in which Elasticsearch can be
> deployed, I and my team have arrived at this architecture
>
> 4 Data Nodes
> 3 Master Nodes
> 2 Search Load Balancers (SLB)
>
> Now my question is:
>  - Does it make sense to have SLB at all?
>  - Can I just have master nodes and have them perform the JOB of SLB too?
>
> Please enlighten me on a sensible Elasticsearch Architecture!
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/82ee8ae2-c84d-4685-b061-d3e433b7969f%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4K9mh%3D%3Dv02mkRForLfHO8E4MYUcd3kNvfvFJGWvRwFiCg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: EC2 Discovery

2014-03-21 Thread Norberto Meijome
Don't try ec2 discovery until you have tested that:
- you can connect from one machine to another on port 9300 ( nc as client
and server, basic networking/ firewalling)
- run a simple aws ec2 describe instances call with the API key you plan to
use, and you can see the machines you need there. Bonus points for
filtering based on the rules you intense to use ( sec group, tags). This is
to ensure your API keys have the correct access needed.

Once you have those basic steps working, use them on es config.

Make sure you enable ec2 discovery and disable the zen discovery ( it will
run first and likely time out and ec2 disco won't get to exec).

The other thing to watch out for is contacting nodes which are too busy to
ack your new nodes request for cluster info...but that would be a problem
with zen disco too.
On 21/03/2014 12:31 PM, "Raphael Miranda"  wrote:

> are both machines in the same security group?
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/eb8bb939-3b9d-4f5b-a45c-3d529f75983e%40googlegroups.com
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4LjvKAW9cnkrjQUR6%3Dk8FRf%3DKzmDUAUopHLVUMNc1ixOw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: 2 clusters versus 1 big cluster?

2014-03-21 Thread Norberto Meijome
@mauri, thank you for such interesting analysis.
On 21/03/2014 1:01 PM, "Mauri"  wrote:

> Hi Brad
>
> I agree with what Mark and Zachary have said and will expand on these.
>
> Firstly, shard and index level operations in ElasticSearch are
> peer-to-peer. Single-shard operations will affect at most 2 nodes, the node
> receiving the request and the node hosting an instance of the shard
> (primary or replica depending on the operation). Multi-shard operations
> (such a searches) will affect from one to (N  +1) nodes where N is the
> number of shards in the index.
>
> So from an index/shard operation perspective there is no reason to split
> into two clusters. The key issue with index/shard operations is that the
> cluster is able to handle the traffic volume. So if you do decide to split
> out into to two clusters you will need to look at the load profile for each
> of your client types to determine how much raw processing power you need in
> each cluster. It may be that a 10:20 split is more optimum than a 15:15
> split between clusters to balance request traffic, and therefore CPU
> utilisation, across all nodes. If you go with one cluster this is not an
> issue because you can move shards between nodes to balance the request
> traffic.
>
> Larger clusters also imply more work for the cluster master in managing
> the cluster. This comes down to the number of nodes that the master has to
> communicate with, and manage, and the size of the cluster state. A cluster
> with 30 nodes is not too large for a master to keep track of. There will be
> an increase in network traffic associated with the increase in volume of
> master-to-worker and worker-to-master pings used to detect the
> presence/absence of nodes. This can be offset by reducing the respective
> ping intervals.
>
> In a large cluster it is good practice to have a group of dedicated master
> nodes, say 3, from which the master is elected. These nodes do not host any
> user data meaning that cluster management is not compromised by high user
> request traffic.
>
> The size of the cluster state may be more of an issue. The cluster state
> comprises all of the information about the cluster configuration. The
> cluster state has records for each node, index, document mapping, shard,
> etc. Whenever there is a change to the cluster state it is first made by
> the master which then sends the updated cluster state to each worker node.
> Note that the entire cluster state is sent, not just the changes! It is
> therefore highly desirable to limit that frequency of changes to the
> cluster state, primarily by minimizing dynamic field mapping updates, and
> the overall size of the cluster state, primarily by minimizing the number
> of indices.
>
> In your proposed model the size of the cluster state associated the set of
> 60 shared month indices will be larger than that of one set of 60 dedicated
> month indices by virtue of having 100 shards to 6. However, it may not be
> much bigger because there will be much more metadata associated with
> defining the index structure, notably the field mappings for all document
> types in the index, than the metadata defining the shards of the index. So
> it may well be that the size of the cluster state associated with 60
> "shared" month indices plus N sets of 60 "dedicated" indices is not much
> more than that of (N + 1) sets of 60 "dedicated" indices. So there may not
> be much point in splitting to two clusters. A quick way to look at this for
> your actual data model is to:
>   1. Set up an index in ES with mappings for all document types and 6
> shards and 0 replicas,
>   2. Retrieve the index metadata JSON using ES admin API,
>   3. Increase the number of replicas to 16 (102 shards total),
>   4. Retrieve the index metadata JSON using ES admin API,
>   5. Compare the two JSON documents from 2 and 4.
>
> As state above it is desirable to minimize the number of indices. Each
> shard is a Lucene index which consumes memory and requires open file
> descriptors from the OS for segment data files and Lucene index level
> files. You may find yourself running out of memory and/or file descriptors
> if you are not careful.
>
> I understand you are looking for a design that will cater for on disc data
> volume. Given that your data is split into monthly indices it may well be
> that no one index, either "shared" or "dedicated" will reach that volume in
> one month. There may also be seasonal factors to consider whereby one or
> two months have much higher volumes than others. I have read/heard about
> cases where a monthly index architecture was implemented but later scraped
> for a single index approach because the month-to-month variation in volume
> was detrimental to overall system resource utilisation and performance.
>
> In you case think about whether monthly indices are really appropriate. An
> alternative model is to partition one years worth of data into a set of
> indices bounded by size rather than time. In this 

Re: Elasticsearch 1.0.0 is now GA

2014-02-19 Thread Norberto Meijome
Agreed is bad form to force reinstall.but surely you would have your
yml in a code/cfg repository?
On 18/02/2014 9:14 AM, "Tony Su"  wrote:

> What?!
>
> Removing and re-installing the ES package either removes the original or
>  the existing elasticsearch.yml
>
> The is contrary to conventional packaging from what I've generally seen.
> Typically, when a package is removed, the configuration fie is left alone
> and must be removed manually if desired
>
> No big deal in my case, I've been working on elasticsearch.yml heavily for
> several days so can remember all the customizations I've made, but IMO this
> is a disaster waiting to happen for clusters with new Admins or those who
> attempt to fix a problem by removing and re-installing.
>
> Leaving the config file alone and re-using is the  option.
>
> IMO,
> Tony
>
>
>
>
>
> On Wednesday, February 12, 2014 3:44:34 PM UTC-8, Mark Walkom wrote:
>
>> I didn't see anything on the list, but there's a blog post about 1.0.0
>> hitting general availability!
>>
>> http://www.elasticsearch.org/blog/1-0-0-released/
>>
>> Regards,
>> Mark Walkom
>>
>> Infrastructure Engineer
>> Campaign Monitor
>> email: ma...@campaignmonitor.com
>> web: www.campaignmonitor.com
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/e1dfeaee-1833-46db-b00a-2a44550b82ad%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4%2BvKKZkW5pcrNEFOog4ywd1J-yyv3xZb%2BQF7ccHF1Dk1A%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ec2 discovery

2014-01-23 Thread Norberto Meijome
As I understand it, the ec2 plugin simply does ec2 API calls to list
instances, filtered as per your  config. It plays no actual part in the
connectivity or clustering part - just discovery.
So yes it makes sense what you saw.
On 23/01/2014 7:43 PM, "barak"  wrote:

> Hi,
>
> I've 3 nodes on ec2, each of them run 0.90.9. With default config, none of
> them discover each other. Now I want 2 of them to form a cluster, so I
> installed on those 2 nodes aws-cloud plugin, changed the related
> configuration and restarted. The 2 nodes indeed discovered each other as
> expected, but also the 3rd node... Only after changing the cluster name of
> that 3rd node, it left the cluster. Is it OK that node without aws-cloud
> plugin got discovered by the others?
>
> Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f4cf931c-b492-4333-8454-29013c557213%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JFoXitTDSFiNrvLCBf1ycjpg4U0-9YJ-gsFp5Gn%2BvCwg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Node will not shut down

2014-01-21 Thread Norberto Meijome
Brad,
What version of ES?
What was ES doing on the problem node? (Hot threads call/ log file/
strace). Any related OS info( was it io bound? )
If it was really hung, I am not sure why the shutdown would work after
moving the shards off it ( I.e. cluster was green...) ...it sounds to me
like it was too busy doing something...

The other nodes had already decided your problem node was out of the
cluster , right?
On 22/01/2014 9:55 AM, "Brad Jordan"  wrote:

> Just an update... I waited for the "unassigned_shards" number to reach
> zero at which point the cluster_state reported GREEN but still only had 3
> nodes. I was then able to execute the curl to shut down the node 2 and then
> restart it. It joined the cluster again and everyone was happy. I guess the
> moral of the story is to just wait and ES will fix itself? Patience is a
> virtue? Not sure but ES did eventually fix itself :-)
>
> -Brad
>
> On Tuesday, January 21, 2014 3:42:51 PM UTC-7, Brad Jordan wrote:
>>
>> This is a DEV env. I've got 24G of RAM on all 4 machines. 12G for ES and
>> 12G for the OS. I believe the machines are quad core HP Z-800's.
>>
>> I will not be inserting at this rate very often. My question is more
>> operational. How do you recover from the place I am in? If I kill -9 the ES
>> process on node 2 I believe I will put my cluster in the red state.
>>
>> I did get into this unhappy spot once before. After trying to shut down
>> ES on node 2 I eventually kill -9'd it. At that point my cluster was in the
>> red state and unable to service requests. The "unassigned_shards" number
>> was not changing. I have daily indexes so I simply deleted the most recent
>> daily index and rebuilt it. At this point my cluster had all 4 nodes and
>> was green again. In production this approach is not popular with mgmt. so
>> I'm trying to understand a less heavy handed approach ;-)
>>
>> -Brad
>>
>> On Tuesday, January 21, 2014 2:54:52 PM UTC-7, Ben Hundley wrote:
>>>
>>> 2 questions:
>>>
>>> 1. What size servers are you using?  Knowing how much RAM and # cores
>>> would
>>> be very helpful.
>>>
>>> 2. Definitely sounds like a massive load.  Are you going to continually
>>> be
>>> inserting 3k docs per sec?  ~260mil documents a day?
>>>
>>>
>>>
>>> -
>>>
>>> --
>>> View this message in context: http://elasticsearch-users.
>>> 115913.n3.nabble.com/Node-will-not-shut-down-tp4047940p4047942.html
>>> Sent from the ElasticSearch Users mailing list archive at Nabble.com.
>>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/a13cec60-5329-46f1-8228-03ebad5f5cee%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JgZpGfAy%2BgFYMTsAGG8sjMPSWn1%3D4059QrjqHU6KvXPg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Cluster state yellow

2014-01-16 Thread Norberto Meijome
Gotcha, my bad.
On 17/01/2014 12:43 AM, "joergpra...@gmail.com" 
wrote:

> minimum_master_nodes is a dynamic cluster setting, that means, it can be
> set via cluster update API.
>
> Jörg
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoHFgi5Wj%2Bz0P%2Bq3AJj55bTjaNhjGAzj_DHPmVfq6iRpBQ%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JfuGA4wAmTCjMdo2T%2B4cuY96EV1EN3Grc-pDMt%3D3xX%3DA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Cluster state yellow

2014-01-16 Thread Norberto Meijome
Great thread, thanks.

Some points: just because you know how many ( master ) nodes you have
doesn't mean you know or should care about their hostnames ; ec2 . servers
are cattle not pets, etc.

One thing I am not sure about. Would it be possible ( ie , safe) to make
the quorum  threshold a runtime configurable value, rather than having to
restart all the nodes for the change to take effect? We'd have to put some
code around this for safety of course ( what happens if you set a number >
N, for example...)

Also,can anyone comment on using zookeeper for master choosing ( and cfg
updates?) . I saw a plugin for zk but haven't had time to test.

Thanks !
Beto
On 16/01/2014 8:55 AM, "joergpra...@gmail.com" 
wrote:

> Using quorum consensus (another name for the 'minimum_master_node'
> approach) as default is not possible, since the quorum count is only known
> by the admin.
>
> There are perfect solutions for consensus but they are not easy to
> implement, see Byzantine fault tolerance
>
> http://en.wikipedia.org/wiki/Byzantine_fault_tolerance
>
> or Paxos
>
> http://research.google.com/archive/paxos_made_live.html
>
> or a more promising approach started lately, RAFT
>
> http://raftconsensus.github.io/
>
> Jörg
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CAKdsXoH4FFgKVeJY37aRuurAt7To5jLa%3D0Rp3b--AcGE%3DhEqoA%40mail.gmail.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4JU-J7_in1iMfP2x7kxZn-OJBWfWJFy1ME%2B%2Bd4gSOH2kw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Strategy for keeping Elasticsearch updated with MySQL

2014-01-10 Thread Norberto Meijome
+1 having a queue and consumers between your source of truth and ES is a
great approach. You cab decouple and independently scale ( and stop when
needed as DP said) the different components, minimising impact to your
users.
On 09/01/2014 7:35 AM, "David Pilato"  wrote:

> I would do 1/ to have a more near real time search.
> Also, I'd the idea that I have an object in memory and I simply push it to
> MySQL and to ES in the same time. No need to read again the object from
> MySQL to index it in another process (proposition 2)
>
> That said you could use also a Message Queue in the middle if you want to
> be able at some point to stop your ES cluster without stopping your
> application.
> This is what I did in the past.
>
> My 2 cents
>
> --
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet  | 
> @elasticsearchfr
>
>
> Le 8 janvier 2014 at 20:13:40, arthurX 
> (fc28...@gmail.com)
> a écrit:
>
> Hello! I use MySQL as my primary datastore and use Elasticsearch to
> further index the documents.
> My problem is keeping the data in ES in sync with MySQL.
>
> Currently I have two methods in mind:
> 1. whenever add or update an entry in MySQL, do the action together in ES.
> 2. Do some cron jobs that periodically keep ES in sync with the data in
> MySQL.
>
> For method 2 I wonder how can I check if an entry is already indexed in
> Elasticsearch. And would it be efficient at all if I have to check every
> entry to see if it is updated?
>
> I am new to the technology and I am afraid I had missed some really
> obvious and established solutions here. Or otherwise the "normal" way this
> situation is handled?
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/55d842e5-277f-4d24-b5a9-8be5b5544dbc%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/etPan.52cdb688.70a64e2a.1449b%40MacBook-Air-de-David.local
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CACj2-4LRefoG1u0MMtX96UhoJG72mXHk9U9G2w4Gv4XJtB9aLg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.