Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}
2017-07-12 15:41 GMT+02:00 Peter Portante: > > > On Wed, Jul 12, 2017 at 9:28 AM, Stéphane Klein < > cont...@stephane-klein.info> wrote: > >> >> 2017-07-12 15:20 GMT+02:00 Peter Portante : >> >>> This looks a lot like this BZ: https://bugzilla.redhat.co >>> m/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving >>> configuration" >>> >>> What version of Origin are you using? >>> >>> >> Logging image : origin-logging-elasticsearch:v1.5.0 >> >> $ oc version >> oc v1.4.1+3f9807a >> kubernetes v1.4.0+776c994 >> features: Basic-Auth >> >> Server https://console.tech-angels.net:443 >> openshift v1.5.0+031cbe4 >> kubernetes v1.5.2+43a9be4 >> >> and with 1.4 nodes because of this crazy bug >> https://github.com/openshift/origin/issues/14092) >> >> >>> I found that I had to run the sgadmin script in each ES pod at the same >>> time, and when one succeeds and one fails, just run it again and it worked. >>> >>> >> Ok, I'll try that, how can I execute sgadmin script manually ? >> > > You can see it in the run.sh script in each pod, look for the invocation > of sgadmin there. > > Ok I have executed: /usr/share/elasticsearch/plugins/search-guard-2/tools/sgadmin.sh \ -cd ${HOME}/sgconfig \ -i .searchguard.${HOSTNAME} \ -ks /etc/elasticsearch/secret/searchguard.key \ -kst JKS \ -kspass kspass \ -ts /etc/elasticsearch/secret/searchguard.truststore \ -tst JKS \ -tspass tspass \ -nhnv \ -icl One ES node 1 and 2 in same time, but I have need to restart one second time on node2. Now I have this message: Will connect to localhost:9300 ... done Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ... Clustername: logging-es Clusterstate: GREEN Number of nodes: 2 Number of data nodes: 2 .searchguard.logging-es-x39myqbs-1-s5g7c index already exists, so we do not need to create one. Populate config from /opt/app-root/src/sgconfig/ Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml SUCC: Configuration for 'config' created or updated Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml SUCC: Configuration for 'roles' created or updated Will update 'rolesmapping' with /opt/app-root/src/sgconfig/sg_roles_mapping.yml SUCC: Configuration for 'rolesmapping' created or updated Will update 'internalusers' with /opt/app-root/src/sgconfig/sg_internal_users.yml SUCC: Configuration for 'internalusers' created or updated Will update 'actiongroups' with /opt/app-root/src/sgconfig/sg_action_groups.yml SUCC: Configuration for 'actiongroups' created or updated Done with success Fixed, thanks. ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}
On Wed, Jul 12, 2017 at 9:28 AM, Stéphane Kleinwrote: > > 2017-07-12 15:20 GMT+02:00 Peter Portante : > >> This looks a lot like this BZ: https://bugzilla.redhat.co >> m/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving >> configuration" >> >> What version of Origin are you using? >> >> > Logging image : origin-logging-elasticsearch:v1.5.0 > > $ oc version > oc v1.4.1+3f9807a > kubernetes v1.4.0+776c994 > features: Basic-Auth > > Server https://console.tech-angels.net:443 > openshift v1.5.0+031cbe4 > kubernetes v1.5.2+43a9be4 > > and with 1.4 nodes because of this crazy bug https://github.com/openshift/ > origin/issues/14092) > > >> I found that I had to run the sgadmin script in each ES pod at the same >> time, and when one succeeds and one fails, just run it again and it worked. >> >> > Ok, I'll try that, how can I execute sgadmin script manually ? > You can see it in the run.sh script in each pod, look for the invocation of sgadmin there. -peter > > Best regards, > Stéphane > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}
2017-07-12 15:20 GMT+02:00 Peter Portante: > This looks a lot like this BZ: https://bugzilla.redhat. > com/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving > configuration" > > What version of Origin are you using? > > Logging image : origin-logging-elasticsearch:v1.5.0 $ oc version oc v1.4.1+3f9807a kubernetes v1.4.0+776c994 features: Basic-Auth Server https://console.tech-angels.net:443 openshift v1.5.0+031cbe4 kubernetes v1.5.2+43a9be4 and with 1.4 nodes because of this crazy bug https://github.com/openshift/origin/issues/14092) > I found that I had to run the sgadmin script in each ES pod at the same > time, and when one succeeds and one fails, just run it again and it worked. > > Ok, I'll try that, how can I execute sgadmin script manually ? Best regards, Stéphane ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}
This looks a lot like this BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving configuration" What version of Origin are you using? I found that I had to run the sgadmin script in each ES pod at the same time, and when one succeeds and one fails, just run it again and it worked. It seems to have to do with sgadmin script trying to be sure that all nodes can see the searchguard index, but since we create one per node, if another node does not have searchguard successfully setup, the current node's setup will fail. Retry at the same time until they work seems to be the fix. :( -peter On Wed, Jul 12, 2017 at 9:03 AM, Stéphane Kleinwrote: > Hi, > > Since one day, after ES cluster pods restart, I have this error message > when I launch logging-es: > > $ oc logs -f logging-es-ne81bsny-5-jdcdk > Comparing the specificed RAM to the maximum recommended for > ElasticSearch... > Inspecting the maximum RAM available... > ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx4096m' > Checking if Elasticsearch is ready on https://localhost:9200 > ..Will connect to localhost:9300 ... > done > Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW > clusterstate ... > Clustername: logging-es > Clusterstate: YELLOW > Number of nodes: 2 > Number of data nodes: 2 > .searchguard.logging-es-ne81bsny-5-jdcdk index does not exists, attempt > to create it ... done (with 1 replicas, auto expand replicas is off) > Populate config from /opt/app-root/src/sgconfig/ > Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml >SUCC: Configuration for 'config' created or updated > Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml >SUCC: Configuration for 'roles' created or updated > Will update 'rolesmapping' with /opt/app-root/src/sgconfig/sg_ > roles_mapping.yml >SUCC: Configuration for 'rolesmapping' created or updated > Will update 'internalusers' with /opt/app-root/src/sgconfig/sg_ > internal_users.yml >SUCC: Configuration for 'internalusers' created or updated > Will update 'actiongroups' with /opt/app-root/src/sgconfig/sg_ > action_groups.yml >SUCC: Configuration for 'actiongroups' created or updated > Timeout (java.util.concurrent.TimeoutException: Timeout after 30SECONDS > while retrieving configuration for [config, roles, rolesmapping, > internalusers, actiongroups](index=.searchguard.logging-es- > x39myqbs-1-s5g7c)) > Done with failures > > after some time, my ES cluster (2 nodes) is green: > > stephane$ oc rsh logging-es-x39myqbs-1-s5g7c bash > st:9200/_cluster/health?pretty=trueasticsearch/secret/admin-cert > https://localho > { > "cluster_name" : "logging-es", > "status" : "green", > "timed_out" : false, > "number_of_nodes" : 2, > "number_of_data_nodes" : 2, > "active_primary_shards" : 1643, > "active_shards" : 3286, > "relocating_shards" : 0, > "initializing_shards" : 0, > "unassigned_shards" : 0, > "delayed_unassigned_shards" : 0, > "number_of_pending_tasks" : 0, > "number_of_in_flight_fetch" : 0, > "task_max_waiting_in_queue_millis" : 0, > "active_shards_percent_as_number" : 100.0 > } > > I have this error in kibana container: > > $ oc logs -f -c kibana logging-kibana-1-jblhl > {"type":"log","@timestamp":"2017-07-12T12:54:54Z","tags":[ > "warning","elasticsearch"],"pid":1,"message":"No living connections"} > {"type":"log","@timestamp":"2017-07-12T12:54:57Z","tags":[ > "warning","elasticsearch"],"pid":1,"message":"Unable to revive > connection: https://logging-es:9200/"} > > But in Kibana container I can access to elasticsearch server: > > $ oc rsh -c kibana logging-kibana-1-jblhl bash > $ curl https://logging-es:9200/ --cacert /etc/kibana/keys/ca --key > /etc/kibana/keys/key --cert /etc/kibana/keys/cert > { > "name" : "Adri Nital", > "cluster_name" : "logging-es", > "cluster_uuid" : "iRo3wOHWSq2bTZskrIs6Zg", > "version" : { > "number" : "2.4.4", > "build_hash" : "fcbb46dfd45562a9cf00c604b30849a6dec6b017", > "build_timestamp" : "2017-01-03T11:33:16Z", > "build_snapshot" : false, > "lucene_version" : "5.5.2" > }, > "tagline" : "You Know, for Search" > } > > How can I fix this error? > > Best regards, > Stéphane > -- > Stéphane Klein > blog: http://stephane-klein.info > cv : http://cv.stephane-klein.info > Twitter: http://twitter.com/klein_stephane > > ___ > users mailing list > users@lists.openshift.redhat.com > http://lists.openshift.redhat.com/openshiftmm/listinfo/users > > ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users
[Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}
Hi, Since one day, after ES cluster pods restart, I have this error message when I launch logging-es: $ oc logs -f logging-es-ne81bsny-5-jdcdk Comparing the specificed RAM to the maximum recommended for ElasticSearch... Inspecting the maximum RAM available... ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx4096m' Checking if Elasticsearch is ready on https://localhost:9200 ..Will connect to localhost:9300 ... done Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ... Clustername: logging-es Clusterstate: YELLOW Number of nodes: 2 Number of data nodes: 2 .searchguard.logging-es-ne81bsny-5-jdcdk index does not exists, attempt to create it ... done (with 1 replicas, auto expand replicas is off) Populate config from /opt/app-root/src/sgconfig/ Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml SUCC: Configuration for 'config' created or updated Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml SUCC: Configuration for 'roles' created or updated Will update 'rolesmapping' with /opt/app-root/src/sgconfig/sg_roles_mapping.yml SUCC: Configuration for 'rolesmapping' created or updated Will update 'internalusers' with /opt/app-root/src/sgconfig/sg_internal_users.yml SUCC: Configuration for 'internalusers' created or updated Will update 'actiongroups' with /opt/app-root/src/sgconfig/sg_action_groups.yml SUCC: Configuration for 'actiongroups' created or updated Timeout (java.util.concurrent.TimeoutException: Timeout after 30SECONDS while retrieving configuration for [config, roles, rolesmapping, internalusers, actiongroups](index=.searchguard.logging-es-x39myqbs-1-s5g7c)) Done with failures after some time, my ES cluster (2 nodes) is green: stephane$ oc rsh logging-es-x39myqbs-1-s5g7c bash st:9200/_cluster/health?pretty=trueasticsearch/secret/admin-cert https://localho { "cluster_name" : "logging-es", "status" : "green", "timed_out" : false, "number_of_nodes" : 2, "number_of_data_nodes" : 2, "active_primary_shards" : 1643, "active_shards" : 3286, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 } I have this error in kibana container: $ oc logs -f -c kibana logging-kibana-1-jblhl {"type":"log","@timestamp":"2017-07-12T12:54:54Z","tags":["warning","elasticsearch"],"pid":1,"message":"No living connections"} {"type":"log","@timestamp":"2017-07-12T12:54:57Z","tags":["warning","elasticsearch"],"pid":1,"message":"Unable to revive connection: https://logging-es:9200/"} But in Kibana container I can access to elasticsearch server: $ oc rsh -c kibana logging-kibana-1-jblhl bash $ curl https://logging-es:9200/ --cacert /etc/kibana/keys/ca --key /etc/kibana/keys/key --cert /etc/kibana/keys/cert { "name" : "Adri Nital", "cluster_name" : "logging-es", "cluster_uuid" : "iRo3wOHWSq2bTZskrIs6Zg", "version" : { "number" : "2.4.4", "build_hash" : "fcbb46dfd45562a9cf00c604b30849a6dec6b017", "build_timestamp" : "2017-01-03T11:33:16Z", "build_snapshot" : false, "lucene_version" : "5.5.2" }, "tagline" : "You Know, for Search" } How can I fix this error? Best regards, Stéphane -- Stéphane Kleinblog: http://stephane-klein.info cv : http://cv.stephane-klein.info Twitter: http://twitter.com/klein_stephane ___ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users