This looks a lot like this BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1449378, "Timeout after
30SECONDS while retrieving configuration"

What version of Origin are you using?

I found that I had to run the sgadmin script in each ES pod at the same
time, and when one succeeds and one fails, just run it again and it worked.

It seems to have to do with sgadmin script trying to be sure that all nodes
can see the searchguard index, but since we create one per node, if another
node does not have searchguard successfully setup, the current node's setup
will fail.  Retry at the same time until they work seems to be the fix. :(

-peter

On Wed, Jul 12, 2017 at 9:03 AM, Stéphane Klein <cont...@stephane-klein.info
> wrote:

> Hi,
>
> Since one day, after ES cluster pods restart, I have this error message
> when I launch logging-es:
>
> $ oc logs -f logging-es-ne81bsny-5-jdcdk
> Comparing the specificed RAM to the maximum recommended for
> ElasticSearch...
> Inspecting the maximum RAM available...
> ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx4096m'
> Checking if Elasticsearch is ready on https://localhost:9200
> ......................................Will connect to localhost:9300 ...
> done
> Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW
> clusterstate ...
> Clustername: logging-es
> Clusterstate: YELLOW
> Number of nodes: 2
> Number of data nodes: 2
> .searchguard.logging-es-ne81bsny-5-jdcdk index does not exists, attempt
> to create it ... done (with 1 replicas, auto expand replicas is off)
> Populate config from /opt/app-root/src/sgconfig/
> Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml
>    SUCC: Configuration for 'config' created or updated
> Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml
>    SUCC: Configuration for 'roles' created or updated
> Will update 'rolesmapping' with /opt/app-root/src/sgconfig/sg_
> roles_mapping.yml
>    SUCC: Configuration for 'rolesmapping' created or updated
> Will update 'internalusers' with /opt/app-root/src/sgconfig/sg_
> internal_users.yml
>    SUCC: Configuration for 'internalusers' created or updated
> Will update 'actiongroups' with /opt/app-root/src/sgconfig/sg_
> action_groups.yml
>    SUCC: Configuration for 'actiongroups' created or updated
> Timeout (java.util.concurrent.TimeoutException: Timeout after 30SECONDS
> while retrieving configuration for [config, roles, rolesmapping,
> internalusers, actiongroups](index=.searchguard.logging-es-
> x39myqbs-1-s5g7c))
> Done with failures
>
> after some time, my ES cluster (2 nodes) is green:
>
> stephane$ oc rsh logging-es-x39myqbs-1-s5g7c bash
> st:9200/_cluster/health?pretty=trueasticsearch/secret/admin-cert
> https://localho
> {
>   "cluster_name" : "logging-es",
>   "status" : "green",
>   "timed_out" : false,
>   "number_of_nodes" : 2,
>   "number_of_data_nodes" : 2,
>   "active_primary_shards" : 1643,
>   "active_shards" : 3286,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 0,
>   "delayed_unassigned_shards" : 0,
>   "number_of_pending_tasks" : 0,
>   "number_of_in_flight_fetch" : 0,
>   "task_max_waiting_in_queue_millis" : 0,
>   "active_shards_percent_as_number" : 100.0
> }
>
> I have this error in kibana container:
>
> $ oc logs -f -c kibana logging-kibana-1-jblhl
> {"type":"log","@timestamp":"2017-07-12T12:54:54Z","tags":[
> "warning","elasticsearch"],"pid":1,"message":"No living connections"}
> {"type":"log","@timestamp":"2017-07-12T12:54:57Z","tags":[
> "warning","elasticsearch"],"pid":1,"message":"Unable to revive
> connection: https://logging-es:9200/"}
>
> But in Kibana container I can access to elasticsearch server:
>
> $ oc rsh -c kibana logging-kibana-1-jblhl bash
> $ curl https://logging-es:9200/ --cacert /etc/kibana/keys/ca --key
> /etc/kibana/keys/key --cert /etc/kibana/keys/cert
> {
>   "name" : "Adri Nital",
>   "cluster_name" : "logging-es",
>   "cluster_uuid" : "iRo3wOHWSq2bTZskrIs6Zg",
>   "version" : {
>     "number" : "2.4.4",
>     "build_hash" : "fcbb46dfd45562a9cf00c604b30849a6dec6b017",
>     "build_timestamp" : "2017-01-03T11:33:16Z",
>     "build_snapshot" : false,
>     "lucene_version" : "5.5.2"
>   },
>   "tagline" : "You Know, for Search"
> }
>
> How can I fix this error?
>
> Best regards,
> Stéphane
> --
> Stéphane Klein <cont...@stephane-klein.info>
> blog: http://stephane-klein.info
> cv : http://cv.stephane-klein.info
> Twitter: http://twitter.com/klein_stephane
>
> _______________________________________________
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
_______________________________________________
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users

Reply via email to