Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}

2017-07-12 Thread Stéphane Klein
2017-07-12 15:41 GMT+02:00 Peter Portante :

>
>
> On Wed, Jul 12, 2017 at 9:28 AM, Stéphane Klein <
> cont...@stephane-klein.info> wrote:
>
>>
>> 2017-07-12 15:20 GMT+02:00 Peter Portante :
>>
>>> This looks a lot like this BZ: https://bugzilla.redhat.co
>>> m/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving
>>> configuration"
>>>
>>> What version of Origin are you using?
>>>
>>>
>> Logging image : origin-logging-elasticsearch:v1.5.0
>>
>> $ oc version
>> oc v1.4.1+3f9807a
>> kubernetes v1.4.0+776c994
>> features: Basic-Auth
>>
>> Server https://console.tech-angels.net:443
>> openshift v1.5.0+031cbe4
>> kubernetes v1.5.2+43a9be4
>>
>> and with 1.4 nodes because of this crazy bug
>> https://github.com/openshift/origin/issues/14092)
>>
>>
>>> I found that I had to run the sgadmin script in each ES pod at the same
>>> time, and when one succeeds and one fails, just run it again and it worked.
>>>
>>>
>> Ok, I'll try that, how can I execute sgadmin script manually ?
>>
>
> ​You can see it in the run.sh script in each pod, look for the invocation
> of sgadmin there.
>
>
Ok I have executed:

/usr/share/elasticsearch/plugins/search-guard-2/tools/sgadmin.sh \
-cd ${HOME}/sgconfig \
-i .searchguard.${HOSTNAME} \
-ks /etc/elasticsearch/secret/searchguard.key \
-kst JKS \
-kspass kspass \
-ts /etc/elasticsearch/secret/searchguard.truststore \
-tst JKS \
-tspass tspass \
-nhnv \
-icl

One ES node 1 and 2 in same time, but I have need to restart one second
time on node2.

Now I have this message:

Will connect to localhost:9300 ... done
Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW
clusterstate ...
Clustername: logging-es
Clusterstate: GREEN
Number of nodes: 2
Number of data nodes: 2
.searchguard.logging-es-x39myqbs-1-s5g7c index already exists, so we do not
need to create one.
Populate config from /opt/app-root/src/sgconfig/
Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml
   SUCC: Configuration for 'config' created or updated
Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml
   SUCC: Configuration for 'roles' created or updated
Will update 'rolesmapping' with
/opt/app-root/src/sgconfig/sg_roles_mapping.yml
   SUCC: Configuration for 'rolesmapping' created or updated
Will update 'internalusers' with
/opt/app-root/src/sgconfig/sg_internal_users.yml
   SUCC: Configuration for 'internalusers' created or updated
Will update 'actiongroups' with
/opt/app-root/src/sgconfig/sg_action_groups.yml
   SUCC: Configuration for 'actiongroups' created or updated
Done with success

Fixed, thanks.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}

2017-07-12 Thread Peter Portante
On Wed, Jul 12, 2017 at 9:28 AM, Stéphane Klein  wrote:

>
> 2017-07-12 15:20 GMT+02:00 Peter Portante :
>
>> This looks a lot like this BZ: https://bugzilla.redhat.co
>> m/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving
>> configuration"
>>
>> What version of Origin are you using?
>>
>>
> Logging image : origin-logging-elasticsearch:v1.5.0
>
> $ oc version
> oc v1.4.1+3f9807a
> kubernetes v1.4.0+776c994
> features: Basic-Auth
>
> Server https://console.tech-angels.net:443
> openshift v1.5.0+031cbe4
> kubernetes v1.5.2+43a9be4
>
> and with 1.4 nodes because of this crazy bug https://github.com/openshift/
> origin/issues/14092)
>
>
>> I found that I had to run the sgadmin script in each ES pod at the same
>> time, and when one succeeds and one fails, just run it again and it worked.
>>
>>
> Ok, I'll try that, how can I execute sgadmin script manually ?
>

​You can see it in the run.sh script in each pod, look for the invocation
of sgadmin there.

-peter​



>
> Best regards,
> Stéphane
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}

2017-07-12 Thread Stéphane Klein
2017-07-12 15:20 GMT+02:00 Peter Portante :

> This looks a lot like this BZ: https://bugzilla.redhat.
> com/show_bug.cgi?id=1449378, "Timeout after 30SECONDS while retrieving
> configuration"
>
> What version of Origin are you using?
>
>
Logging image : origin-logging-elasticsearch:v1.5.0

$ oc version
oc v1.4.1+3f9807a
kubernetes v1.4.0+776c994
features: Basic-Auth

Server https://console.tech-angels.net:443
openshift v1.5.0+031cbe4
kubernetes v1.5.2+43a9be4

and with 1.4 nodes because of this crazy bug
https://github.com/openshift/origin/issues/14092)


> I found that I had to run the sgadmin script in each ES pod at the same
> time, and when one succeeds and one fails, just run it again and it worked.
>
>
Ok, I'll try that, how can I execute sgadmin script manually ?

Best regards,
Stéphane
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: [Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}

2017-07-12 Thread Peter Portante
This looks a lot like this BZ:
https://bugzilla.redhat.com/show_bug.cgi?id=1449378, "Timeout after
30SECONDS while retrieving configuration"

What version of Origin are you using?

I found that I had to run the sgadmin script in each ES pod at the same
time, and when one succeeds and one fails, just run it again and it worked.

It seems to have to do with sgadmin script trying to be sure that all nodes
can see the searchguard index, but since we create one per node, if another
node does not have searchguard successfully setup, the current node's setup
will fail.  Retry at the same time until they work seems to be the fix. :(

-peter

On Wed, Jul 12, 2017 at 9:03 AM, Stéphane Klein  wrote:

> Hi,
>
> Since one day, after ES cluster pods restart, I have this error message
> when I launch logging-es:
>
> $ oc logs -f logging-es-ne81bsny-5-jdcdk
> Comparing the specificed RAM to the maximum recommended for
> ElasticSearch...
> Inspecting the maximum RAM available...
> ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx4096m'
> Checking if Elasticsearch is ready on https://localhost:9200
> ..Will connect to localhost:9300 ...
> done
> Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW
> clusterstate ...
> Clustername: logging-es
> Clusterstate: YELLOW
> Number of nodes: 2
> Number of data nodes: 2
> .searchguard.logging-es-ne81bsny-5-jdcdk index does not exists, attempt
> to create it ... done (with 1 replicas, auto expand replicas is off)
> Populate config from /opt/app-root/src/sgconfig/
> Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml
>SUCC: Configuration for 'config' created or updated
> Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml
>SUCC: Configuration for 'roles' created or updated
> Will update 'rolesmapping' with /opt/app-root/src/sgconfig/sg_
> roles_mapping.yml
>SUCC: Configuration for 'rolesmapping' created or updated
> Will update 'internalusers' with /opt/app-root/src/sgconfig/sg_
> internal_users.yml
>SUCC: Configuration for 'internalusers' created or updated
> Will update 'actiongroups' with /opt/app-root/src/sgconfig/sg_
> action_groups.yml
>SUCC: Configuration for 'actiongroups' created or updated
> Timeout (java.util.concurrent.TimeoutException: Timeout after 30SECONDS
> while retrieving configuration for [config, roles, rolesmapping,
> internalusers, actiongroups](index=.searchguard.logging-es-
> x39myqbs-1-s5g7c))
> Done with failures
>
> after some time, my ES cluster (2 nodes) is green:
>
> stephane$ oc rsh logging-es-x39myqbs-1-s5g7c bash
> st:9200/_cluster/health?pretty=trueasticsearch/secret/admin-cert
> https://localho
> {
>   "cluster_name" : "logging-es",
>   "status" : "green",
>   "timed_out" : false,
>   "number_of_nodes" : 2,
>   "number_of_data_nodes" : 2,
>   "active_primary_shards" : 1643,
>   "active_shards" : 3286,
>   "relocating_shards" : 0,
>   "initializing_shards" : 0,
>   "unassigned_shards" : 0,
>   "delayed_unassigned_shards" : 0,
>   "number_of_pending_tasks" : 0,
>   "number_of_in_flight_fetch" : 0,
>   "task_max_waiting_in_queue_millis" : 0,
>   "active_shards_percent_as_number" : 100.0
> }
>
> I have this error in kibana container:
>
> $ oc logs -f -c kibana logging-kibana-1-jblhl
> {"type":"log","@timestamp":"2017-07-12T12:54:54Z","tags":[
> "warning","elasticsearch"],"pid":1,"message":"No living connections"}
> {"type":"log","@timestamp":"2017-07-12T12:54:57Z","tags":[
> "warning","elasticsearch"],"pid":1,"message":"Unable to revive
> connection: https://logging-es:9200/"}
>
> But in Kibana container I can access to elasticsearch server:
>
> $ oc rsh -c kibana logging-kibana-1-jblhl bash
> $ curl https://logging-es:9200/ --cacert /etc/kibana/keys/ca --key
> /etc/kibana/keys/key --cert /etc/kibana/keys/cert
> {
>   "name" : "Adri Nital",
>   "cluster_name" : "logging-es",
>   "cluster_uuid" : "iRo3wOHWSq2bTZskrIs6Zg",
>   "version" : {
> "number" : "2.4.4",
> "build_hash" : "fcbb46dfd45562a9cf00c604b30849a6dec6b017",
> "build_timestamp" : "2017-01-03T11:33:16Z",
> "build_snapshot" : false,
> "lucene_version" : "5.5.2"
>   },
>   "tagline" : "You Know, for Search"
> }
>
> How can I fix this error?
>
> Best regards,
> Stéphane
> --
> Stéphane Klein 
> blog: http://stephane-klein.info
> cv : http://cv.stephane-klein.info
> Twitter: http://twitter.com/klein_stephane
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


[Logging] searchguard configuration issue? ["warning", "elasticsearch"], "pid":1, "message":"Unable to revive connection: https://logging-es:9200/"}

2017-07-12 Thread Stéphane Klein
Hi,

Since one day, after ES cluster pods restart, I have this error message
when I launch logging-es:

$ oc logs -f logging-es-ne81bsny-5-jdcdk
Comparing the specificed RAM to the maximum recommended for ElasticSearch...
Inspecting the maximum RAM available...
ES_JAVA_OPTS: '-Dmapper.allow_dots_in_name=true -Xms128M -Xmx4096m'
Checking if Elasticsearch is ready on https://localhost:9200
..Will connect to localhost:9300 ...
done
Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW
clusterstate ...
Clustername: logging-es
Clusterstate: YELLOW
Number of nodes: 2
Number of data nodes: 2
.searchguard.logging-es-ne81bsny-5-jdcdk index does not exists, attempt to
create it ... done (with 1 replicas, auto expand replicas is off)
Populate config from /opt/app-root/src/sgconfig/
Will update 'config' with /opt/app-root/src/sgconfig/sg_config.yml
   SUCC: Configuration for 'config' created or updated
Will update 'roles' with /opt/app-root/src/sgconfig/sg_roles.yml
   SUCC: Configuration for 'roles' created or updated
Will update 'rolesmapping' with
/opt/app-root/src/sgconfig/sg_roles_mapping.yml
   SUCC: Configuration for 'rolesmapping' created or updated
Will update 'internalusers' with
/opt/app-root/src/sgconfig/sg_internal_users.yml
   SUCC: Configuration for 'internalusers' created or updated
Will update 'actiongroups' with
/opt/app-root/src/sgconfig/sg_action_groups.yml
   SUCC: Configuration for 'actiongroups' created or updated
Timeout (java.util.concurrent.TimeoutException: Timeout after 30SECONDS
while retrieving configuration for [config, roles, rolesmapping,
internalusers,
actiongroups](index=.searchguard.logging-es-x39myqbs-1-s5g7c))
Done with failures

after some time, my ES cluster (2 nodes) is green:

stephane$ oc rsh logging-es-x39myqbs-1-s5g7c bash
st:9200/_cluster/health?pretty=trueasticsearch/secret/admin-cert
https://localho
{
  "cluster_name" : "logging-es",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 1643,
  "active_shards" : 3286,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

I have this error in kibana container:

$ oc logs -f -c kibana logging-kibana-1-jblhl
{"type":"log","@timestamp":"2017-07-12T12:54:54Z","tags":["warning","elasticsearch"],"pid":1,"message":"No
living connections"}
{"type":"log","@timestamp":"2017-07-12T12:54:57Z","tags":["warning","elasticsearch"],"pid":1,"message":"Unable
to revive connection: https://logging-es:9200/"}

But in Kibana container I can access to elasticsearch server:

$ oc rsh -c kibana logging-kibana-1-jblhl bash
$ curl https://logging-es:9200/ --cacert /etc/kibana/keys/ca --key
/etc/kibana/keys/key --cert /etc/kibana/keys/cert
{
  "name" : "Adri Nital",
  "cluster_name" : "logging-es",
  "cluster_uuid" : "iRo3wOHWSq2bTZskrIs6Zg",
  "version" : {
"number" : "2.4.4",
"build_hash" : "fcbb46dfd45562a9cf00c604b30849a6dec6b017",
"build_timestamp" : "2017-01-03T11:33:16Z",
"build_snapshot" : false,
"lucene_version" : "5.5.2"
  },
  "tagline" : "You Know, for Search"
}

How can I fix this error?

Best regards,
Stéphane
-- 
Stéphane Klein 
blog: http://stephane-klein.info
cv : http://cv.stephane-klein.info
Twitter: http://twitter.com/klein_stephane
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users