Re: Metrics - Could not connect to Cassandra cluster

2016-10-28 Thread Philippe Lafoucrière
The problem is we removed everything in the project, and tried to reinstall
with ansible.
Since we can't deploy the metrics (even with the v1.2.1 image, note that
v1.2.2 doesn't exist:
https://hub.docker.com/r/openshift/origin-metrics-deployer/tags/), we're
screwed :(
We'll try this on another 1.3.1 cluster.

Thanks for tip anyway!
​
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Node triggered evacuation

2016-10-28 Thread Clayton Coleman
Only via the API on the masters.  I do not think it is unreasonable
that you'd be able to do so via the node clients credentials, but
policy may not allow the node to gather all the info node drain and
manage-node use.

Try openshift admin manage-node ... --config=PATH_TO_NODE_CRED

> On Oct 28, 2016, at 2:09 AM, Andrew Lau  wrote:
>
> Hi,
>
> Is there any facility to trigger a node evacuation from within the node? eg. 
> if we are in the console of a particular node or the node receives a signal 
> (eg. spot termination notice).
>
> Thanks
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users

___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Node triggered evacuation

2016-10-28 Thread Alex Wauck
Non-master nodes don't seem to have any built-in privileges.  I suggest
creating a service account with the necessary permissions and using its
token to perform actions on the nodes.

On Fri, Oct 28, 2016 at 1:07 AM, Andrew Lau  wrote:

> Hi,
>
> Is there any facility to trigger a node evacuation from within the node?
> eg. if we are in the console of a particular node or the node receives a
> signal (eg. spot termination notice).
>
> Thanks
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com *

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Re: Metrics - Could not connect to Cassandra cluster

2016-10-28 Thread Alex Wauck
This happened to us.  The problem is probably that you have your metrics
replication controllers set to pull the latest versions of the images.  (I
think this is the default.  Bad!)  The current latest version needs
different configuration, so your existing configuration no longer works.
You probably had this problem for a long time but didn't notice until some
component of the system restarted for some reason, triggering a new image
pull.

We fixed this by changing the images specified in the replication
controllers.  For example, in rc/hawkular-metrics, we changed

image: openshift/origin-metrics-hawkular-metrics:latest

to

image: openshift/origin-metrics-hawkular-metrics:v1.2.1

While I was debugging, I restarted hawkular-cassandra, so it got upgraded,
too.  I don't know if it had already gotten upgraded; if yours hasn't, then
you can avoid losing data.  So, I had to set the :v1.2.1 tag on all three
components (hawkular-cassandra, hawkular-metrics, and heapster) and also
delete all data (both the data directory and the commitlog directory) on
the hawkular-cassandra PV.  In order to delete that data, I had to find the
mountpoint on the node where the hawkular-cassandra pod was running and
delete the files from the host side.  Because hawkular-cassandra was
failing, I was unable to use `oc rsh` to get in.

On Sat, Oct 22, 2016 at 2:32 PM, Miloslav Vlach 
wrote:

> Hi,
>
> I don’t know why is on one server problem with connection to the casandra
> database.
>
> The hawkular write
>
> 19:27:15,354 WARN [org.hawkular.alerts.engine.impl.CassCluster]
> (ServerService Thread Pool -- 75) Could not connect to Cassandra cluster -
> assuming is not up yet. Cause: 
> com.datastax.driver.core.exceptions.NoHostAvailableException:
> All host(s) tried for query failed (tried: /127.0.0.1:9042
> (com.datastax.driver.core.exceptions.TransportException: [/127.0.0.1]
> Cannot connect))
>
>
> But the endpoint is not 127.0.0.1:9042
>
> On the other server outside cluster
>
> 19:26:54,909 WARN [org.hawkular.metrics.api.jaxrs.MetricsServiceLifecycle]
> (metricsservice-lifecycle-thread) HAWKMETRICS23: Could not connect to
> Cassandra cluster - assuming its not up yet: All host(s) tried for query
> failed (tried: hawkular-cassandra/172.30.155.228:9042
> (com.datastax.driver.core.exceptions.TransportException:
> [hawkular-cassandra/172.30.155.228] Cannot connect))
>
> but after a few second it connects to the casandra.
>
> Know somebody where is the problem ?
>
> Instalation performed via ansible. All works before restart.
>
> Thanks Mila
>
> ___
> users mailing list
> users@lists.openshift.redhat.com
> http://lists.openshift.redhat.com/openshiftmm/listinfo/users
>
>


-- 

Alex Wauck // DevOps Engineer

*E X O S I T E*
*www.exosite.com *

Making Machines More Human.
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users


Node triggered evacuation

2016-10-28 Thread Andrew Lau
Hi,

Is there any facility to trigger a node evacuation from within the node?
eg. if we are in the console of a particular node or the node receives a
signal (eg. spot termination notice).

Thanks
___
users mailing list
users@lists.openshift.redhat.com
http://lists.openshift.redhat.com/openshiftmm/listinfo/users