----- Original Message ----- > From: "Srinivas Naga Kotaru (skotaru)" <skot...@cisco.com> > To: "Matt Wringe" <mwri...@redhat.com> > Cc: users@lists.openshift.redhat.com > Sent: Monday, June 13, 2016 7:26:06 PM > Subject: Re: Metrics deployment > > Matt > > PV issue resolved. Was able to to see PV successfully bounded and Casandra > container has been running. However, it seems puzzle not fully yet solved.
Are you sure the OpenShift DNS server is running? If you are running OSE 3.1, can you please follow this https://access.redhat.com/solutions/2329131 and see if you are now seeing errors in the Hawkular Metrics logs (essentially just run `oc exec hawkular-metrics-xxxxx cat /opt/eap/standalone/log/server.log`) > > I could see other container(heapster) not coming up, and seeing below errors > > [skotaru@l3imas-id2-01 metrics]$ oc logs -f heapster-fnkdc > Endpoint Check in effect. Checking > https://hawkular-metrics:443/hawkular/metrics/status > Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. > Curl exit code: 6. Status Code 000 > 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible > [HTTP status code: 000. Curl exit code 6]. Retrying. > Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. > Curl exit code: 6. Status Code 000 > 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible > [HTTP status code: 000. Curl exit code 6]. Retrying. > > > # oc get pv > pv-5gb-0011 5Gi RWO Bound > openshift-infra/metrics-cassandra-1 22m > > > $ oc get pods > NAME READY STATUS RESTARTS AGE > hawkular-cassandra-1-2pzd7 1/1 Running 0 20m > hawkular-metrics-mf5qf 0/1 Running 7 20m > heapster-fnkdc 0/1 Error 6 20m > metrics-deployer-cvep0 0/1 Completed 0 21m > > # oc logs -f hawkular-metrics-mf5qf > > 19:20:00,819 INFO [org.xnio] (MSC service thread 1-2) XNIO Version > 3.0.14.GA-redhat-1 > 19:20:00,831 INFO [org.jboss.as.server] (Controller Boot Thread) JBAS015888: > Creating http management service using socket-binding (management-http) > 19:20:00,834 INFO [org.xnio.nio] (MSC service thread 1-2) XNIO NIO > Implementation Version 3.0.14.GA-redhat-1 > 19:20:00,844 INFO [org.jboss.remoting] (MSC service thread 1-2) JBoss > Remoting version 3.3.5.Final-redhat-1 > > $ oc logs -f heapster-fnkdc > Endpoint Check in effect. Checking > https://hawkular-metrics:443/hawkular/metrics/status > Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. > Curl exit code: 6. Status Code 000 > 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible > [HTTP status code: 000. Curl exit code 6]. Retrying. > Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. > Curl exit code: 6. Status Code 000 > 'https://hawkular-metrics:443/hawkular/metrics/status' is not accessible > [HTTP status code: 000. Curl exit code 6]. Retrying. > Could not connect to https://hawkular-metrics:443/hawkular/metrics/status. > Curl exit code: 6. Status Code 000 > > $ oc logs -f hawkular-cassandra-1-2pzd7 > INFO 23:00:24 Starting listening for CQL clients on > hawkular-cassandra-1-2pzd7/10.1.6.2:9042... > INFO 23:00:24 Binding thrift service to > hawkular-cassandra-1-2pzd7/10.1.6.2:9160 > INFO 23:00:24 enabling encrypted thrift connections between client and > server > INFO 23:00:24 Listening for thrift clients... > INFO 23:00:26 Created default superuser role 'cassandra' > > # oc get svc > NAME CLUSTER-IP EXTERNAL-IP PORT(S) > AGE > hawkular-cassandra 172.30.2.13 <none> > 9042/TCP,9160/TCP,7000/TCP,7001/TCP 25m > hawkular-cassandra-nodes None <none> > 9042/TCP,9160/TCP,7000/TCP,7001/TCP 25m > hawkular-metrics 172.30.117.176 <none> 443/TCP > 25m > heapster 172.30.107.135 <none> 80/TCP > 25m > > #curl -I 172.30.117.176:443//hawkular/metrics/status > > HTTP/1.1 504 Gateway Timeout > Mime-Version: 1.0 > Date: Mon, 13 Jun 2016 23:25:47 GMT > Content-Type: text/html > Connection: keep-alive > Proxy-Connection: keep-alive > Content-Length: 1572 > > -- > Srinivas Kotaru > > On 6/13/16, 2:33 PM, "Srinivas Naga Kotaru (skotaru)" <skot...@cisco.com> > wrote: > > >Matt > > > >That is good catch. I ran without USE_PERSISTENT_STORAGE=false and working > > > >I adjusted PV to 5Gi and reran. Will update progress. > > > >Thanks you for your help so far. > > > >-- > >Srinivas Kotaru > > > >On 6/13/16, 2:27 PM, "Matt Wringe" <mwri...@redhat.com> wrote: > > > >> > >> > >>----- Original Message ----- > >>> From: "Srinivas Naga Kotaru (skotaru)" <skot...@cisco.com> > >>> To: "Matt Wringe" <mwri...@redhat.com> > >>> Cc: users@lists.openshift.redhat.com > >>> Sent: Monday, June 13, 2016 5:21:01 PM > >>> Subject: Re: Metrics deployment > >>> > >>> Oh ok > >>> > >>> Am using PV for metrics > >>> > >>> description: "The persistent volume size for each of the Cassandra nodes" > >>> name: CASSANDRA_PV_SIZE > >>> value: "10Gi" > >>> > >>> oc get pv > >>> NAME CAPACITY ACCESSMODES STATUS CLAIM > >>> REASON > >>> AGE > >>> pv-1gb-001 1Gi RWO Available > >>> 4d > >>> pv-1gb-002 1Gi RWO Available > >>> 4d > >>> pv-1gb-003 1Gi RWO Available > >>> 4d > >>> pv-1gb-004 1Gi RWO Bound thlatt/mongodb > >>> 4d > >>> pv-1gb-005 1Gi RWO Available > >>> 4d > >>> pv-2gb-0010 2Gi RWO Available > >>> 4d > >>> pv-2gb-006 2Gi RWO Available > >>> 4d > >>> pv-2gb-007 2Gi RWO Available > >>> 4d > >>> pv-2gb-008 2Gi RWO Available > >>> 4d > >>> pv-2gb-009 2Gi RWO Available > >>> 4d > >>> pv-5gb-0011 5Gi RWO Available > >>> 4d > >>> pv-5gb-0012 5Gi RWO Available > >>> 4d > >>> pv-5gb-0013 5Gi RWO Available > >>> 4d > >>> pv-5gb-0014 5Gi RWO Available > >>> 4d > >>> pv-5gb-0015 5Gi RWO Available > >>> 4d > >>> > >>> am running with below command > >>> > >>> $ oc new-app -f metrics-deployer.yaml ( hardcoded HOSTNAME, MASTER_API > >>> and > >>> PV info so not passing any parameters) > >>> > >> > >>I would suspect that Cassandra is blocked because its waiting for 10Gi PV > >>to become available, and none of the PV listed above are big enough. > >> > >>> > >>> -- > >>> Srinivas Kotaru > >>> > >>> On 6/13/16, 2:12 PM, "Matt Wringe" <mwri...@redhat.com> wrote: > >>> > >>> >----- Original Message ----- > >>> >> From: "Srinivas Naga Kotaru (skotaru)" <skot...@cisco.com> > >>> >> To: "Matt Wringe" <mwri...@redhat.com> > >>> >> Cc: users@lists.openshift.redhat.com > >>> >> Sent: Monday, June 13, 2016 4:55:55 PM > >>> >> Subject: Re: Metrics deployment > >>> >> > >>> >> Matt > >>> >> > >>> >> Thanks for looking into. I rerun the setup, but had the same issue > >>> >> > >>> >> # oc get pods > >>> >> NAME READY STATUS RESTARTS > >>> >> AGE > >>> >> hawkular-cassandra-1-y2egy 0/1 ContainerCreating 0 > >>> >> 5m > >>> >> hawkular-metrics-4b16f 0/1 Running 1 > >>> >> 4m > >>> >> heapster-x2gj2 0/1 Running 2 > >>> >> 4m > >>> >> metrics-deployer-9v7vc 0/1 Completed 0 > >>> >> 6m > >>> >> > >>> >> $ oc logs -f hawkular-cassandra-1-y2egy > >>> >> Error from server: container "hawkular-cassandra-1" in pod > >>> >> "hawkular-cassandra-1-y2egy" is waiting to start: ContainerCreating > >>> > > >>> >Ok, so it looks like something is blocking the Cassandra pod from > >>> >starting. > >>> > > >>> >If you are using persistent storage, Cassandra will not start until the > >>> >PV > >>> >is available. There may be some more information about Cassandra in the > >>> >pod > >>> >section of the console under events. > >>> > > >>> >What command did you use when deploying the deployer? > >>> > > >>> >> > >>> >> $ oc logs -f hawkular-metrics-4b16f > >>> >> > >>> >> 16:54:25,703 DEBUG [org.jboss.as.config] (MSC service thread 1-4) VM > >>> >> Arguments: -Duser.home=/home/jboss -Duser.name=jboss -D[Standalone] > >>> >> -XX:+UseCompressedOops -verbose:gc > >>> >> -Xloggc:/opt/eap/standalone/log/gc.log > >>> >> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation > >>> >> -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=3M -XX:-TraceClassUnloading > >>> >> -Xms1303m -Xmx1303m -XX:MaxPermSize=256m > >>> >> -Djava.net.preferIPv4Stack=true > >>> >> -Djboss.modules.system.pkgs=org.jboss.logmanager > >>> >> -Djava.awt.headless=true > >>> >> -Djboss.modules.policy-permissions=true > >>> >> -Xbootclasspath/p:/opt/eap/jboss-modules.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/main/jboss-logmanager-1.5.4.Final-redhat-1.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/ext/main/javax.json-1.0.4.jar:/opt/eap/modules/system/layers/base/org/jboss/logmanager/ext/main/jboss-logmanager-ext-1.0.0.Alpha2-redhat-1.jar > >>> >> -Djava.util.logging.manager=org.jboss.logmanager.LogManager > >>> >> -javaagent:/opt/eap/jolokia.jar=port=8778,protocol=https,caCert=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt,clientPrincipal=cn=system:master-proxy,useSslClientAuthentication=true,extraClientCheck=true,host=0.0.0.0,discoveryEnabled=false > >>> >> -Djava.security.egd=file:/dev/./urandom > >>> >> -Dorg.jboss.boot.log.file=/opt/eap/standalone/log/server.log > >>> >> -Dlogging.configuration=file:/opt/eap/standalone/configuration/logging.properties > >>> >> 16:54:27,079 INFO [org.xnio] (MSC service thread 1-3) XNIO Version > >>> >> 3.0.14.GA-redhat-1 > >>> >> 16:54:27,083 INFO [org.xnio.nio] (MSC service thread 1-3) XNIO NIO > >>> >> Implementation Version 3.0.14.GA-redhat-1 > >>> >> 16:54:27,101 INFO [org.jboss.as.server] (Controller Boot Thread) > >>> >> JBAS015888: > >>> >> Creating http management service using socket-binding > >>> >> (management-http) > >>> >> 16:54:27,104 INFO [org.jboss.remoting] (MSC service thread 1-3) JBoss > >>> >> Remoting version 3.3.5.Final-redhat-1 > >>> >> > >>> >> $ oc logs -f heapster-x2gj2 > >>> >> Endpoint Check in effect. Checking > >>> >> https://hawkular-metrics:443/hawkular/metrics/status > >>> >> Could not connect to > >>> >> https://hawkular-metrics:443/hawkular/metrics/status. > >>> >> Curl exit code: 6. Status Code 000 > >>> >> 'https://hawkular-metrics:443/hawkular/metrics/status' is not > >>> >> accessible > >>> >> [HTTP status code: 000. Curl exit code 6]. Retrying. > >>> >> Could not connect to > >>> >> https://hawkular-metrics:443/hawkular/metrics/status. > >>> >> Curl exit code: 6. Status Code 000 > >>> >> 'https://hawkular-metrics:443/hawkular/metrics/status' is not > >>> >> accessible > >>> >> [HTTP status code: 000. Curl exit code 6]. Retrying. > >>> >> Could not connect to > >>> >> https://hawkular-metrics:443/hawkular/metrics/status. > >>> >> Curl exit code: 6. Status Code 000 > >>> >> > >>> >> > >>> >> $ oc logs -f metrics-deployer-9v7vc > >>> >> > >>> >> ++ oc create -f - > >>> >> serviceaccount "heapster" created > >>> >> service "heapster" created > >>> >> replicationcontroller "heapster" created > >>> >> + echo 'Success!' > >>> >> Success! > >>> >> > >>> >> -- > >>> >> Srinivas Kotaru > >>> >> > >>> >> On 6/13/16, 1:49 PM, "Matt Wringe" <mwri...@redhat.com> wrote: > >>> >> > >>> >> > > >>> >> > > >>> >> >----- Original Message ----- > >>> >> >> From: "Srinivas Naga Kotaru (skotaru)" <skot...@cisco.com> > >>> >> >> To: users@lists.openshift.redhat.com > >>> >> >> Sent: Monday, June 13, 2016 3:58:12 PM > >>> >> >> Subject: Metrics deployment > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> Hi > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> Am trying to configure metrics in our newly installed clusters. Am > >>> >> >> seeing > >>> >> >> below errors once metrics-deploy script was successful. I used our > >>> >> >> environment specific HAWKULAR_METRICS_HOSTNAME and MASTER_URL > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> # oc new-app -f metrics-deployer.yaml > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> Note: customized, CASSANDARA PV, MASTER_URL, and > >>> >> >> HAWKULAR_METRICS_HOSTNAME > >>> >> >> ( > >>> >> >> hard coded as values) > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> template "hawkular-heapster" created > >>> >> >> > >>> >> >> Deploying the Heapster component > >>> >> >> > >>> >> >> ++ echo 'Deploying the Heapster component' > >>> >> >> > >>> >> >> ++ '[' -n '' ']' > >>> >> >> > >>> >> >> ++ oc create -f - > >>> >> >> > >>> >> >> ++ oc process hawkular-heapster -v > >>> >> >> IMAGE_PREFIX=registry.access.redhat.com/openshift3/,IMAGE_VERSION=latest,MASTER_URL=https://lae3-alln-int-idev01.cisco.com:443,NODE_ID=nodename > >>> >> >> > >>> >> >> serviceaccount "heapster" created > >>> >> >> > >>> >> >> service "heapster" created > >>> >> >> > >>> >> >> replicationcontroller "heapster" created > >>> >> >> > >>> >> >> + echo 'Success!' > >>> >> >> > >>> >> >> Success! > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> # oc get pods > >>> >> >> > >>> >> >> NAME READY STATUS RESTARTS AGE > >>> >> >> > >>> >> >> hawkular-cassandra-1-9nzio 0/1 ContainerCreating 0 4m > >>> >> >> > >>> >> >> hawkular-metrics-hi7mb 0/1 Running 1 4m > >>> >> >> > >>> >> >> heapster-e8gbu 0/1 Running 2 4m > >>> >> >> > >>> >> >> metrics-deployer-64703 0/1 ContainerCreating 0 3s > >>> >> >> > >>> >> >> metrics-deployer-cd1nf 0/1 Completed 0 5m > >>> >> >> > >>> >> > > >>> >> >It looks like none of your containers are fully up and running yet. > >>> >> > > >>> >> >Without Cassandra running, Hawkular Metrics will not run, and > >>> >> >Heapster > >>> >> >will > >>> >> >wait until Hawkular Metrics is fully running. > >>> >> > > >>> >> >Do you see anything in the Cassandra logs? The first step will be to > >>> >> >get > >>> >> >Cassandra running properly. > >>> >> > > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> $ oc logs -f heapster-e8gbu > >>> >> >> > >>> >> >> Endpoint Check in effect. Checking > >>> >> >> https://hawkular-metrics:443/hawkular/metrics/status > >>> >> >> > >>> >> >> Could not connect to > >>> >> >> https://hawkular-metrics:443/hawkular/metrics/status. > >>> >> >> Curl exit code: 6. Status Code 000 > >>> >> >> > >>> >> >> 'https://hawkular-metrics:443/hawkular/metrics/status' is not > >>> >> >> accessible > >>> >> >> [HTTP status code: 000. Curl exit code 6]. Retrying. > >>> >> >> > >>> >> >> Could not connect to > >>> >> >> https://hawkular-metrics:443/hawkular/metrics/status. > >>> >> >> Curl exit code: 6. Status Code 000 > >>> >> > > >>> >> >Heapster waits until Hawkular Metrics is started before trying to > >>> >> >push > >>> >> >metrics to it. The issue that you are seeing is because Heapster > >>> >> >could > >>> >> >not > >>> >> >properly connect to Hawkular Metrics. Until the Hawkular Metrics > >>> >> >service > >>> >> >is > >>> >> >fully up, Heapster will not be able to connect to it. > >>> >> > > >>> >> > > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> What is the wrong? Why it checking just hawkular-metrics rather > >>> >> >> full > >>> >> >> routing > >>> >> >> URL which was provided as HAWKULAR_METRICS_HOSTNAME > >>> >> > > >>> >> >The Hawkular Metrics service has two hostnames: the internal hostname > >>> >> >used > >>> >> >by the internal components (eg 'hawkular-metrics') and the external > >>> >> >hostname (eg what is configured via HAWKULAR_METRICS_HOSTNAME). The > >>> >> >OpenShift dns server will resolve hostnames to the name of services, > >>> >> >which > >>> >> >is where the internal 'hawkular-metrics' comes from. > >>> >> > > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> -- > >>> >> >> > >>> >> >> > >>> >> >> Srinivas Kotaru > >>> >> >> > >>> >> >> _______________________________________________ > >>> >> >> users mailing list > >>> >> >> users@lists.openshift.redhat.com > >>> >> >> http://lists.openshift.redhat.com/openshiftmm/listinfo/users > >>> >> >> > >>> >> > >>> >> > >>> > >>> > > > > _______________________________________________ users mailing list users@lists.openshift.redhat.com http://lists.openshift.redhat.com/openshiftmm/listinfo/users