[ovirt-users] Re: ovirt metrics ansible error

2019-05-29 Thread Jayme
Shirly,

No problem, I understand.  I will provide all of the requested info in a
bug report.  Thanks again for your help!


On Wed, May 29, 2019 at 11:44 AM Shirly Radco  wrote:

> Hi Jayme,
>
> It getting hard to debug your issue over the mailing list.
> Can you please open a bug in bugzilla and attach all the information you
> know?
> The versions you are using, config files from engine machine
> and  ansible log, vars.yaml , integ.ini from the master0 vm in /root
> directory, .
>
> Sorry for the inconvenience.
>
> Best,
>
>
> --
>
> Shirly Radco
>
> BI Senior Software Engineer
>
> Red Hat 
>
> 
>
>
> On Tue, May 28, 2019 at 10:47 PM Jayme  wrote:
>
>> I actually see the pods running on master0 if I do this:
>>
>> @master0 master]# oc project kube-system
>> Now using project "kube-system" on server "
>> https://openshift-master.cloud.xxx.com:8443";.
>> [root@master0 master]# oc get pods
>> NAME  READY
>> STATUSRESTARTS   AGE
>> master-api-master0.cloud..com   1/1   Running   0
>>22m
>> master-controllers-master0.cloud..com   1/1   Running   0
>>22m
>> master-etcd-master0.cloud.xx  1/1   Running   0
>>22m
>>
>> So I wonder why the ansible "Wait for control plane pods to appear" task
>> is looping
>>
>> - name: Wait for control plane pods to appear
>>   oc_obj:
>> state: list
>> kind: pod
>> name: "master-{{ item }}-{{ l_kubelet_node_name | lower }}"
>> namespace: kube-system
>>   register: control_plane_pods
>>   until:
>>   - "'results' in control_plane_pods"
>>   - "'results' in control_plane_pods.results"
>>   - control_plane_pods.results.results | length > 0
>>   retries: 60
>>   delay: 5
>>   with_items:
>>   - "{{ 'etcd' if inventory_hostname in groups['oo_etcd_to_config'] else
>> omit }}"
>>   - api
>>   - controllers
>>   ignore_errors: true
>>
>> On Tue, May 28, 2019 at 4:23 PM Jayme  wrote:
>>
>>> I just tried again from scratch this time making sure a proper wildcard
>>> DNS entry existed and without using the set /etc/hosts option and am still
>>> running in to the pods issue.  Can anyone confirm if this requires a public
>>> external IP to work?  I am working on an internal DNS zone here and natted
>>> ips.
>>>
>>> On Tue, May 28, 2019 at 3:28 PM Edward Berger 
>>> wrote:
>>>
 In my case it was a single bare metal host, so that would be equivalent
 to disabling iptables on the master0 VM you're installing to, in your ovirt
 scenario.

 On Tue, May 28, 2019 at 1:25 PM Jayme  wrote:

> Do you mean the iptables firewall on the server being installed to
> i.e. master0 or the actual oVirt host that the master0 VM is running on?  
> I
> did try flushing iptables rules on master0 VM then ran plays again from
> installer VM but fail at the same point.
>
> Does this log message have anything to do with the issue, /etc/cni
> directory does not even exist on master0 VM.
>
> May 28 17:23:35 master0 origin-node: W0528 17:23:35.012902   10434
> cni.go:172] Unable to update cni config: No networks found in 
> /etc/cni/net.d
> May 28 17:23:35 master0 origin-node: E0528 17:23:35.013398   10434
> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
> cni config uninitialized
>
>
>
> On Tue, May 28, 2019 at 1:19 PM Edward Berger 
> wrote:
>
>> > TASK [openshift_control_plane : Wait for control plane pods to
>> appear] *
>> > Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
>> 
>> > FAILED - RETRYING: Wait for control plane pods to appear (60
>> retries left).
>> > FAILED - RETRYING: Wait for control plane pods to appear (59
>> retries left).
>> >It eventually counts all the way down to zero and fails.
>>
>> This looks a lot like the issues I saw when the host firewall
>> (iptables) was blocking another OKD all-in-one-host install script [1].
>> Disabling iptables allowed the installation to continue for my proof
>> of concept "cluster".
>>
>> [1]https://github.com/gshipley/installcentos
>>
>> The other error I had with [1] was it was trying to install a couple
>> of packages (zile and python2-pip) from EPEL with the repo disabled.
>>
>>
>>
>> On Tue, May 28, 2019 at 10:41 AM Jayme  wrote:
>>
>>> Shirly,
>>>
>>> Oh and I should mention that I did verify that NetworkManager was
>>> installed on the master0 VM and enabled/started the second go around.  
>>> So
>>> that service is there and running.
>>>
>>> # systemctl list-unit-files | grep Network
>>> dbus-org.freedesktop.NetworkManager.service
>>> enabled
>>> NetworkManager-dispat

[ovirt-users] Re: ovirt metrics ansible error

2019-05-27 Thread Shirly Radco
Hi Jayme,

Thank you for reaching out.
Please try rerunning the ansible playbook.
If this doesn't work, try adding to the integ.ini in the metrics vm
openshift_disable_check=docker_storage
and rerun the ansible playbook again.

Please update how it goes.

Best regards,

--

Shirly Radco

BI Senior Software Engineer

Red Hat 




On Sun, May 26, 2019 at 9:34 PM Jayme  wrote:

> I'm running in to this ansible error during oVirt metrics installation
> (following procedures at:
> https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
>  )
>
> This is happening late in the process, after successfully deploying the
> installation VM and then running second step from the metrics VM.
>
> CHECK [memory_availability : master0.xx.com]
> *
> fatal: [master0.xxx.com]: FAILED! => {"changed": true, "checks":
> {"disk_availability": {}, "docker_image_availability": {"changed": true},
> "docker_storage": {"failed": true, "failures": [["OpenShiftCheckException",
> "Could not find imported module support code for docker_info.  Looked for
> either AnsibleDockerClient.py or docker_common.py\nTraceback (most recent
> call last):\n  File
> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/action_plugins/openshift_health_check.py\",
> line 225, in run_check\nresult = check.run()\n  File
> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_storage.py\",
> line 53, in run\ndocker_info = self.execute_module(\"docker_info\",
> {})\n  File
> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/__init__.py\",
> line 211, in execute_module\nresult = self._execute_module(module_name,
> module_args, self.tmp, self.task_vars)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/plugins/action/__init__.py\",
> line 809, in _execute_module\n(module_style, shebang, module_data,
> module_path) = self._configure_module(module_name=module_name,
> module_args=module_args, task_vars=task_vars)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/plugins/action/__init__.py\",
> line 203, in _configure_module\nenvironment=final_environment)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/executor/module_common.py\",
> line 1023, in modify_module\nenvironment=environment)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/executor/module_common.py\",
> line 859, in _find_module_utils\nrecursive_finder(module_name,
> b_module_data, py_module_names, py_module_cache, zf)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/executor/module_common.py\",
> line 621, in recursive_finder\nraise AnsibleError('
> '.join(msg))\nAnsibleError: Could not find imported module support code for
> docker_info.  Looked for either AnsibleDockerClient.py or
> docker_common.py\n"]], "msg": "Could not find imported module support code
> for docker_info.  Looked for either AnsibleDockerClient.py or
> docker_common.py\nTraceback (most recent call last):\n  File
> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/action_plugins/openshift_health_check.py\",
> line 225, in run_check\nresult = check.run()\n  File
> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_storage.py\",
> line 53, in run\ndocker_info = self.execute_module(\"docker_info\",
> {})\n  File
> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/__init__.py\",
> line 211, in execute_module\nresult = self._execute_module(module_name,
> module_args, self.tmp, self.task_vars)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/plugins/action/__init__.py\",
> line 809, in _execute_module\n(module_style, shebang, module_data,
> module_path) = self._configure_module(module_name=module_name,
> module_args=module_args, task_vars=task_vars)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/plugins/action/__init__.py\",
> line 203, in _configure_module\nenvironment=final_environment)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/executor/module_common.py\",
> line 1023, in modify_module\nenvironment=environment)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/executor/module_common.py\",
> line 859, in _find_module_utils\nrecursive_finder(module_name,
> b_module_data, py_module_names, py_module_cache, zf)\n  File
> \"/usr/lib/python2.7/site-packages/ansible/executor/module_common.py\",
> line 621, in recursive_finder\nraise AnsibleError('
> '.join(msg))\nAnsibleError: Could not find imported module support code for
> docker_info.  Looked for either AnsibleDockerClient.py or
> docker_common.py\n"}, "memory_availability": {}, "package_availability":
> {"changed": false, "invocation": {"module_args": {"packages": ["PyYAML",
> "bash-completion", "bind", "ceph-common",

[ovirt-users] Re: ovirt metrics ansible error

2019-05-27 Thread Jayme
I managed to get past that but am running in to another problem later in
the process on the control plane pods to appear task.   I thought perhaps a
glitch in the process from the failed docker step previously so after a few
more runs I tried killing everything and restarting the metrics process
again from the very beginning and end up hitting the same issue with
control plane pods even though all other steps/tasks seem to be working.

I'm just getting this:

TASK [openshift_control_plane : Wait for control plane pods to appear]
*
Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857

FAILED - RETRYING: Wait for control plane pods to appear (60 retries left).
FAILED - RETRYING: Wait for control plane pods to appear (59 retries left).
FAILED - RETRYING: Wait for control plane pods to appear (58 retries left).
FAILED - RETRYING: Wait for control plane pods to appear (57 retries left).
FAILED - RETRYING: Wait for control plane pods to appear (56 retries left).

It eventually counts all the way down to zero and fails.

In syslog of the master0 server I'm seeing some errors related to cni config

May 27 13:39:07 master0 ansible-oc_obj: Invoked with files=None kind=pod
force=False all_namespaces=None field_selector=None namespace=kube-system
delete_after=False kubeconfig=/etc/origin/master/admin.kubeconfig
content=None state=list debug=False selector=None name=
master-api-master0.xx.com
May 27 13:39:09 master0 origin-node: W0527 13:39:09.064230   20150
cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
May 27 13:39:09 master0 origin-node: E0527 13:39:09.064670   20150
kubelet.go:2101] Container runtime network not ready: NetworkReady=false
reason:NetworkPluginNotReady message:docker: network plugin is not ready:
cni config uninitialized
May 27 13:39:13 master0 ansible-oc_obj: Invoked with files=None kind=pod
force=False all_namespaces=None field_selector=None namespace=kube-system
delete_after=False kubeconfig=/etc/origin/master/admin.kubeconfig
content=None state=list debug=False selector=None name=
master-api-master0.xx.com
May 27 13:39:14 master0 origin-node: W0527 13:39:14.066911   20150
cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
May 27 13:39:14 master0 origin-node: E0527 13:39:14.067321   20150
kubelet.go:2101] Container runtime network not ready: NetworkReady=false
reason:NetworkPluginNotReady message:docker: network plugin is not ready:
cni config uninitialized
May 27 13:39:14 master0 origin-node: E0527 13:39:14.814705   20150
summary.go:102] Failed to get system container stats for
"/system.slice/origin-node.service": failed to get cgroup stats for
"/system.slice/origin-node.service": failed to get container info for
"/system.slice/origin-node.service": unknown container
"/system.slice/origin-node.service"
May 27 13:39:19 master0 origin-node: W0527 13:39:19.069450   20150
cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
May 27 13:39:19 master0 origin-node: E0527 13:39:19.069850   20150
kubelet.go:2101] Container runtime network not ready: NetworkReady=false
reason:NetworkPluginNotReady message:docker: network plugin is not ready:
cni config uninitialized

On Mon, May 27, 2019 at 9:35 AM Shirly Radco  wrote:

> Hi Jayme,
>
> Thank you for reaching out.
> Please try rerunning the ansible playbook.
> If this doesn't work, try adding to the integ.ini in the metrics vm
> openshift_disable_check=docker_storage
> and rerun the ansible playbook again.
>
> Please update how it goes.
>
> Best regards,
>
> --
>
> Shirly Radco
>
> BI Senior Software Engineer
>
> Red Hat 
>
> 
>
>
> On Sun, May 26, 2019 at 9:34 PM Jayme  wrote:
>
>> I'm running in to this ansible error during oVirt metrics installation
>> (following procedures at:
>> https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
>>  )
>>
>> This is happening late in the process, after successfully deploying the
>> installation VM and then running second step from the metrics VM.
>>
>> CHECK [memory_availability : master0.xx.com]
>> *
>> fatal: [master0.xxx.com]: FAILED! => {"changed": true, "checks":
>> {"disk_availability": {}, "docker_image_availability": {"changed": true},
>> "docker_storage": {"failed": true, "failures": [["OpenShiftCheckException",
>> "Could not find imported module support code for docker_info.  Looked for
>> either AnsibleDockerClient.py or docker_common.py\nTraceback (most recent
>> call last):\n  File
>> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/action_plugins/openshift_health_check.py\",
>> line 225, in run_check\nresult = check.run()\n  File
>> \"/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_storage.py\",
>> line 53, in run\ndocker_info = self.

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Shirly Radco
Hi,

The latest release of 4.3.z should already include a fix for this issue,
ovirt-engine-metrics-1.3.1 rpm.

The issue is that it requires the NetworkManagar to be installed, running
and enabled for it to work.

You can install it manually on the master0 vm , start and enable it or you
can also install the updated rpm from the nightly builds if your
environment is oVirt 4.2.z:
https://resources.ovirt.org/pub/ovirt-4.2-snapshot/rpm/el7/noarch/ovirt-engine-metrics-1.2.3-0.0.master.20190523112218.gitbc6e4fa.el7.noarch.rpm

Relevant patches are:
https://gerrit.ovirt.org/#/c/99959/
https://gerrit.ovirt.org/#/c/99718/

Best regards,

--

Shirly Radco

BI Senior Software Engineer

Red Hat 




On Mon, May 27, 2019 at 4:41 PM Jayme  wrote:

> I managed to get past that but am running in to another problem later in
> the process on the control plane pods to appear task.   I thought perhaps a
> glitch in the process from the failed docker step previously so after a few
> more runs I tried killing everything and restarting the metrics process
> again from the very beginning and end up hitting the same issue with
> control plane pods even though all other steps/tasks seem to be working.
>
> I'm just getting this:
>
> TASK [openshift_control_plane : Wait for control plane pods to appear]
> *
> Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
> 
> FAILED - RETRYING: Wait for control plane pods to appear (60 retries left).
> FAILED - RETRYING: Wait for control plane pods to appear (59 retries left).
> FAILED - RETRYING: Wait for control plane pods to appear (58 retries left).
> FAILED - RETRYING: Wait for control plane pods to appear (57 retries left).
> FAILED - RETRYING: Wait for control plane pods to appear (56 retries left).
>
> It eventually counts all the way down to zero and fails.
>
> In syslog of the master0 server I'm seeing some errors related to cni
> config
>
> May 27 13:39:07 master0 ansible-oc_obj: Invoked with files=None kind=pod
> force=False all_namespaces=None field_selector=None namespace=kube-system
> delete_after=False kubeconfig=/etc/origin/master/admin.kubeconfig
> content=None state=list debug=False selector=None name=
> master-api-master0.xx.com
> May 27 13:39:09 master0 origin-node: W0527 13:39:09.064230   20150
> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
> May 27 13:39:09 master0 origin-node: E0527 13:39:09.064670   20150
> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
> cni config uninitialized
> May 27 13:39:13 master0 ansible-oc_obj: Invoked with files=None kind=pod
> force=False all_namespaces=None field_selector=None namespace=kube-system
> delete_after=False kubeconfig=/etc/origin/master/admin.kubeconfig
> content=None state=list debug=False selector=None name=
> master-api-master0.xx.com
> May 27 13:39:14 master0 origin-node: W0527 13:39:14.066911   20150
> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
> May 27 13:39:14 master0 origin-node: E0527 13:39:14.067321   20150
> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
> cni config uninitialized
> May 27 13:39:14 master0 origin-node: E0527 13:39:14.814705   20150
> summary.go:102] Failed to get system container stats for
> "/system.slice/origin-node.service": failed to get cgroup stats for
> "/system.slice/origin-node.service": failed to get container info for
> "/system.slice/origin-node.service": unknown container
> "/system.slice/origin-node.service"
> May 27 13:39:19 master0 origin-node: W0527 13:39:19.069450   20150
> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
> May 27 13:39:19 master0 origin-node: E0527 13:39:19.069850   20150
> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
> cni config uninitialized
>
> On Mon, May 27, 2019 at 9:35 AM Shirly Radco  wrote:
>
>> Hi Jayme,
>>
>> Thank you for reaching out.
>> Please try rerunning the ansible playbook.
>> If this doesn't work, try adding to the integ.ini in the metrics vm
>> openshift_disable_check=docker_storage
>> and rerun the ansible playbook again.
>>
>> Please update how it goes.
>>
>> Best regards,
>>
>> --
>>
>> Shirly Radco
>>
>> BI Senior Software Engineer
>>
>> Red Hat 
>>
>> 
>>
>>
>> On Sun, May 26, 2019 at 9:34 PM Jayme  wrote:
>>
>>> I'm running in to this ansible error during oVirt metrics installation
>>> (following procedures at:
>>> https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
>>>  )
>>>
>>> This is happening late in the process, after successfully deploying the
>>> installation VM 

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Jayme
Shirly,

I appreciate the help with this.  Unfortunately I am still running in to
the same problem.  So far I've tried to install/enable/start NetworkManager
on the existing "master0" server and re-ran the plans from the installer
VM.  I ran in to the same problem waiting for control plane pods and same
errors in syslog.

So I wiped everything out, killed the template along with the installer and
master VMs.  On oVirt engine (I am running 4.3.3.7-1 stable) I did have
ovirt-engine-metrics-1.3.0x rpm installed, no yum updates available on an
update check.  So I installed
http://resources.ovirt.org/pub/yum-repo/ovirt-release43-pre.rpm then
proceeded to install the latest version of ovirt-engine-metrics which gave
me: ovirt-engine-metrics-1.3.1-1.el7.noarch on hosted engine.

After that package was installed I proceeded to follow steps from beginning
outlined at:
https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
--
I ran in to the docker check issue again (same as my initial email) so I
disabled that and again got as far as starting control plane pods before
failure.

Not sure where to go from here at this point.  The only thing I can think
of that I did differently vs the instructions outlined above is that I have
not crated the wildcard DNS record, however I did set configs to create
/etc/hosts entries and they /etc/hosts on the machines have the proper IPs
assigned for all hostnames (automatically added by the ansible plays).

Any ideas how I can get past the plane pods issue?

Thanks!

On Tue, May 28, 2019 at 4:23 AM Shirly Radco  wrote:

> Hi,
>
> The latest release of 4.3.z should already include a fix for this issue,
> ovirt-engine-metrics-1.3.1 rpm.
>
> The issue is that it requires the NetworkManagar to be installed, running
> and enabled for it to work.
>
> You can install it manually on the master0 vm , start and enable it or you
> can also install the updated rpm from the nightly builds if your
> environment is oVirt 4.2.z:
>
> https://resources.ovirt.org/pub/ovirt-4.2-snapshot/rpm/el7/noarch/ovirt-engine-metrics-1.2.3-0.0.master.20190523112218.gitbc6e4fa.el7.noarch.rpm
>
> Relevant patches are:
> https://gerrit.ovirt.org/#/c/99959/
> https://gerrit.ovirt.org/#/c/99718/
>
> Best regards,
>
> --
>
> Shirly Radco
>
> BI Senior Software Engineer
>
> Red Hat 
>
> 
>
>
> On Mon, May 27, 2019 at 4:41 PM Jayme  wrote:
>
>> I managed to get past that but am running in to another problem later in
>> the process on the control plane pods to appear task.   I thought perhaps a
>> glitch in the process from the failed docker step previously so after a few
>> more runs I tried killing everything and restarting the metrics process
>> again from the very beginning and end up hitting the same issue with
>> control plane pods even though all other steps/tasks seem to be working.
>>
>> I'm just getting this:
>>
>> TASK [openshift_control_plane : Wait for control plane pods to appear]
>> *
>> Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
>> 
>> FAILED - RETRYING: Wait for control plane pods to appear (60 retries
>> left).
>> FAILED - RETRYING: Wait for control plane pods to appear (59 retries
>> left).
>> FAILED - RETRYING: Wait for control plane pods to appear (58 retries
>> left).
>> FAILED - RETRYING: Wait for control plane pods to appear (57 retries
>> left).
>> FAILED - RETRYING: Wait for control plane pods to appear (56 retries
>> left).
>>
>> It eventually counts all the way down to zero and fails.
>>
>> In syslog of the master0 server I'm seeing some errors related to cni
>> config
>>
>> May 27 13:39:07 master0 ansible-oc_obj: Invoked with files=None kind=pod
>> force=False all_namespaces=None field_selector=None namespace=kube-system
>> delete_after=False kubeconfig=/etc/origin/master/admin.kubeconfig
>> content=None state=list debug=False selector=None name=
>> master-api-master0.xx.com
>> May 27 13:39:09 master0 origin-node: W0527 13:39:09.064230   20150
>> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
>> May 27 13:39:09 master0 origin-node: E0527 13:39:09.064670   20150
>> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
>> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
>> cni config uninitialized
>> May 27 13:39:13 master0 ansible-oc_obj: Invoked with files=None kind=pod
>> force=False all_namespaces=None field_selector=None namespace=kube-system
>> delete_after=False kubeconfig=/etc/origin/master/admin.kubeconfig
>> content=None state=list debug=False selector=None name=
>> master-api-master0.xx.com
>> May 27 13:39:14 master0 origin-node: W0527 13:39:14.066911   20150
>> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
>> May 27 13:39:14 master0 origin-node: E0527 13:39:14.067321   20150
>> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
>> 

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Jayme
Shirly,

Oh and I should mention that I did verify that NetworkManager was installed
on the master0 VM and enabled/started the second go around.  So that
service is there and running.

# systemctl list-unit-files | grep Network
dbus-org.freedesktop.NetworkManager.service
enabled
NetworkManager-dispatcher.service
enabled
NetworkManager-wait-online.service
 enabled
NetworkManager.service
 enabled

On Tue, May 28, 2019 at 11:13 AM Jayme  wrote:

> Shirly,
>
> I appreciate the help with this.  Unfortunately I am still running in to
> the same problem.  So far I've tried to install/enable/start NetworkManager
> on the existing "master0" server and re-ran the plans from the installer
> VM.  I ran in to the same problem waiting for control plane pods and same
> errors in syslog.
>
> So I wiped everything out, killed the template along with the installer
> and master VMs.  On oVirt engine (I am running 4.3.3.7-1 stable) I did have
> ovirt-engine-metrics-1.3.0x rpm installed, no yum updates available on an
> update check.  So I installed
> http://resources.ovirt.org/pub/yum-repo/ovirt-release43-pre.rpm then
> proceeded to install the latest version of ovirt-engine-metrics which gave
> me: ovirt-engine-metrics-1.3.1-1.el7.noarch on hosted engine.
>
> After that package was installed I proceeded to follow steps from
> beginning outlined at:
> https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
>  --
> I ran in to the docker check issue again (same as my initial email) so I
> disabled that and again got as far as starting control plane pods before
> failure.
>
> Not sure where to go from here at this point.  The only thing I can think
> of that I did differently vs the instructions outlined above is that I have
> not crated the wildcard DNS record, however I did set configs to create
> /etc/hosts entries and they /etc/hosts on the machines have the proper IPs
> assigned for all hostnames (automatically added by the ansible plays).
>
> Any ideas how I can get past the plane pods issue?
>
> Thanks!
>
> On Tue, May 28, 2019 at 4:23 AM Shirly Radco  wrote:
>
>> Hi,
>>
>> The latest release of 4.3.z should already include a fix for this issue,
>> ovirt-engine-metrics-1.3.1 rpm.
>>
>> The issue is that it requires the NetworkManagar to be installed, running
>> and enabled for it to work.
>>
>> You can install it manually on the master0 vm , start and enable it or
>> you can also install the updated rpm from the nightly builds if your
>> environment is oVirt 4.2.z:
>>
>> https://resources.ovirt.org/pub/ovirt-4.2-snapshot/rpm/el7/noarch/ovirt-engine-metrics-1.2.3-0.0.master.20190523112218.gitbc6e4fa.el7.noarch.rpm
>>
>> Relevant patches are:
>> https://gerrit.ovirt.org/#/c/99959/
>> https://gerrit.ovirt.org/#/c/99718/
>>
>> Best regards,
>>
>> --
>>
>> Shirly Radco
>>
>> BI Senior Software Engineer
>>
>> Red Hat 
>>
>> 
>>
>>
>> On Mon, May 27, 2019 at 4:41 PM Jayme  wrote:
>>
>>> I managed to get past that but am running in to another problem later in
>>> the process on the control plane pods to appear task.   I thought perhaps a
>>> glitch in the process from the failed docker step previously so after a few
>>> more runs I tried killing everything and restarting the metrics process
>>> again from the very beginning and end up hitting the same issue with
>>> control plane pods even though all other steps/tasks seem to be working.
>>>
>>> I'm just getting this:
>>>
>>> TASK [openshift_control_plane : Wait for control plane pods to appear]
>>> *
>>> Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
>>> 
>>> FAILED - RETRYING: Wait for control plane pods to appear (60 retries
>>> left).
>>> FAILED - RETRYING: Wait for control plane pods to appear (59 retries
>>> left).
>>> FAILED - RETRYING: Wait for control plane pods to appear (58 retries
>>> left).
>>> FAILED - RETRYING: Wait for control plane pods to appear (57 retries
>>> left).
>>> FAILED - RETRYING: Wait for control plane pods to appear (56 retries
>>> left).
>>>
>>> It eventually counts all the way down to zero and fails.
>>>
>>> In syslog of the master0 server I'm seeing some errors related to cni
>>> config
>>>
>>> May 27 13:39:07 master0 ansible-oc_obj: Invoked with files=None kind=pod
>>> force=False all_namespaces=None field_selector=None namespace=kube-system
>>> delete_after=False kubeconfig=/etc/origin/master/admin.kubeconfig
>>> content=None state=list debug=False selector=None name=
>>> master-api-master0.xx.com
>>> May 27 13:39:09 master0 origin-node: W0527 13:39:09.064230   20150
>>> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
>>> May 27 13:39:09 master0 origin-node: E0527 13:39:09.064670   20150
>>> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
>>> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
>>> cni config uninitialized
>>> May 27 13:39:13 master0 

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Edward Berger
> TASK [openshift_control_plane : Wait for control plane pods to appear]
*
> Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857

> FAILED - RETRYING: Wait for control plane pods to appear (60 retries
left).
> FAILED - RETRYING: Wait for control plane pods to appear (59 retries
left).
>It eventually counts all the way down to zero and fails.

This looks a lot like the issues I saw when the host firewall (iptables)
was blocking another OKD all-in-one-host install script [1].
Disabling iptables allowed the installation to continue for my proof of
concept "cluster".

[1]https://github.com/gshipley/installcentos

The other error I had with [1] was it was trying to install a couple of
packages (zile and python2-pip) from EPEL with the repo disabled.



On Tue, May 28, 2019 at 10:41 AM Jayme  wrote:

> Shirly,
>
> Oh and I should mention that I did verify that NetworkManager was
> installed on the master0 VM and enabled/started the second go around.  So
> that service is there and running.
>
> # systemctl list-unit-files | grep Network
> dbus-org.freedesktop.NetworkManager.service
> enabled
> NetworkManager-dispatcher.service
> enabled
> NetworkManager-wait-online.service
>  enabled
> NetworkManager.service
>  enabled
>
> On Tue, May 28, 2019 at 11:13 AM Jayme  wrote:
>
>> Shirly,
>>
>> I appreciate the help with this.  Unfortunately I am still running in to
>> the same problem.  So far I've tried to install/enable/start NetworkManager
>> on the existing "master0" server and re-ran the plans from the installer
>> VM.  I ran in to the same problem waiting for control plane pods and same
>> errors in syslog.
>>
>> So I wiped everything out, killed the template along with the installer
>> and master VMs.  On oVirt engine (I am running 4.3.3.7-1 stable) I did have
>> ovirt-engine-metrics-1.3.0x rpm installed, no yum updates available on an
>> update check.  So I installed
>> http://resources.ovirt.org/pub/yum-repo/ovirt-release43-pre.rpm then
>> proceeded to install the latest version of ovirt-engine-metrics which gave
>> me: ovirt-engine-metrics-1.3.1-1.el7.noarch on hosted engine.
>>
>> After that package was installed I proceeded to follow steps from
>> beginning outlined at:
>> https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
>>  --
>> I ran in to the docker check issue again (same as my initial email) so I
>> disabled that and again got as far as starting control plane pods before
>> failure.
>>
>> Not sure where to go from here at this point.  The only thing I can think
>> of that I did differently vs the instructions outlined above is that I have
>> not crated the wildcard DNS record, however I did set configs to create
>> /etc/hosts entries and they /etc/hosts on the machines have the proper IPs
>> assigned for all hostnames (automatically added by the ansible plays).
>>
>> Any ideas how I can get past the plane pods issue?
>>
>> Thanks!
>>
>> On Tue, May 28, 2019 at 4:23 AM Shirly Radco  wrote:
>>
>>> Hi,
>>>
>>> The latest release of 4.3.z should already include a fix for this issue,
>>> ovirt-engine-metrics-1.3.1 rpm.
>>>
>>> The issue is that it requires the NetworkManagar to be installed,
>>> running and enabled for it to work.
>>>
>>> You can install it manually on the master0 vm , start and enable it or
>>> you can also install the updated rpm from the nightly builds if your
>>> environment is oVirt 4.2.z:
>>>
>>> https://resources.ovirt.org/pub/ovirt-4.2-snapshot/rpm/el7/noarch/ovirt-engine-metrics-1.2.3-0.0.master.20190523112218.gitbc6e4fa.el7.noarch.rpm
>>>
>>> Relevant patches are:
>>> https://gerrit.ovirt.org/#/c/99959/
>>> https://gerrit.ovirt.org/#/c/99718/
>>>
>>> Best regards,
>>>
>>> --
>>>
>>> Shirly Radco
>>>
>>> BI Senior Software Engineer
>>>
>>> Red Hat 
>>>
>>> 
>>>
>>>
>>> On Mon, May 27, 2019 at 4:41 PM Jayme  wrote:
>>>
 I managed to get past that but am running in to another problem later
 in the process on the control plane pods to appear task.   I thought
 perhaps a glitch in the process from the failed docker step previously so
 after a few more runs I tried killing everything and restarting the metrics
 process again from the very beginning and end up hitting the same issue
 with control plane pods even though all other steps/tasks seem to be
 working.

 I'm just getting this:

 TASK [openshift_control_plane : Wait for control plane pods to appear]
 *
 Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
 
 FAILED - RETRYING: Wait for control plane pods to appear (60 retries
 left).
 FAILED - RETRYING: Wait for control plane pods to appear (59 retries
 left).
 FAILED - RETRYING: Wait for control plane pods to appear (58 retries
 left).
 FAILED - RETRYING: Wait for control plane pods to appear (57 retries
 left).
 FAILED - RETRYING: 

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Jayme
Do you mean the iptables firewall on the server being installed to i.e.
master0 or the actual oVirt host that the master0 VM is running on?  I did
try flushing iptables rules on master0 VM then ran plays again from
installer VM but fail at the same point.

Does this log message have anything to do with the issue, /etc/cni
directory does not even exist on master0 VM.

May 28 17:23:35 master0 origin-node: W0528 17:23:35.012902   10434
cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
May 28 17:23:35 master0 origin-node: E0528 17:23:35.013398   10434
kubelet.go:2101] Container runtime network not ready: NetworkReady=false
reason:NetworkPluginNotReady message:docker: network plugin is not ready:
cni config uninitialized



On Tue, May 28, 2019 at 1:19 PM Edward Berger  wrote:

> > TASK [openshift_control_plane : Wait for control plane pods to appear]
> *
> > Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
> 
> > FAILED - RETRYING: Wait for control plane pods to appear (60 retries
> left).
> > FAILED - RETRYING: Wait for control plane pods to appear (59 retries
> left).
> >It eventually counts all the way down to zero and fails.
>
> This looks a lot like the issues I saw when the host firewall (iptables)
> was blocking another OKD all-in-one-host install script [1].
> Disabling iptables allowed the installation to continue for my proof of
> concept "cluster".
>
> [1]https://github.com/gshipley/installcentos
>
> The other error I had with [1] was it was trying to install a couple of
> packages (zile and python2-pip) from EPEL with the repo disabled.
>
>
>
> On Tue, May 28, 2019 at 10:41 AM Jayme  wrote:
>
>> Shirly,
>>
>> Oh and I should mention that I did verify that NetworkManager was
>> installed on the master0 VM and enabled/started the second go around.  So
>> that service is there and running.
>>
>> # systemctl list-unit-files | grep Network
>> dbus-org.freedesktop.NetworkManager.service
>> enabled
>> NetworkManager-dispatcher.service
>> enabled
>> NetworkManager-wait-online.service
>>  enabled
>> NetworkManager.service
>>  enabled
>>
>> On Tue, May 28, 2019 at 11:13 AM Jayme  wrote:
>>
>>> Shirly,
>>>
>>> I appreciate the help with this.  Unfortunately I am still running in to
>>> the same problem.  So far I've tried to install/enable/start NetworkManager
>>> on the existing "master0" server and re-ran the plans from the installer
>>> VM.  I ran in to the same problem waiting for control plane pods and same
>>> errors in syslog.
>>>
>>> So I wiped everything out, killed the template along with the installer
>>> and master VMs.  On oVirt engine (I am running 4.3.3.7-1 stable) I did have
>>> ovirt-engine-metrics-1.3.0x rpm installed, no yum updates available on an
>>> update check.  So I installed
>>> http://resources.ovirt.org/pub/yum-repo/ovirt-release43-pre.rpm then
>>> proceeded to install the latest version of ovirt-engine-metrics which gave
>>> me: ovirt-engine-metrics-1.3.1-1.el7.noarch on hosted engine.
>>>
>>> After that package was installed I proceeded to follow steps from
>>> beginning outlined at:
>>> https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
>>>  --
>>> I ran in to the docker check issue again (same as my initial email) so I
>>> disabled that and again got as far as starting control plane pods before
>>> failure.
>>>
>>> Not sure where to go from here at this point.  The only thing I can
>>> think of that I did differently vs the instructions outlined above is that
>>> I have not crated the wildcard DNS record, however I did set configs to
>>> create /etc/hosts entries and they /etc/hosts on the machines have the
>>> proper IPs assigned for all hostnames (automatically added by the ansible
>>> plays).
>>>
>>> Any ideas how I can get past the plane pods issue?
>>>
>>> Thanks!
>>>
>>> On Tue, May 28, 2019 at 4:23 AM Shirly Radco  wrote:
>>>
 Hi,

 The latest release of 4.3.z should already include a fix for this
 issue, ovirt-engine-metrics-1.3.1 rpm.

 The issue is that it requires the NetworkManagar to be installed,
 running and enabled for it to work.

 You can install it manually on the master0 vm , start and enable it or
 you can also install the updated rpm from the nightly builds if your
 environment is oVirt 4.2.z:

 https://resources.ovirt.org/pub/ovirt-4.2-snapshot/rpm/el7/noarch/ovirt-engine-metrics-1.2.3-0.0.master.20190523112218.gitbc6e4fa.el7.noarch.rpm

 Relevant patches are:
 https://gerrit.ovirt.org/#/c/99959/
 https://gerrit.ovirt.org/#/c/99718/

 Best regards,

 --

 Shirly Radco

 BI Senior Software Engineer

 Red Hat 

 


 On Mon, May 27, 2019 at 4:41 PM Jayme  wrote:

> I managed to get past that but am running in to another problem later
> in the process on the control plane 

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Jayme
I should also mention one more thing, I am attempting to install on an
internal domain, not externally accessible.

On Tue, May 28, 2019 at 2:25 PM Jayme  wrote:

> Do you mean the iptables firewall on the server being installed to i.e.
> master0 or the actual oVirt host that the master0 VM is running on?  I did
> try flushing iptables rules on master0 VM then ran plays again from
> installer VM but fail at the same point.
>
> Does this log message have anything to do with the issue, /etc/cni
> directory does not even exist on master0 VM.
>
> May 28 17:23:35 master0 origin-node: W0528 17:23:35.012902   10434
> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
> May 28 17:23:35 master0 origin-node: E0528 17:23:35.013398   10434
> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
> cni config uninitialized
>
>
>
> On Tue, May 28, 2019 at 1:19 PM Edward Berger  wrote:
>
>> > TASK [openshift_control_plane : Wait for control plane pods to appear]
>> *
>> > Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
>> 
>> > FAILED - RETRYING: Wait for control plane pods to appear (60 retries
>> left).
>> > FAILED - RETRYING: Wait for control plane pods to appear (59 retries
>> left).
>> >It eventually counts all the way down to zero and fails.
>>
>> This looks a lot like the issues I saw when the host firewall (iptables)
>> was blocking another OKD all-in-one-host install script [1].
>> Disabling iptables allowed the installation to continue for my proof of
>> concept "cluster".
>>
>> [1]https://github.com/gshipley/installcentos
>>
>> The other error I had with [1] was it was trying to install a couple of
>> packages (zile and python2-pip) from EPEL with the repo disabled.
>>
>>
>>
>> On Tue, May 28, 2019 at 10:41 AM Jayme  wrote:
>>
>>> Shirly,
>>>
>>> Oh and I should mention that I did verify that NetworkManager was
>>> installed on the master0 VM and enabled/started the second go around.  So
>>> that service is there and running.
>>>
>>> # systemctl list-unit-files | grep Network
>>> dbus-org.freedesktop.NetworkManager.service
>>> enabled
>>> NetworkManager-dispatcher.service
>>> enabled
>>> NetworkManager-wait-online.service
>>>  enabled
>>> NetworkManager.service
>>>  enabled
>>>
>>> On Tue, May 28, 2019 at 11:13 AM Jayme  wrote:
>>>
 Shirly,

 I appreciate the help with this.  Unfortunately I am still running in
 to the same problem.  So far I've tried to install/enable/start
 NetworkManager on the existing "master0" server and re-ran the plans from
 the installer VM.  I ran in to the same problem waiting for control plane
 pods and same errors in syslog.

 So I wiped everything out, killed the template along with the installer
 and master VMs.  On oVirt engine (I am running 4.3.3.7-1 stable) I did have
 ovirt-engine-metrics-1.3.0x rpm installed, no yum updates available on an
 update check.  So I installed
 http://resources.ovirt.org/pub/yum-repo/ovirt-release43-pre.rpm then
 proceeded to install the latest version of ovirt-engine-metrics which gave
 me: ovirt-engine-metrics-1.3.1-1.el7.noarch on hosted engine.

 After that package was installed I proceeded to follow steps from
 beginning outlined at:
 https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
  --
 I ran in to the docker check issue again (same as my initial email) so I
 disabled that and again got as far as starting control plane pods before
 failure.

 Not sure where to go from here at this point.  The only thing I can
 think of that I did differently vs the instructions outlined above is that
 I have not crated the wildcard DNS record, however I did set configs to
 create /etc/hosts entries and they /etc/hosts on the machines have the
 proper IPs assigned for all hostnames (automatically added by the ansible
 plays).

 Any ideas how I can get past the plane pods issue?

 Thanks!

 On Tue, May 28, 2019 at 4:23 AM Shirly Radco  wrote:

> Hi,
>
> The latest release of 4.3.z should already include a fix for this
> issue, ovirt-engine-metrics-1.3.1 rpm.
>
> The issue is that it requires the NetworkManagar to be installed,
> running and enabled for it to work.
>
> You can install it manually on the master0 vm , start and enable it or
> you can also install the updated rpm from the nightly builds if your
> environment is oVirt 4.2.z:
>
> https://resources.ovirt.org/pub/ovirt-4.2-snapshot/rpm/el7/noarch/ovirt-engine-metrics-1.2.3-0.0.master.20190523112218.gitbc6e4fa.el7.noarch.rpm
>
> Relevant patches are:
> https://gerrit.ovirt.org/#/c/99959/
> https://gerrit.ovirt.org/#/c/99718/
>
> Best regards,
>
> --
>
> Shirly Radco
>

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Jayme
I just tried again from scratch this time making sure a proper wildcard DNS
entry existed and without using the set /etc/hosts option and am still
running in to the pods issue.  Can anyone confirm if this requires a public
external IP to work?  I am working on an internal DNS zone here and natted
ips.

On Tue, May 28, 2019 at 3:28 PM Edward Berger  wrote:

> In my case it was a single bare metal host, so that would be equivalent to
> disabling iptables on the master0 VM you're installing to, in your ovirt
> scenario.
>
> On Tue, May 28, 2019 at 1:25 PM Jayme  wrote:
>
>> Do you mean the iptables firewall on the server being installed to i.e.
>> master0 or the actual oVirt host that the master0 VM is running on?  I did
>> try flushing iptables rules on master0 VM then ran plays again from
>> installer VM but fail at the same point.
>>
>> Does this log message have anything to do with the issue, /etc/cni
>> directory does not even exist on master0 VM.
>>
>> May 28 17:23:35 master0 origin-node: W0528 17:23:35.012902   10434
>> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
>> May 28 17:23:35 master0 origin-node: E0528 17:23:35.013398   10434
>> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
>> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
>> cni config uninitialized
>>
>>
>>
>> On Tue, May 28, 2019 at 1:19 PM Edward Berger 
>> wrote:
>>
>>> > TASK [openshift_control_plane : Wait for control plane pods to appear]
>>> *
>>> > Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
>>> 
>>> > FAILED - RETRYING: Wait for control plane pods to appear (60 retries
>>> left).
>>> > FAILED - RETRYING: Wait for control plane pods to appear (59 retries
>>> left).
>>> >It eventually counts all the way down to zero and fails.
>>>
>>> This looks a lot like the issues I saw when the host firewall (iptables)
>>> was blocking another OKD all-in-one-host install script [1].
>>> Disabling iptables allowed the installation to continue for my proof of
>>> concept "cluster".
>>>
>>> [1]https://github.com/gshipley/installcentos
>>>
>>> The other error I had with [1] was it was trying to install a couple of
>>> packages (zile and python2-pip) from EPEL with the repo disabled.
>>>
>>>
>>>
>>> On Tue, May 28, 2019 at 10:41 AM Jayme  wrote:
>>>
 Shirly,

 Oh and I should mention that I did verify that NetworkManager was
 installed on the master0 VM and enabled/started the second go around.  So
 that service is there and running.

 # systemctl list-unit-files | grep Network
 dbus-org.freedesktop.NetworkManager.service
 enabled
 NetworkManager-dispatcher.service
 enabled
 NetworkManager-wait-online.service
  enabled
 NetworkManager.service
  enabled

 On Tue, May 28, 2019 at 11:13 AM Jayme  wrote:

> Shirly,
>
> I appreciate the help with this.  Unfortunately I am still running in
> to the same problem.  So far I've tried to install/enable/start
> NetworkManager on the existing "master0" server and re-ran the plans from
> the installer VM.  I ran in to the same problem waiting for control plane
> pods and same errors in syslog.
>
> So I wiped everything out, killed the template along with the
> installer and master VMs.  On oVirt engine (I am running 4.3.3.7-1 stable)
> I did have ovirt-engine-metrics-1.3.0x rpm installed, no yum updates
> available on an update check.  So I installed
> http://resources.ovirt.org/pub/yum-repo/ovirt-release43-pre.rpm then
> proceeded to install the latest version of ovirt-engine-metrics which gave
> me: ovirt-engine-metrics-1.3.1-1.el7.noarch on hosted engine.
>
> After that package was installed I proceeded to follow steps from
> beginning outlined at:
> https://ovirt.org/documentation/metrics-install-guide/Installing_Metrics_Store.html
>  --
> I ran in to the docker check issue again (same as my initial email) so I
> disabled that and again got as far as starting control plane pods before
> failure.
>
> Not sure where to go from here at this point.  The only thing I can
> think of that I did differently vs the instructions outlined above is that
> I have not crated the wildcard DNS record, however I did set configs to
> create /etc/hosts entries and they /etc/hosts on the machines have the
> proper IPs assigned for all hostnames (automatically added by the ansible
> plays).
>
> Any ideas how I can get past the plane pods issue?
>
> Thanks!
>
> On Tue, May 28, 2019 at 4:23 AM Shirly Radco 
> wrote:
>
>> Hi,
>>
>> The latest release of 4.3.z should already include a fix for this
>> issue, ovirt-engine-metrics-1.3.1 rpm.
>>
>> The issue is that it requires the NetworkManagar to be installed,
>> running and enabled for it to work.
>>
>

[ovirt-users] Re: ovirt metrics ansible error

2019-05-28 Thread Jayme
I actually see the pods running on master0 if I do this:

@master0 master]# oc project kube-system
Now using project "kube-system" on server "
https://openshift-master.cloud.xxx.com:8443";.
[root@master0 master]# oc get pods
NAME  READY STATUS
   RESTARTS   AGE
master-api-master0.cloud..com   1/1   Running   0
   22m
master-controllers-master0.cloud..com   1/1   Running   0
   22m
master-etcd-master0.cloud.xx  1/1   Running   0
 22m

So I wonder why the ansible "Wait for control plane pods to appear" task is
looping

- name: Wait for control plane pods to appear
  oc_obj:
state: list
kind: pod
name: "master-{{ item }}-{{ l_kubelet_node_name | lower }}"
namespace: kube-system
  register: control_plane_pods
  until:
  - "'results' in control_plane_pods"
  - "'results' in control_plane_pods.results"
  - control_plane_pods.results.results | length > 0
  retries: 60
  delay: 5
  with_items:
  - "{{ 'etcd' if inventory_hostname in groups['oo_etcd_to_config'] else
omit }}"
  - api
  - controllers
  ignore_errors: true

On Tue, May 28, 2019 at 4:23 PM Jayme  wrote:

> I just tried again from scratch this time making sure a proper wildcard
> DNS entry existed and without using the set /etc/hosts option and am still
> running in to the pods issue.  Can anyone confirm if this requires a public
> external IP to work?  I am working on an internal DNS zone here and natted
> ips.
>
> On Tue, May 28, 2019 at 3:28 PM Edward Berger  wrote:
>
>> In my case it was a single bare metal host, so that would be equivalent
>> to disabling iptables on the master0 VM you're installing to, in your ovirt
>> scenario.
>>
>> On Tue, May 28, 2019 at 1:25 PM Jayme  wrote:
>>
>>> Do you mean the iptables firewall on the server being installed to i.e.
>>> master0 or the actual oVirt host that the master0 VM is running on?  I did
>>> try flushing iptables rules on master0 VM then ran plays again from
>>> installer VM but fail at the same point.
>>>
>>> Does this log message have anything to do with the issue, /etc/cni
>>> directory does not even exist on master0 VM.
>>>
>>> May 28 17:23:35 master0 origin-node: W0528 17:23:35.012902   10434
>>> cni.go:172] Unable to update cni config: No networks found in /etc/cni/net.d
>>> May 28 17:23:35 master0 origin-node: E0528 17:23:35.013398   10434
>>> kubelet.go:2101] Container runtime network not ready: NetworkReady=false
>>> reason:NetworkPluginNotReady message:docker: network plugin is not ready:
>>> cni config uninitialized
>>>
>>>
>>>
>>> On Tue, May 28, 2019 at 1:19 PM Edward Berger 
>>> wrote:
>>>
 > TASK [openshift_control_plane : Wait for control plane pods to
 appear] *
 > Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
 
 > FAILED - RETRYING: Wait for control plane pods to appear (60 retries
 left).
 > FAILED - RETRYING: Wait for control plane pods to appear (59 retries
 left).
 >It eventually counts all the way down to zero and fails.

 This looks a lot like the issues I saw when the host firewall
 (iptables) was blocking another OKD all-in-one-host install script [1].
 Disabling iptables allowed the installation to continue for my proof of
 concept "cluster".

 [1]https://github.com/gshipley/installcentos

 The other error I had with [1] was it was trying to install a couple of
 packages (zile and python2-pip) from EPEL with the repo disabled.



 On Tue, May 28, 2019 at 10:41 AM Jayme  wrote:

> Shirly,
>
> Oh and I should mention that I did verify that NetworkManager was
> installed on the master0 VM and enabled/started the second go around.  So
> that service is there and running.
>
> # systemctl list-unit-files | grep Network
> dbus-org.freedesktop.NetworkManager.service
>   enabled
> NetworkManager-dispatcher.service
>   enabled
> NetworkManager-wait-online.service
>  enabled
> NetworkManager.service
>  enabled
>
> On Tue, May 28, 2019 at 11:13 AM Jayme  wrote:
>
>> Shirly,
>>
>> I appreciate the help with this.  Unfortunately I am still running in
>> to the same problem.  So far I've tried to install/enable/start
>> NetworkManager on the existing "master0" server and re-ran the plans from
>> the installer VM.  I ran in to the same problem waiting for control plane
>> pods and same errors in syslog.
>>
>> So I wiped everything out, killed the template along with the
>> installer and master VMs.  On oVirt engine (I am running 4.3.3.7-1 
>> stable)
>> I did have ovirt-engine-metrics-1.3.0x rpm installed, no yum updates
>> available on an update check.  So I installed
>> http://resources.ovirt.org/pub/yum-repo/ovirt-release43-pre.rpm then
>> proceeded to install the latest version of ovirt-engin

[ovirt-users] Re: ovirt metrics ansible error

2019-05-29 Thread Shirly Radco
Hi Jayme,

It getting hard to debug your issue over the mailing list.
Can you please open a bug in bugzilla and attach all the information you
know?
The versions you are using, config files from engine machine
and  ansible log, vars.yaml , integ.ini from the master0 vm in /root
directory, .

Sorry for the inconvenience.

Best,


--

Shirly Radco

BI Senior Software Engineer

Red Hat 




On Tue, May 28, 2019 at 10:47 PM Jayme  wrote:

> I actually see the pods running on master0 if I do this:
>
> @master0 master]# oc project kube-system
> Now using project "kube-system" on server "
> https://openshift-master.cloud.xxx.com:8443";.
> [root@master0 master]# oc get pods
> NAME  READY STATUS
>RESTARTS   AGE
> master-api-master0.cloud..com   1/1   Running   0
>  22m
> master-controllers-master0.cloud..com   1/1   Running   0
>  22m
> master-etcd-master0.cloud.xx  1/1   Running   0
>22m
>
> So I wonder why the ansible "Wait for control plane pods to appear" task
> is looping
>
> - name: Wait for control plane pods to appear
>   oc_obj:
> state: list
> kind: pod
> name: "master-{{ item }}-{{ l_kubelet_node_name | lower }}"
> namespace: kube-system
>   register: control_plane_pods
>   until:
>   - "'results' in control_plane_pods"
>   - "'results' in control_plane_pods.results"
>   - control_plane_pods.results.results | length > 0
>   retries: 60
>   delay: 5
>   with_items:
>   - "{{ 'etcd' if inventory_hostname in groups['oo_etcd_to_config'] else
> omit }}"
>   - api
>   - controllers
>   ignore_errors: true
>
> On Tue, May 28, 2019 at 4:23 PM Jayme  wrote:
>
>> I just tried again from scratch this time making sure a proper wildcard
>> DNS entry existed and without using the set /etc/hosts option and am still
>> running in to the pods issue.  Can anyone confirm if this requires a public
>> external IP to work?  I am working on an internal DNS zone here and natted
>> ips.
>>
>> On Tue, May 28, 2019 at 3:28 PM Edward Berger 
>> wrote:
>>
>>> In my case it was a single bare metal host, so that would be equivalent
>>> to disabling iptables on the master0 VM you're installing to, in your ovirt
>>> scenario.
>>>
>>> On Tue, May 28, 2019 at 1:25 PM Jayme  wrote:
>>>
 Do you mean the iptables firewall on the server being installed to i.e.
 master0 or the actual oVirt host that the master0 VM is running on?  I did
 try flushing iptables rules on master0 VM then ran plays again from
 installer VM but fail at the same point.

 Does this log message have anything to do with the issue, /etc/cni
 directory does not even exist on master0 VM.

 May 28 17:23:35 master0 origin-node: W0528 17:23:35.012902   10434
 cni.go:172] Unable to update cni config: No networks found in 
 /etc/cni/net.d
 May 28 17:23:35 master0 origin-node: E0528 17:23:35.013398   10434
 kubelet.go:2101] Container runtime network not ready: NetworkReady=false
 reason:NetworkPluginNotReady message:docker: network plugin is not ready:
 cni config uninitialized



 On Tue, May 28, 2019 at 1:19 PM Edward Berger 
 wrote:

> > TASK [openshift_control_plane : Wait for control plane pods to
> appear] *
> > Monday 27 May 2019  13:31:54 + (0:00:00.180)   0:14:33.857
> 
> > FAILED - RETRYING: Wait for control plane pods to appear (60 retries
> left).
> > FAILED - RETRYING: Wait for control plane pods to appear (59 retries
> left).
> >It eventually counts all the way down to zero and fails.
>
> This looks a lot like the issues I saw when the host firewall
> (iptables) was blocking another OKD all-in-one-host install script [1].
> Disabling iptables allowed the installation to continue for my proof
> of concept "cluster".
>
> [1]https://github.com/gshipley/installcentos
>
> The other error I had with [1] was it was trying to install a couple
> of packages (zile and python2-pip) from EPEL with the repo disabled.
>
>
>
> On Tue, May 28, 2019 at 10:41 AM Jayme  wrote:
>
>> Shirly,
>>
>> Oh and I should mention that I did verify that NetworkManager was
>> installed on the master0 VM and enabled/started the second go around.  So
>> that service is there and running.
>>
>> # systemctl list-unit-files | grep Network
>> dbus-org.freedesktop.NetworkManager.service
>>   enabled
>> NetworkManager-dispatcher.service
>>   enabled
>> NetworkManager-wait-online.service
>>enabled
>> NetworkManager.service
>>enabled
>>
>> On Tue, May 28, 2019 at 11:13 AM Jayme  wrote:
>>
>>> Shirly,
>>>
>>> I appreciate the help with this.  Unfortunately I am still running
>>> in to the same problem.  So far I've