Alright. By differential comparing to successfull postgres-command sudo-becomes. It looks like the salient difference is that `Copy configuration archive to storage` is missing become:true. Testing now.
The similar postgres commands have become:true while this one does not. ~Kyle On Tue, Apr 3, 2018 at 1:19 PM, Simone Tiraboschi <stira...@redhat.com> wrote: > > > On Tue, Apr 3, 2018 at 5:21 PM, RabidCicada <rabidcic...@gmail.com> wrote: > >> I've attached a full debug packet below. I include the log file from >> ovirt-hosted-engine-setup. I include relevant cmd line info. I include >> info from the command line where epdb has a breakpoint in playbook.py from >> ansible itself. I also include info from commands I ran after it failed. >> I also include attached the ferried over script in /root/.ansible/tmp that >> is run. >> >> >> *Output on cmd line:* >> [ INFO ] TASK [Copy configuration archive to storage] >> [ ERROR ] fatal: [localhost]: FAILED! => { >> "changed": true, >> "cmd": [ >> "dd", >> "bs=20480", >> "count=1", >> "oflag=direct", >> "if=/var/tmp/localvmbCDQIR/5e >> f881f5-c992-48d2-b969-a0b6156bdf7c", >> "of=/rhev/data-center/mnt/nod >> e.local:_srv_data/81292f3f-11d3-4e38-9afa-62e133aa8017/image >> s/c5510e77-1ee0-479c-b6cf-24c179313a45/5ef881f5-c992-48d2- >> b969-a0b6156bdf7c" >> ], >> "delta": "0:00:00.004336", >> "end": "2018-04-03 15:01:55.581823", >> "invocation": { >> "module_args": { >> "_raw_params": "dd bs=20480 count=1 oflag=direct >> if=\"/var/tmp/localvmbCDQIR/5ef881f5-c992-48d2-b969-a0b6156bdf7c\" >> of=\"/rhev/data-center/mnt/node.local:_srv_data/81292f3f-11d >> 3-4e38-9afa-62e133aa8017/images/c5510e77-1ee0-479c-b6cf-24c1 >> 79313a45/5ef881f5-c992-48d2-b969-a0b6156bdf7c\"", >> "_uses_shell": false, >> "chdir": null, >> "creates": null, >> "executable": null, >> "removes": null, >> "stdin": null, >> "warn": true >> } >> }, >> "msg": "non-zero return code", >> "rc": 1, >> "start": "2018-04-03 15:01:55.577487", >> "stderr": "dd: failed to open ā/rhev/data-center/mnt/node.lo >> cal:_srv_data/81292f3f-11d3-4e38-9afa-62e133aa8017/images/c5 >> 510e77-1ee0-479c-b6cf-24c179313a45/5ef881f5-c992-48d2-b969-a0b6156bdf7cā: >> Permission denied", >> "stderr_lines": [ >> "dd: failed to open ā/rhev/data-center/mnt/node.lo >> cal:_srv_data/81292f3f-11d3-4e38-9afa-62e133aa8017/images/c5 >> 510e77-1ee0-479c-b6cf-24c179313a45/5ef881f5-c992-48d2-b969-a0b6156bdf7cā: >> Permission denied" >> ], >> "stdout": "", >> "stdout_lines": [] >> } >> [ ERROR ] Failed to execute stage 'Closing up': Failed executing >> ansible-playbook >> > > In the playbook we have on that task: > become_user: vdsm > become_method: sudo > > but I fear it got somehow ignored. > I'll investigate it. > > >> >> *Output from ansible epdb tracepoint:* >> >> Using module file /usr/lib/python2.7/site-packag >> es/ansible/modules/commands/command.py >> <localhost> ESTABLISH LOCAL CONNECTION FOR USER: root >> <localhost> EXEC /bin/sh -c 'echo ~ && sleep 0' >> <localhost> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo >> /root/.ansible/tmp/ansible-tmp-1522767715.36-81496549401055 `" && echo >> ansible-tmp-1522767715.36-81496549401055="` echo >> /root/.ansible/tmp/ansible-tmp-1522767715.36-81496549401055 `" ) && >> sleep 0' >> <localhost> PUT /tmp/tmpGMGdjh TO /root/.ansible/tmp/ansible-tmp >> -1522767715.36-81496549401055/command.py >> <localhost> EXEC /bin/sh -c 'chmod u+x >> /root/.ansible/tmp/ansible-tmp-1522767715.36-81496549401055/ >> /root/.ansible/tmp/ansible-tmp-1522767715.36-81496549401055/command.py >> && sleep 0' >> <localhost> EXEC /bin/sh -c '/usr/bin/python >> /root/.ansible/tmp/ansible-tmp-1522767715.36-81496549401055/command.py >> && sleep 0' >> to retry, use: --limit @/usr/share/ovirt-hosted-engin >> e-setup/ansible/create_target_vm.retry >> >> The above command.py is the on I have attached as problematic_command.py >> >> *Investigation After Failure:* >> [root@node ~]# ls -al '/rhev/data-center/mnt/node.lo >> cal:_srv_data/81292f3f-11d3-4e38-9afa-62e133aa8017/images/c5 >> 510e77-1ee0-479c-b6cf-24c179313a45/5ef881f5-c992-48d2-b969-a0b6156bdf7c' >> -rw-rw----. 1 vdsm kvm 20480 Apr 3 15:01 /rhev/data-center/mnt/node.loc >> al:_srv_data/81292f3f-11d3-4e38-9afa-62e133aa8017/images/c55 >> 10e77-1ee0-479c-b6cf-24c179313a45/5ef881f5-c992-48d2-b969-a0b6156bdf7c >> >> sudo -u vdsm dd bs=20480 count=1 oflag=direct >> if="/var/tmp/localvmbCDQIR/5ef881f5-c992-48d2-b969-a0b6156bdf7c" >> of="/rhev/data-center/mnt/node.local:_srv_data/81292f3f-11d3 >> -4e38-9afa-62e133aa8017/images/c5510e77-1ee0-479c-b6cf-24c17 >> 9313a45/5ef881f5-c992-48d2-b969-a0b6156bdf7c >> 1+0 records in >> 1+0 records out >> >> >> It seems to me that somehow it is not getting the right permissions even >> though the playbook has: >> - name: Copy configuration archive to storage >> command: dd bs=20480 count=1 oflag=direct if="{{ LOCAL_VM_DIR }}/{{ >> he_conf_disk_details.disk.image_id }}" of="{{ he_conf_disk_path }}" >> become_user: vdsm >> become_method: sudo >> changed_when: True >> tags: [ 'skip_ansible_lint' ] >> >> >> On Tue, Apr 3, 2018 at 8:51 AM, RabidCicada <rabidcic...@gmail.com> >> wrote: >> >>> I am now also running with: >>> >>> export ANSIBLE_VERBOSITY=5 >>> >>> export ANSIBLE_FORKS=1 >>> >>> export ANSIBLE_KEEP_REMOTE_FILES=1 >>> >>> >>> On Tue, Apr 3, 2018 at 8:49 AM, RabidCicada <rabidcic...@gmail.com> >>> wrote: >>> >>>> Here's the log. >>>> >>>> So the command that it says it ran is: >>>> dd bs=20480 count=1 oflag=direct if=/var/tmp/localvmHaWb6G/1cce >>>> 8df2-1810-4063-b4e2-e19a2c5b1909 of=/rhev/data-center/mnt/node. >>>> local:_srv_data/3c7485ea-14e3-40c1-b627-f89a819ed1d6/images/ >>>> 2c1f7c2f-b8f7-46d4-ac66-8ff1e9649e29/1cce8df2-1810-4063-b4e2 >>>> -e19a2c5b1909 >>>> >>>> But we all know that was done with: >>>> >>>> - name: Copy configuration archive to storage >>>> command: dd bs=20480 count=1 oflag=direct if="{{ LOCAL_VM_DIR }}/{{ >>>> he_conf_disk_details.disk.image_id }}" of="{{ he_conf_disk_path }}" >>>> become_user: vdsm >>>> become_method: sudo >>>> changed_when: True >>>> tags: [ 'skip_ansible_lint' ] >>>> >>>> So I manually replicated with `sudo vdsm dd bs=20480 count=1 >>>> if=/var/tmp/localvmHaWb6G/1cce8df2-1810-4063-b4e2-e19a2c5b1909 >>>> of=/rhev/data-center/mnt/node.local:_srv_data/3c7485ea-14e3- >>>> 40c1-b627-f89a819ed1d6/images/2c1f7c2f-b8f7-46d4-ac66-8ff1e9 >>>> 649e29/1cce8df2-1810-4063-b4e2-e19a2c5b1909` >>>> >>>> And it works when I manually do it. Though I think I didn't use the >>>> oflag=direct (Just realised this) >>>> >>>> I eventually put a pause task directly preceeding it with debug output >>>> that showed the file paths. I manually ran the command and it worked. >>>> Then let it do it....failed. I checked all the permissions e.g. >>>> vdsm:kvm. All looks good from a filesystem point of view. I'm beginning >>>> (naively) to suspect a race condition for the file access problem...but >>>> have come nowhere close to solving it. >>>> >>>> *Can you suggest a good way to continue executing the install process >>>> from the create_target_vm.yml playbook (with proper variables and context >>>> from otopi etc)?* I currently have to restart the entire process over >>>> again and wait quite a while for it to circle back around. >>>> >>>> I have since discovered epdb and I've set breakpoints directly in >>>> playbook.py of ansible just to see better log output. I insert epdb.serve >>>> and use netcat to connect since epdb on python 2.7.5 and up seems to have >>>> problems using epdb.connect() itself. >>>> >>>> ~Kyle >>>> >>>> On Tue, Apr 3, 2018 at 4:06 AM, Simone Tiraboschi <stira...@redhat.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Mon, Apr 2, 2018 at 4:52 PM, RabidCicada <rabidcic...@gmail.com> >>>>> wrote: >>>>> >>>>>> Heyo everyone. I'm trying to debug hosted-engine --deploy. It is >>>>>> failing in `Copy configuration archive to storage` in >>>>>> `create_target_vm.yml` from `hosted-engine --deploy`. My general >>>>>> and most important query here is how to get good debug output from >>>>>> ansible >>>>>> through hosted-engine. I'm running hosted-engine through an ssh session. >>>>>> >>>>>> I can't figure out how to get good debug output from ansible within >>>>>> that workflow. I see it's running through otopi, I tried setting >>>>>> typical `debugger: on_failed` hooks etc and tried many incantations on >>>>>> the >>>>>> command line and config files to get ansible to help me out. The >>>>>> debugger: directive and other debugger related ansible config file stuff >>>>>> wouldn't result in any debugger popping up. I also can't seem to pass >>>>>> normal -vvvv flags to hosted-engine either and get it to ansible. >>>>>> Ultimately I tried to use a `pause` directive and it complained that it >>>>>> was >>>>>> in a non-interactive shell. I figured it might be the result of my ssh >>>>>> session so I enabled tty allocation with -t -t. It did not resolve the >>>>>> issue. >>>>>> >>>>>> I eventually wrote-my-own/stole a callback_plugin that checks an >>>>>> environmental variable and enables `display.verbosity = int(v)` since I >>>>>> can't seem to pass typical -vvvv stuff to ansible through `hosted-engine >>>>>> --deploy`. It give me the best info that I have so far. But it wont >>>>>> give >>>>>> me enough to debug issues around Gathering Facts or what looks like a >>>>>> sudo/permission problem in `Copy configuration archive to storage` >>>>>> in `create_target_vm.yml`. I took and used the exact command that they >>>>>> use >>>>>> manually and it works when I run it manually (But I can't get debug >>>>>> output >>>>>> to show me the exact sudo command being executed), hence my interest in >>>>>> passing -vvvv or equivalent to ansible through `hosted-engine`. I >>>>>> intentionally disabled the VM_directory cleanup so that I could execute >>>>>> the >>>>>> same stuff. >>>>>> >>>>>> So....after all that...what is a good way to get deep debug info from >>>>>> hosted-engine ansible stuff? >>>>>> >>>>> >>>>> You should already find all the relevant log entries in a file >>>>> called /var/log/ovirt-hosted-engine-setup/ovirt-hosted-engin >>>>> e-setup-ansible-create_target_vm-{timestamp}-{hash}.log >>>>> >>>>> Can you please share it? >>>>> >>>>> >>>>>> >>>>>> Or does anyone have intuition for the possible sudo problem? >>>>>> ~Kyle >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Users mailing list >>>>>> Users@ovirt.org >>>>>> http://lists.ovirt.org/mailman/listinfo/users >>>>>> >>>>>> >>>>> >>>> >>> >> >
_______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users