Re: [Users] Clarify message: Failed to connect Host to Storage Pool Default
- Original Message - From: Gianluca Cecchi gianluca.cec...@gmail.com To: Omer Frenkel ofren...@redhat.com Cc: users users@ovirt.org Sent: Tuesday, August 27, 2013 4:57:45 PM Subject: Re: [Users] Clarify message: Failed to connect Host to Storage Pool Default On Tue, Aug 27, 2013 at 3:14 PM, Omer Frenkel wrote: - Original Message - From: Gianluca Cecchi gianluca.cec...@gmail.com To: users users@ovirt.org Sent: Monday, August 26, 2013 6:44:39 PM Subject: [Users] Clarify message: Failed to connect Host to Storage Pool Default Hello, after a induced failure of a whole site for testing reaction and restart, what would be sequence of actions to do from a physical point of view and from a gui point of view after powering on the hw components? [snip] I'm going to eventually send full logs, but I would like to ask if it is possible to send clearer messages inside the gui, for example what are the SDs that the host cannot access in case there are many of them. afair, the SDs not logged in audit log since you might have 20 domains or more, and it would not look good, so full information is in the log, and audit log just gives you a general information what is wrong. OK. what about recording the first one (say SD1) and putting in audit log (does this term mean what displayed in web adin gui?) something like Host XXX cannot access at least storage domain SD1 attached to the Data Center Default. See logfile (which one? put the path in message) for full log. Setting Host state to Non-Operational. Does this mean that if only one out of 20 SDs is not able to be reconnected all the DC is automatically impacted? Questions: 1) suppose one out of 20 SDs is not able to be reconnected (hw failure caused by power fault) what are the steps to correct/acknowledge the failure and let at least start the VMs not depending on it in the mean time one analyzes the problem and resolves it? if only one (or few) domains are problematic then the dc should be able to recover to up state, and only these domains will be in 'inactive' status. vms that not depend on these should work ok. this should happen automatically, no manual steps needed. 2) suppose that the particular faulty SD is the one that was the SD Master before crash, does this mean I am forced to use some db commands to switch it to an available SD or can I follow steps in 1) (if there are...) and another SD will be automatically elected as the new Master? no, there is a mechanism to change the master domain to some other available domain, assuming there is one like this. this is the reconstruct master that you see in the logs. sounds like error connecting to your storage Yes, in my simulation I have an IBM DS6800 where I can formally reach the SAN disks from hosts but the TUR command configured in multipath fails (and for exampe the command fdisk -l dev/sdb where sdb is one disk on the san exits with error invalid parameter due to DS6800 incorrect configuration) Thanks in advance. Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Clarify message: Failed to connect Host to Storage Pool Default
- Original Message - From: Gianluca Cecchi gianluca.cec...@gmail.com To: users users@ovirt.org Sent: Monday, August 26, 2013 6:44:39 PM Subject: [Users] Clarify message: Failed to connect Host to Storage Pool Default Hello, after a induced failure of a whole site for testing reaction and restart, what would be sequence of actions to do from a physical point of view and from a gui point of view after powering on the hw components? In webadmin portal I see these 3 messages in host view events after SAN and then host and engine restart Detected new Host XXX. Host state was set to Up. Host XXX cannot access one of the Storage Domains attached to the Data Center Default. Setting Host state to Non-Operational. Failed to connect Host XXX to Storage Pool Default hosts and engine are fedora 18 with 3.2.2-1.1.fc18 and there are 6 fibrechannel storage domains, all down and in Unknown crossdatacenter status. one of them is flagged as (master)... I'm going to eventually send full logs, but I would like to ask if it is possible to send clearer messages inside the gui, for example what are the SDs that the host cannot access in case there are many of them. afair, the SDs not logged in audit log since you might have 20 domains or more, and it would not look good, so full information is in the log, and audit log just gives you a general information what is wrong. What does Storage Pool Default mean? The term Default is because my DC (and my cluster) are named Default? What exactly does it mean Storage Pool? The Master Data Domain or the combination of the defined storage domains? Storage pool is Data center Default is the name of your data center. btw: In storage view I see: Failed to Reconstruct Master Domain for Data Center Default sounds like error connecting to your storage Thanks in advance Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
Re: [Users] Clarify message: Failed to connect Host to Storage Pool Default
On Tue, Aug 27, 2013 at 3:14 PM, Omer Frenkel wrote: - Original Message - From: Gianluca Cecchi gianluca.cec...@gmail.com To: users users@ovirt.org Sent: Monday, August 26, 2013 6:44:39 PM Subject: [Users] Clarify message: Failed to connect Host to Storage Pool Default Hello, after a induced failure of a whole site for testing reaction and restart, what would be sequence of actions to do from a physical point of view and from a gui point of view after powering on the hw components? [snip] I'm going to eventually send full logs, but I would like to ask if it is possible to send clearer messages inside the gui, for example what are the SDs that the host cannot access in case there are many of them. afair, the SDs not logged in audit log since you might have 20 domains or more, and it would not look good, so full information is in the log, and audit log just gives you a general information what is wrong. OK. what about recording the first one (say SD1) and putting in audit log (does this term mean what displayed in web adin gui?) something like Host XXX cannot access at least storage domain SD1 attached to the Data Center Default. See logfile (which one? put the path in message) for full log. Setting Host state to Non-Operational. Does this mean that if only one out of 20 SDs is not able to be reconnected all the DC is automatically impacted? Questions: 1) suppose one out of 20 SDs is not able to be reconnected (hw failure caused by power fault) what are the steps to correct/acknowledge the failure and let at least start the VMs not depending on it in the mean time one analyzes the problem and resolves it? 2) suppose that the particular faulty SD is the one that was the SD Master before crash, does this mean I am forced to use some db commands to switch it to an available SD or can I follow steps in 1) (if there are...) and another SD will be automatically elected as the new Master? sounds like error connecting to your storage Yes, in my simulation I have an IBM DS6800 where I can formally reach the SAN disks from hosts but the TUR command configured in multipath fails (and for exampe the command fdisk -l dev/sdb where sdb is one disk on the san exits with error invalid parameter due to DS6800 incorrect configuration) Thanks in advance. Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users
[Users] Clarify message: Failed to connect Host to Storage Pool Default
Hello, after a induced failure of a whole site for testing reaction and restart, what would be sequence of actions to do from a physical point of view and from a gui point of view after powering on the hw components? In webadmin portal I see these 3 messages in host view events after SAN and then host and engine restart Detected new Host XXX. Host state was set to Up. Host XXX cannot access one of the Storage Domains attached to the Data Center Default. Setting Host state to Non-Operational. Failed to connect Host XXX to Storage Pool Default hosts and engine are fedora 18 with 3.2.2-1.1.fc18 and there are 6 fibrechannel storage domains, all down and in Unknown crossdatacenter status. one of them is flagged as (master)... I'm going to eventually send full logs, but I would like to ask if it is possible to send clearer messages inside the gui, for example what are the SDs that the host cannot access in case there are many of them. What does Storage Pool Default mean? The term Default is because my DC (and my cluster) are named Default? What exactly does it mean Storage Pool? The Master Data Domain or the combination of the defined storage domains? btw: In storage view I see: Failed to Reconstruct Master Domain for Data Center Default Thanks in advance Gianluca ___ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users