Re: [Users] Clarify message: Failed to connect Host to Storage Pool Default

2013-08-28 Thread Omer Frenkel


- Original Message -
 From: Gianluca Cecchi gianluca.cec...@gmail.com
 To: Omer Frenkel ofren...@redhat.com
 Cc: users users@ovirt.org
 Sent: Tuesday, August 27, 2013 4:57:45 PM
 Subject: Re: [Users] Clarify message: Failed to connect Host to Storage Pool 
 Default
 
 On Tue, Aug 27, 2013 at 3:14 PM, Omer Frenkel  wrote:
 
 
  - Original Message -
  From: Gianluca Cecchi gianluca.cec...@gmail.com
  To: users users@ovirt.org
  Sent: Monday, August 26, 2013 6:44:39 PM
  Subject: [Users] Clarify message: Failed to connect Host to Storage Pool
  Default
 
  Hello,
  after a induced failure of a whole site for testing reaction and
  restart, what would be sequence of actions to do from a physical point
  of view and from a gui point of view after powering on the hw
  components?
 
 [snip]
 
  I'm going to eventually send full logs, but I would like to ask if it
  is possible to send clearer messages inside the gui, for example what
  are the SDs that the host cannot access in case there are many of
  them.
 
  afair, the SDs not logged in audit log since you might have 20 domains or
  more,
  and it would not look good, so full information is in the log,
  and audit log just gives you a general information what is wrong.
 
 OK. what about recording the first one (say SD1) and putting in audit
 log (does this term mean what displayed in web adin gui?) something
 like
 
 Host XXX cannot access at least storage domain SD1 attached to the
 Data Center Default. See logfile (which one? put the path in message)
 for full log. Setting Host state to Non-Operational.
 
 Does this mean that if only one out of 20 SDs is not able to be
 reconnected all the DC is automatically impacted?
 
 Questions:
 1) suppose one out of 20 SDs is not able to be reconnected (hw failure
 caused by power fault)
 what are the steps to correct/acknowledge the failure and let at least
 start the VMs not depending on it in the mean time one analyzes the
 problem and resolves it?
 

if only one (or few) domains are problematic then the dc should be able to 
recover to up state,
and only these domains will be in 'inactive' status.
vms that not depend on these should work ok.
this should happen automatically, no manual steps needed.

 2) suppose that the particular faulty SD is the one that was the SD
 Master before crash, does this mean I am forced to use some db
 commands to switch it to an available SD or can I follow steps in 1)
 (if there are...) and another SD will be automatically elected as
 the new Master?
 

no, there is a mechanism to change the master domain to some other available 
domain,
assuming there is one like this.
this is the reconstruct master that you see in the logs.

 
  sounds like error connecting to your storage
 
 Yes, in my simulation I have an IBM DS6800 where I can formally reach
 the SAN disks from hosts but the TUR command configured in multipath
 fails (and for exampe the command fdisk -l dev/sdb where sdb is one
 disk on the san exits with error invalid parameter due to DS6800
 incorrect configuration)
 
 Thanks in advance.
 
 Gianluca
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Clarify message: Failed to connect Host to Storage Pool Default

2013-08-27 Thread Omer Frenkel


- Original Message -
 From: Gianluca Cecchi gianluca.cec...@gmail.com
 To: users users@ovirt.org
 Sent: Monday, August 26, 2013 6:44:39 PM
 Subject: [Users] Clarify message: Failed to connect Host to Storage Pool 
 Default
 
 Hello,
 after a induced failure of a whole site for testing reaction and
 restart, what would be sequence of actions to do from a physical point
 of view and from a gui point of view after powering on the hw
 components?
 In webadmin portal I see these 3 messages in host view events after
 SAN and then host and engine restart
 
 Detected new Host XXX. Host state was set to Up.
 Host XXX cannot access one of the Storage Domains attached to the Data
 Center Default. Setting Host state to Non-Operational.
 Failed to connect Host XXX to Storage Pool Default
 
 hosts and engine are fedora 18 with 3.2.2-1.1.fc18 and there are 6
 fibrechannel storage domains, all down and in Unknown crossdatacenter
 status. one of them is flagged as (master)...
 
 I'm going to eventually send full logs, but I would like to ask if it
 is possible to send clearer messages inside the gui, for example what
 are the SDs that the host cannot access in case there are many of
 them.

afair, the SDs not logged in audit log since you might have 20 domains or more,
and it would not look good, so full information is in the log,
and audit log just gives you a general information what is wrong.

 
 What does Storage Pool Default mean?
 The term Default is because my DC (and my cluster) are named Default?
 What exactly does it mean Storage Pool? The Master Data Domain or
 the combination of the defined storage domains?

Storage pool is Data center
Default is the name of your data center.

 
 btw: In storage view I see:
 Failed to Reconstruct Master Domain for Data Center Default

sounds like error connecting to your storage

 
 Thanks in advance
 
 Gianluca
 ___
 Users mailing list
 Users@ovirt.org
 http://lists.ovirt.org/mailman/listinfo/users
 
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [Users] Clarify message: Failed to connect Host to Storage Pool Default

2013-08-27 Thread Gianluca Cecchi
On Tue, Aug 27, 2013 at 3:14 PM, Omer Frenkel  wrote:


 - Original Message -
 From: Gianluca Cecchi gianluca.cec...@gmail.com
 To: users users@ovirt.org
 Sent: Monday, August 26, 2013 6:44:39 PM
 Subject: [Users] Clarify message: Failed to connect Host to Storage Pool
  Default

 Hello,
 after a induced failure of a whole site for testing reaction and
 restart, what would be sequence of actions to do from a physical point
 of view and from a gui point of view after powering on the hw
 components?

[snip]

 I'm going to eventually send full logs, but I would like to ask if it
 is possible to send clearer messages inside the gui, for example what
 are the SDs that the host cannot access in case there are many of
 them.

 afair, the SDs not logged in audit log since you might have 20 domains or 
 more,
 and it would not look good, so full information is in the log,
 and audit log just gives you a general information what is wrong.

OK. what about recording the first one (say SD1) and putting in audit
log (does this term mean what displayed in web adin gui?) something
like

Host XXX cannot access at least storage domain SD1 attached to the
Data Center Default. See logfile (which one? put the path in message)
for full log. Setting Host state to Non-Operational.

Does this mean that if only one out of 20 SDs is not able to be
reconnected all the DC is automatically impacted?

Questions:
1) suppose one out of 20 SDs is not able to be reconnected (hw failure
caused by power fault)
what are the steps to correct/acknowledge the failure and let at least
start the VMs not depending on it in the mean time one analyzes the
problem and resolves it?

2) suppose that the particular faulty SD is the one that was the SD
Master before crash, does this mean I am forced to use some db
commands to switch it to an available SD or can I follow steps in 1)
(if there are...) and another SD will be automatically elected as
the new Master?


 sounds like error connecting to your storage

Yes, in my simulation I have an IBM DS6800 where I can formally reach
the SAN disks from hosts but the TUR command configured in multipath
fails (and for exampe the command fdisk -l dev/sdb where sdb is one
disk on the san exits with error invalid parameter due to DS6800
incorrect configuration)

Thanks in advance.

Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[Users] Clarify message: Failed to connect Host to Storage Pool Default

2013-08-26 Thread Gianluca Cecchi
Hello,
after a induced failure of a whole site for testing reaction and
restart, what would be sequence of actions to do from a physical point
of view and from a gui point of view after powering on the hw
components?
In webadmin portal I see these 3 messages in host view events after
SAN and then host and engine restart

Detected new Host XXX. Host state was set to Up.
Host XXX cannot access one of the Storage Domains attached to the Data
Center Default. Setting Host state to Non-Operational.
Failed to connect Host XXX to Storage Pool Default

hosts and engine are fedora 18 with 3.2.2-1.1.fc18 and there are 6
fibrechannel storage domains, all down and in Unknown crossdatacenter
status. one of them is flagged as (master)...

I'm going to eventually send full logs, but I would like to ask if it
is possible to send clearer messages inside the gui, for example what
are the SDs that the host cannot access in case there are many of
them.

What does Storage Pool Default mean?
The term Default is because my DC (and my cluster) are named Default?
What exactly does it mean Storage Pool? The Master Data Domain or
the combination of the defined storage domains?

btw: In storage view I see:
Failed to Reconstruct Master Domain for Data Center Default

Thanks in advance

Gianluca
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users