Michael,

first of all: no need to worry.

This message indicates a data underrun, which is quite normal to occur time
and again. A data underrun is not necessarily a problem - but failing to
deal with it appropriately would be.

A good example is the INQUIRY command, which is issued by software during
SCSI device detection. INQUIRY can retrieve different amounts of INQUIRY
data - depending on which so-called INQUIRY pages are supported by the SCSI
device. Since software can't know exactly how much data will be delivered,
it provides a supposedly large enough INQUIRY data buffer - which often is
larger than the actual data retrieved. This situation results in  a data
underrun. That's fine and software deals with it. In this case software
even exepcts it to some degree.

There are other cases - usually during normal I/O activity - when software
doesn't expect a data underrun to occur. Again, a data underrun is fine as
long as software first detects it, and then deals with it appropriately.
That is, the partially succeded command is retried or only the missing data
is requested seperately.

Linux and zfcp handle data underruns correctly. However, there was a data
underrun related bug discovered and fixed end of last year in the zfcp
driver, which could actually lead to data integrity problems (applicable to
any SCSI device attached through zfcp, fixed both for 2.4 and 2.6):

http://www-128.ibm.com/developerworks/linux/linux390/linux-2.4.21-s390-23-june2003.html
Description:      zfcp: Data Miscompares with FCP attached FAStT and cable
pull.
Symptom:    BLAST reported "File ID Miscompare" during test runs.
Problem:    During cable pull data underruns occurred for SCSI commands.
The data underruns were not correctly reported to the SCSI stack. So the
SCSI stack handled those commands as completed succesfully.
Solution:   Set DID_ERROR as result for SCSI commands with data underrun
and correctly set resid field for such commands.
Problem-ID: 10999

Your SLES kernel comes with this fix.

In the course of debugging and fixing the above issue we changed the
logging level or urgency of the message you have encountered. It has
remained like this unintentionally for some time, at least with regard to
the 2.4 version of zfcp. The 2.6 reincarnation of zfcp doesn't spew this
message with the default zfcp logging level.

I acknowledge that it doesn't make much sense to scare users with this
message, as long as we are sure that it is handled correctly. I am going to
change the message for 2.4, as well.

Thanks for reporting this. I am sorry for the confusion caused by the
message.


Mit freundlichen Grüßen / with kind regards

Martin Peschke

IBM Deutschland Entwicklung GmbH
Linux for zSeries Development
Phone: +49-(0)7031-16-2349

----- Forwarded by Martin Peschke3/Germany/IBM on 22/07/2005 13:29 -----
|---------+---------------------------->
|         |           Heiko Carstens   |
|         |                            |
|         |           21/07/2005 21:32 |
|         |                            |
|---------+---------------------------->
  
>---------------------------------------------------------------------------------------------------------------|
  |                                                                             
                                  |
  |       To:       Martin Peschke3/Germany/[EMAIL PROTECTED], Maxim 
Shchetynin/Germany/[EMAIL PROTECTED], Andreas                |
  |        Herrmann/Germany/[EMAIL PROTECTED]                                   
                                          |
  |       cc:                                                                   
                                  |
  |       Subject:  Fw: SLES 8 FCP/hwscan error message                         
                                  |
  |                                                                             
                                  |
  
>---------------------------------------------------------------------------------------------------------------|



Best regards,
Heiko Carstens

Linux for zSeries Development
IBM Deutschland Entwicklung GmbH

----- Forwarded by Heiko Carstens/Germany/IBM on 21.07.2005 21:27 -----
                                                                       
             Michael Lambert                                           
             <[EMAIL PROTECTED]>                                         
             Sent by: Linux on                                          To
             390 Port                  LINUX-390@VM.MARIST.EDU         
             <[EMAIL PROTECTED]                                          cc
             IST.EDU>                                                  
                                                                   Subject
                                       SLES 8 FCP/hwscan error message 
             21.07.2005 21:08                                          
                                                                       
                                                                       
             Please respond to                                         
             Linux on 390 Port                                         
                                                                       
                                                                       




Hello, everyone.

I've noticed a somewhat disturbing error message that is generated on
our SLES 8 (31 bit) guests utilizing FCP scsi drives whenever the
command "hwscan --disk" is run. These errors are always replicatable if
the zfcp, scsi_mod and sd_mod drivers are loaded and an FCP disk is
defined regardless of whether the disk is mounted or not. We're running
on a z800 (model 2066) under z/VM 5.1 but this error also occurred under
4.4.

The message, generated by the zfcp module, is as follows:

 zfcp: FSF: zfcp_fsf_send_fcp_command_task_handler: A data underrun was
detected for a command. This happened for a command to the unit with
FCP_LUN 0x5052000000000000 connected to the port with WWPN
0x5005076300ca15e5 at the adapter with devno 0x5000. The response data
length is 756, the original length was 768.

We've reproduced this message using luns provided by an F20 model ESS
and an XP512 model HP storage array. We can assign the exact same luns
to SLES 9 (31 bit) guests and we don't see the message at all.

Our SLES 8 guests are current on patches and are running kernel version
2.4.21-292-default.

I'm currently working this problem with IBM support but they are
focusing on the microcode level of the F20 ESS. It is under the
recommended level and we are planning an upgrade in the near future but
this does nothing to resolve the errors that are generated whenever an
HP lun is attached.

We haven't noticed any data integrity issues but this message is quite
disconcerting to see considering the criticality of the data that is
being stored on some of these FCP disks. What I'd really like to know is
the seriousness of this message....there have been no apparent problems
but is this an indicator of trouble on the horizon? Could anyone
familiar with the zfcp module as shipped by SUSE with SLES 8 comment on
this?

Thanks,

Michael L.

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to