Re: [PATCH v2] scsi: 3w-xxxx: fix a missing-check bug
On Mon, May 7, 2018 at 5:54 PM, Wenwen Wang <wang6...@umn.edu> wrote: > In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from > the userspace pointer 'argp' and saved to the kernel object > 'data_buffer_length'. Then a security check is performed on it to make sure > that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, > an error code -EINVAL is returned. If the security check is passed, the > entire ioctl command is copied again from the 'argp' pointer and saved to > the kernel object 'tw_ioctl'. Then, various operations are performed on > 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in > userspace, a malicious userspace process can race to change the buffer > length between the two copies. This way, the user can bypass the security > check and inject invalid data buffer length. This can cause potential > security issues in the following execution. > > This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to avoid > the above issues. > > Signed-off-by: Wenwen Wang <wang6...@umn.edu> > --- > drivers/scsi/3w-.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c > index 33261b6..f6179e3 100644 > --- a/drivers/scsi/3w-.c > +++ b/drivers/scsi/3w-.c > @@ -1033,6 +1033,9 @@ static int tw_chrdev_open(struct inode *inode, struct > file *file) > > dprintk(KERN_WARNING "3w-: tw_ioctl_open()\n"); > > + if (!capable(CAP_SYS_ADMIN)) > + return -EACCES; > + > minor_number = iminor(inode); > if (minor_number >= tw_device_extension_count) > return -ENODEV; > -- > 2.7.4 > Acked-by: Adam Radford <aradf...@gmail.com>
Re: [PATCH v2] scsi: 3w-xxxx: fix a missing-check bug
On Mon, May 7, 2018 at 5:54 PM, Wenwen Wang wrote: > In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from > the userspace pointer 'argp' and saved to the kernel object > 'data_buffer_length'. Then a security check is performed on it to make sure > that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, > an error code -EINVAL is returned. If the security check is passed, the > entire ioctl command is copied again from the 'argp' pointer and saved to > the kernel object 'tw_ioctl'. Then, various operations are performed on > 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in > userspace, a malicious userspace process can race to change the buffer > length between the two copies. This way, the user can bypass the security > check and inject invalid data buffer length. This can cause potential > security issues in the following execution. > > This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to avoid > the above issues. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c > index 33261b6..f6179e3 100644 > --- a/drivers/scsi/3w-.c > +++ b/drivers/scsi/3w-.c > @@ -1033,6 +1033,9 @@ static int tw_chrdev_open(struct inode *inode, struct > file *file) > > dprintk(KERN_WARNING "3w-: tw_ioctl_open()\n"); > > + if (!capable(CAP_SYS_ADMIN)) > + return -EACCES; > + > minor_number = iminor(inode); > if (minor_number >= tw_device_extension_count) > return -ENODEV; > -- > 2.7.4 > Acked-by: Adam Radford
Re: [PATCH v2] scsi: 3w-9xxx: fix a missing-check bug
On Mon, May 7, 2018 at 5:46 PM, Wenwen Wang <wang6...@umn.edu> wrote: > In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the > userspace pointer 'argp' and saved to the kernel object 'driver_command'. > Then a security check is performed on the data buffer size indicated by > 'driver_command', which is 'driver_command.buffer_length'. If the security > check is passed, the entire ioctl command is copied again from the 'argp' > pointer and saved to the kernel object 'tw_ioctl'. Then, various operations > are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' > pointer resides in userspace, a malicious userspace process can race to > change the buffer size between the two copies. This way, the user can > bypass the security check and inject invalid data buffer size. This can > cause potential security issues in the following execution. > > This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o avoid > the above issues. > > Signed-off-by: Wenwen Wang <wang6...@umn.edu> > --- > drivers/scsi/3w-9xxx.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index b42c9c4..99ba4a7 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -882,6 +882,11 @@ static int twa_chrdev_open(struct inode *inode, struct > file *file) > unsigned int minor_number; > int retval = TW_IOCTL_ERROR_OS_ENODEV; > > + if (!capable(CAP_SYS_ADMIN)) { > + retval = -EACCES; > + goto out; > + } > + > minor_number = iminor(inode); > if (minor_number >= twa_device_extension_count) > goto out; > -- > 2.7.4 > Acked-by: Adam Radford <aradf...@gmail.com>
Re: [PATCH v2] scsi: 3w-9xxx: fix a missing-check bug
On Mon, May 7, 2018 at 5:46 PM, Wenwen Wang wrote: > In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the > userspace pointer 'argp' and saved to the kernel object 'driver_command'. > Then a security check is performed on the data buffer size indicated by > 'driver_command', which is 'driver_command.buffer_length'. If the security > check is passed, the entire ioctl command is copied again from the 'argp' > pointer and saved to the kernel object 'tw_ioctl'. Then, various operations > are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' > pointer resides in userspace, a malicious userspace process can race to > change the buffer size between the two copies. This way, the user can > bypass the security check and inject invalid data buffer size. This can > cause potential security issues in the following execution. > > This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o avoid > the above issues. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-9xxx.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index b42c9c4..99ba4a7 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -882,6 +882,11 @@ static int twa_chrdev_open(struct inode *inode, struct > file *file) > unsigned int minor_number; > int retval = TW_IOCTL_ERROR_OS_ENODEV; > > + if (!capable(CAP_SYS_ADMIN)) { > + retval = -EACCES; > + goto out; > + } > + > minor_number = iminor(inode); > if (minor_number >= twa_device_extension_count) > goto out; > -- > 2.7.4 > Acked-by: Adam Radford
Re: [PATCH] scsi: 3w-xxxx: fix a missing-check bug
On Sat, May 5, 2018 at 10:48 PM, Wenwen Wangwrote: > In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from > the userspace pointer 'argp' and saved to the kernel object > 'data_buffer_length'. Then a security check is performed on it to make sure > that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, > an error code -EINVAL is returned. If the security check is passed, the > entire ioctl command is copied again from the 'argp' pointer and saved to > the kernel object 'tw_ioctl'. Then, various operations are performed on > 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in > userspace, a malicious userspace process can race to change the buffer > length between the two copies. This way, the user can bypass the security > check and inject invalid data buffer length. This can cause potential > security issues in the following execution. > > This patch checks the buffer length obtained in the second copy. An error > code -EINVAL will be returned if it is not same as the original one in the > first copy. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c > index 33261b6..ef79194 100644 > --- a/drivers/scsi/3w-.c > +++ b/drivers/scsi/3w-.c > @@ -919,6 +919,10 @@ static long tw_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long a > /* Now copy down the entire ioctl */ > if (copy_from_user(tw_ioctl, argp, data_buffer_length + > sizeof(TW_New_Ioctl) - 1)) > goto out2; > + if (tw_ioctl->data_buffer_length != data_buffer_length) { > + retval = -EINVAL; > + goto out2; > + } > > passthru = (TW_Passthru *)_ioctl->firmware_command; > > -- > 2.7.4 > I would drop this patch and check for !capable(CAP_SYS_ADMIN) in tw_chrdev_open() instead. -Adam
Re: [PATCH] scsi: 3w-xxxx: fix a missing-check bug
On Sat, May 5, 2018 at 10:48 PM, Wenwen Wang wrote: > In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from > the userspace pointer 'argp' and saved to the kernel object > 'data_buffer_length'. Then a security check is performed on it to make sure > that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, > an error code -EINVAL is returned. If the security check is passed, the > entire ioctl command is copied again from the 'argp' pointer and saved to > the kernel object 'tw_ioctl'. Then, various operations are performed on > 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in > userspace, a malicious userspace process can race to change the buffer > length between the two copies. This way, the user can bypass the security > check and inject invalid data buffer length. This can cause potential > security issues in the following execution. > > This patch checks the buffer length obtained in the second copy. An error > code -EINVAL will be returned if it is not same as the original one in the > first copy. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-.c | 4 > 1 file changed, 4 insertions(+) > > diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c > index 33261b6..ef79194 100644 > --- a/drivers/scsi/3w-.c > +++ b/drivers/scsi/3w-.c > @@ -919,6 +919,10 @@ static long tw_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long a > /* Now copy down the entire ioctl */ > if (copy_from_user(tw_ioctl, argp, data_buffer_length + > sizeof(TW_New_Ioctl) - 1)) > goto out2; > + if (tw_ioctl->data_buffer_length != data_buffer_length) { > + retval = -EINVAL; > + goto out2; > + } > > passthru = (TW_Passthru *)_ioctl->firmware_command; > > -- > 2.7.4 > I would drop this patch and check for !capable(CAP_SYS_ADMIN) in tw_chrdev_open() instead. -Adam
Re: [PATCH] scsi: 3w-9xxx: fix a missing-check bug
On Sat, May 5, 2018 at 8:43 PM, Wenwen Wangwrote: > In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the > userspace pointer 'argp' and saved to the kernel object 'driver_command'. > Then a security check is performed on the data buffer size indicated by > 'driver_command', which is 'driver_command.buffer_length'. If the security > check is passed, the entire ioctl command is copied again from the 'argp' > pointer and saved to the kernel object 'tw_ioctl'. Then, various operations > are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' > pointer resides in userspace, a malicious userspace process can race to > change the buffer size between the two copies. This way, the user can > bypass the security check and inject invalid data buffer size. This can > cause potential security issues in the following execution. > > This patch checks the buffer size obtained in the second copy. An error > code -EINVAL will be returned if it is not same as the original one in the > first copy. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-9xxx.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index b42c9c4..8bc43db 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -684,6 +684,12 @@ static long twa_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long > if (copy_from_user(tw_ioctl, argp, driver_command.buffer_length + > sizeof(TW_Ioctl_Buf_Apache) - 1)) > goto out3; > > + if (tw_ioctl->driver_command.buffer_length > +!= driver_command.buffer_length) { > + retval = TW_IOCTL_ERROR_OS_EINVAL; > + goto out3; > + } > + > /* See which ioctl we are doing */ > switch (cmd) { > case TW_IOCTL_FIRMWARE_PASS_THROUGH: > -- > 2.7.4 > Drop this patch and create a new one that checks for: if !capable(CAP_SYS_ADMIN) in twa_chrdev_ioctl() (like 3w-sas.c does) and I'll ack it. -Adam
Re: [PATCH] scsi: 3w-9xxx: fix a missing-check bug
On Sat, May 5, 2018 at 8:43 PM, Wenwen Wang wrote: > In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the > userspace pointer 'argp' and saved to the kernel object 'driver_command'. > Then a security check is performed on the data buffer size indicated by > 'driver_command', which is 'driver_command.buffer_length'. If the security > check is passed, the entire ioctl command is copied again from the 'argp' > pointer and saved to the kernel object 'tw_ioctl'. Then, various operations > are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' > pointer resides in userspace, a malicious userspace process can race to > change the buffer size between the two copies. This way, the user can > bypass the security check and inject invalid data buffer size. This can > cause potential security issues in the following execution. > > This patch checks the buffer size obtained in the second copy. An error > code -EINVAL will be returned if it is not same as the original one in the > first copy. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-9xxx.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index b42c9c4..8bc43db 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -684,6 +684,12 @@ static long twa_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long > if (copy_from_user(tw_ioctl, argp, driver_command.buffer_length + > sizeof(TW_Ioctl_Buf_Apache) - 1)) > goto out3; > > + if (tw_ioctl->driver_command.buffer_length > +!= driver_command.buffer_length) { > + retval = TW_IOCTL_ERROR_OS_EINVAL; > + goto out3; > + } > + > /* See which ioctl we are doing */ > switch (cmd) { > case TW_IOCTL_FIRMWARE_PASS_THROUGH: > -- > 2.7.4 > Drop this patch and create a new one that checks for: if !capable(CAP_SYS_ADMIN) in twa_chrdev_ioctl() (like 3w-sas.c does) and I'll ack it. -Adam
Re: [PATCH] scsi: 3ware: fix a missing-check bug
On Sat, May 5, 2018 at 10:50 PM, Wenwen Wangwrote: > In twl_chrdev_ioctl(), the ioctl driver command is firstly copied from the > userspace pointer 'argp' and saved to the kernel object 'driver_command'. > Then a security check is performed on the data buffer size indicated by > 'driver_command', which is 'driver_command.buffer_length'. If the security > check is passed, the entire ioctl command is copied again from the 'argp' > pointer and saved to the kernel object 'tw_ioctl'. Then, various operations > are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' > pointer resides in userspace, a malicious userspace process can race to > change the buffer size between the two copies. This way, the user can > bypass the security check and inject invalid data buffer size. This can > cause potential security issues in the following execution. > > This patch checks the buffer size obtained in the second copy. An error > code -EINVAL will be returned if it is not same as the original one in the > first copy. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-sas.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c > index cf9f2a0..ea41969 100644 > --- a/drivers/scsi/3w-sas.c > +++ b/drivers/scsi/3w-sas.c > @@ -757,6 +757,11 @@ static long twl_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long > /* Now copy down the entire ioctl */ > if (copy_from_user(tw_ioctl, argp, driver_command.buffer_length + > sizeof(TW_Ioctl_Buf_Apache) - 1)) > goto out3; > + if (tw_ioctl->driver_command.buffer_length != > + driver_command.buffer_length) { > + retval = -EINVAL; > + goto out3; > + } > > /* See which ioctl we are doing */ > switch (cmd) { > -- > 2.7.4 > 1. Returning -EINVAL after the copy_from_user() doesn't prevent any invalid copy down to kernel mode from happening. 2. twl_chrdev_open() checks for capable(CAP_SYS_ADMIN): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/3w-sas.c#n834 I don't see the point in this patch. -Adam
Re: [PATCH] scsi: 3ware: fix a missing-check bug
On Sat, May 5, 2018 at 10:50 PM, Wenwen Wang wrote: > In twl_chrdev_ioctl(), the ioctl driver command is firstly copied from the > userspace pointer 'argp' and saved to the kernel object 'driver_command'. > Then a security check is performed on the data buffer size indicated by > 'driver_command', which is 'driver_command.buffer_length'. If the security > check is passed, the entire ioctl command is copied again from the 'argp' > pointer and saved to the kernel object 'tw_ioctl'. Then, various operations > are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' > pointer resides in userspace, a malicious userspace process can race to > change the buffer size between the two copies. This way, the user can > bypass the security check and inject invalid data buffer size. This can > cause potential security issues in the following execution. > > This patch checks the buffer size obtained in the second copy. An error > code -EINVAL will be returned if it is not same as the original one in the > first copy. > > Signed-off-by: Wenwen Wang > --- > drivers/scsi/3w-sas.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c > index cf9f2a0..ea41969 100644 > --- a/drivers/scsi/3w-sas.c > +++ b/drivers/scsi/3w-sas.c > @@ -757,6 +757,11 @@ static long twl_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long > /* Now copy down the entire ioctl */ > if (copy_from_user(tw_ioctl, argp, driver_command.buffer_length + > sizeof(TW_Ioctl_Buf_Apache) - 1)) > goto out3; > + if (tw_ioctl->driver_command.buffer_length != > + driver_command.buffer_length) { > + retval = -EINVAL; > + goto out3; > + } > > /* See which ioctl we are doing */ > switch (cmd) { > -- > 2.7.4 > 1. Returning -EINVAL after the copy_from_user() doesn't prevent any invalid copy down to kernel mode from happening. 2. twl_chrdev_open() checks for capable(CAP_SYS_ADMIN): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/scsi/3w-sas.c#n834 I don't see the point in this patch. -Adam
Re: [PATCH 2/3] scsi: 3ware: use 64-bit times for FW time sync
On Fri, Nov 10, 2017 at 7:58 AM, Arnd Bergmann <a...@arndb.de> wrote: > The calculation of the number of seconds since Sunday 00:00:00 overflows > in 2106, meaning that we instead will return the seconds since Wednesday > 06:28:16 afterwards. > > Using 64-bit time stamps avoids this slight inconsistency, and the > deprecated do_gettimeofday(), replacing it with the simpler > ktime_get_real_seconds(). > > Signed-off-by: Arnd Bergmann <a...@arndb.de> > --- > drivers/scsi/3w-9xxx.c | 8 +++- > drivers/scsi/3w-sas.c | 10 -- > 2 files changed, 7 insertions(+), 11 deletions(-) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index cb9af3f7b653..b1c9bd9c1bfd 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -472,11 +472,10 @@ static char *twa_aen_severity_lookup(unsigned char > severity_code) > static void twa_aen_sync_time(TW_Device_Extension *tw_dev, int request_id) > { > u32 schedulertime; > - struct timeval utc; > TW_Command_Full *full_command_packet; > TW_Command *command_packet; > TW_Param_Apache *param; > - u32 local_time; > + time64_t local_time; > > /* Fill out the command packet */ > full_command_packet = tw_dev->command_packet_virt[request_id]; > @@ -498,9 +497,8 @@ static void twa_aen_sync_time(TW_Device_Extension > *tw_dev, int request_id) > > /* Convert system time in UTC to local time seconds since last > Sunday 12:00AM */ > - do_gettimeofday(); > - local_time = (u32)(utc.tv_sec - (sys_tz.tz_minuteswest * 60)); > - schedulertime = local_time - (3 * 86400); > + local_time = (ktime_get_real_seconds() - (sys_tz.tz_minuteswest * > 60)); > + div_u64_rem(local_time - (3 * 86400), 604800, ); > schedulertime = cpu_to_le32(schedulertime % 604800); > > memcpy(param->data, , sizeof(u32)); > diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c > index c283fdb3cb24..cf9f2a09b47d 100644 > --- a/drivers/scsi/3w-sas.c > +++ b/drivers/scsi/3w-sas.c > @@ -407,11 +407,10 @@ static int twl_aen_read_queue(TW_Device_Extension > *tw_dev, int request_id) > static void twl_aen_sync_time(TW_Device_Extension *tw_dev, int request_id) > { > u32 schedulertime; > - struct timeval utc; > TW_Command_Full *full_command_packet; > TW_Command *command_packet; > TW_Param_Apache *param; > - u32 local_time; > + time64_t local_time; > > /* Fill out the command packet */ > full_command_packet = tw_dev->command_packet_virt[request_id]; > @@ -433,10 +432,9 @@ static void twl_aen_sync_time(TW_Device_Extension > *tw_dev, int request_id) > > /* Convert system time in UTC to local time seconds since last > Sunday 12:00AM */ > - do_gettimeofday(); > - local_time = (u32)(utc.tv_sec - (sys_tz.tz_minuteswest * 60)); > - schedulertime = local_time - (3 * 86400); > - schedulertime = cpu_to_le32(schedulertime % 604800); > + local_time = (ktime_get_real_seconds() - (sys_tz.tz_minuteswest * > 60)); > + div_u64_rem(local_time - (3 * 86400), 604800, ); > + schedulertime = cpu_to_le32(schedulertime); > > memcpy(param->data, , sizeof(u32)); > > -- > 2.9.0 > Acked-by: Adam Radford <aradf...@gmail.com>
Re: [PATCH 2/3] scsi: 3ware: use 64-bit times for FW time sync
On Fri, Nov 10, 2017 at 7:58 AM, Arnd Bergmann wrote: > The calculation of the number of seconds since Sunday 00:00:00 overflows > in 2106, meaning that we instead will return the seconds since Wednesday > 06:28:16 afterwards. > > Using 64-bit time stamps avoids this slight inconsistency, and the > deprecated do_gettimeofday(), replacing it with the simpler > ktime_get_real_seconds(). > > Signed-off-by: Arnd Bergmann > --- > drivers/scsi/3w-9xxx.c | 8 +++- > drivers/scsi/3w-sas.c | 10 -- > 2 files changed, 7 insertions(+), 11 deletions(-) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index cb9af3f7b653..b1c9bd9c1bfd 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -472,11 +472,10 @@ static char *twa_aen_severity_lookup(unsigned char > severity_code) > static void twa_aen_sync_time(TW_Device_Extension *tw_dev, int request_id) > { > u32 schedulertime; > - struct timeval utc; > TW_Command_Full *full_command_packet; > TW_Command *command_packet; > TW_Param_Apache *param; > - u32 local_time; > + time64_t local_time; > > /* Fill out the command packet */ > full_command_packet = tw_dev->command_packet_virt[request_id]; > @@ -498,9 +497,8 @@ static void twa_aen_sync_time(TW_Device_Extension > *tw_dev, int request_id) > > /* Convert system time in UTC to local time seconds since last > Sunday 12:00AM */ > - do_gettimeofday(); > - local_time = (u32)(utc.tv_sec - (sys_tz.tz_minuteswest * 60)); > - schedulertime = local_time - (3 * 86400); > + local_time = (ktime_get_real_seconds() - (sys_tz.tz_minuteswest * > 60)); > + div_u64_rem(local_time - (3 * 86400), 604800, ); > schedulertime = cpu_to_le32(schedulertime % 604800); > > memcpy(param->data, , sizeof(u32)); > diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c > index c283fdb3cb24..cf9f2a09b47d 100644 > --- a/drivers/scsi/3w-sas.c > +++ b/drivers/scsi/3w-sas.c > @@ -407,11 +407,10 @@ static int twl_aen_read_queue(TW_Device_Extension > *tw_dev, int request_id) > static void twl_aen_sync_time(TW_Device_Extension *tw_dev, int request_id) > { > u32 schedulertime; > - struct timeval utc; > TW_Command_Full *full_command_packet; > TW_Command *command_packet; > TW_Param_Apache *param; > - u32 local_time; > + time64_t local_time; > > /* Fill out the command packet */ > full_command_packet = tw_dev->command_packet_virt[request_id]; > @@ -433,10 +432,9 @@ static void twl_aen_sync_time(TW_Device_Extension > *tw_dev, int request_id) > > /* Convert system time in UTC to local time seconds since last > Sunday 12:00AM */ > - do_gettimeofday(); > - local_time = (u32)(utc.tv_sec - (sys_tz.tz_minuteswest * 60)); > - schedulertime = local_time - (3 * 86400); > - schedulertime = cpu_to_le32(schedulertime % 604800); > + local_time = (ktime_get_real_seconds() - (sys_tz.tz_minuteswest * > 60)); > + div_u64_rem(local_time - (3 * 86400), 604800, ); > + schedulertime = cpu_to_le32(schedulertime); > > memcpy(param->data, , sizeof(u32)); > > -- > 2.9.0 > Acked-by: Adam Radford
Re: [PATCH 1/3] scsi: 3ware: fix 32-bit time calculations
On Fri, Nov 10, 2017 at 7:58 AM, Arnd Bergmann <a...@arndb.de> wrote: > twl_aen_queue_event/twa_aen_queue_event, we use do_gettimeofday() > to read the lower 32 bits of the current time in seconds, to pass > them to the TW_IOCTL_GET_NEXT_EVENT ioctl or the 3ware_aen_read > sysfs file. > > This will overflow on all architectures in year 2106, there is > not much we can do about that without breaking the ABI. User > space has 90 years to learn to deal with it, so it's probably ok. > > I'm changing it to use ktime_get_real_seconds() with a comment > to document what happens when. > > Signed-off-by: Arnd Bergmann <a...@arndb.de> > --- > drivers/scsi/3w-9xxx.c | 5 ++--- > drivers/scsi/3w-sas.c | 5 ++--- > 2 files changed, 4 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index 00e7968a1d70..cb9af3f7b653 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -369,7 +369,6 @@ static int twa_aen_drain_queue(TW_Device_Extension > *tw_dev, int no_check_reset) > static void twa_aen_queue_event(TW_Device_Extension *tw_dev, > TW_Command_Apache_Header *header) > { > u32 local_time; > - struct timeval time; > TW_Event *event; > unsigned short aen; > char host[16]; > @@ -392,8 +391,8 @@ static void twa_aen_queue_event(TW_Device_Extension > *tw_dev, TW_Command_Apache_H > memset(event, 0, sizeof(TW_Event)); > > event->severity = TW_SEV_OUT(header->status_block.severity__reserved); > - do_gettimeofday(); > - local_time = (u32)(time.tv_sec - (sys_tz.tz_minuteswest * 60)); > + /* event->time_stamp_sec overflows in y2106 */ > + local_time = (u32)(ktime_get_real_seconds() - (sys_tz.tz_minuteswest > * 60)); > event->time_stamp_sec = local_time; > event->aen_code = aen; > event->retrieved = TW_AEN_NOT_RETRIEVED; > diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c > index b150e131b2e7..c283fdb3cb24 100644 > --- a/drivers/scsi/3w-sas.c > +++ b/drivers/scsi/3w-sas.c > @@ -221,7 +221,6 @@ static char *twl_aen_severity_lookup(unsigned char > severity_code) > static void twl_aen_queue_event(TW_Device_Extension *tw_dev, > TW_Command_Apache_Header *header) > { > u32 local_time; > - struct timeval time; > TW_Event *event; > unsigned short aen; > char host[16]; > @@ -240,8 +239,8 @@ static void twl_aen_queue_event(TW_Device_Extension > *tw_dev, TW_Command_Apache_H > memset(event, 0, sizeof(TW_Event)); > > event->severity = TW_SEV_OUT(header->status_block.severity__reserved); > - do_gettimeofday(); > - local_time = (u32)(time.tv_sec - (sys_tz.tz_minuteswest * 60)); > + /* event->time_stamp_sec overflows in y2106 */ > + local_time = (u32)(ktime_get_real_seconds() - (sys_tz.tz_minuteswest > * 60)); > event->time_stamp_sec = local_time; > event->aen_code = aen; > event->retrieved = TW_AEN_NOT_RETRIEVED; > -- > 2.9.0 > Acked-by: Adam Radford <aradf...@gmail.com>
Re: [PATCH 1/3] scsi: 3ware: fix 32-bit time calculations
On Fri, Nov 10, 2017 at 7:58 AM, Arnd Bergmann wrote: > twl_aen_queue_event/twa_aen_queue_event, we use do_gettimeofday() > to read the lower 32 bits of the current time in seconds, to pass > them to the TW_IOCTL_GET_NEXT_EVENT ioctl or the 3ware_aen_read > sysfs file. > > This will overflow on all architectures in year 2106, there is > not much we can do about that without breaking the ABI. User > space has 90 years to learn to deal with it, so it's probably ok. > > I'm changing it to use ktime_get_real_seconds() with a comment > to document what happens when. > > Signed-off-by: Arnd Bergmann > --- > drivers/scsi/3w-9xxx.c | 5 ++--- > drivers/scsi/3w-sas.c | 5 ++--- > 2 files changed, 4 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index 00e7968a1d70..cb9af3f7b653 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -369,7 +369,6 @@ static int twa_aen_drain_queue(TW_Device_Extension > *tw_dev, int no_check_reset) > static void twa_aen_queue_event(TW_Device_Extension *tw_dev, > TW_Command_Apache_Header *header) > { > u32 local_time; > - struct timeval time; > TW_Event *event; > unsigned short aen; > char host[16]; > @@ -392,8 +391,8 @@ static void twa_aen_queue_event(TW_Device_Extension > *tw_dev, TW_Command_Apache_H > memset(event, 0, sizeof(TW_Event)); > > event->severity = TW_SEV_OUT(header->status_block.severity__reserved); > - do_gettimeofday(); > - local_time = (u32)(time.tv_sec - (sys_tz.tz_minuteswest * 60)); > + /* event->time_stamp_sec overflows in y2106 */ > + local_time = (u32)(ktime_get_real_seconds() - (sys_tz.tz_minuteswest > * 60)); > event->time_stamp_sec = local_time; > event->aen_code = aen; > event->retrieved = TW_AEN_NOT_RETRIEVED; > diff --git a/drivers/scsi/3w-sas.c b/drivers/scsi/3w-sas.c > index b150e131b2e7..c283fdb3cb24 100644 > --- a/drivers/scsi/3w-sas.c > +++ b/drivers/scsi/3w-sas.c > @@ -221,7 +221,6 @@ static char *twl_aen_severity_lookup(unsigned char > severity_code) > static void twl_aen_queue_event(TW_Device_Extension *tw_dev, > TW_Command_Apache_Header *header) > { > u32 local_time; > - struct timeval time; > TW_Event *event; > unsigned short aen; > char host[16]; > @@ -240,8 +239,8 @@ static void twl_aen_queue_event(TW_Device_Extension > *tw_dev, TW_Command_Apache_H > memset(event, 0, sizeof(TW_Event)); > > event->severity = TW_SEV_OUT(header->status_block.severity__reserved); > - do_gettimeofday(); > - local_time = (u32)(time.tv_sec - (sys_tz.tz_minuteswest * 60)); > + /* event->time_stamp_sec overflows in y2106 */ > + local_time = (u32)(ktime_get_real_seconds() - (sys_tz.tz_minuteswest > * 60)); > event->time_stamp_sec = local_time; > event->aen_code = aen; > event->retrieved = TW_AEN_NOT_RETRIEVED; > -- > 2.9.0 > Acked-by: Adam Radford
Re: [PATCH 3/3] scsi: 3w-9xxx: rework lock timeouts
On Fri, Nov 10, 2017 at 7:58 AM, Arnd Bergmann <a...@arndb.de> wrote: > The TW_IOCTL_GET_LOCK ioctl uses do_gettimeofday() to check whether > a lock has expired. This can misbehave due to a concurrent > settimeofday() call, as it is based on 'real' time, and it > will overflow in y2038 on 32-bit architectures, producing > unexpected results when used across the overflow time. > > This changes it to using monotonic time, using ktime_get() > to simplify the code. > > Signed-off-by: Arnd Bergmann <a...@arndb.de> > --- > drivers/scsi/3w-9xxx.c | 13 ++--- > drivers/scsi/3w-9xxx.h | 2 +- > 2 files changed, 7 insertions(+), 8 deletions(-) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index b1c9bd9c1bfd..b42c9c479d4b 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -645,8 +645,7 @@ static long twa_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long > TW_Command_Full *full_command_packet; > TW_Compatibility_Info *tw_compat_info; > TW_Event *event; > - struct timeval current_time; > - u32 current_time_ms; > + ktime_t current_time; > TW_Device_Extension *tw_dev = > twa_device_extension_list[iminor(inode)]; > int retval = TW_IOCTL_ERROR_OS_EFAULT; > void __user *argp = (void __user *)arg; > @@ -837,17 +836,17 @@ static long twa_chrdev_ioctl(struct file *file, > unsigned int cmd, unsigned long > break; > case TW_IOCTL_GET_LOCK: > tw_lock = (TW_Lock *)tw_ioctl->data_buffer; > - do_gettimeofday(_time); > - current_time_ms = (current_time.tv_sec * 1000) + > (current_time.tv_usec / 1000); > + current_time = ktime_get(); > > - if ((tw_lock->force_flag == 1) || (tw_dev->ioctl_sem_lock == > 0) || (current_time_ms >= tw_dev->ioctl_msec)) { > + if ((tw_lock->force_flag == 1) || (tw_dev->ioctl_sem_lock == > 0) || > + ktime_after(current_time, tw_dev->ioctl_time)) { > tw_dev->ioctl_sem_lock = 1; > - tw_dev->ioctl_msec = current_time_ms + > tw_lock->timeout_msec; > + tw_dev->ioctl_time = ktime_add_ms(current_time, > tw_lock->timeout_msec); > tw_ioctl->driver_command.status = 0; > tw_lock->time_remaining_msec = tw_lock->timeout_msec; > } else { > tw_ioctl->driver_command.status = > TW_IOCTL_ERROR_STATUS_LOCKED; > - tw_lock->time_remaining_msec = tw_dev->ioctl_msec - > current_time_ms; > + tw_lock->time_remaining_msec = > ktime_ms_delta(tw_dev->ioctl_time, current_time); > } > break; > case TW_IOCTL_RELEASE_LOCK: > diff --git a/drivers/scsi/3w-9xxx.h b/drivers/scsi/3w-9xxx.h > index b6c208cc474f..d88cd3499bd5 100644 > --- a/drivers/scsi/3w-9xxx.h > +++ b/drivers/scsi/3w-9xxx.h > @@ -666,7 +666,7 @@ typedef struct TAG_TW_Device_Extension { > unsigned char event_queue_wrapped; > unsigned interror_sequence_id; > int ioctl_sem_lock; > - u32 ioctl_msec; > + ktime_t ioctl_time; > int chrdev_request_id; > wait_queue_head_t ioctl_wqueue; > struct mutexioctl_lock; > -- > 2.9.0 > Looks good... Thanks for this fix! Acked-by: Adam Radford <aradf...@gmail.com>
Re: [PATCH 3/3] scsi: 3w-9xxx: rework lock timeouts
On Fri, Nov 10, 2017 at 7:58 AM, Arnd Bergmann wrote: > The TW_IOCTL_GET_LOCK ioctl uses do_gettimeofday() to check whether > a lock has expired. This can misbehave due to a concurrent > settimeofday() call, as it is based on 'real' time, and it > will overflow in y2038 on 32-bit architectures, producing > unexpected results when used across the overflow time. > > This changes it to using monotonic time, using ktime_get() > to simplify the code. > > Signed-off-by: Arnd Bergmann > --- > drivers/scsi/3w-9xxx.c | 13 ++--- > drivers/scsi/3w-9xxx.h | 2 +- > 2 files changed, 7 insertions(+), 8 deletions(-) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index b1c9bd9c1bfd..b42c9c479d4b 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -645,8 +645,7 @@ static long twa_chrdev_ioctl(struct file *file, unsigned > int cmd, unsigned long > TW_Command_Full *full_command_packet; > TW_Compatibility_Info *tw_compat_info; > TW_Event *event; > - struct timeval current_time; > - u32 current_time_ms; > + ktime_t current_time; > TW_Device_Extension *tw_dev = > twa_device_extension_list[iminor(inode)]; > int retval = TW_IOCTL_ERROR_OS_EFAULT; > void __user *argp = (void __user *)arg; > @@ -837,17 +836,17 @@ static long twa_chrdev_ioctl(struct file *file, > unsigned int cmd, unsigned long > break; > case TW_IOCTL_GET_LOCK: > tw_lock = (TW_Lock *)tw_ioctl->data_buffer; > - do_gettimeofday(_time); > - current_time_ms = (current_time.tv_sec * 1000) + > (current_time.tv_usec / 1000); > + current_time = ktime_get(); > > - if ((tw_lock->force_flag == 1) || (tw_dev->ioctl_sem_lock == > 0) || (current_time_ms >= tw_dev->ioctl_msec)) { > + if ((tw_lock->force_flag == 1) || (tw_dev->ioctl_sem_lock == > 0) || > + ktime_after(current_time, tw_dev->ioctl_time)) { > tw_dev->ioctl_sem_lock = 1; > - tw_dev->ioctl_msec = current_time_ms + > tw_lock->timeout_msec; > + tw_dev->ioctl_time = ktime_add_ms(current_time, > tw_lock->timeout_msec); > tw_ioctl->driver_command.status = 0; > tw_lock->time_remaining_msec = tw_lock->timeout_msec; > } else { > tw_ioctl->driver_command.status = > TW_IOCTL_ERROR_STATUS_LOCKED; > - tw_lock->time_remaining_msec = tw_dev->ioctl_msec - > current_time_ms; > + tw_lock->time_remaining_msec = > ktime_ms_delta(tw_dev->ioctl_time, current_time); > } > break; > case TW_IOCTL_RELEASE_LOCK: > diff --git a/drivers/scsi/3w-9xxx.h b/drivers/scsi/3w-9xxx.h > index b6c208cc474f..d88cd3499bd5 100644 > --- a/drivers/scsi/3w-9xxx.h > +++ b/drivers/scsi/3w-9xxx.h > @@ -666,7 +666,7 @@ typedef struct TAG_TW_Device_Extension { > unsigned char event_queue_wrapped; > unsigned interror_sequence_id; > int ioctl_sem_lock; > - u32 ioctl_msec; > + ktime_t ioctl_time; > int chrdev_request_id; > wait_queue_head_t ioctl_wqueue; > struct mutexioctl_lock; > -- > 2.9.0 > Looks good... Thanks for this fix! Acked-by: Adam Radford
Re: 4.6.4, 3w_sas: timeout too small?
>> Oct 16 06:41:33 nasl003b kernel: [4903207.577484] 3w-sas: scsi0: ERROR: >> (0x06:0x000D): Microcontroller Error: clearing. Microcontroller Error = FW crashed and will be reset on the next I/O timeout or ioctl() timeout, whichever happens first. It could be the iSCSI target code or initiator sent a SCSI CDB not supported by 3ware 9750 firmware (or malformed CDB), or your 3ware card is going bad, most likely the later. >> Oct 16, 2016 06:45:03AM (0x04:0x005E): Cache synchronization completed: >> unit=0 Cache sync happens after an online controller reset. Previously acknowledged write back cache entries are flushed to disk at this time. >> The 3ware web interface says the controller was busy to write the cache to >> disk It is not the case that the controller was performing some long cache flush interval, > 60 seconds, such that you would need to adjust any timeout values in the controller reset sequence code, i.e. twl_reset_sequence(). -Adam On Mon, Oct 17, 2016 at 6:09 AM, Harald Dunkelwrote: > Hi folks, > > Question about the reset timeout of the 3w_sas: > > For an unknown reason my 3ware 9750 RAID controller became > unresponsive. kernel.log: > > Oct 16 06:30:42 nasl003b kernel: [4902556.775272] 3w-sas: scsi0: AEN: INFO > (0x04:0x002B): Verify completed:unit=1. > Oct 16 06:38:29 nasl003b kernel: [4903023.388357] 3w-sas: scsi0: AEN: INFO > (0x04:0x002B): Verify completed:unit=2. > Oct 16 06:41:33 nasl003b kernel: [4903207.577484] 3w-sas: scsi0: ERROR: > (0x06:0x000D): Microcontroller Error: clearing. > Oct 16 06:42:11 nasl003b kernel: [4903245.701070] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva001 > Oct 16 06:42:11 nasl003b kernel: [4903245.706778] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva001 > Oct 16 06:42:15 nasl003b kernel: [4903249.298778] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva002 > Oct 16 06:42:15 nasl003b kernel: [4903249.304369] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva002 > Oct 16 06:42:15 nasl003b kernel: [4903249.318734] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva003 > Oct 16 06:42:15 nasl003b kernel: [4903249.324122] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva003 > Oct 16 06:42:41 nasl003b kernel: [4903275.739327] 3w-sas: scsi0: WARNING: > (0x06:0x0006): Character ioctl (0x108) timed out, resetting card. > Oct 16 06:42:41 nasl003b kernel: [4903275.739402] sd 0:0:0:0: WARNING: > (0x06:0x002C): Command (0x2a) timed out, resetting card. > Oct 16 06:43:36 nasl003b kernel: [4903330.757153] iSCSI Login timeout on > Network Portal 172.19.96.217:3260 > Oct 16 06:43:36 nasl003b kernel: [4903330.762124] iSCSI Login negotiation > failed. > Oct 16 06:43:42 nasl003b kernel: [4903336.720892] 3w-sas: scsi0: ERROR: > (0x06:0x0011): Controller not ready during reset sequence. > Oct 16 06:43:51 nasl003b kernel: [4903345.796529] iSCSI Login timeout on > Network Portal 172.19.96.217:3260 > Oct 16 06:43:51 nasl003b kernel: [4903345.801462] iSCSI Login negotiation > failed. > Oct 16 06:44:06 nasl003b kernel: [4903360.835920] iSCSI Login timeout on > Network Portal 172.19.96.217:3260 > Oct 16 06:44:06 nasl003b kernel: [4903360.840843] iSCSI Login negotiation > failed. > Oct 16 06:44:13 nasl003b kernel: [4903367.919662] INFO: task jbd2/sda1-8:235 > blocked for more than 120 seconds. > Oct 16 06:44:13 nasl003b kernel: [4903367.924633] Tainted: G > E 4.6.0-0.bpo.1-amd64 #1 > Oct 16 06:44:13 nasl003b kernel: [4903367.929624] "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Oct 16 06:44:13 nasl003b kernel: [4903367.934751] jbd2/sda1-8 D > 88042d896000 0 235 2 0x > Oct 16 06:44:13 nasl003b kernel: [4903367.934756] 880424c2c440 > 88042b43a140 8804249e 8804249dfbc0 > Oct 16 06:44:13 nasl003b kernel: [4903367.934760] 7fff > 815c50a0 8804249dfc40 8804000daf00 > Oct 16 06:44:13 nasl003b kernel: [4903367.934763] 815c4821 > 815c785f 7fff > Oct 16 06:44:13 nasl003b kernel: [4903367.934766] Call Trace: > Oct 16 06:44:13 nasl003b kernel: [4903367.934783] [] ? > bit_wait_timeout+0xa0/0xa0 > Oct 16 06:44:13 nasl003b kernel: [4903367.934786] [] ? > schedule+0x31/0x80 > Oct 16 06:44:13 nasl003b kernel: [4903367.934789] [] ? > schedule_timeout+0x22f/0x2c0 > Oct 16 06:44:13 nasl003b kernel: [4903367.934794] [] ? > blk_peek_request+0x4d/0x280 > Oct 16 06:44:13 nasl003b kernel: [4903367.934797] [] ? > bit_wait_timeout+0xa0/0xa0 > : > : > > The interesting point is that apparently the reset didn't succeed. >
Re: 4.6.4, 3w_sas: timeout too small?
>> Oct 16 06:41:33 nasl003b kernel: [4903207.577484] 3w-sas: scsi0: ERROR: >> (0x06:0x000D): Microcontroller Error: clearing. Microcontroller Error = FW crashed and will be reset on the next I/O timeout or ioctl() timeout, whichever happens first. It could be the iSCSI target code or initiator sent a SCSI CDB not supported by 3ware 9750 firmware (or malformed CDB), or your 3ware card is going bad, most likely the later. >> Oct 16, 2016 06:45:03AM (0x04:0x005E): Cache synchronization completed: >> unit=0 Cache sync happens after an online controller reset. Previously acknowledged write back cache entries are flushed to disk at this time. >> The 3ware web interface says the controller was busy to write the cache to >> disk It is not the case that the controller was performing some long cache flush interval, > 60 seconds, such that you would need to adjust any timeout values in the controller reset sequence code, i.e. twl_reset_sequence(). -Adam On Mon, Oct 17, 2016 at 6:09 AM, Harald Dunkel wrote: > Hi folks, > > Question about the reset timeout of the 3w_sas: > > For an unknown reason my 3ware 9750 RAID controller became > unresponsive. kernel.log: > > Oct 16 06:30:42 nasl003b kernel: [4902556.775272] 3w-sas: scsi0: AEN: INFO > (0x04:0x002B): Verify completed:unit=1. > Oct 16 06:38:29 nasl003b kernel: [4903023.388357] 3w-sas: scsi0: AEN: INFO > (0x04:0x002B): Verify completed:unit=2. > Oct 16 06:41:33 nasl003b kernel: [4903207.577484] 3w-sas: scsi0: ERROR: > (0x06:0x000D): Microcontroller Error: clearing. > Oct 16 06:42:11 nasl003b kernel: [4903245.701070] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva001 > Oct 16 06:42:11 nasl003b kernel: [4903245.706778] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva001 > Oct 16 06:42:15 nasl003b kernel: [4903249.298778] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva002 > Oct 16 06:42:15 nasl003b kernel: [4903249.304369] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva002 > Oct 16 06:42:15 nasl003b kernel: [4903249.318734] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva003 > Oct 16 06:42:15 nasl003b kernel: [4903249.324122] TMR Opcode > TARGET_WARM_RESET authorization failed for Initiator Node: > iqn.2014-01.com.example.ac.srva003 > Oct 16 06:42:41 nasl003b kernel: [4903275.739327] 3w-sas: scsi0: WARNING: > (0x06:0x0006): Character ioctl (0x108) timed out, resetting card. > Oct 16 06:42:41 nasl003b kernel: [4903275.739402] sd 0:0:0:0: WARNING: > (0x06:0x002C): Command (0x2a) timed out, resetting card. > Oct 16 06:43:36 nasl003b kernel: [4903330.757153] iSCSI Login timeout on > Network Portal 172.19.96.217:3260 > Oct 16 06:43:36 nasl003b kernel: [4903330.762124] iSCSI Login negotiation > failed. > Oct 16 06:43:42 nasl003b kernel: [4903336.720892] 3w-sas: scsi0: ERROR: > (0x06:0x0011): Controller not ready during reset sequence. > Oct 16 06:43:51 nasl003b kernel: [4903345.796529] iSCSI Login timeout on > Network Portal 172.19.96.217:3260 > Oct 16 06:43:51 nasl003b kernel: [4903345.801462] iSCSI Login negotiation > failed. > Oct 16 06:44:06 nasl003b kernel: [4903360.835920] iSCSI Login timeout on > Network Portal 172.19.96.217:3260 > Oct 16 06:44:06 nasl003b kernel: [4903360.840843] iSCSI Login negotiation > failed. > Oct 16 06:44:13 nasl003b kernel: [4903367.919662] INFO: task jbd2/sda1-8:235 > blocked for more than 120 seconds. > Oct 16 06:44:13 nasl003b kernel: [4903367.924633] Tainted: G > E 4.6.0-0.bpo.1-amd64 #1 > Oct 16 06:44:13 nasl003b kernel: [4903367.929624] "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Oct 16 06:44:13 nasl003b kernel: [4903367.934751] jbd2/sda1-8 D > 88042d896000 0 235 2 0x > Oct 16 06:44:13 nasl003b kernel: [4903367.934756] 880424c2c440 > 88042b43a140 8804249e 8804249dfbc0 > Oct 16 06:44:13 nasl003b kernel: [4903367.934760] 7fff > 815c50a0 8804249dfc40 8804000daf00 > Oct 16 06:44:13 nasl003b kernel: [4903367.934763] 815c4821 > 815c785f 7fff > Oct 16 06:44:13 nasl003b kernel: [4903367.934766] Call Trace: > Oct 16 06:44:13 nasl003b kernel: [4903367.934783] [] ? > bit_wait_timeout+0xa0/0xa0 > Oct 16 06:44:13 nasl003b kernel: [4903367.934786] [] ? > schedule+0x31/0x80 > Oct 16 06:44:13 nasl003b kernel: [4903367.934789] [] ? > schedule_timeout+0x22f/0x2c0 > Oct 16 06:44:13 nasl003b kernel: [4903367.934794] [] ? > blk_peek_request+0x4d/0x280 > Oct 16 06:44:13 nasl003b kernel: [4903367.934797] [] ? > bit_wait_timeout+0xa0/0xa0 > : > : > > The interesting point is that apparently the reset didn't succeed. > "Controller not ready during reset
Re: [PATCH] scsi: 3w-9xxx.c: Cleaning up missing null-terminate in conjunction with strncpy
On Mon, Dec 22, 2014 at 2:52 PM, Rickard Strandqvist wrote: > 2014-08-01 0:19 GMT+02:00 adam radford : >> On Sun, Jul 27, 2014 at 8:11 AM, Rickard Strandqvist >> wrote: >>> Replacing strncpy with strlcpy to avoid strings that lacks null terminate. >>> And use the sizeof on the to string rather than strlen on the from string. >>> >>> Signed-off-by: Rickard Strandqvist >>> --- >>> drivers/scsi/3w-9xxx.c |3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c >>> index 0a73253..f4d2331 100644 >>> --- a/drivers/scsi/3w-9xxx.c >>> +++ b/drivers/scsi/3w-9xxx.c >>> @@ -621,7 +621,8 @@ static int twa_check_srl(TW_Device_Extension *tw_dev, >>> int *flashed) >>> } >>> >>> /* Load rest of compatibility struct */ >>> - strncpy(tw_dev->tw_compat_info.driver_version, TW_DRIVER_VERSION, >>> strlen(TW_DRIVER_VERSION)); >>> + strlcpy(tw_dev->tw_compat_info.driver_version, TW_DRIVER_VERSION, >>> + sizeof(tw_dev->tw_compat_info.driver_version)); >>> tw_dev->tw_compat_info.driver_srl_high = TW_CURRENT_DRIVER_SRL; >>> tw_dev->tw_compat_info.driver_branch_high = >>> TW_CURRENT_DRIVER_BRANCH; >>> tw_dev->tw_compat_info.driver_build_high = TW_CURRENT_DRIVER_BUILD; >>> -- >>> 1.7.10.4 >>> >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> Please read the FAQ at http://www.tux.org/lkml/ >> >> Rickard, >> >> This patch looks fine. >> >> Acked-by: Adam Radford >> >> -Adam > > > Hi Adam! > > What happened to this patch? It just didn't get picked up into scsi.git/for-next for some reason. All I can do is Ack it, which I already did :) -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scsi: 3w-9xxx.c: Cleaning up missing null-terminate in conjunction with strncpy
On Mon, Dec 22, 2014 at 2:52 PM, Rickard Strandqvist rickard_strandqv...@spectrumdigital.se wrote: 2014-08-01 0:19 GMT+02:00 adam radford aradf...@gmail.com: On Sun, Jul 27, 2014 at 8:11 AM, Rickard Strandqvist rickard_strandqv...@spectrumdigital.se wrote: Replacing strncpy with strlcpy to avoid strings that lacks null terminate. And use the sizeof on the to string rather than strlen on the from string. Signed-off-by: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se --- drivers/scsi/3w-9xxx.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c index 0a73253..f4d2331 100644 --- a/drivers/scsi/3w-9xxx.c +++ b/drivers/scsi/3w-9xxx.c @@ -621,7 +621,8 @@ static int twa_check_srl(TW_Device_Extension *tw_dev, int *flashed) } /* Load rest of compatibility struct */ - strncpy(tw_dev-tw_compat_info.driver_version, TW_DRIVER_VERSION, strlen(TW_DRIVER_VERSION)); + strlcpy(tw_dev-tw_compat_info.driver_version, TW_DRIVER_VERSION, + sizeof(tw_dev-tw_compat_info.driver_version)); tw_dev-tw_compat_info.driver_srl_high = TW_CURRENT_DRIVER_SRL; tw_dev-tw_compat_info.driver_branch_high = TW_CURRENT_DRIVER_BRANCH; tw_dev-tw_compat_info.driver_build_high = TW_CURRENT_DRIVER_BUILD; -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ Rickard, This patch looks fine. Acked-by: Adam Radford aradf...@gmail.com -Adam Hi Adam! What happened to this patch? It just didn't get picked up into scsi.git/for-next for some reason. All I can do is Ack it, which I already did :) -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scsi: 3w-9xxx.c: Cleaning up missing null-terminate in conjunction with strncpy
On Sun, Jul 27, 2014 at 8:11 AM, Rickard Strandqvist wrote: > Replacing strncpy with strlcpy to avoid strings that lacks null terminate. > And use the sizeof on the to string rather than strlen on the from string. > > Signed-off-by: Rickard Strandqvist > --- > drivers/scsi/3w-9xxx.c |3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c > index 0a73253..f4d2331 100644 > --- a/drivers/scsi/3w-9xxx.c > +++ b/drivers/scsi/3w-9xxx.c > @@ -621,7 +621,8 @@ static int twa_check_srl(TW_Device_Extension *tw_dev, int > *flashed) > } > > /* Load rest of compatibility struct */ > - strncpy(tw_dev->tw_compat_info.driver_version, TW_DRIVER_VERSION, > strlen(TW_DRIVER_VERSION)); > + strlcpy(tw_dev->tw_compat_info.driver_version, TW_DRIVER_VERSION, > + sizeof(tw_dev->tw_compat_info.driver_version)); > tw_dev->tw_compat_info.driver_srl_high = TW_CURRENT_DRIVER_SRL; > tw_dev->tw_compat_info.driver_branch_high = TW_CURRENT_DRIVER_BRANCH; > tw_dev->tw_compat_info.driver_build_high = TW_CURRENT_DRIVER_BUILD; > -- > 1.7.10.4 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ Rickard, This patch looks fine. Acked-by: Adam Radford -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scsi: 3w-9xxx.c: Cleaning up missing null-terminate in conjunction with strncpy
On Sun, Jul 27, 2014 at 8:11 AM, Rickard Strandqvist rickard_strandqv...@spectrumdigital.se wrote: Replacing strncpy with strlcpy to avoid strings that lacks null terminate. And use the sizeof on the to string rather than strlen on the from string. Signed-off-by: Rickard Strandqvist rickard_strandqv...@spectrumdigital.se --- drivers/scsi/3w-9xxx.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c index 0a73253..f4d2331 100644 --- a/drivers/scsi/3w-9xxx.c +++ b/drivers/scsi/3w-9xxx.c @@ -621,7 +621,8 @@ static int twa_check_srl(TW_Device_Extension *tw_dev, int *flashed) } /* Load rest of compatibility struct */ - strncpy(tw_dev-tw_compat_info.driver_version, TW_DRIVER_VERSION, strlen(TW_DRIVER_VERSION)); + strlcpy(tw_dev-tw_compat_info.driver_version, TW_DRIVER_VERSION, + sizeof(tw_dev-tw_compat_info.driver_version)); tw_dev-tw_compat_info.driver_srl_high = TW_CURRENT_DRIVER_SRL; tw_dev-tw_compat_info.driver_branch_high = TW_CURRENT_DRIVER_BRANCH; tw_dev-tw_compat_info.driver_build_high = TW_CURRENT_DRIVER_BUILD; -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ Rickard, This patch looks fine. Acked-by: Adam Radford aradf...@gmail.com -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 13/39] megaraid: simplify internal command handling
403,6 @@ static void scsi_put_host_cmd_pool(gfp_t gfp_mask) > } > > /** > - * scsi_allocate_command - get a fully allocated SCSI command > - * @gfp_mask: allocation mask > - * > - * This function is for use outside of the normal host based pools. > - * It allocates the relevant command and takes an additional reference > - * on the pool it used. This function *must* be paired with > - * scsi_free_command which also has the identical mask, otherwise the > - * free pool counts will eventually go wrong and you'll trigger a bug. > - * > - * This function should *only* be used by drivers that need a static > - * command allocation at start of day for internal functions. > - */ > -struct scsi_cmnd *scsi_allocate_command(gfp_t gfp_mask) > -{ > - struct scsi_host_cmd_pool *pool = scsi_get_host_cmd_pool(gfp_mask); > - > - if (!pool) > - return NULL; > - > - return scsi_pool_alloc_command(pool, gfp_mask); > -} > -EXPORT_SYMBOL(scsi_allocate_command); > - > -/** > - * scsi_free_command - free a command allocated by scsi_allocate_command > - * @gfp_mask: mask used in the original allocation > - * @cmd: command to free > - * > - * Note: using the original allocation mask is vital because that's > - * what determines which command pool we use to free the command. Any > - * mismatch will cause the system to BUG eventually. > - */ > -void scsi_free_command(gfp_t gfp_mask, struct scsi_cmnd *cmd) > -{ > - struct scsi_host_cmd_pool *pool = scsi_get_host_cmd_pool(gfp_mask); > - > - /* > -* this could trigger if the mask to scsi_allocate_command > -* doesn't match this mask. Otherwise we're guaranteed that this > -* succeeds because scsi_allocate_command must have taken a reference > -* on the pool > -*/ > - BUG_ON(!pool); > - > - scsi_pool_free_command(pool, cmd); > - /* > -* scsi_put_host_cmd_pool is called twice; once to release the > -* reference we took above, and once to release the reference > -* originally taken by scsi_allocate_command > -*/ > - scsi_put_host_cmd_pool(gfp_mask); > - scsi_put_host_cmd_pool(gfp_mask); > -} > -EXPORT_SYMBOL(scsi_free_command); > - > -/** > * scsi_setup_command_freelist - Setup the command freelist for a scsi host. > * @shost: host to allocate the freelist for. > * > diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h > index 414edf9..dd7c998 100644 > --- a/include/scsi/scsi_cmnd.h > +++ b/include/scsi/scsi_cmnd.h > @@ -155,9 +155,6 @@ extern void scsi_release_buffers(struct scsi_cmnd *cmd); > extern int scsi_dma_map(struct scsi_cmnd *cmd); > extern void scsi_dma_unmap(struct scsi_cmnd *cmd); > > -struct scsi_cmnd *scsi_allocate_command(gfp_t gfp_mask); > -void scsi_free_command(gfp_t gfp_mask, struct scsi_cmnd *cmd); > - > static inline unsigned scsi_sg_count(struct scsi_cmnd *cmd) > { > return cmd->sdb.table.nents; > -- > 1.7.10.4 > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ Christoph/James, I have reviewed this patch, and it looks good to me. Please consider this ACK'd by the Megaraid driver team. Acked-by: Adam Radford -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 13/39] megaraid: simplify internal command handling
the original allocation mask is vital because that's - * what determines which command pool we use to free the command. Any - * mismatch will cause the system to BUG eventually. - */ -void scsi_free_command(gfp_t gfp_mask, struct scsi_cmnd *cmd) -{ - struct scsi_host_cmd_pool *pool = scsi_get_host_cmd_pool(gfp_mask); - - /* -* this could trigger if the mask to scsi_allocate_command -* doesn't match this mask. Otherwise we're guaranteed that this -* succeeds because scsi_allocate_command must have taken a reference -* on the pool -*/ - BUG_ON(!pool); - - scsi_pool_free_command(pool, cmd); - /* -* scsi_put_host_cmd_pool is called twice; once to release the -* reference we took above, and once to release the reference -* originally taken by scsi_allocate_command -*/ - scsi_put_host_cmd_pool(gfp_mask); - scsi_put_host_cmd_pool(gfp_mask); -} -EXPORT_SYMBOL(scsi_free_command); - -/** * scsi_setup_command_freelist - Setup the command freelist for a scsi host. * @shost: host to allocate the freelist for. * diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index 414edf9..dd7c998 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -155,9 +155,6 @@ extern void scsi_release_buffers(struct scsi_cmnd *cmd); extern int scsi_dma_map(struct scsi_cmnd *cmd); extern void scsi_dma_unmap(struct scsi_cmnd *cmd); -struct scsi_cmnd *scsi_allocate_command(gfp_t gfp_mask); -void scsi_free_command(gfp_t gfp_mask, struct scsi_cmnd *cmd); - static inline unsigned scsi_sg_count(struct scsi_cmnd *cmd) { return cmd-sdb.table.nents; -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ Christoph/James, I have reviewed this patch, and it looks good to me. Please consider this ACK'd by the Megaraid driver team. Acked-by: Adam Radford aradf...@gmail.com -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] [SCSI] megaraid: Remove local (struct pci_dev) pdev's
On Tue, Jul 9, 2013 at 11:10 PM, James Bottomley wrote: > On Tue, 2013-07-09 at 15:12 -0700, adam radford wrote: >> On Tue, Jul 9, 2013 at 2:18 PM, James Bottomley >> > Adam, you do drive by coding on this for LSI ... ack or reject, please. > >> I have just now located my box of MegaRAID Parallel SCSI controllers. >> I will review and test the patch series from Myron and respond by next >> Monday. > > Thanks, > > James > > Unfortunately my box of MegaRAID Parallel SCSI controllers only contains only cards intended for megaraid_mbox.c (I tested all 20 of them), and does not contain any of the following really old Symbios based megaraid cards: static struct pci_device_id megaraid_pci_tbl[] = { {PCI_VENDOR_ID_AMI, PCI_DEVICE_ID_AMI_MEGARAID, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, {PCI_VENDOR_ID_AMI, PCI_DEVICE_ID_AMI_MEGARAID2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, {PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_AMI_MEGARAID3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, {0,} }; which I had located previously before the company headquarters moved. I cannot currently locate any of the above 3 controllers anywhere at LSI headquarters after an exhaustive search, so I cannot test the patches to megaraid.c from Myron @ RedHat. Myron, do you actually have the hardware and have you tested the patches yourself ? -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] [SCSI] megaraid: Remove local (struct pci_dev) pdev's
On Tue, Jul 9, 2013 at 11:10 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Tue, 2013-07-09 at 15:12 -0700, adam radford wrote: On Tue, Jul 9, 2013 at 2:18 PM, James Bottomley Adam, you do drive by coding on this for LSI ... ack or reject, please. I have just now located my box of MegaRAID Parallel SCSI controllers. I will review and test the patch series from Myron and respond by next Monday. Thanks, James Unfortunately my box of MegaRAID Parallel SCSI controllers only contains only cards intended for megaraid_mbox.c (I tested all 20 of them), and does not contain any of the following really old Symbios based megaraid cards: static struct pci_device_id megaraid_pci_tbl[] = { {PCI_VENDOR_ID_AMI, PCI_DEVICE_ID_AMI_MEGARAID, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, {PCI_VENDOR_ID_AMI, PCI_DEVICE_ID_AMI_MEGARAID2, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, {PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_AMI_MEGARAID3, PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0}, {0,} }; which I had located previously before the company headquarters moved. I cannot currently locate any of the above 3 controllers anywhere at LSI headquarters after an exhaustive search, so I cannot test the patches to megaraid.c from Myron @ RedHat. Myron, do you actually have the hardware and have you tested the patches yourself ? -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] [SCSI] megaraid: Remove local (struct pci_dev) pdev's
On Tue, Jul 9, 2013 at 2:18 PM, James Bottomley wrote: > On Tue, 2013-07-09 at 14:39 -0600, Myron Stowe wrote: >> Is the "megaraid" driver still actively used and maintained? I originally >> posted this series on 06.07.2013 and after receiving no comments, pinged >> the list again on 06.17.2013 and still received no comments/feedback. >> >> Trying again as I believe there is a real issue here, which I'd like >> confirmation on, and we really should remove the local copy/usage of >> 'struct pci_dev' that this driver currently maintains. >> >> >> While the megaraid device itself may be 64-bit DMA capable, 32-bit address >> restricted DMA buffers are apparently required for "internal commands" as >> is denoted by a couple of comments - "For all internal commands, the >> buffer must be allocated in <4GB address range" - within the driver. >> >> If the device is 64-bit DMA capable then, once it is setup, any subsequent >> DMA allocations for "internal commands" would not be properly restricted >> due to megaraid_probe_one() having called pci_set_dma_mask() on pdev with >> DMA_BIT_MASK(64). The driver attempts to solve this by using >> make_local_pdev() to dynamically create local pci_dev structures which are >> then set and used for allocating 32-bit address space restricted DMA >> buffers[1] but I don't believe that the implementation works as intended. >> >> >> Assume that the megaraid device is 64-bit DMA capable. While probing the >> device and attaching the megaraid driver, pci_set_dma_mask() is called >> with the "originating pdev" and a DMA_BIT_MASK of 64. As a result, any >> subsequent dynamic DMA related allocations associated with the >> "originating pdev" will acquire 64-bit based buffers, which do not meet >> the addressing restrictions for internal commands. >> >> megaraid_probe_one(struct pci_dev *pdev, ...) >> ... >> pci_set_dma_mask(pdev, DMA_BIT_MASK(64)); >> >> As mentioned, the driver attempts to solve this by using make_local_pdev() >> to dynamically create local pci_dev structures - "local pdev's" - which >> are set with a DMA_BIT_MASK of 32. >> >> make_local_pdev >> alloc_pci_dev >> memcpy >> pci_set_dma_mask >> dma_set_mask >> *dev->dma_mask = mask; >> >> The "local pdev" is then used in allocating a DMA buffer in an attempt to >> meet the < 4 GB restriction. >> >> For a 64-bit DMA capable device, the "originating pdev" will have its >> 'dma_mask' set to 0x after the driver attaches. >> Subsequently, when an internal command is initiated, make_local_pdev() is >> called. make_local_pdev() uses the PCI's core to allocate a "local pdev" >> and then copies the "originating pdev" content into the newly allocated >> "local pdev". As a result of copying the "originating pdev" content into >> the "local pdev", pdev->dev.dma_mask will be pointing back to the >> "originating pdev's" 'dma_mask' member, not the "local pdev's" as >> intended. Thus, when make_local_pdev() calls pci_set_dma_mask() in an >> attempt to set the "local pdev's" DMA mask to 32 it will instead overwrite >> the "originating pdev's" DMA mask. Thus, after any user initiated >> commands are issued, all subsequent DMA allocations will be 32-bit >> restricted from that point onward regardless of whether they are internal >> commands or otherwise. >> >> >> This patch fixes the issue by removing the setup of DMA_BIT_MASK to 64 in >> megaraid_probe_one(), leaving the driver with default 32-bit DMA >> capabilities, as it currently ends up in such a state anyway after any >> internal commands are initiated. >> >> >> [1] It seems strange that both mega_buffer/buf_dma_handle and >> make_local_pdev() both exist for internal commands but this has been >> the case for a long time - at least since 2.6.12-rc2. Perhaps there >> is some coalescing that could be done. >> --- >> Myron Stowe (3): >> [SCSI] megaraid: Remove 64-bit DMA related dead code >> [SCSI] megaraid: Remove local pdev's >> [SCSI] megaraid: Remove 64-bit DMA_BIT_MASK capability > > Adam, you do drive by coding on this for LSI ... ack or reject, please. > > James > > James, I have just now located my box of MegaRAID Parallel SCSI controllers. I will review and test the patch series from Myron and respond by next Monday. -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 0/3] [SCSI] megaraid: Remove local (struct pci_dev) pdev's
On Tue, Jul 9, 2013 at 2:18 PM, James Bottomley james.bottom...@hansenpartnership.com wrote: On Tue, 2013-07-09 at 14:39 -0600, Myron Stowe wrote: Is the megaraid driver still actively used and maintained? I originally posted this series on 06.07.2013 and after receiving no comments, pinged the list again on 06.17.2013 and still received no comments/feedback. Trying again as I believe there is a real issue here, which I'd like confirmation on, and we really should remove the local copy/usage of 'struct pci_dev' that this driver currently maintains. While the megaraid device itself may be 64-bit DMA capable, 32-bit address restricted DMA buffers are apparently required for internal commands as is denoted by a couple of comments - For all internal commands, the buffer must be allocated in 4GB address range - within the driver. If the device is 64-bit DMA capable then, once it is setup, any subsequent DMA allocations for internal commands would not be properly restricted due to megaraid_probe_one() having called pci_set_dma_mask() on pdev with DMA_BIT_MASK(64). The driver attempts to solve this by using make_local_pdev() to dynamically create local pci_dev structures which are then set and used for allocating 32-bit address space restricted DMA buffers[1] but I don't believe that the implementation works as intended. Assume that the megaraid device is 64-bit DMA capable. While probing the device and attaching the megaraid driver, pci_set_dma_mask() is called with the originating pdev and a DMA_BIT_MASK of 64. As a result, any subsequent dynamic DMA related allocations associated with the originating pdev will acquire 64-bit based buffers, which do not meet the addressing restrictions for internal commands. megaraid_probe_one(struct pci_dev *pdev, ...) ... pci_set_dma_mask(pdev, DMA_BIT_MASK(64)); As mentioned, the driver attempts to solve this by using make_local_pdev() to dynamically create local pci_dev structures - local pdev's - which are set with a DMA_BIT_MASK of 32. make_local_pdev alloc_pci_dev memcpy pci_set_dma_mask dma_set_mask *dev-dma_mask = mask; The local pdev is then used in allocating a DMA buffer in an attempt to meet the 4 GB restriction. For a 64-bit DMA capable device, the originating pdev will have its 'dma_mask' set to 0x after the driver attaches. Subsequently, when an internal command is initiated, make_local_pdev() is called. make_local_pdev() uses the PCI's core to allocate a local pdev and then copies the originating pdev content into the newly allocated local pdev. As a result of copying the originating pdev content into the local pdev, pdev-dev.dma_mask will be pointing back to the originating pdev's 'dma_mask' member, not the local pdev's as intended. Thus, when make_local_pdev() calls pci_set_dma_mask() in an attempt to set the local pdev's DMA mask to 32 it will instead overwrite the originating pdev's DMA mask. Thus, after any user initiated commands are issued, all subsequent DMA allocations will be 32-bit restricted from that point onward regardless of whether they are internal commands or otherwise. This patch fixes the issue by removing the setup of DMA_BIT_MASK to 64 in megaraid_probe_one(), leaving the driver with default 32-bit DMA capabilities, as it currently ends up in such a state anyway after any internal commands are initiated. [1] It seems strange that both mega_buffer/buf_dma_handle and make_local_pdev() both exist for internal commands but this has been the case for a long time - at least since 2.6.12-rc2. Perhaps there is some coalescing that could be done. --- Myron Stowe (3): [SCSI] megaraid: Remove 64-bit DMA related dead code [SCSI] megaraid: Remove local pdev's [SCSI] megaraid: Remove 64-bit DMA_BIT_MASK capability Adam, you do drive by coding on this for LSI ... ack or reject, please. James James, I have just now located my box of MegaRAID Parallel SCSI controllers. I will review and test the patch series from Myron and respond by next Monday. -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: "WRITE SAME failed. Manually zeroing" with 3w-xxxx driver
On Mon, Apr 29, 2013 at 9:13 AM, Martin K. Petersen wrote: >>>>>> "Florian" == Florian Westphal writes: > > Florian> After update to 3.8 dmesg is spammed with: kernel: [ > Florian> 280.272094] 3w-: scsi8: Unknown scsi opcode: 0x41 kernel: [ > Florian> 280.272107] sd 8:0:0:0: [sda] Unhandled error code kernel: > > Interesting. It looks like the 3ware handles this at the driver level > instead of passing the command through to the disk and letting it > fail. That in turn means that the logic we have in place to disable WS > when the disk does not support it does not get triggered. > > The driver should really fill out the sense buffer in that case. > > Could you please test the patch below? > > > Florian> This goes on and on. > > The second question is what it is that's issuing these zeroouts at boot? > Which filesystem are you using? What's your DM/MD config? > > -- > Martin K. Petersen Oracle Linux Engineering > > > 3w-: Create sense buffer for unsupported commands > > Make the driver return appropriate sense data when an unsupported > operation is queued. This will cause the SCSI layer to stop issuing the > offending command. > > Reported-by: Florian Westphal > CC: adam radford > Signed-off-by: Martin K. Petersen > > diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c > index 56662ae..b9276d1 100644 > --- a/drivers/scsi/3w-.c > +++ b/drivers/scsi/3w-.c > @@ -216,6 +216,7 @@ > #include > #include > #include > +#include > #include "3w-.h" > > /* Globals */ > @@ -2009,7 +2010,8 @@ static int tw_scsi_queue_lck(struct scsi_cmnd *SCpnt, > void (*done)(struct scsi_c > printk(KERN_NOTICE "3w-: scsi%d: Unknown scsi > opcode: 0x%x\n", tw_dev->host->host_no, *command); > tw_dev->state[request_id] = TW_S_COMPLETED; > tw_state_request_finish(tw_dev, request_id); > - SCpnt->result = (DID_BAD_TARGET << 16); > + SCpnt->result = (DRIVER_SENSE << 24) | > SAM_STAT_CHECK_CONDITION; > + scsi_build_sense_buffer(1, SCpnt->sense_buffer, > ILLEGAL_REQUEST, 0x20, 0); > done(SCpnt); > retval = 0; > } Thanks Martin. This patch looks good. Acked-by: Adam Radford -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: WRITE SAME failed. Manually zeroing with 3w-xxxx driver
On Mon, Apr 29, 2013 at 9:13 AM, Martin K. Petersen martin.peter...@oracle.com wrote: Florian == Florian Westphal f...@strlen.de writes: Florian After update to 3.8 dmesg is spammed with: kernel: [ Florian 280.272094] 3w-: scsi8: Unknown scsi opcode: 0x41 kernel: [ Florian 280.272107] sd 8:0:0:0: [sda] Unhandled error code kernel: Interesting. It looks like the 3ware handles this at the driver level instead of passing the command through to the disk and letting it fail. That in turn means that the logic we have in place to disable WS when the disk does not support it does not get triggered. The driver should really fill out the sense buffer in that case. Could you please test the patch below? Florian This goes on and on. The second question is what it is that's issuing these zeroouts at boot? Which filesystem are you using? What's your DM/MD config? -- Martin K. Petersen Oracle Linux Engineering 3w-: Create sense buffer for unsupported commands Make the driver return appropriate sense data when an unsupported operation is queued. This will cause the SCSI layer to stop issuing the offending command. Reported-by: Florian Westphal f...@strlen.de CC: adam radford aradf...@gmail.com Signed-off-by: Martin K. Petersen martin.peter...@oracle.com diff --git a/drivers/scsi/3w-.c b/drivers/scsi/3w-.c index 56662ae..b9276d1 100644 --- a/drivers/scsi/3w-.c +++ b/drivers/scsi/3w-.c @@ -216,6 +216,7 @@ #include scsi/scsi_host.h #include scsi/scsi_tcq.h #include scsi/scsi_cmnd.h +#include scsi/scsi_eh.h #include 3w-.h /* Globals */ @@ -2009,7 +2010,8 @@ static int tw_scsi_queue_lck(struct scsi_cmnd *SCpnt, void (*done)(struct scsi_c printk(KERN_NOTICE 3w-: scsi%d: Unknown scsi opcode: 0x%x\n, tw_dev-host-host_no, *command); tw_dev-state[request_id] = TW_S_COMPLETED; tw_state_request_finish(tw_dev, request_id); - SCpnt-result = (DID_BAD_TARGET 16); + SCpnt-result = (DRIVER_SENSE 24) | SAM_STAT_CHECK_CONDITION; + scsi_build_sense_buffer(1, SCpnt-sense_buffer, ILLEGAL_REQUEST, 0x20, 0); done(SCpnt); retval = 0; } Thanks Martin. This patch looks good. Acked-by: Adam Radford aradf...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch,v3,repost 07/10] megaraid_sas: use scsi_host_alloc_node
On Tue, Nov 27, 2012 at 8:46 AM, Jeff Moyer wrote: > > Signed-off-by: Jeff Moyer > --- > drivers/scsi/megaraid/megaraid_sas_base.c |5 +++-- > 1 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c > b/drivers/scsi/megaraid/megaraid_sas_base.c > index d2c5366..707a6cd 100644 > --- a/drivers/scsi/megaraid/megaraid_sas_base.c > +++ b/drivers/scsi/megaraid/megaraid_sas_base.c > @@ -4020,8 +4020,9 @@ megasas_probe_one(struct pci_dev *pdev, const struct > pci_device_id *id) > if (megasas_set_dma_mask(pdev)) > goto fail_set_dma_mask; > > - host = scsi_host_alloc(_template, > - sizeof(struct megasas_instance)); > + host = scsi_host_alloc_node(_template, > + sizeof(struct megasas_instance), > + dev_to_node(>dev)); > > if (!host) { > printk(KERN_DEBUG "megasas: scsi_host_alloc failed\n"); Acked-by: Adam Radford -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6] scsi: megaraid: remove a useless call to memset().
On Sat, Dec 1, 2012 at 6:40 PM, Cyril Roelandt wrote: > This call is followed by a call to memcpy() on the same memory area, so it can > be safely removed. > > Signed-off-by: Cyril Roelandt > --- > drivers/scsi/megaraid/megaraid_sas_fusion.c |2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c > b/drivers/scsi/megaraid/megaraid_sas_fusion.c > index 74030af..71cc3eb 100644 > --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c > +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c > @@ -1028,8 +1028,6 @@ map_cmd_status(struct megasas_cmd_fusion *cmd, u8 > status, u8 ext_status) > > cmd->scmd->result = (DID_OK << 16) | ext_status; > if (ext_status == SAM_STAT_CHECK_CONDITION) { > - memset(cmd->scmd->sense_buffer, 0, > - SCSI_SENSE_BUFFERSIZE); > memcpy(cmd->scmd->sense_buffer, cmd->sense, >SCSI_SENSE_BUFFERSIZE); > cmd->scmd->result |= DRIVER_SENSE << 24; > -- Acked-by: Adam Radford -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 4/6] scsi: megaraid: remove a useless call to memset().
On Sat, Dec 1, 2012 at 6:40 PM, Cyril Roelandt tipec...@gmail.com wrote: This call is followed by a call to memcpy() on the same memory area, so it can be safely removed. Signed-off-by: Cyril Roelandt tipec...@gmail.com --- drivers/scsi/megaraid/megaraid_sas_fusion.c |2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index 74030af..71cc3eb 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -1028,8 +1028,6 @@ map_cmd_status(struct megasas_cmd_fusion *cmd, u8 status, u8 ext_status) cmd-scmd-result = (DID_OK 16) | ext_status; if (ext_status == SAM_STAT_CHECK_CONDITION) { - memset(cmd-scmd-sense_buffer, 0, - SCSI_SENSE_BUFFERSIZE); memcpy(cmd-scmd-sense_buffer, cmd-sense, SCSI_SENSE_BUFFERSIZE); cmd-scmd-result |= DRIVER_SENSE 24; -- Acked-by: Adam Radford aradf...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch,v3,repost 07/10] megaraid_sas: use scsi_host_alloc_node
On Tue, Nov 27, 2012 at 8:46 AM, Jeff Moyer jmo...@redhat.com wrote: Signed-off-by: Jeff Moyer jmo...@redhat.com --- drivers/scsi/megaraid/megaraid_sas_base.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index d2c5366..707a6cd 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -4020,8 +4020,9 @@ megasas_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) if (megasas_set_dma_mask(pdev)) goto fail_set_dma_mask; - host = scsi_host_alloc(megasas_template, - sizeof(struct megasas_instance)); + host = scsi_host_alloc_node(megasas_template, + sizeof(struct megasas_instance), + dev_to_node(pdev-dev)); if (!host) { printk(KERN_DEBUG megasas: scsi_host_alloc failed\n); Acked-by: Adam Radford aradf...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] megaraid_sas: combine kmalloc+memset into kzalloc
On Fri, Aug 24, 2012 at 8:27 AM, Fengguang Wu wrote: > Use kzalloc rather than kmalloc followed by memset with 0. > > Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci > > Signed-off-by: Fengguang Wu > --- > drivers/scsi/megaraid/megaraid_sas_fusion.c |7 ++- > 1 file changed, 2 insertions(+), 5 deletions(-) > > --- linux.orig/drivers/scsi/megaraid/megaraid_sas_fusion.c 2012-06-07 > 05:39:57.418846382 +0800 > +++ linux/drivers/scsi/megaraid/megaraid_sas_fusion.c 2012-08-24 > 23:25:02.261560445 +0800 > @@ -461,8 +461,8 @@ megasas_alloc_cmds_fusion(struct megasas > * Allocate the dynamic array first and then allocate individual > * commands. > */ > - fusion->cmd_list = kmalloc(sizeof(struct megasas_cmd_fusion *) > - *max_cmd, GFP_KERNEL); > + fusion->cmd_list = kzalloc(sizeof(struct megasas_cmd_fusion *) > + * max_cmd, GFP_KERNEL); > > if (!fusion->cmd_list) { > printk(KERN_DEBUG "megasas: out of memory. Could not alloc " > @@ -470,9 +470,6 @@ megasas_alloc_cmds_fusion(struct megasas > goto fail_cmd_list; > } > > - memset(fusion->cmd_list, 0, sizeof(struct megasas_cmd_fusion *) > - *max_cmd); > - > max_cmd = instance->max_fw_cmds; > for (i = 0; i < max_cmd; i++) { > fusion->cmd_list[i] = kmalloc(sizeof(struct > megasas_cmd_fusion), Acked-by: Adam Radford -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] megaraid_sas: combine kmalloc+memset into kzalloc
On Fri, Aug 24, 2012 at 8:27 AM, Fengguang Wu fengguang...@intel.com wrote: Use kzalloc rather than kmalloc followed by memset with 0. Generated by: scripts/coccinelle/api/alloc/kzalloc-simple.cocci Signed-off-by: Fengguang Wu fengguang...@intel.com --- drivers/scsi/megaraid/megaraid_sas_fusion.c |7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) --- linux.orig/drivers/scsi/megaraid/megaraid_sas_fusion.c 2012-06-07 05:39:57.418846382 +0800 +++ linux/drivers/scsi/megaraid/megaraid_sas_fusion.c 2012-08-24 23:25:02.261560445 +0800 @@ -461,8 +461,8 @@ megasas_alloc_cmds_fusion(struct megasas * Allocate the dynamic array first and then allocate individual * commands. */ - fusion-cmd_list = kmalloc(sizeof(struct megasas_cmd_fusion *) - *max_cmd, GFP_KERNEL); + fusion-cmd_list = kzalloc(sizeof(struct megasas_cmd_fusion *) + * max_cmd, GFP_KERNEL); if (!fusion-cmd_list) { printk(KERN_DEBUG megasas: out of memory. Could not alloc @@ -470,9 +470,6 @@ megasas_alloc_cmds_fusion(struct megasas goto fail_cmd_list; } - memset(fusion-cmd_list, 0, sizeof(struct megasas_cmd_fusion *) - *max_cmd); - max_cmd = instance-max_fw_cmds; for (i = 0; i max_cmd; i++) { fusion-cmd_list[i] = kmalloc(sizeof(struct megasas_cmd_fusion), Acked-by: Adam Radford aradf...@gmail.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] [SCSI] mpt fusion: add support for 0x1000/0x0055
On 7/27/12, Jiri Kosina wrote: ... > > So, what is the alternative? > > The only thing I know is that it works at least in basic mode (haven't > tested performance at all). The driver for your card is a closed source driver called 'megasr'. Here is a link to the LSI download page for this card/driver: http://www.lsi.com/support/Pages/Download-Results.aspx?productcode=P00041=0=Storage%20Component=Legacy%20RAID%20Controllers=MegaRAID%20SAS%208208XLP -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] [SCSI] mpt fusion: add support for 0x1000/0x0055
On 7/27/12, Jiri Kosina wrote: > On Sat, 21 Jul 2012, Jiri Kosina wrote: > >> The device identifies itself as >> >> 0d:05.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X >> Fusion-MPT SAS (rev 01) Subsystem: NEC Corporation SAS1068 >> >> and seems to be functionally compatible with 0x0054 PID. >> >> The request for support of this device has been raised on mailinglists >> several >> times in the past (see [1] [2] and more), but aparently the PCI ID never >> made it >> to mptsas_pci_table[]. >> >> [1] http://comments.gmane.org/gmane.linux.scsi/63836 >> [2] http://lkml.indiana.edu/hypermail/linux/kernel/0701.2/1715.html >> >> Signed-off-by: Jiri Kosina >> --- >> >> >> I guess the "Subsystem: NEC Corporation" is telling us some rebranding >> story, including the PID change ... ? > > Hi guys, > > any feedback on this please? > > Thanks. NACK. Vendor 0x1000, Device id 0x0055 is actually an old LSI MegaRAID 1068 based software raid board. This device was never qualified nor intended to be used with the mpt fusion driver. -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] [SCSI] mpt fusion: add support for 0x1000/0x0055
On 7/27/12, Jiri Kosina jkos...@suse.cz wrote: On Sat, 21 Jul 2012, Jiri Kosina wrote: The device identifies itself as 0d:05.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068 PCI-X Fusion-MPT SAS (rev 01) Subsystem: NEC Corporation SAS1068 and seems to be functionally compatible with 0x0054 PID. The request for support of this device has been raised on mailinglists several times in the past (see [1] [2] and more), but aparently the PCI ID never made it to mptsas_pci_table[]. [1] http://comments.gmane.org/gmane.linux.scsi/63836 [2] http://lkml.indiana.edu/hypermail/linux/kernel/0701.2/1715.html Signed-off-by: Jiri Kosina jkos...@suse.cz --- I guess the Subsystem: NEC Corporation is telling us some rebranding story, including the PID change ... ? Hi guys, any feedback on this please? Thanks. NACK. Vendor 0x1000, Device id 0x0055 is actually an old LSI MegaRAID 1068 based software raid board. This device was never qualified nor intended to be used with the mpt fusion driver. -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [RFC] [SCSI] mpt fusion: add support for 0x1000/0x0055
On 7/27/12, Jiri Kosina jkos...@suse.cz wrote: ... So, what is the alternative? The only thing I know is that it works at least in basic mode (haven't tested performance at all). The driver for your card is a closed source driver called 'megasr'. Here is a link to the LSI download page for this card/driver: http://www.lsi.com/support/Pages/Download-Results.aspx?productcode=P00041assettype=0component=Storage%20Componentproductfamily=Legacy%20RAID%20Controllersproductname=MegaRAID%20SAS%208208XLP -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel crash with lspci -vv and SAS Controller LSI Logic / Symbios Logic MegaRAID SAS 9240
On 7/18/12, stepping stone GmbH wrote: ... > > a tail of lspci -vv http://pastebin.com/kjh8ig9q > PCI Config reads from lspci -vvv don't go through the megaraid_sas driver itself. It looks like your system hung up while trying to do a PCI Config read of Capabilities 0xd0: VPD (Vital Product Data). I have a 9240 controller, and with kernel 3.5.0-rc1, I can do lspci -vvv : Capabilities: [d0] Vital Product Data Unknown small resource type 00, will not decode more. Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: Data: Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Vector table: BAR=1 offset=2000 PBA: BAR=1 offset=3800 Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [138] Power Budgeting Kernel driver in use: megaraid_sas Kernel modules: megaraid_sas Can you reproduce with kernel 3.5.0-rcX ? I would also try upgrading your controller firmware. -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Kernel crash with lspci -vv and SAS Controller LSI Logic / Symbios Logic MegaRAID SAS 9240
On 7/18/12, stepping stone GmbH supp...@stepping-stone.ch wrote: ... a tail of lspci -vv http://pastebin.com/kjh8ig9q PCI Config reads from lspci -vvv don't go through the megaraid_sas driver itself. It looks like your system hung up while trying to do a PCI Config read of Capabilities 0xd0: VPD (Vital Product Data). I have a 9240 controller, and with kernel 3.5.0-rc1, I can do lspci -vvv : Capabilities: [d0] Vital Product Data Unknown small resource type 00, will not decode more. Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: Data: Capabilities: [c0] MSI-X: Enable+ Count=15 Masked- Vector table: BAR=1 offset=2000 PBA: BAR=1 offset=3800 Capabilities: [100] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [138] Power Budgeting ? Kernel driver in use: megaraid_sas Kernel modules: megaraid_sas Can you reproduce with kernel 3.5.0-rcX ? I would also try upgrading your controller firmware. -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch submission question [not in the FAQ]
On Dec 5, 2007 3:36 AM, Gabriele Gorla <[EMAIL PROTECTED]> wrote: > Hello, > I have submitted a patch for the 3x- driver on > alpha several months ago to both the driver maintainer > and the linux-scsi mailing list. > I have read all the FAQ and I tried to stick to the > instructions to the letter. > However the patch has been completely ignored. No > reply, no comment, no flames, absolutely nothing... > > the original email submission is at the end of the > email. > > could anyone please explain what I am doing wrong? > > thanks, > GG Gabriele, I ignored your patch because: 1. I do not believe you have the 3w- driver running on an alpha SMP system. 2. I removed the bitfields from the 3w- driver but I have yet to add full big endian support due to lack of demand. I have such a patch for this driver (which already includes the unpacking of the wait_queue_head_t variable) but I have not submitted it to the main-line kernel. The in-kernel 3w- driver is still missing the byte-swaps. The 3w-9xxx (9000 series 3ware driver) has full big endian support. 3. Your patch was garbled. Is this an official request for big endian support for the 3w- driver or are you looking for anybody who has a packed 'wait_queue_head_t' and submitting a patch to fix it? -Adam -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Patch submission question [not in the FAQ]
On Dec 5, 2007 3:36 AM, Gabriele Gorla [EMAIL PROTECTED] wrote: Hello, I have submitted a patch for the 3x- driver on alpha several months ago to both the driver maintainer and the linux-scsi mailing list. I have read all the FAQ and I tried to stick to the instructions to the letter. However the patch has been completely ignored. No reply, no comment, no flames, absolutely nothing... the original email submission is at the end of the email. could anyone please explain what I am doing wrong? thanks, GG Gabriele, I ignored your patch because: 1. I do not believe you have the 3w- driver running on an alpha SMP system. 2. I removed the bitfields from the 3w- driver but I have yet to add full big endian support due to lack of demand. I have such a patch for this driver (which already includes the unpacking of the wait_queue_head_t variable) but I have not submitted it to the main-line kernel. The in-kernel 3w- driver is still missing the byte-swaps. The 3w-9xxx (9000 series 3ware driver) has full big endian support. 3. Your patch was garbled. Is this an official request for big endian support for the 3w- driver or are you looking for anybody who has a packed 'wait_queue_head_t' and submitting a patch to fix it? -Adam -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: SCSI opcode 0x80 and 3ware Escalade 7000 ATA RAID
Make sure you are are using the 3ware character ioctl interface at /dev/twe0 (dynamic major, controller number minor) for your smartmontools, not /dev/sda. i.e. smartctl -a -d 3ware,X /dev/tweY where X = port # of the drive whose information you want, Y=controller #. The old interface from smartmontools used SCSI_IOCTL_SEND_COMMAND ioctls with a special passthru opcode of 0x80 that would get passed to the driver. This interface is deprecated in the driver and the kernel. -Adam On 4/15/05, Kyle Moffett <[EMAIL PROTECTED]> wrote: > I've been getting the following message in syslog on a couple of my > servers > recently: > > > Apr 15 16:41:18 king kernel: scsi: unknown opcode 0x80 > > I've tracked it down to the SCSI opcode verification patch that went in > a > while back. I also determined that the trigger was our smartd/smartctl > runs, > which execute tests and get status on a regular basis. Our kernel is > latest > Debian kernel-image-2.6.8-686-smp, although I've verified identical > behavior > with a recent kernel.org kernel. > > The below strace run generates exactly 8 warnings every time. > > On another (unrelated) note, we also get the following messages at the > rate > at which our Cisco does IPv6 router-announcement broadcasts. This is on > every kernel that we have, from 2.4 through 2.6. Is there something > wrong > with our setup, or is this just a spurious error? (NOTE: I don't know > for > sure that our Cisco's set up properly, we're all new at IPv6 here, and > we're > not actually relying on the advertising yet, so...) > > > Apr 15 18:01:27 king kernel: IPv6 addrconf: prefix with wrong length 49 > > I'll get a packet trace of necessary. > > Thanks for your help! > > Here's an example smartctl log (Without the console warnings). > > # strace -o smart.trace smartctl /dev/sda -d 3ware,0 -a > > smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen > > Home page is http://smartmontools.sourceforge.net/ > > > > === START OF INFORMATION SECTION === > > Device Model: Maxtor 6Y200P0 > > Serial Number:Y60PVCZE > > Firmware Version: YAR41BW0 > > Device is:In smartctl database [for details use: -P show] > > ATA Version is: 7 > > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 > > Local Time is:Fri Apr 15 17:25:49 2005 EDT > > SMART support is: Available - device has SMART capability. > > SMART support is: Enabled > > > > === START OF READ SMART DATA SECTION === > > SMART overall-health self-assessment test result: PASSED > > > > General SMART Values: > > Offline data collection status: (0x82) Offline data collection > > activity > > was completed without error. > > Auto Offline Data Collection: > > Enabled. > > Self-test execution status: ( 0) The previous self-test routine > > completed > > without error or no self-test > > has ever > > been run. > > Total time to complete Offline > > data collection: ( 243) seconds. > > Offline data collection > > capabilities:(0x5b) SMART execute Offline > > immediate. > > Auto Offline data collection > > on/off support. > > Suspend Offline collection > > upon new > > command. > > Offline surface scan supported. > > Self-test supported. > > No Conveyance Self-test > > supported. > > Selective Self-test supported. > > SMART capabilities:(0x0003) Saves SMART data before > > entering > > power-saving mode. > > Supports SMART auto save timer. > > Error logging capability:(0x01) Error logging supported. > > No General Purpose Logging > > support. > > Short self-test routine > > recommended polling time:( 2) minutes. > > Extended self-test routine > > recommended polling time:( 91) minutes. > > > > SMART Attributes Data Structure revision number: 16 > > Vendor Specific SMART Attributes with Thresholds: > > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > > UPDATED WHEN_FAILED RAW_VALUE > > 3 Spin_Up_Time0x0027 252 252 063Pre-fail > > Always - 5945 > > 4 Start_Stop_Count0x0032 253 253 000Old_age > > Always - 17 > > 5 Reallocated_Sector_Ct 0x0033 253 253 063Pre-fail > > Always - 1 > > 6 Read_Channel_Margin 0x0001 253 253 100Pre-fail > > Offline - 0 > > 7 Seek_Error_Rate 0x000a 253 252 000Old_age > > Always - 0 > > 8
Re: SCSI opcode 0x80 and 3ware Escalade 7000 ATA RAID
Make sure you are are using the 3ware character ioctl interface at /dev/twe0 (dynamic major, controller number minor) for your smartmontools, not /dev/sda. i.e. smartctl -a -d 3ware,X /dev/tweY where X = port # of the drive whose information you want, Y=controller #. The old interface from smartmontools used SCSI_IOCTL_SEND_COMMAND ioctls with a special passthru opcode of 0x80 that would get passed to the driver. This interface is deprecated in the driver and the kernel. -Adam On 4/15/05, Kyle Moffett [EMAIL PROTECTED] wrote: I've been getting the following message in syslog on a couple of my servers recently: Apr 15 16:41:18 king kernel: scsi: unknown opcode 0x80 I've tracked it down to the SCSI opcode verification patch that went in a while back. I also determined that the trigger was our smartd/smartctl runs, which execute tests and get status on a regular basis. Our kernel is latest Debian kernel-image-2.6.8-686-smp, although I've verified identical behavior with a recent kernel.org kernel. The below strace run generates exactly 8 warnings every time. On another (unrelated) note, we also get the following messages at the rate at which our Cisco does IPv6 router-announcement broadcasts. This is on every kernel that we have, from 2.4 through 2.6. Is there something wrong with our setup, or is this just a spurious error? (NOTE: I don't know for sure that our Cisco's set up properly, we're all new at IPv6 here, and we're not actually relying on the advertising yet, so...) Apr 15 18:01:27 king kernel: IPv6 addrconf: prefix with wrong length 49 I'll get a packet trace of necessary. Thanks for your help! Here's an example smartctl log (Without the console warnings). # strace -o smart.trace smartctl /dev/sda -d 3ware,0 -a smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION === Device Model: Maxtor 6Y200P0 Serial Number:Y60PVCZE Firmware Version: YAR41BW0 Device is:In smartctl database [for details use: -P show] ATA Version is: 7 ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0 Local Time is:Fri Apr 15 17:25:49 2005 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 243) seconds. Offline data collection capabilities:(0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities:(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability:(0x01) Error logging supported. No General Purpose Logging support. Short self-test routine recommended polling time:( 2) minutes. Extended self-test routine recommended polling time:( 91) minutes. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time0x0027 252 252 063Pre-fail Always - 5945 4 Start_Stop_Count0x0032 253 253 000Old_age Always - 17 5 Reallocated_Sector_Ct 0x0033 253 253 063Pre-fail Always - 1 6 Read_Channel_Margin 0x0001 253 253 100Pre-fail Offline - 0 7 Seek_Error_Rate 0x000a 253 252 000Old_age Always - 0 8 Seek_Time_Performance 0x0027 251 248 187Pre-fail Always - 47575 9 Power_On_Minutes0x0032 220 220 000Old_age Always - 849h+05m 10
RE: Bug in 3w-xxxx.c (Notifiers STILL broken)
This is due to a bug in kernel/sys.c in the function notifier_chain_unregister(). where the 'notifier_lock' can't be acquired while reboot is running. I suspect any other drivers that call this function on shutdown from unregister_reboot_notifier() (in the case where the root filesystem is mounted through the driver will also have this problem), i.e. DAC960.c (Mylex) and gdth.c (ICP). The fix for now is to modify kernel/sys.c, the function "notifier_chain_unregister", and remove the write_lock(_lock), and write_unlock(_lock) calls from this function and recompile your kernel. -- Adam Radford Software Engineer 3ware, Inc. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, September 15, 2000 12:49 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Bug in 3w-.c Hi, i have discovered a problem with a 3ware 5400 controller inside my SMP system (Dual PIII800, kernel 2.4.0-test7, 3w-.c version 1.02.00.002). Problem: System will not reboot or halt. Last message on console comes form within 3w-.c. Solution: remove call to function unregister_reboot_notifier() in function tw_halt() solves the problem. Regards, Frank Koeck --- Frank Koeck Max-Planck-Institut fuer Kernphysik, Saupfercheckweg 1, D-69117 Heidelberg Phone: +49-6221-516-518 Fax: +49-6221-516-602 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Bug in 3w-xxxx.c (Notifiers STILL broken)
This is due to a bug in kernel/sys.c in the function notifier_chain_unregister(). where the 'notifier_lock' can't be acquired while reboot is running. I suspect any other drivers that call this function on shutdown from unregister_reboot_notifier() (in the case where the root filesystem is mounted through the driver will also have this problem), i.e. DAC960.c (Mylex) and gdth.c (ICP). The fix for now is to modify kernel/sys.c, the function "notifier_chain_unregister", and remove the write_lock(notifier_lock), and write_unlock(notifier_lock) calls from this function and recompile your kernel. -- Adam Radford Software Engineer 3ware, Inc. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Friday, September 15, 2000 12:49 AM To: [EMAIL PROTECTED]; [EMAIL PROTECTED] Subject: Bug in 3w-.c Hi, i have discovered a problem with a 3ware 5400 controller inside my SMP system (Dual PIII800, kernel 2.4.0-test7, 3w-.c version 1.02.00.002). Problem: System will not reboot or halt. Last message on console comes form within 3w-.c. Solution: remove call to function unregister_reboot_notifier() in function tw_halt() solves the problem. Regards, Frank Koeck --- Frank Koeck Max-Planck-Institut fuer Kernphysik, Saupfercheckweg 1, D-69117 Heidelberg Phone: +49-6221-516-518 Fax: +49-6221-516-602 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/