[Patch][RFC] st: provide tape statistics via sysfs

2013-02-21 Thread Seymour, Shane M
First forgive me for using outlook for this, if there are any issues with what 
I sent let me know and I'll send it again from gmail. This is also my first 
attempt at a kernel patch so please be gentle.

This patch was written to enable tape statistics via sysfs for the dt driver 
based on kernel 3.8.0-rc6. It creates two new files in sysfs and is based on 
work done previously in 2005 by Kai Mäkisara. Any feedback would be greatly 
appreciated.

Assuming sysfs is mounted at /sys the first file is 
/sys/bus/scsi/drivers/st/drives which gives a single number indicating what the 
largest tape drive instance assigned by st_probe in the st module is. If it's 4 
it possible that st0, st1, st2, and st3 exist on the system. Since tape drives 
can later be disconnected they don't have to exist, the count is a hint so it's 
possible to gather statistics in a loop with an upper bound. This makes it 
easier in iostat to gather statistcs.

The second file is /sys/class/scsi_tape/stxx/stat where xx is the instance of 
the tape drive. The file contents are almost the same as the stat file for 
disks except the merge statistics are always 0 (since tape drives are 
sequential merged I/Os don't make sense) and the inflight value is either a 0 
or 1 since the st module always only has either one read or write outstanding. 
I've also added one field to the end of the file - a count other I/Os - this 
could be commands issued by the driver within the kernel (e.g. rewind) or via 
an ioctl from user space. For tape drives some commands involving actions like 
tape movement can take a long time, it's important to keep track of scsi 
requests sent to the tape drive other than reads and writes so when delays 
happen they can be explained.

With some future patches to iostat this figure will be reported and used to 
calculate an average wait for all I/Os (a_await and oio/s in this output):

tape:   wr/s   KiB_write/srd/s  KiB_read/s  r_await  w_await  a_await  oio/s
st0   186.50 46.750.000.000.0000.2760.276   0.00
st1   186.00 93.000.000.000.0000.1800.180   0.00
st2 0.00  0.00  181.50   45.500.3470.0000.347   0.00
st3 0.00  0.00  183.00   45.750.2240.0000.224   0.00

Q: Does anyone have strong objections to extending the stat format to include 
another field (a count of scsi commands issue to the target other than reads or 
writes), or should the format stay in common with disks and a new device class 
specific file be created that provides extra statistics that may be useful only 
for a specific class of SCSI device? For example called stat-tape, stat-st or 
something else?

Onto justification we have a customer using virtual tape libraries (lots of 
drives) and they wanted to be able to monitor the activity and performance of 
their backups. Because of a lack of functionality they resorted to using a 
publicly available SystemTap script (created by RedHat presumably when they 
received similar requests from other customers):

http://sourceware.org/systemtap/wiki/WSiostatSCSI

Unfortunately, using this script occasionally results in kernel panics on older 
kernels, those issues have been addressed but most customers still don't end up 
running the SystemTap script unless they have to and they still wait to monitor 
performance of their tape drives.

Just googling: linux tape throughput statistics is enough to yield many hits on 
the topic including these:

1. http://www.ibm.com/developerworks/forums/thread.jspa?messageID=14775056
2. 
http://h30499.www3.hp.com/t5/System-Administration/How-to-get-tape-drive-performance-stats/td-p/3880235#.UKoJxNGloUo
3. http://docs.oracle.com/cd/E19455-01/816-3319/6m9k06r58/index.html

The first two are asking about getting tape stats on Linux, the reply for 1. is 
that you can get the information on AIX. 2. is similar but the reply is that 
you can get the information for HP-UX 11.31. The last one is the iostat manual 
page for Solaris which can report tape stats as well. All 3 point out that 
iostat can print tape statistics on the largest of the commercial unix 
operating systems.

Q: Does anyone have any general feedback about things that need to change or 
demands about changing the implementation before being accepted?

The checkpatch.pl script generates warnings for the diffs because of CamelToe 
however the CamelToe warnings are because I wanted to stay consistent with the 
module (look for things like STp).

Signed-off-by: Shane Seymour 
Signed-off-by: Darren Lavender 
Tested-by: Shane Seymour 
Tested-by: Darren Lavender 
---
diff -uprN -X linux-3.8-rc6-vanilla/Documentation/dontdiff 
linux-3.8-rc6-vanilla/drivers/scsi/st.c linux-3.8-rc6/drivers/scsi/st.c
--- linux-3.8-rc6-vanilla/drivers/scsi/st.c 2013-02-08 14:35:27.0 
+
+++ linux-3.8-rc6/drivers/scsi/st.c 2013-02-22 00:06:50.0 +
@@ -174,6 +174,9 @@ static int debugging = DEBUG;
 stat

Re: Issue with mini-SaS to eSATA to USB 3.0 setup

2013-02-21 Thread Sarah Sharp
On Thu, Feb 21, 2013 at 05:27:00PM -0300, Fabio David wrote:
> On Thu, Feb 21, 2013 at 4:26 PM, Sarah Sharp  
> wrote:
> On Tue, Jan 29, 2013 at 12:56:02PM -0200, Fabio David wrote:
> > > Do you have any suggestions?
> >
> > A couple possible root causes come to mind:
> >
> > 1. Perhaps the USB 3.0 hub is interfering with communication to your
> > eSATA to USB 3.0 adapters.
> >
> > 2. Maybe USB device suspend is to blame.  Do you have USB device suspend
> > enabled for the eSATA to USB adapters?
> 
> I am not sure, I thought it was disabled by default. How can I check?

It is disabled by default.  I just wanted to make sure an installed udev
script wasn't enabling auto-suspend.

You can check whether auto-suspend is enabled by running powertop and
looking for the lines that correspond to the USB 3.0 to eSATA adapters.
If they say 'Bad', device suspend is disabled.  If they say 'Good',
device suspend is enabled.

Or you can find the power/control entries for the devices in
/sys/bus/usb/devices/ and make sure they say 'on' rather than 'auto'.

E.g.

sarah@xanatos:~$ lsusb
Bus 001 Device 002: ID 050d:0413 Belkin Components 
Bus 003 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 004 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 003: ID 045e:0750 Microsoft Corp. Wired Keyboard 600
Bus 001 Device 004: ID 046d:c018 Logitech, Inc. Optical Wheel Mouse
Bus 003 Device 004: ID 04f2:b2ea Chicony Electronics Co., Ltd 
sarah@xanatos:~$ lsusb -t
/:  Bus 04.Port 1: Dev 1, Class=root_hub, Driver=ehci_hcd/3p, 480M
|__ Port 1: Dev 2, If 0, Class=hub, Driver=hub/8p, 480M
/:  Bus 03.Port 1: Dev 1, Class=root_hub, Driver=ehci_hcd/3p, 480M
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 5000M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/4p, 480M
|__ Port 1: Dev 2, If 0, Class=hub, Driver=hub/4p, 480M
|__ Port 3: Dev 3, If 0, Class=HID, Driver=usbhid, 1.5M
|__ Port 3: Dev 3, If 1, Class=HID, Driver=usbhid, 1.5M
|__ Port 4: Dev 4, If 0, Class=HID, Driver=usbhid, 1.5M
sarah@xanatos:~$ cd /sys/bus/usb/devices/
sarah@xanatos:/sys/bus/usb/devices$ ls
1-0:1.0  1-1  1-1:1.0  1-1.3  1-1.3:1.0  1-1.3:1.1  1-1.4  1-1.4:1.0  2-0:1.0  
3-0:1.0  3-1  3-1:1.0  3-1.6  3-1.6:1.0  3-1.6:1.1  4-0:1.0  4-1  4-1:1.0  usb1 
 usb2  usb3  usb4
sarah@xanatos:/sys/bus/usb/devices$ cat 1-1.4/idVendor 
046d
sarah@xanatos:/sys/bus/usb/devices$ cat 1-1.4/power/control 
on
sarah@xanatos:/sys/bus/usb/devices$ 

That means my USB mouse is 'on', so device auto-suspend is disabled.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [usb-storage] Re: Issue with mini-SaS to eSATA to USB 3.0 setup

2013-02-21 Thread Vojtech Pavlik
On Thu, Feb 21, 2013 at 03:48:42PM -0500, Douglas Gilbert wrote:
> On 13-02-21 02:26 PM, Sarah Sharp wrote:
> >Cc-ing the SCSI and USB storage list.
> >
> >Folks, does the attached picture look like a sane setup?  I've never
> >used mini-SaS to eSATA adapter before, let alone with four eSATA to USB
> >3.0 adapters.
> 
> Well SAS to eSATA is okay (works for me: LSI SAS9212-4i4e HBA
> via a SATA to eSATA cable to a SATA disk caddy with an eSATA
> port).

This seems to be all just SATA signalling, no SAS involved at all, just
the physical shape of the connector is miniSAS.

-- 
Vojtech Pavlik
Director SuSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Read I/O starvation with writeback RAID controller

2013-02-21 Thread Nicholas A. Bellinger
Hi Martin,

On Thu, 2013-02-21 at 12:43 +0100, Martin Svec wrote:
> I'm sorry, I forgot to mention hardware details. It isn't aacraid, it
> is megaraid-based Dell PERC H700 w/ 1GB NVRAM and 12x 450GB 15k SAS
> drives in RAID-10. All in Dell R510 server.
> 

Jan Engelhardt (CC'ed) mentioned the currently out-of-tree ROW scheduler
worked for him:

https://lkml.org/lkml/2012/12/11/534

Perhaps this would be worth a shot..?

--nab

> Thanks,
> 
> Martin
> 
> Dne 20.2.2013 21:48, Nicholas A. Bellinger napsal(a):
> > Hi Martin,
> >
> > CC'ing linux-scsi here, as aacraid doesn't have an official maintainer
> > atm.
> >
> > --nab
> >
> > On Wed, 2013-02-20 at 16:38 +0100, Martin Svec wrote:
> >> Hello,
> >>
> >> I've noticed read I/O starvation problems of LIO iSCSI target when
> >> used on top of writeback-enabled HW RAID controller (PERC H700 with
> >> 1GB cache). For intensive mixed read-write workload in virtualized
> >> environments, writes are able to consume over 95% of the IOPS
> >> throughput and cause starvation of reads.
> >>
> >> After a number of tests it seems to me it's a general issue of block
> >> layer I/O scheduling when running on top of a writeback device. If
> >> there is a write-intensive task, all writes go to the writeback cache
> >> with near-zero latency. This allows writer to quickly saturate the
> >> device with thousands of writes while using only a minimal fraction of
> >> queue depth. However, non-cached reads depend on spinning drive
> >> latencies which are orders of magnitude higher than writeback cache
> >> latencies, and so readers cannot submit so many requests per second as
> >> writers. Consequently, I guess the controller has totally wrong view
> >> of the incoming workload pattern, tries to satisfy the write flood
> >> first and the net result is inacceptable starvation of reads, with
> >> latencies up to hundreds of milliseconds.
> >>
> >> A simple fio test with 1TiB block device where one thread does 4k
> >> random sync writes with iodepth=32 and one thread does 4k random reads
> >> with iodepth=32 shows that instead of the theoretical 50:50 IOPS
> >> ratio, the block device runs with 95:5 ratio in favor of writes. In
> >> fact, the imbalance is so high that even write iodepth=2 is enaugh to
> >> achieve the same numbers.
> >>
> >> Real workloads that tend to exhibit this problem are: initial zeroing
> >> of a virtual machine disk, virtual machine migration, virtual machine
> >> cloning, intensive swapping of one virtual machine etc.
> >>
> >> I tried to set WCE=1 on target iblock device, played with queue
> >> depths, tested all three I/O schedulers and their parameters,
> >> controller's parameters, but with no luck. To achieve reasonably good
> >> fairness, the only solution is to set nr_requests to 1 or disable
> >> controller's writeback cache at all -- at the expense of degraded
> >> overall performance :-(
> >>
> >> Regarding nr_requests, there's obvious relation between iodepths and
> >> read starvation: if (nr_requests >= workload iodepth) then starvation
> >> surely occurs. Lowering nr_requests below this threshold slowly starts
> >> improving fairness and for every rd+wr iodepths pair, there exists
> >> sufficiently low nr_requests value at which IOPS ratio is finally
> >> balanced according to rd:wr iodepth ratio. Unfortunately it means
> >> there is no minimal nr_requests value suitable for all workloads. For
> >> iodepths around 2 to 8, only nr_requests=1 provides fair load balancing.
> >>
> >> Is this a known problem? Does anybody find block layer parameters that
> >> elliminate this problem for iscsi-target storage in mixed random
> >> read-write environments like virtualization? Or should I start writing
> >> my own I/O scheduler? ;-)
> >>
> >> Update: I've just found https://lkml.org/lkml/2012/12/10/550 (Read
> >> starvation by sync writes), where Jan Kara describes identical
> >> symptoms. But setting nr_requests=1 doesn't help in my case.
> >> CC'ing LKML too (I'm not LKML subscriber).
> >>
> >> Thanks,
> >>
> >> Martin
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> >> the body of a message to majord...@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> --
> To unsubscribe from this list: send the line "unsubscribe target-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Issue with mini-SaS to eSATA to USB 3.0 setup

2013-02-21 Thread Douglas Gilbert

On 13-02-21 02:26 PM, Sarah Sharp wrote:

Cc-ing the SCSI and USB storage list.

Folks, does the attached picture look like a sane setup?  I've never
used mini-SaS to eSATA adapter before, let alone with four eSATA to USB
3.0 adapters.


Well SAS to eSATA is okay (works for me: LSI SAS9212-4i4e HBA
via a SATA to eSATA cable to a SATA disk caddy with an eSATA
port).

eSATA to USB 3.0 adapters sound pretty dodgy, especially when
no mention is made of UAS(P).

Doug Gilbert


On Tue, Jan 29, 2013 at 12:56:02PM -0200, Fabio David wrote:

Hi Sarah,

My name is Fabio David and I am from Brazil. I've seen your posts on
several forums and read articles about you. I really admire your work.

Maybe you can help me. I'm trying to connect a PC running Centos 6.3
to a CRU dataport 4-bay storage device. This device only has a miniSaS
port.

Here is my scenario:

- DataCRU device with 4 hot-swapables bays.
http://www.cru-inc.com/slideshow.php?dir=//Digital-Cinema//&sel=5
- MiniSaS cable connects to the DataCRU device and on the other side
there are 4 eSata connectors
   
http://www.elpeus.com/sas-mini-sas/external-mini-sas-cables/sff-8088-to-4-esata/3m-mini-sas-sff-8088-to-4-esata-cable/
- 4 eSata<->USB3.0 adaptors connected to each eSata connector
- Adaptors connected to a USB3.0 HUB
- USB3.0 hub connected to PC

Everything works ok, I can mount/read the HDs, but sometimes the
system does not detect when a hard drive is inserted/removed from a
DataCru bay. No events are generated, nothing appears in
/proc/partitions nor udev
is called to apply my rules.


Do you lose only hard drive insertion events, or do you lose remove
events as well?

For example, what happens when you do this:

1. Unplug the eSATA to USB adapters from the USB 3.0 hub.
2. Insert a hard drive into the bay.
3. Connect the eSATA to USB adapter to the USB 3.0 hub.
4. Wait for hard drive detection, then hot-remove the drive from the
bay.


However, everything works fine when connected directly to PC's USB
port. Please look at the attached picture.


It looks like you're only attaching one eSATA to USB adapter to the
roothub.  Do you only have one USB 3.0 port on the host, or can you try
plugging in multiple eSATA to USB adapters into the roothub?

Does the setup work when only one eSATA to USB adapter is plugged into
the USB 3.0 hub?


Do you have any suggestions?


A couple possible root causes come to mind:

1. Perhaps the USB 3.0 hub is interfering with communication to your
eSATA to USB 3.0 adapters.

2. Maybe USB device suspend is to blame.  Do you have USB device suspend
enabled for the eSATA to USB adapters?

3. Perhaps the SATA adapters aren't responding with a Medium Changed
status when the USB storage device is plugged in.

Can you send me dmesg, starting from just before you insert a hard drive
into the drive bays?  I need dmesg for both when the SATA adapter is
connected directly to the roothub, and when it's connected to the USB
3.0 hub.

A usbmon trace might also be useful for the USB storage developers.
Documentation on how to take that trace is here:

http://lxr.linux.no/#linux/Documentation/usb/usbmon.txt

Sarah Sharp


===

lsusb returns

Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 001 Device 002: ID 13d3:3323 IMC Networks
Bus 001 Device 009: ID 2109:3431  <  HUB 3.0
Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 007 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 007 Device 040: ID 2109:0810  <  HUB 3.0
Bus 007 Device 041: ID 1234:5678 Brain Actuated Technologies
Bus 007 Device 042: ID 1234:5678 Brain Actuated Technologies
Bus 007 Device 043: ID 1234:5678 Brain Actuated Technologies
Bus 007 Device 044: ID 1234:5678 Brain Actuated Technologies

/var/log/messages

Jan 27 18:00:28 localhost kernel: usb 7-1: New USB device found,
idVendor=2109, idProduct=0810
Jan 27 18:00:28 localhost kernel: usb 7-1: New USB device strings:
Mfr=1, Product=2, SerialNumber=0
Jan 27 18:00:28 localhost kernel: usb 7-1: Product: 4-Port USB 3.0 Hub
Jan 27 18:00:28 localhost kernel: usb 7-1: Manufacturer: VIA Labs, Inc.
Jan 27 18:00:28 localhost kernel: usb 7-1: configuration #1 chosen from 1 choice
Jan 27 18:00:28 localhost kernel: hub 7-1:1.0: USB hub found
Jan 27 18:00:28 localhost kernel: hub 7-1:1.0: 4 ports detected

Jan 28 21:32:02 localhost kernel: usb 7-1.1: new SuperSpeed USB device
number 9 using xhci_hcd
Jan 28 21:32:56 localhost kernel: xhci_hcd :01:00.0: PCI INT A ->
GSI 16 (level, low) -> IRQ 16
Jan 28 21:32:56 localhost kernel: xhci_hcd :01:00.0: xHCI Host Controller
Jan 28 21:32:56 localhost kernel: xhci_hcd 000

Re: [usb-storage] Issue with mini-SaS to eSATA to USB 3.0 setup

2013-02-21 Thread Matthew Dharm
I highly doubt hot-insert and hot-remove of HDDs from the 4-bay
container (without removing the corresponding USB/eSATA adaptor) will
work.

The USB/eSATA adaptor does not have a way to inform the host that the
eSATA side has been disconnected from the HDD.  That functionality
isn't in the usb-storage protocol.

This type of functionality *might* be supported in the UAS protocol,
but I don't know.

Matt

On Thu, Feb 21, 2013 at 11:26 AM, Sarah Sharp
 wrote:
> Cc-ing the SCSI and USB storage list.
>
> Folks, does the attached picture look like a sane setup?  I've never
> used mini-SaS to eSATA adapter before, let alone with four eSATA to USB
> 3.0 adapters.
>
> On Tue, Jan 29, 2013 at 12:56:02PM -0200, Fabio David wrote:
>> Hi Sarah,
>>
>> My name is Fabio David and I am from Brazil. I've seen your posts on
>> several forums and read articles about you. I really admire your work.
>>
>> Maybe you can help me. I'm trying to connect a PC running Centos 6.3
>> to a CRU dataport 4-bay storage device. This device only has a miniSaS
>> port.
>>
>> Here is my scenario:
>>
>> - DataCRU device with 4 hot-swapables bays.
>> http://www.cru-inc.com/slideshow.php?dir=//Digital-Cinema//&sel=5
>> - MiniSaS cable connects to the DataCRU device and on the other side
>> there are 4 eSata connectors
>>   
>> http://www.elpeus.com/sas-mini-sas/external-mini-sas-cables/sff-8088-to-4-esata/3m-mini-sas-sff-8088-to-4-esata-cable/
>> - 4 eSata<->USB3.0 adaptors connected to each eSata connector
>> - Adaptors connected to a USB3.0 HUB
>> - USB3.0 hub connected to PC
>>
>> Everything works ok, I can mount/read the HDs, but sometimes the
>> system does not detect when a hard drive is inserted/removed from a
>> DataCru bay. No events are generated, nothing appears in
>> /proc/partitions nor udev
>> is called to apply my rules.
>
> Do you lose only hard drive insertion events, or do you lose remove
> events as well?
>
> For example, what happens when you do this:
>
> 1. Unplug the eSATA to USB adapters from the USB 3.0 hub.
> 2. Insert a hard drive into the bay.
> 3. Connect the eSATA to USB adapter to the USB 3.0 hub.
> 4. Wait for hard drive detection, then hot-remove the drive from the
> bay.
>
>> However, everything works fine when connected directly to PC's USB
>> port. Please look at the attached picture.
>
> It looks like you're only attaching one eSATA to USB adapter to the
> roothub.  Do you only have one USB 3.0 port on the host, or can you try
> plugging in multiple eSATA to USB adapters into the roothub?
>
> Does the setup work when only one eSATA to USB adapter is plugged into
> the USB 3.0 hub?
>
>> Do you have any suggestions?
>
> A couple possible root causes come to mind:
>
> 1. Perhaps the USB 3.0 hub is interfering with communication to your
> eSATA to USB 3.0 adapters.
>
> 2. Maybe USB device suspend is to blame.  Do you have USB device suspend
> enabled for the eSATA to USB adapters?
>
> 3. Perhaps the SATA adapters aren't responding with a Medium Changed
> status when the USB storage device is plugged in.
>
> Can you send me dmesg, starting from just before you insert a hard drive
> into the drive bays?  I need dmesg for both when the SATA adapter is
> connected directly to the roothub, and when it's connected to the USB
> 3.0 hub.
>
> A usbmon trace might also be useful for the USB storage developers.
> Documentation on how to take that trace is here:
>
> http://lxr.linux.no/#linux/Documentation/usb/usbmon.txt
>
> Sarah Sharp
>
>> ===
>>
>> lsusb returns
>>
>> Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>> Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
>> Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
>> Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
>> Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
>> Bus 001 Device 002: ID 13d3:3323 IMC Networks
>> Bus 001 Device 009: ID 2109:3431  <  HUB 3.0
>> Bus 006 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
>> Bus 007 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
>> Bus 007 Device 040: ID 2109:0810  <  HUB 3.0
>> Bus 007 Device 041: ID 1234:5678 Brain Actuated Technologies
>> Bus 007 Device 042: ID 1234:5678 Brain Actuated Technologies
>> Bus 007 Device 043: ID 1234:5678 Brain Actuated Technologies
>> Bus 007 Device 044: ID 1234:5678 Brain Actuated Technologies
>>
>> /var/log/messages
>> 
>> Jan 27 18:00:28 localhost kernel: usb 7-1: New USB device found,
>> idVendor=2109, idProduct=0810
>> Jan 27 18:00:28 localhost kernel: usb 7-1: New USB device strings:
>> Mfr=1, Product=2, SerialNumber=0
>> Jan 27 18:00:28 localhost kernel: usb 7-1: Product: 4-Port USB 3.0 Hub
>> Jan 27 18:00:28 localhost kernel: usb 7-1: Manufacturer: VIA Labs, Inc.
>> Jan 27 18:00:28 localhost kernel: usb 7-1: configuration #1 chosen from 1 
>> choice
>> Jan 27 18:00:28 localhos

[PATCH RESEND 2/4] scsi: storvsc: avoid usage of WRITE_SAME

2013-02-21 Thread K. Y. Srinivasan
From: Olaf Hering 

Set scsi_device->no_write_same because the host does not support it.
Also blacklist WRITE_SAME to avoid (and log) accident usage.

If the guest uses the ext4 filesystem, storvsc hangs while it prints
these messages in an endless loop:
...
[  161.459523] hv_storvsc vmbus_0_1: cmd 0x41 scsi status 0x2 srb status 0x6
[  161.462157] sd 2:0:0:0: [sda]
[  161.463135] Sense Key : No Sense [current]
[  161.464983] sd 2:0:0:0: [sda]
[  161.465899] Add. Sense: No additional sense information
[  161.468211] hv_storvsc vmbus_0_1: cmd 0x41 scsi status 0x2 srb status 0x6
[  161.475766] sd 2:0:0:0: [sda]
[  161.476728] Sense Key : No Sense [current]
[  161.478284] sd 2:0:0:0: [sda]
[  161.479441] Add. Sense: No additional sense information
...

This happens with a guest running on Windows Server 2012, but happens to
work while running on Windows Server 2008. WRITE_SAME isnt really
supported by both versions, so disable the command usage globally.

Signed-off-by: Olaf Hering 
Cc: KY Srinivasan 
Cc: 
Signed-off-by: K. Y. Srinivasan 
---
 drivers/scsi/storvsc_drv.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 5ada1d0..2060509 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1156,6 +1156,8 @@ static int storvsc_device_configure(struct scsi_device 
*sdevice)
 
blk_queue_bounce_limit(sdevice->request_queue, BLK_BOUNCE_ANY);
 
+   sdevice->no_write_same = 1;
+
return 0;
 }
 
@@ -1238,6 +1240,8 @@ static bool storvsc_scsi_cmd_ok(struct scsi_cmnd *scmnd)
u8 scsi_op = scmnd->cmnd[0];
 
switch (scsi_op) {
+   /* the host does not handle WRITE_SAME, log accident usage */
+   case WRITE_SAME:
/*
 * smartd sends this command and the host does not handle
 * this. So, don't send it.
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] Drivers: scsi: storvsc: Handle dynamic resizing of the device

2013-02-21 Thread K. Y. Srinivasan
Handle LUN size changes by re-scanning the device.

Signed-off-by: K. Y. Srinivasan 
Reviewed-by: Haiyang Zhang 
---
 drivers/scsi/storvsc_drv.c |   31 +++
 1 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 16d5aac..16a3a0c 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -201,6 +201,7 @@ enum storvsc_request_type {
 #define SRB_STATUS_AUTOSENSE_VALID 0x80
 #define SRB_STATUS_INVALID_LUN 0x20
 #define SRB_STATUS_SUCCESS 0x01
+#define SRB_STATUS_ABORTED 0x02
 #define SRB_STATUS_ERROR   0x04
 
 /*
@@ -295,6 +296,25 @@ struct storvsc_scan_work {
uint lun;
 };
 
+static void storvsc_device_scan(struct work_struct *work)
+{
+   struct storvsc_scan_work *wrk;
+   uint lun;
+   struct scsi_device *sdev;
+
+   wrk = container_of(work, struct storvsc_scan_work, work);
+   lun = wrk->lun;
+
+   sdev = scsi_device_lookup(wrk->host, 0, 0, lun);
+   if (!sdev)
+   goto done;
+   scsi_rescan_device(&sdev->sdev_gendev);
+   scsi_device_put(sdev);
+
+done:
+   kfree(wrk);
+}
+
 static void storvsc_bus_scan(struct work_struct *work)
 {
struct storvsc_scan_work *wrk;
@@ -791,7 +811,18 @@ static void storvsc_handle_error(struct vmscsi_request 
*vm_srb,
do_work = true;
process_err_fn = storvsc_remove_lun;
break;
+   case (SRB_STATUS_ABORTED | SRB_STATUS_AUTOSENSE_VALID):
+   if ((asc == 0x2a) && (ascq == 0x9)) {
+   do_work = true;
+   process_err_fn = storvsc_device_scan;
+   /*
+* Retry the I/O that trigerred this.
+*/
+   set_host_byte(scmnd, DID_REQUEUE);
+   }
+   break;
}
+
if (!do_work)
return;
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH RESEND 1/4] Drivers: scsi: storvsc: Initialize the sglist

2013-02-21 Thread K. Y. Srinivasan
Initialize sglist before using it.

Signed-off-by: K. Y. Srinivasan 
Cc: 
---
 drivers/scsi/storvsc_drv.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 270b3cf..5ada1d0 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -467,6 +467,7 @@ static struct scatterlist *create_bounce_buffer(struct 
scatterlist *sgl,
if (!bounce_sgl)
return NULL;
 
+   sg_init_table(bounce_sgl, num_pages);
for (i = 0; i < num_pages; i++) {
page_buf = alloc_page(GFP_ATOMIC);
if (!page_buf)
-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] Drivers: scsi: storvsc: Restructure error handling code on command completion

2013-02-21 Thread K. Y. Srinivasan
In preparation for handling additional sense codes, restructure and cleanup
the error handling code in the command completion code path.

Signed-off-by: K. Y. Srinivasan 
Reviewed-by: Haiyang Zhang 
---
 drivers/scsi/storvsc_drv.c |  101 +--
 1 files changed, 59 insertions(+), 42 deletions(-)

diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index 2060509..16d5aac 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -761,6 +761,55 @@ cleanup:
return ret;
 }
 
+static void storvsc_handle_error(struct vmscsi_request *vm_srb,
+   struct scsi_cmnd *scmnd,
+   struct Scsi_Host *host,
+   u8 asc, u8 ascq)
+{
+   struct storvsc_scan_work *wrk;
+   void (*process_err_fn)(struct work_struct *work);
+   bool do_work = false;
+
+   switch (vm_srb->srb_status) {
+   case SRB_STATUS_ERROR:
+   /*
+* If there is an error; offline the device since all
+* error recovery strategies would have already been
+* deployed on the host side. However, if the command
+* were a pass-through command deal with it appropriately.
+*/
+   switch (scmnd->cmnd[0]) {
+   case ATA_16:
+   case ATA_12:
+   set_host_byte(scmnd, DID_PASSTHROUGH);
+   break;
+   default:
+   set_host_byte(scmnd, DID_TARGET_FAILURE);
+   }
+   break;
+   case SRB_STATUS_INVALID_LUN:
+   do_work = true;
+   process_err_fn = storvsc_remove_lun;
+   break;
+   }
+   if (!do_work)
+   return;
+
+   /*
+* We need to schedule work to process this error; schedule it.
+*/
+   wrk = kmalloc(sizeof(struct storvsc_scan_work), GFP_ATOMIC);
+   if (!wrk) {
+   set_host_byte(scmnd, DID_TARGET_FAILURE);
+   return;
+   }
+
+   wrk->host = host;
+   wrk->lun = vm_srb->lun;
+   INIT_WORK(&wrk->work, process_err_fn);
+   schedule_work(&wrk->work);
+}
+
 
 static void storvsc_command_completion(struct storvsc_cmd_request *cmd_request)
 {
@@ -769,8 +818,13 @@ static void storvsc_command_completion(struct 
storvsc_cmd_request *cmd_request)
void (*scsi_done_fn)(struct scsi_cmnd *);
struct scsi_sense_hdr sense_hdr;
struct vmscsi_request *vm_srb;
-   struct storvsc_scan_work *wrk;
struct stor_mem_pools *memp = scmnd->device->hostdata;
+   struct Scsi_Host *host;
+   struct storvsc_device *stor_dev;
+   struct hv_device *dev = host_dev->dev;
+
+   stor_dev = get_in_stor_device(dev);
+   host = stor_dev->host;
 
vm_srb = &cmd_request->vstor_packet.vm_srb;
if (cmd_request->bounce_sgl_count) {
@@ -783,55 +837,18 @@ static void storvsc_command_completion(struct 
storvsc_cmd_request *cmd_request)
cmd_request->bounce_sgl_count);
}
 
-   /*
-* If there is an error; offline the device since all
-* error recovery strategies would have already been
-* deployed on the host side. However, if the command
-* were a pass-through command deal with it appropriately.
-*/
scmnd->result = vm_srb->scsi_status;
 
-   if (vm_srb->srb_status == SRB_STATUS_ERROR) {
-   switch (scmnd->cmnd[0]) {
-   case ATA_16:
-   case ATA_12:
-   set_host_byte(scmnd, DID_PASSTHROUGH);
-   break;
-   default:
-   set_host_byte(scmnd, DID_TARGET_FAILURE);
-   }
-   }
-
-
-   /*
-* If the LUN is invalid; remove the device.
-*/
-   if (vm_srb->srb_status == SRB_STATUS_INVALID_LUN) {
-   struct storvsc_device *stor_dev;
-   struct hv_device *dev = host_dev->dev;
-   struct Scsi_Host *host;
-
-   stor_dev = get_in_stor_device(dev);
-   host = stor_dev->host;
-
-   wrk = kmalloc(sizeof(struct storvsc_scan_work),
-   GFP_ATOMIC);
-   if (!wrk) {
-   scmnd->result = DID_TARGET_FAILURE << 16;
-   } else {
-   wrk->host = host;
-   wrk->lun = vm_srb->lun;
-   INIT_WORK(&wrk->work, storvsc_remove_lun);
-   schedule_work(&wrk->work);
-   }
-   }
-
if (scmnd->result) {
if (scsi_normalize_sense(scmnd->sense_buffer,
SCSI_SENSE_BUFFERSIZE, &sense_hdr))
scsi_print_sense_hdr("storvsc", &sense_hdr);
}
 
+   if (vm_srb->srb_status != SRB_STATUS_SUCCESS)
+ 

[PATCH 0/4] Drivers: scsi: storvsc

2013-02-21 Thread K. Y. Srinivasan
This patch set (two of the patches are being resent) fixes and enhances
the functionality of the Hyper-V storage driver

K. Y. Srinivasan (3):
  Drivers: scsi: storvsc: Initialize the sglist
  Drivers: scsi: storvsc: Restructure error handling code on command
completion
  Drivers: scsi: storvsc: Handle dynamic resizing of the device

Olaf Hering (1):
  scsi: storvsc: avoid usage of WRITE_SAME

 drivers/scsi/storvsc_drv.c |  137 ++-
 1 files changed, 95 insertions(+), 42 deletions(-)

-- 
1.7.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] scsi: 64-bit LUN support

2013-02-21 Thread James Bottomley
On Thu, 2013-02-21 at 16:15 +, Elliott, Robert (Server Storage)
wrote:
> Regarding changes like this:
> - printk(MYIOC_s_NOTE_FMT "[%d:%d:%d:%d] "
> + printk(MYIOC_s_NOTE_FMT "[%d:%d:%d:%llu] "
>   "FCP_ResponseInfo=%08xh\n", ioc->name,
>   sc->device->host->host_no, sc->device->channel,
>   sc->device->id, sc->device->lun,
> 
> It might be preferable to print the LUN values in hex rather than
> decimal, particularly if they are large values.  SAM-5 includes some
> guidance for displaying LUNs, shown below. 

We can't really change from decimal to hex without causing confusion and
possibly breaking ABIs.  All the existing SCSI references look like
h:c:t:l and all expect l to be a simple decimal.  It's not just in the
logs, we have active use of this form in all the /sys/class/scsi_*/
directories and some tools may parse this value.

> One important goal is to match the format, if any, that the user must
> use in a configuration file or command line argument, so
> cutting-and-pasting the LUN value works.  So, the answer might differ
> for prints from different drivers.  If a driver expects decimal input
> values, then print decimal.
> 
> SAM-5 excerpt:
> 4.7.2 Logical unit representation format
[...]

We're a bit bound by kernel convention here as well.  To retain
compatibility with SPI and flat addressing schemes, we really need to
show the 8 and 16 bit flat addresses as simple decimal numerics.
However, we *might* be free to move to a more hierarchical scheme with
the multi-level luns, since I don't think there's to many people who've
got arrays that output them (yet).

James

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/4] scsi: 64-bit LUN support

2013-02-21 Thread Elliott, Robert (Server Storage)
Regarding changes like this:
-   printk(MYIOC_s_NOTE_FMT "[%d:%d:%d:%d] "
+   printk(MYIOC_s_NOTE_FMT "[%d:%d:%d:%llu] "
"FCP_ResponseInfo=%08xh\n", ioc->name,
sc->device->host->host_no, sc->device->channel,
sc->device->id, sc->device->lun,

It might be preferable to print the LUN values in hex rather than decimal, 
particularly if they are large values.  SAM-5 includes some guidance for 
displaying LUNs, shown below. 

One important goal is to match the format, if any, that the user must use in a 
configuration file or command line argument, so cutting-and-pasting the LUN 
value works.  So, the answer might differ for prints from different drivers.  
If a driver expects decimal input values, then print decimal.

SAM-5 excerpt:
4.7.2 Logical unit representation format
When an application client displays or otherwise makes a 64-bit LUN value 
visible, the application client should
display it in hexadecimal format with byte 0 first (i.e., on the left) and byte 
7 last (i.e., on the right), regardless of
the internal representation of the LUN value (e.g., a single level LUN with an 
ADDRESS METHOD field set to 01b
(i.e., flat space addressing) and a FLAT SPACE LUN field set to 0001h should be 
displayed as 40 01 00 00 00 00 00
00h, not 00 00 00 00 00 00 01 40h). A separator (e.g., space, dash, or colon) 
may be included between each
byte, each two bytes (e.g., 4001---h), or each four bytes (e.g., 
4001 h).

[The trailing h is just the T10 documentation convention... a 0x prefix is fine 
too]
[The next three paragraph allow stripping off unnecessary trailing zeros:]

When displaying a single level LUN structure using the peripheral device 
addressing method (see table 11) or a
single level LUN structure using the flat space addressing method (see table 
12), an application client may
display the value as a single 2-byte value representing only the first level 
LUN (e.g., 40 01h). A separator (e.g.,
space, dash, or colon) may be included between each byte.

When displaying a single level LUN structure using the extended flat space 
addressing method (see table 13), an
application client may display the value as a single 4-byte value representing 
only the first level LUN (e.g., D2 00
00 01h). A separator (e.g., space, dash, or colon) may be included between each 
byte, or between each two
bytes (e.g., D200 0001h).

When displaying a single level LUN structure using the long extended flat space 
addressing method (see table
14), an application client may display the value as a single 6-byte value 
representing only the first level LUN
(e.g., E2 00 00 01 00 01h). A separator (e.g. space, dash, or colon) may be 
included between each byte, or
between each two bytes (e.g., E200 0001 0001h).

When displaying a 16-bit LUN value, an application client should display the 
value as a single 2-byte value (e.g.,
40 01h). A separator (e.g., space, dash, or colon) may be included between each 
byte.


> -Original Message-
> From: Hannes Reinecke [mailto:h...@suse.de]
> Sent: Tuesday, 19 February, 2013 2:18 AM
> To: linux-scsi@vger.kernel.org
> Cc: James Bottomley; Jeremy Linton; Elliott, Robert (Server Storage); Bart Van
> Assche; Hannes Reinecke
> Subject: [PATCH 0/4] scsi: 64-bit LUN support
> 
> This patchset updates the SCSI midlayer to use 64-bit LUNs internally.
> It eliminates the need to limit the number of LUNs artificially to
> avoid aliasing issues; the SCSI midlayer can now accept any LUN presented
> to it.
> 
> The LLDD specific settings for 'max_lun' have been left untouched;
> it should be raised to '~0' if the HBA supports 64-bit LUNs internally.
> However, it is up to the driver maintainer to raise that limit.


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [SCSI]: megaraid: avoid sleeping on spinlock

2013-02-21 Thread Denis Efremov
GFP_KERNEL may cause pci_pool_alloc() sleep,
so we need use GFP_ATOMIC instead of GFP_KERNEL.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Denis Efremov 
---
 drivers/scsi/megaraid/megaraid_mm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/megaraid/megaraid_mm.c 
b/drivers/scsi/megaraid/megaraid_mm.c
index 25506c7..4b2f336 100644
--- a/drivers/scsi/megaraid/megaraid_mm.c
+++ b/drivers/scsi/megaraid/megaraid_mm.c
@@ -568,7 +568,7 @@ mraid_mm_attach_buf(mraid_mmadp_t *adp, uioc_t *kioc, int 
xferlen)
 
kioc->pool_index= right_pool;
kioc->free_buf  = 1;
-   kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_KERNEL,
+   kioc->buf_vaddr = pci_pool_alloc(pool->handle, GFP_ATOMIC,
&kioc->buf_paddr);
spin_unlock_irqrestore(&pool->lock, flags);
 
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Read I/O starvation with writeback RAID controller

2013-02-21 Thread Martin Svec
I'm sorry, I forgot to mention hardware details. It isn't aacraid, it
is megaraid-based Dell PERC H700 w/ 1GB NVRAM and 12x 450GB 15k SAS
drives in RAID-10. All in Dell R510 server.

Thanks,

Martin

Dne 20.2.2013 21:48, Nicholas A. Bellinger napsal(a):
> Hi Martin,
>
> CC'ing linux-scsi here, as aacraid doesn't have an official maintainer
> atm.
>
> --nab
>
> On Wed, 2013-02-20 at 16:38 +0100, Martin Svec wrote:
>> Hello,
>>
>> I've noticed read I/O starvation problems of LIO iSCSI target when
>> used on top of writeback-enabled HW RAID controller (PERC H700 with
>> 1GB cache). For intensive mixed read-write workload in virtualized
>> environments, writes are able to consume over 95% of the IOPS
>> throughput and cause starvation of reads.
>>
>> After a number of tests it seems to me it's a general issue of block
>> layer I/O scheduling when running on top of a writeback device. If
>> there is a write-intensive task, all writes go to the writeback cache
>> with near-zero latency. This allows writer to quickly saturate the
>> device with thousands of writes while using only a minimal fraction of
>> queue depth. However, non-cached reads depend on spinning drive
>> latencies which are orders of magnitude higher than writeback cache
>> latencies, and so readers cannot submit so many requests per second as
>> writers. Consequently, I guess the controller has totally wrong view
>> of the incoming workload pattern, tries to satisfy the write flood
>> first and the net result is inacceptable starvation of reads, with
>> latencies up to hundreds of milliseconds.
>>
>> A simple fio test with 1TiB block device where one thread does 4k
>> random sync writes with iodepth=32 and one thread does 4k random reads
>> with iodepth=32 shows that instead of the theoretical 50:50 IOPS
>> ratio, the block device runs with 95:5 ratio in favor of writes. In
>> fact, the imbalance is so high that even write iodepth=2 is enaugh to
>> achieve the same numbers.
>>
>> Real workloads that tend to exhibit this problem are: initial zeroing
>> of a virtual machine disk, virtual machine migration, virtual machine
>> cloning, intensive swapping of one virtual machine etc.
>>
>> I tried to set WCE=1 on target iblock device, played with queue
>> depths, tested all three I/O schedulers and their parameters,
>> controller's parameters, but with no luck. To achieve reasonably good
>> fairness, the only solution is to set nr_requests to 1 or disable
>> controller's writeback cache at all -- at the expense of degraded
>> overall performance :-(
>>
>> Regarding nr_requests, there's obvious relation between iodepths and
>> read starvation: if (nr_requests >= workload iodepth) then starvation
>> surely occurs. Lowering nr_requests below this threshold slowly starts
>> improving fairness and for every rd+wr iodepths pair, there exists
>> sufficiently low nr_requests value at which IOPS ratio is finally
>> balanced according to rd:wr iodepth ratio. Unfortunately it means
>> there is no minimal nr_requests value suitable for all workloads. For
>> iodepths around 2 to 8, only nr_requests=1 provides fair load balancing.
>>
>> Is this a known problem? Does anybody find block layer parameters that
>> elliminate this problem for iscsi-target storage in mixed random
>> read-write environments like virtualization? Or should I start writing
>> my own I/O scheduler? ;-)
>>
>> Update: I've just found https://lkml.org/lkml/2012/12/10/550 (Read
>> starvation by sync writes), where Jan Kara describes identical
>> symptoms. But setting nr_requests=1 doesn't help in my case.
>> CC'ing LKML too (I'm not LKML subscriber).
>>
>> Thanks,
>>
>> Martin
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe target-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] [SCSI] aacraid: suppress two GCC warnings

2013-02-21 Thread Paul Bolle
Building src.o for a 32 bit system triggers two GCC warnings:
drivers/scsi/aacraid/src.c: In function ‘aac_src_deliver_message’:
drivers/scsi/aacraid/src.c:410:3: warning: right shift count >= width of 
type [enabled by default]
drivers/scsi/aacraid/src.c:434:2: warning: right shift count >= width of 
type [enabled by default]

These warnings are caused by a right shift of 32. Use upper_32_bits() to
suppress them.

Signed-off-by: Paul Bolle 
---
0) Instead of a cast to u64, this version uses upper_32_bits() as James
suggested. I also stopped changing 0L to 0UL, because I keep having
doubts about the cargo cult.

1) Still compile tested only, but now on v3.8.

 drivers/scsi/aacraid/src.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/aacraid/src.c b/drivers/scsi/aacraid/src.c
index 3b021ec..e2e3492 100644
--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -407,7 +407,7 @@ static int aac_src_deliver_message(struct fib *fib)
fib->hw_fib_va->header.StructType = FIB_MAGIC2;
fib->hw_fib_va->header.SenderFibAddress = (u32)address;
fib->hw_fib_va->header.u.TimeStamp = 0;
-   BUG_ON((u32)(address >> 32) != 0L);
+   BUG_ON(upper_32_bits(address) != 0L);
address |= fibsize;
} else {
/* Calculate the amount to the fibsize bits */
@@ -431,7 +431,7 @@ static int aac_src_deliver_message(struct fib *fib)
address |= fibsize;
}
 
-   src_writel(dev, MUnit.IQ_H, (address >> 32) & 0x);
+   src_writel(dev, MUnit.IQ_H, upper_32_bits(address) & 0x);
src_writel(dev, MUnit.IQ_L, address & 0x);
 
return 0;
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] SCSI: amd_iommu dma_boundary overflow

2013-02-21 Thread Joerg Roedel
Hi Eddie,

> On Tue, 2013-02-19 at 18:30 -0800, Eddie Wai wrote:
> > The code seems correct as it make sense to impose the same hardware
> > segment boundary limit on both the blk queue and the DMA code.  It would
> > be an easy alternative to simply prevent the shost->dma_boundary from
> > being set to DMA_BIT_MASK(64), but it seems more correct to fix the
> > amd_iommu code itself to detect and handle this max 64-bit mask condition.

Thanks for tracking this problem down. It turns out that this code does
not only exist in the AMD IOMMU driver but also in other ones (Calgary
and GART at least, havn't checked all).

> > --- a/drivers/iommu/amd_iommu.c
> > +++ b/drivers/iommu/amd_iommu.c
> > @@ -1526,11 +1526,14 @@ static unsigned long dma_ops_area_alloc(struct 
> > device *dev,
> > unsigned long boundary_size;
> > unsigned long address = -1;
> > unsigned long limit;
> > +   unsigned long mask;
> >  
> > next_bit >>= PAGE_SHIFT;
> >  
> > -   boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
> > -   PAGE_SIZE) >> PAGE_SHIFT;

Given that there is a BUG_ON() in the iommu-helpers which checks for
!is_power_of_2(boundary_size) I think we can simplify the this macro and
avoid the overflow in a more clever way:

boundary_size = (dma_get_seg_boundary(dev) >> PAGE_SHIFT) + 1;

This should work because dma_get_seg_boundary(dev) really needs to be a
bitmask which becomes a power_of_2 on incrementing.


Regards,

Joerg


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v10 0/4] block layer runtime pm

2013-02-21 Thread Aaron Lu
On Wed, Feb 20, 2013 at 10:43:50AM -0500, Alan Stern wrote:
> On Wed, 20 Feb 2013, Aaron Lu wrote:
> 
> > In August 2010, Jens and Alan discussed about "Runtime PM and the block
> > layer". http://marc.info/?t=12825910841&r=1&w=2
> > And then Alan has given a detailed implementation guide:
> > http://marc.info/?l=linux-scsi&m=133727953625963&w=2
> 
> > v10:
> > - Add link of Alan Stern's ideas on block layer runtime PM to patch 2
> >   and 3's changelog;
> > - Add back code to schdule device suspend if scsi driver return -EBUSY.
> 
> This all looks okay now.  You can add
> 
> Acked-by: Alan Stern 
> 
> to each of the patches.

Great, thanks a lot for your kind help.

Hi James,
Can I have your ack for patch 1 and 4?

And Jens,
Do you have any comments for this series?

Thanks,
Aaron

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] block: modify __bio_add_page check to accept pages that don't start a new segment

2013-02-21 Thread Jan Vesely

The original behavior was to refuse all pages after the maximum number of
segments has been reached. However, some drivers (like st) craft their buffers
to potentially require exactly max segments and multiple pages in the last
segment. This patch modifies the check to allow pages that can be merged into
the last segment.

This change fixes EBUSY failures when using large (1mb) tape block size in high
memory fragmentation condition.

Signed-off-by: Jan Vesely 
---
 fs/bio.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/fs/bio.c b/fs/bio.c
index b96fc6c..02efbd5 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -500,7 +500,6 @@ static int __bio_add_page(struct request_queue *q, struct 
bio *bio, struct page

  *page, unsigned int len, unsigned int offset,
  unsigned short max_sectors)
 {
-   int retried_segments = 0;
struct bio_vec *bvec;

/*
@@ -551,18 +550,12 @@ static int __bio_add_page(struct request_queue *q, struct 
bio *bio, struct page

return 0;

/*
-* we might lose a segment or two here, but rather that than
-* make this too complex.
+* prepare segment count check, reduce segment count if possible
 */

-   while (bio->bi_phys_segments >= queue_max_segments(q)) {
-
-   if (retried_segments)
-   return 0;
-
-   retried_segments = 1;
+   if (bio->bi_phys_segments >= queue_max_segments(q))
blk_recount_segments(q, bio);
-   }
+

/*
 * setup the new entry, we might clear it again later if we
@@ -572,6 +565,19 @@ static int __bio_add_page(struct request_queue *q, struct 
bio *bio, struct page

bvec->bv_page = page;
bvec->bv_len = len;
bvec->bv_offset = offset;
+   
+   /*
+* the other part of the segment count check, allow mergeable pages
+*/
+   if ((bio->bi_phys_segments > queue_max_segments(q)) ||
+   ( (bio->bi_phys_segments == queue_max_segments(q)) &&
+   !BIOVEC_PHYS_MERGEABLE(bvec - 1, bvec))) {
+   bvec->bv_page = NULL;
+   bvec->bv_len = 0;
+   bvec->bv_offset = 0;
+   return 0;
+   }
+

/*
 * if queue has other restrictions (eg varying max sector size
--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html