Re: [RFC 1/2] target: Add documentation on the target userspace pass-through driver

2014-08-30 Thread Richard W.M. Jones
On Tue, Jul 01, 2014 at 12:11:14PM -0700, Andy Grover wrote:
 Describes the driver and its interface to make it possible for user
 programs to back a LIO-exported LUN.
 
 Signed-off-by: Andy Grover agro...@redhat.com
 ---
  Documentation/target/tcmu-design.txt | 210 
 +++
  1 file changed, 210 insertions(+)
  create mode 100644 Documentation/target/tcmu-design.txt
 
 diff --git a/Documentation/target/tcmu-design.txt 
 b/Documentation/target/tcmu-design.txt
 new file mode 100644
 index 000..200ff3e
 --- /dev/null
 +++ b/Documentation/target/tcmu-design.txt
 @@ -0,0 +1,210 @@
 +TCM Userspace Design
 +
 +
 +
 +Background:
 +
 +In addition to modularizing the transport protocol used for carrying
 +SCSI commands (fabrics), the Linux kernel target, LIO, also modularizes
 +the actual data storage as well. These are referred to as backstores
 +or storage engines. The target comes with backstores that allow a
 +file, a block device, RAM, or another SCSI device to be used for the
 +local storage needed for the exported SCSI LUN. Like the rest of LIO,
 +these are implemented entirely as kernel code.
 +
 +These backstores cover the most common use cases, but not all. One new
 +use case that other non-kernel target solutions, such as tgt, are able
 +to support is using Gluster's GLFS or Ceph's RBD as a backstore. The
 +target then serves as a translator, allowing initiators to store data
 +in these non-traditional networked storage systems, while still only
 +using standard protocols themselves.
 +
 +If the target is a userspace process, supporting these is easy. tgt,
 +for example, needs only a small adapter module for each, because the
 +modules just use the available userspace libraries for RBD and GLFS.
 +
 +Adding support for these backstores in LIO is considerably more
 +difficult, because LIO is entirely kernel code. Instead of undertaking
 +the significant work to port the GLFS or RBD APIs and protocols to the
 +kernel, another approach is to create a userspace pass-through
 +backstore for LIO, TCMU.

It has to be said that this documentation is terrible.

Jumping in medias res[1] is great for fiction, awful for technical
documentation.

I would recommend the Economist Style Guide[2].  They always say
Barak Obama, President of the United States the first time he is
mentioned in an article, even though almost everyone knows who Barak
Obama is.

In this case you're leaping into something .. fabrics, LIO,
backstores, target solutions, ... aargh.  Explain what you mean by
each term and how it all fits together.

Thanks,
Rich.

[1] https://en.wikipedia.org/wiki/In_medias_res

[2] http://www.economist.com/styleguide/introduction

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/5] kexec: Export kexec_in_progress

2014-08-30 Thread Eric W. Biederman
Brian King brk...@linux.vnet.ibm.com writes:

 On 08/04/2014 09:21 AM, Brian King wrote:
 On 07/28/2014 03:28 PM, Brian King wrote:

 Export kexec_in_progress for use by device drivers and other modules
 to optimize kexec boot.

 Signed-off-by: Brian King brk...@linux.vnet.ibm.com
 ---

  kernel/kexec.c |2 ++
  1 file changed, 2 insertions(+)

 diff -puN kernel/kexec.c~kexec_export_in_prog kernel/kexec.c
 --- linux/kernel/kexec.c~kexec_export_in_prog   2014-07-23 
 17:05:24.851887935 -0500
 +++ linux-bjking1/kernel/kexec.c2014-07-23 17:05:24.856887970 -0500
 @@ -1716,3 +1716,5 @@ int kernel_kexec(void)
 mutex_unlock(kexec_mutex);
 return error;
  }
 +
 +EXPORT_SYMBOL_GPL(kexec_in_progress);
 
 Eric,
 
 Can I get an ack on this so we can take this entire series through the SCSI 
 tree?

 Eric,

 Any issues with this patch?

No huge issues except that patch description is largely uniformitive and
actually wrong.  kexec_in_progress is never set at boot/kernel startup.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


scsi_debug module deadlock on 3.17-rc2

2014-08-30 Thread Milan Broz
Hi,

I am using scsi_debug in cryptsetup testsuite and with recent 3.17-rc kernel
it deadlocks on rmmod of scsi_debug module.

For me even this simple reproducer causes deadlock:
  modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1
  DEV=/dev/$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)
  mkfs -t ext4 $DEV
  rmmod scsi_debug

(adding small delay before rmmod obviously helps here)

Bisect tracked it to commit
  commit cbf67842c3d9e7af8ccc031332b79e88d9cca592
  Author: Douglas Gilbert dgilb...@interlog.com
  Date:   Sat Jul 26 11:55:35 2014 -0400
  scsi_debug: support scsi-mq, queues and locks

I guess that with introducing mq the del_timer_sync() must not be called
with acquired queued_arr_lock.
(to me it looks like situation described in comment before
del_timer_sync() in kernel/time/timer.c...)

Here is the log (running on vmware VM and i686 arch):

[   67.916472] scsi_debug: host protection
[   67.916483] scsi host3: scsi_debug, version 1.84 [20140706], dev_size_mb=16, 
opts=0x0
[   67.917446] scsi 3:0:0:0: Direct-Access Linuxscsi_debug   0184 
PQ: 0 ANSI: 5
[   67.920539] sd 3:0:0:0: Attached scsi generic sg8 type 0
[   67.940542] sd 3:0:0:0: [sdh] 32768 512-byte logical blocks: (16.7 MB/16.0 
MiB)
[   67.940548] sd 3:0:0:0: [sdh] 4096-byte physical blocks
[   67.950705] sd 3:0:0:0: [sdh] Write Protect is off
[   67.950715] sd 3:0:0:0: [sdh] Mode Sense: 73 00 10 08
[   67.970514] sd 3:0:0:0: [sdh] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[   68.040566]  sdh: unknown partition table
[   68.090618] sd 3:0:0:0: [sdh] Attached SCSI disk
[   68.799699]  sdh: unknown partition table
[   69.072314] 
[   69.072387] ==
[   69.072433] [ INFO: possible circular locking dependency detected ]
[   69.072487] 3.17.0-rc2+ #80 Not tainted
[   69.072518] ---
[   69.072560] rmmod/2890 is trying to acquire lock:
[   69.072595]  ((sqcp-cmnd_timerp)){+.-...}, at: [c10846c0] 
del_timer_sync+0x0/0xb0
[   69.072704] 
[   69.072704] but task is already holding lock:
[   69.072743]  (queued_arr_lock){..-...}, at: [e1271887] 
stop_all_queued+0x17/0xc0 [scsi_debug]
[   69.072852] 
[   69.072852] which lock already depends on the new lock.
[   69.072852] 
[   69.072902] 
[   69.072902] the existing dependency chain (in reverse order) is:
[   69.072949] 
[   69.072949] - #1 (queued_arr_lock){..-...}:
[   69.073045][c1072689] lock_acquire+0x59/0xa0
[   69.073114][c1465cb1] _raw_spin_lock_irqsave+0x31/0x70
[   69.073438][e1271cf7] sdebug_q_cmd_complete+0x27/0x190 [scsi_debug]
[   69.073515][c108434b] call_timer_fn+0x5b/0xd0
[   69.073581][c10848d1] run_timer_softirq+0x161/0x200
[   69.073649][c103dbc9] __do_softirq+0x119/0x230
[   69.073726][c1003c77] do_softirq_own_stack+0x27/0x30
[   69.073811][c103de1e] irq_exit+0x7e/0xa0
[   69.073889][c102a893] smp_apic_timer_interrupt+0x33/0x40
[   69.073969][c14672ae] apic_timer_interrupt+0x32/0x38
[   69.074254][c100a529] arch_cpu_idle+0x9/0x10
[   69.074318][c10674ec] cpu_startup_entry+0x22c/0x280
[   69.074381][c145e1ec] rest_init+0x9c/0xb0
[   69.074441][c16799e0] start_kernel+0x2e9/0x2ee
[   69.074504][c16792ab] i386_start_kernel+0x79/0x7d
[   69.074567] 
[   69.074567] - #0 ((sqcp-cmnd_timerp)){+.-...}:
[   69.074794][c1071b94] __lock_acquire+0x16e4/0x1c30
[   69.074859][c1072689] lock_acquire+0x59/0xa0
[   69.074919][c10846e9] del_timer_sync+0x29/0xb0
[   69.074981][e12718fa] stop_all_queued+0x8a/0xc0 [scsi_debug]
[   69.075050][e1276f85] scsi_debug_exit+0x16/0xac [scsi_debug]
[   69.075117][c109a7bd] SyS_delete_module+0xfd/0x180
[   69.075181][c1466b2e] syscall_after_call+0x0/0x4
[   69.075243] 
[   69.075243] other info that might help us debug this:
[   69.075243] 
[   69.075321]  Possible unsafe locking scenario:
[   69.075321] 
[   69.075380]CPU0CPU1
[   69.075424]
[   69.075468]   lock(queued_arr_lock);
[   69.075534]lock((sqcp-cmnd_timerp));
[   69.075613]lock(queued_arr_lock);
[   69.075690]   lock((sqcp-cmnd_timerp));
[   69.075758] 
[   69.075758]  *** DEADLOCK ***
[   69.075758] 
[   69.075827] 1 lock held by rmmod/2890:
[   69.075867]  #0:  (queued_arr_lock){..-...}, at: [e1271887] 
stop_all_queued+0x17/0xc0 [scsi_debug]
[   69.076009] 
[   69.076009] stack backtrace:
[   69.076064] CPU: 1 PID: 2890 Comm: rmmod Not tainted 3.17.0-rc2+ #80
[   69.076117] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 07/31/2013
[   69.076200]  c1c93200  da25fe30 c146081f c1c93330 da25fe60 c145fbf6 
c158bbfc
[   69.076375]  c158bb99 c158bb7c c158bb91 c158bb7c da25fe9c 

Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with 2TB HDDs)

2014-08-30 Thread Alan Stern
On Fri, 29 Aug 2014, Matthew Dharm wrote:

 Is there an 'easy' way to override the detected size of a storage
 device from userspace?  If we had that, someone could write a helper
 application which looked for this particular fubar and try to Do The
 Right Thing(tm), or at least offer the user some options.

You mean, force a Media Change event and override the capacity reported 
by the hardware?  I'm not aware of any API for doing that, although it 
probably wouldn't be too hard to add one.

How would the user know what value to put in for the capacity?  Unless 
the drive had been hooked up to a different computer and the user 
manually noted the correct capacity and typed it in, it would have to 
be guesswork.

Alan Stern

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with 2TB HDDs)

2014-08-30 Thread Douglas Gilbert

On 14-08-30 05:15 PM, Alan Stern wrote:

On Fri, 29 Aug 2014, Matthew Dharm wrote:


Is there an 'easy' way to override the detected size of a storage
device from userspace?  If we had that, someone could write a helper
application which looked for this particular fubar and try to Do The
Right Thing(tm), or at least offer the user some options.


You mean, force a Media Change event and override the capacity reported
by the hardware?  I'm not aware of any API for doing that, although it
probably wouldn't be too hard to add one.

How would the user know what value to put in for the capacity?  Unless
the drive had been hooked up to a different computer and the user
manually noted the correct capacity and typed it in, it would have to
be guesswork.


Might another possibility be using the SAT layer to issue
the appropriate ATA command via the SCSI ATA PASS-THROUGH
(12 or 16) command to find out the disk's size. This might
be a possible strategy if READ CAPACITY(10) yields 0x
for the last sector's LBA and the follow-up READ CAPACITY(16)
fails or yields a truncated value.

Doug Gilbert


BTW Been looking at a USB-to-SATA adapter that uses the
UAS(P) transport. I thought nothing could have worse
SCSI compliance than USB mass storage devices. I was
wrong ...

--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scsi_debug module deadlock on 3.17-rc2

2014-08-30 Thread Douglas Gilbert

On 14-08-30 04:56 PM, Milan Broz wrote:

Hi,

I am using scsi_debug in cryptsetup testsuite and with recent 3.17-rc kernel
it deadlocks on rmmod of scsi_debug module.

For me even this simple reproducer causes deadlock:
   modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1
   DEV=/dev/$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)
   mkfs -t ext4 $DEV
   rmmod scsi_debug

(adding small delay before rmmod obviously helps here)


So I used this slight variation for testing:

modprobe scsi_debug dev_size_mb=16 sector_size=512 num_tgts=1 num_parts=1
DEV=/dev/$(grep -l -e scsi_debug /sys/block/*/device/model | cut -f4 -d /)1
echo mkfs -t ext4 ${DEV}
mkfs -t ext4 ${DEV}
sleep 0.1
rmmod scsi_debug


Bisect tracked it to commit
   commit cbf67842c3d9e7af8ccc031332b79e88d9cca592
   Author: Douglas Gilbert dgilb...@interlog.com
   Date:   Sat Jul 26 11:55:35 2014 -0400
   scsi_debug: support scsi-mq, queues and locks

I guess that with introducing mq the del_timer_sync() must not be called
with acquired queued_arr_lock.
(to me it looks like situation described in comment before
del_timer_sync() in kernel/time/timer.c...)


Looks like something a lawyer would write.


Here is the log (running on vmware VM and i686 arch):

[   67.916472] scsi_debug: host protection
[   67.916483] scsi host3: scsi_debug, version 1.84 [20140706], dev_size_mb=16, 
opts=0x0
[   67.917446] scsi 3:0:0:0: Direct-Access Linuxscsi_debug   0184 
PQ: 0 ANSI: 5
[   67.920539] sd 3:0:0:0: Attached scsi generic sg8 type 0
[   67.940542] sd 3:0:0:0: [sdh] 32768 512-byte logical blocks: (16.7 MB/16.0 
MiB)
[   67.940548] sd 3:0:0:0: [sdh] 4096-byte physical blocks
[   67.950705] sd 3:0:0:0: [sdh] Write Protect is off
[   67.950715] sd 3:0:0:0: [sdh] Mode Sense: 73 00 10 08
[   67.970514] sd 3:0:0:0: [sdh] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[   68.040566]  sdh: unknown partition table
[   68.090618] sd 3:0:0:0: [sdh] Attached SCSI disk
[   68.799699]  sdh: unknown partition table
[   69.072314]
[   69.072387] ==
[   69.072433] [ INFO: possible circular locking dependency detected ]
[   69.072487] 3.17.0-rc2+ #80 Not tainted
[   69.072518] ---
[   69.072560] rmmod/2890 is trying to acquire lock:
[   69.072595]  ((sqcp-cmnd_timerp)){+.-...}, at: [c10846c0] 
del_timer_sync+0x0/0xb0
[   69.072704]
[   69.072704] but task is already holding lock:
[   69.072743]  (queued_arr_lock){..-...}, at: [e1271887] 
stop_all_queued+0x17/0xc0 [scsi_debug]
[   69.072852]
[   69.072852] which lock already depends on the new lock.
[   69.072852]


snip


[   69.075321]  Possible unsafe locking scenario:
[   69.075321]
[   69.075380]CPU0CPU1
[   69.075424]
[   69.075468]   lock(queued_arr_lock);
[   69.075534]lock((sqcp-cmnd_timerp));
[   69.075613]lock(queued_arr_lock);
[   69.075690]   lock((sqcp-cmnd_timerp));
[   69.075758]
[   69.075758]  *** DEADLOCK ***


Interesting analysis, somewhat confusing because cmnd_timerp
is a pointer. Also my guess is the sqcp pointers in the
two threads were different.

Anyway the attached patch removes the lock(queued_arr_lock)
from around the del_timer calls. Could you try it and report
back.

Doug Gilbert


--- a/drivers/scsi/scsi_debug.c	2014-08-26 13:24:51.646948507 -0400
+++ b/drivers/scsi/scsi_debug.c	2014-08-30 18:04:54.589226679 -0400
@@ -2743,6 +2743,13 @@ static int stop_queued_cmnd(struct scsi_
 		if (test_bit(k, queued_in_use_bm)) {
 			sqcp = queued_arr[k];
 			if (cmnd == sqcp-a_cmnd) {
+devip = (struct sdebug_dev_info *)
+	cmnd-device-hostdata;
+if (devip)
+	atomic_dec(devip-num_in_q);
+sqcp-a_cmnd = NULL;
+spin_unlock_irqrestore(queued_arr_lock,
+		   iflags);
 if (scsi_debug_ndelay  0) {
 	if (sqcp-sd_hrtp)
 		hrtimer_cancel(
@@ -2755,18 +2762,13 @@ static int stop_queued_cmnd(struct scsi_
 	if (sqcp-tletp)
 		tasklet_kill(sqcp-tletp);
 }
-__clear_bit(k, queued_in_use_bm);
-devip = (struct sdebug_dev_info *)
-	cmnd-device-hostdata;
-if (devip)
-	atomic_dec(devip-num_in_q);
-sqcp-a_cmnd = NULL;
-break;
+clear_bit(k, queued_in_use_bm);
+return 1;
 			}
 		}
 	}
 	spin_unlock_irqrestore(queued_arr_lock, iflags);
-	return (k  qmax) ? 1 : 0;
+	return 0;
 }
 
 /* Deletes (stops) timers or tasklets of all queued commands */
@@ -2782,6 +2784,13 @@ static void stop_all_queued(void)
 		if (test_bit(k, queued_in_use_bm)) {
 			sqcp = queued_arr[k];
 			if (sqcp-a_cmnd) {
+devip = (struct sdebug_dev_info *)
+	sqcp-a_cmnd-device-hostdata;
+if (devip)
+	atomic_dec(devip-num_in_q);
+sqcp-a_cmnd = NULL;
+spin_unlock_irqrestore(queued_arr_lock,
+		   iflags);
 if (scsi_debug_ndelay  0) {
 	if (sqcp-sd_hrtp)

Re: Problem with USB-to-SATA adapters (was: AS2105-based enclosure size issues with 2TB HDDs)

2014-08-30 Thread Matthew Dharm
On Sat, Aug 30, 2014 at 2:15 PM, Alan Stern st...@rowland.harvard.edu wrote:
 On Fri, 29 Aug 2014, Matthew Dharm wrote:

 Is there an 'easy' way to override the detected size of a storage
 device from userspace?  If we had that, someone could write a helper
 application which looked for this particular fubar and try to Do The
 Right Thing(tm), or at least offer the user some options.

 You mean, force a Media Change event and override the capacity reported
 by the hardware?  I'm not aware of any API for doing that, although it
 probably wouldn't be too hard to add one.

 How would the user know what value to put in for the capacity?  Unless
 the drive had been hooked up to a different computer and the user
 manually noted the correct capacity and typed it in, it would have to
 be guesswork.

I didn't say it would be easy to figure out the right value, but at
least it would be possible.

I was thinking of something that could notice a USB device which is
formatted NTFS and has a partition table and filesystem that indicates
a much bigger capacity than what the drive reports.  Under this
circumstances, you could do something like pop-up a dialog box saying
this drive is confused -- is it 2TB or 3TB?

Well, maybe that would say Drive capacity is not consistent with
partition table.  This can happen with certain USB drives designed for
use with Windows.  Override drive capacity (emulating Windows)?

You could imagine increasing complex heuristics to try to detect this
scenario.  Even without an automated helper program to do it, if there
was a sysfs interface then when we got the periodic e-mails reporting
this same type of problem, we could offer a quick-and-clean solution.

Matt


-- 
Matthew Dharm
Maintainer, USB Mass Storage driver for Linux
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html