Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1

2005-09-15 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote:
 On Thu, 2005-09-15 at 15:57 -0400, James Bottomley wrote:
  I haven't had time to review the eh changes, but I was going to reply to
  the other one (basically there's a better way to try to close the device
  add/host remove race using the host state model).
  
  Let me complete the SCSI process and I'll take them through the scsi-rc-
  fixes tree.
 
 Well, I think the symptoms are racing scsi_remove_host() calls and the
 solution is to enforce the state model on removal (as in if the host is
 already in the remove state, don't try to remove it again).
 
 Could you try the patch here:
 
 http://marc.theaimsgroup.com/?l=linux-scsim=112613077011571
 
 And see if it will fix the problem?
 

A side effect of not applying Alan's previous patch that added
SHOST_RECOVERY to the SHOST_CANCEL: state is that we will not move to the
SHOST_CANCEL and subsequently not to SHOST_DEL state if the eh is active
during the start of scsi_remove_host. I sent mail on the 7th indicating to
include that state change hunk of the diff, but I guess that overlapped
with your newer state changes.
http://marc.theaimsgroup.com/?l=linux-scsim=112238726326927w=2

In looking at the state model introduced by your patch I believe there may
still be a state model race issue if the recovery completes just after
the if (!scsi_host_set_state(shost, SHOST_CANCEL)) call in
scsi_remove_host (maybe I am just looking to quickly at the state
updates).

I still do not understand as I asked in a previous comment why we are not
shutting down the eh_thread in scsi_remove_host and also why simpler state
model updates could not solve the problem.

I believe I also indicated that we could enhance scsi_error to shutdown
faster during this state which should only be a performance improvement.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]


---
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42 plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] oops on usb storage device disconnect with 2.6.14-rc1

2005-09-15 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote:
 On Thu, 2005-09-15 at 15:19 -0700, Mike Anderson wrote:
  A side effect of not applying Alan's previous patch that added
  SHOST_RECOVERY to the SHOST_CANCEL: state is that we will not move to the
  SHOST_CANCEL and subsequently not to SHOST_DEL state if the eh is active
  during the start of scsi_remove_host. I sent mail on the 7th indicating to
  include that state change hunk of the diff, but I guess that overlapped
  with your newer state changes.
  http://marc.theaimsgroup.com/?l=linux-scsim=112238726326927w=2
 
 Yes, but that's not really legitimate since it introduces a bifurcation
 in the state machine ... when the eh terminates it will come back to
 running even if it went in from cancel.

Clarification here as I do not see the split in the state machine or the
transition back to running from cancel. If the above patch is applied we
can transition to cancel from recovery if eh already has started. When eh
is complete scsi_restart_operations transition to running will fail as we
are in the cancel state. 

That said I like the idea below of waiting / terminating the eh thread
prior to transitioning to cancel. There is some introduction of asymmetry
here in scsi_remove_host as the eh thread is created in scsi_host_alloc,
but possibly later patches could move the eh creation to scsi_add_host
(unless I forgot the reason it needed to be earlier).

 
  In looking at the state model introduced by your patch I believe there may
  still be a state model race issue if the recovery completes just after
  the if (!scsi_host_set_state(shost, SHOST_CANCEL)) call in
  scsi_remove_host (maybe I am just looking to quickly at the state
  updates).
 
 No, that's true; there's a tiny race that can be mediated by doing
 locking around the state changes ... that was one of the feedback
 comments from Alan.

ok.

 
  I still do not understand as I asked in a previous comment why we are not
  shutting down the eh_thread in scsi_remove_host and also why simpler state
  model updates could not solve the problem.
 
 Well, it goes back to whether we wait for the thread or not.  To shut
 the thread down, we also need to wait for it to complete.
 
 As far as the state model goes, we either need to wait for the eh thread
 before transitioning to cancel or introduce the extra states that
 reflect the parallel eh transitions.

I like waiting for the eh thread to complete / shutdown prior to
transitioning to cancel.

 
  I believe I also indicated that we could enhance scsi_error to shutdown
  faster during this state which should only be a performance improvement.
 
 Yes, we could ... patches?


ok, I will try, but IBM non-coding activities has made me very
unproductive in the patch department lately :-).

-andmike
--
Michael Anderson
[EMAIL PROTECTED]


---
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42 plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] 2.6.11-rc3-bk5, oops in scsi_try_bus_reset

2005-02-14 Thread Mike Anderson
Alan Stern [EMAIL PROTECTED] wrote:
 And Mike Anderson's response was
 
 http://marc.theaimsgroup.com/?l=linux-scsim=110538854224319w=2
 
 His explanation was Currently scsi_host_cancel being called from
 scsi_remove_host appears to not do anything as scsi_forget_host removes
 the devices from the list it iterates over.  I don't know whether this 
 problem has been fixed yet.
 


No, this has not been fixed yet, but needs to be. 

In discussing this with Mike C today he had a patch that would drain the
devices queues prior to scsi_forget_host returning which may help in this
case. 

Along with Mike C's change we could remove the scsi_host_cancel code
(since it does not do anything and when it did it was racey), and shutdown
the error handler thread in scsi_remove_host (reduce possible callers into
the LLDD post scsi_remove_host).

Mike C or I will have something out soon on this.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
SF email is sponsored by - The IT Product Guide
Read honest  candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595alloc_id=14396op=click
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] Fw: [BUG] USB Storage OOPS and a D state process in 2.6.10

2005-01-10 Thread Mike Anderson

Alan Stern [EMAIL PROTECTED] wrote:
 Jan 10 20:49:08 desktop kernel: scsi113 : SCSI emulation for USB Mass Storage 
 devices
 Jan 10 20:49:08 desktop kernel: usb-storage: device found at 113
 Jan 10 20:49:08 desktop kernel: usb-storage: waiting for device to settle 
 before scanning
 Jan 10 20:49:12 desktop kernel: In bus_reset, srb: 010037d71640
 Jan 10 20:49:12 desktop kernel:   device: 010038e3d000
 Jan 10 20:49:12 desktop kernel:   host: 01002b500c00
 Jan 10 20:49:12 desktop kernel:   hostdata: 01002b549c00
 
 This is the first indication of something funny.  I can't think how a bus
 reset would get initiated by the kernel -- that only happens after a
 timeout.  Maybe it came from a user program?

Well it can get initiated in the scsi error handler. This can be verified
by setting the logging_level (sysctl -w dev.scsi.logging_level=0x4).

You probably do not need to do this as I assume what is happening is that
with all the plug and unplug that is being initiated that a IO is in
flight when the scsi_remove_host functions is called. It looks like we
still have to decide on the removal cleanup. Currently scsi_host_cancel
being called from scsi_remove_host appears to not do anything as
scsi_forget_host removes the devices from the list it iterates over. Also
as previously mentioned by you there are races with scsi_mid canceling IOs
that might be completing.  Both of these issue where mentioned in an old
thread (link provided below), but not completely resolved.

http://marc.theaimsgroup.com/?t=10963092065r=1w=2

What is needed is for the LLDD to ensure all IO sent through the LLDD's
queuecommand functions gets a scsi_done call (even a fail call). The scsi
mid applies choke points to stop IO being set to the LLDDs queuecommand,
but there can be races during a host removal (you previously mentioned
this). This would also imply that scsi mid should not be making the call
to scsi_host_cancel (which appears to not do anything anyway).


-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almosthttp://www.thinkgeek.com/sfshirt
___
linux-usb-devel@lists.sourceforge.net
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] BUG when removing USB flash drive

2004-05-14 Thread Mike Anderson
Alan Stern [EMAIL PROTECTED] wrote:
 On Thu, 13 May 2004, Mike Anderson wrote:
 
  The LLDD queuecommand should not be called after the return of
  scsi_remove_host (unless we have a bug). The LLDD should not free its
  resources until the release function is called on the struct device
  passed in during the scsi_add_host call.
 
 Ah.  We have a problem here, because usb-storage doesn't own that struct 
 device.  It is owned by the USB core, and usb-storage has no way to know 
 when its release function has been called.
 
 Even worse, the release function may never be called at all.  The device
 itself may still exist and be attached to the system even though the
 usb-storage driver has been unbound from it.
 
 There must be many other LLDDs with the same problem.  Any driver that 
 manages a PCI host controller, for example.  The struct device it receives 
 from the PCI layer isn't owned by the LLDD, yet that's what it must pass 
 to scsi_add_host().
 
 How should we handle this?
 

I was incorrect in my previous statement. Well we may hold a reference
count on the struct device passed into the scsi_add_host until all
references on the host instance are gone and there maybe a positive ref
count on the hostt module until all devices are closed a instance of
LLDD should be able to do cleanup post return of scsi_remove_host. This
needs to be true or SCSI mid-layer would not honor the requirement for
the pci remove call.

I believe in the past you or someone else has mentioned that post the
return of scsi_remove_host a Scsi_Host instance still has a hostt and a
transportt which may be dangerous if all code paths are not correctly
handling a host in the SHOST_DEL state.

I guess we still have this current issue on what is happening in the
usb storage that it believes SCSI mid is still using the device. Is
something not cleaning up, or is something in SCSI mid not honoring the
SHOST_DEL state.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562alloc_id=6184op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] BUG when removing USB flash drive

2004-05-13 Thread Mike Anderson
Alan Stern [EMAIL PROTECTED] wrote:
 This BUG happened because the SCSI layer was still using the drive after 
 usb-storage had called scsi_remove_host().

In this case the scsi_remove_host is being called in a unexpected
disconnect case. Any IOs in flight will be canceled in the mid-layer
(i.e., they will not be chased down with calls to eh_abort_handler).

 
 This is an aspect of SCSI midlayer - low-level driver interaction that
 I'm not very clear about.  How long after the host is removed can the
 midlayer continue to use it?  How does the driver know when the midlayer
 is finished using the host so the driver can exit (and unload from
 memory)?

The LLDD queuecommand should not be called after the return of
scsi_remove_host (unless we have a bug). The LLDD should not free its
resources until the release function is called on the struct device
passed in during the scsi_add_host call. This release function could be
called post the scsi_host_dev_release function being called as it does a
put on this struct device. This chain of release calls is undefined as a
hotplug goes out on the remove calls, but if user space never umounts /
closes the device things will not fully cleanup.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562alloc_id=6184op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: bug 2400

2004-04-06 Thread Mike Anderson

PS, I am traveling today so future comments will be delayed a bit.

Alan Stern [EMAIL PROTECTED] wrote:
 All right, let's look at sd.c.  I'll show you that _it_ doesn't obey the 
 object lifetime rules.  In sd_open we see this code (lightly edited):
 
 
   static int sd_open(struct inode *inode, struct file *filp)
   {
   struct gendisk *disk = inode-i_bdev-bd_disk;
   struct scsi_disk *sdkp = scsi_disk(disk);
   struct scsi_device *sdev;
   int retval;
 
   retval = scsi_disk_get(sdkp);
   if (retval)
   return retval;
 
   sdev = sdkp-device;
 
 
 As it turns out, the block layer guarantees that when sd_open runs the
 bd_disk pointer will be valid.  It does this by following the pattern I
 mentioned in an earlier message -- drivers/base/map.c uses a
 subsystem-wide semaphore, domain_sem, to properly synchronize lookups and
 deletes.
 
 Next, the scsi_disk inline function returns:
 
   container_of(disk-private_data, struct scsi_disk, driver);
 
 How do you know that the scsi_disk pointed to by disk-private_data still 
 exists?  So far as I can see, the gendisk doesn't take any references to 
 it.  Correct me if I'm wrong, but there doesn't seem to be anything 
 preventing a disconnect event from arriving after the open() call has got 
 a valid reference to the gendisk, and succeeding in deallocating the 
 scsi_disk before this code executes.  There's only one reference between 
 the scsi_disk and the gendisk, and it goes the wrong way: the scsi_disk 
 owns a reference to the gendisk.
 
 But let's suppose that works okay, so sdkp is a valid pointer.  Then 
 the code calls scsi_disk_get(), which in turn calls scsi_device_get() for 
 sdkp-device.  How do you know that this doesn't point to deallocated 
 storage?  The only reference to the scsi_device is taken (in a rather 
 convoluted way) by the gendisk, and it is dropped during del_gendisk() -- 
 not when the gendisk is released.  Hence it is entirely possible for a 
 disconnect event to have freed the scsi_device when this code executes.
 
 
 There's two potential oopses for you.  I don't have a full grasp of the
 web of interlocking references (and interlocking code) in the SCSI,
 gendisk, and block layers, but it seems likely that at least one of
 these might actually happen.
 
 The object lifetime rules require that in your disconnect() routine, you
 must tell all your users that your structure is going away, but you must
 not free the structure until your users have notified you that they won't
 try to use it any more.  When the scsi_disk is on its way out, sd.c tells
 the gendisk but doesn't wait for a notification in return.  When the
 scsi_device is on its way out the SCSI core tells sd.c, but sd.c doesn't
 send back its notification at the right time.

As I previously stated there is no notification back from the block
layer that users are complete with a structure. Currently the only
method to prevent an oops looks like sd_remove would need to use
lock_kernel so that scsi could ensure that a user would be through open
which means the sd_remove would not take references to zero or the user
has not made it far enough through do open that they have received a
gendisk structure so that del_gendisk will ensure they do not call
sd_open.

I would like another method, but this looks like it would need to be
another shared sync mechanism between scsi layer (an other blk
interfaces) and block layer or a lookup method similar to kobj_lookup
in scsi open routines so that a object can be unmapped atomically.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470alloc_id=3638op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote:
 
 Well, I know why this happens, but I'm not entirely clear how to fix it.
 
 The problem comes because the cdrom open and close take and release
 references to the SCSI generic device (as they're supposed to).
 
 However, Upper level Drivers like sr are implemented as generic device
 drivers in the driverfs model.
 
 When a USB unplug comes along, it calls scsi_remove_device, which
 eventually calls device_del().  The problem is that device_del triggers
 the -remove methods of all the attached drivers and the sr_remove
 method calls cdrom_unregister which throws away the cdrom device state,
 even though the actual device has active references.

Yes, we reordered some of this in sd. As your comment down below
indicates reordering will reduce the window but not eliminate it.

 
 Some time later, the device is closed but there's now bogus state
 because the sr_remove method has kfreed the struct scsi_cd which
 contains the struct cdrom_device_info.
 
 Now, the questions are, whose issue is this and how do we fix it?  I can
 see that a driver needs early notification of unplugs so it can deny all
 access to a gone device.  On the other hand, for a user land open where
 we still have to hold resources in the driver, we'd like the driver to
 have a notify when the device reference count drops to zero so we can
 clean up.
 
 This problem, by the way, exists in a lesser form for sd: the sd remove
 method will free the device for reattachment even though it might have
 active references.
 

I have looked at the sd issue off and on due to the previous open race
reported by Alan Stern. While the window can be narrowed inside SCSI you
need help for the calling subsystem to know when it is OK to cleanup and
your routine will not be called anymore. A similar problem also showed
up in the tear down the host directory entry in /proc/scsi but was only
fixed up so far due to its depreciated status.

http://marc.theaimsgroup.com/?t=10554517591r=1w=2

I believe as indicated above that all cross subsystem registrations need
a release / put callback. This would allow the release chain to be
called from block - ULD - scsi core - LLDD. 

Recently I have not been spending the proper time looking at this, but
last look it appeared that we needed to  add a release / put method call
to the gendisk disk_release routine. The release function or object to do
the put on would need to be set prior to the call to add_disk.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470alloc_id=3638op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] Re: Unregistering interfaces

2004-04-02 Thread Mike Anderson
Alan Stern [EMAIL PROTECTED] wrote:
  You can search the lkml archives for the bugs that my kobject patch
  fixed, as there was some discussion there (otherwise I would have never
  written the patch...)
 
 I searched and found one thread that looks relevant:
 
 http://marc.theaimsgroup.com/?t=10635469474r=1w=2
 
 In the final message Mike said that the patch did fix the problem but that 
 a better answer would be to fix the SCSI sd driver.  If that's been taken 
 care of then there's no more need for the kobject patch, so far as SCSI is 
 concerned.  My test yesterday showed everything working okay without the 
 patch.  Mike, can you tell us the current state of the SCSI code?
 

I did change the sd scsi code to correct the order of sysfs calls on
tear down. I tried to do it all in sd / scsi core code, but there is
still a small window for a open / tear down race in sd. I believe the only
way to correct this is to add a release callback from the block layer.

There is also another thread that I believe the usb-devel list is copied
on that is a similar tear down issue.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470alloc_id=3638op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote:
  Recently I have not been spending the proper time looking at this, but
  last look it appeared that we needed to  add a release / put method call
  to the gendisk disk_release routine. The release function or object to do
  the put on would need to be set prior to the call to add_disk.
 
 Actually, I can't believe this problem to be local entirely to SCSI. 
 So, a simpler mechanism (and more globally useful) might be to have a
 two phase driver release in sysfs:

I would agree this should be more general than SCSI, but if kobjects are
being passed during the registration / setup in other subsystem
interfaces than this could already be handled.

 
 - the current -remove would stay where it is (as a notify on
 device_del).  On receving this the driver begins clean up enough to drop
 any internal references to the device it is holding.
 - then introduce a -release which is called as part of dropping the
 last device reference where the driver cleans up any resources the
 driver was keeping to service the removed but not released device.
 
 Then, we'd obviously not call unregister_cdrom and kfree the scsi_cd
 structure until -release time.

Maybe some clarification here as I am unsure if we both think there
needs to be a notification (a put call) from outside SCSI. We have
release functions available on most objects in SCSI now. The issue is
that when we register (add_disk, dev_set_drvdata, etc.) or pass a handle
to another subsystem we need a reference count agreement to know when
the other subsystem is done with the the object. Something like the
put_device(parent) used in scsi_host_dev_release.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470alloc_id=3638op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote:
 On Fri, 2004-04-02 at 11:45, Mike Anderson wrote:
  Maybe some clarification here as I am unsure if we both think there
  needs to be a notification (a put call) from outside SCSI. We have
  release functions available on most objects in SCSI now. The issue is
  that when we register (add_disk, dev_set_drvdata, etc.) or pass a handle
  to another subsystem we need a reference count agreement to know when
  the other subsystem is done with the the object. Something like the
  put_device(parent) used in scsi_host_dev_release.
 
 Actually, no, that's not the issue here, if I understand you.  The
 reference counting model on the sdev-sdev_gendev seems to be working
 correctly because sr.c takes a reference to the sdev_gendev on open and
 drops it on close.
 
 The problem is that ULDs are implemented as struct device_drivers and as
 such, their -remove method gets called *not* on last put of sdev_gendev
 but on device_del (when there are still active references).

The remove can do as much or as little as the implementor wishes, but I
believe there is still a under lying issue here (see comment below).
 
 sr.c frees the cdinfo structure on -remove, but still has its own
 reference to sdev_gendev (because the device is still open). On final
 close, the generic cdrom code tries to use cdinfo to close the device
 and references a kfree'd structure.  Really what sr.c wants to be doing
 is freeing the cdinfo structure on last put, not on device_del.
 

Where does the last put come from? How do you close the open race or
know when the final put_disk has been called? SCSI cannot do this alone
as we have created and registered an object in another subsystem
(alloc_disk and add_disk) and we have no indication when that objects
ref count has reached zero. 

Or

As I previous stated in the thread below I have the gendisk /
block layer locking mis-understood and there is something that SCSI can
do.

For reference you can look at this thread sent by Alan about a sd race.
http://marc.theaimsgroup.com/?t=10759118583r=1w=2

While freeing in sr could be rearranged more like what sd does there is
still the issue of a cross subsystem put to know that the ULDs open
function will not be called again.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470alloc_id=3638op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: bug 2400

2004-04-02 Thread Mike Anderson
James Bottomley [EMAIL PROTECTED] wrote:
 = drivers/scsi/sr.c 1.103 vs edited =
 --- 1.103/drivers/scsi/sr.c   Fri Apr  2 11:30:44 2004
 +++ edited/drivers/scsi/sr.c  Fri Apr  2 17:29:06 2004
 @@ -424,8 +424,19 @@
  
  static int sr_block_release(struct inode *inode, struct file *file)
  {
 + int ret;
   struct scsi_cd *cd = scsi_cd(inode-i_bdev-bd_disk);
 - return cdrom_release(cd-cdi, file);
 + struct scsi_device *sdev = cd-device;
 + ret = cdrom_release(cd-cdi, file);
 + if(ret)
 + return ret;

It looks like cdrom_release always returns 0? Is there some other patch
outstanding that changes this.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470alloc_id=3638op=click
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: Problem with an USB Stick and kernel 2.6.1

2004-01-16 Thread Mike Anderson
Alan Stern [EMAIL PROTECTED] wrote:
 
 Seeing no sense key, usb-storage returns 0 indicating the command
 succeeded.  But scsi_finish_command() sees that valid sense data is
 present and sets the driver_byte to DRIVER_SENSE.
 
  SCSI error : 1 0 0 0 return code = 0x800
  Current sda: sense key No Sense
 
 This causes scsi_io_completion() to call scsi_end_request() with uptodate 
 = 0 and results in this error:
 
  end_request: I/O error, dev sda, sector 32
  FAT: unable to read boot sector
 
 How should we handle this?  Should we zero out sense_buffer[0] when 
 auto-sense shows sense key = NO_SENSE?
 

I would think the upper level driver should handle this. We decode
RECOVERED_ERROR in sd. sd could just handle NO_SENSE the same with /
without the print_sense. I noticed that st and osst under some
circumstances look at the NO_SENSE case (the tape people will need to
comment as this looks like a EOM workaround) so it would seem incorrect
for the LLDD as a general policy to say it received no sense when it
did.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: Problem with an USB Stick and kernel 2.6.1

2004-01-16 Thread Mike Anderson

This matches what I thought should be added to sd, but I did not hear an
answer from James / others if this is the way we should handle it.

Alan Stern [EMAIL PROTECTED] wrote:
 On Fri, 16 Jan 2004, Mike Anderson wrote:
 
  Alan Stern [EMAIL PROTECTED] wrote:
   
   Seeing no sense key, usb-storage returns 0 indicating the command
   succeeded.  But scsi_finish_command() sees that valid sense data is
   present and sets the driver_byte to DRIVER_SENSE.
   
SCSI error : 1 0 0 0 return code = 0x800
Current sda: sense key No Sense
   
   This causes scsi_io_completion() to call scsi_end_request() with uptodate 
   = 0 and results in this error:
   
end_request: I/O error, dev sda, sector 32
FAT: unable to read boot sector
   
   How should we handle this?  Should we zero out sense_buffer[0] when 
   auto-sense shows sense key = NO_SENSE?
   
  
  I would think the upper level driver should handle this. We decode
  RECOVERED_ERROR in sd. sd could just handle NO_SENSE the same with /
  without the print_sense. I noticed that st and osst under some
  circumstances look at the NO_SENSE case (the tape people will need to
  comment as this looks like a EOM workaround) so it would seem incorrect
  for the LLDD as a general policy to say it received no sense when it
  did.
 
 Florian and Alexander, does this patch fix your problem?  Mike, how does 
 it look to you?
 
 Alan Stern
 
 
 = sd.c 1.59 vs edited =
 --- 1.59/drivers/scsi/sd.cFri Oct 24 14:53:37 2003
 +++ edited/drivers/scsi/sd.c  Fri Jan 16 15:49:48 2004
 @@ -730,6 +730,14 @@
* hard error.
*/
   print_sense(sd, SCpnt);
 + /* FALLS THROUGH */
 +
 + case NO_SENSE:
 + /*
 +  * The low-level driver got the sense data but
 +  * everything was all right.  Don't treat this
 +  * an an error.
 +  */
   SCpnt-result = 0;
   SCpnt-sense_buffer[0] = 0x0;
   good_sectors = this_count;
 
-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
The SF.Net email is sponsored by EclipseCon 2004
Premiere Conference on Open Tools Development and Integration
See the breadth of Eclipse activity. February 3-5 in Anaheim, CA.
http://www.eclipsecon.org/osdn
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] Re: [PATCH] oops in sd_shutdown

2003-08-14 Thread Mike Anderson
Greg KH [EMAIL PROTECTED] wrote:
 On Tue, Aug 12, 2003 at 07:53:53AM +0100, Christoph Hellwig wrote:
  On Tue, Aug 12, 2003 at 12:28:44AM +0200, Andries Brouwer wrote:
   I see an Oops in the SCSI code, caused by the fact that sdkp is NULL
   in sd_shutdown. How can that be?, you will ask - dev-driver_data was set
   in sd_probe. But in my case sd_probe never finished. An insmod usb-storage
   hangs forever, or at least for more than six hours, giving ample opportunity
   to observe this race between sd_probe and sd_shutdown.
   (Of course sd_probe hangs in sd_revalidate disk.)
  
  Well, this same problem could show upb in any other driver.  Could
  you instead send a patch to Pat that the driver model never calls
  the shutdown method for a driver that hasn't finished -probe?
 
 I think it already will not do that due to taking the bus-subsys.rwsem
 before calling either probe() or remove().
 

Is the shutdown being called directly? The shutdown call is protected by
a different rwsem. Depending on the call graph setting dev-driver on
return of probe may provide a solution. I have not looked at all probe
routines to understand if this would cause any bad side effects.

Andries,
Can you send the oops output?

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa0013ave/direct;at.aspnet_072303_01/01
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


[linux-usb-devel] Re: [PATCH / RFC] scsi_set_host_offline

2003-03-25 Thread Mike Anderson
Matthew Dharm [EMAIL PROTECTED] wrote:
 Mike --
 
 As far as I can tell, this patch was never accepted.  Any idea when this
 will be resolved?
 
 usb-storage is still waiting to implement proper device unplug...
 

I reposted the patch with refreshed offsets for the newer kernel
version.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.net email is sponsored by:
The Definitive IT and Networking Event. Be There!
NetWorld+Interop Las Vegas 2003 -- Register today!
http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-02-26 Thread Mike Anderson
Matthew,
Sorry for the delay in replying (non coding activities are
consuming to many hours).

Matthew Dharm [EMAIL PROTECTED] wrote:
 Okay, I see Linus has now accepted this into his tree.  It should propagate
 to the USB development trees soon.
 
 One question: What else is needed?  We set the device offline,
 error/complete all pending commands, and the need to (somehow) make certain
 we're in a good state for calling scsi_remove_host().  How do we make that
 final guarantee?
 
 There was talk that scsi_set_device_offline() would take care of that for
 us by waking up the error handler... there seems to be code to do that
 

Yes,

The scsi_set_device_offline will wake up the error handler to abort
outstanding commands.

 There was talk of using the release() function from the SCSI template to
 actually release resources
 
 So, what's the plan?

There still are a few things on the to do list, but should not effect the
LLDD interface (at least this is the goal).
- scsi_request_fn needs a fix for device offline that will
  handle all request types.
- scsi_remove_host needs to call template release at the correct
  time (ref counting ??).
- need fix for offline hotplug event.
- Should do_hotplug be exported or should device states
  be added / fixed ??

Cleanups
- Change scsi_remove_host for int to void function.


-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.net email is sponsored by: Scholarships for Techies!
Can't afford IT training? All 2003 ictp students receive scholarships.
Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
www.ictp.com/training/sourceforge.asp
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-02-26 Thread Mike Anderson
Matthew Dharm [EMAIL PROTECTED] wrote:
 Right... but I removed the release() function because that was marked (in
 the documentation) as only for the old-style drivers.  So I'll need to
 re-introduce it -- but it looks like all it has to do is free some memory.
 Does that sound about right?

Yes it was previously removed, but IIRC this is the direction
discussed on this thread. For a idle device the release functionality could
be done in the context of the scsi_remove_host call, but for a busy
device we need to have this call to clean up later.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.net email is sponsored by: Scholarships for Techies!
Can't afford IT training? All 2003 ictp students receive scholarships.
Get hands-on training in Microsoft, Cisco, Sun, Linux/UNIX, and more.
www.ictp.com/training/sourceforge.asp
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel


Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-02-17 Thread Mike Anderson
Matthew Dharm [[EMAIL PROTECTED]] wrote:
 Any updates on this?  I saw some patches, but they don't seem to be in my
 tree (the usb tree, which is synced from Linus' tree).
 
 People are starting to reports OOPSes to me because of this being
 missing
 
 Matt
 

The scsi_set_device_offline interface is part of the last patch (scsi
error) I sent to linux-scsi. I updated my patch post some comments from
the list, but I am working on issue with the patch before I resend.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel



Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-02-03 Thread Mike Anderson
Sorry Matthew I got side tracked on some issues for the last week. The
scsi_set_device_offline(); function has not been added to any of James
linux-scsi trees. You could add a ifndef in your code until we get the
interface in the tree. 

Would scsi_set_device_offline() do more than sdev-online = 0? It
depends on the state of the device at the calling of the function.

I currently have scsi_set_device_offline() trying to do the following:
1.) set device offline and mark host in_recovery.
2.) mark all outstanding commands to be canceled and wake up
error handler.
3.) flush the request queue.
4.) Once device is really offline send hotplug event.

(2) needs some changes in the error handler which are needed in other
cases. This the large part of the change, but it is not directly part of
your request.

(3) need a cleaner way to flush request specials off the queue . It
would also be nice if there was a method to stop the incoming side of the
request queue.

(4) need export of do_hotplug interface or a method to generate a call
to it for an offline event.

I am working on these changes pretty much in order.

Doug suggested a scsi_schedule_host_removal(), but I thought we could
just change scsi_remove_host() to handle this task unless there is a
side effect that all callers would not want???.

-andmike

Matthew Dharm [[EMAIL PROTECTED]] wrote:
 Willem Riede [EMAIL PROTECTED] suggested to me that I simply set
 sdev-online = 0 for scsi_set_device_offline()
 
 Any reason that isn't good enough?
 
 On Sun, Feb 02, 2003 at 10:13:17AM -0800, Matthew Dharm wrote:
  So, was any of this ever implemented?  As far as I can tell, the required
  changes were:
  
  (o) addition of scsi_schedule_host_removal() (possibly optional)
  (o) implementation of scsi_set_device_offline() (possibly optional)
  (o) change the behavior of the 'hotplug initialization model' to call my
  release function
  
  Matt
   then call the set_scsi_device_offline(), then send back any outstanding
   commands via scsi_done(), then possibly call the
   scsi_schedule_host_removal() if Mike adds that function.  Then return.  
   Don't to anything else.  Take all the remaining code you would normally
   run at this point and put it into a function in your source called
   usb_release() and in your Scsi_Host_Template struct that you pass to the
   scsi layer, init the release pointer with the address of your
   usb_release() routine.  That way, the scsi layer can do what it's best at,
   taking care of the clean up details we've been talking about, and when
   it's all done, it will call your usb_release() routine with a single
   argument of the host struct you are wanting released.  At that point, you
   can do all the freeing you would have done in that khubd loop (at least I
   think that's the context you are doing the freeing from now) and know for
   a fact that not only are you freeing everything, but so is the scsi mid
   layer.  I think this will solve all the issues you've had, because this
   *won't* leak, it won't block your other actions, and it lets the scsi
   subsystem clean up properly.


-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel



Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-01-24 Thread Mike Anderson
Doug,
I started writing the interface you put forth in your email. I
am currently debugging it in UML so I can generate the error
conditions in a control manner. I still have some stuff to look
at in the error handler with it running in this mode as it
previously expected no one else to be possibly doing operations
on the host. This could be the case if other LLDD's use this
interface and have another device that happens to timeout an IO
post a device being set offline.

A clarification question below.


Doug Ledford [[EMAIL PROTECTED]] wrote:

 So, what you should be doing in 
 order to be both a nice scsi host that plays well with the generic 
 mechanism we have in place is when you get this removal event, you should 
 be free'ing all the state you needed about the usb bus and such and taking 
 this usb device off line or whatever you do.  Then let the scsi mid layer 
 clean up at it's leisure.  You don't need to worry about it because the 
 only thing you will have left is to wait for the scsi subsys to call you 
 when it's time to delete things.  You don't even have to keep device 
 references around because we pass those in to your deletion routines 
 anyway.
 
  I want to be able to inform the SCSI mid-layer, which will then inform
  higher layers, that the device is gone.
 
 scsi_set_device_offline() as we've been discussing.
 

I assumed that the hotplug event would only come from this function if
no commands where outstanding. If there where commands outstanding the
event would not be generated until the error handler gained ownership of
all the commands.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel



Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-01-24 Thread Mike Anderson
Alan Stern [[EMAIL PROTECTED]] wrote:
 On Fri, 24 Jan 2003, Luben Tuikov wrote:
 
  A LLDD should and must *not* call scsi_unregister_host().  This brakes
  all hierarchy.
   
  
  What I probably meant is the detect()/release() pair; release() itself
  normally calls scsi_unregister(host); the point is that it got nudged
  from ``above'', i.e. SCSI Core.
  
  How can a LLDD be certain that it can safely call scsi_unregister_host()
  whenever it wishes?  As Doug pointed out this leads to problems.
 
 Apparently it can't.  I don't mean to say that this was the right thing to 
 do; I just meant that this is what Matt's currently-proposed patch does.  
 Personally, I'm not very familiar with the details of the SCSI subsystem, 
 and I don't know what preconditions are required for calling the various 
 API's.
 
  Furhtermore, are we talking about scsi_unregister_host() or
  scsi_unregister(host)?  The former does drivers and the latter
  does hosts.  This would mean that my original statement was
  nevertheless correct, how can a LLDD decide to unload itself safely?
 
 I did indeed type it wrong.  The code first calls scsi_remove_host(host)  
 and then it calls scsi_unregister(host).
 

I probably should have looked at Matt's patch closer, sorry. If a LLDD is
going to be using scsi_add_host and scsi_remove_host the driver should
not use scsi_register_host / scsi_unregister_host. 

If the driver is updated to the sysfs driver model then:

1.) The drivers probe routine should call into the scsi mid with.
scsi_register(...);
scsi_add_host(...);

2.) The drivers remove routine should call into the scsi mid with.

scsi_remove_host(...);
scsi_unregister(...);

(scsi_remove_host is part of this current discussion).

The event that calls probe / remove is the device_register /
device_unregister of the the adapter device.

The LLDD's device_initcall / module_exit routines will call
driver_register and driver_unregister to cause device to driver
binding. Which will cause probe / remove to be called.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel



Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-01-24 Thread Mike Anderson
Luben Tuikov [[EMAIL PROTECTED]] wrote:
 Mike Anderson wrote:
 Doug,
  I started writing the interface you put forth in your email.
 
 Do you mind clarifying?  Either it was a private email, or
 one posted here, in which case there was an interpretation.

It was posted here at the bottom of this email
http://marc.theaimsgroup.com/?l=linux-scsim=104335366403485w=2

It is a starting point. 

-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.NET email is sponsored by:
SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See!
http://www.vasoftware.com
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel



Re: [linux-usb-devel] Re: [PATCH] USB changes for 2.5.58

2003-01-17 Thread Mike Anderson
Oliver and Alan I am trying to catch up on this thread so I did not
reply directly to your concerns, but I think they are covered below.

Matthew Dharm [[EMAIL PROTECTED]] wrote:
 On Fri, Jan 17, 2003 at 11:55:36AM +0100, Oliver Neukum wrote:
  That is simply wrong. Reporting somebody having pulled a plug must
  not fail. What are you supposed to do with an error here?
  
  There must be a way for a LLD to report that reliably.
  If the answer is, take that lock, call that function, error all pending
  requests, release that lock and call that function, it's OK.
  
  But it must work in all cases.
 
 I absolutely agree.  The device is gone.  I can't do anything about it.
 If the SCSI layer decides it can't let go, what am I supposed to do about
 it?
 
 In a separate discussion with Mike, he mentioned that you can't
 scsi_remove_device() unless there are no pending commands.
 
 How the hell is an LLD supposed to assure that!?!?
 

I believe that the scsi_remove_host function the way it is currently is
not the correct function. The SCSI needs to separate the device gone
from freeing. There maybe some unbounded cleanup as the request_queue
that is exported to the block layer is part of the scsi_device which is
a child of the virtual usb SCSI adapter. The only way to reduce the
unbounded time is possibly we reorganizing some sysfs tree object
layouts.

 The minute I error a command and call scsi_done(), I can get a new one.
 Unless I lock out requests with scsi_block_requests(), but that comes with
 major warnings about needing to get unblocked.
 

Well in the case of the device really being gone does the LLD need to be
worried about being unblocked. I get the feeling from this thread that
this is probably the wrong interface.

 The way this should work is that the LLD calls scsi_remove_device(), and
 that cuts off the flow of commands.  The LLD can promise to error-out any
 pending commands in the device command queue.
 
 That is, unless scsi_block_requests() and scsi_unblock_requests() are more
 useful than the documentation suggests... block(), error all commands, 
 unregister()... that would make some sense.  We could call
 scsi_block_request() as soon as we know the unit is gone, and unregister()
 as soon as the queue is empty.


We should really ensure that we have good separation between stopping device
IO, device gone, and release resources.

- SCSI seems to have the flags to stop the IO, but instead of
  scsi_block_requests we may want to export the setting of
  device online. This can be done from sysfs now, but
  not from the driver ( the driver does have a handle to the
  device, but it would be better to have an interface in case we
  need to do something addition operations).

- Possibly add a scsi_remove_device that would always succeed and
  a version of scsi_remove_host that calls scsi_remove_device for
  all devices. Though with the recent change to SCSI remove host to
  allow non sysfs device registration I do not believe we could
  ensure devices would be cleaned up.

- SCSI would need to support ref counting so that resources are
  not removed to soon.


-andmike
--
Michael Anderson
[EMAIL PROTECTED]



---
This SF.NET email is sponsored by: Thawte.com - A 128-bit supercerts will
allow you to extend the highest allowed 128 bit encryption to all your 
clients even if they use browsers that are limited to 40 bit encryption. 
Get a guide here:http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0030en
___
[EMAIL PROTECTED]
To unsubscribe, use the last form field at:
https://lists.sourceforge.net/lists/listinfo/linux-usb-devel