Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-14 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 19:36 +0200, James Bottomley [EMAIL PROTECTED] wrote:
snip
 ---
 From: Boaz Harrosh [EMAIL PROTECTED]
 Subject: [PATCH] gdth: bugfix for the at-exit problems

 gdth_exit would first remove all cards then stop the timer
 and would not sync with the timer function. This caused a crash
 in gdth_timer() when module was unloaded.
 So del_timer_sync the timer before we delete the cards.

 also the reboot notifier function would crash. So unify
 the exit and halt functions with a gdth_shutdown() that's
 called by both.

 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
snip
 +static struct notifier_block gdth_notifier = {
 +gdth_halt, NULL, 0
 +};
 +
 +bool gdth_shutdown_done;
 

right forgot the static. But I use it in gdth_init(), so it
must be external. Unless you promise me that gdth_init() will
never ever be called after a call to shutdown.
Any way the hot-plug patch changes all that. This is only
for 2.6.24 bugfixs.

 Static police alert!  Just make it static and move it into
 gdth_shutdown()
 
 +static void gdth_shutdown()
 +{
 +gdth_ha_str *ha;
 +if (gdth_shutdown_done)
 +return;
 +
 +gdth_shutdown_done = true;
 +unregister_chrdev(major,gdth);
 +unregister_reboot_notifier(gdth_notifier);
 
 I'm not sure you can do this, aren't reboot notifiers called with the
 rwsem held?  In which case the unregister which also takes the rwsem
 will hang the system.
 
humm, can't remove a notifier from within the notifier. Thanks James for 
the catch, it's what happens when you don't test your own patches.

I have moved unregister_reboot_notifier to gdth_exit.
 James
 

Will send a new version for review. Please note that this is a bugfix patch
on top of 2.6.24. It is not needed for Jeff's hot-plug path.

There will be one more bugfix patch for a crash at the user-mode ioctl code.

Boaz


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-14 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 21:38 +0200, Jan Engelhardt [EMAIL PROTECTED] wrote:
 On Feb 13 2008 11:03, Boaz Harrosh wrote:
 I've tested this patch now - and it works fine. Now rmmod, halt and 
 reboot also works.

 Stefan Priebe

 This is grate news Stefan. Thank you very much for all your time
 and effort, with out we could not have fixed all this.
 
 Do you have a git tree with the latest pieces?
No, scsi-misc I guess ;)

I could put it here:
git://git.bhalevy.com/open-osd gdth

branch give me an hours

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 9:06 +0200, Stefan Priebe - allied internet ag [EMAIL 
PROTECTED] wrote:
 Hello!
 
 I've tested this patch now - and it works fine. Now rmmod, halt and 
 reboot also works.
 
 Stefan Priebe
 
This is grate news Stefan. Thank you very much for all your time
and effort, with out we could not have fixed all this.

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh
On Tue, Feb 12 2008 at 19:40 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 gdth _exit would first remove all cards then stop the timer
 and would not sync with the timer function. This caused a crash
 in gdth_timer() when module was unloaded.
 
 del_timer_sync the timer before we delete the cards.
 
 NOT YET TESTED
 
 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]

Tested-by: Stefan Priebe [EMAIL PROTECTED]

 ---
  drivers/scsi/gdth.c |   15 ---
  1 files changed, 8 insertions(+), 7 deletions(-)
 
 diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
 index 8eb78be..103280e 100644
 --- a/drivers/scsi/gdth.c
 +++ b/drivers/scsi/gdth.c
 @@ -3793,6 +3793,8 @@ static void gdth_timeout(ulong data)
  gdth_ha_str *ha;
  ulong flags;
  
 +BUG_ON(list_empty(gdth_instances));
 +
  ha = list_first_entry(gdth_instances, gdth_ha_str, list);
  spin_lock_irqsave(ha-smp_lock, flags);
  
 @@ -5146,8 +5148,6 @@ static void gdth_remove_one(gdth_ha_str *ha)
   ha-sdev = NULL;
   }
  
 - gdth_flush(ha);
 -
   if (shp-irq)
   free_irq(shp-irq,ha);
  
 @@ -5245,14 +5245,15 @@ static void __exit gdth_exit(void)
  {
   gdth_ha_str *ha;
  
 - list_for_each_entry(ha, gdth_instances, list)
 - gdth_remove_one(ha);
 + unregister_chrdev(major,gdth);
 + unregister_reboot_notifier(gdth_notifier);
  
  #ifdef GDTH_STATISTICS
 - del_timer(gdth_timer);
 + del_timer_sync(gdth_timer);
  #endif
 - unregister_chrdev(major,gdth);
 - unregister_reboot_notifier(gdth_notifier);
 +
 + list_for_each_entry(ha, gdth_instances, list)
 + gdth_remove_one(ha);
  }
  
  module_init(gdth_init);

James please put this patch in rc-fixes also. It has now been tested
by few people, and it solves a reproducible problem in the unloading
of the driver.

It was not yet confirmed by Andrew's reporter with the:
+if (list_empty(gdth_instances))
+   return;

at gdth_timer() In -mm tree. In my patch I have converted the if() to a 
BUG_ON because now it should not happen. But I figure it is not worse then 
what there is now, which is nothing.

With your recommendation I will push both patches to the stable branches
People have emailed me requesting it.

Thanks
Boaz
 
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread James Bottomley
On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 - gdth_flush(ha);
 -

This piece doesn't look right.  gdth_flush() forces the internal cache
to disk backing.  If you remove it, you're taking the chance that the
machine will be powered off without a writeback which can cause data
corruption.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 -gdth_flush(ha);
 -
 
 This piece doesn't look right.  gdth_flush() forces the internal cache
 to disk backing.  If you remove it, you're taking the chance that the
 machine will be powered off without a writeback which can cause data
 corruption.
 
 James
 
Yes. 
I have more problems reported, with exit, and am just sending one more patch 
that puts
this back in. Which was tested.

So I will resend this one plus one new one.

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread James Bottomley

On Wed, 2008-02-13 at 18:33 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
  On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] 
  wrote:
  On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
  - gdth_flush(ha);
  -
  This piece doesn't look right.  gdth_flush() forces the internal cache
  to disk backing.  If you remove it, you're taking the chance that the
  machine will be powered off without a writeback which can cause data
  corruption.
 
  James
 
  Yes. 
  I have more problems reported, with exit, and am just sending one more 
  patch that puts
  this back in. Which was tested.
  
  So I will resend this one plus one new one.
  
  Boaz
  
 
 The gdth driver would do a register_reboot_notifier(gdth_notifier);
 to a gdth_halt() function, which would then redo half of what gdth_exit
 does, and wrongly so, and crash.  
 
 Are we guaranteed in todays kernel that modules .exit function be called
 on an halt or reboot? If so then there is no need for duplications and
 the gdth_halt() should go.

No.  The __exit section is actually discardable if you promise never to
remove the module.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 -   gdth_flush(ha);
 -
 This piece doesn't look right.  gdth_flush() forces the internal cache
 to disk backing.  If you remove it, you're taking the chance that the
 machine will be powered off without a writeback which can cause data
 corruption.

 James

 Yes. 
 I have more problems reported, with exit, and am just sending one more patch 
 that puts
 this back in. Which was tested.
 
 So I will resend this one plus one new one.
 
 Boaz
 

The gdth driver would do a register_reboot_notifier(gdth_notifier);
to a gdth_halt() function, which would then redo half of what gdth_exit
does, and wrongly so, and crash.  

Are we guaranteed in todays kernel that modules .exit function be called
on an halt or reboot? If so then there is no need for duplications and
the gdth_halt() should go.

Submitted a patch that replaces the previous one I submitted with a deeper
fix. 
[PATCH] gdth: bugfix for the at-exit problems

If you ask me this all gdth_flush() is a crackup. sd and scsi-ml are doing 
scsi FLUSH commands when ever is needed. The controller as no business caching
data in memory longer then what is stated in standard. Raid controller or no 
raid
controller. Virtual or not virtual device. Data on Plate means data on plate.
What if there is a power outage? what the driver can do then?

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 18:45 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Wed, 2008-02-13 at 18:33 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] 
 wrote:
 On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 - gdth_flush(ha);
 -
 This piece doesn't look right.  gdth_flush() forces the internal cache
 to disk backing.  If you remove it, you're taking the chance that the
 machine will be powered off without a writeback which can cause data
 corruption.

 James

 Yes. 
 I have more problems reported, with exit, and am just sending one more 
 patch that puts
 this back in. Which was tested.

 So I will resend this one plus one new one.

 Boaz

 The gdth driver would do a register_reboot_notifier(gdth_notifier);
 to a gdth_halt() function, which would then redo half of what gdth_exit
 does, and wrongly so, and crash.  

 Are we guaranteed in todays kernel that modules .exit function be called
 on an halt or reboot? If so then there is no need for duplications and
 the gdth_halt() should go.
 
 No.  The __exit section is actually discardable if you promise never to
 remove the module.
 
I don't understand please explain. 
What does a driver need to do if it needs a consistent shutdown retine?
module or built in? unload or shutdown?


 James
 
 

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 19:03 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Wed, 2008-02-13 at 18:50 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 18:45 +0200, James Bottomley [EMAIL PROTECTED] 
 wrote:
 On Wed, 2008-02-13 at 18:33 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] 
 wrote:
 On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 -   gdth_flush(ha);
 -
 This piece doesn't look right.  gdth_flush() forces the internal cache
 to disk backing.  If you remove it, you're taking the chance that the
 machine will be powered off without a writeback which can cause data
 corruption.

 James

 Yes. 
 I have more problems reported, with exit, and am just sending one more 
 patch that puts
 this back in. Which was tested.

 So I will resend this one plus one new one.

 Boaz

 The gdth driver would do a register_reboot_notifier(gdth_notifier);
 to a gdth_halt() function, which would then redo half of what gdth_exit
 does, and wrongly so, and crash.  

 Are we guaranteed in todays kernel that modules .exit function be called
 on an halt or reboot? If so then there is no need for duplications and
 the gdth_halt() should go.
 No.  The __exit section is actually discardable if you promise never to
 remove the module.

 I don't understand please explain. 
 What does a driver need to do if it needs a consistent shutdown retine?
 module or built in? unload or shutdown?
 
 It needs to register a reboot notifier, which gdth does.
 
 However, the notifier is only called on reboot, so it also needs to
 clean up correctly on module exit as well.
 
 The alternative for GDTH would be to process the SCSI SYNCHRONIZE CACHE
 command.  That's done by a shutdown notifier from sd, so the correct
 thing would always get done; however it does mean the driver has to be
 in a condition to process the last sync cache command.
 
 For the quick fix, just keep the current infrastructure and put back the
 gdth_flush() command where it can be effective.
 
 James
 
 
Totally untested.

---
From: Boaz Harrosh [EMAIL PROTECTED]
Subject: [PATCH] gdth: bugfix for the at-exit problems

gdth_exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.
So del_timer_sync the timer before we delete the cards.

also the reboot notifier function would crash. So unify
the exit and halt functions with a gdth_shutdown() that's
called by both.

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |   99 --
 1 files changed, 40 insertions(+), 59 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 8eb78be..7bb9b45 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -183,7 +183,6 @@ static int gdth_ioctl(struct inode *inode, struct file 
*filep,
   unsigned int cmd, unsigned long arg);
 
 static void gdth_flush(gdth_ha_str *ha);
-static int gdth_halt(struct notifier_block *nb, ulong event, void *buf);
 static int gdth_queuecommand(Scsi_Cmnd *scp,void (*done)(Scsi_Cmnd *));
 static int __gdth_queuecommand(gdth_ha_str *ha, struct scsi_cmnd *scp,
struct gdth_cmndinfo *cmndinfo);
@@ -418,12 +417,6 @@ static inline void gdth_set_sglist(struct scsi_cmnd *cmd,
 #include gdth_proc.h
 #include gdth_proc.c
 
-/* notifier block to get a notify on system shutdown/halt/reboot */
-static struct notifier_block gdth_notifier = {
-gdth_halt, NULL, 0
-};
-static int notifier_disabled = 0;
-
 static gdth_ha_str *gdth_find_ha(int hanum)
 {
gdth_ha_str *ha;
@@ -3793,6 +3786,8 @@ static void gdth_timeout(ulong data)
 gdth_ha_str *ha;
 ulong flags;
 
+BUG_ON(list_empty(gdth_instances));
+
 ha = list_first_entry(gdth_instances, gdth_ha_str, list);
 spin_lock_irqsave(ha-smp_lock, flags);
 
@@ -4668,45 +4663,6 @@ static void gdth_flush(gdth_ha_str *ha)
 }
 }
 
-/* shutdown routine */
-static int gdth_halt(struct notifier_block *nb, ulong event, void *buf)
-{
-gdth_ha_str *ha;
-#ifndef __alpha__
-gdth_cmd_strgdtcmd;
-charcmnd[MAX_COMMAND_SIZE];   
-#endif
-
-if (notifier_disabled)
-return NOTIFY_OK;
-
-TRACE2((gdth_halt() event %d\n,(int)event));
-if (event != SYS_RESTART  event != SYS_HALT  event != SYS_POWER_OFF)
-return NOTIFY_DONE;
-
-notifier_disabled = 1;
-printk(GDT-HA: Flushing all host drives .. );
-list_for_each_entry(ha, gdth_instances, list) {
-gdth_flush(ha);
-
-#ifndef __alpha__
-/* controller reset */
-memset(cmnd, 0xff, MAX_COMMAND_SIZE);
-gdtcmd.BoardNode = LOCALBOARD;
-gdtcmd.Service = CACHESERVICE;
-gdtcmd.OpCode = GDT_RESET;
-TRACE2((gdth_halt(): reset controller %d\n, ha-hanum));
-gdth_execute(ha-shost, gdtcmd, 

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread James Bottomley
On Wed, 2008-02-13 at 19:18 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 19:03 +0200, James Bottomley [EMAIL PROTECTED] wrote:
  It needs to register a reboot notifier, which gdth does.
  
  However, the notifier is only called on reboot, so it also needs to
  clean up correctly on module exit as well.
  
  The alternative for GDTH would be to process the SCSI SYNCHRONIZE CACHE
 
 Why would we think that the controller does not support this command
 is it not in the mandatory section of the standard?

Um, because the controller isn't a SCSI device.  It's an emulated device
which means the SCSI comands are processed in the driver.  It does look
like the driver-HBA communication is some sort of translated dialect
of SCSI.

  command.  That's done by a shutdown notifier from sd, so the correct
  thing would always get done; however it does mean the driver has to be
  in a condition to process the last sync cache command.
 
 Why would it not be ready? what do other drivers do?
 The drivers is ready until the very last module's .exit. Is that good
 enough?

shutdown is called as part of device removal and module unload ...
usually from scsi_remove_host().  So you can't tear down command
processing before that point.

  
  For the quick fix, just keep the current infrastructure and put back the
  gdth_flush() command where it can be effective.
  
 
 Just did. But if needed I would prefer to emulate the SCSI SYNCHRONIZE CACHE
 command and not that boot notifier thing. Please advise.

I think such a change, though desirable, would be too large to count as
a bug fix.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread James Bottomley
On Wed, 2008-02-13 at 18:50 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 18:45 +0200, James Bottomley [EMAIL PROTECTED] wrote:
  On Wed, 2008-02-13 at 18:33 +0200, Boaz Harrosh wrote:
  On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
  On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] 
  wrote:
  On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
  -   gdth_flush(ha);
  -
  This piece doesn't look right.  gdth_flush() forces the internal cache
  to disk backing.  If you remove it, you're taking the chance that the
  machine will be powered off without a writeback which can cause data
  corruption.
 
  James
 
  Yes. 
  I have more problems reported, with exit, and am just sending one more 
  patch that puts
  this back in. Which was tested.
 
  So I will resend this one plus one new one.
 
  Boaz
 
  The gdth driver would do a register_reboot_notifier(gdth_notifier);
  to a gdth_halt() function, which would then redo half of what gdth_exit
  does, and wrongly so, and crash.  
 
  Are we guaranteed in todays kernel that modules .exit function be called
  on an halt or reboot? If so then there is no need for duplications and
  the gdth_halt() should go.
  
  No.  The __exit section is actually discardable if you promise never to
  remove the module.
  
 I don't understand please explain. 
 What does a driver need to do if it needs a consistent shutdown retine?
 module or built in? unload or shutdown?

It needs to register a reboot notifier, which gdth does.

However, the notifier is only called on reboot, so it also needs to
clean up correctly on module exit as well.

The alternative for GDTH would be to process the SCSI SYNCHRONIZE CACHE
command.  That's done by a shutdown notifier from sd, so the correct
thing would always get done; however it does mean the driver has to be
in a condition to process the last sync cache command.

For the quick fix, just keep the current infrastructure and put back the
gdth_flush() command where it can be effective.

James


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Boaz Harrosh
On Wed, Feb 13 2008 at 19:03 +0200, James Bottomley [EMAIL PROTECTED] wrote:
 On Wed, 2008-02-13 at 18:50 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 18:45 +0200, James Bottomley [EMAIL PROTECTED] 
 wrote:
 On Wed, 2008-02-13 at 18:33 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] wrote:
 On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] 
 wrote:
 On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
 -   gdth_flush(ha);
 -
 This piece doesn't look right.  gdth_flush() forces the internal cache
 to disk backing.  If you remove it, you're taking the chance that the
 machine will be powered off without a writeback which can cause data
 corruption.

 James

 Yes. 
 I have more problems reported, with exit, and am just sending one more 
 patch that puts
 this back in. Which was tested.

 So I will resend this one plus one new one.

 Boaz

 The gdth driver would do a register_reboot_notifier(gdth_notifier);
 to a gdth_halt() function, which would then redo half of what gdth_exit
 does, and wrongly so, and crash.  

 Are we guaranteed in todays kernel that modules .exit function be called
 on an halt or reboot? If so then there is no need for duplications and
 the gdth_halt() should go.
 No.  The __exit section is actually discardable if you promise never to
 remove the module.

 I don't understand please explain. 
 What does a driver need to do if it needs a consistent shutdown retine?
 module or built in? unload or shutdown?
 
 It needs to register a reboot notifier, which gdth does.
 
 However, the notifier is only called on reboot, so it also needs to
 clean up correctly on module exit as well.
 
 The alternative for GDTH would be to process the SCSI SYNCHRONIZE CACHE

Why would we think that the controller does not support this command
is it not in the mandatory section of the standard?

 command.  That's done by a shutdown notifier from sd, so the correct
 thing would always get done; however it does mean the driver has to be
 in a condition to process the last sync cache command.

Why would it not be ready? what do other drivers do?
The drivers is ready until the very last module's .exit. Is that good
enough?

 
 For the quick fix, just keep the current infrastructure and put back the
 gdth_flush() command where it can be effective.
 

Just did. But if needed I would prefer to emulate the SCSI SYNCHRONIZE CACHE
command and not that boot notifier thing. Please advise.

 James
 
 

Boaz
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread James Bottomley
On Wed, 2008-02-13 at 19:12 +0200, Boaz Harrosh wrote:
 On Wed, Feb 13 2008 at 19:03 +0200, James Bottomley [EMAIL PROTECTED] wrote:
  On Wed, 2008-02-13 at 18:50 +0200, Boaz Harrosh wrote:
  On Wed, Feb 13 2008 at 18:45 +0200, James Bottomley [EMAIL PROTECTED] 
  wrote:
  On Wed, 2008-02-13 at 18:33 +0200, Boaz Harrosh wrote:
  On Wed, Feb 13 2008 at 17:54 +0200, Boaz Harrosh [EMAIL PROTECTED] 
  wrote:
  On Wed, Feb 13 2008 at 17:44 +0200, James Bottomley [EMAIL PROTECTED] 
  wrote:
  On Tue, 2008-02-12 at 19:40 +0200, Boaz Harrosh wrote:
  - gdth_flush(ha);
  -
  This piece doesn't look right.  gdth_flush() forces the internal cache
  to disk backing.  If you remove it, you're taking the chance that the
  machine will be powered off without a writeback which can cause data
  corruption.
 
  James
 
  Yes. 
  I have more problems reported, with exit, and am just sending one more 
  patch that puts
  this back in. Which was tested.
 
  So I will resend this one plus one new one.
 
  Boaz
 
  The gdth driver would do a register_reboot_notifier(gdth_notifier);
  to a gdth_halt() function, which would then redo half of what gdth_exit
  does, and wrongly so, and crash.  
 
  Are we guaranteed in todays kernel that modules .exit function be called
  on an halt or reboot? If so then there is no need for duplications and
  the gdth_halt() should go.
  No.  The __exit section is actually discardable if you promise never to
  remove the module.
 
  I don't understand please explain. 
  What does a driver need to do if it needs a consistent shutdown retine?
  module or built in? unload or shutdown?
  
  It needs to register a reboot notifier, which gdth does.
  
  However, the notifier is only called on reboot, so it also needs to
  clean up correctly on module exit as well.
  
  The alternative for GDTH would be to process the SCSI SYNCHRONIZE CACHE
  command.  That's done by a shutdown notifier from sd, so the correct
  thing would always get done; however it does mean the driver has to be
  in a condition to process the last sync cache command.
  
  For the quick fix, just keep the current infrastructure and put back the
  gdth_flush() command where it can be effective.
  
  James
  
  
 Totally untested.
 
 ---
 From: Boaz Harrosh [EMAIL PROTECTED]
 Subject: [PATCH] gdth: bugfix for the at-exit problems
 
 gdth_exit would first remove all cards then stop the timer
 and would not sync with the timer function. This caused a crash
 in gdth_timer() when module was unloaded.
 So del_timer_sync the timer before we delete the cards.
 
 also the reboot notifier function would crash. So unify
 the exit and halt functions with a gdth_shutdown() that's
 called by both.
 
 Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
 ---
  drivers/scsi/gdth.c |   99 --
  1 files changed, 40 insertions(+), 59 deletions(-)
 
 diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
 index 8eb78be..7bb9b45 100644
 --- a/drivers/scsi/gdth.c
 +++ b/drivers/scsi/gdth.c
 @@ -183,7 +183,6 @@ static int gdth_ioctl(struct inode *inode, struct file 
 *filep,
unsigned int cmd, unsigned long arg);
  
  static void gdth_flush(gdth_ha_str *ha);
 -static int gdth_halt(struct notifier_block *nb, ulong event, void *buf);
  static int gdth_queuecommand(Scsi_Cmnd *scp,void (*done)(Scsi_Cmnd *));
  static int __gdth_queuecommand(gdth_ha_str *ha, struct scsi_cmnd *scp,
   struct gdth_cmndinfo *cmndinfo);
 @@ -418,12 +417,6 @@ static inline void gdth_set_sglist(struct scsi_cmnd *cmd,
  #include gdth_proc.h
  #include gdth_proc.c
  
 -/* notifier block to get a notify on system shutdown/halt/reboot */
 -static struct notifier_block gdth_notifier = {
 -gdth_halt, NULL, 0
 -};
 -static int notifier_disabled = 0;
 -
  static gdth_ha_str *gdth_find_ha(int hanum)
  {
   gdth_ha_str *ha;
 @@ -3793,6 +3786,8 @@ static void gdth_timeout(ulong data)
  gdth_ha_str *ha;
  ulong flags;
  
 +BUG_ON(list_empty(gdth_instances));
 +
  ha = list_first_entry(gdth_instances, gdth_ha_str, list);
  spin_lock_irqsave(ha-smp_lock, flags);
  
 @@ -4668,45 +4663,6 @@ static void gdth_flush(gdth_ha_str *ha)
  }
  }
  
 -/* shutdown routine */
 -static int gdth_halt(struct notifier_block *nb, ulong event, void *buf)
 -{
 -gdth_ha_str *ha;
 -#ifndef __alpha__
 -gdth_cmd_strgdtcmd;
 -charcmnd[MAX_COMMAND_SIZE];   
 -#endif
 -
 -if (notifier_disabled)
 -return NOTIFY_OK;
 -
 -TRACE2((gdth_halt() event %d\n,(int)event));
 -if (event != SYS_RESTART  event != SYS_HALT  event != SYS_POWER_OFF)
 -return NOTIFY_DONE;
 -
 -notifier_disabled = 1;
 -printk(GDT-HA: Flushing all host drives .. );
 -list_for_each_entry(ha, gdth_instances, list) {
 -gdth_flush(ha);
 -
 -#ifndef __alpha__
 -/* controller reset */
 -memset(cmnd, 0xff, MAX_COMMAND_SIZE);
 -gdtcmd.BoardNode = 

Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Jan Engelhardt

On Feb 13 2008 11:03, Boaz Harrosh wrote:
 
 I've tested this patch now - and it works fine. Now rmmod, halt and 
 reboot also works.
 
 Stefan Priebe
 
This is grate news Stefan. Thank you very much for all your time
and effort, with out we could not have fixed all this.

Do you have a git tree with the latest pieces?
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-13 Thread Christoph Hellwig
On Wed, Feb 13, 2008 at 11:03:45AM -0600, James Bottomley wrote:
  I don't understand please explain. 
  What does a driver need to do if it needs a consistent shutdown retine?
  module or built in? unload or shutdown?
 
 It needs to register a reboot notifier, which gdth does.

Well, for crappy legacy driver that's the way, but it's not really
recommended.  As soon as a driver uses the proper driver models,
e.g. gdth for pci using Jeff's pci hotplug patches it can just
implement the -shutdown method that is called before shutdown/kexec
and can do the right thing.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUGFIX 2/2] gdth: bugfix for the Timer at exit crash

2008-02-12 Thread Stefan Priebe - allied internet ag

Hello!

I've tested this patch now - and it works fine. Now rmmod, halt and 
reboot also works.


Stefan Priebe


Boaz Harrosh schrieb:

gdth _exit would first remove all cards then stop the timer
and would not sync with the timer function. This caused a crash
in gdth_timer() when module was unloaded.

del_timer_sync the timer before we delete the cards.

NOT YET TESTED

Signed-off-by: Boaz Harrosh [EMAIL PROTECTED]
---
 drivers/scsi/gdth.c |   15 ---
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index 8eb78be..103280e 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -3793,6 +3793,8 @@ static void gdth_timeout(ulong data)
 gdth_ha_str *ha;
 ulong flags;
 
+BUG_ON(list_empty(gdth_instances));

+
 ha = list_first_entry(gdth_instances, gdth_ha_str, list);
 spin_lock_irqsave(ha-smp_lock, flags);
 
@@ -5146,8 +5148,6 @@ static void gdth_remove_one(gdth_ha_str *ha)

ha-sdev = NULL;
}
 
-	gdth_flush(ha);

-
if (shp-irq)
free_irq(shp-irq,ha);
 
@@ -5245,14 +5245,15 @@ static void __exit gdth_exit(void)

 {
gdth_ha_str *ha;
 
-	list_for_each_entry(ha, gdth_instances, list)

-   gdth_remove_one(ha);
+   unregister_chrdev(major,gdth);
+   unregister_reboot_notifier(gdth_notifier);
 
 #ifdef GDTH_STATISTICS

-   del_timer(gdth_timer);
+   del_timer_sync(gdth_timer);
 #endif
-   unregister_chrdev(major,gdth);
-   unregister_reboot_notifier(gdth_notifier);
+
+   list_for_each_entry(ha, gdth_instances, list)
+   gdth_remove_one(ha);
 }
 
 module_init(gdth_init);

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html