[jira] Issue Comment Edited: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Mike Matrigali (JIRA) Fri, 11 Mar 2011 16:15:23 -0800

    [ 
https://issues.apache.org/jira/browse/DERBY-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13005899#comment-13005899
 ]


Mike Matrigali edited comment on DERBY-5108 at 3/12/11 12:14 AM:
-----------------------------------------------------------------

The more I look at this issue I think the problem is that the istat daemon 
should shutdown and not return until it has completed this 
shutdown when indexRefresher.stop(); is called from the DataDictionary's stop 
call().  For a clean shutdown of the system the store
needs all it's clients shutdown first and then it can cleanly shutdown, and 
force the database files and transaction logs insuring 
a clean shutdown with no recovery work necessary on the next boot.

By leaving the istat daemon running we can run into a number of errors that I 
don't think can be solved.  We might fix a specific one shown
up by this test but the system is just not designed to handle clean shutdown 
while stuff is still running without first waiting for the running
stuff to stop somehow.

Kristian noted in DERBY-5037:
> I think Mike's comments/observations above agree pretty much with my thinking 
> when writing the code. Seems there are several error-handling issues to iron 
> out though...
>A few specific comments:
>o I decided to not make Derby wait for the background thread to finish on 
>shutdown, as it might potentially be scanning a very large table.
>o Logging is rather verbose now during testing, but I agree it should be less 
>verbose (or maybe turned off completely) when released.
>o I'm logging a lot of exceptions to aid testing/debugging. These should also 
>go away, or be enabled by a property if the user wishes to do so. 

I now think that it was wrong to not wait for the background thread.  This 
would match the behavior of the rawStoreDaemon thread which is "owned"
by the raw store module - the module stops the daemon and the daemon waits 
around for work to stop/complete before returning from the stop, and
then the raw store continues with it's data and transaction file cleanup prior 
to stopping.   I agree it would be a nice optimization to somehow stop the 
background thread in
the middle of a big scan, and it seems like with the better interrupt support 
this should be much easier than was the case before 10.8.   I would like
some feedback before proceding from those more knowledgeable about the istat 
work.

I do think that the work rick did for DERBY-5037 is still valuable as it will 
handle much better the non-clean shutdowns that Derby can experience.  ButFor
a non-clean shutdown we might have to just live with a file left open until the 
thread or jvm exits.   But for a requested orderly shutdown of the system I 
think we should go with the top down shutdown supported by the architecture 
rather than try to fix errors encountered when top level modules are still
running while lower level modules are trying to shut down.

      was (Author: mikem):
    The more I look at this issue I think the problem is that the istat daemon 
should shutdown and not return until it has completed this 
shutdown when indexRefresher.stop(); is called from the DataDictionary's stop 
call().  For a clean shutdown of the system the store
needs all it's clients shutdown first and then it can cleanly shutdown, and 
force the database files and transaction logs insuring 
a clean shutdown with no recovery work necessary on the next boot.

By leaving the istat daemon running we can run into a number of errors that I 
don't think can be solved.  We might fix a specific one shown
up by this test but the system is just not designed to handle clean shutdown 
while stuff is still running without first waiting for the running
stuff to stop somehow.

Kristian noted in DERBY-5037:
> I think Mike's comments/observations above agree pretty much with my thinking 
> when writing the code. Seems there are several error-handling issues to iron 
> out though...
>A few specific comments:
>o I decided to not make Derby wait for the background thread to finish on 
>shutdown, as it might potentially be scanning a very large table.
>o Logging is rather verbose now during testing, but I agree it should be less 
>verbose (or maybe turned off completely) when released.
>o I'm logging a lot of exceptions to aid testing/debugging. These should also 
>go away, or be enabled by a property if the user wishes to do so. 

I now think that it was wrong to not wait for the background thread.  I agree 
it would be a nice optimization to somehow stop the background thread in
the middle of a big scan, and it seems like with the better interrupt support 
this should be much easier than was the case before 10.8.   I would like
some feedback before proceding from those more knowledgeable about the istat 
work.

I do think that the work rick did for DERBY-5037 is still valuable as it will 
handle much better the non-clean shutdowns that Derby can experience.  ButFor
a non-clean shutdown we might have to just live with a file left open until the 
thread or jvm exits.   But for a requested orderly shutdown of the system I 
think we should go with the top down shutdown supported by the architecture 
rather than try to fix errors encountered when top level modules are still
running while lower level modules are trying to shut down.
  
> Intermittent failure in 
> AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5108
>                 URL: https://issues.apache.org/jira/browse/DERBY-5108
>             Project: Derby
>          Issue Type: Bug
>          Components: Test
>    Affects Versions: 10.8.0.0
>         Environment: Windows platforms.
>            Reporter: Kristian Waagan
>            Assignee: Mike Matrigali
>            Priority: Blocker
>         Attachments: javacore.20110309.125807.4048.0001.txt
>
>
> The test AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete 
> fails intermittently on Windows platforms because the test is unable to 
> delete a database directory.
> Even after several retries and sleeps (the formula should be (attempt -1) * 
> 2000, resulting in a total sleep time of 12 seconds), the conglomerate 
> system\singleUse\copyShutdown\seg0\c481.dat cannot be deleted.
> For instance from 
> http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/w2003/1078855-suitesAll_diff.txt
>  :
> (truncated paths)
> testShutdownWhileScanningThenDelete <assertDirectoryDeleted> attempt 1 left 3 
> files/dirs behind: 0=system\singleUse\copyShutdown\seg0\c481.dat 
> 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 2 left 3 files/dirs behind: 
> 0=system\singleUse\copyShutdown\seg0\c481.dat 
> 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 3 left 3 files/dirs behind: 
> 0=system\singleUse\copyShutdown\seg0\c481.dat 
> 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> <assertDirectoryDeleted> attempt 4 left 3 files/dirs behind: 
> 0=system\singleUse\copyShutdown\seg0\c481.dat 
> 1=system\singleUse\copyShutdown\seg0 2=system\singleUse\copyShutdown
> used 205814 ms F.
> Maybe the database isn't shut down, or some specific timing of events causes 
> a file to be reopened when it shouldn't have been (i.e. after the database 
> shutdown has been initiated).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Issue Comment Edited: (DERBY-5108) Intermittent failure in AutomaticIndexStatisticsTest.testShutdownWhileScanningThenDelete on Windows

Reply via email to