Re: Some idea about automatic checkpointing issue
RR == Raymond Raymond [EMAIL PROTECTED] writes: RR Last time when I discussed the automatic checkpointing issue RR with you and Mike, I suggested that maybe we can establish a RR dirty page list in wich dirty pages are sorted in ascending RR order of the time when they were firt updated. I don't mean we RR need to copy the whole page to the list just some RR identification of the page that can make us find the RR corresponding page later.When pages are fist updated, it will RR be linked to the list and when it is flushed out to disk, it RR will be released from the link. In this way, the oldest dirty RR pages will be in the head and the lastest will be in the RR tail. When we do checkpointing, we scan form the head to the RR end of the list. I think that will guarantee the oldest dirty RR pages are written in a checkpoint. Sounds like a good idea. -- Øystein
Re: Some idea about automatic checkpointing issue
RR == Raymond Raymond [EMAIL PROTECTED] writes: RR Oystein wrote: I would like to suggest the following: - 1. The user may be able to configure a certain recovery time that Derby should try to satisfy. (An appropriate default must be determined). - 2. During initilization of Derby, we run some measurement that determines the performance of the system and maps the recovery time into some X megabytes of log.) - 3. A checkpoint is made by default every X megabytes of log. - 4. One tries to dynamically adjust the write rate of the checkpoint so that the writing takes an entire checkpoint interval. (E.g., write Y pages, then pause for some time). - 5. If data reads or a log writes (if log in default location) start to have long response times, one can increase the checkpoint interval. The user should be able to turn this feature off in case longer recovery times are no acceptable. Hope this rambling has some value, -- Øystein RR Thanks for Oystein's comment. I agree with your comment RR and I have any other thought about it. In order to be easier RR to explain,I added the sequence number to your comment. RR In step 3 and 4 I have another idea. Generally, we do checkpointing RR from the earliest useful log record which is determined by the RR repPoint and the undoLWM, whichever is earlier, to the current RR log instant (redoLWM) and then update the derby control RR file(ref. http://db.apache.org/derby/papers/recovery.html). I am not sure I understand what you mean by do checkpointing. Are you talking about writing the checkpoint log record to the log? RR I agree with RR you to spread the writes out over the checkpoint interval, but the RR trade-off is that we have to do recovery from the penultimate RR checkpoint(Am I right here?^_^). If the log is long, it will take us RR a long time in recovery. From the perspective of recovery, it will still be the checkpoint reflected in the log control file. It is true that a new checkpoint had probably been started when the crash occurred, but that may happen today also. It is less likely, but the principles are the same. I agree that it will be more log to redo during recovery. The advantage with my proposal is that the recovery time will be more deterministic since it will less dependent on how long time it takes to clean the page cache. The average log size for recovery will always be 1.5 checkpoint interval with my proposal. The maximum log size will be 2 checkpoint interval, and this is also true for the current solution. If the goal is to guarantee a maximum recovery time, I think my proposal is better. It is no point in reducing performance in order to be able to do recovery in 30 seconds, if the user is willing to accept recovery times of 2 minutes. RR How about we update the derby control RR file periodically instead updating the control file when the whole RR checkpoint is done? (E.g. write several pages, if we detect that the RR system is busy, then we update the derby control file and pause for RR some time or we update the control file once every several RR minutes) I guess that is possible, but in that case, you will need to have some way of determining the redoLWM for the checkpoint. It will no longer be the current log instant when the checkpoint starts. I guess you can do this by either scanning the entire page cache or by keeping the pages sorted by age. RR That seems we do a part of checkpoint at a time if the system RR become busy. In this way, if the system crushes, the last checkpoint RR mark (the log address up to where the last checkpoint did)will be RR closer to the tail of the log than if we update the control file when RR the whole checkpoint is done. Maybe we can call it Incremental RR Checkpointing. Unless each checkpoint cleans the entire cache, the redoLWM may be much older than the the last checkpoint mark. Hence, updating the control file more often, does not reduce recovery times by itself. However, making sure that the oldest dirty pages are written a checkpoint, should advance redoLWM and reduce recovery times. -- Øystein Last time when I discussed the automatic checkpointing issue with you and Mike, I suggested that maybe we can establish a dirty page list in wich dirty pages are sorted in ascending order of the time when they were firt updated. I don't mean we need to copy the whole page to the list just some identification of the page that can make us find the corresponding page later.When pages are fist updated, it will be linked to the list and when it is flushed out to disk, it will be released from the link. In this way, the oldest dirty pages will be in the head and the lastest will be in the tail. When we do checkpointing, we scan form the head to the end of the list. I think that will guarantee the oldest
Re: Some idea about automatic checkpointing issue
From: [EMAIL PROTECTED] (Øystein Grøvlen) Reply-To: derby-dev@db.apache.org To: derby-dev@db.apache.org Subject: Re: Some idea about automatic checkpointing issue Date: Mon, 31 Oct 2005 15:36:39 +0100 RR == Raymond Raymond [EMAIL PROTECTED] writes: RR Oystein wrote: I would like to suggest the following: - 1. The user may be able to configure a certain recovery time that Derby should try to satisfy. (An appropriate default must be determined). - 2. During initilization of Derby, we run some measurement that determines the performance of the system and maps the recovery time into some X megabytes of log.) - 3. A checkpoint is made by default every X megabytes of log. - 4. One tries to dynamically adjust the write rate of the checkpoint so that the writing takes an entire checkpoint interval. (E.g., write Y pages, then pause for some time). - 5. If data reads or a log writes (if log in default location) start to have long response times, one can increase the checkpoint interval. The user should be able to turn this feature off in case longer recovery times are no acceptable. Hope this rambling has some value, -- Øystein RR Thanks for Oystein's comment. I agree with your comment RR and I have any other thought about it. In order to be easier RR to explain,I added the sequence number to your comment. RR In step 3 and 4 I have another idea. Generally, we do checkpointing RR from the earliest useful log record which is determined by the RR repPoint and the undoLWM, whichever is earlier, to the current RR log instant (redoLWM) and then update the derby control RR file(ref. http://db.apache.org/derby/papers/recovery.html). I am not sure I understand what you mean by do checkpointing. Are you talking about writing the checkpoint log record to the log? RR I agree with RR you to spread the writes out over the checkpoint interval, but the RR trade-off is that we have to do recovery from the penultimate RR checkpoint(Am I right here?^_^). If the log is long, it will take us RR a long time in recovery. From the perspective of recovery, it will still be the checkpoint reflected in the log control file. It is true that a new checkpoint had probably been started when the crash occurred, but that may happen today also. It is less likely, but the principles are the same. I agree that it will be more log to redo during recovery. The advantage with my proposal is that the recovery time will be more deterministic since it will less dependent on how long time it takes to clean the page cache. The average log size for recovery will always be 1.5 checkpoint interval with my proposal. The maximum log size will be 2 checkpoint interval, and this is also true for the current solution. If the goal is to guarantee a maximum recovery time, I think my proposal is better. It is no point in reducing performance in order to be able to do recovery in 30 seconds, if the user is willing to accept recovery times of 2 minutes. RR How about we update the derby control RR file periodically instead updating the control file when the whole RR checkpoint is done? (E.g. write several pages, if we detect that the RR system is busy, then we update the derby control file and pause for RR some time or we update the control file once every several RR minutes) I guess that is possible, but in that case, you will need to have some way of determining the redoLWM for the checkpoint. It will no longer be the current log instant when the checkpoint starts. I guess you can do this by either scanning the entire page cache or by keeping the pages sorted by age. RR That seems we do a part of checkpoint at a time if the system RR become busy. In this way, if the system crushes, the last checkpoint RR mark (the log address up to where the last checkpoint did)will be RR closer to the tail of the log than if we update the control file when RR the whole checkpoint is done. Maybe we can call it Incremental RR Checkpointing. Unless each checkpoint cleans the entire cache, the redoLWM may be much older than the the last checkpoint mark. Hence, updating the control file more often, does not reduce recovery times by itself. However, making sure that the oldest dirty pages are written a checkpoint, should advance redoLWM and reduce recovery times. -- Øystein Last time when I discussed the automatic checkpointing issue with you and Mike, I suggested that maybe we can establish a dirty page list in wich dirty pages are sorted in ascending order of the time when they were firt updated. I don't mean we need to copy the whole page to the list just some identification of the page that can make us find the corresponding page later.When pages are fist updated, it will be linked to the list and when
Re: Some idea about automatic checkpointing issue
RR == Raymond Raymond [EMAIL PROTECTED] writes: RR Oystein wrote: I would like to suggest the following: - 1. The user may be able to configure a certain recovery time that Derby should try to satisfy. (An appropriate default must be determined). - 2. During initilization of Derby, we run some measurement that determines the performance of the system and maps the recovery time into some X megabytes of log.) - 3. A checkpoint is made by default every X megabytes of log. - 4. One tries to dynamically adjust the write rate of the checkpoint so that the writing takes an entire checkpoint interval. (E.g., write Y pages, then pause for some time). - 5. If data reads or a log writes (if log in default location) start to have long response times, one can increase the checkpoint interval. The user should be able to turn this feature off in case longer recovery times are no acceptable. Hope this rambling has some value, -- Øystein RR Thanks for Oystein's comment. I agree with your comment RR and I have any other thought about it. In order to be easier RR to explain,I added the sequence number to your comment. RR In step 3 and 4 I have another idea. Generally, we do checkpointing RR from the earliest useful log record which is determined by the RR repPoint and the undoLWM, whichever is earlier, to the current RR log instant (redoLWM) and then update the derby control RR file(ref. http://db.apache.org/derby/papers/recovery.html). I am not sure I understand what you mean by do checkpointing. Are you talking about writing the checkpoint log record to the log? RR I agree with RR you to spread the writes out over the checkpoint interval, but the RR trade-off is that we have to do recovery from the penultimate RR checkpoint(Am I right here?^_^). If the log is long, it will take us RR a long time in recovery. From the perspective of recovery, it will still be the checkpoint reflected in the log control file. It is true that a new checkpoint had probably been started when the crash occurred, but that may happen today also. It is less likely, but the principles are the same. I agree that it will be more log to redo during recovery. The advantage with my proposal is that the recovery time will be more deterministic since it will less dependent on how long time it takes to clean the page cache. The average log size for recovery will always be 1.5 checkpoint interval with my proposal. The maximum log size will be 2 checkpoint interval, and this is also true for the current solution. If the goal is to guarantee a maximum recovery time, I think my proposal is better. It is no point in reducing performance in order to be able to do recovery in 30 seconds, if the user is willing to accept recovery times of 2 minutes. RR How about we update the derby control RR file periodically instead updating the control file when the whole RR checkpoint is done? (E.g. write several pages, if we detect that the RR system is busy, then we update the derby control file and pause for RR some time or we update the control file once every several RR minutes) I guess that is possible, but in that case, you will need to have some way of determining the redoLWM for the checkpoint. It will no longer be the current log instant when the checkpoint starts. I guess you can do this by either scanning the entire page cache or by keeping the pages sorted by age. RR That seems we do a part of checkpoint at a time if the system RR become busy. In this way, if the system crushes, the last checkpoint RR mark (the log address up to where the last checkpoint did)will be RR closer to the tail of the log than if we update the control file when RR the whole checkpoint is done. Maybe we can call it Incremental RR Checkpointing. Unless each checkpoint cleans the entire cache, the redoLWM may be much older than the the last checkpoint mark. Hence, updating the control file more often, does not reduce recovery times by itself. However, making sure that the oldest dirty pages are written a checkpoint, should advance redoLWM and reduce recovery times. -- Øystein