Re: [s3ql] Force fsck to "continue"?

PA Nilsson Thu, 19 Jun 2014 02:10:30 -0700


On Thursday, June 19, 2014 1:51:30 AM UTC+2, Nikolaus Rath wrote:
>
> PA Nilsson <p...@zid.nu <javascript:>> writes: 
> > On Tuesday, June 17, 2014 10:16:23 PM UTC+2, Nikolaus Rath wrote: 
> >> PA Nilsson <p...@zid.nu <javascript:>> writes: 
> >> >> > fsck.s3ql --ssl-ca-path ${capath} --cachedir ${s3ql_cachedir} 
> --log 
> >> >> > $log_file --authfile ${auth_file}  $storage_url 
> >> >> > 
> >> >> > " 
> >> >> > Starting fsck of xxxxxxxxx 
> >> >> > Ignoring locally cached metadata (outdated). 
> >> >> > Backend reports that file system is still mounted elsewhere. 
> Either 
> >> >> > the file system has not been unmounted cleanly or the data has not 
> >> yet 
> >> >> > propagated through the backend. In the later case, waiting for a 
> >> while 
> >> >> > should fix the problem, in the former case you should try to run 
> fsck 
> >> >> > on the computer where the file system has been mounted most 
> recently. 
> >> >> > Enter "continue" to use the outdated data anyway: 
> >> >> > " 
> >> >> > 
> >> >> > In this case, it is true that the file system was not cleanly 
> >> unmounted, 
> >> >> > but what are my options here? 
> >> >> 
> >> >> You should find out why you are loosing your local metadata copy. 
> >> >> 
> >> >> Is your $s3ql_cachedir on a journaling file system? What happened to 
> >> >> this file system on the power cycle? Did it loose data? 
> >> >> 
> >> >> What are the contents of $s3ql_cachedir when you run fsck.s3ql? 
> >> >> 
> >> >> Are you running fsck.s3ql with the same $s3ql_cachedir as 
> mount.s3ql? 
> >> >> Are you *absolutely* sure about that? 
> >> >> 
> >> >> 
> >> > I can only trigger this when the system is powered off during an 
> actual 
> >> > transfer of data. If I let the data transfer finish and then power 
> cycle 
> >> > with the fs mounted, the FS recovers when running fsck. 
> >> > 
> >> > The system is running on an ext4 filesystem. The filesystem does not 
> >> seem 
> >> > to have lost any data. 
> >> > The cachedir is read from the same config file and works otherwise, 
> to 
> >> yes, 
> >> > I am sure about that. 
> >> > Contents of cachdir when failing is: 
> >> > -rw-r--r-- 1 root root     0 Jun 16 13:06 mount.s3ql_crit.log 
> >> > -rw------- 1 root root 77824 Jun 17 07:21 
> >> > s3c:=storageurl.db 
> >> > -rw-r--r-- 1 root root   217 Jun 17 07:21 
> >> > s3c:=storageurl.params 
> >> 
> >> There is something very wrong here. While mount.s3ql is running, there 
> >> will always be a directory ending in -cache in the cache directory. 
> This 
> >> directory is only removed after mount.s3ql exits, so if you reboot the 
> >> computer, it *must* still be there. 
> >> 
> >> Can you confirm that the directory exists while mount.s3ql is running? 
> >> 
> >> What happens if, instead of rebooting, you just kill -9 the mount.s3ql 
> >> process? Does the -cache directory exist? Does fsck.s3ql work in that 
> >> case? 
> >> 
> > The -cache dir is there while the and only removed when mount.s3ql 
> > finishes.  After a kill -9, the -cache is still there.  Then rebooting 
> > the system, the -cache is still there and fsck completes. 
> > 
> > However when closely monitoring the system, the -cache is created when 
> the 
> > FS is mounted but if the system is immediately reset, it is not there 
> after 
> > a reboot. 
> > So my thinking is that this is a problem that we have with our flash 
> based 
> > file system. The file is simply not yet written to flash. 
>
> It's a directory, not a file, and it is created when mount.s3ql 
> starts. If this directory (with its contents) disappears if you reboot 
> the system several minutes later, you have a real problem. 
>
If I reboot several minutes later, the directory is there and everything 
works. I need to force the reboot within seconds after the mount process, 
or maybe even during it while the metadata is read/written from the server.


>
> > This will be running on an "non maintained" system with no possibility 
> for 
> > user interaction. 
> > What is the drawback of always continuing the fsck operation? 
>
> You will loose any data that has not been written to the backend, and 
> you will loose all metadata updates since the last metadata updates - 
> which can imply that you loose some data even though the pure data has 
> already been written to the backend. 
>
> More importantly, though, you are ignoring a big problem with your flash 
> file system. Rebooting the system might affect recently written files, 
> but it should not result in the loss of an entire directory with its 
> contents that was created an arbitrary amount of time before the 
> reboot. In addition, it looks as if the .params file reverts to an 
> earlier state (this is the reason why fsck.s3ql things that the remote 
> metadata is newer). 
>
> Thank you for your support Nikolaus, we will look into this further on our 
own.

/PA

-- 
You received this message because you are subscribed to the Google Groups 
"s3ql" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to s3ql+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [s3ql] Force fsck to "continue"?

Reply via email to