On Thursday, June 19, 2014 1:51:30 AM UTC+2, Nikolaus Rath wrote: > > PA Nilsson <p...@zid.nu <javascript:>> writes: > > On Tuesday, June 17, 2014 10:16:23 PM UTC+2, Nikolaus Rath wrote: > >> PA Nilsson <p...@zid.nu <javascript:>> writes: > >> >> > fsck.s3ql --ssl-ca-path ${capath} --cachedir ${s3ql_cachedir} > --log > >> >> > $log_file --authfile ${auth_file} $storage_url > >> >> > > >> >> > " > >> >> > Starting fsck of xxxxxxxxx > >> >> > Ignoring locally cached metadata (outdated). > >> >> > Backend reports that file system is still mounted elsewhere. > Either > >> >> > the file system has not been unmounted cleanly or the data has not > >> yet > >> >> > propagated through the backend. In the later case, waiting for a > >> while > >> >> > should fix the problem, in the former case you should try to run > fsck > >> >> > on the computer where the file system has been mounted most > recently. > >> >> > Enter "continue" to use the outdated data anyway: > >> >> > " > >> >> > > >> >> > In this case, it is true that the file system was not cleanly > >> unmounted, > >> >> > but what are my options here? > >> >> > >> >> You should find out why you are loosing your local metadata copy. > >> >> > >> >> Is your $s3ql_cachedir on a journaling file system? What happened to > >> >> this file system on the power cycle? Did it loose data? > >> >> > >> >> What are the contents of $s3ql_cachedir when you run fsck.s3ql? > >> >> > >> >> Are you running fsck.s3ql with the same $s3ql_cachedir as > mount.s3ql? > >> >> Are you *absolutely* sure about that? > >> >> > >> >> > >> > I can only trigger this when the system is powered off during an > actual > >> > transfer of data. If I let the data transfer finish and then power > cycle > >> > with the fs mounted, the FS recovers when running fsck. > >> > > >> > The system is running on an ext4 filesystem. The filesystem does not > >> seem > >> > to have lost any data. > >> > The cachedir is read from the same config file and works otherwise, > to > >> yes, > >> > I am sure about that. > >> > Contents of cachdir when failing is: > >> > -rw-r--r-- 1 root root 0 Jun 16 13:06 mount.s3ql_crit.log > >> > -rw------- 1 root root 77824 Jun 17 07:21 > >> > s3c:=storageurl.db > >> > -rw-r--r-- 1 root root 217 Jun 17 07:21 > >> > s3c:=storageurl.params > >> > >> There is something very wrong here. While mount.s3ql is running, there > >> will always be a directory ending in -cache in the cache directory. > This > >> directory is only removed after mount.s3ql exits, so if you reboot the > >> computer, it *must* still be there. > >> > >> Can you confirm that the directory exists while mount.s3ql is running? > >> > >> What happens if, instead of rebooting, you just kill -9 the mount.s3ql > >> process? Does the -cache directory exist? Does fsck.s3ql work in that > >> case? > >> > > The -cache dir is there while the and only removed when mount.s3ql > > finishes. After a kill -9, the -cache is still there. Then rebooting > > the system, the -cache is still there and fsck completes. > > > > However when closely monitoring the system, the -cache is created when > the > > FS is mounted but if the system is immediately reset, it is not there > after > > a reboot. > > So my thinking is that this is a problem that we have with our flash > based > > file system. The file is simply not yet written to flash. > > It's a directory, not a file, and it is created when mount.s3ql > starts. If this directory (with its contents) disappears if you reboot > the system several minutes later, you have a real problem. > If I reboot several minutes later, the directory is there and everything works. I need to force the reboot within seconds after the mount process, or maybe even during it while the metadata is read/written from the server.
> > > This will be running on an "non maintained" system with no possibility > for > > user interaction. > > What is the drawback of always continuing the fsck operation? > > You will loose any data that has not been written to the backend, and > you will loose all metadata updates since the last metadata updates - > which can imply that you loose some data even though the pure data has > already been written to the backend. > > More importantly, though, you are ignoring a big problem with your flash > file system. Rebooting the system might affect recently written files, > but it should not result in the loss of an entire directory with its > contents that was created an arbitrary amount of time before the > reboot. In addition, it looks as if the .params file reverts to an > earlier state (this is the reason why fsck.s3ql things that the remote > metadata is newer). > > Thank you for your support Nikolaus, we will look into this further on our own. /PA -- You received this message because you are subscribed to the Google Groups "s3ql" group. To unsubscribe from this group and stop receiving emails from it, send an email to s3ql+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.