GIT get corrupted on lustre

2012-12-24 Thread Eric Chamberland
Hi, we are using git since may and all is working fine for all of us (almost 20 people) on our workstations. However, when we clone our repositories to the cluster, only and only there we are having many problems similiar to this post: http://thread.gmane.org/gmane.comp.file-systems.lustre.u

Re: GIT get corrupted on lustre

2012-12-24 Thread Andreas Schwab
Eric Chamberland writes: > #1) However, how can we *test* the filesystem (lustre) compatibility with > git? (Is there a unit test we can run?) Have you considered running git's testsuite? Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44

Re: GIT get corrupted on lustre

2012-12-24 Thread Brian J. Murrell
On 12-12-24 09:08 AM, Eric Chamberland wrote: > Hi, Hi, > Doing a "git clone" always work fine, but when we "git pull" or "git gc" > or "git fsck", often (1/5) the local repository get corrupted. Have you tried adding a "-q" to the git command line to quiet down git's "feedback" messages? I dis

Re: GIT get corrupted on lustre

2012-12-24 Thread Greg Troxel
we are using git since may and all is working fine for all of us (almost 20 people) on our workstations. However, when we clone our repositories to the cluster, only and only there we are having many problems similiar to this post: What filesystem tests have you run on lustre? I would r

Re: GIT get corrupted on lustre

2012-12-26 Thread Jeff King
On Mon, Dec 24, 2012 at 09:08:46AM -0500, Eric Chamberland wrote: > Doing a "git clone" always work fine, but when we "git pull" or "git > gc" or "git fsck", often (1/5) the local repository get corrupted. > for example, I got this error two days ago while doing "git gc": > > error: index file >

Re: GIT get corrupted on lustre

2013-01-08 Thread Eric Chamberland
On 12/24/2012 10:11 AM, Brian J. Murrell wrote: Have you tried adding a "-q" to the git command line to quiet down git's "feedback" messages? Ok, I have modified my crontab to use "-q" and I will wait to see if the problem occurs from now. I discovered other oddities with using git on Lust

Re: GIT get corrupted on lustre

2013-01-09 Thread Eric Chamberland
Hi Brian, On 01/08/2013 11:11 AM, Eric Chamberland wrote: On 12/24/2012 10:11 AM, Brian J. Murrell wrote: Have you tried adding a "-q" to the git command line to quiet down git's "feedback" messages? I moved to git 1.8.1 and added the "-q" to the command "git gc" but it occured to return

Re: GIT get corrupted on lustre

2013-01-17 Thread Eric Chamberland
Hi! I still have the corruption problems We just compiled a git without threads to try... (by the way, --without-pthreads doesn't work, you have to do a --disable-pthreads instead). And to remove the warnings about threads at "git gc" execution, I did a: git config --local pack.threads

Re: GIT get corrupted on lustre

2013-01-17 Thread Eric Chamberland
On 01/17/2013 09:23 AM, Philippe Vaucher wrote: Anyone has a new idea? Did you try Jeff King's code to confirm his idea? Philippe Yes I did, but it was running without any problem I find that my test case is "simple" (fresh git clone then "git gc" in a crontab), I bet anyone who has a

RE: GIT get corrupted on lustre

2013-01-17 Thread Pyeron, Jason J CTR (US)
> -Original Message- > From: Eric Chamberland > Sent: Thursday, January 17, 2013 11:31 AM > > On 01/17/2013 09:23 AM, Philippe Vaucher wrote: > >> Anyone has a new idea? > > > > Did you try Jeff King's code to confirm his idea? > > > > Philippe > > > > Yes I did, but it was running withou

Re: GIT get corrupted on lustre

2013-01-17 Thread Maxime Boissonneault
I don't know of any lustre filesystem that is used on Windows. Barely anybody uses Windows in the HPC industry. This is a Linux cluster. Maxime Boissonneault Le 2013-01-17 11:40, Pyeron, Jason J CTR (US) a écrit : -Original Message- From: Eric Chamberland Sent: Thursday, January 17, 20

RE: GIT get corrupted on lustre

2013-01-17 Thread Pyeron, Jason J CTR (US)
Jason J CTR (US) > Cc: Eric Chamberland; Philippe Vaucher; git@vger.kernel.org; Sébastien > Boisvert > Subject: Re: GIT get corrupted on lustre > > I don't know of any lustre filesystem that is used on Windows. Barely > anybody uses Windows in the HPC industry. > This is a

Re: GIT get corrupted on lustre

2013-01-18 Thread Eric Chamberland
: GIT get corrupted on lustre I don't know of any lustre filesystem that is used on Windows. Barely anybody uses Windows in the HPC industry. This is a Linux cluster. Maxime Boissonneault Le 2013-01-17 11:40, Pyeron, Jason J CTR (US) a écrit : -Original Message- From: Eric Chamberland

Re: GIT get corrupted on lustre

2013-01-21 Thread Erik Faye-Lund
On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland wrote: > Good idea! > > I did a strace and here is the output with the error: > > http://www.giref.ulaval.ca/~ericc/strace_git_error.txt > > Hope it will be insightful! This trace doesn't seem to contain child-processes, but instead having their s

Re: GIT get corrupted on lustre

2013-01-21 Thread Thomas Rast
Erik Faye-Lund writes: > On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland > wrote: >> Good idea! >> >> I did a strace and here is the output with the error: >> >> http://www.giref.ulaval.ca/~ericc/strace_git_error.txt >> >> Hope it will be insightful! > > This trace doesn't seem to contain chil

Re: GIT get corrupted on lustre

2013-01-21 Thread Maxime Boissonneault
Hi Thomas, Can you tell me what is the version of the lustre servers and the lustre clients ? Thanks, Maxime Boissonneault HPC specialist @ Calcul Québec. Le 2013-01-21 11:11, Thomas Rast a écrit : Erik Faye-Lund writes: On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland wrote: Good idea

Re: GIT get corrupted on lustre

2013-01-21 Thread Thomas Rast
Maxime Boissonneault writes: > Hi Thomas, > Can you tell me what is the version of the lustre servers and the > lustre clients ? $ uname -a Linux brutus4.ethz.ch 2.6.32-279.14.1.el6.x86_64 #1 SMP Tue Nov 6 23:43:09 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux $ cat /proc/fs/lustre/version lustre: 2.

Re: GIT get corrupted on lustre

2013-01-21 Thread Eric Chamberland
Hi, It just happened again. Now I have the "strace -f" output gzipped here: http://www.giref.ulaval.ca/~ericc/strace-f_git_error.txt.gz thanks, Eric On 01/21/2013 08:29 AM, Erik Faye-Lund wrote: On Fri, Jan 18, 2013 at 6:50 PM, Eric Chamberland wrote: Good idea! I did a strace and here i

Re: GIT get corrupted on lustre

2013-01-21 Thread Eric Chamberland
On 01/21/2013 12:07 PM, Eric Chamberland wrote: Hi, It just happened again. Now I have the "strace -f" output gzipped here: http://www.giref.ulaval.ca/~ericc/strace-f_git_error.txt.gz I added the "strace -f" output when non error occurs... http://www.giref.ulaval.ca/~ericc/strace-f_git_no_

Re: GIT get corrupted on lustre

2013-01-21 Thread Brian J. Murrell
On 13-01-21 11:11 AM, Thomas Rast wrote: > > What's odd is that while I cannot reproduce the original problem, there > seems to be another issue/bug with utime(): I wonder if this is related to http://jira.whamcloud.com/browse/LU-305. That was reported as fixed in Lustre 2.0.0 and 2.1.0 but I th

Re: GIT get corrupted on lustre

2013-01-21 Thread Thomas Rast
Please don't drop the Cc list! "Brian J. Murrell" writes: >> What's odd is that while I cannot reproduce the original problem, there >> seems to be another issue/bug with utime(): > > I wonder if this is related to http://jira.whamcloud.com/browse/LU-305. > That was reported as fixed in Lustre

Re: GIT get corrupted on lustre

2013-01-22 Thread Eric Chamberland
So, hum, do we have some sort of conclusion? Shall it be a fix for git to get around that lustre "behavior"? If something can be done in git it would be great: it is a *lot* easier to change git than the lustre filesystem software for a cluster in running in production mode... (words from clus

Re: GIT get corrupted on lustre

2013-01-22 Thread Junio C Hamano
Eric Chamberland writes: > So, hum, do we have some sort of conclusion? > > Shall it be a fix for git to get around that lustre "behavior"? > > If something can be done in git it would be great: it is a *lot* > easier to change git than the lustre filesystem software for a cluster > in running in

Re: GIT get corrupted on lustre

2013-01-22 Thread Thomas Rast
Eric Chamberland writes: > So, hum, do we have some sort of conclusion? > > Shall it be a fix for git to get around that lustre "behavior"? > > If something can be done in git it would be great: it is a *lot* > easier to change git than the lustre filesystem software for a cluster > in running in

Re: GIT get corrupted on lustre

2013-01-22 Thread Eric Chamberland
On 01/22/2013 05:14 PM, Thomas Rast wrote: Eric Chamberland writes: So, hum, do we have some sort of conclusion? Shall it be a fix for git to get around that lustre "behavior"? If something can be done in git it would be great: it is a *lot* easier to change git than the lustre filesystem so

Re: GIT get corrupted on lustre

2013-01-23 Thread Sébastien Boisvert
On 01/22/2013 05:14 PM, Thomas Rast wrote: Eric Chamberland writes: So, hum, do we have some sort of conclusion? Shall it be a fix for git to get around that lustre "behavior"? If something can be done in git it would be great: it is a *lot* easier to change git than the lustre filesystem so

Re: GIT get corrupted on lustre

2013-01-23 Thread Sébastien Boisvert
[I forgot to subscribe to the git mailing list, sorry for that] On 01/22/2013 05:14 PM, Thomas Rast wrote: Eric Chamberland writes: So, hum, do we have some sort of conclusion? Shall it be a fix for git to get around that lustre "behavior"? If something can be done in git it would be great:

Re: GIT get corrupted on lustre

2013-01-23 Thread Erik Faye-Lund
On Tue, Jan 22, 2013 at 11:14 PM, Thomas Rast wrote: > Eric Chamberland writes: > > Other than that I agree with Junio, from what we've seen so far, Lustre > returns EINTR on all sorts of calls that simply aren't allowed to do so. I don't think this analysis is 100% accurate, POSIX allows error

Re: GIT get corrupted on lustre

2013-01-23 Thread Thomas Rast
Erik Faye-Lund writes: > On Tue, Jan 22, 2013 at 11:14 PM, Thomas Rast wrote: >> Eric Chamberland writes: >> >> Other than that I agree with Junio, from what we've seen so far, Lustre >> returns EINTR on all sorts of calls that simply aren't allowed to do so. > > I don't think this analysis is

Re: GIT get corrupted on lustre

2013-01-23 Thread Erik Faye-Lund
On Wed, Jan 23, 2013 at 4:32 PM, Thomas Rast wrote: > Erik Faye-Lund writes: > >> On Tue, Jan 22, 2013 at 11:14 PM, Thomas Rast wrote: >>> Eric Chamberland writes: >>> >>> Other than that I agree with Junio, from what we've seen so far, Lustre >>> returns EINTR on all sorts of calls that simply

Re: GIT get corrupted on lustre

2013-01-23 Thread Thomas Rast
Erik Faye-Lund writes: > On Wed, Jan 23, 2013 at 4:32 PM, Thomas Rast wrote: >> Erik Faye-Lund writes: >> >>> POSIX allows error codes >>> to be generated other than those defined. From >>> http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_03.html: >>> >>> "Implementations may

Re: GIT get corrupted on lustre

2013-01-23 Thread Erik Faye-Lund
On Wed, Jan 23, 2013 at 4:44 PM, Thomas Rast wrote: > Erik Faye-Lund writes: > >> On Wed, Jan 23, 2013 at 4:32 PM, Thomas Rast wrote: >>> Erik Faye-Lund writes: >>> POSIX allows error codes to be generated other than those defined. From http://pubs.opengroup.org/onlinepubs/009695

Re: GIT get corrupted on lustre

2013-01-23 Thread Jonathan Nieder
Thomas Rast wrote: > Taken together this should mean that the bug is in fact simply that the > calls do not *restart*. They are (like you say) allowed to return EINTR > despite not being specified to, *but* SA_RESTART should restart it. > > Now, does that make it a lustre bug or a glibc bug? :-)

Re: GIT get corrupted on lustre

2013-01-23 Thread Sébastien Boisvert
Hello, Here is a patch (with git format-patch) that removes any timer if NO_SETITIMER is set. Éric: To test it with your workflow: $ module load apps/git/1.8.1.1.348.g78eb407-NO_SETITIMER-patch $ git clone ... Sébastien On 01/22/2013 05:14 PM, Thomas Rast w

Re: GIT get corrupted on lustre

2013-02-04 Thread Eric Chamberland
Hi, On 01/23/2013 01:34 PM, Sébastien Boisvert wrote: Hello, Here is a patch (with git format-patch) that removes any timer if NO_SETITIMER is set. Even with the patch, I finally got an error... :-/ Here are the log (strace -f) of a clean execution and one with the error: http://www.giref.