On Wed, Sep 2, 2015 at 8:47 PM, Wahl, Edward <ew...@osc.edu> wrote: > I've seen this kind of error before when doing samba to do something > stupid (and let's face it, that most everything with samba) It was a > locking issue I think. Things were being changed/deleted/ (unlinked in > actuality) as the client was trying to do something with it. > > Is the Apache process or it's spawned app(s) still working on the files > in question while serving them up? > Not as far as I know, these are result files that were generated days ago (possibly more) and should be static by now.... But I'll double check with the people behind the app....
> That would be my guess here. Any chance this is across NFS? Seen that a > great deal with this error, it used to cause crashes. > Strictly speaking it is not, but it may be because a part of the path the server 'sees'/'knows' is a symlink to the lustre filesystem which lives on nfs... Thanks, Eli > > Ed Wahl > OSC > > > ------------------------------ > *From:* lustre-discuss [lustre-discuss-boun...@lists.lustre.org] on > behalf of E.S. Rosenberg [esr+lus...@mail.hebrew.edu] > *Sent:* Wednesday, September 02, 2015 7:57 AM > *To:* lustre-discuss@lists.lustre.org > *Subject:* [lustre-discuss] refresh file layout error > > Hi all, > > I am seeing an interesting/annoying problem with lustre and am not really > sure what/where to look. > > When a webserver (galaxy using wsgi/apache2) tries to server (large) files > stored on lustre it fails to send the full file and I see the following > errors in syslog: > > Sep 2 11:50:17 hm-02 kernel: LustreError: > 6973:0:(vvp_io.c:1197:vvp_io_init()) fs01: refresh file layout > [0x200008815:0x217e:0x0] error -13. > Sep 2 11:50:17 hm-02 kernel: LustreError: > 6973:0:(file.c:179:ll_close_inode_openhandle()) inode 144115772543738238 > mdc close failed: rc = -13 > > If I try to access the files through their direct path (copying to > tmp/md5sum/sha512sum) it seems to work without a problem (full file is > copied and sums agree, from different nodes). > > When we switched the storage backend to NFS the server worked fine, so my > guess is that there is an issue with the way python tries to read from the > 'disk'... > > Is anyone familiar with the error above? > > Thanks, > Eli >
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org