Re: [OpenAFS] Re: Advice on a use case

2012-11-08 Thread Timothy Balcer
On Tue, Nov 6, 2012 at 8:49 AM, Andrew Deason wrote: > On Tue, 6 Nov 2012 00:06:53 -0800 > Timothy Balcer wrote: > > > I have a need to think about replicating large volumes (multigigabyte) > > of large number (many terabytes of data total), to at least two other > > servers besides the read writ

RE: [OpenAFS] 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Renata Maria Dart
Hi Dan, thanks for your efforts in researching this problem and posting it. And thanks to Arne for his response as well. Renata On Thu, 8 Nov 2012, Dan Van Der Ster wrote: >Just run > >ulimit -Hn > >If it says 4096 your AFS will probably crash. If it says 1024 you are safe (as >far as we'

Re: [OpenAFS] 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Stephan Wiesand
Hi Dan, On Nov 8, 2012, at 16:41 , Dan Van Der Ster wrote: [...] > All of the nasty details of this incident here: >https://afs.web.cern.ch/afs/reports/html/afs200SegFaults.html > > We're now running with a workaround, > ulimit -Hn 1024; ulimit -Sn 1024 > in our init scripts until we manag

Re: [OpenAFS] 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Arne Wiebalck
>From what I see on our most recent RHEL derived SLC kernels this change is only in 6. Cheers, Arne On Nov 8, 2012, at 5:46 PM, Renata Maria Dart wrote: > Hi, does this issue apply to both rhel5 and 6? > > Thanks, > > Renata > > >> Unless you manually set HAVE_POLL, you may not have it

RE: [OpenAFS] 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Dan Van Der Ster
Just run ulimit -Hn If it says 4096 your AFS will probably crash. If it says 1024 you are safe (as far as we've seen). Cheers, Dan From: Renata Maria Dart [ren...@slac.stanford.edu] Sent: 08 November 2012 17:46 To: Derrick Brashear Cc: Dan Van De

Re: [OpenAFS] 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Renata Maria Dart
Hi, does this issue apply to both rhel5 and 6? Thanks, Renata >Unless you manually set HAVE_POLL, you may not have it enabled in 1.6: >we didn't actually do the configure test for it. It will be fixed in 1.6.2. > >Incidentally, of note, currently salvsync unlike fssync doesn't ever try >poll()

[OpenAFS] Re: 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Andrew Deason
On Thu, 8 Nov 2012 15:41:57 + Dan Van Der Ster wrote: > Finally we realised this was due to fssync.c in 1.4's use of > select()/FD_SET and the corrupting behaviour of those functions when > using >1024 file descriptors per process. Until quite recently this > hadn't been a problem, since RHEL

Re: [OpenAFS] 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Derrick Brashear
On Thu, Nov 8, 2012 at 10:41 AM, Dan Van Der Ster wrote: > Dear OpenAFS 1.4.x Users, > > At CERN we just suffered from a confusing problem where the fileserver > process would regularly segfault (on only one new server just put into > production). Since a gdb of the fileserver core file was show

[OpenAFS] 1.4.x, select() and recent RHEL kernels beware

2012-11-08 Thread Dan Van Der Ster
Dear OpenAFS 1.4.x Users, At CERN we just suffered from a confusing problem where the fileserver process would regularly segfault (on only one new server just put into production). Since a gdb of the fileserver core file was showing random bit flips here and there, we initially suspected a bad

Re: [OpenAFS] Re: [OpenAFS-announce] OpenAFS 1.7.18 released for Microsoft Windows - Win 8 and Server 2012

2012-11-08 Thread Jeffrey Altman
On Thursday, November 08, 2012 10:11:24 AM, Steve Simmons wrote: > I have no clue how much work that would be, but it's a helluva idea. Start by porting RX to C#. signature.asc Description: OpenPGP digital signature

Re: [OpenAFS] Re: [OpenAFS-announce] OpenAFS 1.7.18 released for Microsoft Windows - Win 8 and Server 2012

2012-11-08 Thread Steve Simmons
On Nov 6, 2012, at 2:26 PM, Gary Buhrmaster wrote: > On Mon, Nov 5, 2012 at 1:32 PM, Jeffrey Altman wrote: >> OpenAFS 1.7.18 is the next a series of OpenAFS clients for the Microsoft >> Windows platform that is implemented as a native file system. > > I am not asking for it, just curious if Ope