[OpenAFS] kernel crash linux-4.4.15 openafs-1.6.18.2

2016-10-17 Thread Hans-Werner Paulsen

Dear All,
10 days ago I have reported a problem with OpenAFS on Linux to 
openafs-bugs (#133467). Now I have found more results and a solution to 
circumvent this problem.

I have a simple program executing the following calls again and again:
  fopen afs-file mode append
  fwrite 64MiB
  fclose
The machine has a disk cache with 16GiB and afsd is started with
  -afsdb -blocks 14862564 -stat 65536 -volumes 128 -chunksize 18
The first 40 64MiB writes took about 0.7s each, then the time to
append 64MiB is 1.2s for block #41 and is growing to 11s for write #154, 
then I run a "fs flushall" and the write time drops to 0.8s, but is 
growing again. When I kill the process, I get a kernel crash most of the 
time.


When I use afsd -dcache 32768, I see the following: for the first 128 
writes (=128*64MiB=32768*chunksize) the time for each write is growing 
from 0.7s to 1.5s, then it jumps to 6.5s and is growing to ~10s for 
write #154 and ~19s for write #220


For our production machines I have set the chunksize to 2^21 and use the 
default dcache size (1?). Now we do not see any problems with our 
current jobs.


It seems that the algorithm to look for free dcache entries is very 
slow. Any hints?


Best regards,
Hans-Werner



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OpenAFS] Re: accessing R/O volume becomes slow

2014-11-28 Thread Hans-Werner Paulsen


On 11/27/2014 01:11 PM, Stephan Wiesand wrote:

On 27 Nov 2014, at 11:26, Hans-Werner Paulsen  wrote:

Yesterday, on another machine I created and deleted 4 million files on AFS. The 
number of afs_inode_cache slabs grew from 1 million to 5 million. Today there 
are still 5 million entries.

It should shrink when there's memory pressure. If you're still worried, there's 
the -disable-dynamic-vcaches switch for afsd.
On my desktop PC (Linux 3.16.5 x86_64, OpenAFS 1.6.10) I set the 
-disable-dynamic-vcaches option, the -stat option has a value of 65536. 
When I create 100,000 files, I see 100,000 more afs_inode_cache slab 
objects. But, the fileserver is seeing this option, there are only 65253 
nFEs, 65253 nCBs (4194304 nblks). Without -disable-dynamic-vcaches the 
number of CBs is about the number of created files. And if I try to 
create more files than nCBs on the fileserver, the fileserver 
(dafileserver) hangs for about 15 minutes (dafileserver 100-120% cpu!), 
and I get a "connection timeout" on the client.


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: accessing R/O volume becomes slow

2014-11-27 Thread Hans-Werner Paulsen


On 11/26/2014 09:15 PM, Andrew Deason wrote:

On Wed, 26 Nov 2014 10:51:00 +0100
Hans-Werner Paulsen  wrote:


Checking the machine I see more than 5 million of afs_inode_cache slab
entries. Is this normal? Any hint how to proceed?

That's not unusual if you are accessing a lot of files (say, about 5
million recently accessed). But having a lot of vcaches in memory can
cause certain operations to be slow; there was a fix just added in
1.6.10 to improve speed for a background cleanup process with lots of
files (well, and PAGs): 94f1d4.
Yesterday, on another machine I created and deleted 4 million files on 
AFS. The number of afs_inode_cache slabs grew from 1 million to 5 
million. Today there are still 5 million entries.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] accessing R/O volume becomes slow

2014-11-27 Thread Hans-Werner Paulsen


On 11/26/2014 07:54 PM, Benjamin Kaduk wrote:

On Wed, 26 Nov 2014, Hans-Werner Paulsen wrote:


Hello,
this is on Linux 3.14.8 x86_64, and OpenAFS 1.6.9. The machine is running
normally for several months, and then accessing a specific R/O volume (e.g. ls
-lR ) becomes slow. Accessing the R/W version of this volume
works normally. Accessing other R/O volumes, which have the same size and
number of files, works normally. Accessing the R/O version of the problem
volume from other clients works normally.
The command "fs flushall" does not solve the problem. The (second) "ls -lR"
command needs 10 seconds on the R/O, and 2 seconds on the R/W version of this
volume.
Accessing the R/O version from other fileservers (using fs setserverprefs)
does not change anything.
Checking the machine I see more than 5 million of afs_inode_cache slab
entries. Is this normal? Any hint how to proceed?

Are accesses from the same client to other volumes on the same (slow)
fileserver slow or fast?  Maybe you are getting throttled for too many
failed RPCs...

I have 2 volumes A and B (similiar in size and number of files) with R/W 
on machine X, and R/O on machine X, Y and Z. With "fs setserverprefs" I 
am accessing only the fileserver on X.
On the problem machine the first "ls -lR" takes about 1 to 2 minutes. 
The following "ls -lR"s take 4 to 5 seconds on A(R/W), B(R/W), B(R/O) 
and 15 seconds on A(R/O).
On another client machine all four values are identical and much better 
( 1.5 second).

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] accessing R/O volume becomes slow

2014-11-26 Thread Hans-Werner Paulsen


On 11/26/2014 11:07 AM, Jan Iven wrote:

On 11/26/2014 10:51 AM, Hans-Werner Paulsen wrote:

Hello,
this is on Linux 3.14.8 x86_64, and OpenAFS 1.6.9. The machine is
running normally for several months, and then accessing a specific R/O
volume (e.g. ls -lR ) becomes slow. Accessing the R/W
version of this volume works normally. Accessing other R/O volumes,
which have the same size and number of files, works normally. Accessing
the R/O version of the problem volume from other clients works normally.
The command "fs flushall" does not solve the problem. The (second) "ls
-lR" command needs 10 seconds on the R/O, and 2 seconds on the R/W
version of this volume.
Accessing the R/O version from other fileservers (using fs
setserverprefs) does not change anything.
Checking the machine I see more than 5 million of afs_inode_cache slab
entries. Is this normal? Any hint how to proceed?


suggest to check for local disk errors (incl SMART reallocation) on 
the partition hosting your AFS cache.


Cheers
jan


Thank you for your help, but:
No, there are no local disk errors. We had the same problem (but did not 
further analyse it) a few months ago. Rebooting solved that. And if 
there is a failure with the local disk, we should see slow performance 
with other volumes, too.

HW
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] accessing R/O volume becomes slow

2014-11-26 Thread Hans-Werner Paulsen

Hello,
this is on Linux 3.14.8 x86_64, and OpenAFS 1.6.9. The machine is 
running normally for several months, and then accessing a specific R/O 
volume (e.g. ls -lR ) becomes slow. Accessing the R/W 
version of this volume works normally. Accessing other R/O volumes, 
which have the same size and number of files, works normally. Accessing 
the R/O version of the problem volume from other clients works normally.
The command "fs flushall" does not solve the problem. The (second) "ls 
-lR" command needs 10 seconds on the R/O, and 2 seconds on the R/W 
version of this volume.
Accessing the R/O version from other fileservers (using fs 
setserverprefs) does not change anything.
Checking the machine I see more than 5 million of afs_inode_cache slab 
entries. Is this normal? Any hint how to proceed?

Best regards,
HW
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: No buffer space available

2013-07-24 Thread Hans-Werner Paulsen

On 07/23/2013 09:09 PM, Andrew Deason wrote:

On Tue, 23 Jul 2013 11:40:53 -0500
Andrew Deason  wrote:


On Tue, 23 Jul 2013 13:11:17 +0200
Hans-Werner Paulsen  wrote:


sometimes creating a file (using different programs) fails with the
error message "No buffer space available". This is on amd64_linux26
with OpenAFS 1.6.2 (both client and server). Any idea?

I don't see anywhere we'd be generating that error code (ENOBUFS), and
I can't see how it would show up that way if we got it back from a
socket or something. Do you have any idea if the machine is under a
lot of load or memory pressure, or if the directory has a lot of
entries in it?

Oh, and if you wanted a "next step" involving less guesswork, get an
fstrace dump if you can reproduce the problem. That's something like:

fstrace clear cm
fstrace setlog cmfx -buffers 1024
fstrace sets cm -active
# do something to trigger the error, then:
fstrace dump cm > /tmp/whatever
fstrace sets cm -inactive

Providing that to a developer will let us see a lower level trace of
what's going on, and should say where that error is coming from. But it
can contain some sensitive information like filenames.

Unfortunately I cannot reproduce the problem. Thank you for the usage 
instructions of fstrace, anyway.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: No buffer space available

2013-07-24 Thread Hans-Werner Paulsen

On 07/23/2013 06:40 PM, Andrew Deason wrote:

On Tue, 23 Jul 2013 13:11:17 +0200
Hans-Werner Paulsen  wrote:


sometimes creating a file (using different programs) fails with the
error message "No buffer space available". This is on amd64_linux26
with OpenAFS 1.6.2 (both client and server). Any idea?

I don't see anywhere we'd be generating that error code (ENOBUFS), and I
can't see how it would show up that way if we got it back from a socket
or something. Do you have any idea if the machine is under a lot of load
or memory pressure, or if the directory has a lot of entries in it?


The client was not under heavy load, but the fileserver, and the network.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] No buffer space available

2013-07-23 Thread Hans-Werner Paulsen
Hello,
sometimes creating a file (using different programs) fails with the error
message "No buffer space available". This is on amd64_linux26 with
OpenAFS 1.6.2 (both client and server). Any idea?

Best regards,
HW

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Mixing 1.4 and 1.6 fileservers

2013-02-14 Thread Hans-Werner Paulsen
On Mon, Sep 24, 2012 at 12:14:52PM -0500, Andrew Deason wrote:
> ... You can mix any fileserver versions in a cell, and any client
> version can use any fileserver version, etc etc. If the servers are
> running database server processes (vlserver, ptserver, etc) then mixing
> versions is not supported, but should still work with those versions.

What does happen, when I run 1.4.14 and 1.6.1 db servers simultaneously?
Is it save to upgrade one db server while the db servers with the old
versions are still running, when I do NOT modify the databases during the
upgrade process?

Best regards,
HW

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] sqlite on AFS will not work, even with whole-file locking

2010-04-22 Thread Hans-Werner Paulsen
On Thu, Apr 22, 2010 at 11:19:57AM +0100, Simon Wilkinson wrote:
> Okay, I understand now. And you're right, this is somewhat strange  
> behaviour, which has been there for years. And it won't help in the  
> cooperative locking case, sadly.
> 
> When a file is opened RW, and is marked as being dirty, the Unix cache  
> manager stores all of that files details locally - it won't update  
> stat information in response to callback breaks, nor will it flush  
> pages that are already in memory, or invalidate chunks on disk. It  
> does this in an attempt to prevent locally made changes from being  
> overwritten by those on the server (because AFS is write-on-close, and  
> our conflict resolution strategy is last-closer-wins).

If you are using "flock" to coordinate access to AFS files from programs
running on different machines, it is necessary to get an up-to-date
copy from the fileserver as last step of the "lock" call. And you have
to flush the local cache to the fileserver before you "unlock" the file.
When this is the first step of the flock(LOCK_UN) call, existing programs
using "flock" should work without modifications.

Best regards,
HW

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] sqlite on AFS will not work, even with whole-file locking

2010-04-22 Thread Hans-Werner Paulsen
On Wed, Apr 21, 2010 at 09:48:27AM -0400, Derrick Brashear wrote:
> if you opened it O_RDWR on this client, it better have a valid callback.
> 
> if it's modified and you still have it open, the callback is broken.
> if the client doesn't refetch, it's a bug, and it has nothing to do
> with locking particularly.

I do not know the exact semantics of the AFS filesystem, and therefore I
do not know that it is a bug. Is it really a bug?
Running the following program on machine A

  fd = open("xxx",O_RDONLY);
  while (1) {
ret = fstat(fd,&buf);
printf("size: %d  mtime: %s",(int)buf.st_size,ctime(&(buf.st_mtime)));
sleep(1);
  }

and modifying "xxx" on machine B (e.g. echo "J" >>xxx) reports me the
up-to-date information on machine A.
But, when I open the file with
  fd = open("xxx",O_RDWR);
the stat-information is never updated.
This is on i386_linux26 with 1.4.12, but I have seen this behavior forever.

Best regards,
HW

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] sqlite on AFS will not work, even with whole-file locking

2010-04-21 Thread Hans-Werner Paulsen
On Wed, Apr 21, 2010 at 08:46:54AM -0400, Derrick Brashear wrote:
> if you have a valid callback, the file better be up to date. uh
Hm, I do not understand. I have the following code on one client:
(1) fd = open("afs-file",O_RDWR)
(2) flock(fd,LOCK_EX)
...
When the file is modified on the fileserver after (1) and before (2)
the copy on the client is NOT up to date (the file is opened O_RDWR).

HW

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] sqlite on AFS will not work, even with whole-file locking

2010-04-21 Thread Hans-Werner Paulsen
On Wed, Apr 21, 2010 at 12:49:39PM +0100, Simon Wilkinson wrote:
> >When there are two processes (on different machines) executing that
> >code, the (2) flock call has to update the local copy of the afs-file,
> >otherwise locking is useless. And the (3) flock call has to sync the
> >local copy with the fileserver.
> >Writing a small test program I see that this synchronization isn't  
> >done.
> >How can I use the flock(2) call on AFS files?
> 
> Are you saying that the locks don't make it to the fileserver (so two  
> processes on different machines can flock() the same file). Or that  
> the file isn't flushed to the server when it is unlocked, so the  
> second machine doesn't see the changes that the first machine has made?
> 
The second one. To be honest, today I checked only, that the local copy
(the cache) is not updated from the fileserver on "flock(fd,LOCK_EX)".

HW

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] sqlite on AFS will not work, even with whole-file locking

2010-04-21 Thread Hans-Werner Paulsen
On Mon, Apr 12, 2010 at 12:34:23AM -0400, Derrick Brashear wrote:
> On Sun, Apr 11, 2010 at 11:13 PM, Adam Megacz  wrote:
> >
> > Brandon Simmons  writes:
> >> Thanks for the response. It seems like whole-file locking in sqlite
> >> would be a good choice for me in any case,
> >
> >> In a situation where the whole-file locking scheme is used, would AFS
> >> be an acceptable choice? Would it be better than NFS?
> >
> > I had the same idea, and tried it.  It does not work.  Your databases
> > will get corrupted.  I never figured out why, although I did confirm
> > that sqlite was in fact requesting only whole-file locks.
> >
> > It would be nice if it worked, though.  There are a lot of applications
> > out there where writes to the database are extremely rare, so
> > invalidating all the clients' caches is not a problem.
> 
> do you happen to know what the corruption looked like (blocks of
> zeroes, just not readable, something else)
> 
> -- 
> Derrick

On 27 Oct 2008 I had a question about flock on AFS, because I did not
understand how flock should work on AFS at all. May I repeat this question:

Hello,
today I am totally confused how the flock(2) call should work on
AFS files.
Normally locking works in the following way:
1fd = open("afs-file",O_RDWR)
 do something
2flock(fd,LOCK_EX)
 do something with "afs-file"
3flock(fd,LOCK_UN)
 do something
4close(fd)

When there are two processes (on different machines) executing that
code, the (2) flock call has to update the local copy of the afs-file,
otherwise locking is useless. And the (3) flock call has to sync the
local copy with the fileserver.
Writing a small test program I see that this synchronization isn't done.
How can I use the flock(2) call on AFS files?
Thank you for any help,

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: [OpenAFS-announce] OpenAFS 1.4.12 release candidate 4 available

2010-03-12 Thread Hans-Werner Paulsen
Dear all,
on i386_linux26 with linux-2.6.32.9 and gcc-4.4.3 I compiled
openafs-1.4.12pre4 without any problems.
The client runs fine for about 2 weeks. Server not tested.

Kind regards,
Hans-Werner Paulsen

-- 
Hans-Werner Paulsen h...@mpa-garching.mpg.de
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] flock on AFS files

2008-10-27 Thread Hans-Werner Paulsen
Hello,
today I am totally confused how the flock(2) call should work on
AFS files.
Normally locking works in the following way:
1fd = open("afs-file",O_RDWR)
 do something
2flock(fd,LOCK_EX)
 do something with "afs-file"
3flock(fd,LOCK_UN)
 do something
4close(fd)

When there are two processes (on different machines) executing that
code, the (2) flock call has to update the local copy of the afs-file,
otherwise locking is useless. And the (3) flock call has to sync the
local copy with the fileserver.
Writing a small test program I see that this synchronization isn't done.
How can I use the flock(2) call on AFS files?
Thank you for any help,

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] 1.4.7pre2+patch success on i386

2008-04-04 Thread Hans-Werner Paulsen
Hello,
I added the following modifications to OpenAFS-1.4.7pre2
reported on the list:
--- openafs-1.4.7pre2/src/afs/LINUX/osi_vnodeops.c.orig Wed Mar 26 05:17:32 2008
+++ openafs-1.4.7pre2/src/afs/LINUX/osi_vnodeops.c  Thu Apr  3 14:30:36 2008
@@ -570,8 +570,10 @@
 
 AFS_GLOCK();
 
-if (fp->f_flags | O_RDONLY) /* readers dont flush */
+if ((fp->f_flags & O_ACCMODE) == O_RDONLY) {
+AFS_GUNLOCK();
return 0;
+}
 
 credp = crref();
 vcp = VTOAFS(FILE_INODE(fp));

and the OpenAFS client does not show any problems with
linux-2.6.24.4
gcc-4.2.3
on a i386 system.

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: [OpenAFS-announce] OpenAFS 1.4.5 release candidate 1 available

2007-10-15 Thread Hans-Werner Paulsen
On Sun, Oct 14, 2007 at 10:16:07PM -0400, Derrick J Brashear wrote:
> Notable changes include updates for newer Linux kernels (including 
> 2.6.23) ...

Does not compile on i386 and x86_64 with kernel 2.6.23. I think that
linux-2623-support-20071004 is missing. With this patch applied I was
able to compile OpenAFS and start the client on i386 and x86_64.

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Strange access problems on one client

2007-10-11 Thread Hans-Werner Paulsen
On Sun, Oct 07, 2007 at 01:15:00PM -0400, Marc Dionne wrote:
> Anyone care to test out the attached patch for src/dir/dir.c
> 
> The hashing code in the DirHash() function relies on integer overflow to 
> make the hval value turn into a negative value.  gcc 4.2 assumes that 
> this value can never go negative and optimizes out the (hval < 0) test.
> 
> Marc
> 
> diff -u -r1.24 dir.c
> --- src/dir/dir.c 13 Oct 2005 15:12:12 -  1.24
> +++ src/dir/dir.c 7 Oct 2007 17:10:37 -
> @@ -478,8 +478,9 @@
>  {
>  /* Hash a string to a number between 0 and NHASHENT. */
>  register unsigned char tc;
> -register int hval;
> +unsigned long hval;
>  register int tval;
> +
>  hval = 0;
>  while ((tc = (*string++))) {
>   hval *= 173;
> @@ -488,7 +489,7 @@
>  tval = hval & (NHASHENT - 1);
>  if (tval == 0)
>   return tval;
> -else if (hval < 0)
> +else if (hval >= 1<<31)
>   tval = NHASHENT - tval;
>  return tval;
>  }

this patch is fine for architectures where the size of "unsigned long"
is 4 bytes. But on the x86_64 architecture this will not work, because
the size is 8 bytes. One can use "unsigned int".

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] gcc-4.2.1, afs-client not working

2007-10-09 Thread Hans-Werner Paulsen
On Tue, Oct 09, 2007 at 12:23:44PM +0200, Hans-Werner Paulsen wrote:
> blabla...
Sorry, this problem is "Strange access problems on one client" and already 
solved.

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] gcc-4.2.1, afs-client not working

2007-10-09 Thread Hans-Werner Paulsen
On Fri, Oct 05, 2007 at 10:12:10AM -0400, Derrick Brashear wrote:
> If you can tell us which file or files when compiled with 4.2.1 are screwing
> up, it would make it easier to be able to figure out what's going on.
Of course, but I did not understand how the different Makefiles work to
compile the kernel module, and how to specify different compiler flags
for different source files.
Now I have some results:
Garance A Drosihn suggested to try the gcc option -fno-tree-vrp, and yes
the kernel module compiled with this option runs fine.
In addition only the source file src/libafs/MODLOAD-2.6.22.9-MP/afs_dir.c,
which is src/dir/dir.c, needs to be compiled with this option.

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] gcc-4.2.1, afs-client not working

2007-10-05 Thread Hans-Werner Paulsen
Hello,
on i386_linux26 I compiled the kernel 2.6.22.9 and OpenAFS 1.4.4
using the gcc 4.2.1.
Now I get:
===
# ls -l /afs 
ls: cannot access /afs/mpa-garching.mpg.de: No such file or directory
ls: cannot access /afs/ipp-garching.mpg.de: No such file or directory
ls: cannot access /afs/world: No such file or directory
ls: cannot access /afs/rzg.mpg.de: No such file or directory
ls: cannot access /afs/andrew.cmu.edu: No such file or directory
total 14
d?  ? ??   ?? andrew.cmu.edu
drwxr-xr-x 11 root root 6144 Sep 21 07:18 ipp
??  ? ??   ?? ipp-garching.mpg.de
drwxr-xr-x  3 root root 2048 Mar 30  2007 mpa
??  ? ??   ?? mpa-garching.mpg.de
drwxr-xr-x 11 root root 6144 Sep 21 07:18 rzg
d?  ? ??   ?? rzg.mpg.de
??  ? ??   ?? world
===

When I recompiled the OpenAFS software using gcc 4.1.2 everything is fine.

Any idea or help?

Hans-Werner

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Maximum size /vicepX partition

2006-01-24 Thread Hans-Werner Paulsen
On Tue, Jan 24, 2006 at 02:59:00PM +0100, Horst Birthelmer wrote:
> On Jan 24, 2006, at 2:26 PM, Hans-Werner Paulsen wrote:
> >the size of a /vicepX partition on an OpenAFS-1.4.0 fileserver seems
> >to be limited to 4TByte.
> 
> Are you sure, it's the fileserver and not your partitions filesystem  
> that is limited?
> 
> Can you write data to the partition directly on the fileserver?
> I mean writing directly on the filesystem of the fileserver (in this  
> case /vicepd).

Yes.
ls -l /vicepd
total 0
drwx--2 root root   72 Jan 24 12:43 AFSIDat
drwx--2 root root   72 Jan 24 12:40 Lock

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Maximum size /vicepX partition

2006-01-24 Thread Hans-Werner Paulsen
Hello,
the size of a /vicepX partition on an OpenAFS-1.4.0 fileserver seems
to be limited to 4TByte.
I can create volumes on this fileserver/partition, but when I try to
create a file on this volume I get:
$ touch xxx
touch: creating `xxx': No space left on device
The fileserver writes the following lines to the log file:
Tue Jan 24 12:40:10 2006 File Server started Tue Jan 24 12:40:10 2006
Tue Jan 24 12:42:53 2006 Partition /vicepd that contains volume 536920876 is 
full

Tue Jan 24 12:44:41 2006 Shutting down file server at Tue Jan 24 12:44:41 2006
Tue Jan 24 12:44:41 2006 Partition /vicepd: -5451620 available 1K blocks 
(minfree=0), Tue Jan 24 12:44:41 2006 overallocated by 10008912 blocks
...

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] fileserver monitor

2005-10-20 Thread Hans-Werner Paulsen
Hello,
one of our OpenAFS fileserver is very busy, and I want to check
who is doing the top I/O. Neither "afsmonitor" nor "scout" can tell
me who (client) is using which data (volume). Are there any other
tools around to identify the top usage?

HW

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Callback/Cache Issues with 1.3.82 on FC3

2005-05-04 Thread Hans-Werner Paulsen
On Wed, May 04, 2005 at 08:41:19AM +0200, Stephan Wiesand wrote:
> On Tue, 3 May 2005, chas williams - CONTRACTOR wrote:
> 
> >i cant seem to duplicate this failure.  deleting on the 1.2 client
> >makes the file disappear on the 1.3 client as well.  do you have
> >a little more info?  cache types for the clients?

We have the same problem, one machine 1.3.81, the other one 1.3.82:
1.3.82:~ >echo a > a
1.3.82:~ >cat a
a
1.3.81:~ >cat a
a
1.3.81:~ >rm a
1.3.81:~ >cat a
cat: a: No such file or directory
1.3.82:~ >cat a
a

On both machines we are using identical software, compiled using
gcc-3.3.2, only OpenAFS versions are different.
linux   2.6.11.7
glibc   2.3.2
cache   ext2

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Problem with openafs-1.3.81 on kernel 2.6.11.7

2005-04-15 Thread Hans-Werner Paulsen
On Fri, Apr 15, 2005 at 12:02:39PM +0100, Dr A V Le Blanc wrote:
>...
> rebooted the system and gone into afs and done a
> 
>  for i in `find . -type f -noleaf`;do cat $i >/dev/null;done
> 
> The cache partition quickly gets overfull and input/output errors
> appear.  This looks like a real problem.  I am now using the
> Debian packakes for 1.3.81, but previously I compiled openafs
> from source and had the same problem, so it does not appear to
> be specific to the Debian packages.
> 
> Is anyone else running 1.3.81 on 2.6.11 of some form, who is willing
> to try the same test?
> 

I have the same problem with my machine:
linux-2.6.11.5
openafs-1.3.81
gcc-3.3.2
glibc-2.3.2
This is a self-made system using unmodified sources.

The three commands "df", "du" and "fs getcache" are showing different
results (unit=1k)
df: 669132 of 1035692
fs: 432785 of 50 (I reduced the max size to avoid problems)
    du: 2520660 (!!!)

And the cache partition is ext2.

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Fileserver Problem with OpenAFS 1.3.73 on Debian Sarge

2004-11-10 Thread Hans-Werner Paulsen
On Tue, Nov 09, 2004 at 11:44:48AM -0500, Derrick J Brashear wrote:
> On Tue, 9 Nov 2004, Lars Schimmer wrote:
> 
> >It seems to wirk correct, but now the problems appears:
> >VLLog on the backup database server:
> >Tue Nov  9 12:55:24 2004 ubik: A Remote Server has addresses: @(#) OpenAFS
> >1.2.11 built  2004-01-11
> >~  [127.0.0.1 134.169.37.177]
> >~   It will replace the following existing entry in the VLDB (same uuid):
> >~  entry 4: [134.169.37.177]
> 
> vos changeaddr -remove 127.0.0.1 and put 127.0.0.1 in NetRestrict; this is 
> a bug which will be fixed in 1.3.74

The problem is NOT fixed in version 1.3.74. I checked this on i386_linux24.
I was not able to remove the localhost entry with this command:
vos changeaddr -remove 127.0.0.1
  Could not remove server 127.0.0.1 from the VLDB
  vlserver does not support the remove flag or VLDB: no such entry
But creating the NetRestrict file, and restarting the fileserver/volserver
removes this entry.

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Fileserver Problem with OpenAFS 1.3.73 on Debian Sarge

2004-11-09 Thread Hans-Werner Paulsen
On Tue, Nov 09, 2004 at 01:18:44PM +0100, Lars Schimmer wrote:
> I used the debian experimental packages of OpenAFS 1.3.73,
> openafs-client - The AFS distributed filesystem- client support
> openafs-fileserver - The AFS distributed filesystem- file server
> openafs-modules-source - The AFS distributed filesystem- Module Sources
> buitl by Maintainer: Sam Hartman <[EMAIL PROTECTED]>
> 
> It seems to wirk correct, but now the problems appears:
> VLLog on the backup database server:
> Tue Nov  9 12:55:24 2004 ubik: A Remote Server has addresses: @(#) OpenAFS
> 1.2.11 built  2004-01-11
> ~  [127.0.0.1 134.169.37.177]
> ~   It will replace the following existing entry in the VLDB (same uuid):
> ~  entry 4: [134.169.37.177]
> 
> Hell, what went wrong?
> Yes, the fileserver sees my server as 127.0.0.1, but why?

Hello,
on an amd64_linux24 system with OpenAFS 1.3.73 I have the same problem.
127.0.0.1 is added to "sysid" and send to the VLDB server.
Creating a NetRestrict file with this IP address does solve this problem.

Hans-Werner

-- 
Hans-Werner Paulsen [EMAIL PROTECTED]
MPI für Astrophysik Tel 089-3-2602
Karl-Schwarzschild-Str. 1   Fax 089-3-2235  
D-85741 Garching
___
OpenAFS-info mailing list
[EMAIL PROTECTED]
https://lists.openafs.org/mailman/listinfo/openafs-info