Re: [OpenAFS] Re: Performance issues with Git repositories (or in general with many small files workloads)

2020-12-17 Thread Ciprian Dorin Craciun
On Thu, Dec 17, 2020 at 11:44 AM  wrote:
> 1. You could use git repack to transfer less files without losing the ability 
> of incremental updates.

For some reason Git (on the receiving side) although it receives a
pack, it unpacks it.

However my question also relates to other use-cases where one has to
handle large number of files.  (Git was just the latest use-case I
faced these days where the performance issue poped-up.)


> 2. You could turn off sync-after-close in the cache manager, see fs 
> storebehind. This should increase upfront performance but may degrade again, 
> should your cache run out of file handles. So, you'd have to play with cache 
> parameters, as well.

I've already set storebehind to 16 MiB, which is well above the
average size of a Git object.

Moreover I've even tried to use `-sync never` to the fileserver and
these didn't make much difference.

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Performance issues with Git repositories (or in general with many small files workloads)

2020-12-16 Thread Ciprian Dorin Craciun
On Thu, Dec 17, 2020 at 1:51 AM Douglas E Engert  wrote:
> If you are just backing up, consider "git bundle" that creates one file,
> and git clone can read the bundle.
>
> https://stackoverflow.com/questions/5578270/fully-backup-a-git-repo


Thank you for the suggestion.

I know about `git-bundle` however I don't want only a backup, but in
case I need also a "working" git repository, thus my option for `git
push --mirror`.

But indeed, if one only wants to backup a Git repository, then `git
bundle` is the best option as it results in only one file, which
OpenAFS handles flawlessly.

However on the downside of bundle, there is no support for incremental
backups;  i.e. each new backup will be a full dump.

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Performance issues with Git repositories (or in general with many small files workloads)

2020-12-16 Thread Ciprian Dorin Craciun
After fiddling with the `fileserver` arguments, I think the
problematic ones were `-p 128` and `-vhandle-max-cachesize 32768`, and
perhaps the too large `-b`, `-l` and `-s`.  (Also I've switched back
to the non direct-attached variant of the servers.)

The new arguments I'm using are:

/usr/lib/openafs/fileserver -syslog \
-sync onclose \
-p 16 \
-udpsize 67108864 -sendsize 67108864 \
-rxpck 65536 -rxmaxmtu 1400 \
-cb 1048576 -busyat 65536 \
-vc 4096 -b 4096 -l 65536 -s 262144


Apparently by using these new options things work much better now, as
in I can now get ~500 KiB/s where previously I had only ~20 KiB/s
throughput.  Although depending on the repository I can even obtain
~10 or ~20 MiB/s if it contains larger files.

Now regarding the arguments, what is exactly `vhandle`?  The
documentation hints about "file handles";  are these the actual OS
file-handles?

Is there perhaps a bottleneck for large values of the block and vnode caches?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Performance issues with Git repositories (or in general with many small files workloads)

2020-12-16 Thread Ciprian Dorin Craciun
Hello all!

I'm trying to use AFS to backup various Git repositories.  By "backup"
actually mean `git push --mirror /afs/.cell/some-path/repository.git`,
which has the following behaviour:  it writes many small files in the
`.git/objects` folder fanned by the first two hex digits of the object
hash.

In fact this pattern can be found in many applications that handle
lots of small files.  For example `rsync`, build systems, etc.
Moreover the pattern I'm describing is single-threaded, as in these
files are not created concurrently by multiple threads / processes.

Unfortunately the performance is abysmal, I mean what should take
perhaps 1-2 seconds on a normal drive it takes perhaps up-to a minute
on AFS;  for example `git-push` reports an bandwidth of only ~20
KiB/s.

Looking at the CPU usage, the `dafileserver` seems to be at ~95%,
although the system has 4 cores and is lightly used.

I can eliminate the following causes:
* network issues (both bandwidth or latency), because this behaviour
occurs even if I mount AFS on the same server where the file server
lives, thus everything happens over loopback;
* encryption -- it is off;
* synchronous close -- I've tried to set `fs storebehind -allfiles
16384 -verbose`;
* disks backing AFS cache -- it's a NVMe disk capable of ~3GiB/s;
* disks backing AFS file server -- it's a RAID5 of 3 top-of-the-line
(Gold) WD S-ATA drives;
* I can achieve good throughput for large files, or if accessing
medium sized files from multiple threads / processes;

My OpenAFS deployment is on Linux 5.3.18, OpenSUSE Leap 15.2, and the
following are the arguments of the file server and cache manager:


/usr/lib/openafs/dafileserver -syslog -sync onclose \
-p 128 -b 524288 -l 524288 -s 1048576 -vc 4096 \
-cb 1048576 -vhandle-max-cachesize 32768 \
-udpsize 67108864 -sendsize 67108864 \
-rxpck 4096 -rxmaxmtu 1400 -busyat 65536



/usr/sbin/afsd -blocks 67108864 -chunksize 17 -files 524288 \
-files_per_subdir 4096 -dcache 524288 \
-stat 524288 -volumes 4096 \
-splitcache 90/10 \
-afsdb -dynroot-sparse -fakestat-all \
-inumcalc md5 -backuptree \
-daemons 8 -rxmaxfrags 8 -rxmaxmtu 1400 \
-rxpck 4096 -nosettime


BTW, initially I was using the old `fileserver`-based setup, and
even though I've switched to `dafileserver` the performance seems to
stay unchanged.

Thanks for the help,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Git throwing bus error when pack files not entirely cached

2020-04-22 Thread Ciprian Dorin Craciun
Sorry for reviving this old thread, but this happened to me also.


My details:
* the same computer is used as both server and client;
* openSUSE 15.0 as distribution;
* Linux 4.12.14-lp150.12.79-default x86_64
* OpenAFS 1.8.0-lp150.2.2.1 (both client and server packages);
* OpenAFS `kmp-default` 1.8.0_k4.12.14_lp150.12.13-lp150.2.2.1;
* Git 2.16.4;
* large Git repository:
  * ~30 GiB;
  * most objects are packed;
  * largest pack is ~2.5 GiB;
  * a few are 200-400 MiB;
  * most are under 128 MiB;
* contents of `/etc/openafs/cacheinfo`:
  /afs:/var/cache/openafs:33554432
* `/var/cache/openafs` is on Ext4 and still has a lot of space free;
* nothing useful in neither `dmesg` or logs;
* `afsd` is started as:

/usr/sbin/afsd -blocks 33554432 -chunksize 17 -files 524288
-files_per_subdir 4096 -dcache 524288 -stat 524288 -volumes 4096
-splitcache 90/10 -afsdb -dynroot-sparse -fakestat-all -inumcalc md5
-backuptree -daemons 8 -rxmaxfrags 8 -rxmaxmtu 1400 -rxpck 4096
-nosettime



How to provoke it:  run one of the following:


git fsck --root --tags --no-reflogs --full --connectivity-only
--unreachable --dangling --name-objects

git fsck --root --tags --no-reflogs --full --strict --unreachable
--dangling --name-objects


At random times I get:

Bus error (core dumped)



If one wants to try this and doesn't have a large enough repository, I
would recommend:

https://github.com/cdnjs/cdnjs


I haven't yet tried to preload the files.


Hope it helps find the issue,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

2019-11-25 Thread Ciprian Dorin Craciun
On Mon, Nov 25, 2019 at 2:53 AM Benjamin Kaduk  wrote:
> > * I suspect that perhaps the issue is due to the latest kernel version,
> > because I have run similar patterns a few weeks ago on an older kernel (but
> > still from the `5.x` family), but can't say for sure;
>
> I see the diagnostics and further data points later in the thread, but are
> you in a position to boot an older kernel to attempt to confirm/refute this
> hypothesis?


The issue was on my personal laptop, thus I can try to install an
older kernel and retry.

(However looking at the OpenSUSE Tumbleweed, a rolling release, I
think I'll have a hard time finding an older kernel...)

I'll report back if I manage to do this.

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on concurrential file-system patterns (100% CPU in kernel mode)

2019-11-20 Thread Ciprian Dorin Craciun
On Wed, Nov 20, 2019 at 9:37 PM Ciprian Dorin Craciun
 wrote:
> Now the client works OK, however if I start the `afsd` client on the
> server itself (i.e. over `loopback` network), where previously (with
> `-jumbo`) I was able to max-out the disks (~300 MiB/s), now seems to
> be capped at around ~120MiB.  (The packet-per-second is aroun
> ~120K...)


Minor correction (only to the item above, the rest still stands),
restarting the `afsd` cache on the server itself (thus over
`loopback`), I once again am able to max-out the disks (read only).
(For some reason this wasn't the case in the previous test...)  (It's
not the benchmark, as I read large ~20MiB files with 16 concurrent
readers, and the same command was used in both test-cases.)

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on concurrential file-system patterns (100% CPU in kernel mode)

2019-11-20 Thread Ciprian Dorin Craciun
Before replying, I want to note that I think I've stumbled upon three
(perhaps related) issues (some of which might just be configuration
error):
* AFS file access getting stuck;  (seems to be solved by increasing
the number of `fileserver` threads from `-p 4` to `-p 128`;)
* trying to `SIGTERM` or `SIGKILL` a "stuck" process, takes Linux (in
kernel code) to 100% CPU;
* having `-jumbo -rxmaxmtu 9000` on the server, but not on the client,
yields poor performance;

This new thread (which I was just going to open myself) is related to
the third problem of mismatch between server and client jumbo frames
setting.




On Wed, Nov 20, 2019 at 8:59 PM Kostas Liakakis  wrote:
> (Yesterday over wireless I didn't use Jumbo frames, but the day
> before, where the same thing happened, I was using them.)
>
> Does this mean that '"the other day with jumbo frames" was over GigE ? Does 
> this happen over GigE with jumbo frames disabled a well?


So, apparently having `-jumbo -rxmaxmtu 9000` on the server, but not
configuring jumbo frames on the client yields poor performance.

(Also the "getting stuck" issue happens regardless of this other problem.)

Without touching the `fileserver` parameters, none of the following
seem to work:
* `afsd` with `-rxmaxmtu 9000` but without jumbo frames configured on
the network card;  (clear missconfiguration on my part);
* `afsd` with `-rxmaxmtu 1500` but over GigaBit Ethernet (and without
jumbo frames configured on the network card);  (an usual client on the
same network without jumbo frames support;)
* `afsd` with `-rxmaxmtu 1500` but over Wifi (which is capable of ~14
MiB receive);  (clearly no jumbo frames are supported;)
* as mentioned only by matching the server configuration seem to solve
the issue;
* (encryption is disabled;)


I've changed the `fileserver` parameters by removing `-jumbo` and
updating `-rxmaxmtu 1400` (I also intend to use this over WAN, thus
over PPPoE and VPN, which will add quite an impact on the MTU).

Now the client works OK, however if I start the `afsd` client on the
server itself (i.e. over `loopback` network), where previously (with
`-jumbo`) I was able to max-out the disks (~300 MiB/s), now seems to
be capped at around ~120MiB.  (The packet-per-second is aroun
~120K...)




> I 've seen problems finally attributed to jumbo frames where some 
> configuration change on a switch someplace amount the path rendered them 
> unusable.


I don't think this is the case here.  I have only one switch between
the client and the server (no other network equipment), and I haven't
encountered performance problems (even with regard to jumbo frames).


Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

2019-11-20 Thread Ciprian Dorin Craciun
On Wed, Nov 20, 2019 at 7:49 PM Mark Vitale  wrote:
> > The following are the arguments of `fileserver`:
> > -syslog -sync always -p 4 -b 524288 -l 524288 -s 1048576 -vc 4096 -cb
> > 1048576 -vhandle-max-cachesize 32768 -jumbo -udpsize 67108864
> > -sendsize 67108864 -rxmaxmtu 9000 -rxpck 4096 -busyat 65536
>
> I see some areas of concern here.  First of all, many of your parameters
> indicate that you expect to run relatively high load through this fileserver.
> Yet there are only -p 4 server threads defined.  The fileserver will 
> automatically
> increase this to the minimum of 6, but that still seems quite low.


These parameters (at least most of them) were empirically identified
for a highly concurrent access pattern, of a large number of 16KiB to
20MIB files, from a low number of users (2-3) over low-latency network
(wired, GigaBit, same LAN).  (I also had an IRC discussion with with
Jeffrey about this topic.)

There is a thread on this mailing list from 9th March 2019, with the
subject <>, where I've also listed the IRC discussion with Jeffrey
about this topic.  The `-p` argument is explicitly present in that
discussion.

The main use-case of my setup is a home / SOHO file server acting as a
NAS.  Therefore all my parameters are tuned towards low-latency and
high-bandwidth access, at the expense of server RAM (thus the large
number for buffers count and sizes).




> This low thread number, combined with a very large -busyat value,
> means that this fileserver will queue a very large backlog before returning
> VBUSY to the client.  Is there a reason you need to keep the fileserver
> threads so low?  Would it be possible for you to increase it dramatically
> (perhaps 100) and try the test again?


I've just increased this number to `-p 128`, and re-executed the
build.  (I haven't restarted the client, but I did restart the
server.)

Under initial parameters (i.e. 8 parallel builds) I wasn't able to
replicate the issue in 10 tries.

(The solution for this item seemed to be removing `-jumbo` and setting
`-rxmaxmtu 1500` instead of `9000`.)
Thus I've deleted around ~2K output files and increased the
parallelism to 32.  Under these conditions, although the build didn't
block, the bandwidth (over wireless) was around 500KiB/s (receive)
when I would have expected more (the input files are much larger than
the output files, for instance ~300KiB in to ~25KiB out), and the task
completion rate seemed verry jagged (i.e. no progress for a while,
then all of a sudden 10 would finish).  (I mention that the workload
is not CPU bound, average CPU on client is around ~20%.)

I've tried this second scenario (with the no-Jumbo settings) a few
times and still nothing got stuck.


However even if the case of "stuck process for 20 minutes" is solved,
there is still the issue of trying to `SIGTERM` those waiting
processes that jumps the kernel in 100% CPU.

If I can try other experiments, please let me know.

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

2019-11-20 Thread Ciprian Dorin Craciun
On Wed, Nov 20, 2019 at 7:03 PM Mark Vitale  wrote:
> Thank you for the backtraces.  I agree that 'gm' is the problematic thread;
> it appears to be stuck in rxi_WriteProc waiting for the Rx packet transmit 
> window
> to advance.  That is, it's waiting for acknowledgments - probably from the 
> fileserver.


It's true that the test was performed over wireless, however the same
behaviour was encountered even when over GigaBit LAN.
(This is a personal setup, both server, network and client, and there
was light to no usage on both the client, server and the network.)


> Unfortunately the rest of the backtrace seems muddled and so we can't tell 
> exactly
> what the client was doing.  In fact, many of the backtraces are incomplete.

I haven't deleted anything from a particular process stacktrace.
Although I have deleted processes that have nothing to do with AFS or
didn't contain a stack which contained `afs`.

(If you think it would be useful I can send you privately a complete,
uncensored, output.)


> If I have some time later this week, I may try to reproduce this issue.
> However, there's no guarantee I will be able to do so, so it would be better
> if we could either obtain more information from your site, or if you could
> narrow the problem down to a simpler test case.

I'll try to reproduce this without the actual build system.  (Using
say `stat`, `cp` and `xargs`.)


> Do you have FileLogs and/or fileserver audit logs for the time in question?

Yes, I do have access to them.

The following is the syslog output from OpenAFS server in a 5 minute
time-window to the stacktrace sent yesterday:

FindClient: stillborn client 0x7fe9b0012dc0(77749fe8); conn
0x7fe9d800e390 (host 172.30.214.35:7001) had client
0x7fe9b00131d0(77749fe8)
FindClient: stillborn client 0x7fe9b00132a0(77749fec); conn
0x7fe9d800e660 (host 172.30.214.35:7001) had client
0x7fe9b0012dc0(77749fec)
FindClient: stillborn client 0x7fe9b0013030(77749fec); conn
0x7fe9d800e660 (host 172.30.214.35:7001) had client
0x7fe9b0012dc0(77749fec)
FindClient: stillborn client 0x7fe9b0012cf0(77749fec); conn
0x7fe9d800e660 (host 172.30.214.35:7001) had client
0x7fe9b0012dc0(77749fec)


No information is present in `/var/log/openafs` in that timeframe.

The following are the arguments of `fileserver`:

-syslog -sync always -p 4 -b 524288 -l 524288 -s 1048576 -vc 4096 -cb
1048576 -vhandle-max-cachesize 32768 -jumbo -udpsize 67108864
-sendsize 67108864 -rxmaxmtu 9000 -rxpck 4096 -busyat 65536


(Yesterday over wireless I didn't use Jumbo frames, but the day
before, where the same thing happened, I was using them.)

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

2019-11-19 Thread Ciprian Dorin Craciun
On Tue, Nov 19, 2019 at 10:38 PM Ciprian Dorin Craciun
 wrote:
> At the following link you can find an extract of `dmesg` after the
> sysrq trigger.
>
>   
> https://scratchpad.volution.ro/ciprian/f89fc32a0bbd0ae6d6f3edbbc3ee111c/b9c3bc4f795bbe9e7eaca93b0a57bea0.txt


I forgot to mention that in this case the CPU didn't go up to 100%, in
fact it was quite "quiet".  (The 100% CPU seems to happen only after a
process "blocks" and I try to `SIGTERM` or `SIGKILL` it.)

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

2019-11-19 Thread Ciprian Dorin Craciun
On Tue, Nov 19, 2019 at 5:10 PM Ciprian Dorin Craciun
 wrote:
> > # echo t > /proc/sysrq-trigger


At the following link you can find an extract of `dmesg` after the
sysrq trigger.

  
https://scratchpad.volution.ro/ciprian/f89fc32a0bbd0ae6d6f3edbbc3ee111c/b9c3bc4f795bbe9e7eaca93b0a57bea0.txt

(I have filtered processes that don't have `afs` in their name, mainly
because it exposes all my workstation's processes.  However I can
provide privately a complete file.)




The following is the process which gets stuck (it took almost ~25
minutes to complete, and it is not input file related):


gm  S0 27572  27562 0x8000
Call Trace:
 ? __schedule+0x2be/0x6d0
 schedule+0x39/0xa0
 afs_cv_wait+0x10a/0x300 [libafs]
 ? wake_up_q+0x60/0x60
 rxi_WriteProc+0x21d/0x410 [libafs]
 ? rxfs_storeUfsWrite+0x55/0xb0 [libafs]
 ? afs_GenericStoreProc+0x11a/0x1f0 [libafs]
 ? afs_CacheStoreDCaches+0x1a9/0x5b0 [libafs]
 ? afs_CacheStoreVCache+0x32c/0x680 [libafs]
 ? __filemap_fdatawrite_range+0xca/0x100
 ? afs_osi_Wakeup+0xb/0x60 [libafs]
 ? afs_UFSGetDSlot+0xf6/0x4f0 [libafs]
 ? afs_StoreAllSegments+0x725/0xc20 [libafs]
 ? afs_linux_flush+0x486/0x4e0 [libafs]
 ? filp_close+0x32/0x70
 ? __x64_sys_close+0x1e/0x50
 ? do_syscall_64+0x6e/0x200
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe



On a second try (that also lockups) the following is the stack-trace
(only for the blocked process) (they look almost identical):


gm  S0 30548  30545 0x80004000
Call Trace:
 ? __schedule+0x2be/0x6d0
 schedule+0x39/0xa0
 afs_cv_wait+0x10a/0x300 [libafs]
 ? wake_up_q+0x60/0x60
 rxi_WriteProc+0x21d/0x410 [libafs]
 ? rxfs_storeUfsWrite+0x55/0xb0 [libafs]
 ? afs_GenericStoreProc+0x11a/0x1f0 [libafs]
 ? afs_CacheStoreDCaches+0x1a9/0x5b0 [libafs]
 ? afs_CacheStoreVCache+0x32c/0x680 [libafs]
 ? __filemap_fdatawrite_range+0xca/0x100
 ? afs_osi_Wakeup+0xb/0x60 [libafs]
 ? afs_UFSGetDSlot+0xf6/0x4f0 [libafs]
 ? afs_StoreAllSegments+0x725/0xc20 [libafs]
 ? afs_linux_flush+0x486/0x4e0 [libafs]
 ? filp_close+0x32/0x70
 ? __x64_sys_close+0x1e/0x50
 ? do_syscall_64+0x6e/0x200
 ? entry_SYSCALL_64_after_hwframe+0x49/0xbe



I can reliably trigger the issue almost 50% of the times, by just
doing the following:
* remove a few files (in my case ~15) which should trigger the rebuild
of around x2;
* start the build with maximum 8 processes concurrency;
* all the processes execute similar jobs, with similarly sized inputs,
outputs and used CPU time;


Based on `htop` I would say that neither `ninja` which does the heavy
`stat`-ing, neither `gm` (an ImageMagik alternative) are
multi-threaded.

The build procedure involves the following AFS related operations:
* check if the output exists, and if so `rm`;
* create an `output.tmp` file;
* move the `output.tmp` to `output`;

No other proceses are actively using AFS (except `mc` and a couple of
`bash` which have their `cwd` into an AFS volume).  (The `[nodaemon]`
process is a simple tool that uses `prtcl (PR_SET_CHILD_SUBREAPER)` to
catch double forking processses, and also has the `cwd` into AFS.)

Hope it helps,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

2019-11-19 Thread Ciprian Dorin Craciun
On Tue, Nov 19, 2019 at 5:06 PM Mark Vitale  wrote:
> If you had a true soft lockup, there should be some information in the syslog.

I don't think it was a "softlokup" as per Linux kernel terminology, as
it would have been detected by the kernel.  (But still it took all my
cores to 100% in kernel space as mentioned.)


> If you don't see anything there, you could try this while the hang is 
> occurring:

There wasn't anything in either `journald` (i.e. the syslog
replacement) nor in `dmesg`.  (And the system was freshly rebooted
after each occurrence.)


> # echo t > /proc/sysrq-trigger

I'll try to re-trigger that issue later today, and report back the findings.

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] OpenAFS client softlockup on highly concurrential file-system patterns (100% CPU in kernel mode)

2019-11-19 Thread Ciprian Dorin Craciun
A few days ago I have encountered a very strange OpenAFS client issue that
basically exhibits in two ways:

* either the processes accessing the file-system get "stuck" reading (or
perhaps opening) the files; (although if one waits "long" enough, sometimes
those processes will finally complete their job;)  (in this case the CPU
doesn't go to 100%;)

* either if one tries to `SIGTERM` the stuck processes, the CPU goes to 100%
(on multiple cores) in kernel mode;  (again, sometimes if one waits long
enough, the system settles;)


The usage pattern is as follows:

* it is a typical "build" scenario, where a `make`-like tool (in this case
`ninja`) heavily stats all files it knows about to find changed or missing
ones;  (in my case there are about 90k files, all hosted on AFS;  moreover
I suspect `ninja` tries to stat these on multiple threads;)

* there are a few processes that do CPU-bound tasks, reading a file (from
AFS) and writing the output to another one (also on AFS);  (the concurrency
level doesn't seem to change much, from 128 processes in parallel to 4;)


I was able to replicate this issue each time I tried to run the build and
send `SIGTERM`, after letting the whole build process run for a night it
eventually completed.




My setup is as follows:

* OpenSUSE Tumbleweed, kernel 5.3.9-1-default, client package
`openafs-client` and `openafs-kmp-default` at `1.8.5_k5.3.9_1-1.3` as
provided by OpenSUSE;

* `afsd` parameters (neither memory cache (on `tmpfs`) or disk cache seems
to help;  neither daemons from 4 to 1;  encryption is off):


-verbose -blocks 7864320 -chunksize 17 -files 524288 -files_per_subdir 128
-dcache 524288 -stat 524288 -volumes 128 -splitcache 90/10 -afsdb
-dynroot-sparse -fakestat-all -inumcalc md5 -backuptree -daemons 1
-rxmaxfrags 8 -rxmaxmtu 1500 -rxpck 4096 -nosettime

-verbose -memcache -blocks 1048576 -chunksize 17 -stat 524288 -volumes 128
-splitcache 90/10 -afsdb -dynroot-sparse -fakestat-all -inumcalc md5
-backuptree -daemons 1 -rxmaxfrags 8 -rxmaxmtu 1500 -rxpck 4096 -nosettime


* the server is also on OpenSUSE Leap 15.0, with `openafs-server` package at
`1.8.0-lp150.2.2.1` as provided by OpenSUSE;

* I suspect that perhaps the issue is due to the latest kernel version,
because I have run similar patterns a few weeks ago on an older kernel (but
still from the `5.x` family), but can't say for sure;




I also tried the following:

* `fs flushall` seems to block as the processes accessing the file-system;

* the only way to "kill" the stuck processes is to disconnect the network,
and let them timeout;


Any pointers on how to diagnose this?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] About `dafileserver` vs `fileserver` differences (for small cells)

2019-03-10 Thread Ciprian Dorin Craciun
On Mon, Mar 11, 2019 at 12:35 AM Benjamin Kaduk  wrote:
> > Thus I think that when one would modify the code, in large part the
> > code is common, and where it isn't at least the "switch" is visible in
> > there.  Therefore I'm confident that the `fileserver` is still a
> > viable solution.  :)
>
> I won't really dispute that it is viable at present, but it's pretty clear
> to me that it's no longer a *recommended* solution, and I don't really
> understand your attachment to it.  Is this just because you continue to
> investigate running a simple fileserver without the bosserver and
> demand-attach has more moving parts in that respect?


Exactly.  I want to simplify the OpenAFS deployment as much as
possible.  (Especially since the simpler it is, the better the chance
I actually understand what happens with my data.)


I see OpenAFS as a viable solution for a WAN-enabled NAS, that one
could quickly deploy (the "server" part) in a VM (or even a
container), and just use it.  (I'm really amazed that to day no other
WAN-enabled NAS solution exists, especially one that allows
user-defined ACL's, and one that works both on Linux and Windows...)

However as it stands today OpenAFS is geared towards large and static
deployments, and less for "experimental" ones.  I would really love If
I managed to "put together" a very lightweight VM that has just the
bare minimum services and moving parts.  (And this is really
achievable once one understands the "underlying" of managing a file
server.)


Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] About `dafileserver` vs `fileserver` differences (for small cells)

2019-03-10 Thread Ciprian Dorin Craciun
On Mon, Mar 11, 2019 at 12:06 AM Benjamin Kaduk  wrote:
> To be clear, they do share a great bit of code (dafs was not "from
> scratch"), but there are many places that do get differential treatment in
> the source -- look for AFS_DEMAND_ATTACH_FS preprocessor conditionals.


Based on what I see:

  https://github.com/openafs/openafs/search?q=AFS_DEMAND_ATTACH_FS
  
https://github.com/openafs/openafs/blob/c1d39153da00d5525b2f7874b2d214a7f1b1bb86/src/viced/Makefile.in#L15
  
https://github.com/openafs/openafs/blob/c1d39153da00d5525b2f7874b2d214a7f1b1bb86/src/dviced/Makefile.in#L15

I would assume that most of the code is common (in terms of files),
and at compile time the sources are re-built with different defines.

Thus I think that when one would modify the code, in large part the
code is common, and where it isn't at least the "switch" is visible in
there.  Therefore I'm confident that the `fileserver` is still a
viable solution.  :)

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Starting an server (both DB and FS) without `BOS` (e.g. on Linux with systemd)

2019-03-10 Thread Ciprian Dorin Craciun
On Sat, Mar 9, 2019 at 11:16 PM Jeffrey Altman  wrote:
> The BOS Overseer Service plays a number of roles:


Just wanted to stress that `bos` is wonderful in a distributed
deployment, and I'm quite surprised that until this date we don't have
other "general purpose" alternatives.

However as stated in the previous email I'm using OpenAFS in a home /
small office environment, where I'll never have more than one server.
Moreover the deployment will in the end be done in a dedicated VM.
Thus the need of `bos` seems to be superfluous.



> 2. The bosserver is responsible for managing the content of many
>configuration files including BosConfig, UserList, and
>the server version of the CellServDB file.  The KeyFile can
>also be updated via bosserver.  The files other than BosConfig
>are shared with the AFS services.


These files are configured one-time only, and from what I gather (and
experimented) can easily be created by hand without the `bos`
toolchain.  (Perhaps only the `KeyFile` requires `bos` commands, but
does not require the `bos` daemon to be running.)



>c. fs - a bnode which defines the process group for [...]
>
>d. dafs - a bnode which defines the process group for the
>   demand attach fileserver.  The bosserver has special knowledge
>   related to process restart in case of failure and integration
>   with the "bos salvage" command.
>
> 3. The bosserver is used to request manual salvages of individual
>volumes or whole partitions.  When the "fs" bnode is in use,
>the bnode will be stopped and started while the salvage takes
>place.  With the "dafs" bnode, single volume salvages do not
>require the "dafs" bnode to be halted but full partition
>salvages do.
>
> [...]
>
> > Does the `fileserver` / `dafileserver` actually start the salvage
> > process, or do they communicate this to the `bos` to restart only that
> > service?
>
> Most but not all of these functions could be performed with other tools.
>  Managing the special inter-dependencies of the "fs" and "dafs" bnode
> processes and salvaging are the two exceptions.


And this is where things get "opaque", and the documentation doesn't
give much internal details.

When you say <>, by "failure" you mean "the `fileserver` process just dies",
or the `fileserver` process somewhat "signals" this to the `bos`
server?

Because what I gather from what you say, a simplified file server
startup might look like:
* run `salvager` / `dasalvager` and wait for it to terminate;
* run `volserver` / `davolserver` and parallel,
* run `fileserver` / `dafileserver` and,
* if any of the volume or file servers fail, stop them and restart
from the first step;

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Starting an server (both DB and FS) without `BOS` (e.g. on Linux with systemd)

2019-03-09 Thread Ciprian Dorin Craciun
[I'm adding to the previous question also the issue of salvaging.  I'm
quoting what I've asked on a previous thread.]


BTW, on the topic of volume salvaging, when I define my DAFS / FS node
I start a node of `salvager` (for FS) and `dasalvager` and
`salvageserver`.  However looking at the running processes the
`salvager` and `dasalvager` don't seem to be running after the initial
startup.  Thus I wonder how the salvage process actually happens?

Does the `fileserver` / `dafileserver` actually start the salvage
process, or do they communicate this to the `bos` to restart only that
service?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] About `dafileserver` vs `fileserver` differences (for small cells)

2019-03-09 Thread Ciprian Dorin Craciun
On Sat, Mar 9, 2019 at 11:43 AM Harald Barth  wrote:
> > However is it still "safe" and "advised" (outside of these
> > disadvantages) to run the old `fileserver` component?
>
> I would recommend everyone to migrate to "da" and not recommend to
> start with anything old. For obvious reasons, all the big
> installations will migrate to "da" and you don't want to run another
> codebase, don't you?


Thanks Harald for the feedback.

This is exactly what I wanted to find out, namely if the `fileserver`
and `dafileserver` have different code bases.  (And you've confirmed
my hunch that the DAFS codebase is the currently maintained one.)

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] About `dafileserver` vs `fileserver` differences (for small cells)

2019-03-08 Thread Ciprian Dorin Craciun
On Sat, Mar 9, 2019 at 4:10 AM Mark Vitale  wrote:
> DAFS main benefit is the reduced impact of restarting a fileserver, especially
> fileserver with thousands or even millions of volumes.  DAFS fileservers
> are able to restart more quickly, are able to avoid restarts formerly 
> required for
> volume salvages, and are able to reduce the negative effects of restarts on 
> clients.
> Here are some details about how these benefits are acheived:


Thanks Mark for explaining the advantages of DAFS, especially number
(4) (i.e. saving of the client "states").

However is it still "safe" and "advised" (outside of these
disadvantages) to run the old `fileserver` component?

(More specifically, from a source code point of view, outside of the
demand-attach, are there any other performance / stability
improvements in DAFS as compared with FS?)


BTW, on the topic of volume salvaging, when I define my DAFS / FS node
I start a node of `salvager` (for FS) and `dasalvager` and
`salvageserver`.  However looking at the running processes the
`salvager` and `dasalvager` don't seem to be running after the initial
startup.  Thus I wonder how the salvage process actually happens?

Does the `fileserver` / `dafileserver` actually start the salvage
process, or do they communicate this to the `bos` to restart only that
service?

(My main reason to ask this, is in anticipation of my other email
which tries to identify if I can safely run the fileserver processes
directly from `systemd` outside the control of `bos`?)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Questions regarding `afsd` caching arguments (`-dcache` and `-files`)

2019-03-08 Thread Ciprian Dorin Craciun
On Fri, Mar 8, 2019 at 11:39 PM Ciprian Dorin Craciun
 wrote:
> On Fri, Mar 8, 2019 at 11:11 PM Jeffrey Altman  wrote:
> > The performance issues could be anywhere and everywhere between the
> > application being used for testing and the disk backing the vice partition.


OK, so first of all I want to thank Jeffrey for the support via IRC,
as we've solved the issue.


Basically it boils down to:

* lower the number of threads from the `fileserver` to a proper value
based on available CPU's / cores;  (in my case `-p 4` or `-p 8`;)

* properly configure jumbo frames on the network cards `ip link set
dev eth0 mtu 9000`;  (this configuration has to be made in the
"proper" place else it will be lost after restart;)
* (after changing MTU restart both server and clients;)

* disable encryption `fs setcrypt -crypt off`;  (in the end based on
what I understood it's not too powerful, and given that I'll use it
mostly on LAN it's not an issue;  moreover for WAN I don't need to
saturate GigaBit network;)
* (after changing re-authenticate, i.e. `unlog && klog`);


In order to check the correct configuration one has to:

* `cmdebug -server 192.168.0.2 -addrs` (on the client) to see if the
MTU is correctly picked up;  (else restart the cache manager;)

* `rxdebug -server 192.168.0.1 -peer -long` (on the server) to see if
the `ifMTU / natMTU / maxMTU` for the client connection have proper
values;  (in my case they were `8524 / 7108 / 7108`;)

* use `top -H` and check if the kernel thread `afs_rxlistener` (on the
client) and the many of the `fileserver` threads (on the server) are
not maxed-out (i.e. > ~90%);  if so, that is the bottleneck (after
encryption is disabled and jumbo frames are enabled);


A note about the benchmark:  in order to saturate the link I've tested
only with the large files (i.e. ~20 MiB each), else I'll end up
"trashing" the disk, and thus that would become the bottleneck.



BTW, I've taken the liberty to copy-paste the log from the IRC channel
(I've keep only the relevant lines, and also grouped reordered some of
them), because they are very insightful into OpenAFS performance
tuning.

So once more thank's Jeffrey for the help,
Ciprian.




23:43 < auristor> first question, when you are writing to the
fileserver, does "top -H" show a fileserver thread at or near 100%
cpu?
23:45 < auristor> -H will break them out by process thread instead
providing one value for the fileserver as a whole

23:46 < auristor> I ask because one thread is the RX listener thread
and that thread is the data pump.  If that thread reaches 100% then
you are out of capacity to receive and transmit packets


00:00 < auristor> Since you have a single client and 8 processor
threads on the fileserver, I would recommend lowering the -p
configuration value to reduce lock contention.

23:55 < auristor> there are two major bottlenecks in the OpenAFS.
First, the rx listener thread which does all of the work associated
with packet allocation, population, transmission, restransmission, and
freeing on the sender and packet allocation, population, application
queuing, acknowledging, and freeing on the receiver.

23:56 < auristor> In OpenAFS this process is not as efficient as it
could be and its architecture limits it to using a single processor
thread which means that its ability to scale correlates to the
processor clock speed


23:58 < auristor> Second, there are many global locks in play.  On the
fileserver, there is one global lock for each fileserver subsystem
required to process an RPC.  For directories there are 8 global locks
that must be acquired and 7 for non-directories.
23:59 < auristor> These global locks in the fileserver result in
serialization of calls received in parallel.

00:00 < ciprian_craciun> (Even if they are for different directories / files?)
00:00 < ciprian_craciun> (I.e. is there some sort of actual "global
lock" that basically serializes all requests from all clients?)

00:01 < auristor> The global locks I mentioned do serialize the
startup and shutdown of calls even when the calls touch different
objects.


00:02 < auristor> Note that an afs family fileserver is really an
object store.  unlike a nfs or cifs fileserver, an afs fileserver does
not perform path evaluation.   path evaluation to object id is
performed by the cache managers.

00:04 < auristor> The Linux cache manager also has a single global
lock that protects all other locks and data structures.  This lock is
dropped frequently to permit parallel processing but it does severely
limit the amount of a parallel execution


00:09 < ciprian_craciun> Trying now with `-p 4` seems to yield ~35
MiB/s of `cat` throughput.

00:11 < auristor> that would imply that the fileserver is not
releasing worker threads from the call channel fast enough 

Re: [OpenAFS] Questions regarding `afsd` caching arguments (`-dcache` and `-files`)

2019-03-08 Thread Ciprian Dorin Craciun
[Replying also to the list, just to mention the benchmarking technique.]


On Fri, Mar 8, 2019 at 11:11 PM Jeffrey Altman  wrote:
> The performance issues could be anywhere and everywhere between the
> application being used for testing and the disk backing the vice partition.


The issue is not the backing disk as using the same benchmarking
technique (see bellow) I get around ~270 MiB/s from the actual
`/vicepX` files.

The technique is simple (i.e. list all files, randomize them, and then
`cat` them 128 at a time to `/dev/null`):

  find . -type f | sort -R | xargs -P 64 -n 128 -- cat > /dev/null

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Regarding OpenAFS performance (on a small / home single node deployment)

2019-03-08 Thread Ciprian Dorin Craciun
Small correction to the previous email, the `-chunksize` for the
server `afsd` was `20` (i.e. 1MiB) at the time of the experiment.  And
the `-dcache` on the LAN client was `65536`.

(The values in my initial email were based on some notes I had while I
was trying various parameters.)

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Regarding OpenAFS performance (on a small / home single node deployment)

2019-03-08 Thread Ciprian Dorin Craciun
[I've changed the subject to reflect the new topic.]


On Fri, Mar 8, 2019 at 9:58 PM Mark Vitale  wrote:
> >>> (I'm struggling to get AFS to go over the 50MB/s, i.e. half a GigaBit,
> >>> bandwidth...  My target is to saturate a full GigaBit link...)
> >
> > Perhaps you know:  what is the maximum bandwidth that one has achieved
> > with OpenAFS?  (Not a "record" but in the sense "usually in enterprise
> > deployments we see zzz MB/s".)
>
> I think this may be a question like "how long is a piece of string?".
> The answer is "it depends".  Could you be more specific about your use cases,
> and what you are seeing (or need to see) in terms of OpenAFS performance?


So my use-case is pretty simple:
* small (home / office) single node deployment on Linux (OpenSUSE Leap
15.0) running OpenAFS 1.8;
* three `/vicepX` partitions on the same Ext4 over RAID5, backed by
rotative HDD's, capable of ~300 MiB/s sequential I/O (in total per
RAID);  (these are migrated from three old disks, and I mean to merge
them into a single one;)
* 1x GigaBit network, 32 GiB RAM, Core i7, currently not used for anything else;
* I have around 600 GiB of personal files, in ~20 volumes;  some
(around 50%) of these files are largish ~20 MiB files (in one volume),
meanwhile the rest are "usual" smallish ~128 KiB to mediumish ~4 MiB
(these last figures are an assumption) (all in 2 or 3 volumes);


My intention is to saturate the GigaBit network card from one client
(in the same LAN) (both with 9k Jumbo frames support), while accessing
these files read-only.  (The client has an 6 GiB cache over TMPFS,
with 8 GiB RAM and 64 GiB swap.  I know this last one is not
"advisable", but the cache is not swapped, thus it is not impacting
the performance.)


I've tried to read either sequentially and in parallel (from 8 to 64
processes), all the available files (either sorted by path or
randomly), and never get over 40-50 MiB/s in network traffic.  (I've
done this test both from the server, thus over `lo`, and the networked
client, with almost the same performance.)


The following is my current configuration:

* for the `fileserver`:
/usr/lib/openafs/fileserver -syslog -sync always -p 128 -b 524288 -l
524288 -s 1048576 -vc 4096 -cb 1048576 -vhandle-max-cachesize 32768
-jumbo -udpsize 67108864 -sendsize 67108864 -rxmaxmtu 8192 -rxpck 4096
-busyat 65536

* for the `volserver`:
/usr/lib/openafs/volserver -syslog -sync always -p 16 -jumbo -udpsize 67108864

* for the server `afsd`:
-memcache -blocks 4194304 -chunksize 17 -stat 524288 -volumes 4096
-splitcache 25/75 -afsdb -dynroot-sparse -fakestat-all -inumcalc md5
-backuptree -daemons 8 -rxmaxfrags 8 -rxmaxmtu 8192 -rxpck 4096
-nosettime

* for the LAN client `afsd`:
-blocks 7864320 -afsdb -chunksize 20 -files 262144 -files_per_subdir
1024 -dcache 128 -splitcache 25/75 -volumes 256 -stat 262144
-dynroot-sparse -fakestat-all -backuptree -daemons 8 -rxmaxfrags 8
-rxmaxmtu 8192 -rxpck 4096 -nosettime



> > (I think my issue is with the file-server not the cache-manager...)
>
> It is easy to get bottlenecks on both.  One way to help characterize this
> is to use some of the OpenAFS test programs and see how they perform against 
> your fileservers:
> - afscp  (tests/afscp)
> - afsio  (src/venus/afsio)
>
> There is also the test server/client pair for checking raw rx network 
> throughput:
> - rxperf  (src/tools/rxperf)


I'll try to look at them.  (None of them seem to be part of the
OpenSUSE RPM, thus I'll have to build them.)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Questions regarding `afsd` caching arguments (`-dcache` and `-files`)

2019-03-08 Thread Ciprian Dorin Craciun
On Fri, Mar 8, 2019 at 9:30 PM Mark Vitale  wrote:
> But now on more careful reading, I see this only applies when -dcache has not 
> been explicitly specified.
> (Which, to be fair, is the normal case).

Thanks for the insight.


> > (I'm struggling to get AFS to go over the 50MB/s, i.e. half a GigaBit,
> > bandwidth...  My target is to saturate a full GigaBit link...)
>
> Here are some helpful commands for examining the results of your 
> configuration experiments:
>
> cmdebug  -cache
> fs getcacheparms -excessive

Perhaps you know:  what is the maximum bandwidth that one has achieved
with OpenAFS?  (Not a "record" but in the sense "usually in enterprise
deployments we see zzz MB/s".)

(I think my issue is with the file-server not the cache-manager...)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Questions regarding `afsd` caching arguments (`-dcache` and `-files`)

2019-03-08 Thread Ciprian Dorin Craciun
On Fri, Mar 8, 2019 at 9:11 PM Mark Vitale  wrote:
> The -dcache option for a disk-based cache does set the number of dcaches in 
> memory.
> It has a minimum value of 2000 and max of 1.


Is the 100K maximum a hard limit imposed in code, or a
"best-practice"?  (I've looked in a few places and it seems that it is
not a hard limit.)



> In addition, many of the options interact with each other.
> The best guide for how all this _really_ works is the source code - however, 
> the
> source itself is quite confusing at times, so I feel your pain.


Currently I go with a trial-and-error approach.  :)

(I'm struggling to get AFS to go over the 50MB/s, i.e. half a GigaBit,
bandwidth...  My target is to saturate a full GigaBit link...)

Thanks Mark for the info,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Questions regarding `afsd` caching arguments (`-dcache` and `-files`)

2019-03-08 Thread Ciprian Dorin Craciun
On Fri, Mar 8, 2019 at 6:19 PM Ciprian Dorin Craciun
 wrote:
> (B)  Using `-files` and `-chunksize` so that their product is larger
> than `-blocks` means that the cache can hold up to as many `-files`
> actual AFS files, but their total size can't be larger than `-blocks`?
>  (I.e. if one has a cell with lots of small files, it is OK to
> configure a largish `-chunksize` and `-files` because they will be
> cached up to `-blocks`.)


I've found the http://docs.openafs.org/Reference/5/afs_cache.html
documentation that states:

Vn files expand and contract to accommodate the size of the AFS
directory listing or file they temporarily house. As mentioned, by
default each Vn file holds up to 64 KB (65,536 bytes) of a cached AFS
element. AFS elements larger than 64 KB are divided among multiple Vn
files. If an element is smaller than 64 KB, the Vn file expands only
to the required size. A Vn file accommodates only a single element, so
if there many small cached elements, it is possible to exhaust the
available Vn files without reaching the maximum cache size.


This would imply that:

* there is no 1-to-1 relation between "chunks" and "Vn files", one
chunk could be stored in multiple "Vn files";  (however one "Vn file"
never stores multiple chunks in case the chunk size is bellow 64K?)

* by explicitly setting `-files` one can set a limit to the maximum
number of actual AFS files to cache;  (i.e. if all files are smaller
than 64K and the `-blocks` is larger than `-files * 64K`, then no more
than `-files` AFS files would be stored;)

Am I correct?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Questions regarding `afsd` caching arguments (`-dcache` and `-files`)

2019-03-08 Thread Ciprian Dorin Craciun
I have two small questions about the cache management of `asfd`.  (The
documentation isn't very explicit.)

(In both cases I'm speaking about disk-based cache.)

(A) Using `-dcache 128` with a `-chunksize 10` (i.e. 1MiB) for a
disk-based cache, would actually allocate 128 MiB from kernel memory
(i.e. the product of the two)?  It is unclear from the documentation.
(Although I would infer yes, based on the description of memory based
cache.)

(B)  Using `-files` and `-chunksize` so that their product is larger
than `-blocks` means that the cache can hold up to as many `-files`
actual AFS files, but their total size can't be larger than `-blocks`?
 (I.e. if one has a cell with lots of small files, it is OK to
configure a largish `-chunksize` and `-files` because they will be
cached up to `-blocks`.)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Documentation for `afsd` argument `files_per_subdir` is wrong (out of sync with the implementation)

2019-03-08 Thread Ciprian Dorin Craciun
Hello all!

I'm using OpenAFS on OpenSUSE, version 1.8.x (in fact 1.8.0 and 1.8.2
on two nodes), and although the documentation for the `afsd` daemon
states for `files_per_subdir` that:

files_per_subdir -- Limits the number of cache files in each
subdirectory of the cache directory. The value of the option should be
the base-two log of the number of cache files per cache subdirectory
(so 10 for 1024 files, 14 for 16384 files, and so forth).


It is in fact use without the exponential transformation.  I.e.
setting `-files_per_subdir 10` will actually result in exactly 10
files per directory, meanwhile `-files_per_subdir 1024` would
correctly result in 1K files.

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Starting an server (both DB and FS) without `BOS` (e.g. on Linux with systemd)

2019-03-08 Thread Ciprian Dorin Craciun
I understand that for large deployment the `bos` is useful because it
allows administering the AFS services remotely without resorting to
SSH.

However for small deployments (like for example a single server) could
it be removed completely and letting the services be started without
it?  (Like for example as plain systemd services.)  (My assumption
based on the snippet in the documentation seems to be "yes".)

And if it is possible, then the various services should be started
just as they are listed inside `/etc/openafs/BosConfig`?

Are there other environment variables (or similar "configuration")
that must be configured?

Also is there a particular ordering or (hard) dependency between the
services?  (Or they can be started in parallel.)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Administrators with a slash

2019-03-06 Thread Ciprian Dorin Craciun
On Wed, Mar 6, 2019 at 7:16 AM Benjamin Kaduk  wrote:
> To a large extent, getting Kerberos set up is pretty much drop it in and
> switch it on, but there's a lot of flexibility about principal names,
> especially for administrative operations.  Getting it integrated with
> OpenAFS is mostly about having the right 'pts createuser's happen to
> register users, and creating the afs/cellname.fqdn principal to go in the
> rxkad.keytab and/or KeyFileExt -- at this point, AFS is just a regular
> kerberized service and doesn't require special treatment on the Kerberos
> side for the service principals.

Indeed this was my experience also, the Kerberos deployment was quite
trivial (once I've done it);  however in seemed (and still seems) that
I've "lost" something along the way because I lack the proper know-how
and expertise with Kerberos.


> I don't know of specific documentation for this, no.
> I think that many sites running Kerberos+AFS have some homegrown database
> management system that handles both and keeps them synchronized.

And this is unfortunate, especially since deploying OpenAFS "seems" a
daunting task for the small cell operator, or one that just wants to
"play" with the technology.  I say "seems" because deploying an
OpenAFS server can be done quite quickly with a couple of copy-pastes.

Perhaps (if I'll have time) I will prepare a small hands-on tutorial
on deploying OpenAFS on a Linux server.  (I know that there already
exists the "Quick Starting UNIX Guide", however it is far from
"quick"...)  :)


> > > Of course, rxgk will let us use fancier names for things, so we'll have to
> > > get used to a whole new world order when that finishes landing...
> >
> > Could you elaborate more on this?
>
> The short form is that we'll be able to use (encoded) GSS principal
> names in the UserList file.  It looks like the details haven't made it into
> the UserList.pod documentation yet (unsurprising, since the code to
> authenticate as them isn't in place yet), but the format includes a base64
> encoded version of the GSS exported name.

Basically it means one could use something alternative to Kerberos for
authentication?  (Something that is GSS-compliant?)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] About `dafileserver` vs `fileserver` differences (for small cells)

2019-03-05 Thread Ciprian Dorin Craciun
Hello all!

I understand from the documentation that the main difference between
`dafileserver` and `fileserver` is the "on-demand-attach" of volumes.
However I wonder if there are other advantages / differences between
the two, especially with regard to:
* performance -- is `dafileserver` more performant than `fileserver`?
* reliability -- because (I assume) many cells have migrated to
`dafileserver` the "old" `fileserver` gets less used, thus less tested
in real deployments;
* maintenance -- is the `fileserver` still actively developed and maintained?

I ask this also from the perspective of a small cell operator (for
personal purposes), where attach-on-demand is not an issue, and in
fact I think I would prefer all my volumes to be attached as early as
possible.

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Administrators with a slash

2019-03-04 Thread Ciprian Dorin Craciun
On Mon, Mar 4, 2019 at 3:35 AM Benjamin Kaduk  wrote:
> > Perhaps the OpenAFS Quick Start UNIX chapters touching the Kerberos
> > integration (http://docs.openafs.org/QuickStartUnix/HDRWQ53.html)
> > should clearly state this issue with principals containing dots and
> > using at the same time instances (i.e. slashes)...
>
> Patches welcome!  (XML sources browseable at
> http://git.openafs.org/?p=openafs.git;a=tree;f=doc/xml/QuickStartUnix;h=9e4fbd3f23b81696d98b1fcb68519364fe365d3f;hb=HEAD
> ; preferred submissions are as gerrit changes (docs on that at
> https://wiki.openafs.org/devel/GitDevelopers/) but mailed patches and
> similar are fine.


I'll try to provide a patch to the documentation.

(I am aware that OpenAFS is an open-source, volunteer-based project,
thus I was not "demanding" the update.)  :)

However on the same subject, is there a document describing how one
should configure Kerberos (from MIT) to work flawlessly with OpenAFS?
(I've tried searching for such a document, but found none, and
moreover even "plain" Kerberos deployment tutorials are very
scarce...)



> > Moreover it's still unclear to me if in `pts createuser` I should use
> > the `username.admin` or `username/admin` variants?  (It lets me do
> > both, but I think only the former actually works.)  Could someone tell
> > me the "correct" syntax for OpenAFS usernames?
>
> You should pts createuser the username.admin variants.


I'll try to include this in that patch also.



> Of course, rxgk will let us use fancier names for things, so we'll have to
> get used to a whole new world order when that finishes landing...

Could you elaborate more on this?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Offline migration of AFS partition files (i.e. the contents of `/vicepX`)

2019-03-04 Thread Ciprian Dorin Craciun
On Wed, Dec 5, 2018 at 3:29 PM Harald Barth  wrote:
> > Can I safely `rsync` the files from the old partition to the new one?
>
> For Linux (The "new" server partition layout):
>
> If the file tree really is exactly copied (including all permissions
> and chmod-bits) then you have everything you need. This was not true
> for the old file system layout for example in SunOS UFS.


Just for reference (perhaps it would help others in the future) I've
used something on the lines:
(please note that it would erase everything from the destination
folder;  thus it must be used only for migration purposes;)
(also note that it is safe to run this multiple times, perhaps to
resume a failed synchronization, or perhaps to rollback and try the
salvage operation with various options, etc.)

rsync \
--recursive --one-file-system \
--delete \
--ignore-times \
--checksum --checksum-choice md5 \
--links --safe-links \
--hard-links \
--perms --times \
--owner --group --numeric-ids \
--whole-file --no-compress \
--preallocate \
--verbose --progress --itemize-changes \
-- \
/mnt/old-disk/vicepX/ \
/mnt/new-disk/vicepX/ \
#



> I would copy to a not-yet used partition, mount it then as /vicepY
> (where Y is a new unused letter) and then as the first thing when
> startting the server run a salvage with the options
> -orphans attach -salvagedirs -force


I've run the following before starting any OpenAFS services (i.e.
without `bos` running.)

/usr/lib/openafs/dasalvager \
-partition /vicepX \
-orphans attach \
-salvagedirs \
-force \
#

Apparently outside some lines like:
Vnode 18800: version < inode version; fixed (old status)
, no other "strange" lines appear in the `SalvageLog` file.



And finally I've created a completely new OpenAFS (1.8.0) deployment
from scratch, initializing the protection database with the same users
and groups as in the previous deployment, making sure I've kept the
same UID's.  (Hopefully this is enough to keep ACL's and ownership
from the old volumes intact.)

Afterwards I've started the OpenAFS `bos` service, and run the following:

vos syncvldb \
-server 172.xx.xx.xx \
-partition X \
-verbose \
-localauth \
#

vos syncserv \
-server 172.xx.xx.xx \
-partition X \
-verbose \
-localauth \
#



Hopefully this was enough to "migrate" my old AFS deployment to the new server.

Thanks all for the help,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Administrators with a slash

2019-03-03 Thread Ciprian Dorin Craciun
On Tue, Jan 10, 2012 at 3:20 PM Bobb Crosbie
 wrote:
> I now recall reading about the slash -> dot remapping in the docs, but I had 
> forgotten about it.
>
> I think perhaps the tools might have done a better job of indicating that 
> there was a problem, and what it might be ?
>
> If slashes are remapped to dots, then perhaps ``pts createuser'' should issue 
> a warning message if you try to create a user with a slash ?
> As it stands (1.4.12 & 1.6.0), pts happily creates the user with the slash 
> and also includes it in the list of entries.


Sorry for reviving such an old thread, but I've just wasted about 4
hours randomly trying things out in order to get OpenAFS (1.8.0) with
Kerberos to actually work...  And fortunately (?!) I've managed to
find the solution through this random process;  thus I've searched the
mailing lists to see if anyone had the same issue...

Perhaps the OpenAFS Quick Start UNIX chapters touching the Kerberos
integration (http://docs.openafs.org/QuickStartUnix/HDRWQ53.html)
should clearly state this issue with principals containing dots and
using at the same time instances (i.e. slashes)...

Moreover as Bobb observed almost 10 years ago, none of the OpenAFS
tools (not even in 1.8.0) give any hint about what is happening, not
in the logs, nor on stderr...

Moreover it's still unclear to me if in `pts createuser` I should use
the `username.admin` or `username/admin` variants?  (It lets me do
both, but I think only the former actually works.)  Could someone tell
me the "correct" syntax for OpenAFS usernames?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Offline migration of AFS partition files (i.e. the contents of `/vicepX`)

2019-03-02 Thread Ciprian Dorin Craciun
On Sun, Mar 3, 2019 at 12:29 AM Jeffrey Altman
 wrote:
> On 3/2/2019 3:42 PM, Ciprian Dorin Craciun wrote:
> > (A)  When you state `exactly copied` you mean only the following
> > (based on the `struct stat` members):
> > [...]
>
> The vice partition directory hierarchy is used to create a private
> object store.   The reason that Harald said "exact copy" is because
> OpenAFS leverages the fact that the "dafs" or "fs" bnode services
> execute as "root" to encode information in the inode's metadata that is
> not guaranteed to be a valid state from the perspective of normal file
> tooling.


I understand that OpenAFS "reuses" the inode metadata for its own
purposes, and that one shouldn't touch it outside the OpenAFS tools.

However is it enough if I make sure that while migrating I need to
keep **only** the following file metadata:
* `st_uid` and `st_gid`;
* `st_mode`;
* `st_atim`, `st_mtim` and `st_ctim`;

Can I assume that no other meta-data is required?  (Like for example
Linux file-system ACL's or extended user attributes?)  (I would assume
not, however I wanted to make sure.)

Moreover I am curios if the timestamps are actually required?
(Especially the access and changed timestamps.)



> For many years there was discussion of creating a plug-in interface for
> the vice partition object storage.  This would permit separate formats
> depending on the underlying file system capabilities and use of non-file
> system object stores.


Although this is a little bit off-topic, I am quite happy that OpenAFS
decided to just reuse a "proper" file-system, and layout its own
"objects" on-top, instead of going with opaque "object stores"...

I understand that from a performance and scalability point of view a
more advanced format would help, however for small deployments, I
think the plain file-system approach provides more reliability and
reassurance that in case something happens one can easily recover
files.  (See bellow for more about this.)



> OpenAFS stores each AFS3 File ID data stream in a single file in
> the current format.
>
> > I.e. formalizing the last one:  if one would take any file accessible
> > under `/afs` and would compute its SHA1, then by looking into all
> > `/vicepX` partitions belonging to that cell, one would definitively
> > find a matching file with that SHA1.
>
> This is true for the current format.


Continuing my "reliability" idea of plain file-systems, I for example
maintain MD5 checksums for all my AFS stored files (i.e. those in
`/afs/cell`), which means that in case something goes wrong with the
AFS directories or meta-data, I can always just MD5 the actual
`/vicepX` files, and pick my data out of there.

In fact, given that I have deployed OpenAFS for personal use and most
my "archived" files are on it, and the fact that I don't have too much
time to invest in it, just knowing the fact that I can always easily
get my data out, gives me almost blind trust in OpenAFS.

(This, and the lack of WAN and ACL support, is why I don't use Lustre,
Ceph or other "modern" distributed / parallel file-systems.)



> > My curiosity into all this is because I want to prepare some `rsync`
> > and `cpio` snippets that perhaps could help others in a similar
> > endeavor.  Moreover (although I know there are at least two other
> > "official" ways to achieve this) it can serve as an alternative backup
> > mechanism.
>
> The vice partition format should be considered to be private to the
> fileserver processes.  It is not portable and should not be used as a
> backup or transfer mechanism.


I understand this, however I'm thinking more in case of "disaster
recovery" scenarios, and in those cases when the OpenAFS services are
not capable of running.  (As is in my case when I don't have OpenAFS
yet installed on my "new" server, and my "old" server OS is unusable.
I just have my `/vicepX` partitions...  Moreover I intend to create a
`cpio` in `newc` format of my old `/vicepX` partitions and keep them
for a while...  And given that `cpio` has limited metadata support is
why I asked about which metadata is required.)


Thanks Jeffrey for the information,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Offline migration of AFS partition files (i.e. the contents of `/vicepX`)

2019-03-02 Thread Ciprian Dorin Craciun
On Wed, Dec 5, 2018 at 3:29 PM Harald Barth  wrote:
> > Can I safely `rsync` the files from the old partition to the new one?
>
> For Linux (The "new" server partition layout):
>
> If the file tree really is exactly copied (including all permissions
> and chmod-bits) then you have everything you need.



I would like to follow-up on this with some additional questions,
which I'll try to keep as succinct as possible.  (Nothing critical,
however I would like to have a little bit more insight into this.)

(A)  When you state `exactly copied` you mean only the following
(based on the `struct stat` members):
* `st_uid` / `st_gid`;
* `st_mode`;  (i.e. permissions;)
* `st_atim`, `st_mtim` and `st_ctim`?  (i.e. timestamps)
* no ACL's, no `xattr` (or `user_xattr`);
* anything else?

(B)  Also (based on what I gathered by "peeking" into the `/vicepX`
partition) there are only plain folders and plain files, without any
symlinks or hard-links.

(C)  Moreover based on the same observations, I guess that the
metadata (i.e. uid/gid/permissions/timestamps) for the actual folders
inside of `/vicepX` don't matter much.  (Only the matadata for the
actual files do.)



(D)  (Not really related to migration)  Am I to assume that some of
the files inside `AFSIDat` are identical in contents to the actual
files on the `/afs` structure?  (Disregarding all meta-data, including
filenames.)  Moreover am I to assume that all the files accessible
from `/afs` are found somewhere inside `AFSIDat` with identical
contents?

I.e. formalizing the last one:  if one would take any file accessible
under `/afs` and would compute its SHA1, then by looking into all
`/vicepX` partitions belonging to that cell, one would definitively
find a matching file with that SHA1.



My curiosity into all this is because I want to prepare some `rsync`
and `cpio` snippets that perhaps could help others in a similar
endeavor.  Moreover (although I know there are at least two other
"official" ways to achieve this) it can serve as an alternative backup
mechanism.

BTW, is there a document that outlines the actual layout of the
`/vicepX` structure?  I've searched a bit but found nothing useful.

Thanks for the feedback,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Offline migration of AFS partition files (i.e. the contents of `/vicepX`)

2018-12-06 Thread Ciprian Dorin Craciun
On Wed, Dec 5, 2018 at 3:29 PM Harald Barth  wrote:
> > Can I safely `rsync` the files from the old partition to the new one?
>
> For Linux (The "new" server partition layout):
>
> If the file tree really is exactly copied (including all permissions
> and chmod-bits) then you have everything you need.


Am I safe to assume that on Linux only the "new" partition layout is
used?  (Is there a way to check which layout am I using?)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Offline migration of AFS partition files (i.e. the contents of `/vicepX`)

2018-12-04 Thread Ciprian Dorin Craciun
Quick question regarding the following situation:  one of my `/vicepX`
AFS partitions is currently stored on an old disk (with JFS as
file-system) and I thus I need to move all my AFS data from that
partition to a fresh one (Ext4);  moreover during this movement
OpenAFS is not running (and I intend to upgrade the server version
also from 1.6.5 to the latest one).

Can I safely `rsync` the files from the old partition to the new one?

Is there another alternative that doesn't require actually staring OpenAFS?

(I know about `voldump`, however it requires me to execute it for each
volume, and thus I might "forget" something.  Moreover it requires
extra storage space for the resulting archive.)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Multi-homed server and NAT-ed client issues

2013-07-17 Thread Ciprian Dorin Craciun
On Wed, Jul 17, 2013 at 10:23 PM, Harald Barth  wrote:
>
>> services:  `kaserver`, 
>
> Please consider running a KDC (or use the KDC in your AD if you have
> one) instead of kaserver. kaserver is so last century.
>
> Harald.


Yes... The `kaserver` thingy...  :)

The problem is that when I've started using OpenAFS (for personal
purposes), the "stable" version was 1.4 (or at least what was labeled
"stable" in my distribution).  And at that time `kaserver` was simple
to install and manage, and I still use it today, mainly due to
laziness.  Moreover I only have a few users, thus migrating to a full
Kerberos stack seems like an overkill to me...

On the same topic, are there any serious concerns related to
`kaserver`?  Or is it more related with other aspects (like say
scalability, integration, future, etc.)?

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Multi-homed server and NAT-ed client issues

2013-07-17 Thread Ciprian Dorin Craciun
Problem solved!  Thanks to both posters for pointing me to the
right direction: adding the `-rxbind` option to the following
services:  `kaserver`, `ptserver`, `vlserver`, `fileserver`, and
`volserver`.  This was simply done by editing the `BosConfig` file in
the `/etc/openafs` folder and adding that token in the lines starting
with `parm`.

(I must confess I feel quite dump for not finding this option
myself...  I must say I did add it to the `bosserver` invocation but
didn't seem to work.  I should have added it to each individual
service.)

However a note for the documentation maintainers it seems that the
`-rxbind` option is missing from the manuals of the following services
(at least in the HTML version on http://docs.openafs.org ): `kaserver`
and `volserver`.


About the words that follow bellow, because they were written
while I was reading the replies, I'll leave them there, for those that
one day will have similar issues with multi-homed servers.


On Wed, Jul 17, 2013 at 6:28 PM, Jeffrey Hutzelman  wrote:
> On Wed, 2013-07-17 at 17:43 +0300, Ciprian Dorin Craciun wrote:
>> Hello all!  I've encountered quite a blocking issue in my OpenAFS
>> setup...  I hope someone is able to help me... :)
>>
>>
>> The setup is as follows:
>> * multi-homed server with, say S-IP-1 (i.e. x.x.x.5) and S-IP-2
>> (i.e. x.x.x.7), multiple IP addresses, all from the public range;
>
> Things get much easier if you just use the actual names and addresses,
> instead of making up placeholders.

Indeed it could seem that I've obscured the situation by providing
placeholders for the actual IP's.  (The reason revolves mainly around
the fact that all these emails are public knowledge.)

However the values I've chosen as placeholders have been carefully selected:
* both are addresses for the same interface (configured with `ip
addr add ...`, thus not with alias interfaces like Debian once had);
* both are addresses from the same network (thus are routed identically);
* the second IP (the one OpenAFS should use) is marked as
`secondary` by `ip addr show`;

Below is the output of `ip -4 addr show ethX` (blanking only the
interface name and the network address):

ethX:  mtu 1500 qdisc noqueue state UP
inet x.x.x.5/27 brd x.x.x.31 scope global ethX
inet x.x.x.7/27 brd x.x.x.31 scope global secondary ethX


And the output of `ip -4 route show`:

x.x.x.0/27 dev ethX  proto kernel  scope link  src x.x.x.5
127.0.0.0/8 via 127.0.0.1 dev lo
default via x.x.x.1 dev ethX


The full output of both `ip addr` and `ip route` includes a few
more bridges and interfaces.  However none share the same IP range
with the addresses above, there are no other default routes except the
one above, and moreover the OpenAFS clients aren't on any of the
"extra" networks (i.e. the packets to them should go through the
default route above).


> Frequently, doing that sort of thing
> hides critical information that may point to the source of the problem.

I hope that the details above are sufficient to depict the overall context.


> For example, in this case, Linux's choice of source IP address on an
> outgoing UDP packet sent from an unbound socket (or one bound to
> INADDR_ANY) will depend on the interface it chooses, which will depend
> on the route taken, which depends on the server's actual addresses and
> the network topology, particularly with respect to the client (or in
> this case, to the public address of the NAT the client is behind).

As said, the reply packet leaves the server with the source set as
the first IP (x.x.x.5).  And thus the behaviour is consistent with a
socket bound to `INADDR_ANY` and towards a peer that takes the default
route.


> You also haven't said what version of OpenAFS you're using, so I'll
> assume it's some relatively recent 1.6.x.

Indeed, my fault.  Being hurried to leave for home, I've forgot to
mention that I have Linux on both the client and server, and the
OpenAFS version is 1.6.2.


>> * the second IP, S-IP-2 (i.e. x.x.x.7), is the one listed in
>> `NetInfo` and DNS record (and correctly listed when queried via `vos
>> listaddrs`);
>> * the first IP, S-IP-1 (i.e. x.x.x.5), is listed in
>> `NetRestricted` (and doesn't appear in `vos listaddrs`);
>
> So, the machine the fileserver runs on is multi-homed, but you're only
> interested in actually using one of those interfaces to provide AFS
> service?

Exactly the server is multi-homed and I want it to use the
secondary IP address.  (In fact all OpenAFS services run on exactly
the same server.)


> In that case, you use the -rxbind option, which tells the
> servers to bind to a specific address instead of INADDR_ANY.  That
> option needs 

[OpenAFS] Multi-homed server and NAT-ed client issues

2013-07-17 Thread Ciprian Dorin Craciun
Hello all!  I've encountered quite a blocking issue in my OpenAFS
setup...  I hope someone is able to help me... :)


The setup is as follows:
* multi-homed server with, say S-IP-1 (i.e. x.x.x.5) and S-IP-2
(i.e. x.x.x.7), multiple IP addresses, all from the public range;
* the second IP, S-IP-2 (i.e. x.x.x.7), is the one listed in
`NetInfo` and DNS record (and correctly listed when queried via `vos
listaddrs`);
* the first IP, S-IP-1 (i.e. x.x.x.5), is listed in
`NetRestricted` (and doesn't appear in `vos listaddrs`);
* NAT-ed client (no multi-home on the client side);

The actual problem is:
* the client sends the authentication request to S-IP-2;
* the client's router source-NAT's the IP to its own public IP,
and adds the UDP "connection" with S-IP-2 as the other peer to its
conntrack table;
* the server receives the request on S-IP-2;
* !!! however it replies from S-IP-1 (i.e. x.x.x.5) !!!  (probably
because the UDP socket is bound on `0.0.0.0`...)
* the client's router receives the packet and can't find it in its
conntrack table (because it expects the packet to come from S-IP-2);

As a note everything works perfect with non-NAT-ed clients.
Moreover on these public-IP-ed clients, I can clearly see via
`tcpdump` that outgoing packets go towards S-IP-2, but the replies
come from S-IP-1.  (The same asymmetry is visible also on the server.)


Thus my question is how can I resolve such an issue?


I must say I've tried to `iptables -j SNAT ...` outgoing packets
to the right S-IP-2, however this doesn't work because SNAT also
changes the source port.  I've also tried to `-j NETMAP` these
packets, but it doesn't work because NETMAP in the `OUTPUT` or
`POSTROUTING` tables actually touch the destination...  Thus if
someone knows of an `iptables`...

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Question for OpenAFS

2013-04-19 Thread Ciprian Dorin Craciun
On Thu, Apr 18, 2013 at 10:10 AM, Lars Schimmer
 wrote:
> On 18.04.2013 08:46, 강신덕 wrote:
>> I wonder whether Openafs File Server & DB Server could work on vmware as
>> a virtual machine.
>>
>> If it is possible, We want to migrate our openafs system from physical
>> server to virtual machine using VMWare.
>
> Sure that is possible, some cells do run complete out of VMWare VMs.
> But remind: this setup does cost some performance overhead.


About this OpenAFS in VM, I tried a small experiment some time ago
and I had some issues...

Basically there are two approaches to virtualization:

* each VM gets its own "public" IP address (practically it is just
like other hosts on the same LAN); in this case I have no doubts that
OpenAFS works flawlessly, minus the performance issues;

* the host has a single IP, and the VM's get a "private" IP
address, that is NAT-ed; in this case I remember I had issues with
properly configuring OpenAFS to handle such a scenario;

Could someone comment on his success / in-success in the second
NAT-ed scenario? I remember I gave up and just moved the OpenAFS on
the host. (I remember that 2-3 years ago there was some sketchy
documentation on this, but I haven't checked lately...)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Where to find the 1.6.1a source code Git repository?

2012-12-18 Thread Ciprian Dorin Craciun
On Tue, Dec 18, 2012 at 10:17 PM, Derrick Brashear  wrote:
> On Tue, Dec 18, 2012 at 3:13 PM, Ciprian Dorin Craciun
>  wrote:
>> (Why this confusion? Because currently OpenAFS 1.6.1 fails to
>> build on latest 3.6 Linux kernel, and I was hopping that there is a
>> 1.6.1b version in the working in Git which solves the issues...)
>>
>
> There's a 1.6.2pre1 in git, which as it happens fixes that.


:) Indeed it fixes it... :)

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Where to find the 1.6.1a source code Git repository?

2012-12-18 Thread Ciprian Dorin Craciun
On Tue, Dec 18, 2012 at 10:07 PM, Derrick Brashear  wrote:
> 1.6.1a is macos-only. if you're not building a macos client, you don't
> care. if you are building a macos client, apply the 1.6.1a patch in
> the macos release directory to the 1.6.1 source.

Thanks for the quick reply, it clarifies some things now.

But then shouldn't this information (that the version 1.6.1a is
OSX only) be clearly written somewhere on the download site? (Indeed
if I look where that version appears I can find it only under the Mac
section.)

(Why this confusion? Because currently OpenAFS 1.6.1 fails to
build on latest 3.6 Linux kernel, and I was hopping that there is a
1.6.1b version in the working in Git which solves the issues...)

Thanks again,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Where to find the 1.6.1a source code Git repository?

2012-12-18 Thread Ciprian Dorin Craciun
Hello all!

I've seen that on the download site there is an OpenAFS 1.6.1a
version, but in the git repository I don't seem to find any tag or
branch relating to such a version... (Indeed on the download site
there is a patch that applies cleanly over the branch
`openafs-stable-1_6_1-branch`.)

Thus my question is in which repository (or under which reference)
can I find the code for the 1.6.1a release?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: OpenAFS 1.6.0pre4 kernel panic

2011-04-16 Thread Ciprian Dorin Craciun
On Sat, Apr 16, 2011 at 14:16, Simon Wilkinson  wrote:
> Could you run gdb against your kernel module (either openafs.ko or 
> libafs.ko), and run
> list *afs_GetDownD.clone.5+0x1d0
>
> This will let us know exactly where in your kernel module the fault is 
> occurring.
>
> Cheers,
>
> Simon


Unfortunately it complains that it can't find symbols...

gdb ./src/libafs/MODLOAD-2.6.38.3-erebus+-MP/libafs.ko
...
(gdb) list *afs_GetDownD.clone.5+0x1d0
No symbol table is loaded.  Use the "file" command.


I've tried reconfiguring and rebuilding as below, but with the same result:

./configure --prefix=/packages/openafs/1.6.0-pre4--1
--with-afs-sysname=i386_linux26 --enable-kernel-module
--disable-transarc-paths --disable-linux-syscall-probing
  --with-linux-kernel-headers=/tmp/linux--2.6.38.3-erebus+--modules
 --with-linux-kernel-build=/tmp/linux--2.6.38.3-erebus+--modules
  --enable-debug --enable-debug-kernel


Have I missed some configuration options, or should I change the
kernel config?

Thanks,
Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: OpenAFS 1.6.0pre4 kernel panic

2011-04-16 Thread Ciprian Dorin Craciun
On Sat, Apr 16, 2011 at 12:58, Ciprian Dorin Craciun
 wrote:
>   (I've resent this email as the attached image and config file are
> too large and were rejected by the mailing list.)
>
>    Hello all!
>
>    I've successfully run OpenAFS v1.4.12 with kernels up-to 2.6.34.x.
> But unfortunately lately I'm unable to make neither 1.4.14.1 nor
> 1.6.0pre4 work with either 2.6.37.x or 2.6.38.x kernels.
>
>    Attached I put my kernel config file and a picture of my laptop's
> screen when "panic"-ed. (How could I easily capture the panic error
> after I reboot?) Also attached I put my (custom, but Debian based)
> `init.d` script and OpenAFS related config files. (I'm using
> ArchLinux.)
>
>   Panic picture:
>       
> http://data.volution.ro/ciprian/8f75abb6c3c12be3375206fa1cdea065/dscf0902-small.jpg
>   Kernel config:
>       http://data.volution.ro/ciprian/8f75abb6c3c12be3375206fa1cdea065/config
>
>    Any pointers?
>    Thank you,
>    Ciprian.


Ok.

I've backtracked the problem and identified that the cause is:

afsd -dynroot -afsdb -memcache -dcache 8192 -chunksize 14 -stat 32768
-fakestat-all -daemons 6 -volumes 256 -nosettime


Because if I remove all the "fine-tuning" it works as:

afsd -dynroot -afsdb -memcache -fakestat-all -nosettime


But still, should the cache manager crash the system if badly configured?

Ciprian.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] OpenAFS 1.6.0pre4 kernel panic

2011-04-16 Thread Ciprian Dorin Craciun
   (I've resent this email as the attached image and config file are
too large and were rejected by the mailing list.)

   Hello all!

   I've successfully run OpenAFS v1.4.12 with kernels up-to 2.6.34.x.
But unfortunately lately I'm unable to make neither 1.4.14.1 nor
1.6.0pre4 work with either 2.6.37.x or 2.6.38.x kernels.

   Attached I put my kernel config file and a picture of my laptop's
screen when "panic"-ed. (How could I easily capture the panic error
after I reboot?) Also attached I put my (custom, but Debian based)
`init.d` script and OpenAFS related config files. (I'm using
ArchLinux.)

   Panic picture:
   
http://data.volution.ro/ciprian/8f75abb6c3c12be3375206fa1cdea065/dscf0902-small.jpg
   Kernel config:
   http://data.volution.ro/ciprian/8f75abb6c3c12be3375206fa1cdea065/config

   Any pointers?
   Thank you,
   Ciprian.


openafs
Description: Binary data


cacheinfo
Description: Binary data