Re: [OpenAFS] solaris 10 versions supporting inode fileservers

2009-05-13 Thread Hartmut Reuter
David R Boldt wrote:
 
 We use Solaris 10 SPARC exclusively for our AFS servers.
 After upgrading to 1.4.10 from 1.4.8 we had a very few
 volumes that started spontaneously going off-line, recovering,
 and then going off-line again until they needed to be salvaged.
 
 Hearing that this might be related to inode, we moved these
 volumes to a set of little use fileservers that were running
 namei at 1.4.10. It made no discernible difference.
 
 Two volumes in particular accounted for 90% of our off-line
 volume issues.
 
 FileLog:
 Mon Apr 27 10:56:09 2009 Volume 2023867468 now offline, must be salvaged.
 Mon Apr 27 10:56:15 2009 Volume 2023867468 now offline, must be salvaged.
 Mon Apr 27 10:56:15 2009 Volume 2023867468 now offline, must be salvaged.
 Mon Apr 27 10:56:22 2009 fssync: volume 2023867469 restored; breaking
 all call backs
 (restored vol above being R/O for R/W in need of salvage)

That's interesting: I saw similar behavior on some of our volumes,
however, with AFS/OSD fileservers. I then made the ViceLog messages more
 eloquent and found out that this always happened when IH_OPEN failed.
This can fail if the handle in the vnode is missing. To prevent that I
added some lines in VGetVnode_r when an already existing vnode structure
is found to check whether the handle is in place and if not do a new
IH_INIT (and write a message into the log). I found about 100 cases per
day in our cell, but not all of them would have ended in taking the
volume off-line because in many cases the handle never would have been
used (All the GetStatus RPCs). Since then I never again saw volumes
going off-line.

Hartmut
 
 Both of the volumes most frequently impacted have content
 completely rewritten roughly every 20 minutes while being on
 an automated replication schedule of 15 minutes. One of them
 25MB, the other 95MB, both at about 80% quota.
 
 We downgraded just the fileserver binary to 1.4.8 on all of
 our servers and have not seen a single off-line message in
 36 hours.
 
 
 -- David Boldt
 dbo...@usgs.gov


-- 
-
Hartmut Reuter  e-mail  reu...@rzg.mpg.de
phone+49-89-3299-1328
fax  +49-89-3299-1301
RZG (Rechenzentrum Garching)webhttp://www.rzg.mpg.de/~hwr
Computing Center of the Max-Planck-Gesellschaft (MPG) and the
Institut fuer Plasmaphysik (IPP)
-
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] /afs area is hanging

2009-05-13 Thread Jason Edgecombe

Mark Henry wrote:



This is a happy day.  Derrick it looks like your skills have pointed us to the
solution.  I mentioned that we use a loopback device to mount our afs cache
filesystem.  That device was /dev/loop0.  Well, after the direction that you
gave us we found that /dev/loop0 was also being used as a method of restricting
font cache for a different app.  When the app would run the afs cache was
getting clobbered and the afs hang would follow.  We have moved the afs cache
to a new place now and it looks like this problem has been solved.  Thank you
all on openafs.org that helped us with this issue.  Thank you Derrick for the
key piece of info that has solved this one.
  
I'm curious what the backing store for /dev/loop0 is in your setup. what 
advantages do you receive while running this way?


Is this so you can store the cache in a ramdisk?

Thanks,
Jason
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] OpenAFS Newsletter, Issue 1, May 2009

2009-05-13 Thread Jason Edgecombe

[The HTML version of this text is attached]

OpenAFS Newsletter, Issue 1, May 2009
   This is the first issue of what will hopefully be a monthly summary of
   the activity that's happening in the OpenAFS community.

   As always, volunteers, patches, bug reports, or any other type of help
   is greatly appreciated.

   Feedback on this newsletter is welcome. The goal is to summarize the
   various development efforts and news of OpenAFS for the community.
   Please let Jason know what you would like to see out of this newsletter.

 Upcoming Events
   The Sixth Annual International AFS  Kerberos Best Practices Workshop
   will be held at Stanford University on June 1-5, 2009.

   Ref: http://workshop.openafs.org/afsbpw09/index.html

 Projects
  Disconnected AFS support
   Project Contacts:

   *   Simon Wilkinson s...@inf.ed.ac.uk

   *   Dragos Tatulea dragos.tatu...@gmail.com

   Support for disconnection operation on Unix has been integrated into the
   1.5.x branch. There are no currently known data loss bugs, and further
   testing would be greatly appreciated. Currently, files written whilst
   disconnected do not persist across restarts, and there is no user
   interface to specify which files should be pinned in the cache to ensure
   they're available whilst disconnected. Simon Wilkinson and Dragos
   Tatulea (respectively) are working to resolve these.

  Security Releases
   Security Officer:

   *   Simon Wilkinson s...@inf.ed.ac.uk

   OpenAFS 1.4.9, 1.4.10 and 1.5.59 were a security release to resolve two
   independent issues in the OpenAFS cache manager for Unix, one of which
   is a potential remote root exploit. All Unix platforms, excluding Mac OS
   X 10.4 and 10.5 are affected, and upgrading is strongly recommended.

  Fedora 11 and Linux 2.6.29 support
   Project Contact:

   *   Simon Wilkinson s...@inf.ed.ac.uk

   OpenAFS 1.4.10 does not build on Fedora 11, and crashes in some
   situations when running on a 2.6.29 kernel. Fixes will be available
   shortly.

  Newsletter
   Project Contact:

   *   Jason Edgecombe ja...@rampaginggeek.com

   Derrick Brashear, Jason Edgecombe, Jeff Altman, and Simon Wilkinson
   discussed the issue of keeping the community better informed by the web
   and a monthly email newsletter. Jason volunteered to write the
   newsletter. This document is the result.

   Ref:
   http://jabber.openafs.org/open...@conference.openafs.org/2009-05-04.txt

  Google Summer of Code 2009
   OpenAFS received 4 slots for the 2009 Google Summer of Code.

   Go to http://socghop.appspot.com/org/home/google/gsoc2009/openafs for
   more information about the GSoC projects.

   The student projects are:

  OpenAFS server preference based on network conditions
   Student Developer: Jake Thebault-Spieker summatusmen...@gmail.com

   Mentor: Derrick Brashear sha...@gmail.com

   This is Jake's second year with GSoC for OpenAFS.

   Abstract:

   The OpenAFS cache manager keeps two lists of which servers host the
   files required. Currently, these lists are ordered based on antiquated
   network architecture assumptions that no longer apply to current network
   architectures. This project seeks to change the way these lists are
   ordered by taking into account network conditions that can be estimated
   based on the Rx peer statistics gathering functionality built into
   OpenAFS.

   Ref:
   https://lists.openafs.org/pipermail/openafs-devel/2009-April/016590.html

  OpenAFS Management Console on Windows
   Student Developer: Brant Gurganus br...@gurganus.name

   Mentor: Jeffrey Altman jalt...@secure-endpoints.com

   Abstract:

   I propose creating a Microsoft Management Console snap-in for Windows.
   This will better integrate OpenAFS with existing Windows management
   technology and fits in an overall strategy of improved Windows
   integration.

   Ref:
   https://lists.openafs.org/pipermail/openafs-devel/2009-April/016591.html

  Implementing OpenAFS features into RedHat's kafs kernel module.
   Student Developer: Wang Lei wang840...@gmail.com

   Mentor: David Howells

   Abstract:

   My project is to implement some OpenAFS features into the Linux kafs
   kernel module. The first feature is DNS AFSDB Support which may be
   implemented by sharing the mechanism used by CIFS as my mentor's design
   at present. The second is implement more functions of the pioctl
   system-call on the work of the student of GSoC last year. And the other
   two are to implement some OpenAFS fs commonds that were not implemented
   in kafs and Keyring compatibility which can work well with both OpenAFS
   client and kafs. I have began to read the documatations of pioctl/ioctl
   of the OpenAFS, and I will start my work with the implement of this
   system call. There are a lot of pioctls to implement and I think that
   can help me to understand OpenAFS well.

   Ref:
   https://lists.openafs.org/pipermail/openafs-devel/2009-April/016595.html

  Adding Searching and Indexing