Re: [OpenAFS] PAM configuration?

2006-05-26 Thread Brady Catherman

Okay, I have no clue what I did but it is working now =)

O was walking through each of the suggestions making sure that  
everything was in place. I don't remember what I changed or tweaked  
with but now it appears to be working! Thank you all for your help  
with this.


Now I just need to finish getting SGE to work properly =)

- Brady


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS implementation questions.

2006-05-26 Thread Christopher D. Clausen
There are per-IP ACLs.  While not the best solution, it might work if 
you have a limited set of users who are generally trusted enough not to 
mess with other peoples stuff.

http://www.duke.edu/~jhv/answers/afs-ip-acls.html

< wrote:
> Well, you should be able to get tickets/tokens through ssh, either
> via kerberos ticket passing or typing in a password.  In those cases
> your users can still run re-auth.
>
> However for batch processes, well, there's just not much you can do.
>
> -derek
>
> Brady Catherman <[EMAIL PROTECTED]> writes:
>
>> Thanks for the quick reply! =)
>>
>> These users are coming in through SSh and often launching jobs that
>> run in the background. there really isn't room for running reauth and
>> such =/
>>
>> Plus the hope is to put this on the cluster where jobs sit queued for
>> ages before running. The user would have no ability to authenticate
>> later.
>>
>> Hope this clarifies things a bit.
>>
>>
>> On May 25, 2006, at 12:32 PM, Derek Atkins wrote:
>>
>>> Have your users run reauth?  That will automatically get them new
>>> tickets and tokens..  Or tie into the screensaver!
>>>
>>> -derek
>>>
>>> Quoting Brady Catherman <[EMAIL PROTECTED]>:
>>>
 I am currently considering moving our environment to OpenAFS but
 before I can switch I need to make sure a few things are going to
 keep working..

 We have users that use or systems for months on end without
 logging  off and I am concerned that the kerberos ticket they are
 being issued  will expire. Having them log back into kerberos/
 openafs isn't really  a good option for us (I am having a hard
 enough time selling even the  basic conversion, let alone anything
 that requires user action!)

 Is there an easy way around this with OpenAFS? I have tried
 setting  the length of our kerberos tickets higher but the most it
 will give  me is 7 days. Is there another way to do this?


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Solaris 10 inode server

2006-05-26 Thread John Tang Boyland
] John Tang Boyland <[EMAIL PROTECTED]> wrote:
] > (Since I get openafs-info in digest form, a direct cc is appreciated.)
] >
] > We are expanding our small AFS cell at UWM.  We have new SPARC
] > blades running Solaris 10 and the inode fileserver.  But we've
] > found it impossible to create or release volumes to the new machine.
] > (The new machine is called jeremiah.cs.uwm.edu.)
] >
] > I can add a readonly site, but releasing the volume to the server
] > hangs, as does trying to create a volume.
] 
] Are you using the binaries from openafs.org or did you compile yourself?

openafs.org

] Is this the first volume on this partition on this server?  

Yes, although we had tried before and got different erroneous results:
vos release wouldn't hang, it would simply report that it failed and
ask to salvage the disk, which didn't help.

But now it was a more serious error.

] Can you manually create files on /vicepX?  Using touch or cat or 
] something?  (You probably don't want to attempt this while the 
] fileserver is running.)

'touch' hung too.  We rebooted the server and then the disk itself
didn't appear (fsck gave No Such File or Directory).  It looks
like it's a problem with the fibre channel RAID.  Nothing to do with
AFS.  Sorry.

John
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] file corruption redux

2006-05-26 Thread Derrick J Brashear

On Fri, 26 May 2006, Miles Davis wrote:


I think I can get it to happen pretty regularly on several clients. It seems to
be just this server, though...I'm still stumped. Another point of possible
interest -- I see no corruption if I go through an afs client on the server
itself.


So is it possible that hardware is in some (non-disk-controller) way 
corrupting the transfers?


It might be worthwhile to tcpdump -s (larger than mtu) -w (some file) host 
(a client you know will lose) and port 7000 and then given cmp -l output 
of the good and corrupt file perhaps we can see if there's garbage on the 
wire.



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] file corruption redux

2006-05-26 Thread Miles Davis
On Fri, May 26, 2006 at 01:48:53PM -0400, Derrick J Brashear wrote:
> On Fri, 26 May 2006, Miles Davis wrote:
> 
> >
> >OK, by replacing a RAID controller and memory, I've managed to modify the
> >behaviour of my little file corruption issue, but magically it remains. 
> >Now it
> >seems to be isolated to the client side -- I have yet to find a bad md4sum 
> >of a
> >file on the server. The client-side corruption also comes and goes -- with 
> >a
> >well-placed fs flush or flushvol, it's gone, only to reappear later. I've
> >verified that it happens on 1.4.1 and 1.3.86 (I don't have any 1.2.x 
> >clients
> >left to try, but I can open up the volume to the world if anybody wants to
> >try).
> 
> So I'll leave your bug open. Hm
> 
> When I have recovered from lunch I should code you up a patch. How 
> reproducible is it?

I think I can get it to happen pretty regularly on several clients. It seems to
be just this server, though...I'm still stumped. Another point of possible
interest -- I see no corruption if I go through an afs client on the server
itself.

I wasn't ever able to reproduce the corruption on other servers, either.


-- 
// Miles Davis - [EMAIL PROTECTED] - http://www.cs.stanford.edu/~miles
// Computer Science Department - Computer Facilities
// Stanford University
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] file corruption redux

2006-05-26 Thread Derrick J Brashear

On Fri, 26 May 2006, Miles Davis wrote:



OK, by replacing a RAID controller and memory, I've managed to modify the
behaviour of my little file corruption issue, but magically it remains. Now it
seems to be isolated to the client side -- I have yet to find a bad md4sum of a
file on the server. The client-side corruption also comes and goes -- with a
well-placed fs flush or flushvol, it's gone, only to reappear later. I've
verified that it happens on 1.4.1 and 1.3.86 (I don't have any 1.2.x clients
left to try, but I can open up the volume to the world if anybody wants to
try).


So I'll leave your bug open. Hm

When I have recovered from lunch I should code you up a patch. How 
reproducible is it?

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] file corruption redux

2006-05-26 Thread Miles Davis

OK, by replacing a RAID controller and memory, I've managed to modify the 
behaviour of my little file corruption issue, but magically it remains. Now it 
seems to be isolated to the client side -- I have yet to find a bad md4sum of a 
file on the server. The client-side corruption also comes and goes -- with a 
well-placed fs flush or flushvol, it's gone, only to reappear later. I've 
verified that it happens on 1.4.1 and 1.3.86 (I don't have any 1.2.x clients 
left to try, but I can open up the volume to the world if anybody wants to 
try).

cmp -l on a bad client-side file and known good file doesn't enlighten me, but 
maybe it will for somebody else:

cmp -l aspell-pl-0.51-3.i386.rpm /tmp/aspell-pl-0.51-3.i386.rpm
13468782 337 317
cmp -l eclipse-platform-3.1.0_fc-0.M6.22.i386.rpm 
/tmp/eclipse-platform-3.1.0_fc-0.M6.22.i386.rpm
 2883589 357 257

I can do whatever debugging will help on the file server, as it has exactly one 
volume on it right now.


-- 
// Miles Davis - [EMAIL PROTECTED] - http://www.cs.stanford.edu/~miles
// Computer Science Department - Computer Facilities
// Stanford University
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Solaris 10 inode server

2006-05-26 Thread Christopher D. Clausen
John Tang Boyland <[EMAIL PROTECTED]> wrote:
> (Since I get openafs-info in digest form, a direct cc is appreciated.)
>
> We are expanding our small AFS cell at UWM.  We have new SPARC
> blades running Solaris 10 and the inode fileserver.  But we've
> found it impossible to create or release volumes to the new machine.
> (The new machine is called jeremiah.cs.uwm.edu.)
>
> I can add a readonly site, but releasing the volume to the server
> hangs, as does trying to create a volume.

Are you using the binaries from openafs.org or did you compile yourself?

Is this the first volume on this partition on this server?  Did vos 
listvol return an empty list prior to attempting the release / create? 
You did remember to run newfs on the vice partitions before attempting 
to use them, correct?

Can you manually create files on /vicepX?  Using touch or cat or 
something?  (You probably don't want to attempt this while the 
fileserver is running.)

If everything is hung, yes, you will probably have to power-cycle the 
server to get it back up.  Or at least I have been in situations where 
such action was required.

-

For the record, I will state that I have had nearly perfect success 
using 1.4.1 with the inode server on Solaris 10 on four of my machines. 
(Slight un-reproducable problem with two client machines appearing to 
have the same UUIDs.)



[OpenAFS] Solaris 10 inode server

2006-05-26 Thread John Tang Boyland
(Since I get openafs-info in digest form, a direct cc is appreciated.)

We are expanding our small AFS cell at UWM.  We have new SPARC
blades running Solaris 10 and the inode fileserver.  But we've
found it impossible to create or release volumes to the new machine.
(The new machine is called jeremiah.cs.uwm.edu.)

I can add a readonly site, but releasing the volume to the server
hangs, as does trying to create a volume.  For example:

% vos release root.cell -verbose

root.cell 
RWrite: 536870915 ROnly: 536870916 RClone: 536870916 
number of sites -> 4
   server solomons.cs.uwm.edu partition /vicepa RW Site 
   server solomons.cs.uwm.edu partition /vicepa RO Site  -- New release
   server afs2.cs.uwm.edu partition /vicepa RO Site  -- New release
   server jeremiah.cs.uwm.edu partition /vicepa RO Site  -- Old release
This is a completion of the previous release
[HANG]
(Control-C followed by "vos unlock root.cell".)

The FileLog says:

Fri May 26 06:56:04 2006 File server starting
Fri May 26 06:56:04 2006 afs_krb_get_lrealm failed, using cs.uwm.edu.
Fri May 26 06:56:04 2006 Set thread id 14 for FSYNC_sync
Fri May 26 06:56:04 2006 Partition /vicepa: attaching volumes
Fri May 26 06:56:04 2006 Partition /vicepa: attached 0 volumes; 0 volumes not 
attached
Fri May 26 06:56:04 2006 Getting FileServer name...
Fri May 26 06:56:04 2006 FileServer host name is 'jeremiah'
Fri May 26 06:56:04 2006 Getting FileServer address...
Fri May 26 06:56:04 2006 FileServer jeremiah has address 129.89.143.70 
(0x468f5981 or 0x81598f46 in host byte order)
Fri May 26 06:56:04 2006 File Server started Fri May 26 06:56:04 2006
Fri May 26 06:56:04 2006 Set thread id 15 for 'FiveMinuteCheckLWP'
Fri May 26 06:56:04 2006 Set thread id 16 for 'HostCheckLWP'
Fri May 26 06:56:04 2006 Set thread id 17 for 'FsyncCheckLWP'

While the release operation is hanging, the VolserLog says:

Fri May 26 06:56:07 2006 Starting AFS Volserver 2.0 (/usr/afs/bin/volserver)
Fri May 26 09:17:07 2006 trans 1 on volume 536870916 is older than 300 seconds
Fri May 26 09:17:37 2006 trans 1 on volume 536870916 is older than 330 seconds
Fri May 26 09:18:07 2006 trans 1 on volume 536870916 is older than 360 seconds
Fri May 26 09:18:37 2006 trans 1 on volume 536870916 is older than 390 seconds
...
Even after killing the release, it still prints out messages:
...
Fri May 26 09:48:08 2006 trans 1 on volume 536870916 is older than 2160 seconds
...

the volserver refuses any otehr requests: rxdebug shows it's still
alive, but 'vos listvol' hangs as done 'vos create'.  Stopping the 'fs'
instance hangs too:
(from bos status -long)
Instance fs, (type is fs) disabled, has core file, currently shutting down.
Auxiliary status is: file server shutting down.
Process last started at Fri May 26 06:56:04 2006 (2 proc starts)
Last exit at Fri May 26 09:53:35 2006
Command 1 is '/usr/afs/bin/fileserver'
Command 2 is '/usr/afs/bin/volserver'
Command 3 is '/usr/afs/bin/salvager'
(But the FileLog says:
Fri May 26 09:53:35 2006 Shutting down file server at Fri May 26 09:53:35 2006
Fri May 26 09:53:35 2006 Vice was last started at Fri May 26 06:56:04 2006

Fri May 26 09:53:35 2006 Large vnode cache, 400 entries, 0 allocs, 0 gets (0 
reads), 0 writes
Fri May 26 09:53:35 2006 Small vnode cache,400 entries, 0 allocs, 0 gets (0 
reads), 0 writes
Fri May 26 09:53:35 2006 Volume header cache, 400 entries, 0 gets, 0 
replacements
Fri May 26 09:53:35 2006 Partition /vicepa: 418562775 available 1K blocks 
(minfree=4227906), Fri May 26 09:53:35 2006 418562766 free blocks
Fri May 26 09:53:35 2006 With 90 directory buffers; 0 reads resulted in 0 read 
I/Os
Fri May 26 09:53:35 2006 Total Client entries = 0, blocks = 0; Host entries = 
0, blocks = 0
Fri May 26 09:53:35 2006 There are 0 connections, process size 137858
Fri May 26 09:53:35 2006 There are 0 workstations, 0 are active (req in < 15 
mins), 0 marked "down"
Fri May 26 09:53:35 2006 VShutdown:  shutting down on-line volumes...
Fri May 26 09:53:35 2006 VShutdown:  complete.
Fri May 26 09:53:35 2006 File server has terminated normally at Fri May 26 
09:53:35 2006
)
And the VolserLog has stopped adding lines.

Am I going to have to reboot to get the system to respond?

I don't see any indication of what the problem is.  Is this a problem
with the inode fileserver?  The entry from vfstab for the partition
is:

/dev/dsk/c0t600C0FF0098C96204C3F4A00d0s7
/dev/rdsk/c0t600C0FF0098C96204C3F4A00d0s7   /vicepa afs 3   
yes nologging

(Before trying to stop the fs process.)
% rxdebug jeremiah -port 7005
Trying 129.89.143.70 (port 7005):
Free packets: 193, packet reclaims: 0, calls: 3, used FDs: 6
not waiting for packets.
0 calls waiting for a thread
8 threads are idle
Connection from host 129.89.38.129, port 48474, Cuid a15f602d/cd4f1f5c
  serial 70,  natMTU 1444, flags pktCksum, security index 2, server conn
  rxkad: level clear, flags authenticated pktCksum, expi

Re: [OpenAFS] need help writing to AFS without tokens

2006-05-26 Thread Franco Milicchio


On May 26, 2006, at 04:56pm, Edward Quick wrote:

I have 2 boxes with AFS clients installed. On one box, I can log on  
as user jpyxcom1 and write to a directory on AFS without having to  
get any tokens. On the other box though, I can't, I have to do klog  
jpyxcom1 and get the tokens, before it'll let me write, or cd to  
the directory.


I'm trying to set up a cronjob on the other box, which will copy a  
file to AFS, but I'm stuck at the moment because of this  
permissions problem. Could someone tell me where I'm going wrong  
please?


Can you explain better the situation? How do you log in? Locally I  
suppose (no kaserver interaction) and so no token at all. The user  
writes to a directory. What are the ACLs?


There are some differences between the two hosts...


--
Franco Milicchio <[EMAIL PROTECTED]>

The optimist thinks this is the best of all possible worlds.
The pessimist fears it is true.  [J. Robert Oppenheimer]


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] need help writing to AFS without tokens

2006-05-26 Thread Anne . Salemme
look at the access permissions on the directory where you can write  
("fs la") and you'll probably be able to figure it out.


anne



Quoting Edward Quick <[EMAIL PROTECTED]>:


Hi,

I have 2 boxes with AFS clients installed. On one box, I can log on as
user jpyxcom1 and write to a directory on AFS without having to get any
tokens. On the other box though, I can't, I have to do klog jpyxcom1
and get the tokens, before it'll let me write, or cd to the directory.

I'm trying to set up a cronjob on the other box, which will copy a file
to AFS, but I'm stuck at the moment because of this permissions
problem. Could someone tell me where I'm going wrong please?

Many thanks,

Ed.


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] need help writing to AFS without tokens

2006-05-26 Thread Edward Quick

Hi,

I have 2 boxes with AFS clients installed. On one box, I can log on as user 
jpyxcom1 and write to a directory on AFS without having to get any tokens. 
On the other box though, I can't, I have to do klog jpyxcom1 and get the 
tokens, before it'll let me write, or cd to the directory.


I'm trying to set up a cronjob on the other box, which will copy a file to 
AFS, but I'm stuck at the moment because of this permissions problem. Could 
someone tell me where I'm going wrong please?


Many thanks,

Ed.


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS implementation questions.

2006-05-26 Thread Frank Burkhardt
Hi,

On Thu, May 25, 2006 at 12:23:01PM -0700, Brady Catherman wrote:
> I am currently considering moving our environment to OpenAFS but before I
> can switch I need to make sure a few things are going to keep working..
> 
> We have users that use or systems for months on end without logging off
> and I am concerned that the kerberos ticket they are being issued will
> expire. Having them log back into kerberos/openafs isn't really a good
> option for us (I am having a hard enough time selling even the basic
> conversion, let alone anything that requires user action!)

Use some kind of reauthentication. On one of my AFS-clients there are 4
processes running *always* (->they start when the computer boots up, they
terminate only, when the computer is going to reboot). I'm using a
self-written tool "tokenmgr" which knows how to execute kinit, aklog and
some other programs in the right way to ensure that a valid token is always
available. In most cases, I'm using keytabs to provide the necessary
Kerberos credentials.

A different method can be used for interactive or "semi-interactive"
sessions. When someone logs in by ssh, he would just type 'tokenmgr -R' (and
enter his passwort twice) to get an arbitrary number of virtual terminals
(using the almighty 'screen' command). All programs run in those terminals
will always have a valid token.

Regards,

Frank
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info