Re: [Gluster-users] View from one client's gone subbornly bad

2011-07-22 Thread Whit Blauvelt
On Fri, Jul 22, 2011 at 10:26:13AM +0530, Anand Avati wrote:
 Can you post gluster client logs and check if there are any core dumps?

Okay, they are at http://transpect.com/gluster/ - may be a bit confused
because of system reboots and gluster being shut down and restarted as I was
working on it, but there you are.

No core dumps.

I'm wondering if Gluster might get into trouble sometimes when there's more
than one replicated storage pair between the same two machines. These
machines have three such, but mostly only one was active. Earlier yesterday
I was doing work that more heavily used another of them, and it was in the
midst of that that the more-used one got dodgy. 

Last night, Me:
 And what do I do to wake that disconnected endpoint in the morning?

Strangely, it's now working again for that one. It didn't display the files
at first this morning. I tried a gluster volume start kvm which only got a
statement that it was already running. I did a restart of autofs then, and
the files were available, and have remained so. 

Whit
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] View from one client's gone subbornly bad

2011-07-21 Thread Whit Blauvelt
 The client on the other system in the pair continues through this to have
 normal access. The system with the Gluster client problem shows no other
 symptoms.

Update: Now the second system's having the same symptoms. 

Maybe I need to just go back to 3.1.3? 

Whit
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] View from one client's gone subbornly bad

2011-07-21 Thread Whit Blauvelt
Okay ...

Finally got that one replicated partition back in line. A few of the
recommended

  find /mnt/point -print0 | xargs --null stat

from each side seems to have done some good. Then while I'm away a second
replicated partition on the same two systems ends up with a 

  Transport endpoint is disconnected

and even totally shutting down all the Gluster processes on that box and
restarting them does nothing for this - doesn't even create more entries in
the log for it. 

The other two replicated Gluster shares between these machines are operating
still - including the one I first had the trouble with today. But this third
one that decided it would be disconnected seems intent to stay that way -
despite that it's the same physical connection betweent the machines - which
is fine - and the same Gluster daemons running on both.

Again, this was all happy for many weeks with 3.1.3. So I'd give pretty good
odds that 3.1.5 has some deep bugs. Should I go back, or do things finally
look better going forward? And what do I do to wake that disconnected
endpoint in the morning?

Thanks,
Whit
 
___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users


Re: [Gluster-users] View from one client's gone subbornly bad

2011-07-21 Thread Anand Avati
Can you post gluster client logs and check if there are any core dumps?

Avati

On Fri, Jul 22, 2011 at 9:05 AM, Whit Blauvelt
whit.glus...@transpect.comwrote:

 Okay ...

 Finally got that one replicated partition back in line. A few of the
 recommended

  find /mnt/point -print0 | xargs --null stat

 from each side seems to have done some good. Then while I'm away a second
 replicated partition on the same two systems ends up with a

  Transport endpoint is disconnected

 and even totally shutting down all the Gluster processes on that box and
 restarting them does nothing for this - doesn't even create more entries in
 the log for it.

 The other two replicated Gluster shares between these machines are
 operating
 still - including the one I first had the trouble with today. But this
 third
 one that decided it would be disconnected seems intent to stay that way -
 despite that it's the same physical connection betweent the machines -
 which
 is fine - and the same Gluster daemons running on both.

 Again, this was all happy for many weeks with 3.1.3. So I'd give pretty
 good
 odds that 3.1.5 has some deep bugs. Should I go back, or do things finally
 look better going forward? And what do I do to wake that disconnected
 endpoint in the morning?

 Thanks,
 Whit

 ___
 Gluster-users mailing list
 Gluster-users@gluster.org
 http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

___
Gluster-users mailing list
Gluster-users@gluster.org
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users