Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-12 Thread Matthew Anderson
Moving this conversation to ceph-devel where the dev's might be able to shed some light on this. I've added some additional debug to my code to narrow the issue down a bit and the reader thread appears to be getting locked by tcp_read_wait() because rpoll never returns an event when the socket is

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-12 Thread Andreas Bluemle
Hi Matthew, I can confirm the beahviour whichi you describe. I too believe that the problem is on the client side (ceph command). My log files show the very same symptom, i.e. the client side not being able to shutdown the pipes properly. (Q: I had problems yesterday to send a mail to ceph-users

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Andreas Bluemle
Hi Matthew, I found a workaround for my (our) problem: in the librdmacm code, rsocket.c, there is a global constant polling_time, which is set to 10 microseconds at the moment. I raise this to 1 - and all of a sudden things work nicely. I think we are looking at two issues here: 1. the threa

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Atchley, Scott
On Aug 13, 2013, at 10:06 AM, Andreas Bluemle wrote: > Hi Matthew, > > I found a workaround for my (our) problem: in the librdmacm > code, rsocket.c, there is a global constant polling_time, which > is set to 10 microseconds at the moment. > > I raise this to 1 - and all of a sudden things

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Hefty, Sean
> > I found a workaround for my (our) problem: in the librdmacm > > code, rsocket.c, there is a global constant polling_time, which > > is set to 10 microseconds at the moment. > > > > I raise this to 1 - and all of a sudden things work nicely. > > I am adding the linux-rdma list to CC so Sean

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-14 Thread Andreas Bluemle
Hi, maybe some information about the environment I am working in: - CentOS 6.4 with custom kernel 3.8.13 - librdmacm / librspreload from git, tag 1.0.17 - application started with librspreload in LD_PRELOAD environment Currently, I have increased the value of the spin time by setting the default

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-14 Thread Atchley, Scott
On Aug 14, 2013, at 3:21 AM, Andreas Bluemle wrote: > Hi, > > maybe some information about the environment I am > working in: > > - CentOS 6.4 with custom kernel 3.8.13 > - librdmacm / librspreload from git, tag 1.0.17 > - application started with librspreload in LD_PRELOAD environment > > Cu

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-14 Thread Hefty, Sean
> The first question I would have is: why is the rpoll() split into > these two pieces? There must have been some reason to do a busy > loop on some local state information rather than just call the > real poll() directly. As Scott mentioned in his email, this is done for performance reasons. The

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-16 Thread Hefty, Sean
> I am looking at a multithreaded application here, and I believe that > the race is between thread A calling the rpoll() for POLLIN event and > thread B calling the shutdown(SHUT_RDWR) for reading and writing of > the (r)socket almost immediately afterwards. I modified a test program, and I can r

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-19 Thread Hefty, Sean
Can you see if the patch below fixes the hang? Signed-off-by: Sean Hefty --- src/rsocket.c | 11 ++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/src/rsocket.c b/src/rsocket.c index d544dd0..e45b26d 100644 --- a/src/rsocket.c +++ b/src/rsocket.c @@ -2948,10 +2948,12

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-20 Thread Andreas Bluemle
Hi Sean, I will re-check until the end of the week; there is some test scheduling issue with our test system, which affects my access times. Thanks Andreas On Mon, 19 Aug 2013 17:10:11 + "Hefty, Sean" wrote: > Can you see if the patch below fixes the hang? > > Signed-off-by: Sean Hefty

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-20 Thread Andreas Bluemle
Hi, I have added the patch and re-tested: I still encounter hangs of my application. I am not quite sure whether the I hit the same error on the shutdown because now I don't hit the error always, but only every now and then. WHen adding the patch to my code base (git tag v1.0.17) I notice an offs

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-20 Thread Hefty, Sean
> I have added the patch and re-tested: I still encounter > hangs of my application. I am not quite sure whether the > I hit the same error on the shutdown because now I don't hit > the error always, but only every now and then. I guess this is at least some progress... :/ > WHen adding the patc

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-21 Thread Matthew Anderson
Hi Sean, I tested out the patch and unfortunately had the same results as Andreas. About 50% of the time the rpoll() thread in Ceph still hangs when rshutdown() is called. I saw a similar behaviour when increasing the poll time on the pre-patched version if that's of any relevance. Thanks On Tue

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-22 Thread Hefty, Sean
> I tested out the patch and unfortunately had the same results as > Andreas. About 50% of the time the rpoll() thread in Ceph still hangs > when rshutdown() is called. I saw a similar behaviour when increasing > the poll time on the pre-patched version if that's of any relevance. I'm not optimist

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-09-10 Thread Andreas Bluemle
Hi, after some more analysis and debugging, I found workarounds for my problems; I have added these workarounds to the last version of the patch for the poll problem by Sean; see the attachment to this posting. The shutdown() operations below are all SHUT_RDWR. 1. shutdown() on side A of a conne

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-09-12 Thread Gandalf Corvotempesta
2013/9/10 Andreas Bluemle : > Since I have added these workarounds to my version of the librdmacm > library, I can at least start up ceph using LD_PRELOAD and end up in > a healthy ceph cluster state. Have you seen any performance improvement by using LD_PRELOAD with ceph? Which throughput are you

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-09-12 Thread Andreas Bluemle
On Thu, 12 Sep 2013 12:20:03 +0200 Gandalf Corvotempesta wrote: > 2013/9/10 Andreas Bluemle : > > Since I have added these workarounds to my version of the librdmacm > > library, I can at least start up ceph using LD_PRELOAD and end up in > > a healthy ceph cluster state. > > Have you seen any p

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-09-16 Thread Gandalf Corvotempesta
2013/9/12 Andreas Bluemle : > I have not yet done any performance testing. > > The next step I have to take is more related to setting up > a larger cluster with sth. like 150 osd's without hitting any > resource limitations. How do you manage failover ? Will you use mulitple HBA (or dual-port HBA

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-09-20 Thread Hefty, Sean
> I would not call these workarounds a real fix, but they should point > out the problems which I am trying to solve. Thanks for the update. I haven't had the time to investigate this, but did want to at least acknowledge that this hasn't gotten lost. - Sean -- To unsubscribe from this list: se

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-10-30 Thread Hefty, Sean
Sorry it took so long to get to this. > after some more analysis and debugging, I found > workarounds for my problems; I have added these workarounds > to the last version of the patch for the poll problem by Sean; > see the attachment to this posting. > > The shutdown() operations below are all

Re: [ceph-users] Help needed porting Ceph to RSockets

2014-02-05 Thread Gandalf Corvotempesta
2013-10-31 Hefty, Sean : > Can you please try the attached patch in place of all previous patches? Any updates on ceph with rsockets? -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.k