Re: [ceph-users] Help needed porting Ceph to RSockets

2014-02-05 Thread Gandalf Corvotempesta
2013-10-31 Hefty, Sean sean.he...@intel.com: Can you please try the attached patch in place of all previous patches? Any updates on ceph with rsockets? -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-09-20 Thread Hefty, Sean
I would not call these workarounds a real fix, but they should point out the problems which I am trying to solve. Thanks for the update. I haven't had the time to investigate this, but did want to at least acknowledge that this hasn't gotten lost. - Sean -- To unsubscribe from this list:

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-09-12 Thread Gandalf Corvotempesta
2013/9/10 Andreas Bluemle andreas.blue...@itxperts.de: Since I have added these workarounds to my version of the librdmacm library, I can at least start up ceph using LD_PRELOAD and end up in a healthy ceph cluster state. Have you seen any performance improvement by using LD_PRELOAD with ceph?

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-09-12 Thread Andreas Bluemle
On Thu, 12 Sep 2013 12:20:03 +0200 Gandalf Corvotempesta gandalf.corvotempe...@gmail.com wrote: 2013/9/10 Andreas Bluemle andreas.blue...@itxperts.de: Since I have added these workarounds to my version of the librdmacm library, I can at least start up ceph using LD_PRELOAD and end up in a

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-09-10 Thread Andreas Bluemle
Hi, after some more analysis and debugging, I found workarounds for my problems; I have added these workarounds to the last version of the patch for the poll problem by Sean; see the attachment to this posting. The shutdown() operations below are all SHUT_RDWR. 1. shutdown() on side A of a

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-21 Thread Matthew Anderson
Hi Sean, I tested out the patch and unfortunately had the same results as Andreas. About 50% of the time the rpoll() thread in Ceph still hangs when rshutdown() is called. I saw a similar behaviour when increasing the poll time on the pre-patched version if that's of any relevance. Thanks On

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-20 Thread Andreas Bluemle
Hi Sean, I will re-check until the end of the week; there is some test scheduling issue with our test system, which affects my access times. Thanks Andreas On Mon, 19 Aug 2013 17:10:11 + Hefty, Sean sean.he...@intel.com wrote: Can you see if the patch below fixes the hang?

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-20 Thread Andreas Bluemle
Hi, I have added the patch and re-tested: I still encounter hangs of my application. I am not quite sure whether the I hit the same error on the shutdown because now I don't hit the error always, but only every now and then. WHen adding the patch to my code base (git tag v1.0.17) I notice an

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-20 Thread Hefty, Sean
I have added the patch and re-tested: I still encounter hangs of my application. I am not quite sure whether the I hit the same error on the shutdown because now I don't hit the error always, but only every now and then. I guess this is at least some progress... :/ WHen adding the patch to

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-19 Thread Hefty, Sean
Can you see if the patch below fixes the hang? Signed-off-by: Sean Hefty sean.he...@intel.com --- src/rsocket.c | 11 ++- 1 files changed, 10 insertions(+), 1 deletions(-) diff --git a/src/rsocket.c b/src/rsocket.c index d544dd0..e45b26d 100644 --- a/src/rsocket.c +++ b/src/rsocket.c

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-16 Thread Hefty, Sean
I am looking at a multithreaded application here, and I believe that the race is between thread A calling the rpoll() for POLLIN event and thread B calling the shutdown(SHUT_RDWR) for reading and writing of the (r)socket almost immediately afterwards. I modified a test program, and I can

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-14 Thread Andreas Bluemle
Hi, maybe some information about the environment I am working in: - CentOS 6.4 with custom kernel 3.8.13 - librdmacm / librspreload from git, tag 1.0.17 - application started with librspreload in LD_PRELOAD environment Currently, I have increased the value of the spin time by setting the

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-14 Thread Atchley, Scott
On Aug 14, 2013, at 3:21 AM, Andreas Bluemle andreas.blue...@itxperts.de wrote: Hi, maybe some information about the environment I am working in: - CentOS 6.4 with custom kernel 3.8.13 - librdmacm / librspreload from git, tag 1.0.17 - application started with librspreload in LD_PRELOAD

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-14 Thread Hefty, Sean
The first question I would have is: why is the rpoll() split into these two pieces? There must have been some reason to do a busy loop on some local state information rather than just call the real poll() directly. As Scott mentioned in his email, this is done for performance reasons. The

Re: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Atchley, Scott
On Aug 13, 2013, at 10:06 AM, Andreas Bluemle andreas.blue...@itxperts.de wrote: Hi Matthew, I found a workaround for my (our) problem: in the librdmacm code, rsocket.c, there is a global constant polling_time, which is set to 10 microseconds at the moment. I raise this to 1 - and

RE: [ceph-users] Help needed porting Ceph to RSockets

2013-08-13 Thread Hefty, Sean
I found a workaround for my (our) problem: in the librdmacm code, rsocket.c, there is a global constant polling_time, which is set to 10 microseconds at the moment. I raise this to 1 - and all of a sudden things work nicely. I am adding the linux-rdma list to CC so Sean might see