Re: Seg Fault on rgw 0.61.1 with cluster in 0.61

2013-05-16 Thread Sage Weil
On Tue, 14 May 2013, Faidon Liambotis wrote:
 On 05/10/13 19:02, Yehuda Sadeh wrote:
  Sounds to me like package versioning mismastch. Could it be that one
  of the ceph packages was on a different version (e.g., librados).
 
 I attempted to install and run radosgw 0.61.1 on a system with a 0.56.4
 librados and it segfaulted with the same backtrace as the one in this thread.
 
 If a newer radosgw can't work with an older librados, this should be reflected
 on the package relationships -- hopefully without nasty Breaks/Conflicts, but
 with a proper librados SONAME bump that will allow coinstability between
 librados2 and e.g. librados3. Or symbol versioning could be employed to
 provide backwards compatibility.
 
 This installed-but-segfaulting combination of packages shouldn't be allowed by
 apt to exist on the system. FWIW, if these were packages in Debian (and,
 presumably, Ubuntu), that would be a severity: serious/release critical bug.

I believe this is actually a problem with radosgw statically linking some 
of the same stuff that librados includes, and not with the librados ABI 
changes.  We need to fix that somehow.. In the meantime, though, setting 
the radosgw package to require a matching librados2 ought to do the trick.

 It'd also be nice to be able to do things like mixing newer radosgw while also
 keeping the old librados2 on the system. My use case is that I have monitors
 and radosgw on the same boxes and I'd like to keep monitors on bobtail, while
 at the same time use some of the much needed radosgw cuttlefish features.

ceph-common need sto match the librados2 version, but ceph (which contains 
ceph-mon) does not, so you should be able to have dufferent ceph-mon and 
radosgw versions if we do the above.

sage
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seg Fault on rgw 0.61.1 with cluster in 0.61

2013-05-14 Thread Faidon Liambotis

On 05/10/13 19:02, Yehuda Sadeh wrote:

Sounds to me like package versioning mismastch. Could it be that one
of the ceph packages was on a different version (e.g., librados).


I attempted to install and run radosgw 0.61.1 on a system with a 0.56.4 
librados and it segfaulted with the same backtrace as the one in this 
thread.


If a newer radosgw can't work with an older librados, this should be 
reflected on the package relationships -- hopefully without nasty 
Breaks/Conflicts, but with a proper librados SONAME bump that will allow 
coinstability between librados2 and e.g. librados3. Or symbol versioning 
could be employed to provide backwards compatibility.


This installed-but-segfaulting combination of packages shouldn't be 
allowed by apt to exist on the system. FWIW, if these were packages in 
Debian (and, presumably, Ubuntu), that would be a severity: 
serious/release critical bug.


It'd also be nice to be able to do things like mixing newer radosgw 
while also keeping the old librados2 on the system. My use case is that 
I have monitors and radosgw on the same boxes and I'd like to keep 
monitors on bobtail, while at the same time use some of the much needed 
radosgw cuttlefish features.


Regards,
Faidon
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Seg Fault on rgw 0.61.1 with cluster in 0.61

2013-05-11 Thread Yann ROBIN
Hi,

We're using ubuntu 12.04.2 with kernel 3.2.0-41-virtual.
I'll try changing removing the dbg package and install your branch Monday.


-Message d'origine-
De : Sage Weil [mailto:s...@inktank.com] 
Envoyé : vendredi 10 mai 2013 17:30
À : Yann ROBIN
Cc : ceph-devel@vger.kernel.org
Objet : RE: Seg Fault on rgw 0.61.1 with cluster in 0.61

On Fri, 10 May 2013, Yann ROBIN wrote:
 I've downgraded the rgw to 0.60 and I still had the same issue.

Yeah, the radosgw code is essentially identical between the two versions (there 
is a new config option but it is unused).
 
 I returned to 0.61.1 and noticed that sometimes (1 on 10) the radosgw start 
 normally.
 I've installed debug package of librados to do some debugging and now it 
 always works...

That is disconcerting.

I've pushed wip-rgw-crash that prints some debug information in 
ceph::crypto::init... do you mind instaling that (without debug packages
:) and seeing if you can reproduce the problem?  it should print out a couple 
of lines to stdout, but you need to run radosgw with the '-f' 
option (which prevents fork).  hopefully the problem is reproducible in that 
case.

What distro are you running?
sage


  
 -Original Message-
 From: ceph-devel-ow...@vger.kernel.org 
 [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Yann ROBIN
 Sent: vendredi 10 mai 2013 10:51
 To: ceph-devel@vger.kernel.org
 Subject: Seg Fault on rgw 0.61.1 with cluster in 0.61
 
 Hi,
 
 I've tried to update the rgw to 0.61.1 and now I have a segfault while 
 connecting to the 0.61 cluster.
 The rgw with version 0.61 run fine.
 
 *** Caught signal (Segmentation fault) **  in thread 7fc1fec79780  
 ceph version 0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
  1: /usr/bin/radosgw() [0x4f19da]
  2: (()+0xfcb0) [0x7fc1fcf0dcb0]
  3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
  4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
  5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
  6: (RGWRados::initialize()+0x53) [0x5b5c03]
  7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) 
 [0x5b9b39]
  8: (main()+0x2d7) [0x4b4ed7]
  9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
  10: /usr/bin/radosgw() [0x4b6db1]
 2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal 
 (Segmentation fault) **  in thread 7fc1fec79780
 
 --
 Yann ROBIN
 YouScribe
 
 
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel 
 in the body of a message to majord...@vger.kernel.org More majordomo 
 info at  http://vger.kernel.org/majordomo-info.html
 
 
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel 
 in the body of a message to majord...@vger.kernel.org More majordomo 
 info at  http://vger.kernel.org/majordomo-info.html
 
 


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Seg Fault on rgw 0.61.1 with cluster in 0.61

2013-05-10 Thread Yann ROBIN
I've downgraded the rgw to 0.60 and I still had the same issue.

I returned to 0.61.1 and noticed that sometimes (1 on 10) the radosgw start 
normally.
I've installed debug package of librados to do some debugging and now it always 
works...

-Original Message-
From: ceph-devel-ow...@vger.kernel.org 
[mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Yann ROBIN
Sent: vendredi 10 mai 2013 10:51
To: ceph-devel@vger.kernel.org
Subject: Seg Fault on rgw 0.61.1 with cluster in 0.61

Hi,

I've tried to update the rgw to 0.61.1 and now I have a segfault while 
connecting to the 0.61 cluster.
The rgw with version 0.61 run fine.

*** Caught signal (Segmentation fault) **  in thread 7fc1fec79780  ceph version 
0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
 1: /usr/bin/radosgw() [0x4f19da]
 2: (()+0xfcb0) [0x7fc1fcf0dcb0]
 3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
 4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
 5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
 6: (RGWRados::initialize()+0x53) [0x5b5c03]
 7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) 
[0x5b9b39]
 8: (main()+0x2d7) [0x4b4ed7]
 9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
 10: /usr/bin/radosgw() [0x4b6db1]
2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal (Segmentation 
fault) **  in thread 7fc1fec79780

--
Yann ROBIN
YouScribe


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Seg Fault on rgw 0.61.1 with cluster in 0.61

2013-05-10 Thread Sage Weil
On Fri, 10 May 2013, Yann ROBIN wrote:
 I've downgraded the rgw to 0.60 and I still had the same issue.

Yeah, the radosgw code is essentially identical between the two versions 
(there is a new config option but it is unused).
 
 I returned to 0.61.1 and noticed that sometimes (1 on 10) the radosgw start 
 normally.
 I've installed debug package of librados to do some debugging and now it 
 always works...

That is disconcerting.

I've pushed wip-rgw-crash that prints some debug information in 
ceph::crypto::init... do you mind instaling that (without debug packages 
:) and seeing if you can reproduce the problem?  it should print out a 
couple of lines to stdout, but you need to run radosgw with the '-f' 
option (which prevents fork).  hopefully the problem is reproducible in 
that case.

What distro are you running?
sage


  
 -Original Message-
 From: ceph-devel-ow...@vger.kernel.org 
 [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Yann ROBIN
 Sent: vendredi 10 mai 2013 10:51
 To: ceph-devel@vger.kernel.org
 Subject: Seg Fault on rgw 0.61.1 with cluster in 0.61
 
 Hi,
 
 I've tried to update the rgw to 0.61.1 and now I have a segfault while 
 connecting to the 0.61 cluster.
 The rgw with version 0.61 run fine.
 
 *** Caught signal (Segmentation fault) **  in thread 7fc1fec79780  ceph 
 version 0.61.1 (56c4847ba82a92023700e2d4920b59cdaf23428d)
  1: /usr/bin/radosgw() [0x4f19da]
  2: (()+0xfcb0) [0x7fc1fcf0dcb0]
  3: (ceph::crypto::init(CephContext*)+0xf) [0x7fc1fdfeb2ef]
  4: (common_init_finish(CephContext*)+0x23) [0x7fc1fdfc33f3]
  5: (librados::RadosClient::connect()+0x1d) [0x7fc1fde1d48d]
  6: (RGWRados::initialize()+0x53) [0x5b5c03]
  7: (RGWStoreManager::init_storage_provider(CephContext*, bool)+0x2c9) 
 [0x5b9b39]
  8: (main()+0x2d7) [0x4b4ed7]
  9: (__libc_start_main()+0xed) [0x7fc1fb90176d]
  10: /usr/bin/radosgw() [0x4b6db1]
 2013-05-10 10:36:39.749439 7fc1fec79780 -1 *** Caught signal (Segmentation 
 fault) **  in thread 7fc1fec79780
 
 --
 Yann ROBIN
 YouScribe
 
 
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
--
To unsubscribe from this list: send the line unsubscribe ceph-devel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html