Re: [ceph-users] segmentation fault when using librbd interface

2015-11-02 Thread Jason Dillaman
I'd recommend running your program through valgrind first to see if something 
pops out immediately.

-- 

Jason Dillaman 


- Original Message - 

> From: "min fang" 
> To: ceph-users@lists.ceph.com
> Sent: Saturday, October 31, 2015 10:43:22 PM
> Subject: Re: [ceph-users] segmentation fault when using librbd interface

> this segmentation fault should happen in rbd_read function, I can see code
> call this function, and then get segmentation fault, which means rbd_read
> has not been completed successfully when segmentation fault happened.

> 2015-11-01 10:34 GMT+08:00 min fang < louisfang2...@gmail.com > :

> > Hi,my code get Segmentation fault when using librbd to do sync read IO.
> > From
> > the trace, I can say there are several read IOs get successfully, but the
> > last read IO (2015-10-31 08:56:34.804383) can not be returned and my code
> > got segmentation fault. I used rbd_read interface and malloc a buffer for
> > read data buffer.
> 

> > Anybody can help this? Thanks.
> 

> > 2015-10-31 08:56:34.750411 7f04bcbdc7c0 20 librbd: read 0x17896d0 off = 0
> > len
> > = 4096
> 
> > 2015-10-31 08:56:34.750436 7f04bcbdc7c0 20 librbd: aio_read 0x17896d0
> > completion 0x1799440 [0,4096]
> 
> > 2015-10-31 08:56:34.750442 7f04bcbdc7c0 20 librbd: ictx_check 0x17896d0
> 
> > 2015-10-31 08:56:34.750451 7f04bcbdc7c0 20 librbd::AsyncOperation:
> > 0x1799570
> > start_op
> 
> > 2015-10-31 08:56:34.750453 7f04bcbdc7c0 20 librbd: oid
> > rb.0.8597.2ae8944a. 0~4096 from [0,4096]
> 
> > 2015-10-31 08:56:34.750457 7f04bcbdc7c0 10 librbd::ImageCtx:
> > prune_parent_extents image overlap 0, object overlap 0 from image extents
> > []
> 
> > 2015-10-31 08:56:34.750462 7f04bcbdc7c0 20 librbd::AioRequest: send
> > 0x1799c60
> > rb.0.8597.2ae8944a. 0~4096
> 
> > 2015-10-31 08:56:34.750498 7f04bcbdc7c0 1 -- 192.168.90.240:0/1006544 -->
> > 192.168.90.253:6801/2041 -- osd_op(client.34253.0:92
> > rb.0.8597.2ae8944a. [sparse-read 0~4096] 2.7cf90552
> > ack+read+known_if_redirected e30) v5 -- ?+0 0x179b890 con 0x17877b0
> 
> > 2015-10-31 08:56:34.750526 7f04bcbdc7c0 20 librbd::AioCompletion:
> > AioCompletion::finish_adding_requests 0x1799440 pending 1
> 
> > 2015-10-31 08:56:34.780308 7f04b0bb5700 1 -- 192.168.90.240:0/1006544 <==
> > osd.0 192.168.90.253:6801/2041 5  osd_op_reply(92
> > rb.0.8597.2ae8944a. [sparse-read 0~4096] v0'0 uv8 ondisk = 0)
> > v6
> >  198+0+4120 (3153096351 0 1287205638) 0x7f0494001ce0 con 0x17877b0
> 
> > 2015-10-31 08:56:34.780408 7f04b14b7700 20 librbd::AioRequest:
> > should_complete 0x1799c60 rb.0.8597.2ae8944a. 0~4096 r = 0
> 
> > 2015-10-31 08:56:34.780418 7f04b14b7700 20 librbd::AioRequest:
> > should_complete 0x1799c60 READ_FLAT
> 
> > 2015-10-31 08:56:34.780420 7f04b14b7700 20 librbd::AioRequest: complete
> > 0x1799c60
> 
> > 2015-10-31 08:56:34.780421 7f04b14b7700 10 librbd::AioCompletion:
> > C_AioRead::finish() 0x1793710 r = 0
> 
> > 2015-10-31 08:56:34.780422 7f04b14b7700 10 librbd::AioCompletion: got
> > {0=4096} for [0,4096] bl 4096
> 
> > 2015-10-31 08:56:34.780432 7f04b14b7700 20 librbd::AioCompletion:
> > AioCompletion::complete_request() 0x1799440 complete_cb=0x7f04ba2b1240
> > pending 1
> 
> > 2015-10-31 08:56:34.780434 7f04b14b7700 20 librbd::AioCompletion:
> > AioCompletion::finalize() 0x1799440 rval 4096 read_buf 0x179a5e0 read_bl 0
> 
> > 2015-10-31 08:56:34.780440 7f04b14b7700 20 librbd::AioCompletion:
> > AioCompletion::finalize() copied resulting 4096 bytes to 0x179a5e0
> 
> > 2015-10-31 08:56:34.780442 7f04b14b7700 20 librbd::AsyncOperation:
> > 0x1799570
> > finish_op
> 
> > 2015-10-31 08:56:34.780766 7f04bcbdc7c0 20 librbd: read 0x17896d0 off =
> > 4096
> > len = 4096
> 
> > 2015-10-31 08:56:34.780778 7f04bcbdc7c0 20 librbd: aio_read 0x17896d0
> > completion 0x1799440 [4096,4096]
> 
> > 2015-10-31 08:56:34.780781 7f04bcbdc7c0 20 librbd: ictx_check 0x17896d0
> 
> > 2015-10-31 08:56:34.780786 7f04bcbdc7c0 20 librbd::AsyncOperation:
> > 0x1799570
> > start_op
> 
> > 2015-10-31 08:56:34.780788 7f04bcbdc7c0 20 librbd: oid
> > rb.0.8597.2ae8944a. 4096~4096 from [0,4096]
> 
> > 2015-10-31 08:56:34.780790 7f04bcbdc7c0 10 librbd::ImageCtx:
> > prune_parent_extents image overlap 0, object overlap 0 from image extents
> > []
> 
> > 2015-10-31 08:56:34.780793 7f04bcbdc7c0 20 librbd::AioRequest: send
> > 0x179bcc0
> > rb.0.8597.2ae8944a. 4096~4096
> 
> > 2015-10-31 08:56:34.780813 7f04bcbdc7c0 1 -- 192.168.90.240:0/1006544 -->
> > 192.168.90.253:6801/2041 -- osd_op(client.34253.0:93
> > rb.0.8597.2ae8944a. [sparse-read 4096~4096] 2.7cf90552
> > ack+read+known_if_redirected e30) v5 -- ?+0 0x179b5f0 con 0x17877b0
> 
> > 2015-10-31 08:56:34.780833 7f04bcbdc7c0 20 librbd::AioCompletion:
> > AioCompletion::finish_adding_requests 0x1799440 pending 1
> 
> > 2015-10-31 08:56:34.800847 7f04b0bb5700 1 -- 192.168.90.240:0/1006544 <==
> > osd.0 

Re: [ceph-users] Changing CRUSH map ids

2015-11-02 Thread Loris Cuoghi

Thanks Greg :)

For the OSDs, I understand, on the other hand for intermediate 
abstractions like hosts, racks and rooms, do you agree that it should 
currently be possible to change the IDs (always under the "one change at 
a time, I promise mom" rule)?


Clearly, a good amount of shuffling should be expected as a consequence.
Basically I was inquiring whether changing the id of a single host would 
shuffle the entirety (or a relatively big chunk) of the cluster data, or 
if the shuffling was limited to a direct proportion of the item's weight.


I just --test-ed with crushtool. I changed an host's id, and testing the 
two maps with :


crushtool -i crush.map --test --show-statistics --rule 0 --num-rep 3 
--min-x 1 --max-x $N --show-mappings


(with $N varying from as little as 32 to "big numbers"TM) shows that 
nearly the 50% of the mappings changed, in a 10 hosts cluster.


Thanks All :)


Le 02/11/2015 16:14, Gregory Farnum a écrit :

Regardless of what the crush tool does, I wouldn't muck around with the
IDs of the OSDs. The rest of Celh will probably not handle it well if
the crush IDs don't match the OSD numbers.
-Greg

On Monday, November 2, 2015, Loris Cuoghi > wrote:

Le 02/11/2015 12:47, Wido den Hollander a écrit :



On 02-11-15 12:30, Loris Cuoghi wrote:

Hi All,

We're currently on version 0.94.5 with three monitors and 75
OSDs.

I've peeked at the decompiled CRUSH map, and I see that all
ids are
commented with '# Here be dragons!', or more literally : '#
do not
change unnecessarily'.

Now, what would happen if an incautious user would happen to
put his
chubby fingers on this ids, totally disregarding the warning
at the
entrance of the cave, and change one of them?

Data shuffle? (Relative to the allocation of PGs for the
OSD/host/other
item?)

A *big* data shuffle? (ALL data would need to have its position
recalculated, with immediate end-of-the-world data shuffle?)

Nothing at all? (And the big fat warning is there only to
take fun on
the uninstructed ones? Not plausible...)


Give it a try! Download the CRUSHMap and run tests on it with
crushtool:

$ crushtool -i mycrushmap --test --rule 0 --num-rep 3
--show-statistics

Now, change the map, compile it and run again:

$ crushtool -i mycrushmap.new --test --rule 0 --num-rep 3
--show-statistics

Check the differences and you get the idea of how much has changed.

Wido


Thanks Wido ! :)

Thanks !

Loris
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] all three mons segfault at same time

2015-11-02 Thread Arnulf Heimsbakk
When I did a unset noout on the cluster all three mons got a
segmentation fault, then continued as if nothing had happened. Regular
segmentation faults started on mons after upgrading to 0.94.5. Ubuntu
Trusty LTS. Anyone had similar?

-Arnulf

Backtraces:

mon1:

#0  0x7f0b2969120b in raise (sig=11)
at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x009adfbd in reraise_fatal (signum=11)
at global/signal_handler.cc:59
#2  handle_fatal_signal (signum=11) at global/signal_handler.cc:109
#3  
#4  0x006518e5 in std::_Rb_tree,
std::less, std::allocator >::find (this=this@entry=0x47dac90, __k=...)
at /usr/include/c++/4.8/bits/stl_tree.h:1805
#5  0x008a002e in find (__x=..., this=)
at /usr/include/c++/4.8/bits/stl_map.h:837
#6  get_str_map_key (str_map=..., key=...,
fallback_key=fallback_key@entry=0xd1d210
<_ZL23CLOG_CONFIG_DEFAULT_KEY>)
at common/str_map.cc:120
#7  0x006b0a5a in get_facility (channel=..., this=0x47dac30)
at mon/LogMonitor.h:79
#8  LogMonitor::update_from_paxos (this=0x47dab40,
need_bootstrap=) at mon/LogMonitor.cc:141
#9  0x0060432a in PaxosService::refresh (this=0x47dab40,
need_bootstrap=need_bootstrap@entry=0x7f0b208b9f3f)
at mon/PaxosService.cc:128
#10 0x005b03db in Monitor::refresh_from_paxos (this=0x4968000,
need_bootstrap=need_bootstrap@entry=0x7f0b208b9f3f) at
mon/Monitor.cc:788
#11 0x005eea5e in Paxos::do_refresh (this=this@entry=0x4874dc0)
at mon/Paxos.cc:1008
#12 0x005f5c83 in Paxos::handle_commit
(this=this@entry=0x4874dc0,
commit=commit@entry=0x73a7480) at mon/Paxos.cc:933
#13 0x005fd7bb in Paxos::dispatch (this=0x4874dc0,
m=m@entry=0x73a7480) at mon/Paxos.cc:1399
#14 0x005cf9e3 in Monitor::dispatch (this=this@entry=0x4968000,
s=s@entry=0x47d7f80, m=m@entry=0x73a7480,
src_is_mon=src_is_mon@entry=true) at mon/Monitor.cc:3567
#15 0x005cfe36 in Monitor::_ms_dispatch
(this=this@entry=0x4968000,
m=m@entry=0x73a7480) at mon/Monitor.cc:3376
#16 0x005edb43 in Monitor::ms_dispatch (this=0x4968000,
m=0x73a7480)
at mon/Monitor.h:833
#17 0x00929679 in ms_deliver_dispatch (m=0x73a7480,
this=0x49be700)
at ./msg/Messenger.h:567
#18 DispatchQueue::entry (this=0x49be8c8) at
msg/simple/DispatchQueue.cc:185
#19 0x007c99cd in DispatchQueue::DispatchThread::entry (
this=) at msg/simple/DispatchQueue.h:103
#20 0x7f0b29689182 in start_thread (arg=0x7f0b208bb700)
at pthread_create.c:312
#21 0x7f0b27bf447d in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


mon2:

#0  0x7fd27c06520b in raise () from
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x009adfbd in reraise_fatal (signum=11)
at global/signal_handler.cc:59
#2  handle_fatal_signal (signum=11) at global/signal_handler.cc:109
#3  
#4  0x006518e5 in std::_Rb_tree,
std::less, std::allocator >::find (this=this@entry=0x36a6390, __k=...)
at /usr/include/c++/4.8/bits/stl_tree.h:1805
#5  0x008a002e in find (__x=..., this=)
at /usr/include/c++/4.8/bits/stl_map.h:837
#6  get_str_map_key (str_map=..., key=...,
fallback_key=fallback_key@entry=0xd1d210
<_ZL23CLOG_CONFIG_DEFAULT_KEY>)
at common/str_map.cc:120
#7  0x006b0a5a in get_facility (channel=..., this=0x36a6330)
at mon/LogMonitor.h:79
#8  LogMonitor::update_from_paxos (this=0x36a6240,
need_bootstrap=) at mon/LogMonitor.cc:141
#9  0x0060432a in PaxosService::refresh (this=0x36a6240,
need_bootstrap=need_bootstrap@entry=0x7fd276f5d6af)
at mon/PaxosService.cc:128
#10 0x005b03db in Monitor::refresh_from_paxos (this=0x37feb00,
need_bootstrap=need_bootstrap@entry=0x7fd276f5d6af) at
mon/Monitor.cc:788
#11 0x005eea5e in Paxos::do_refresh (this=this@entry=0x3740dc0)
at mon/Paxos.cc:1008
#12 0x005fbf39 in Paxos::commit_finish (this=0x3740dc0)
at mon/Paxos.cc:903
#13 0x0060038b in C_Committed::finish (this=0x4600ad0,
r=) at mon/Paxos.cc:807
#14 0x005d4d89 in Context::complete (this=0x4600ad0,
r=) at ./include/Context.h:65
#15 0x005ff4bc in MonitorDBStore::C_DoTransaction::finish (
this=0x38258c0, r=) at mon/MonitorDBStore.h:326
#16 0x005d4d89 in Context::complete (this=0x38258c0,
r=) at ./include/Context.h:65
#17 0x00717e88 in Finisher::finisher_thread_entry (this=0x3683350)
at common/Finisher.cc:59
#18 0x7fd27c05d182 in start_thread ()
   from /lib/x86_64-linux-gnu/libpthread.so.0
#19 0x7fd27a5c847d in clone () from /lib/x86_64-linux-gnu/libc.so.6

mon3:

#0  0x7f4f0cfce20b in raise () from
/lib/x86_64-linux-gnu/libpthread.so.0
#1  0x009adfbd in reraise_fatal (signum=11)
at global/signal_handler.cc:59
#2  handle_fatal_signal (signum=11) at global/signal_handler.cc:109
#3  
#4  

Re: [ceph-users] retrieving quota of ceph pool using librados or python API

2015-11-02 Thread Alex Leake
John,

Thank you very much! Works exactly as expected.


Kind Regards,
Alex.

From: John Spray 
Sent: 02 November 2015 13:19
To: Alex Leake
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] retrieving quota of ceph pool using librados or 
python API

On Mon, Nov 2, 2015 at 9:39 PM, Alex Leake  wrote:
> Hello all,
>
>
> I'm attempting to use the python API to get the quota of a pool, but I can't
> see it in the documentation
> (http://docs.ceph.com/docs/v0.94/rados/api/python/).

The call you're looking for is Rados.mon_command, which seems to be
missing in the documentation for some reason.  This is the same
interface that the ceph CLI uses.

>>> r = rados.Rados(conffile="./ceph.conf")
>>> r.connect()
>>> json.loads(r.mon_command(json.dumps({"prefix": "osd pool get-quota", 
>>> "pool": "rbd", "format": "json-pretty"}), "")[1])
{u'pool_name': u'rbd', u'quota_max_objects': 0, u'quota_max_bytes': 0,
u'pool_id': 0}

Cheers,
John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] testing simple rebalancing

2015-11-02 Thread Mulpur, Sudha
Hi,

I am a new user to ceph. I am using version 0.94.5. Can someone help me with 
this rebalancing question.

I have 1 mon and 4 OSDs, 100 pgs in my pool with 2+1 erasure coding. I have 
given the steps that I did below. Am I doing these correctly? Why are UP and 
acting showing differently in step 10.

To test rebalancing, I did the following steps:


1.   When I put and object, osd map showed the following pgs were in UP and 
ACTING - 1,0,3 (health is OK)

2.   I manually stopped my osd.1 and the map changed to NONE,0,3

3.   After 5 min, map changed to 2,0,3 (health is OK)

4.   I manually started my osd.1 and the map changed to 1,2,3 (health is OK)

5.   Then I stopped my osd.2 and the map changed to 1,NONE,3

6.   After 5 min, map changed to 1,0,3 (health is OK)

7.   I manually started my osd.2 and the map changed to 2,0,3 (health is OK)

8.   Then I stopped my osd.3 and the map changed to 2,0,NONE

9.   At this point even after waiting for 15 min, the map did not change, 
still stuck at NONE

10.   Then I started my osd.3 and still the map did not change and health is 
warnings, with 37 pgs degraded, 100 pgs unclean and 37 pgs stuck undersized. 
The map shows UP (3,NONE,0) and acting (3,0,0)

Thanks
Sudha

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Changing CRUSH map ids

2015-11-02 Thread Gregory Farnum
Regardless of what the crush tool does, I wouldn't muck around with the IDs
of the OSDs. The rest of Celh will probably not handle it well if the crush
IDs don't match the OSD numbers.
-Greg

On Monday, November 2, 2015, Loris Cuoghi  wrote:

> Le 02/11/2015 12:47, Wido den Hollander a écrit :
>
>>
>>
>> On 02-11-15 12:30, Loris Cuoghi wrote:
>>
>>> Hi All,
>>>
>>> We're currently on version 0.94.5 with three monitors and 75 OSDs.
>>>
>>> I've peeked at the decompiled CRUSH map, and I see that all ids are
>>> commented with '# Here be dragons!', or more literally : '# do not
>>> change unnecessarily'.
>>>
>>> Now, what would happen if an incautious user would happen to put his
>>> chubby fingers on this ids, totally disregarding the warning at the
>>> entrance of the cave, and change one of them?
>>>
>>> Data shuffle? (Relative to the allocation of PGs for the OSD/host/other
>>> item?)
>>>
>>> A *big* data shuffle? (ALL data would need to have its position
>>> recalculated, with immediate end-of-the-world data shuffle?)
>>>
>>> Nothing at all? (And the big fat warning is there only to take fun on
>>> the uninstructed ones? Not plausible...)
>>>
>>>
>> Give it a try! Download the CRUSHMap and run tests on it with crushtool:
>>
>> $ crushtool -i mycrushmap --test --rule 0 --num-rep 3 --show-statistics
>>
>> Now, change the map, compile it and run again:
>>
>> $ crushtool -i mycrushmap.new --test --rule 0 --num-rep 3
>> --show-statistics
>>
>> Check the differences and you get the idea of how much has changed.
>>
>> Wido
>>
>>
> Thanks Wido ! :)
>
> Thanks !
>>>
>>> Loris
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Creating RGW Zone System Users Fails with "couldn't init storage provider"

2015-11-02 Thread Daniel Schneller

Hi!


I am trying to set up a Rados Gateway, prepared for multiple regions 
and zones, according to the documenation on 
http://docs.ceph.com/docs/hammer/radosgw/federated-config/.

Ceph version is 0.94.3 (Hammer).

I am stuck at the "Create zone users" step 
(http://docs.ceph.com/docs/hammer/radosgw/federated-config/#create-zone-users). 



Running the user create command I get this:

$ sudo radosgw-admin user create --uid="eu-zone1" 
--display-name="Region-EU Zone-zone1" --client-id 
client.radosgw.eu-zone1-1 --system

couldn't init storage provider
$ echo $?
5



I have found this in a Documentation bug ticket, but unfortunately 
there is no indication of what was actually going on there: 
http://tracker.ceph.com/issues/10848#note-21


I am at a loss, I have even tried to figure out what was going on via 
reading the rgw-admin source, but I could not find any strong hints.


Ideas?

Thanks,
Daniel


Find all relevant(?) bits of configuration below:


Ceph.conf has this for the RGW instances:


[client.radosgw.eu-zone1-1]
 host = dec-b1-d7-73-f0-04
 admin socket = /var/run/ceph-radosgw/client.radosgw.dec-b1-d7-73-f0-04.asok
 pid file = /var/run/ceph-radosgw/$name.pid
 rgw region = eu
 rgw region root pool = .eu.rgw.root
 rgw zone = eu-zone1
 rgw zone root pool = .eu-zone1.rgw.root
 rgw_print_continue = false
 keyring = /etc/ceph/ceph.client.radosgw.keyring
 rgw_socket_path = /var/run/ceph-radosgw/client.radosgw.eu-zone1-1.sock
 log_file = /var/log/radosgw/radosgw.log
 rgw_enable_ops_log = false
 rgw_gc_max_objs = 31
 rgw_frontends = fastcgi
 debug_rgw = 20


Keyring:
[client.radosgw.eu-zone1-1]
   key = 
   caps mon = "allow rwx"
   caps osd = "allow rwx"


ceph auth list has the same key and these caps:

client.radosgw.eu-zone1-1
key: 
caps: [mon] allow rwx
caps: [osd] allow rwx



I have followed the instructions on that page and have created Region 
and Zone configurations as follows:




{ "name": "eu",
 "api_name": "eu",
 "is_master": "true",
 "endpoints": [
   "https:\/\/rgw-eu-zone1.mydomain.net:443\/",
   "http:\/\/rgw-eu-zone1.mydomain.net:80\/"],
 "master_zone": "eu-zone1",
 "zones": [
   { "name": "eu-zone1",
 "endpoints": [
   "https:\/\/rgw-eu-zone1.mydomain.net:443\/",
   "http:\/\/rgw-eu-zone1.mydomain.net:80\/"],
 "log_meta": "true",
 "log_data": "true"}
 ],
 "placement_targets": [
  {
"name": "default-placement",
"tags": []
  }
 ],
 "default_placement": "default-placement"}



{ "domain_root": ".eu-zone1.domain.rgw",
 "control_pool": ".eu-zone1.rgw.control",
 "gc_pool": ".eu-zone1.rgw.gc",
 "log_pool": ".eu-zone1.log",
 "intent_log_pool": ".eu-zone1.intent-log",
 "usage_log_pool": ".eu-zone1.usage",
 "user_keys_pool": ".eu-zone1.users",
 "user_email_pool": ".eu-zone1.users.email",
 "user_swift_pool": ".eu-zone1.users.swift",
 "user_uid_pool": ".eu-zone1.users.uid",
 "system_key": { "access_key": "", "secret_key": ""},
 "placement_pools": [
   { "key": "default-placement",
 "val": { "index_pool": ".eu-zone1.rgw.buckets.index",
  "data_pool": ".eu-zone1.rgw.buckets"}
   }
 ]
}


These pools are defined:

rbd
images
volumes
.eu-zone1.rgw.root
.eu-zone1.rgw.control
.eu-zone1.rgw.gc
.eu-zone1.rgw.buckets
.eu-zone1.rgw.buckets.index
.eu-zone1.rgw.buckets.extra
.eu-zone1.log
.eu-zone1.intent-log
.eu-zone1.usage
.eu-zone1.users
.eu-zone1.users.email
.eu-zone1.users.swift
.eu-zone1.users.uid
.eu.rgw.root
.eu-zone1.domain.rgw
.rgw
.rgw.root
.rgw.gc
.users.uid
.users
.rgw.control
.log
.intent-log
.usage
.users.email
.users.swift



--
Daniel Schneller
Principal Cloud Engineer

CenterDevice GmbH
https://www.centerdevice.de


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] retrieving quota of ceph pool using librados or python API

2015-11-02 Thread John Spray
On Mon, Nov 2, 2015 at 9:39 PM, Alex Leake  wrote:
> Hello all,
>
>
> I'm attempting to use the python API to get the quota of a pool, but I can't
> see it in the documentation
> (http://docs.ceph.com/docs/v0.94/rados/api/python/).

The call you're looking for is Rados.mon_command, which seems to be
missing in the documentation for some reason.  This is the same
interface that the ceph CLI uses.

>>> r = rados.Rados(conffile="./ceph.conf")
>>> r.connect()
>>> json.loads(r.mon_command(json.dumps({"prefix": "osd pool get-quota", 
>>> "pool": "rbd", "format": "json-pretty"}), "")[1])
{u'pool_name': u'rbd', u'quota_max_objects': 0, u'quota_max_bytes': 0,
u'pool_id': 0}

Cheers,
John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-cn] librados: Objecter returned from getxattrs r=-2

2015-11-02 Thread Zhou, Yuan
Hi,

So in RGW there's no hive* objects now, could you please check if there's any 
exists in the S3 perspective?  That's to check the object listing of bucket 
'olla' via the S3 API (boto or s3cmd could do the job)

I've met some similar issue in Hadoop over SwiftFS before. There's some OSDs 
were down in Ceph cluster, then the file listing in Hadoop and Swift does not 
match. Don’t know the detail failures though. I was simply trying to do some 
benchmarks so the data are not important. By manually deleting the 
objects/buckets and regenerating the data issue was fixed.

hope this can help. 

thanks, -yuan

-Original Message-
From: 张绍文 [mailto:zhangshao...@btte.net] 
Sent: Tuesday, November 3, 2015 1:45 PM
To: Zhou, Yuan
Cc: ceph...@lists.ceph.com; ceph-us...@ceph.com
Subject: Re: [Ceph-cn] librados: Objecter returned from getxattrs r=-2

On Tue, 3 Nov 2015 05:32:27 +
"Zhou, Yuan"  wrote:

> Hi,
> 
> The directory there should be some simulated hierarchical structure 
> with '/' in the object names. Do you mind checking the rest objects in 
> ceph pool .rgw.buckets?
> 
> $ rados ls -p .rgw.buckets | grep default.157931.5_hive
> 
> If there're still objects come out, you might try to delete them from 
> the 'olla' bucket with S3 API. (Note I'm not sure how's your Hive data 
> generated, so please do backup first if it's important.)
> 

Thanks for your reply. I dumped object list yesterday:

# rados -p .rgw.buckets ls >obj-list
# ls -lh obj-list
-rw-r--r-- 1 root root 1.2G Nov  2 15:51 obj-list # grep default.157931.5_hive 
obj-list # 

There's no such object.

> 
> -Original Message-
> From: Ceph-cn [mailto:ceph-cn-boun...@lists.ceph.com] On Behalf Of ???
> Sent: Tuesday, November 3, 2015 12:22 PM
> To: ceph...@lists.ceph.com; ceph-us...@ceph.com
> Subject: Re: [Ceph-cn] librados: Objecter returned from getxattrs r=-2
> 
> With debug_objecter = 20/0 I get this, I guess the thing is: the 
> object has been removed, but "directory" info still exists.
> 
> 2015-11-03 12:07:22.264704 7f03c42f3700 10 client.214496.objecter 
> ms_dispatch 0x2c18840 osd_op_reply(81 
> default.157931.5_hive/staging_hive_2015-11-01_14-57-40_861_37977797652
> 10222008-1/_tmp.-ext-1/ [getxattrs,stat] v0'0 uv0 ack = -2 ((2) No 
> such file or directory)) v6
> 
> So, how can I safely remove the "directory" info?
> 
> On Tue, 3 Nov 2015 10:10:26 +0800
> 张绍文  wrote:
> 
> > On Mon, 2 Nov 2015 16:47:11 +0800
> > 张绍文  wrote:
> >   
> > > On Mon, 2 Nov 2015 16:36:57 +0800
> > > 张绍文  wrote:
> > > 
> > > > Hi, all:
> > > > 
> > > > I'm using hive via s3a, but it's not usable after I removed some 
> > > > temp files with:
> > > > 
> > > > /opt/hadoop/bin/hdfs dfs -rm -r -f s3a://olla/hive/
> > > > 
> > > > With debug_radosgw = 10/0, I got these messages repeatly:
> > > > 
> > > > 2015-11-02 14:30:44.547271 7f08ef7fe700 10 librados: Objecter 
> > > > returned from getxattrs r=-2 2015-11-02 14:30:44.549117
> > > > 7f08ef7fe700 10 librados: getxattrs
> > > > oid=default.157931.5_hive/staging_hive_2015-11-01_14-57-40_861_3
> > > > 79 7779765210222008-1/_tmp.-ext-1/
> > > > nspace=
> > > > 
> > > > I dumped whole object list, and there's no object named starts 
> > > > with hive/..., and hive is not usable now, please help.
> > > >   
> > > 
> > > Sorry, I forgot this:
> > > 
> > > # ceph -v
> > > ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
> > > 
> > > Other known "directories" under THE same bucket is readable.
> > > 
> > > 
> > 
> > Also happened to others on ceph-users maillist, seems not resolved:
> > 
> > http://article.gmane.org/gmane.comp.file-systems.ceph.user/7653/matc
> > h=
> > objecter+returned+getxattrs
> > 
> >   
> 
> 
> 
> --
> 张绍文
> ___
> Ceph-cn mailing list
> ceph...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-cn-ceph.com



--
张绍文
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph new osd addition and client disconnected

2015-11-02 Thread gjprabu
Hi Taylor,



  I have checked DNS name and all host resolve to the correct IP. MTU 
size is 1500 in switch level configuration done. There is no firewall/ selinux  
is running currently. 



 Also we would like to know below query's which already in the thread.



Regards

Prabu



  On Tue, 03 Nov 2015 11:20:07 +0530 Chris Taylor 
ctay...@eyonic.com wrote 




I would double check the network configuration on the new node. Including hosts 
files and DNS names. Do all the host names resolve to the correct IP addresses 
from all hosts?

"... 192.168.112.231:6800/49908  192.168.113.42:0/599324131 ..."

Looks like the communication between subnets is a problem. Is xxx.xxx.113.xxx a 
typo? If that's correct, check MTU sizes. Are they configured correctly on the 
switch and all NICs?

Is there any iptables/firewall rules that could be blocking traffic between 
hosts?



Hope that helps,

Chris





On 2015-11-02 9:18 pm, gjprabu wrote:

Hi,



Anybody please help me on this issue.



Regards

Prabu





 On Mon, 02 Nov 2015 17:54:27 +0530 gjprabu gjpr...@zohocorp.com 
wrote 






Hi Team,



   We have ceph setup with 2 OSD and replica 2 and it is mounted with ocfs2 
clients and its working. When we added new osd  all the clients rbd mapped 
device disconnected and got hanged by running rbd ls or rbd map command. We 
waited for long hours to scale the new osd size but peering not completed event 
data sync finished, but client side issue was persist and thought to try old 
osd service stop/start, after some time rbd mapped automatically using existing 
map script.



   After service stop/start in old osd again 3rd OSD rebuild and back 
filling started and after some time clients rbd mapped device disconnected and 
got hanged by running rbd ls or rbd map command. We thought to wait till to 
finished data sync in 3'rd OSD and its completed, even though client side rbd 
not mapped. After we restarted all mon and osd service and client side issue 
got fixed and mounted rbd. We suspected some issue in our setup. also attached 
logs for your reference.



  Something we are missing in our setup i don't know, highly appreciated if 
anybody help us to solve this issue.





Before new osd.2 addition :



osd.0 - size : 13T  and used 2.7 T

osd.1 - size : 13T  and used 2.7 T



After new osd addition :

osd.0  size : 13T  and used  1.8T

osd.1  size : 13T  and used  2.1T

osd.2  size : 15T  and used  2.5T



rbd ls

repo / integrepository  (pg_num: 126)

rbd / integdownloads (pg_num: 64)







Also we would like to know few clarifications .



If any new osd will be added whether all client will be unmounted automatically 
.



While add new osd can we access ( read / write ) from client machines ?



How much data will be added in new osd - without change any repilca / pg_num ?



How long to take finish this process ? 



If we missed any common configuration - please share the same .





ceph.conf

[global]

fsid = 944fa0af-b7be-45a9-93ff-b9907cfaee3f

mon_initial_members = integ-hm5, integ-hm6, integ-hm7

mon_host = 192.168.112.192,192.168.112.193,192.168.112.194

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

filestore_xattr_use_omap = true

osd_pool_default_size = 2



[mon]

mon_clock_drift_allowed = .500



[client]

rbd_cache = false



Current Logs from new osd also attached old logs.



2015-11-02 12:47:48.481641 7f386f691700  0 bad crc in data 3889133030 != exp 
2857248268

2015-11-02 12:47:48.482230 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc510580).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42530/0)

2015-11-02 12:47:48.483951 7f386f691700  0 bad crc in data 3192803598 != exp 
1083014631

2015-11-02 12:47:48.484512 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc516f60).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42531/0)

2015-11-02 12:47:48.486284 7f386f691700  0 bad crc in data 133120597 != exp 
393328400

2015-11-02 12:47:48.486777 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x16a18000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc514620).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42532/0)

2015-11-02 12:47:48.488624 7f386f691700  0 bad crc in data 3299720069 != exp 
211350069

2015-11-02 12:47:48.489100 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc513860).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42533/0)

2015-11-02 12:47:48.490911 7f386f691700  0 bad crc in data 2381447347 != exp 
1177846878

2015-11-02 12:47:48.491390 7f386f691700  0 -- 192.168.112.231:6800/49908 
 

Re: [ceph-users] ceph new osd addition and client disconnected

2015-11-02 Thread Chris Taylor

On 2015-11-02 10:19 pm, gjprabu wrote:


Hi Taylor,

I have checked DNS name and all host resolve to the correct IP. MTU 
size is 1500 in switch level configuration done. There is no firewall/ 
selinux is running currently.


Also we would like to know below query's which already in the thread.

Regards
Prabu

 On Tue, 03 Nov 2015 11:20:07 +0530 CHRIS TAYLOR 
 wrote 


I would double check the network configuration on the new node. 
Including hosts files and DNS names. Do all the host names resolve to 
the correct IP addresses from all hosts?


"... 192.168.112.231:6800/49908 >> 192.168.113.42:0/599324131 ..."

Looks like the communication between subnets is a problem. Is 
xxx.xxx.113.xxx a typo? If that's correct, check MTU sizes. Are they 
configured correctly on the switch and all NICs?


Is there any iptables/firewall rules that could be blocking traffic 
between hosts?


Hope that helps,

Chris

On 2015-11-02 9:18 pm, gjprabu wrote:

Hi,

Anybody please help me on this issue.

Regards
Prabu

 On Mon, 02 Nov 2015 17:54:27 +0530 GJPRABU  
wrote 


Hi Team,

We have ceph setup with 2 OSD and replica 2 and it is mounted with 
ocfs2 clients and its working. When we added new osd all the clients 
rbd mapped device disconnected and got hanged by running rbd ls or rbd 
map command. We waited for long hours to scale the new osd size but 
peering not completed event data sync finished, but client side issue 
was persist and thought to try old osd service stop/start, after some 
time rbd mapped automatically using existing map script.


After service stop/start in old osd again 3rd OSD rebuild and back 
filling started and after some time clients rbd mapped device 
disconnected and got hanged by running rbd ls or rbd map command. We 
thought to wait till to finished data sync in 3'rd OSD and its 
completed, even though client side rbd not mapped. After we restarted 
all mon and osd service and client side issue got fixed and mounted 
rbd. We suspected some issue in our setup. also attached logs for your 
reference.




What does 'ceph -s' look like? is the cluster HEALTH_OK?



Something we are missing in our setup i don't know, highly appreciated 
if anybody help us to solve this issue.


Before new osd.2 addition :

osd.0 - size : 13T and used 2.7 T
osd.1 - size : 13T and used 2.7 T

After new osd addition :
osd.0 size : 13T and used 1.8T
osd.1 size : 13T and used 2.1T
osd.2 size : 15T and used 2.5T

rbd ls
repo / integrepository (pg_num: 126)
rbd / integdownloads (pg_num: 64)

Also we would like to know few clarifications .

If any new osd will be added whether all client will be unmounted 
automatically .




Clients do not need to unmount images when OSDs are added.


While add new osd can we access ( read / write ) from client machines ?



Clients still have read/write access to RBD images in the cluster while 
adding OSDs and during recovery.


How much data will be added in new osd - without change any repilca / 
pg_num ?




The data will re-balance between OSDs automatically. I found having more 
PGs help distribute the load more evenly.



How long to take finish this process ?


Depends greatly on the hardware and configuration. Whether Journals on 
SSD or spinning disks, network connectivity, max_backfills, etc.




If we missed any common configuration - please share the same .


I don't see any configuration for public and cluster networks. If you 
are sharing the same network for clients and object replication/recovery 
the cluster re-balancing data between OSDs could cause problems with the 
client traffic.


Take a look at: 
http://docs.ceph.com/docs/master/rados/configuration/network-config-ref/




ceph.conf
[global]
fsid = 944fa0af-b7be-45a9-93ff-b9907cfaee3f
mon_initial_members = integ-hm5, integ-hm6, integ-hm7
mon_host = 192.168.112.192,192.168.112.193,192.168.112.194
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
osd_pool_default_size = 2

[mon]
mon_clock_drift_allowed = .500

[client]
rbd_cache = false

Current Logs from new osd also attached old logs.

2015-11-02 12:47:48.481641 7f386f691700 0 bad crc in data 3889133030 != 
exp 2857248268
2015-11-02 12:47:48.482230 7f386f691700 0 -- 192.168.112.231:6800/49908 
>> 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 l=0 c=0xc510580).accept peer addr is really 192.168.113.42:0/599324131 (socket is 192.168.113.42:42530/0)
2015-11-02 12:47:48.483951 7f386f691700 0 bad crc in data 3192803598 != 
exp 1083014631
2015-11-02 12:47:48.484512 7f386f691700 0 -- 192.168.112.231:6800/49908 
>> 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 l=0 c=0xc516f60).accept peer addr is really 192.168.113.42:0/599324131 (socket is 192.168.113.42:42531/0)
2015-11-02 12:47:48.486284 7f386f691700 0 bad crc in data 133120597 != 
exp 393328400
2015-11-02 12:47:48.486777 7f386f691700 0 -- 

Re: [ceph-users] Changing CRUSH map ids

2015-11-02 Thread Gregory Farnum
On Mon, Nov 2, 2015 at 7:42 AM, Loris Cuoghi  wrote:
> Thanks Greg :)
>
> For the OSDs, I understand, on the other hand for intermediate abstractions
> like hosts, racks and rooms, do you agree that it should currently be
> possible to change the IDs (always under the "one change at a time, I
> promise mom" rule)?

Yeah, that should be fine. Actually changing a bunch of IDs shouldn't
matter, but I haven't played with actually changing them so no
promises.

>
> Clearly, a good amount of shuffling should be expected as a consequence.
> Basically I was inquiring whether changing the id of a single host would
> shuffle the entirety (or a relatively big chunk) of the cluster data, or if
> the shuffling was limited to a direct proportion of the item's weight.
>
> I just --test-ed with crushtool. I changed an host's id, and testing the two
> maps with :
>
> crushtool -i crush.map --test --show-statistics --rule 0 --num-rep 3 --min-x
> 1 --max-x $N --show-mappings
>
> (with $N varying from as little as 32 to "big numbers"TM) shows that nearly
> the 50% of the mappings changed, in a 10 hosts cluster.
>
> Thanks All :)
>
>
> Le 02/11/2015 16:14, Gregory Farnum a écrit :
>>
>> Regardless of what the crush tool does, I wouldn't muck around with the
>> IDs of the OSDs. The rest of Celh will probably not handle it well if
>> the crush IDs don't match the OSD numbers.
>> -Greg
>>
>> On Monday, November 2, 2015, Loris Cuoghi > > wrote:
>>
>> Le 02/11/2015 12:47, Wido den Hollander a écrit :
>>
>>
>>
>> On 02-11-15 12:30, Loris Cuoghi wrote:
>>
>> Hi All,
>>
>> We're currently on version 0.94.5 with three monitors and 75
>> OSDs.
>>
>> I've peeked at the decompiled CRUSH map, and I see that all
>> ids are
>> commented with '# Here be dragons!', or more literally : '#
>> do not
>> change unnecessarily'.
>>
>> Now, what would happen if an incautious user would happen to
>> put his
>> chubby fingers on this ids, totally disregarding the warning
>> at the
>> entrance of the cave, and change one of them?
>>
>> Data shuffle? (Relative to the allocation of PGs for the
>> OSD/host/other
>> item?)
>>
>> A *big* data shuffle? (ALL data would need to have its
>> position
>> recalculated, with immediate end-of-the-world data shuffle?)
>>
>> Nothing at all? (And the big fat warning is there only to
>> take fun on
>> the uninstructed ones? Not plausible...)
>>
>>
>> Give it a try! Download the CRUSHMap and run tests on it with
>> crushtool:
>>
>> $ crushtool -i mycrushmap --test --rule 0 --num-rep 3
>> --show-statistics
>>
>> Now, change the map, compile it and run again:
>>
>> $ crushtool -i mycrushmap.new --test --rule 0 --num-rep 3
>> --show-statistics
>>
>> Check the differences and you get the idea of how much has
>> changed.
>>
>> Wido
>>
>>
>> Thanks Wido ! :)
>>
>> Thanks !
>>
>> Loris
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] who is using radosgw with civetweb?

2015-11-02 Thread Derek Yarnell
On 2/25/15 2:31 PM, Sage Weil wrote:
> Hey,
> 
> We are considering switching to civetweb (the embedded/standalone rgw web 
> server) as the primary supported RGW frontend instead of the current 
> apache + mod-fastcgi or mod-proxy-fcgi approach.  "Supported" here means 
> both the primary platform the upstream development focuses on and what the 
> downstream Red Hat product will officially support.
> 
> How many people are using RGW standalone using the embedded civetweb 
> server instead of apache?  In production?  At what scale?  What 
> version(s) (civetweb first appeared in firefly and we've backported most 
> fixes).
> 
> Have you seen any problems?  Any other feedback?  The hope is to (vastly) 
> simplify deployment.

Hi,

We have been using civetweb proxied by Apache on RHEL7 on both our RGW
clusters and have been very happy with performance and setup. This has
been our default since we upgraded to Hammer.

Only thing we had to make sure that on our ProxyPass we were specifying
nocanon the proxy pass would mangle http encoding.

Reason we use Apache on the front end is so we can collocate a Django
web front end application for the object store to get around the
necessary for CORS (we designed this before RGW was CORS aware anyway).

Thanks,
derek

-- 
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph new osd addition and client disconnected

2015-11-02 Thread gjprabu
Hi,



Anybody please help me on this issue.



Regards

Prabu




  On Mon, 02 Nov 2015 17:54:27 +0530 gjprabu gjpr...@zohocorp.com 
wrote 






Hi Team,



   We have ceph setup with 2 OSD and replica 2 and it is mounted with ocfs2 
clients and its working. When we added new osd  all the clients rbd mapped 
device disconnected and got hanged by running rbd ls or rbd map command. We 
waited for long hours to scale the new osd size but peering not completed event 
data sync finished, but client side issue was persist and thought to try old 
osd service stop/start, after some time rbd mapped automatically using existing 
map script.



   After service stop/start in old osd again 3rd OSD rebuild and back 
filling started and after some time clients rbd mapped device disconnected and 
got hanged by running rbd ls or rbd map command. We thought to wait till to 
finished data sync in 3'rd OSD and its completed, even though client side rbd 
not mapped. After we restarted all mon and osd service and client side issue 
got fixed and mounted rbd. We suspected some issue in our setup. also attached 
logs for your reference.



  Something we are missing in our setup i don't know, highly appreciated if 
anybody help us to solve this issue.





Before new osd.2 addition :



osd.0 - size : 13T  and used 2.7 T

osd.1 - size : 13T  and used 2.7 T



After new osd addition :

osd.0  size : 13T  and used  1.8T

osd.1  size : 13T  and used  2.1T

osd.2  size : 15T  and used  2.5T



rbd ls

repo / integrepository  (pg_num: 126)

rbd / integdownloads (pg_num: 64)







Also we would like to know few clarifications .



If any new osd will be added whether all client will be unmounted automatically 
.



While add new osd can we access ( read / write ) from client machines ?



How much data will be added in new osd - without change any repilca / pg_num ?



How long to take finish this process ? 



If we missed any common configuration - please share the same .





ceph.conf

[global]

fsid = 944fa0af-b7be-45a9-93ff-b9907cfaee3f

mon_initial_members = integ-hm5, integ-hm6, integ-hm7

mon_host = 192.168.112.192,192.168.112.193,192.168.112.194

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

filestore_xattr_use_omap = true

osd_pool_default_size = 2



[mon]

mon_clock_drift_allowed = .500



[client]

rbd_cache = false



Current Logs from new osd also attached old logs.



2015-11-02 12:47:48.481641 7f386f691700  0 bad crc in data 3889133030 != exp 
2857248268

2015-11-02 12:47:48.482230 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc510580).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42530/0)

2015-11-02 12:47:48.483951 7f386f691700  0 bad crc in data 3192803598 != exp 
1083014631

2015-11-02 12:47:48.484512 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc516f60).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42531/0)

2015-11-02 12:47:48.486284 7f386f691700  0 bad crc in data 133120597 != exp 
393328400

2015-11-02 12:47:48.486777 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x16a18000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc514620).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42532/0)

2015-11-02 12:47:48.488624 7f386f691700  0 bad crc in data 3299720069 != exp 
211350069

2015-11-02 12:47:48.489100 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc513860).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42533/0)

2015-11-02 12:47:48.490911 7f386f691700  0 bad crc in data 2381447347 != exp 
1177846878

2015-11-02 12:47:48.491390 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc513700).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42534/0)

2015-11-02 12:47:48.493167 7f386f691700  0 bad crc in data 2093712440 != exp 
2175112954

2015-11-02 12:47:48.493682 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x16a18000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc514200).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42535/0)

2015-11-02 12:47:48.495150 7f386f691700  0 bad crc in data 3047197039 != exp 
38098198

2015-11-02 12:47:48.495679 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc510b00).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42536/0)

2015-11-02 12:47:48.497259 7f386f691700  0 bad crc in data 1400444622 != exp 
2648291990

2015-11-02 12:47:48.497756 

Re: [ceph-users] [Ceph-cn] librados: Objecter returned from getxattrs r=-2

2015-11-02 Thread Zhou, Yuan
Hi,

The directory there should be some simulated hierarchical structure with '/' in 
the object names. Do you mind checking the rest objects in ceph pool 
.rgw.buckets?

$ rados ls -p .rgw.buckets | grep default.157931.5_hive

If there're still objects come out, you might try to delete them from the 
'olla' bucket with S3 API.
(Note I'm not sure how's your Hive data generated, so please do backup first if 
it's important.)

thanks, -yuan

-Original Message-
From: Ceph-cn [mailto:ceph-cn-boun...@lists.ceph.com] On Behalf Of ???
Sent: Tuesday, November 3, 2015 12:22 PM
To: ceph...@lists.ceph.com; ceph-us...@ceph.com
Subject: Re: [Ceph-cn] librados: Objecter returned from getxattrs r=-2

With debug_objecter = 20/0 I get this, I guess the thing is: the object has 
been removed, but "directory" info still exists.

2015-11-03 12:07:22.264704 7f03c42f3700 10 client.214496.objecter ms_dispatch 
0x2c18840 osd_op_reply(81 
default.157931.5_hive/staging_hive_2015-11-01_14-57-40_861_3797779765210222008-1/_tmp.-ext-1/
[getxattrs,stat] v0'0 uv0 ack = -2 ((2) No such file or directory)) v6

So, how can I safely remove the "directory" info?

On Tue, 3 Nov 2015 10:10:26 +0800
张绍文  wrote:

> On Mon, 2 Nov 2015 16:47:11 +0800
> 张绍文  wrote:
> 
> > On Mon, 2 Nov 2015 16:36:57 +0800
> > 张绍文  wrote:
> >   
> > > Hi, all:
> > > 
> > > I'm using hive via s3a, but it's not usable after I removed some 
> > > temp files with:
> > > 
> > > /opt/hadoop/bin/hdfs dfs -rm -r -f s3a://olla/hive/
> > > 
> > > With debug_radosgw = 10/0, I got these messages repeatly:
> > > 
> > > 2015-11-02 14:30:44.547271 7f08ef7fe700 10 librados: Objecter 
> > > returned from getxattrs r=-2 2015-11-02 14:30:44.549117
> > > 7f08ef7fe700 10 librados: getxattrs 
> > > oid=default.157931.5_hive/staging_hive_2015-11-01_14-57-40_861_379
> > > 7779765210222008-1/_tmp.-ext-1/
> > > nspace=
> > > 
> > > I dumped whole object list, and there's no object named starts 
> > > with hive/..., and hive is not usable now, please help.
> > > 
> > 
> > Sorry, I forgot this:
> > 
> > # ceph -v
> > ceph version 0.94.2 (5fb85614ca8f354284c713a2f9c610860720bbf3)
> > 
> > Other known "directories" under THE same bucket is readable.
> > 
> >   
> 
> Also happened to others on ceph-users maillist, seems not resolved:
> 
> http://article.gmane.org/gmane.comp.file-systems.ceph.user/7653/match=
> objecter+returned+getxattrs
> 
> 



--
张绍文
___
Ceph-cn mailing list
ceph...@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-cn-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph new osd addition and client disconnected

2015-11-02 Thread Chris Taylor
 

I would double check the network configuration on the new node.
Including hosts files and DNS names. Do all the host names resolve to
the correct IP addresses from all hosts? 

"... 192.168.112.231:6800/49908 >> 192.168.113.42:0/599324131 ..." 

Looks like the communication between subnets is a problem. Is
xxx.xxx.113.xxx a typo? If that's correct, check MTU sizes. Are they
configured correctly on the switch and all NICs? 

Is there any iptables/firewall rules that could be blocking traffic
between hosts? 

Hope that helps, 

Chris 

On 2015-11-02 9:18 pm, gjprabu wrote: 

> Hi, 
> 
> Anybody please help me on this issue. 
> 
> Regards 
> Prabu 
> 
>  On Mon, 02 Nov 2015 17:54:27 +0530 GJPRABU  wrote 
>  
> 
>> Hi Team, 
>> 
>> We have ceph setup with 2 OSD and replica 2 and it is mounted with ocfs2 
>> clients and its working. When we added new osd all the clients rbd mapped 
>> device disconnected and got hanged by running rbd ls or rbd map command. We 
>> waited for long hours to scale the new osd size but peering not completed 
>> event data sync finished, but client side issue was persist and thought to 
>> try old osd service stop/start, after some time rbd mapped automatically 
>> using existing map script. 
>> 
>> After service stop/start in old osd again 3rd OSD rebuild and back filling 
>> started and after some time clients rbd mapped device disconnected and got 
>> hanged by running rbd ls or rbd map command. We thought to wait till to 
>> finished data sync in 3'rd OSD and its completed, even though client side 
>> rbd not mapped. After we restarted all mon and osd service and client side 
>> issue got fixed and mounted rbd. We suspected some issue in our setup. also 
>> attached logs for your reference. 
>> 
>> Something we are missing in our setup i don't know, highly appreciated if 
>> anybody help us to solve this issue. 
>> 
>> Before new osd.2 addition : 
>> 
>> osd.0 - size : 13T and used 2.7 T 
>> osd.1 - size : 13T and used 2.7 T 
>> 
>> After new osd addition : 
>> osd.0 size : 13T and used 1.8T 
>> osd.1 size : 13T and used 2.1T 
>> osd.2 size : 15T and used 2.5T 
>> 
>> rbd ls 
>> repo / integrepository (pg_num: 126) 
>> rbd / integdownloads (pg_num: 64) 
>> 
>> Also we would like to know few clarifications . 
>> 
>> If any new osd will be added whether all client will be unmounted 
>> automatically . 
>> 
>> While add new osd can we access ( read / write ) from client machines ? 
>> 
>> How much data will be added in new osd - without change any repilca / pg_num 
>> ? 
>> 
>> How long to take finish this process ? 
>> 
>> If we missed any common configuration - please share the same . 
>> 
>> ceph.conf 
>> [global] 
>> fsid = 944fa0af-b7be-45a9-93ff-b9907cfaee3f 
>> mon_initial_members = integ-hm5, integ-hm6, integ-hm7 
>> mon_host = 192.168.112.192,192.168.112.193,192.168.112.194 
>> auth_cluster_required = cephx 
>> auth_service_required = cephx 
>> auth_client_required = cephx 
>> filestore_xattr_use_omap = true 
>> osd_pool_default_size = 2 
>> 
>> [mon] 
>> mon_clock_drift_allowed = .500 
>> 
>> [client] 
>> rbd_cache = false 
>> 
>> Current Logs from new osd also attached old logs. 
>> 
>> 2015-11-02 12:47:48.481641 7f386f691700 0 bad crc in data 3889133030 != exp 
>> 2857248268 
>> 2015-11-02 12:47:48.482230 7f386f691700 0 -- 192.168.112.231:6800/49908 >> 
>> 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 l=0 
>> c=0xc510580).accept peer addr is really 192.168.113.42:0/599324131 (socket 
>> is 192.168.113.42:42530/0) 
>> 2015-11-02 12:47:48.483951 7f386f691700 0 bad crc in data 3192803598 != exp 
>> 1083014631 
>> 2015-11-02 12:47:48.484512 7f386f691700 0 -- 192.168.112.231:6800/49908 >> 
>> 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 l=0 
>> c=0xc516f60).accept peer addr is really 192.168.113.42:0/599324131 (socket 
>> is 192.168.113.42:42531/0) 
>> 2015-11-02 12:47:48.486284 7f386f691700 0 bad crc in data 133120597 != exp 
>> 393328400 
>> 2015-11-02 12:47:48.486777 7f386f691700 0 -- 192.168.112.231:6800/49908 >> 
>> 192.168.113.42:0/599324131 pipe(0x16a18000 sd=28 :6800 s=0 pgs=0 cs=0 l=0 
>> c=0xc514620).accept peer addr is really 192.168.113.42:0/599324131 (socket 
>> is 192.168.113.42:42532/0) 
>> 2015-11-02 12:47:48.488624 7f386f691700 0 bad crc in data 3299720069 != exp 
>> 211350069 
>> 2015-11-02 12:47:48.489100 7f386f691700 0 -- 192.168.112.231:6800/49908 >> 
>> 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 l=0 
>> c=0xc513860).accept peer addr is really 192.168.113.42:0/599324131 (socket 
>> is 192.168.113.42:42533/0) 
>> 2015-11-02 12:47:48.490911 7f386f691700 0 bad crc in data 2381447347 != exp 
>> 1177846878 
>> 2015-11-02 12:47:48.491390 7f386f691700 0 -- 192.168.112.231:6800/49908 >> 
>> 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 l=0 
>> c=0xc513700).accept peer addr is really 192.168.113.42:0/599324131 (socket 
>> is 

[ceph-users] rgw max-buckets

2015-11-02 Thread Derek Yarnell
Hi,

We want to have users that can authenticate to our RGW but have no quota
and can not create buckets.  The former is working by setting the
user_quota max_size_kb to 0.  The later does not seem to be working by
setting max_buckets to 0 means no quota on the amount of buckets that
can be created and this field does not seem to support something like
-1.  Setting max_buckets to 1 does correctly return 400 TooManyBuckets
when trying to create the 2nd bucket.

Is there anyway to not allow any bucket creation?

Thanks,
derek

-- 
Derek T. Yarnell
University of Maryland
Institute for Advanced Computer Studies
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data size less than 4 mb

2015-11-02 Thread mad Engineer
thanks Gregory

On Sun, Nov 1, 2015 at 12:06 AM, Gregory Farnum  wrote:

> On Friday, October 30, 2015, mad Engineer 
> wrote:
>
>> i am learning ceph,block storage and read that each object size is 4 Mb.I
>> am not clear about the concepts of object storage still what will happen if
>> the actual size of data written to block is less than 4 Mb lets say 1
>> Mb.Will it still create object with 4 mb size and keep the rest of the
>> space free and unusable?
>>
>
> No, it will only take up as much space as you write (plus some metadata).
> Although I think RBD passes down io hints suggesting the object's final
> size will be 4MB so that the underlying storage (eg xfs) can prevent
> fragmentation.
> -Greg
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] two or three replicas?

2015-11-02 Thread Wah Peng

Hello,

for production application (for example, openstack's block storage), is 
it better to setup data to be stored with two replicas, or three 
replicas? is two replicas with better performance and lower cost?


Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-02 Thread Jason Dillaman
Most likely not going to be related to 13045 since you aren't actively 
exporting an image diff.  The most likely problem is that the RADOS IO context 
is being closed prior to closing the RBD image.

-- 

Jason Dillaman 


- Original Message - 

> From: "Voloshanenko Igor" 
> To: "Ceph Users" 
> Sent: Thursday, October 29, 2015 5:27:17 PM
> Subject: Re: [ceph-users] Cloudstack agent crashed JVM with exception in
> librbd

> From all we analyzed - look like - it's this issue
> http://tracker.ceph.com/issues/13045

> PR: https://github.com/ceph/ceph/pull/6097

> Can anyone help us to confirm this? :)

> 2015-10-29 23:13 GMT+02:00 Voloshanenko Igor < igor.voloshane...@gmail.com >
> :

> > Additional trace:
> 

> > #0 0x7f30f9891cc9 in __GI_raise (sig=sig@entry=6) at
> > ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> 
> > #1 0x7f30f98950d8 in __GI_abort () at abort.c:89
> 
> > #2 0x7f30f87b36b5 in __gnu_cxx::__verbose_terminate_handler() () from
> > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > #3 0x7f30f87b1836 in ?? () from
> > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > #4 0x7f30f87b1863 in std::terminate() () from
> > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > #5 0x7f30f87b1aa2 in __cxa_throw () from
> > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > #6 0x7f2fddb50778 in ceph::__ceph_assert_fail
> > (assertion=assertion@entry=0x7f2fdddeca05 "sub < m_subsys.size()",
> 
> > file=file@entry=0x7f2fdddec9f0 "./log/SubsystemMap.h", line=line@entry=62,
> 
> > func=func@entry=0x7f2fdddedba0
> > <_ZZN4ceph3log12SubsystemMap13should_gatherEjiE19__PRETTY_FUNCTION__> "bool
> > ceph::log::SubsystemMap::should_gather(unsigned int, int)") at
> > common/assert.cc:77
> 
> > #7 0x7f2fdda1fed2 in ceph::log::SubsystemMap::should_gather
> > (level=, sub=, this=)
> 
> > at ./log/SubsystemMap.h:62
> 
> > #8 0x7f2fdda3b693 in ceph::log::SubsystemMap::should_gather
> > (this=, sub=, level=)
> 
> > at ./log/SubsystemMap.h:61
> 
> > #9 0x7f2fddd879be in ObjectCacher::flusher_entry (this=0x7f2ff80b27a0)
> > at
> > osdc/ObjectCacher.cc:1527
> 
> > #10 0x7f2fddd9851d in ObjectCacher::FlusherThread::entry
> > (this= > out>) at osdc/ObjectCacher.h:374
> 
> > #11 0x7f30f9c28182 in start_thread (arg=0x7f2e1a7fc700) at
> > pthread_create.c:312
> 
> > #12 0x7f30f995547d in clone () at
> > ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> 

> > 2015-10-29 17:38 GMT+02:00 Voloshanenko Igor < igor.voloshane...@gmail.com
> > >
> > :
> 

> > > Hi Wido and all community.
> > 
> 

> > > We catched very idiotic issue on our Cloudstack installation, which
> > > related
> > > to ceph and possible to java-rados lib.
> > 
> 

> > > So, we have constantly agent crashed (which cause very big problem for
> > > us...
> > > ).
> > 
> 

> > > When agent crashed - it's crash JVM. And no event in logs at all.
> > 
> 
> > > We enabled crush dump, and after crash we see next picture:
> > 
> 

> > > #grep -A1 "Problematic frame" < /hs_err_pid30260.log
> > 
> 
> > > Problematic frame:
> > 
> 
> > > C [librbd.so.1.0.0+0x5d681]
> > 
> 

> > > # gdb /usr/lib/librbd.so.1.0.0 /var/tmp/cores/jsvc.25526.0.core
> > 
> 
> > > (gdb) bt
> > 
> 
> > > ...
> > 
> 
> > > #7 0x7f30b9a1fed2 in ceph::log::SubsystemMap::should_gather
> > > (level=, sub=, this=)
> > 
> 
> > > at ./log/SubsystemMap.h:62
> > 
> 
> > > #8 0x7f30b9a3b693 in ceph::log::SubsystemMap::should_gather
> > > (this=, sub=, level=)
> > 
> 
> > > at ./log/SubsystemMap.h:61
> > 
> 
> > > #9 0x7f30b9d879be in ObjectCacher::flusher_entry
> > > (this=0x7f2fb4017910)
> > > at
> > > osdc/ObjectCacher.cc:1527
> > 
> 
> > > #10 0x7f30b9d9851d in ObjectCacher::FlusherThread::entry
> > > (this= > > out>) at osdc/ObjectCacher.h:374
> > 
> 

> > > From ceph code, this part executed when flushing cache object... And we
> > > don;t
> > > understand why. Becasue we have absolutely different race condition to
> > > reproduce it.
> > 
> 

> > > As cloudstack have not good implementation yet of snapshot lifecycle,
> > > sometime, it's happen, that some volumes already marked as EXPUNGED in DB
> > > and then cloudstack try to delete bas Volume, before it's try to
> > > unprotect
> > > it.
> > 
> 

> > > Sure, unprotecting fail, normal exception returned back (fail because
> > > snap
> > > has childs... )
> > 
> 

> > > 2015-10-29 09:02:19,401 DEBUG [kvm.resource.KVMHAMonitor]
> > > (Thread-1304:null)
> > > Executing:
> > > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh -i
> > > 10.44.253.13 -p /var/lib/libvirt/PRIMARY -m
> > > /mnt/93655746-a9ef-394d-95e9-6e62471dd39f -h 10.44.253.11
> > 
> 
> > > 2015-10-29 09:02:19,412 DEBUG [kvm.resource.KVMHAMonitor]
> > > (Thread-1304:null)
> > > Execution is successful.
> > 
> 
> > > 2015-10-29 09:02:19,554 INFO [kvm.storage.LibvirtStorageAdaptor]
> > > (agentRequest-Handler-5:null) Unprotecting and Removing RBD snapshots of
> > > 

Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-02 Thread Voloshanenko Igor
Dear all, can anybody help?

2015-10-30 10:37 GMT+02:00 Voloshanenko Igor :

> It's pain, but not... :(
> We already used your updated lib in dev env... :(
>
> 2015-10-30 10:06 GMT+02:00 Wido den Hollander :
>
>>
>>
>> On 29-10-15 16:38, Voloshanenko Igor wrote:
>> > Hi Wido and all community.
>> >
>> > We catched very idiotic issue on our Cloudstack installation, which
>> > related to ceph and possible to java-rados lib.
>> >
>>
>> I think you ran into this one:
>> https://issues.apache.org/jira/browse/CLOUDSTACK-8879
>>
>> Cleaning up RBD snapshots for volumes didn't go well and caused the JVM
>> to crash.
>>
>> Wido
>>
>> > So, we have constantly agent crashed (which cause very big problem for
>> > us... ).
>> >
>> > When agent crashed - it's crash JVM. And no event in logs at all.
>> > We enabled crush dump, and after crash we see next picture:
>> >
>> > #grep -A1 "Problematic frame" < /hs_err_pid30260.log
>> >  Problematic frame:
>> >  C  [librbd.so.1.0.0+0x5d681]
>> >
>> > # gdb /usr/lib/librbd.so.1.0.0 /var/tmp/cores/jsvc.25526.0.core
>> > (gdb)  bt
>> > ...
>> > #7  0x7f30b9a1fed2 in ceph::log::SubsystemMap::should_gather
>> > (level=, sub=, this=)
>> > at ./log/SubsystemMap.h:62
>> > #8  0x7f30b9a3b693 in ceph::log::SubsystemMap::should_gather
>> > (this=, sub=, level=)
>> > at ./log/SubsystemMap.h:61
>> > #9  0x7f30b9d879be in ObjectCacher::flusher_entry
>> > (this=0x7f2fb4017910) at osdc/ObjectCacher.cc:1527
>> > #10 0x7f30b9d9851d in ObjectCacher::FlusherThread::entry
>> > (this=) at osdc/ObjectCacher.h:374
>> >
>> > From ceph code, this part executed when flushing cache object... And we
>> > don;t understand why. Becasue we have absolutely different race
>> > condition to reproduce it.
>> >
>> > As cloudstack have not good implementation yet of snapshot lifecycle,
>> > sometime, it's happen, that some volumes already marked as EXPUNGED in
>> > DB and then cloudstack try to delete bas Volume, before it's try to
>> > unprotect it.
>> >
>> > Sure, unprotecting fail, normal exception returned back (fail because
>> > snap has childs... )
>> >
>> > 2015-10-29 09:02:19,401 DEBUG [kvm.resource.KVMHAMonitor]
>> > (Thread-1304:null) Executing:
>> > /usr/share/cloudstack-common/scripts/vm/hypervisor/kvm/kvmheartbeat.sh
>> > -i 10.44.253.13 -p /var/lib/libvirt/PRIMARY -m
>> > /mnt/93655746-a9ef-394d-95e9-6e62471dd39f -h 10.44.253.11
>> > 2015-10-29 09:02:19,412 DEBUG [kvm.resource.KVMHAMonitor]
>> > (Thread-1304:null) Execution is successful.
>> > 2015-10-29 09:02:19,554 INFO  [kvm.storage.LibvirtStorageAdaptor]
>> > (agentRequest-Handler-5:null) Unprotecting and Removing RBD snapshots of
>> > image 6789/71b1e2e9-1985-45ca-9ab6-9e5016b86b7c prior to removing the
>> image
>> > 2015-10-29 09:02:19,571 DEBUG [kvm.storage.LibvirtStorageAdaptor]
>> > (agentRequest-Handler-5:null) Succesfully connected to Ceph cluster at
>> > cephmon.anolim.net:6789 
>> > 2015-10-29 09:02:19,608 DEBUG [kvm.storage.LibvirtStorageAdaptor]
>> > (agentRequest-Handler-5:null) Unprotecting snapshot
>> > cloudstack/71b1e2e9-1985-45ca-9ab6-9e5016b86b7c@cloudstack-base-snap
>> > 2015-10-29 09:02:19,627 DEBUG [kvm.storage.KVMStorageProcessor]
>> > (agentRequest-Handler-5:null) Failed to delete volume:
>> > com.cloud.utils.exception.CloudRuntimeException:
>> > com.ceph.rbd.RbdException: Failed to unprotect snapshot
>> cloudstack-base-snap
>> > 2015-10-29 09:02:19,628 DEBUG [cloud.agent.Agent]
>> > (agentRequest-Handler-5:null) Seq 4-1921583831:  { Ans: , MgmtId:
>> > 161344838950, via: 4, Ver: v1, Flags: 10,
>> >
>> [{"com.cloud.agent.api.Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
>> > com.ceph.rbd.RbdException: Failed to unprotect snapshot
>> > cloudstack-base-snap","wait":0}}] }
>> > 2015-10-29 09:02:25,722 DEBUG [cloud.agent.Agent]
>> > (agentRequest-Handler-2:null) Processing command:
>> > com.cloud.agent.api.GetHostStatsCommand
>> > 2015-10-29 09:02:25,722 DEBUG [kvm.resource.LibvirtComputingResource]
>> > (agentRequest-Handler-2:null) Executing: /bin/bash -c idle=$(top -b -n
>> > 1| awk -F, '/^[%]*[Cc]pu/{$0=$4; gsub(/[^0-9.,]+/,""); print }'); echo
>> $idle
>> > 2015-10-29 09:02:26,249 DEBUG [kvm.resource.LibvirtComputingResource]
>> > (agentRequest-Handler-2:null) Execution is successful.
>> > 2015-10-29 09:02:26,250 DEBUG [kvm.resource.LibvirtComputingResource]
>> > (agentRequest-Handler-2:null) Executing: /bin/bash -c
>> > freeMem=$(free|grep cache:|awk '{print $4}');echo $freeMem
>> > 2015-10-29 09:02:26,254 DEBUG [kvm.resource.LibvirtComputingResource]
>> > (agentRequest-Handler-2:null) Execution is successful.
>> >
>> > BUT, after 20 minutes - agent crashed... If we remove all childs and
>> > create conditions for cloudstack to delete volume - all OK - no agent
>> > crash in 20 minutes...
>> >
>> > We can't connect this action - Volume delete with agent crashe... Also
>> > we don't 

Re: [ceph-users] data size less than 4 mb

2015-11-02 Thread Jan Schermer
Can those hints be disabled somehow? I was battling XFS preallocation the other 
day, and the mount option didn't make any difference - maybe because those 
hints have precedence (which could mean they aren't working as they should), 
maybe not.

In particular, when you fallocate a file, some number of blocks will be 
reserved without actually allocating the blocks. When you then dirty a block 
with write and flush, metadata needs to be written (in journal, synchronously) 
<- this is slow with all drives, and extremely slow with sh*tty drives (doing 
benchmark on such a file will yield just 100 write IOPs, but when you allocate 
the file previously with dd if=/dev/zero it will have 6000 IOPs!) - and there 
doesn't seem to be a way to disable it in XFS. Not sure if hints should help or 
if they are actually causing the problem (I am not clear on whether they 
preallocate metadata blocks or just block count). Ext4 does the same thing.

Might be worth looking into?

Jan


> On 31 Oct 2015, at 19:36, Gregory Farnum  wrote:
> 
> On Friday, October 30, 2015, mad Engineer  > wrote:
> i am learning ceph,block storage and read that each object size is 4 Mb.I am 
> not clear about the concepts of object storage still what will happen if the 
> actual size of data written to block is less than 4 Mb lets say 1 Mb.Will it 
> still create object with 4 mb size and keep the rest of the space free and 
> unusable?
> 
> No, it will only take up as much space as you write (plus some metadata). 
> Although I think RBD passes down io hints suggesting the object's final size 
> will be 4MB so that the underlying storage (eg xfs) can prevent fragmentation.
> -Greg
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data size less than 4 mb

2015-11-02 Thread Jan Schermer

> On 02 Nov 2015, at 11:59, Wido den Hollander  wrote:
> 
> 
> 
> On 02-11-15 11:56, Jan Schermer wrote:
>> Can those hints be disabled somehow? I was battling XFS preallocation
>> the other day, and the mount option didn't make any difference - maybe
>> because those hints have precedence (which could mean they aren't
>> working as they should), maybe not.
>> 
> 
> This config option?
> 
> OPTION(rbd_enable_alloc_hint, OPT_BOOL, true) // when writing a object,
> it will issue a hint to osd backend to indicate the expected size object
> need
> 
> Found in src/common/config_opts.h
> 
> Wido
> 

Thanks, but can this option be set for a whole OSD by default?

Jan

>> In particular, when you fallocate a file, some number of blocks will be
>> reserved without actually allocating the blocks. When you then dirty a
>> block with write and flush, metadata needs to be written (in journal,
>> synchronously) <- this is slow with all drives, and extremely slow with
>> sh*tty drives (doing benchmark on such a file will yield just 100 write
>> IOPs, but when you allocate the file previously with dd if=/dev/zero it
>> will have 6000 IOPs!) - and there doesn't seem to be a way to disable it
>> in XFS. Not sure if hints should help or if they are actually causing
>> the problem (I am not clear on whether they preallocate metadata blocks
>> or just block count). Ext4 does the same thing.
>> 
>> Might be worth looking into?
>> 
>> Jan
>> 
>> 
>>> On 31 Oct 2015, at 19:36, Gregory Farnum >> > wrote:
>>> 
>>> On Friday, October 30, 2015, mad Engineer >> > wrote:
>>> 
>>>i am learning ceph,block storage and read that each object size is
>>>4 Mb.I am not clear about the concepts of object storage still
>>>what will happen if the actual size of data written to block is
>>>less than 4 Mb lets say 1 Mb.Will it still create object with 4 mb
>>>size and keep the rest of the space free and unusable?
>>> 
>>> 
>>> No, it will only take up as much space as you write (plus some
>>> metadata). Although I think RBD passes down io hints suggesting the
>>> object's final size will be 4MB so that the underlying storage (eg
>>> xfs) can prevent fragmentation.
>>> -Greg
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com 
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Changing CRUSH map ids

2015-11-02 Thread Wido den Hollander


On 02-11-15 12:30, Loris Cuoghi wrote:
> Hi All,
> 
> We're currently on version 0.94.5 with three monitors and 75 OSDs.
> 
> I've peeked at the decompiled CRUSH map, and I see that all ids are
> commented with '# Here be dragons!', or more literally : '# do not
> change unnecessarily'.
> 
> Now, what would happen if an incautious user would happen to put his
> chubby fingers on this ids, totally disregarding the warning at the
> entrance of the cave, and change one of them?
> 
> Data shuffle? (Relative to the allocation of PGs for the OSD/host/other
> item?)
> 
> A *big* data shuffle? (ALL data would need to have its position
> recalculated, with immediate end-of-the-world data shuffle?)
> 
> Nothing at all? (And the big fat warning is there only to take fun on
> the uninstructed ones? Not plausible...)
> 

Give it a try! Download the CRUSHMap and run tests on it with crushtool:

$ crushtool -i mycrushmap --test --rule 0 --num-rep 3 --show-statistics

Now, change the map, compile it and run again:

$ crushtool -i mycrushmap.new --test --rule 0 --num-rep 3 --show-statistics

Check the differences and you get the idea of how much has changed.

Wido

> Thanks !
> 
> Loris
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] data size less than 4 mb

2015-11-02 Thread Wido den Hollander


On 02-11-15 11:56, Jan Schermer wrote:
> Can those hints be disabled somehow? I was battling XFS preallocation
> the other day, and the mount option didn't make any difference - maybe
> because those hints have precedence (which could mean they aren't
> working as they should), maybe not.
> 

This config option?

OPTION(rbd_enable_alloc_hint, OPT_BOOL, true) // when writing a object,
it will issue a hint to osd backend to indicate the expected size object
need

Found in src/common/config_opts.h

Wido

> In particular, when you fallocate a file, some number of blocks will be
> reserved without actually allocating the blocks. When you then dirty a
> block with write and flush, metadata needs to be written (in journal,
> synchronously) <- this is slow with all drives, and extremely slow with
> sh*tty drives (doing benchmark on such a file will yield just 100 write
> IOPs, but when you allocate the file previously with dd if=/dev/zero it
> will have 6000 IOPs!) - and there doesn't seem to be a way to disable it
> in XFS. Not sure if hints should help or if they are actually causing
> the problem (I am not clear on whether they preallocate metadata blocks
> or just block count). Ext4 does the same thing.
> 
> Might be worth looking into?
> 
> Jan
> 
> 
>> On 31 Oct 2015, at 19:36, Gregory Farnum > > wrote:
>>
>> On Friday, October 30, 2015, mad Engineer > > wrote:
>>
>> i am learning ceph,block storage and read that each object size is
>> 4 Mb.I am not clear about the concepts of object storage still
>> what will happen if the actual size of data written to block is
>> less than 4 Mb lets say 1 Mb.Will it still create object with 4 mb
>> size and keep the rest of the space free and unusable?
>>
>>
>> No, it will only take up as much space as you write (plus some
>> metadata). Although I think RBD passes down io hints suggesting the
>> object's final size will be 4MB so that the underlying storage (eg
>> xfs) can prevent fragmentation.
>> -Greg
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Changing CRUSH map ids

2015-11-02 Thread Loris Cuoghi

Hi All,

We're currently on version 0.94.5 with three monitors and 75 OSDs.

I've peeked at the decompiled CRUSH map, and I see that all ids are 
commented with '# Here be dragons!', or more literally : '# do not 
change unnecessarily'.


Now, what would happen if an incautious user would happen to put his 
chubby fingers on this ids, totally disregarding the warning at the 
entrance of the cave, and change one of them?


Data shuffle? (Relative to the allocation of PGs for the OSD/host/other 
item?)


A *big* data shuffle? (ALL data would need to have its position 
recalculated, with immediate end-of-the-world data shuffle?)


Nothing at all? (And the big fat warning is there only to take fun on 
the uninstructed ones? Not plausible...)


Thanks !

Loris
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-02 Thread Voloshanenko Igor
Thank you, Jason!

Any advice, for troubleshooting

I'm looking in code, and right now don;t see any bad things :(

2015-11-03 1:32 GMT+02:00 Jason Dillaman :

> Most likely not going to be related to 13045 since you aren't actively
> exporting an image diff.  The most likely problem is that the RADOS IO
> context is being closed prior to closing the RBD image.
>
> --
>
> Jason Dillaman
>
>
> - Original Message -
>
> > From: "Voloshanenko Igor" 
> > To: "Ceph Users" 
> > Sent: Thursday, October 29, 2015 5:27:17 PM
> > Subject: Re: [ceph-users] Cloudstack agent crashed JVM with exception in
> > librbd
>
> > From all we analyzed - look like - it's this issue
> > http://tracker.ceph.com/issues/13045
>
> > PR: https://github.com/ceph/ceph/pull/6097
>
> > Can anyone help us to confirm this? :)
>
> > 2015-10-29 23:13 GMT+02:00 Voloshanenko Igor <
> igor.voloshane...@gmail.com >
> > :
>
> > > Additional trace:
> >
>
> > > #0 0x7f30f9891cc9 in __GI_raise (sig=sig@entry=6) at
> > > ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> >
> > > #1 0x7f30f98950d8 in __GI_abort () at abort.c:89
> >
> > > #2 0x7f30f87b36b5 in __gnu_cxx::__verbose_terminate_handler() ()
> from
> > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> >
> > > #3 0x7f30f87b1836 in ?? () from
> > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> >
> > > #4 0x7f30f87b1863 in std::terminate() () from
> > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> >
> > > #5 0x7f30f87b1aa2 in __cxa_throw () from
> > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> >
> > > #6 0x7f2fddb50778 in ceph::__ceph_assert_fail
> > > (assertion=assertion@entry=0x7f2fdddeca05 "sub < m_subsys.size()",
> >
> > > file=file@entry=0x7f2fdddec9f0 "./log/SubsystemMap.h", line=line@entry
> =62,
> >
> > > func=func@entry=0x7f2fdddedba0
> > > <_ZZN4ceph3log12SubsystemMap13should_gatherEjiE19__PRETTY_FUNCTION__>
> "bool
> > > ceph::log::SubsystemMap::should_gather(unsigned int, int)") at
> > > common/assert.cc:77
> >
> > > #7 0x7f2fdda1fed2 in ceph::log::SubsystemMap::should_gather
> > > (level=, sub=, this=)
> >
> > > at ./log/SubsystemMap.h:62
> >
> > > #8 0x7f2fdda3b693 in ceph::log::SubsystemMap::should_gather
> > > (this=, sub=, level=)
> >
> > > at ./log/SubsystemMap.h:61
> >
> > > #9 0x7f2fddd879be in ObjectCacher::flusher_entry
> (this=0x7f2ff80b27a0)
> > > at
> > > osdc/ObjectCacher.cc:1527
> >
> > > #10 0x7f2fddd9851d in ObjectCacher::FlusherThread::entry
> > > (this= > > out>) at osdc/ObjectCacher.h:374
> >
> > > #11 0x7f30f9c28182 in start_thread (arg=0x7f2e1a7fc700) at
> > > pthread_create.c:312
> >
> > > #12 0x7f30f995547d in clone () at
> > > ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> >
>
> > > 2015-10-29 17:38 GMT+02:00 Voloshanenko Igor <
> igor.voloshane...@gmail.com
> > > >
> > > :
> >
>
> > > > Hi Wido and all community.
> > >
> >
>
> > > > We catched very idiotic issue on our Cloudstack installation, which
> > > > related
> > > > to ceph and possible to java-rados lib.
> > >
> >
>
> > > > So, we have constantly agent crashed (which cause very big problem
> for
> > > > us...
> > > > ).
> > >
> >
>
> > > > When agent crashed - it's crash JVM. And no event in logs at all.
> > >
> >
> > > > We enabled crush dump, and after crash we see next picture:
> > >
> >
>
> > > > #grep -A1 "Problematic frame" < /hs_err_pid30260.log
> > >
> >
> > > > Problematic frame:
> > >
> >
> > > > C [librbd.so.1.0.0+0x5d681]
> > >
> >
>
> > > > # gdb /usr/lib/librbd.so.1.0.0 /var/tmp/cores/jsvc.25526.0.core
> > >
> >
> > > > (gdb) bt
> > >
> >
> > > > ...
> > >
> >
> > > > #7 0x7f30b9a1fed2 in ceph::log::SubsystemMap::should_gather
> > > > (level=, sub=, this=)
> > >
> >
> > > > at ./log/SubsystemMap.h:62
> > >
> >
> > > > #8 0x7f30b9a3b693 in ceph::log::SubsystemMap::should_gather
> > > > (this=, sub=, level=)
> > >
> >
> > > > at ./log/SubsystemMap.h:61
> > >
> >
> > > > #9 0x7f30b9d879be in ObjectCacher::flusher_entry
> > > > (this=0x7f2fb4017910)
> > > > at
> > > > osdc/ObjectCacher.cc:1527
> > >
> >
> > > > #10 0x7f30b9d9851d in ObjectCacher::FlusherThread::entry
> > > > (this= > > > out>) at osdc/ObjectCacher.h:374
> > >
> >
>
> > > > From ceph code, this part executed when flushing cache object... And
> we
> > > > don;t
> > > > understand why. Becasue we have absolutely different race condition
> to
> > > > reproduce it.
> > >
> >
>
> > > > As cloudstack have not good implementation yet of snapshot lifecycle,
> > > > sometime, it's happen, that some volumes already marked as EXPUNGED
> in DB
> > > > and then cloudstack try to delete bas Volume, before it's try to
> > > > unprotect
> > > > it.
> > >
> >
>
> > > > Sure, unprotecting fail, normal exception returned back (fail because
> > > > snap
> > > > has childs... )
> > >
> >
>
> > > > 2015-10-29 09:02:19,401 DEBUG [kvm.resource.KVMHAMonitor]
> > > > (Thread-1304:null)
> > > > Executing:
> > > >

Re: [ceph-users] [performance] why rbd_aio_write latency increase from 4ms to 7.3ms after the same test

2015-11-02 Thread hzwuli...@gmail.com
Hi,

Thank you, that make sense for testing, but i'm afraid not in my case.
Even i test on the volume that already test many times, the IOPS will not 
growing up 
again. Yeah, i mean, this VM is broken, IOPS of the VM will never growing up..

Thanks!



hzwuli...@gmail.com
 
From: Chen, Xiaoxi
Date: 2015-11-02 14:11
To: hzwulibin; ceph-devel; ceph-users
Subject: RE: [performance] why rbd_aio_write latency increase from 4ms to 7.3ms 
after the same test
Pre-allocated the volume by "DD" across the entire RBD before you do any 
performance test:).
 
In this case, you may want to re-create the RBD, pre-allocate and try again.
 
> -Original Message-
> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-
> ow...@vger.kernel.org] On Behalf Of hzwulibin
> Sent: Monday, November 2, 2015 1:24 PM
> To: ceph-devel; ceph-users
> Subject: [performance] why rbd_aio_write latency increase from 4ms to
> 7.3ms after the same test
> 
> Hi,
> same environment, after a test script, the io latency(get from sudo ceph --
> admin-daemon /run/ceph/guests/ceph-client.*.asok per dump) increase
> from about 4ms to 7.3ms
> 
> qemu version: debian 2.1.2
> kernel:3.10.45-openstack-amd64
> system: debian 7.8
> ceph: 0.94.5
> VM CPU number: 4  (cpu MHz : 2599.998)
> VM memory size: 16GB
> 9 OSD storage servers, with 4 SSD OSD on each, total 36 OSDs.
> 
> Test scripts in VM:
> # cat reproduce.sh
> #!/bin/bash
> 
> times=20
> for((i=1;i<=$times;i++))
> do
> tmpdate=`date "+%F-%T"`
> echo
> "===$tmpdate($i/$times)===
> "
> tmp=$((i%2))
> if [[ $tmp -eq 0 ]];then
> echo "### fio /root/vdb.cfg ###"
> fio /root/vdb.cfg
> else
> echo "### fio /root/vdc.cfg ###"
> fio /root/vdc.cfg
> fi
> done
> 
> 
> tmpdate=`date "+%F-%T"`
> echo "### [$tmpdate] fio /root/vde.cfg ###"
> fio /root/vde.cfg
> 
> 
> # cat vdb.cfg
> [global]
> rw=randwrite
> direct=1
> numjobs=64
> ioengine=sync
> bsrange=4k-4k
> runtime=180
> group_reporting
> 
> [disk01]
> filename=/dev/vdb
> 
> 
> # cat vdc.cfg
> [global]
> rw=randwrite
> direct=1
> numjobs=64
> ioengine=sync
> bsrange=4k-4k
> runtime=180
> group_reporting
> 
> [disk01]
> filename=/dev/vdc
> 
> # cat vdd.cfg
> [global]
> rw=randwrite
> direct=1
> numjobs=64
> ioengine=sync
> bsrange=4k-4k
> runtime=180
> group_reporting
> 
> [disk01]
> filename=/dev/vdd
> 
> # cat vde.cfg
> [global]
> rw=randwrite
> direct=1
> numjobs=64
> ioengine=sync
> bsrange=4k-4k
> runtime=180
> group_reporting
> 
> [disk01]
> filename=/dev/vde
> 
> After run the scripts reproduce.sh, the disks in the VM's IOPS cutdown from
> 12k to 5k, the latency increase from 4ms to 7.3ms.
> 
> run steps:
> 1. create a VM
> 2. create four volumes and attatch to the VM 3. sh reproduce.sh 4. in the
> runtime of  reproduce.sh, run "fio vdd.cfg" or "fio vde.cfg" to checkt the
> performance
> 
> After reproduce.sh finished, performance down.
> 
> 
> Anyone has the same problem or has some ideas about this?
> 
> Thanks!
> --
> hzwulibin
> 2015-11-02
> {.n +   +%  lzwm  b 맲  r  yǩ ׯzXܨ}   Ơz :+vzZ+  +zf   h   ~ 
>i   z w   ?
> & )ߢ f
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cloudstack agent crashed JVM with exception in librbd

2015-11-02 Thread Jason Dillaman
I can't say that I know too much about Cloudstack's integration with RBD to 
offer much assistance.  Perhaps if Cloudstack is receiving an exception for 
something, it is not properly handling object lifetimes in this case.

-- 

Jason Dillaman 


- Original Message - 

> From: "Voloshanenko Igor" 
> To: "Jason Dillaman" 
> Cc: "Ceph Users" 
> Sent: Monday, November 2, 2015 7:54:23 PM
> Subject: Re: [ceph-users] Cloudstack agent crashed JVM with exception in
> librbd

> Thank you, Jason!

> Any advice, for troubleshooting

> I'm looking in code, and right now don;t see any bad things :(

> 2015-11-03 1:32 GMT+02:00 Jason Dillaman < dilla...@redhat.com > :

> > Most likely not going to be related to 13045 since you aren't actively
> > exporting an image diff. The most likely problem is that the RADOS IO
> > context is being closed prior to closing the RBD image.
> 

> > --
> 

> > Jason Dillaman
> 

> > - Original Message -
> 

> > > From: "Voloshanenko Igor" < igor.voloshane...@gmail.com >
> 
> > > To: "Ceph Users" < ceph-users@lists.ceph.com >
> 
> > > Sent: Thursday, October 29, 2015 5:27:17 PM
> 
> > > Subject: Re: [ceph-users] Cloudstack agent crashed JVM with exception in
> 
> > > librbd
> 

> > > From all we analyzed - look like - it's this issue
> 
> > > http://tracker.ceph.com/issues/13045
> 

> > > PR: https://github.com/ceph/ceph/pull/6097
> 

> > > Can anyone help us to confirm this? :)
> 

> > > 2015-10-29 23:13 GMT+02:00 Voloshanenko Igor <
> > > igor.voloshane...@gmail.com
> > > >
> 
> > > :
> 

> > > > Additional trace:
> 
> > >
> 

> > > > #0 0x7f30f9891cc9 in __GI_raise (sig=sig@entry=6) at
> 
> > > > ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> 
> > >
> 
> > > > #1 0x7f30f98950d8 in __GI_abort () at abort.c:89
> 
> > >
> 
> > > > #2 0x7f30f87b36b5 in __gnu_cxx::__verbose_terminate_handler() ()
> > > > from
> 
> > > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > >
> 
> > > > #3 0x7f30f87b1836 in ?? () from
> 
> > > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > >
> 
> > > > #4 0x7f30f87b1863 in std::terminate() () from
> 
> > > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > >
> 
> > > > #5 0x7f30f87b1aa2 in __cxa_throw () from
> 
> > > > /usr/lib/x86_64-linux-gnu/libstdc++.so.6
> 
> > >
> 
> > > > #6 0x7f2fddb50778 in ceph::__ceph_assert_fail
> 
> > > > (assertion=assertion@entry=0x7f2fdddeca05 "sub < m_subsys.size()",
> 
> > >
> 
> > > > file=file@entry=0x7f2fdddec9f0 "./log/SubsystemMap.h",
> > > > line=line@entry=62,
> 
> > >
> 
> > > > func=func@entry=0x7f2fdddedba0
> 
> > > > <_ZZN4ceph3log12SubsystemMap13should_gatherEjiE19__PRETTY_FUNCTION__>
> > > > "bool
> 
> > > > ceph::log::SubsystemMap::should_gather(unsigned int, int)") at
> 
> > > > common/assert.cc:77
> 
> > >
> 
> > > > #7 0x7f2fdda1fed2 in ceph::log::SubsystemMap::should_gather
> 
> > > > (level=, sub=, this=)
> 
> > >
> 
> > > > at ./log/SubsystemMap.h:62
> 
> > >
> 
> > > > #8 0x7f2fdda3b693 in ceph::log::SubsystemMap::should_gather
> 
> > > > (this=, sub=, level=)
> 
> > >
> 
> > > > at ./log/SubsystemMap.h:61
> 
> > >
> 
> > > > #9 0x7f2fddd879be in ObjectCacher::flusher_entry
> > > > (this=0x7f2ff80b27a0)
> 
> > > > at
> 
> > > > osdc/ObjectCacher.cc:1527
> 
> > >
> 
> > > > #10 0x7f2fddd9851d in ObjectCacher::FlusherThread::entry
> 
> > > > (this= 
> > > > out>) at osdc/ObjectCacher.h:374
> 
> > >
> 
> > > > #11 0x7f30f9c28182 in start_thread (arg=0x7f2e1a7fc700) at
> 
> > > > pthread_create.c:312
> 
> > >
> 
> > > > #12 0x7f30f995547d in clone () at
> 
> > > > ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
> 
> > >
> 

> > > > 2015-10-29 17:38 GMT+02:00 Voloshanenko Igor <
> > > > igor.voloshane...@gmail.com
> 
> > > > >
> 
> > > > :
> 
> > >
> 

> > > > > Hi Wido and all community.
> 
> > > >
> 
> > >
> 

> > > > > We catched very idiotic issue on our Cloudstack installation, which
> 
> > > > > related
> 
> > > > > to ceph and possible to java-rados lib.
> 
> > > >
> 
> > >
> 

> > > > > So, we have constantly agent crashed (which cause very big problem
> > > > > for
> 
> > > > > us...
> 
> > > > > ).
> 
> > > >
> 
> > >
> 

> > > > > When agent crashed - it's crash JVM. And no event in logs at all.
> 
> > > >
> 
> > >
> 
> > > > > We enabled crush dump, and after crash we see next picture:
> 
> > > >
> 
> > >
> 

> > > > > #grep -A1 "Problematic frame" < /hs_err_pid30260.log
> 
> > > >
> 
> > >
> 
> > > > > Problematic frame:
> 
> > > >
> 
> > >
> 
> > > > > C [librbd.so.1.0.0+0x5d681]
> 
> > > >
> 
> > >
> 

> > > > > # gdb /usr/lib/librbd.so.1.0.0 /var/tmp/cores/jsvc.25526.0.core
> 
> > > >
> 
> > >
> 
> > > > > (gdb) bt
> 
> > > >
> 
> > >
> 
> > > > > ...
> 
> > > >
> 
> > >
> 
> > > > > #7 0x7f30b9a1fed2 in ceph::log::SubsystemMap::should_gather
> 
> > > > > (level=, sub=, this=)
> 
> > > >
> 
> > >
> 
> > > > > at ./log/SubsystemMap.h:62
> 
> > > >

[ceph-users] ceph new osd addition and client disconnected

2015-11-02 Thread gjprabu


Hi Team,



   We have ceph setup with 2 OSD and replica 2 and it is mounted with ocfs2 
clients and its working. When we added new osd  all the clients rbd mapped 
device disconnected and got hanged by running rbd ls or rbd map command. We 
waited for long hours to scale the new osd size but peering not completed event 
data sync finished, but client side issue was persist and thought to try old 
osd service stop/start, after some time rbd mapped automatically using existing 
map script.



   After service stop/start in old osd again 3rd OSD rebuild and back 
filling started and after some time clients rbd mapped device disconnected and 
got hanged by running rbd ls or rbd map command. We thought to wait till to 
finished data sync in 3'rd OSD and its completed, even though client side rbd 
not mapped. After we restarted all mon and osd service and client side issue 
got fixed and mounted rbd. We suspected some issue in our setup. also attached 
logs for your reference.



  Something we are missing in our setup i don't know, highly appreciated if 
anybody help us to solve this issue.





Before new osd.2 addition :



osd.0 - size : 13T  and used 2.7 T

osd.1 - size : 13T  and used 2.7 T



After new osd addition :

osd.0  size : 13T  and used  1.8T

osd.1  size : 13T  and used  2.1T

osd.2  size : 15T  and used  2.5T



rbd ls

repo / integrepository  (pg_num: 126)

rbd / integdownloads (pg_num: 64)







Also we would like to know few clarifications .



If any new osd will be added whether all client will be unmounted automatically 
.



While add new osd can we access ( read / write ) from client machines ?



How much data will be added in new osd - without change any repilca / pg_num ?



How long to take finish this process ? 



If we missed any common configuration - please share the same .





ceph.conf

[global]

fsid = 944fa0af-b7be-45a9-93ff-b9907cfaee3f

mon_initial_members = integ-hm5, integ-hm6, integ-hm7

mon_host = 192.168.112.192,192.168.112.193,192.168.112.194

auth_cluster_required = cephx

auth_service_required = cephx

auth_client_required = cephx

filestore_xattr_use_omap = true

osd_pool_default_size = 2



[mon]

mon_clock_drift_allowed = .500



[client]

rbd_cache = false



Current Logs from new osd also attached old logs.



2015-11-02 12:47:48.481641 7f386f691700  0 bad crc in data 3889133030 != exp 
2857248268

2015-11-02 12:47:48.482230 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc510580).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42530/0)

2015-11-02 12:47:48.483951 7f386f691700  0 bad crc in data 3192803598 != exp 
1083014631

2015-11-02 12:47:48.484512 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc516f60).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42531/0)

2015-11-02 12:47:48.486284 7f386f691700  0 bad crc in data 133120597 != exp 
393328400

2015-11-02 12:47:48.486777 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x16a18000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc514620).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42532/0)

2015-11-02 12:47:48.488624 7f386f691700  0 bad crc in data 3299720069 != exp 
211350069

2015-11-02 12:47:48.489100 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc513860).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42533/0)

2015-11-02 12:47:48.490911 7f386f691700  0 bad crc in data 2381447347 != exp 
1177846878

2015-11-02 12:47:48.491390 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc513700).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42534/0)

2015-11-02 12:47:48.493167 7f386f691700  0 bad crc in data 2093712440 != exp 
2175112954

2015-11-02 12:47:48.493682 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x16a18000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc514200).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42535/0)

2015-11-02 12:47:48.495150 7f386f691700  0 bad crc in data 3047197039 != exp 
38098198

2015-11-02 12:47:48.495679 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170d2000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0xc510b00).accept peer addr is really 192.168.113.42:0/599324131 (socket 
is 192.168.113.42:42536/0)

2015-11-02 12:47:48.497259 7f386f691700  0 bad crc in data 1400444622 != exp 
2648291990

2015-11-02 12:47:48.497756 7f386f691700  0 -- 192.168.112.231:6800/49908 
 192.168.113.42:0/599324131 pipe(0x170ea000 sd=28 :6800 s=0 pgs=0 cs=0 
l=0 c=0x17f7b700).accept peer addr is really 

[ceph-users] retrieving quota of ceph pool using librados or python API

2015-11-02 Thread Alex Leake
Hello all,


I'm attempting to use the python API to get the quota of a pool, but I can't 
see it in the documentation (http://docs.ceph.com/docs/v0.94/rados/api/python/).


Does anyone know how to get the quota? (python or C). Without making a call to 
"ceph osd pool get-quota".


I am using ceph 0.94.2, under Ubuntu 14.04.



Kind Regards,

Alex.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Changing CRUSH map ids

2015-11-02 Thread Loris Cuoghi

Le 02/11/2015 12:47, Wido den Hollander a écrit :



On 02-11-15 12:30, Loris Cuoghi wrote:

Hi All,

We're currently on version 0.94.5 with three monitors and 75 OSDs.

I've peeked at the decompiled CRUSH map, and I see that all ids are
commented with '# Here be dragons!', or more literally : '# do not
change unnecessarily'.

Now, what would happen if an incautious user would happen to put his
chubby fingers on this ids, totally disregarding the warning at the
entrance of the cave, and change one of them?

Data shuffle? (Relative to the allocation of PGs for the OSD/host/other
item?)

A *big* data shuffle? (ALL data would need to have its position
recalculated, with immediate end-of-the-world data shuffle?)

Nothing at all? (And the big fat warning is there only to take fun on
the uninstructed ones? Not plausible...)



Give it a try! Download the CRUSHMap and run tests on it with crushtool:

$ crushtool -i mycrushmap --test --rule 0 --num-rep 3 --show-statistics

Now, change the map, compile it and run again:

$ crushtool -i mycrushmap.new --test --rule 0 --num-rep 3 --show-statistics

Check the differences and you get the idea of how much has changed.

Wido



Thanks Wido ! :)


Thanks !

Loris
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com