Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-21 Thread Paul Emmerich
For mailing list archive readers in the future:

On Tue, Jul 9, 2019 at 1:22 PM Paul Emmerich  wrote:

> Try to add "--inconsistent-index" (caution: will obviously leave your
> bucket in a broken state during the deletion, so don't try to use the
> bucket)
>

this was bad advice as long as https://tracker.ceph.com/issues/40700 is not
fixed, don't do that.



>
> You can also speed up the deletion with "--max-concurrent-ios" (default
> 32). The documentation incorrectly claims that "--max-concurrent-ios" is
> only for other operations but that's wrong, it is used for most bucket
> operations including deleteion.
>

this, however, is a good idea to speed up deletion of large buckets.

Try to combine the deletion command with timeout or something to not run
into OOM all the time affecting other services.
(or use cgroups to limit RAM)


Paul


>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
>
> On Tue, Jul 9, 2019 at 1:11 PM Harald Staub 
> wrote:
>
>> Currently removing a bucket with a lot of objects:
>> radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects
>>
>> This process was killed by the out-of-memory killer. Then looking at the
>> graphs, we see a continuous increase of memory usage for this process,
>> about +24 GB per day. Removal rate is about 3 M objects per day.
>>
>> It is not the fastest hardware, and this index pool is still without
>> SSDs. The bucket is sharded, 1024 shards. We are on Nautilus 14.2.1, now
>> about 500 OSDs.
>>
>> So with this bucket with 60 M objects, we would need about 480 GB of RAM
>> to come through. Or is there a workaround? Should I open a tracker issue?
>>
>> The killed remove command can just be called again, but it will be
>> killed again before it finishes. Also, it has to run some time until it
>> continues to actually remove objects. This "wait time" is also
>> increasing. Last time, after about 16 M objects already removed, the
>> wait time was nearly 9 hours. Also during this time, there is a memory
>> ramp, but not so steep.
>>
>> BTW it feels strange that the removal of objects is slower (about 3
>> times) than adding objects.
>>
>>   Harry
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-11 Thread Harald Staub

Created https://tracker.ceph.com/issues/40700 (sry forgot to mention).

On 11.07.19 16:41, Matt Benjamin wrote:

I don't think one has been created yet.  Eric Ivancich and Mark Kogan
of my team are investigating this behavior.

Matt

On Thu, Jul 11, 2019 at 10:40 AM Paul Emmerich  wrote:


Is there already a tracker issue?

I'm seeing the same problem here. Started deletion of a bucket with a few 
hundred million objects a week ago or so and I've now noticed that it's also 
leaking memory and probably going to crash.
Going to investigate this further...

Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Tue, Jul 9, 2019 at 1:26 PM Matt Benjamin  wrote:


Hi Harald,

Please file a tracker issue, yes.  (Deletes do tend to be slower,
presumably due to rocksdb compaction.)

Matt

On Tue, Jul 9, 2019 at 7:12 AM Harald Staub  wrote:


Currently removing a bucket with a lot of objects:
radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects

This process was killed by the out-of-memory killer. Then looking at the
graphs, we see a continuous increase of memory usage for this process,
about +24 GB per day. Removal rate is about 3 M objects per day.

It is not the fastest hardware, and this index pool is still without
SSDs. The bucket is sharded, 1024 shards. We are on Nautilus 14.2.1, now
about 500 OSDs.

So with this bucket with 60 M objects, we would need about 480 GB of RAM
to come through. Or is there a workaround? Should I open a tracker issue?

The killed remove command can just be called again, but it will be
killed again before it finishes. Also, it has to run some time until it
continues to actually remove objects. This "wait time" is also
increasing. Last time, after about 16 M objects already removed, the
wait time was nearly 9 hours. Also during this time, there is a memory
ramp, but not so steep.

BTW it feels strange that the removal of objects is slower (about 3
times) than adding objects.

   Harry
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





--

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] memory usage of: radosgw-admin bucket rm [EXT]

2019-07-11 Thread Matthew Vernon

On 11/07/2019 15:40, Paul Emmerich wrote:

Is there already a tracker issue?

I'm seeing the same problem here. Started deletion of a bucket with a 
few hundred million objects a week ago or so and I've now noticed that 
it's also leaking memory and probably going to crash.

Going to investigate this further...


We had a bucket rm on a machine that OOM'd (and killed the relevant 
process), but I wasn't watching at the time to see if it was the thing 
eating all the RAM.


If someone's giving the bucket rm code some love, it'd be nice if
https://tracker.ceph.com/issues/40587 (and associated PR) got looked at 
- missing shadow objects shouldn't really cause a bucket rm to give up...


Regards,

Matthew


--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-11 Thread Matt Benjamin
I don't think one has been created yet.  Eric Ivancich and Mark Kogan
of my team are investigating this behavior.

Matt

On Thu, Jul 11, 2019 at 10:40 AM Paul Emmerich  wrote:
>
> Is there already a tracker issue?
>
> I'm seeing the same problem here. Started deletion of a bucket with a few 
> hundred million objects a week ago or so and I've now noticed that it's also 
> leaking memory and probably going to crash.
> Going to investigate this further...
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
>
> On Tue, Jul 9, 2019 at 1:26 PM Matt Benjamin  wrote:
>>
>> Hi Harald,
>>
>> Please file a tracker issue, yes.  (Deletes do tend to be slower,
>> presumably due to rocksdb compaction.)
>>
>> Matt
>>
>> On Tue, Jul 9, 2019 at 7:12 AM Harald Staub  wrote:
>> >
>> > Currently removing a bucket with a lot of objects:
>> > radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects
>> >
>> > This process was killed by the out-of-memory killer. Then looking at the
>> > graphs, we see a continuous increase of memory usage for this process,
>> > about +24 GB per day. Removal rate is about 3 M objects per day.
>> >
>> > It is not the fastest hardware, and this index pool is still without
>> > SSDs. The bucket is sharded, 1024 shards. We are on Nautilus 14.2.1, now
>> > about 500 OSDs.
>> >
>> > So with this bucket with 60 M objects, we would need about 480 GB of RAM
>> > to come through. Or is there a workaround? Should I open a tracker issue?
>> >
>> > The killed remove command can just be called again, but it will be
>> > killed again before it finishes. Also, it has to run some time until it
>> > continues to actually remove objects. This "wait time" is also
>> > increasing. Last time, after about 16 M objects already removed, the
>> > wait time was nearly 9 hours. Also during this time, there is a memory
>> > ramp, but not so steep.
>> >
>> > BTW it feels strange that the removal of objects is slower (about 3
>> > times) than adding objects.
>> >
>> >   Harry
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> >
>>
>>
>> --
>>
>> Matt Benjamin
>> Red Hat, Inc.
>> 315 West Huron Street, Suite 140A
>> Ann Arbor, Michigan 48103
>>
>> http://www.redhat.com/en/technologies/storage
>>
>> tel.  734-821-5101
>> fax.  734-769-8938
>> cel.  734-216-5309
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-11 Thread Paul Emmerich
Is there already a tracker issue?

I'm seeing the same problem here. Started deletion of a bucket with a few
hundred million objects a week ago or so and I've now noticed that it's
also leaking memory and probably going to crash.
Going to investigate this further...

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Tue, Jul 9, 2019 at 1:26 PM Matt Benjamin  wrote:

> Hi Harald,
>
> Please file a tracker issue, yes.  (Deletes do tend to be slower,
> presumably due to rocksdb compaction.)
>
> Matt
>
> On Tue, Jul 9, 2019 at 7:12 AM Harald Staub 
> wrote:
> >
> > Currently removing a bucket with a lot of objects:
> > radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects
> >
> > This process was killed by the out-of-memory killer. Then looking at the
> > graphs, we see a continuous increase of memory usage for this process,
> > about +24 GB per day. Removal rate is about 3 M objects per day.
> >
> > It is not the fastest hardware, and this index pool is still without
> > SSDs. The bucket is sharded, 1024 shards. We are on Nautilus 14.2.1, now
> > about 500 OSDs.
> >
> > So with this bucket with 60 M objects, we would need about 480 GB of RAM
> > to come through. Or is there a workaround? Should I open a tracker issue?
> >
> > The killed remove command can just be called again, but it will be
> > killed again before it finishes. Also, it has to run some time until it
> > continues to actually remove objects. This "wait time" is also
> > increasing. Last time, after about 16 M objects already removed, the
> > wait time was nearly 9 hours. Also during this time, there is a memory
> > ramp, but not so steep.
> >
> > BTW it feels strange that the removal of objects is slower (about 3
> > times) than adding objects.
> >
> >   Harry
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> >
>
>
> --
>
> Matt Benjamin
> Red Hat, Inc.
> 315 West Huron Street, Suite 140A
> Ann Arbor, Michigan 48103
>
> http://www.redhat.com/en/technologies/storage
>
> tel.  734-821-5101
> fax.  734-769-8938
> cel.  734-216-5309
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-09 Thread Matt Benjamin
Hi Harald,

Please file a tracker issue, yes.  (Deletes do tend to be slower,
presumably due to rocksdb compaction.)

Matt

On Tue, Jul 9, 2019 at 7:12 AM Harald Staub  wrote:
>
> Currently removing a bucket with a lot of objects:
> radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects
>
> This process was killed by the out-of-memory killer. Then looking at the
> graphs, we see a continuous increase of memory usage for this process,
> about +24 GB per day. Removal rate is about 3 M objects per day.
>
> It is not the fastest hardware, and this index pool is still without
> SSDs. The bucket is sharded, 1024 shards. We are on Nautilus 14.2.1, now
> about 500 OSDs.
>
> So with this bucket with 60 M objects, we would need about 480 GB of RAM
> to come through. Or is there a workaround? Should I open a tracker issue?
>
> The killed remove command can just be called again, but it will be
> killed again before it finishes. Also, it has to run some time until it
> continues to actually remove objects. This "wait time" is also
> increasing. Last time, after about 16 M objects already removed, the
> wait time was nearly 9 hours. Also during this time, there is a memory
> ramp, but not so steep.
>
> BTW it feels strange that the removal of objects is slower (about 3
> times) than adding objects.
>
>   Harry
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-09 Thread Paul Emmerich
Try to add "--inconsistent-index" (caution: will obviously leave your
bucket in a broken state during the deletion, so don't try to use the
bucket)

You can also speed up the deletion with "--max-concurrent-ios" (default
32). The documentation incorrectly claims that "--max-concurrent-ios" is
only for other operations but that's wrong, it is used for most bucket
operations including deleteion.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Tue, Jul 9, 2019 at 1:11 PM Harald Staub  wrote:

> Currently removing a bucket with a lot of objects:
> radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects
>
> This process was killed by the out-of-memory killer. Then looking at the
> graphs, we see a continuous increase of memory usage for this process,
> about +24 GB per day. Removal rate is about 3 M objects per day.
>
> It is not the fastest hardware, and this index pool is still without
> SSDs. The bucket is sharded, 1024 shards. We are on Nautilus 14.2.1, now
> about 500 OSDs.
>
> So with this bucket with 60 M objects, we would need about 480 GB of RAM
> to come through. Or is there a workaround? Should I open a tracker issue?
>
> The killed remove command can just be called again, but it will be
> killed again before it finishes. Also, it has to run some time until it
> continues to actually remove objects. This "wait time" is also
> increasing. Last time, after about 16 M objects already removed, the
> wait time was nearly 9 hours. Also during this time, there is a memory
> ramp, but not so steep.
>
> BTW it feels strange that the removal of objects is slower (about 3
> times) than adding objects.
>
>   Harry
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] memory usage of: radosgw-admin bucket rm

2019-07-09 Thread Harald Staub

Currently removing a bucket with a lot of objects:
radosgw-admin bucket rm --bucket=$BUCKET --bypass-gc --purge-objects

This process was killed by the out-of-memory killer. Then looking at the 
graphs, we see a continuous increase of memory usage for this process, 
about +24 GB per day. Removal rate is about 3 M objects per day.


It is not the fastest hardware, and this index pool is still without 
SSDs. The bucket is sharded, 1024 shards. We are on Nautilus 14.2.1, now 
about 500 OSDs.


So with this bucket with 60 M objects, we would need about 480 GB of RAM 
to come through. Or is there a workaround? Should I open a tracker issue?


The killed remove command can just be called again, but it will be 
killed again before it finishes. Also, it has to run some time until it 
continues to actually remove objects. This "wait time" is also 
increasing. Last time, after about 16 M objects already removed, the 
wait time was nearly 9 hours. Also during this time, there is a memory 
ramp, but not so steep.


BTW it feels strange that the removal of objects is slower (about 3 
times) than adding objects.


 Harry
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] memory usage ceph jewel OSDs

2017-03-24 Thread Manuel Lausch
Hello,

in the last days I try to figure out why my OSDs needs a huge amount of
RAM. (1,2 - 4 GB). With this my System memory is on limit. At
beginning I thougt it is because of huge amount of backfilling (some
disks died). But now since a few days all is good but the memory keeps
at its level. Restarting of the OSDs did nothing on this behaviour. 

I running Ceph Jewel (10.2.6) on RedHat7. The cluster has 8 Hosts with
36 4TB OSDs each and 4 Hosts with 15 4 TB OSDs

I tried to profile the used memory like documented here: 
http://docs.ceph.com/docs/jewel/rados/troubleshooting/memory-profiling/

But the output of this commands didn't help me. But I am confused about
the used memory.

from ceph tell osd.98 heap dump I get the following output:
# ceph tell osd.98 heap dump
osd.98 dumping heap profile now.

MALLOC: 1290458456 ( 1230.7 MiB) Bytes in use by application
MALLOC: +0 (0.0 MiB) Bytes in page heap freelist
MALLOC: + 63583000 (   60.6 MiB) Bytes in central cache freelist
MALLOC: +  5896704 (5.6 MiB) Bytes in transfer cache freelist
MALLOC: +102784400 (   98.0 MiB) Bytes in thread cache freelists
MALLOC: + 11350176 (   10.8 MiB) Bytes in malloc metadata
MALLOC:   
MALLOC: =   1474072736 ( 1405.8 MiB) Actual memory used (physical +
swap) MALLOC: +129064960 (  123.1 MiB) Bytes released to OS (aka
unmapped) MALLOC:   
MALLOC: =   1603137696 ( 1528.9 MiB) Virtual address space used
MALLOC:
MALLOC:  88305  Spans in use
MALLOC:   1627  Thread heaps in use
MALLOC:   8192  Tcmalloc page size

Call ReleaseFreeMemory() to release freelist memory to the OS (via
madvise()). Bytes released to the OS take up virtual address space but
no physical memory.


I would say the application needs 1230.7 MB of RAM. But if I analyse
the corresponding dump whit pprof The are only a few Megabytes
mentioned. Follwing the first few lines of pprof:

# pprof --text /usr/bin/ceph-osd osd.98.profile.0002.heap 
Using local file /usr/bin/ceph-osd.
Using local file osd.98.profile.0002.heap.
Total: 8.9 MB
 3.3  36.7%  36.7%  3.3  36.7% ceph::log::Log::create_entry
 2.3  25.5%  62.2%  2.3  25.5% ceph::buffer::list::append@a1f280
 1.1  12.1%  74.3%  2.0  23.1% SimpleMessenger::add_accept_pipe
 0.9  10.4%  84.7%  0.9  10.5% Pipe::Pipe
 0.2   2.8%  87.5%  0.2   2.8% std::map::operator[]
 0.2   2.2%  89.7%  0.2   2.2% std::vector::_M_default_append
 0.2   1.8%  91.5%  0.2   1.8% std::_Rb_tree::_M_copy
 0.1   0.8%  92.4%  0.1   0.8% ceph::buffer::create_aligned
 0.1   0.8%  93.2%  0.1   0.8% std::string::_Rep::_S_create


Is this normal? Do I do something wrong? Is there a Bug? Why need my
OSDs so much RAM?

Thanks for your help

Regards,
Manuel

-- 
Manuel Lausch

Systemadministrator
Cloud Services

1&1 Mail & Media Development & Technology GmbH | Brauerstraße 48 |
76135 Karlsruhe | Germany Phone: +49 721 91374-1847
E-Mail: manuel.lau...@1und1.de | Web: www.1und1.de

Amtsgericht Montabaur, HRB 5452

Geschäftsführer: Frank Einhellinger, Thomas Ludwig, Jan Oetjen


Member of United Internet

Diese E-Mail kann vertrauliche und/oder gesetzlich geschützte
Informationen enthalten. Wenn Sie nicht der bestimmungsgemäße Adressat
sind oder diese E-Mail irrtümlich erhalten haben, unterrichten Sie
bitte den Absender und vernichten Sie diese E-Mail. Anderen als dem
bestimmungsgemäßen Adressaten ist untersagt, diese E-Mail zu speichern,
weiterzuleiten oder ihren Inhalt auf welche Weise auch immer zu
verwenden.

This e-mail may contain confidential and/or privileged information. If
you are not the intended recipient of this e-mail, you are hereby
notified that saving, distribution or use of the content of this e-mail
in any way is prohibited. If you have received this e-mail in error,
please notify the sender and delete the e-mail.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Memory-Usage

2015-08-18 Thread Gregory Farnum
On Mon, Aug 17, 2015 at 8:21 PM, Patrik Plank pat...@plank.me wrote:
 Hi,


 have a ceph cluster witch tree nodes and 32 osds.

 The tree nodes have 16Gb memory but only 5Gb is in use.

 Nodes are Dell Poweredge R510.


 my ceph.conf:


 [global]
 mon_initial_members = ceph01
 mon_host = 10.0.0.20,10.0.0.21,10.0.0.22
 auth_cluster_required = cephx
 auth_service_required = cephx
 auth_client_required = cephx
 filestore_xattr_use_omap = true
 filestore_op_threads = 32
 public_network = 10.0.0.0/24
 cluster_network = 10.0.1.0/24
 osd_pool_default_size = 3
 osd_pool_default_min_size = 1
 osd_pool_default_pg_num = 4096
 osd_pool_default_pgp_num = 4096
 osd_max_write_size = 200
 osd_map_cache_size = 1024
 osd_map_cache_bl_size = 128
 osd_recovery_op_priority = 1
 osd_max_recovery_max_active = 1
 osd_recovery_max_backfills = 1
 osd_op_threads = 32
 osd_disk_threads = 8

 is that normal or a bottleneck?

Any memory not used by the OSD processes directly will be used by
Linux for page caching. That's what we want to have happen! So it's
not a problem that it's using only 5 GB. Keep in mind that the
memory usage might spike dramatically if the OSDs need to deal with an
outage, though — your normal-state usage ought to be lower than our
recommended values for that reason.
-Greg



 best regards

 Patrik


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Memory-Usage

2015-08-17 Thread Patrik Plank
Hi,



have a ceph cluster witch tree nodes and 32 osds.

The tree nodes have 16Gb memory but only 5Gb is in use.

Nodes are Dell Poweredge R510.



my ceph.conf:



[global]
mon_initial_members = ceph01
mon_host = 10.0.0.20,10.0.0.21,10.0.0.22
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
filestore_xattr_use_omap = true
filestore_op_threads = 32
public_network = 10.0.0.0/24
cluster_network = 10.0.1.0/24
osd_pool_default_size = 3
osd_pool_default_min_size = 1
osd_pool_default_pg_num = 4096
osd_pool_default_pgp_num = 4096
osd_max_write_size = 200
osd_map_cache_size = 1024
osd_map_cache_bl_size = 128
osd_recovery_op_priority = 1
osd_max_recovery_max_active = 1
osd_recovery_max_backfills = 1
osd_op_threads = 32
osd_disk_threads = 8


is that normal or a bottleneck?



best regards

Patrik

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Memory usage of librados

2013-12-25 Thread Potapov Sergey
Hi!

I`am accessing to Ceph cluster via librados and found that memory usage 
(VSZ in ps) is extremely increases on writing/reading/removing objects. 
If I successively write+read+remove the object with the same name, then 
memory usage is not increased, but if I do
the same operations with the different object names then VSZ increases 
up to 4G and stops at this level. It seems that some internal buffering 
is enabled, can I configure the maximum size of them? Memory usage is 
very critical for me and I want to reduce it as much as possible.

Operations write+read+stat+remove in loop with different object names:
lion@kubuntu-12:~$ ps aux | grep RADOS
lion  7890  0.2  0.0 4289264 3988 ?Sl   10:28   0:00 
/home/lion/Projects/RMS/TestRADOS/TestRADOS

Operations write+read+stat+remove in loop with one object name:
lion@kubuntu-12:~$ ps aux | grep RADOS
lion  8149  0.1  0.0 556424  3520 ?Sl   10:30   0:00 
/home/lion/Projects/RMS/TestRADOS/TestRADOS

Listing of test program: http://pastebin.com/idqt1PEv
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Memory usage of librados

2013-12-25 Thread Sage Weil
On Wed, 25 Dec 2013, Potapov Sergey wrote:
 Hi!
 
 I`am accessing to Ceph cluster via librados and found that memory usage 
 (VSZ in ps) is extremely increases on writing/reading/removing objects. 
 If I successively write+read+remove the object with the same name, then 
 memory usage is not increased, but if I do
 the same operations with the different object names then VSZ increases 
 up to 4G and stops at this level. It seems that some internal buffering 
 is enabled, can I configure the maximum size of them? Memory usage is 
 very critical for me and I want to reduce it as much as possible.
 
 Operations write+read+stat+remove in loop with different object names:
 lion@kubuntu-12:~$ ps aux | grep RADOS
 lion  7890  0.2  0.0 4289264 3988 ?Sl   10:28   0:00 
 /home/lion/Projects/RMS/TestRADOS/TestRADOS
 
 Operations write+read+stat+remove in loop with one object name:
 lion@kubuntu-12:~$ ps aux | grep RADOS
 lion  8149  0.1  0.0 556424  3520 ?Sl   10:30   0:00 
 /home/lion/Projects/RMS/TestRADOS/TestRADOS
 
 Listing of test program: http://pastebin.com/idqt1PEv

There will be some memory usage that is proportinoal to the size of the 
cluster because of the sockets open to the various OSDs.  I'm not sure 
what level you should expect, though: it depends on the size of the 
cluster, your architecture...and I haven't really measured it.

In any case, if you can run your program through valgrind massif, that 
will tell us exactly where the memory is going, and whether there is 
anything obviously wrong or easy to fix!

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Memory usage of librados

2013-12-24 Thread Potapov Sergey
Hi!

I`am accessing to Ceph cluster via librados and found that memory usage (VSZ in 
ps) is extremely increases on writing/reading/removing objects. If I 
successively write+read+remove the object with the same name, then memory usage 
is not increased, but if I do the same operations with the different object 
names then VSZ increases up to 4G and stops at this level. It seems that some 
internal buffering is enabled, can I configure the maximum size of them? Memory 
usage is very critical for me and I want to reduce it as much as possible.

Operations write+read+stat+remove in loop with different object names:
lion@kubuntu-12:~$ ps aux | grep RADOS
lion  7890  0.2  0.0 4289264 3988 ?Sl   10:28   0:00 
/home/lion/Projects/RMS/TestRADOS/TestRADOS

Operations write+read+stat+remove in loop with one object name:
lion@kubuntu-12:~$ ps aux | grep RADOS
lion  8149  0.1  0.0 556424  3520 ?Sl   10:30   0:00 
/home/lion/Projects/RMS/TestRADOS/TestRADOS

Listing of test program: http://pastebin.com/idqt1PEv

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com