date:20171121

[ceph-users] luminous - 12.2.1 - stale RBD locks after client crash

2017-11-21 Thread Nikola Ciprich

Hello ceph users and developers,

I've stumbled upon a bit strange problem with Luminous.

One of our servers running multiple QEMU clients crashed.
When we tried restarting those on another cluster node,
we got lots of fsck errors, disks seemed to return "physical"
block errors. I figured this out to be stale RBD locks on volumes
from the crashed machine. Wnen I removed the locks, everything
started to work. (for some volumes, I was fixing those the another
day after crash, so it was >10-15hours later)

My question is, it this a bug or feature? I mean, after the client
crashes, should locks somehow expire, or they need to be removed
by hand? I don't remember having this issue with older ceph versions,
but I suppose we didn't  have exclusive locks feature enabled..

I'll be very grateful for any reply

with best regards

nik
-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph - SSD cluster

2017-11-21 Thread Christian Balzer

On Tue, 21 Nov 2017 11:34:51 +0100 Ronny Aasen wrote:

> On 20. nov. 2017 23:06, Christian Balzer wrote:
> > On Mon, 20 Nov 2017 15:53:31 +0100 Ansgar Jazdzewski wrote:
> >   
> >> Hi *,
> >>
> >> just on note because we hit it, take a look on your discard options
> >> make sure it not run on all OSD at the same time.
> >>  
> > Any SSD that actually _requires_ the use of TRIM/DISCARD to maintain
> > either speed or endurance I'd consider unfit for Ceph to boot.
> >   
> 
> 
> hello
> 
> is there some sort of hardware compatibillity list for this part ?
> perhaps community maintained on the wiki or similar.
> 
> there are some older blog posts covering some devices, but hard to find 
> ceph related for current devices.
> 
Current devices tend to follow in the footsteps of older ones.

Thus Intel DC S ones tend to be suitable, however their endurance and
performance vary and need to match the expected load.
The 37xx ones can deal with anything you throw at them, the x6xx ones with
3DWPD endurance will do the job for many people, the 35xx ones are
certainly fast enough for many use cases but should only be deployed by
people who know exactly what they're doing and for read-mostly use cases. 

The same is true in principal for the Samsung DC level SSDs (SM863a these
days), they certainly perform well enough, sometimes even better than the
Intel DC S36xx to which they are mostly equivalent. 
Samsung has a history of "firmware (and bug) of the week" issues, so
diligent testing and a good return/fix/replace policy from your vendor is
always something to make sure off.

Christian

-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Communications
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph - SSD cluster

2017-11-21 Thread Eric Nelson

Plus one here the evos are terrible

On Tue, Nov 21, 2017 at 6:10 AM Phil Schwarz 
wrote:

> Hi,
> not a real HAL, but keeping this list [1] in mind is mandatory.
>
> According to me, use roughly any kind of Intel SSD :3750 in SATA or best
> 3700 in MVNE.
> Avoid any Samsung pro or EVO  of nearly any kind.(Haven't found a link,
> sorry)
>
> My 2 cents
>
> [1] :
>
> https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/
>
>
> Le 21/11/2017 à 11:34, Ronny Aasen a écrit :
> > On 20. nov. 2017 23:06, Christian Balzer wrote:
> >> On Mon, 20 Nov 2017 15:53:31 +0100 Ansgar Jazdzewski wrote:
> >>
> >>> Hi *,
> >>>
> >>> just on note because we hit it, take a look on your discard options
> >>> make sure it not run on all OSD at the same time.
> >>>
> >> Any SSD that actually _requires_ the use of TRIM/DISCARD to maintain
> >> either speed or endurance I'd consider unfit for Ceph to boot.
> >>
> >
> >
> > hello
> >
> > is there some sort of hardware compatibillity list for this part ?
> > perhaps community maintained on the wiki or similar.
> >
> > there are some older blog posts covering some devices, but hard to find
> > ceph related for current devices.
> >
> > kind regards
> > Ronny Aasen
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to set osd_max_backfills in Luminous

2017-11-21 Thread Jean-Charles Lopez

Hi,

to check a current value use the following command on the machine where the OSD 
you want to check is running

ceph daemon osd.{id} config show | grep {parameter}
  Or
ceph daemon osd.{id} config get {parameter}

What you are seeing is actually a known glitch where you are being told it has 
no effect when in fact it does. See capture below
[root@luminous ceph-deploy]# ceph daemon osd.0 config get osd_max_backfills
{
"osd_max_backfills": "1"
}
[root@luminous ceph-deploy]# ceph tell osd.* injectargs '--osd_max_backfills 2'
osd.0: osd_max_backfills = '2' rocksdb_separate_wal_dir = 'false' (not 
observed, change may require restart)
osd.1: osd_max_backfills = '2' rocksdb_separate_wal_dir = 'false' (not 
observed, change may require restart)
osd.2: osd_max_backfills = '2' rocksdb_separate_wal_dir = 'false' (not 
observed, change may require restart)
[root@luminous ceph-deploy]# ceph daemon osd.0 config get osd_max_backfills
{
"osd_max_backfills": "2"
}

Regards
JC

> On Nov 21, 2017, at 15:17, Karun Josy  wrote:
> 
> Hello,
> 
> We added couple of OSDs to the cluster and the recovery is taking much time.
> 
> So I tried to increase the osd_max_backfills value dynamically. But its 
> saying the change may need restart. 
> 
> $ ceph tell osd.* injectargs '--osd-max-backfills 5'
> osd.0: osd_max_backfills = '5' osd_objectstore = 'bluestore' (not observed, 
> change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, 
> change may require restart)
> 
> 
> =
> 
> The value seems to be not changed too.
> 
> [cephuser@ceph-las-admin-a1 home]$  ceph -n osd.0 --show-config | grep 
> osd_max_backfills
> osd_max_backfills = 1
> 
> Do I have to really restart all the OSD daemons ?
> 
> 
> 
> Karun 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] How to set osd_max_backfills in Luminous

2017-11-21 Thread Karun Josy

Hello,

We added couple of OSDs to the cluster and the recovery is taking much time.

So I tried to increase the osd_max_backfills value dynamically. But its
saying the change may need restart.

$ ceph tell osd.* injectargs '--osd-max-backfills 5'
osd.0: osd_max_backfills = '5' osd_objectstore = 'bluestore' (not observed,
change may require restart) rocksdb_separate_wal_dir = 'false' (not
observed, change may require restart)


=

The value seems to be not changed too.

[cephuser@ceph-las-admin-a1 home]$  ceph -n osd.0 --show-config | grep
osd_max_backfills
osd_max_backfills = 1

Do I have to really restart all the OSD daemons ?



Karun
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] I/O stalls when doing fstrim on large RBD

2017-11-21 Thread Brendan Moloney

Hi,

So I dug into this a bit.  Apparently with XFS the fstrim command will ignore 
the provided "length" option once it hits a large contiguous block of free 
space and just keep going until there is a non-empty block.  Most of my larger 
filesystems end up with the XFS allocation group being 1TB in size, so the meta 
data from the next allocation group ends up stopping the fstrim command at 
about the 1TB mark.  

I did capture an fstrim call with blktrace and attached the results. I did this 
test on a smaller 2TB FS where the allocation groups are 512GB.  I found an 
offset which hit a large contiguous block of empty space, so even though I only 
requested a length of 4GB it ended up trimming ~487GB.

# fstrim -v -o 549032275968 -l 4294967296 /data/bulk
/data/bulk: 487.3 GiB (523262476288 bytes) trimmed

Looking through the blktrace I see some CFQ related stuff, so maybe it is 
actually helping to reduce starvation for other processes?

These large fstrim runs can actually complete quite quickly (10-20 seconds for 
1TB), but they can also be quite slow if the FS is busy (a few minutes).

I have heard the the ATA "trim" command can cause many problems because it is 
not "queueable". However I understand that the SCSI "unmap" command does not 
have this shortcoming.  Could the virtio-scsi driver and/or librbd be handling 
these better?

Thanks for the help!

Brendan

From: Jason Dillaman [jdill...@redhat.com]
Sent: Saturday, November 18, 2017 5:08 AM
To: Brendan Moloney
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] I/O stalls when doing fstrim on large RBD

Can you capture a blktrace while perform fstrim to record the discard
operations? A 1TB trim extent would cause a huge impact since it would
translate to approximately 262K IO requests to the OSDs (assuming 4MB
backing files).

On Fri, Nov 17, 2017 at 6:19 PM, Brendan Moloney  wrote:
> Hi,
>
> I guess this isn't strictly about Ceph, but I feel like other folks here
> must have run into the same issues.
>
> I am trying to keep my thinly provisioned RBD volumes thin.  I use
> virtio-scsi to attach the RBD volumes to my VMs with the "discard=unmap"
> option. The RBD is formatted as XFS and some of them can be quite large
> (16TB+).  I have a cron job that runs "fstrim" commands twice a week in the
> evenings.
>
> The issue is that I see massive I/O stalls on the VM during the fstrim.  To
> the point where I am getting kernel panics from hung tasks and other
> timeouts.  I have tried a number of things to lessen the impact:
>
> - Switching from deadline to CFQ (initially I thought this helped, but
> now I am not convinced)
> - Running fstrim with "ionice -c idle" (this doesn't seem to make a
> difference)
> - Chunking the fstrim with the offset/length options (helps reduce worst
> case, but I can't trim less than 1TB at a time and that can still cause a
> pause for several minutes)
>
> Is there anything else I can do to avoid this issue?
>
> Thanks,
> Brendan
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

--
Jason

fstrim_blktrace.tar.gz
Description: fstrim_blktrace.tar.gz
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ubuntu upgrade Zesty => Aardvark, Implications for Ceph?

2017-11-21 Thread Ken Dreyer

As a tangent, it's a problem for download.ceph.com packages that
"xenial" happens to sort alphabetically after "bionic", because
do-release-upgrade will consider (for example) "ceph_12.2.1-1xenial"
to be newer than "ceph_12.2.1-1bionic". I think we need to switch to
using "ubuntu16.04", "ubuntu18.04" suffixes instead.

- Ken

On Mon, Nov 13, 2017 at 4:41 AM, Ranjan Ghosh  wrote:
> Hi everyone,
>
> In January, support for Ubuntu Zesty will run out and we're planning to
> upgrade our servers to Aardvark. We have a two-node-cluster (and one
> additional monitoring-only server) and we're using the packages that come
> with the distro. We have mounted CephFS on the same server with the kernel
> client in FSTab. AFAIK, Aardvark includes Ceph 12.0. What would happen if we
> used the usual "do-release-upgrade" to upgrade the servers one-by-one? I
> assume the procedure described here
> "http://ceph.com/releases/v12-2-0-luminous-released/"; (section "Upgrade from
> Jewel or Kraken") probably won't work for us, because "do-release-upgrade"
> will upgrade all packages (including the ceph ones) at once and then reboots
> the machine. So we cannot really upgrade only the monitoring nodes. And I'd
> rather avoid switching to PPAs beforehand. So, what are the real
> consequences if we upgrade all servers one-by-one with "do-release-upgrade"
> and then reboot all the nodes? Is it only the downtime why this isnt
> recommended or do we lose data? Any other recommendations on how to tackle
> this?
>
> Thank you / BR
>
> Ranjan
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to improve performance

2017-11-21 Thread ulembke

Am 2017-11-21 13:12, schrieb Rudi Ahlers:
On Tue, Nov 21, 2017 at 10:46 AM, Christian Balzer  
wrote:

On Tue, 21 Nov 2017 09:21:58 +0200 Rudi Ahlers wrote:

> On Mon, Nov 20, 2017 at 2:36 PM, Christian Balzer  wrote:
>...
>
>
> Ok, so I have 4 physical servers and need to setup a highly redundant
> cluster. How else would you have done it? There is no budget for a SAN,
let
> alone a highly available SAN.
>
As I said, I'd be fine doing it with Ceph, if that was a good match.
It's easy to starve resources with hyperconverged clusters.

Since you're using proxmox, DRBD would be an obvious alternative,
especially if you're not planning on growing this cluster.

You only mentioned 3 servers so far, is the fourth one non-Ceph?

From what I have read, DRBD isn't very stable?

The 4th one will be for backups.

Hi,
unfortunality drbd isn't realy useable in proxmox-ve anymore.
Linbit changed the licence and proxmox droped drbd-support.
Linbit create an own repository and changed the licence back, but there 
are no real user base.

Udo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD failure test freezes the cluster

2017-11-21 Thread Bishoy Mikhael

Bump!


On Tue, Nov 21, 2017 at 12:34 AM Gmail  wrote:

> Hi All,
>
> I was performing an OSD failure test on a 3 node Ceph cluster configured
> as follows:
> Ceph version 12.2.1
> 3 x MDS (one per node)
> 3 x MON (one per node)
> 3 x MGR (one per node)
> 15 x OSDs (5 per node)
>
> Ceph filesystem was mounted on a different node using Ceph kernel block
> device driver.
>
> I failed one of the OSD drives by removing it from the SCSI bus, while
> consistently writing 1MB files to the cluster.
>
> Once the drive was failed, the write process stopped, ceph -s command
> timeout, the three MON, MDS and MGR daemons status shows that they received
> a sigkill.
>
> Any clue on what’s going on on the cluster?
>
> Regards,
> Bishoy
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw bucket rename and change owner

2017-11-21 Thread David Turner

User and bucket operations have more to do with what is providing the S3
API.  In this case you're using swift for that.  The Ceph tools to do this
would be if you're using RGW to provide the S3 API.

The answers you're looking for would be in how to do this with SWIFT, if
I'm not mistaken.  Ceph in this case is only providing the storage, but all
of the data and user management would be done through non-Ceph tools.

That said, there are probably several people on this list that use SWIFT
with a Ceph storage backend.  They may be able to help you with this, but
hopefully you can find the answers your looking for in their documentation.

On Tue, Nov 21, 2017 at 11:43 AM Kim-Norman Sahm  wrote:

> Or is there a way to mova the bucket contect into another bucket?
>
> Am Montag, den 20.11.2017, 17:36 +0100 schrieb Kim-Norman Sahm:
> > is it possible to rename a radosgw bucket and change the owner?
> > i'm using ceph as swift backend in openstack and want to move an old
> > bucket to a keystone based user.
> >
> > br Kim
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] radosgw bucket rename and change owner

2017-11-21 Thread Kim-Norman Sahm

Or is there a way to mova the bucket contect into another bucket?

Am Montag, den 20.11.2017, 17:36 +0100 schrieb Kim-Norman Sahm:
> is it possible to rename a radosgw bucket and change the owner?
> i'm using ceph as swift backend in openstack and want to move an old
> bucket to a keystone based user.
> 
> br Kim
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] HEALTH_ERR pgs are stuck inactive for more than 300 seconds

2017-11-21 Thread David Turner

All you have to do is figure out why osd.0, osd.1, and osd.2 are down and
get the daemons running.  They have PGs assigned to them, but since they
are not up and running those PGs are in a down state.  You can check the
logs for them in /var/log/ceph/.  Did you have any errors when deploying
these OSDs?

On Tue, Nov 21, 2017 at 10:25 AM Traiano Welcome  wrote:

> Hi List
>
> I've just begun using ceph and installed a small cluster on ubuntu
> 16.04 nodes using this the process in this guide:
>
>
> https://www.howtoforge.com/tutorial/how-to-install-a-ceph-cluster-on-ubuntu-16-04/
>
> However, once the installation is complete, I see the newly installed
> cluster is not healthy, and complaining about pgs stuck in inactive:
>
> ---
> root@lol-045:~# ceph -s
>
> cluster 220c92fb-2daa-4860-b511-d65ec88d6060
>  health HEALTH_ERR
> 448 pgs are stuck inactive for more than 300 seconds
> 64 pgs degraded
> 256 pgs stale
> 64 pgs stuck degraded
> 192 pgs stuck inactive
> 256 pgs stuck stale
> 256 pgs stuck unclean
> 64 pgs stuck undersized
> 64 pgs undersized
> noout flag(s) set
>  monmap e1: 1 mons at {lol-045=17.16.2.20:6789/0}
> election epoch 4, quorum 0 lol-045
>  osdmap e66: 7 osds: 4 up, 4 in; 55 remapped pgs
> flags noout,sortbitwise,require_jewel_osds
>   pgmap v526: 256 pgs, 1 pools, 0 bytes data, 0 objects
> 134 MB used, 6120 GB / 6121 GB avail
>  192 stale+creating
>   64 stale+active+undersized+degraded
>
> ---
>
> Why is this, and how can troubleshoot and I fix it? (I've googled
> extensively but couldn't find a solution to this).
>
>
> My osd tree looks like this:
>
> 
> ID WEIGHT   TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -1 10.46080 root default
> -2  2.98880 host anx-dp02-046
>  0  1.49440 osd.0down0  1.0
>  4  1.49440 osd.4  up  1.0  1.0
> -3  2.98880 host anx-dp02-047
>  1  1.49440 osd.1down0  1.0
>  5  1.49440 osd.5  up  1.0  1.0
> -4  2.98880 host anx-dp02-048
>  2  1.49440 osd.2down0  1.0
>  6  1.49440 osd.6  up  1.0  1.0
> -5  1.49440 host anx-dp02-049
>  7  1.49440 osd.7  up  1.0  1.0
> 
>
> Many thanks in advance,
> Traiano
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSD is near full and slow in accessing storage from client

2017-11-21 Thread David Turner

Your rbd pool can be removed (unless you're planning to use it) which will
delete those PGs from your cluster/OSDs.  Also all of your backfilling
finished and has settled.  Now you just need to work on balancing the
weights for the OSDs in your cluster.

There are multiple ways to balance the usage of the clusters.  Changing the
crush weight of the OSD, changing the reweight of the OSD, doing that by
using `ceph osd reweight-by-utilization`, doing that by using Cern's
modified version of that which can weight things up as well as down, etc.
I use a method that changes the crush weight of the OSD, but does so by
downloading the crush map and using the crushtool to generate a balanced
map and do it in one go.  A very popular method on the list is to create a
cron that does very small modifications in the background and keeps things
balanced by utilization.

You should be able to find a lot of references in the ML or in blog posts
about doing these various options.  The take away is that the CRUSH
algorithm is putting too much data on osd.4 and not enough data on osd.2
(those are the extremes, but there are others not quite as extreme) and you
need to modify the weight and/or reweight of the osd to help the algorithm
balance that out.

On Tue, Nov 21, 2017 at 12:11 AM gjprabu  wrote:

> Hi David,
>
>This is our current status.
>
>
> ~]# ceph status
> cluster b466e09c-f7ae-4e89-99a7-99d30eba0a13
>  health HEALTH_WARN
> mds0: Client integ-hm3 failing to respond to cache pressure
> mds0: Client integ-hm9-bkp failing to respond to cache pressure
> mds0: Client me-build1-bkp failing to respond to cache pressure
>  monmap e2: 3 mons at {intcfs-mon1=
> 192.168.113.113:6789/0,intcfs-mon2=192.168.113.114:6789/0,intcfs-mon3=192.168.113.72:6789/0
> }
> election epoch 16, quorum 0,1,2
> intcfs-mon3,intcfs-mon1,intcfs-mon2
>   fsmap e177798: 1/1/1 up {0=intcfs-osd1=up:active}, 1 up:standby
>  osdmap e4388: 8 osds: 8 up, 8 in
> flags sortbitwise
>   pgmap v24129785: 564 pgs, 3 pools, 6885 GB data, 17138 kobjects
> 14023 GB used, 12734 GB / 26757 GB avail
>  560 active+clean
>3 active+clean+scrubbing
>1 active+clean+scrubbing+deep
>   client io 47187 kB/s rd, 965 kB/s wr, 125 op/s rd, 525 op/s wr
>
> ]# ceph df
> GLOBAL:
> SIZE   AVAIL  RAW USED %RAW USED
> 26757G 12735G   14022G 52.41
> POOLS:
> NAME   ID USED   %USED MAX AVAIL
> OBJECTS
> rbd0   0 0
> 3787G0
> downloads_data 3   6885G 51.46 3787G
> 16047944
> downloads_metadata 4  84773k 0 3787G
> 1501805
>
>
> Regards
> Prabu GJ
>
>  On Mon, 20 Nov 2017 21:35:17 +0530 *David Turner
> >* wrote 
>
> What is your current `ceph status` and `ceph df`? The status of your
> cluster has likely changed a bit in the last week.
>
> On Mon, Nov 20, 2017 at 6:00 AM gjprabu  wrote:
>
>
> Hi David,
>
> Sorry for the late reply and its completed OSD Sync and more
> ever still fourth OSD available size is keep reducing. Is there any option
> to check or fix .
>
>
> ID WEIGHT  REWEIGHT SIZE   USEAVAIL  %USE  VAR  PGS
>
> 0 3.29749  1.0  3376G  2320G  1056G 68.71 1.10 144
> 1 3.26869  1.0  3347G  1871G  1475G 55.92 0.89 134
> 2 3.27339  1.0  3351G  1699G  1652G 50.69 0.81 134
> 3 3.24089  1.0  3318G  1865G  1452G 56.22 0.90 142
> 4 3.24089  1.0  3318G  2839G   478G 85.57 1.37 158
> 5 3.32669  1.0  3406G  2249G  1156G 66.04 1.06 136
> 6 3.27800  1.0  3356G  1924G  1432G 57.33 0.92 139
> 7 3.20470  1.0  3281G  1949G  1331G 59.42 0.95 141
>   TOTAL 26757G 16720G 10037G 62.49
> MIN/MAX VAR: 0.81/1.37  STDDEV: 10.26
>
>
> Regards
> Prabu GJ
>
>
>
>  On Mon, 13 Nov 2017 00:27:47 +0530 *David Turner
> >* wrote 
>
> You cannot reduce the PG count for a pool.  So there isn't anything you
> can really do for this unless you create a new FS with better PG counts and
> migrate your data into it.
>
> The problem with having more PGs than you need is in the memory footprint
> for the osd daemon. There are warning thresholds for having too many PGs
> per osd.  Also in future expansions, if you need to add pools, you might
> not be able to create the pools with the proper amount of PGs due to older
> pools that have way too many PGs.
>
> It would still be nice to see the output from those commands I asked about.
>
> The built-in reweighting scripts might help your data distribution.
> reweight-by-utilization
>
> On Sun, Nov 12, 2017, 11:41 AM gjprabu  wrote:
>
>
> Hi David,
>
> Thanks for your valuable reply , once complete the backfilling for new osd
> and will consider by increasing replica value asap. Is it possible to
> decrease the metadata pg count ?  if the pg count for metadata for value
>

[ceph-users] HEALTH_ERR pgs are stuck inactive for more than 300 seconds

2017-11-21 Thread Traiano Welcome

Hi List

I've just begun using ceph and installed a small cluster on ubuntu
16.04 nodes using this the process in this guide:

https://www.howtoforge.com/tutorial/how-to-install-a-ceph-cluster-on-ubuntu-16-04/

However, once the installation is complete, I see the newly installed
cluster is not healthy, and complaining about pgs stuck in inactive:

---
root@lol-045:~# ceph -s

cluster 220c92fb-2daa-4860-b511-d65ec88d6060
 health HEALTH_ERR
448 pgs are stuck inactive for more than 300 seconds
64 pgs degraded
256 pgs stale
64 pgs stuck degraded
192 pgs stuck inactive
256 pgs stuck stale
256 pgs stuck unclean
64 pgs stuck undersized
64 pgs undersized
noout flag(s) set
 monmap e1: 1 mons at {lol-045=17.16.2.20:6789/0}
election epoch 4, quorum 0 lol-045
 osdmap e66: 7 osds: 4 up, 4 in; 55 remapped pgs
flags noout,sortbitwise,require_jewel_osds
  pgmap v526: 256 pgs, 1 pools, 0 bytes data, 0 objects
134 MB used, 6120 GB / 6121 GB avail
 192 stale+creating
  64 stale+active+undersized+degraded

---

Why is this, and how can troubleshoot and I fix it? (I've googled
extensively but couldn't find a solution to this).


My osd tree looks like this:


ID WEIGHT   TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-1 10.46080 root default
-2  2.98880 host anx-dp02-046
 0  1.49440 osd.0down0  1.0
 4  1.49440 osd.4  up  1.0  1.0
-3  2.98880 host anx-dp02-047
 1  1.49440 osd.1down0  1.0
 5  1.49440 osd.5  up  1.0  1.0
-4  2.98880 host anx-dp02-048
 2  1.49440 osd.2down0  1.0
 6  1.49440 osd.6  up  1.0  1.0
-5  1.49440 host anx-dp02-049
 7  1.49440 osd.7  up  1.0  1.0


Many thanks in advance,
Traiano
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph - SSD cluster

2017-11-21 Thread Phil Schwarz

Hi,
not a real HAL, but keeping this list [1] in mind is mandatory.

According to me, use roughly any kind of Intel SSD :3750 in SATA or best
3700 in MVNE.
Avoid any Samsung pro or EVO  of nearly any kind.(Haven't found a link,
sorry)

My 2 cents

[1] :
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/


Le 21/11/2017 à 11:34, Ronny Aasen a écrit :
> On 20. nov. 2017 23:06, Christian Balzer wrote:
>> On Mon, 20 Nov 2017 15:53:31 +0100 Ansgar Jazdzewski wrote:
>>
>>> Hi *,
>>>
>>> just on note because we hit it, take a look on your discard options
>>> make sure it not run on all OSD at the same time.
>>>
>> Any SSD that actually _requires_ the use of TRIM/DISCARD to maintain
>> either speed or endurance I'd consider unfit for Ceph to boot.
>>
> 
> 
> hello
> 
> is there some sort of hardware compatibillity list for this part ?
> perhaps community maintained on the wiki or similar.
> 
> there are some older blog posts covering some devices, but hard to find
> ceph related for current devices.
> 
> kind regards
> Ronny Aasen
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to improve performance

2017-11-21 Thread Rudi Ahlers

On Tue, Nov 21, 2017 at 10:46 AM, Christian Balzer  wrote:

> On Tue, 21 Nov 2017 09:21:58 +0200 Rudi Ahlers wrote:
>
> > On Mon, Nov 20, 2017 at 2:36 PM, Christian Balzer  wrote:
> >
> > > On Mon, 20 Nov 2017 14:02:30 +0200 Rudi Ahlers wrote:
> > >
> > > > We're planning on installing 12X Virtual Machines with some heavy
> loads.
> > > >
> > > > the SSD drives are  INTEL SSDSC2BA400G4
> > > >
> > > Interesting, where did you find those?
> > > Or did you have them lying around?
> > >
> > > I've been unable to get DC S3710 SSDs for nearly a year now.
> > >
> >
> > In South Africa, one of our suppliers had some in stock. They're still
> > fairly new, about 2 months old now.
> >
> >
> Odd, oh well.
>
> >
> >
> > > The SATA drives are ST8000NM0055-1RM112
> > > >
> > > Note that these (while fast) have an internal flash cache, limiting
> them to
> > > something like 0.2 DWPD.
> > > Probably not an issue with the WAL/DB on the Intels, but something to
> keep
> > > in mind.
> > >
> >
> >
> > I don't quite understand what you want to say, please explain?
> >
> See the other mails in this thread after the one above.
> In short, probably nothing to worry about.
>
> >
> >
> > > > Please explain your comment, "b) will find a lot of people here who
> don't
> > > > approve of it."
> > > >
> > > Read the archives.
> > > Converged clusters are complex and debugging Ceph when tons of other
> > > things are going on at the same time on the machine even more so.
> > >
> >
> >
> > Ok, so I have 4 physical servers and need to setup a highly redundant
> > cluster. How else would you have done it? There is no budget for a SAN,
> let
> > alone a highly available SAN.
> >
> As I said, I'd be fine doing it with Ceph, if that was a good match.
> It's easy to starve resources with hyperconverged clusters.
>
> Since you're using proxmox, DRBD would be an obvious alternative,
> especially if you're not planning on growing this cluster.
>
> You only mentioned 3 servers so far, is the fourth one non-Ceph?
>

>From what I have read, DRBD isn't very stable?

The 4th one will be for backups.



>
> >
> >
> > >
> > > > I don't have access to the switches right now, but they're new so
> > > whatever
> > > > default config ships from factory would be active. Though iperf shows
> > > 10.5
> > > > GBytes  / 9.02 Gbits/sec throughput.
> > > >
> > > Didn't think it was the switches, but completeness sake and all that.
> > >
> > > > What speeds would you expect?
> > > > "Though with your setup I would have expected something faster, but
> NOT
> > > the
> > > > theoretical 600MB/s 4 HDDs will do in sequential writes."
> > > >
> > > What I wrote.
> > > A 7200RPM HDD, even these, can not sustain writes much over 170MB/s, in
> > > the most optimal circumstances.
> > > So your cluster can NOT exceed about 600MB/s sustained writes with the
> > > effective bandwidth of 4 HDDs.
> > > Smaller writes/reads that can be cached by RAM, DB, onboard caches on
> the
> > > HDDs of course can and will be faster.
> > >
> > > But again, you're missing the point, even if you get 600MB/s writes
> out of
> > > your cluster, the number of 4k IOPS will be much more relevant to your
> VMs.
> > >
> > >
> > hdparm shows about 230MB/s:
> >
> > ^Croot@virt2:~# hdparm -Tt /dev/sda
> >
> > /dev/sda:
> >  Timing cached reads:   20250 MB in  2.00 seconds = 10134.81 MB/sec
> >  Timing buffered disk reads: 680 MB in  3.00 seconds = 226.50 MB/sec
> >
> That's read and a very optimized sequential one at that.
> >
> >
> > 600MB/s would be super nice, but in reality even 400MB/s would be nice.
> Do you really need to write that amount of data in a short time?
> Typical VMs are IOPS bound, as pointed out several times.
>

We have 10x physical servers which are quite busy and two of them are slow
in terms of disk speed so I am looking at getting better performance.


>
> > Would it not be achievable?
> >
> Maybe, but you need to find out what, if anything makes your cluster
> slower than this.
> iostat, atop, etc can help with that.
> How busy are your CPUs, HDDs and SSDs when you run that benchmark?
>

The CPU and RAM is fairly "idle" during any of my tests.


>
> >
> >
> > > >
> > > >
> > > > On this, "If an OSD has no fast WAL/DB, it will drag the overall
> speed
> > > > down. Verify and if so fix this and re-test.": how?
> > > >
> > > No idea, I don't do bluestore.
> > > You noticed the lack of a WAL/DB for sda, go and fix it.
> > > If in in doubt by destroying and re-creating.
> > >
> > > And if you're looking for a less invasive procedure, docs and the ML
> > > archive, but AFAIK there is nothing but re-creation at this time.
> > >
> >
> >
> > Since I use Proxmox, which setup a DB device, but not a WAL device.
> >
> Again, I don't do bluestore.
> But AFAIK, WAL will live on the fastest device, which is the SSD you've
> put the DB on, unless specified separately.
> So nothing to be done here.
>


I have re-created the CEPH pool with a DB and WAL device this time and
performance is sli

Re: [ceph-users] Ceph - SSD cluster

2017-11-21 Thread Ronny Aasen


On 20. nov. 2017 23:06, Christian Balzer wrote:

On Mon, 20 Nov 2017 15:53:31 +0100 Ansgar Jazdzewski wrote:


Hi *,

just on note because we hit it, take a look on your discard options
make sure it not run on all OSD at the same time.


Any SSD that actually _requires_ the use of TRIM/DISCARD to maintain
either speed or endurance I'd consider unfit for Ceph to boot.




hello

is there some sort of hardware compatibillity list for this part ?
perhaps community maintained on the wiki or similar.

there are some older blog posts covering some devices, but hard to find 
ceph related for current devices.


kind regards
Ronny Aasen

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] measure performance / latency in blustore

2017-11-21 Thread Stefan Priebe - Profihost AG

Hello,

to measure performance / latency for filestore we used:
filestore:apply_latency
filestore:commitcycle_latency
filestore:journal_latency
filestore:queue_transaction_latency_avg

What are the correct ones for bluestore?

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Prefer ceph monitor

2017-11-21 Thread ceph

Hi Erik,

Am 10. November 2017 16:59:40 MEZ schrieb Erik Schwalbe 
:
>Hello,
>
>is it possible to prefer a monitor/mgr, so that it become the leader
>monitor?

The leader will be the MON Server with the lowest ip Adress. 

When you  have 4 mgr Server Running you can Stop 3 of them to make the fourth 
to the Active mgr and start then the other 3.
  
>What are the important params during monitor election? 
>Rank could be an parameter, but is it possible to set the rank, perhaps
>in ceph.conf?
No, i dont believe so.

- Mehmet 
>
>Thanks for your answer.
>
>Regards,
>Erik
>
>
>
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to test journal?

2017-11-21 Thread Loris Cuoghi

Le Tue, 21 Nov 2017 10:52:43 +0200,
Rudi Ahlers  a écrit :

> [snip, snap]
> 
> Maybe I'm confusing the terminology. I have created a DB and WAL
> device for my Bluestore, but I presume that's not a journal
> (anymore?).

They never were, please read further in the documentation and mailing
list archives about their respective functions. For instance:
http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/

> 
> This is how I set it up this time: ceph-disk prepare
> --bluestore /dev/sdc --block.wal /dev/sde --block.db /dev/sde

Putting the WAL and DB on the same device is a non sequitur once you
carefully read the available documentation and ML. The whole purpose of
a WAL is in it being on a faster device than the DB/main storage one(s).

> 
> Apart from "ceph grafana", which I can't install at this stage since
> the cluster isn't ready yet, how else can I test the SSD cache drives?
> 
> On Tue, Nov 21, 2017 at 10:37 AM, Christian Balzer 
> wrote:
> 
> > On Tue, 21 Nov 2017 10:23:20 +0200 Rudi Ahlers wrote:
> >  
> > > Hi,
> > >
> > > Is it possible to test the CEPH journal to see if it performs
> > > optimally? 
> > You're using bluestore, as I said before, there is no journal.
> >
> > The SSDs you're using are top of the line, perfectly suited for what
> > you're using them for.
> >  
> > > And, how can I see how much data is being cached?
> > >  
> > Nothing gets cached per se with bluestore.
> > As for utilization, there were values for the journal, AFAIK there
> > isn't anything for the DB stuff at this point.
> >
> > In general, google "ceph grafana", you will REALLY want to monitor
> > a ceph cluster this way.
> >
> > --
> > Christian BalzerNetwork/Systems Engineer
> > ch...@gol.com   Rakuten Communications
> >  
> 
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to test journal?

2017-11-21 Thread Rudi Ahlers

Hi,

I didn't see where you said there is no journal on bluestore.

Maybe I'm confusing the terminology. I have created a DB and WAL device for
my Bluestore, but I presume that's not a journal (anymore?).

This is how I set it up this time: ceph-disk prepare --bluestore /dev/sdc
--block.wal /dev/sde --block.db /dev/sde

Apart from "ceph grafana", which I can't install at this stage since the
cluster isn't ready yet, how else can I test the SSD cache drives?

On Tue, Nov 21, 2017 at 10:37 AM, Christian Balzer  wrote:

> On Tue, 21 Nov 2017 10:23:20 +0200 Rudi Ahlers wrote:
>
> > Hi,
> >
> > Is it possible to test the CEPH journal to see if it performs optimally?
> >
> You're using bluestore, as I said before, there is no journal.
>
> The SSDs you're using are top of the line, perfectly suited for what
> you're using them for.
>
> > And, how can I see how much data is being cached?
> >
> Nothing gets cached per se with bluestore.
> As for utilization, there were values for the journal, AFAIK there isn't
> anything for the DB stuff at this point.
>
> In general, google "ceph grafana", you will REALLY want to monitor a ceph
> cluster this way.
>
> --
> Christian BalzerNetwork/Systems Engineer
> ch...@gol.com   Rakuten Communications
>



-- 
Kind Regards
Rudi Ahlers
Website: http://www.rudiahlers.co.za
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to improve performance

2017-11-21 Thread Christian Balzer

On Tue, 21 Nov 2017 09:21:58 +0200 Rudi Ahlers wrote:

> On Mon, Nov 20, 2017 at 2:36 PM, Christian Balzer  wrote:
> 
> > On Mon, 20 Nov 2017 14:02:30 +0200 Rudi Ahlers wrote:
> >  
> > > We're planning on installing 12X Virtual Machines with some heavy loads.
> > >
> > > the SSD drives are  INTEL SSDSC2BA400G4
> > >  
> > Interesting, where did you find those?
> > Or did you have them lying around?
> >
> > I've been unable to get DC S3710 SSDs for nearly a year now.
> >  
> 
> In South Africa, one of our suppliers had some in stock. They're still
> fairly new, about 2 months old now.
> 
> 
Odd, oh well.

> 
> 
> > The SATA drives are ST8000NM0055-1RM112  
> > >  
> > Note that these (while fast) have an internal flash cache, limiting them to
> > something like 0.2 DWPD.
> > Probably not an issue with the WAL/DB on the Intels, but something to keep
> > in mind.
> >  
> 
> 
> I don't quite understand what you want to say, please explain?
> 
See the other mails in this thread after the one above.
In short, probably nothing to worry about.

> 
> 
> > > Please explain your comment, "b) will find a lot of people here who don't
> > > approve of it."
> > >  
> > Read the archives.
> > Converged clusters are complex and debugging Ceph when tons of other
> > things are going on at the same time on the machine even more so.
> >  
> 
> 
> Ok, so I have 4 physical servers and need to setup a highly redundant
> cluster. How else would you have done it? There is no budget for a SAN, let
> alone a highly available SAN.
>
As I said, I'd be fine doing it with Ceph, if that was a good match.
It's easy to starve resources with hyperconverged clusters.

Since you're using proxmox, DRBD would be an obvious alternative,
especially if you're not planning on growing this cluster. 
 
You only mentioned 3 servers so far, is the fourth one non-Ceph?

> 
> 
> >  
> > > I don't have access to the switches right now, but they're new so  
> > whatever  
> > > default config ships from factory would be active. Though iperf shows  
> > 10.5  
> > > GBytes  / 9.02 Gbits/sec throughput.
> > >  
> > Didn't think it was the switches, but completeness sake and all that.
> >  
> > > What speeds would you expect?
> > > "Though with your setup I would have expected something faster, but NOT  
> > the  
> > > theoretical 600MB/s 4 HDDs will do in sequential writes."
> > >  
> > What I wrote.
> > A 7200RPM HDD, even these, can not sustain writes much over 170MB/s, in
> > the most optimal circumstances.
> > So your cluster can NOT exceed about 600MB/s sustained writes with the
> > effective bandwidth of 4 HDDs.
> > Smaller writes/reads that can be cached by RAM, DB, onboard caches on the
> > HDDs of course can and will be faster.
> >
> > But again, you're missing the point, even if you get 600MB/s writes out of
> > your cluster, the number of 4k IOPS will be much more relevant to your VMs.
> >
> >  
> hdparm shows about 230MB/s:
> 
> ^Croot@virt2:~# hdparm -Tt /dev/sda
> 
> /dev/sda:
>  Timing cached reads:   20250 MB in  2.00 seconds = 10134.81 MB/sec
>  Timing buffered disk reads: 680 MB in  3.00 seconds = 226.50 MB/sec
>
That's read and a very optimized sequential one at that.  
> 
> 
> 600MB/s would be super nice, but in reality even 400MB/s would be nice.
Do you really need to write that amount of data in a short time?
Typical VMs are IOPS bound, as pointed out several times.

> Would it not be achievable?
> 
Maybe, but you need to find out what, if anything makes your cluster
slower than this.
iostat, atop, etc can help with that.
How busy are your CPUs, HDDs and SSDs when you run that benchmark?

> 
> 
> > >
> > >
> > > On this, "If an OSD has no fast WAL/DB, it will drag the overall speed
> > > down. Verify and if so fix this and re-test.": how?
> > >  
> > No idea, I don't do bluestore.
> > You noticed the lack of a WAL/DB for sda, go and fix it.
> > If in in doubt by destroying and re-creating.
> >
> > And if you're looking for a less invasive procedure, docs and the ML
> > archive, but AFAIK there is nothing but re-creation at this time.
> >  
> 
> 
> Since I use Proxmox, which setup a DB device, but not a WAL device.
> 
Again, I don't do bluestore.
But AFAIK, WAL will live on the fastest device, which is the SSD you've
put the DB on, unless specified separately. 
So nothing to be done here.

Christian
> 
> 
> 
> > Christian  
> > >
> > > On Mon, Nov 20, 2017 at 1:44 PM, Christian Balzer  wrote:
> > >  
> > > > On Mon, 20 Nov 2017 12:38:55 +0200 Rudi Ahlers wrote:
> > > >  
> > > > > Hi,
> > > > >
> > > > > Can someone please help me, how do I improve performance on ou CEPH  
> > > > cluster?  
> > > > >
> > > > > The hardware in use are as follows:
> > > > > 3x SuperMicro servers with the following configuration
> > > > > 12Core Dual XEON 2.2Ghz  
> > > > Faster cores is better for Ceph, IMNSHO.
> > > > Though with main storage on HDDs, this will do.
> > > >  
> > > > > 128GB RAM  
> > > > Overkill for Ceph but I see something else below

Re: [ceph-users] how to test journal?

2017-11-21 Thread Christian Balzer

On Tue, 21 Nov 2017 10:23:20 +0200 Rudi Ahlers wrote:

> Hi,
> 
> Is it possible to test the CEPH journal to see if it performs optimally?
> 
You're using bluestore, as I said before, there is no journal.

The SSDs you're using are top of the line, perfectly suited for what
you're using them for.

> And, how can I see how much data is being cached?
> 
Nothing gets cached per se with bluestore. 
As for utilization, there were values for the journal, AFAIK there isn't
anything for the DB stuff at this point.

In general, google "ceph grafana", you will REALLY want to monitor a ceph
cluster this way.

-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Communications
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] OSD failure test freezes the cluster

2017-11-21 Thread Gmail

Hi All,

I was performing an OSD failure test on a 3 node Ceph cluster configured as 
follows:
Ceph version 12.2.1
3 x MDS (one per node)
3 x MON (one per node)
3 x MGR (one per node)
15 x OSDs (5 per node)

Ceph filesystem was mounted on a different node using Ceph kernel block device 
driver.

I failed one of the OSD drives by removing it from the SCSI bus, while 
consistently writing 1MB files to the cluster.

Once the drive was failed, the write process stopped, ceph -s command timeout, 
the three MON, MDS and MGR daemons status shows that they received a sigkill.

Any clue on what’s going on on the cluster?

Regards,
Bishoy

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] how to test journal?

2017-11-21 Thread Gmail

The only test I’ve performed on the journal was running dstat on the journal 
and OSDs.
I know the sustained throughput performance for the journal drive and OSD 
drives, then I keep writing small files (1MB) to Ceph cluster mounting it using 
Ceph kernel block device driver.
If the journal throughput measured by dstat is close to the theoretical 
throughput but the OSDs are not, that means that the journal is throttling the 
cluster performance, and if the journal and OSDs are pushed to their limits, 
that means that the journal is doing OK.

Hope that helps!

Regards,
Bishoy

> On Nov 21, 2017, at 12:23 AM, Rudi Ahlers  wrote:
> 
> Hi, 
> 
> Is it possible to test the CEPH journal to see if it performs optimally?
> 
> And, how can I see how much data is being cached?
> 
> -- 
> Kind Regards
> Rudi Ahlers
> Website: http://www.rudiahlers.co.za 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] how to test journal?

2017-11-21 Thread Rudi Ahlers

Hi,

Is it possible to test the CEPH journal to see if it performs optimally?

And, how can I see how much data is being cached?

-- 
Kind Regards
Rudi Ahlers
Website: http://www.rudiahlers.co.za
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] luminous - 12.2.1 - stale RBD locks after client crash

Re: [ceph-users] Ceph - SSD cluster

Re: [ceph-users] Ceph - SSD cluster

Re: [ceph-users] How to set osd_max_backfills in Luminous

[ceph-users] How to set osd_max_backfills in Luminous

Re: [ceph-users] I/O stalls when doing fstrim on large RBD

Re: [ceph-users] Ubuntu upgrade Zesty => Aardvark, Implications for Ceph?

Re: [ceph-users] how to improve performance

Re: [ceph-users] OSD failure test freezes the cluster

Re: [ceph-users] radosgw bucket rename and change owner

Re: [ceph-users] radosgw bucket rename and change owner

Re: [ceph-users] HEALTH_ERR pgs are stuck inactive for more than 300 seconds

Re: [ceph-users] OSD is near full and slow in accessing storage from client

[ceph-users] HEALTH_ERR pgs are stuck inactive for more than 300 seconds

Re: [ceph-users] Ceph - SSD cluster

Re: [ceph-users] how to improve performance

Re: [ceph-users] Ceph - SSD cluster

[ceph-users] measure performance / latency in blustore

Re: [ceph-users] Prefer ceph monitor

Re: [ceph-users] how to test journal?

Re: [ceph-users] how to test journal?

Re: [ceph-users] how to improve performance

Re: [ceph-users] how to test journal?

[ceph-users] OSD failure test freezes the cluster

Re: [ceph-users] how to test journal?

[ceph-users] how to test journal?

26 matches

Site Navigation

Mail list logo

Footer information