Message went away, but obviously still don't get the stats showing in the
dashboard (I am guessing this isn't a known bug currently?) and that they
should be working.
Everything work's fine apart from the dashboard does not show the live I/O
stats.
Nothing is mentioned in mgr lags at the default l
On 2/8/19 8:38 AM, Ashley Merrick wrote:
> So I was adding a new host using ceph-deploy, for the first OSD I
> accidentally run it against the hostname of the external IP and not the
> internal network.
>
> I stopped / deleted the OSD from the new host and then re-created the
> OSD using the in
On 2/8/19 8:13 AM, Ashley Merrick wrote:
> I have had issues on a mimic cluster (latest release) where the
> dashboard does not display any read or write ops under the pool's
> section on the main dashboard page.
>
> I have just noticed during restarting the mgr service the following
> shows un
So I was adding a new host using ceph-deploy, for the first OSD I
accidentally run it against the hostname of the external IP and not the
internal network.
I stopped / deleted the OSD from the new host and then re-created the OSD
using the internal hostname along with the rest of the OSD's.
They
I have had issues on a mimic cluster (latest release) where the dashboard
does not display any read or write ops under the pool's section on the main
dashboard page.
I have just noticed during restarting the mgr service the following shows
under "Cluster Logs", nothing else just the following : "F
Hi all, I created a problem when moving data to Ceph and I would be grateful
for some guidance before I do something dumb.
I started with the 4x 6TB source disks that came together as a single XFS
filesystem via software RAID. The goal is to have the same data on a cephfs
volume, but with these
Hi All. I was on luminous 12.2.0 as I do *not* enable repo updates for critical
software (e.g. openstack / ceph). Upgrades need to occur on an intentional
basis!
So I first have upgraded to luminous 12.2.11 following the guide and release
notes.
[root@lvtncephx110 ~]# ceph version
ceph version
On Thu, Feb 7, 2019 at 10:50 AM Dan van der Ster wrote:
>
> On Fri, Feb 1, 2019 at 10:18 PM Neha Ojha wrote:
> >
> > On Fri, Feb 1, 2019 at 1:09 PM Robert Sander
> > wrote:
> > >
> > > Am 01.02.19 um 19:06 schrieb Neha Ojha:
> > >
> > > > If you would have hit the bug, you should have seen failu
Ceph is a massive overhead, so it seems it maxes out at ~1 (at most
15000) write iops per one ssd with queue depth of 128 and ~1000 iops with
queue depth of 1 (1ms latency). Or maybe 2000-2500 write iops (0.4-0.5ms)
with best possible hardware. Micron has only squeezed ~8750 iops from eac
Just to add, that a more general formula is that the number of nodes should be
greater than or equal to k+m+m so N>=k+m+m for full recovery
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Eugen
Block
Sent: Thursday, February 7, 2019 8:47 AM
To
rados bench is garbage, it creates and benches a very small amount of objects.
If you want RBD better test it with fio ioengine=rbd
7 февраля 2019 г. 15:16:11 GMT+03:00, Ryan пишет:
>I just ran your test on a cluster with 5 hosts 2x Intel 6130, 12x 860
>Evo
>2TB SSD per host (6 per SAS3008), 2x
> That's a usefull conclusion to take back.
Last question - We have our SSD pool set to 3x replication, Micron states
that NVMe is good at 2x - is this "taste and safety" or is there any
general
thoughts about SSD-robustness in a Ceph setup?
Jesper
__
On Fri, Feb 1, 2019 at 10:18 PM Neha Ojha wrote:
>
> On Fri, Feb 1, 2019 at 1:09 PM Robert Sander
> wrote:
> >
> > Am 01.02.19 um 19:06 schrieb Neha Ojha:
> >
> > > If you would have hit the bug, you should have seen failures like
> > > https://tracker.ceph.com/issues/36686.
> > > Yes, pglog_hard
> On 07/02/2019 17:07, jes...@krogh.cc wrote:
> Thanks for your explanation. In your case, you have low concurrency
> requirements, so focusing on latency rather than total iops is your
> goal. Your current setup gives 1.9 ms latency for writes and 0.6 ms for
> read. These are considered good, it i
You need to run a full deep scrub before continuing the upgrade, the
reason for this is that the deep scrub migrates the format of some
snapshot-related on-disk data structure.
Looks like you only tried a normal scrub, not a deep-scrub
Paul
--
Paul Emmerich
Looking for help with your Ceph clus
Hi Francois,
Is that correct that recovery will be forbidden by the crush rule if
a node is down?
yes, that is correct, failure-domain=host means no two chunks of the
same PG can be on the same host. So if your PG is divided into 6
chunks, they're all on different hosts, no recovery is po
Dear All
We created an erasure coded pool with k=4 m=2 with failure-domain=host but have
only 6 osd nodes.
Is that correct that recovery will be forbidden by the crush rule if a node is
down?
After rebooting all nodes we noticed that the recovery was slow, maybe half an
hour, but all
On 07/02/2019 17:07, jes...@krogh.cc wrote:
Hi Maged
Thanks for your reply.
6k is low as a max write iops value..even for single client. for cluster
of 3 nodes, we see from 10k to 60k write iops depending on hardware.
can you increase your threads to 64 or 128 via -t parameter
I can absolu
Alternatively, will increase the mon_data_size to 30G (from 15G)..
Thanks
Swami
On Thu, Feb 7, 2019 at 8:44 PM Dan van der Ster wrote:
>
> On Thu, Feb 7, 2019 at 4:12 PM M Ranga Swami Reddy
> wrote:
> >
> > >Compaction isn't necessary -- you should only need to restart all
> > >peon's then the
Hi,
could it be a missing 'ceph osd require-osd-release luminous' on your cluster?
When I check a luminous cluster I get this:
host1:~ # ceph osd dump | grep recovery
flags sortbitwise,recovery_deletes,purged_snapdirs
The flags in the code you quote seem related to that.
Can you check that out
On 2/7/2019 6:06 PM, Eugen Block wrote:
At first - you should upgrade to 12.2.11 (or bring the mentioned
patch in by other means) to fix rename procedure which will avoid new
inconsistent objects appearance in DB. This should at least reduce
the OSD crash frequency.
We'll have to wait until
Hello All! Yesterday started upgrade from luminous to mimic with one of my 3
MONs.
After applying mimic yum repo and updating - a restart reports the following
error from the MON log file:
==> /var/log/ceph/ceph-mon.lvtncephx121.log <==
2019-02-07 10:02:40.110 7fc8283ed700 -1 mon.lvtncephx121@0
On Thu, Feb 7, 2019 at 4:12 PM M Ranga Swami Reddy wrote:
>
> >Compaction isn't necessary -- you should only need to restart all
> >peon's then the leader. A few minutes later the db's should start
> >trimming.
>
> As we on production cluster, which may not be safe to restart the
> ceph-mon, inste
>Compaction isn't necessary -- you should only need to restart all
>peon's then the leader. A few minutes later the db's should start
>trimming.
As we on production cluster, which may not be safe to restart the
ceph-mon, instead prefer to do the compact on non-leader mons.
Is this ok?
Thanks
Swam
Hi Maged
Thanks for your reply.
> 6k is low as a max write iops value..even for single client. for cluster
> of 3 nodes, we see from 10k to 60k write iops depending on hardware.
>
> can you increase your threads to 64 or 128 via -t parameter
I can absolutely get it higher by increasing the paral
At first - you should upgrade to 12.2.11 (or bring the mentioned
patch in by other means) to fix rename procedure which will avoid
new inconsistent objects appearance in DB. This should at least
reduce the OSD crash frequency.
We'll have to wait until 12.2.11 is available for openSUSE, I'm
On 07/02/2019 09:17, jes...@krogh.cc wrote:
Hi List
We are in the process of moving to the next usecase for our ceph cluster
(Bulk, cheap, slow, erasurecoded, cephfs) storage was the first - and
that works fine.
We're currently on luminous / bluestore, if upgrading is deemed to
change what we
Eugen,
At first - you should upgrade to 12.2.11 (or bring the mentioned patch
in by other means) to fix rename procedure which will avoid new
inconsistent objects appearance in DB. This should at least reduce the
OSD crash frequency.
At second - theoretically previous crashes could result in
I read here [0] that to get strays removed, you have to 'touch' them or
'getattr on all the remote links'. Is this still necessary in luminous
12.2.11?
Or is there meanwhile a manual option to force purging of strays?
[@~]# ceph daemon mds.c perf dump | grep strays
"num_strays": 7474
Hi Igor,
thanks for the quick response!
Just to make sure I don't misunderstand, and because it's a production
cluster:
before anything else I should run fsck on that OSD? Depending on the
result we'll decide how to continue, right?
Is there anything else to be enabled for that command or can
Hi Eugen,
looks like this isn't [1] but rather
https://tracker.ceph.com/issues/38049
and
https://tracker.ceph.com/issues/36541 (=
https://tracker.ceph.com/issues/36638 for luminous)
Hence it's not fixed in 12.2.10, target release is 12.2.11
Also please note the patch allows to avoid new o
On Thu, Feb 7, 2019 at 12:17 PM M Ranga Swami Reddy
wrote:
>
> Hi Dan,
> >During backfilling scenarios, the mons keep old maps and grow quite
> >quickly. So if you have balancing, pg splitting, etc. ongoing for
> >awhile, the mon stores will eventually trigger that 15GB alarm.
> >But the intended
Hi list,
I found this thread [1] about crashing SSD OSDs, although that was
about an upgrade to 12.2.7, we just hit (probably) the same issue
after our update to 12.2.10 two days ago in a production cluster.
Just half an hour ago I saw one OSD (SSD) crashing (for the first time):
2019-02-07
I just ran your test on a cluster with 5 hosts 2x Intel 6130, 12x 860 Evo
2TB SSD per host (6 per SAS3008), 2x bonded 10GB NIC, 2x Arista switches.
Pool with 3x replication
rados bench -p scbench -b 4096 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4096 bytes to objects of
On 07/02/2019 20:21, Marc Roos wrote:
I also do not exactly know how many I have. It is sort of test setup and
the bash script creates a snapshot every day. So with 100 dirs it will
be
a maximum of 700. But the script first checks if there is any data with
getfattr --only-values --absolute-names
On 07/02/2019 19:47, Marc Roos wrote:
Is this difference not related to chaching? And you filling up some
cache/queue at some point? If you do a sync after each write, do you
have still the same results?
No, the slow operations are slow from the very beginning. It's not about
filling a buff
Hello - We are using the ceph osd nodes with cache controller cache of 1G size.
Are there any recommendation for using the cache for read and write?
Here we are using - HDDs with colocated journals.
For SSD journal - 0% cache and 100% write.
Thanks
On Mon, Feb 4, 2019 at 6:07 PM M Ranga Swami Red
>
>>
>>
>> Hmmm, I am having a daily cron job creating these only on maybe 100
>> directories. I am removing the snapshot if it exists with a rmdir.
>> Should I do this differently? Maybe eg use snap-20190101,
snap-20190102,
>> snap-20190103 then I will always create unique directories
Hi Sage
Sure, we will increase the mon_data_size to 30G to avoid this type of
warning. And currently we are using 500G disk here. I guss, which
should good enough here.
Thanks
Swami
On Wed, Feb 6, 2019 at 5:56 PM Sage Weil wrote:
>
> Hi Swami
>
> The limit is somewhat arbitrary, based on cluste
Hi Dan,
>During backfilling scenarios, the mons keep old maps and grow quite
>quickly. So if you have balancing, pg splitting, etc. ongoing for
>awhile, the mon stores will eventually trigger that 15GB alarm.
>But the intended behavior is that once the PGs are all active+clean,
>the old maps should
Is this difference not related to chaching? And you filling up some
cache/queue at some point? If you do a sync after each write, do you
have still the same results?
-Original Message-
From: Hector Martin [mailto:hec...@marcansoft.com]
Sent: 07 February 2019 06:51
To: ceph-users@li
On 07/02/2019 19:19, Marc Roos wrote:
Hmmm, I am having a daily cron job creating these only on maybe 100
directories. I am removing the snapshot if it exists with a rmdir.
Should I do this differently? Maybe eg use snap-20190101, snap-20190102,
snap-20190103 then I will always create unique
Hmmm, I am having a daily cron job creating these only on maybe 100
directories. I am removing the snapshot if it exists with a rmdir.
Should I do this differently? Maybe eg use snap-20190101, snap-20190102,
snap-20190103 then I will always create unique directories and the ones
removed will
4xnodes, around 100GB, 2x2660, 10Gbit, 2xLSI Logic SAS2308
Thanks for the confirmation Marc
Can you put in a but more hardware/network details?
Jesper
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi
Thanks for the confirmation Marc
Can you put in a but more hardware/network details?
Jesper
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 07/02/2019 18:17, Marc Roos wrote:
250~1,2252~1,2254~1,2256~1,2258~1,225a~1,225c~1,225e~1,2260~1,2262~1,226
4~1,2266~1,2268~1,226a~1,226c~1,226e~1,2270~1,2272~1,2274~1,2276~1,2278~
1,227a~1,227c~1,227e~1,2280~1,2282~1,2284~1,2286~1,2288~1,228a~1,228c~1,
228e~1,2290~1,2292~1,2294~1,2296~1,2298~
I did your rados bench test on our sm863a pool 3x rep, got similar
results.
[@]# rados bench -p fs_data.ssd -b 4096 10 write --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4096 bytes to objects of size 4096
for up to 10 seconds or 0 objects
Object prefix: benchmark_data_c04_1337712
Also on pools that are empty, looks like on all cephfs data pools.
pool 55 'fs_data.ec21.ssd' erasure size 3 min_size 3 crush_rule 6
object_hash rjenkins pg_num 8 pgp_num 8 last_change 29032 flags
hashpspool,ec_overwrites stripe_width 8192 application cephfs
removed_snaps
[57f~1,583~
ceph osd pool ls detail
pool 20 'fs_data' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 64 pgp_num 64 last_change 29032 flags hashpspool
stripe_width 0 application cephfs
removed_snaps
[3~1,5~31,37~768,7a0~3,7a4~b10,12b5~3,12b9~3,12bd~22c,14ea~22e,1719~b04,
2
> On 2/7/19 8:41 AM, Brett Chancellor wrote:
>> This seems right. You are doing a single benchmark from a single client.
>> Your limiting factor will be the network latency. For most networks this
>> is between 0.2 and 0.3ms. if you're trying to test the potential of
>> your cluster, you'll need
> On Thu, 7 Feb 2019 08:17:20 +0100 jes...@krogh.cc wrote:
>> Hi List
>>
>> We are in the process of moving to the next usecase for our ceph cluster
>> (Bulk, cheap, slow, erasurecoded, cephfs) storage was the first - and
>> that works fine.
>>
>> We're currently on luminous / bluestore, if upgradi
On 2/7/19 8:41 AM, Brett Chancellor wrote:
> This seems right. You are doing a single benchmark from a single client.
> Your limiting factor will be the network latency. For most networks this
> is between 0.2 and 0.3ms. if you're trying to test the potential of
> your cluster, you'll need multi
52 matches
Mail list logo