acpi_cpufreq was the driver I used.
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert
LeBlanc
Sent: 02 September 2015 22:34
To: Nick Fisk
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph SSD CPU Frequency Benchmarks
-BEGIN PGP SIGNED MESSAGE--
Hi!
I got Ceph Hammer 0.94.3 with 72 OSD and 6 nodes. My problem is that
i'm constantly getting my /var/log/messages full of such messages:
Sep 3 11:16:31 slpeah001 ceph-osd: 2015-09-03 11:16:31.393234
7f5a6bfd3700 -1 osd.34 2991 heartbeat_check: no reply from osd.68
since back 2015-09-03 11:16:
> On 02 Sep 2015, at 17:50, Robert LeBlanc wrote:
>
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> Thanks for the responses.
>
> I forgot to include the fio test for completeness:
>
> 8 job QD=8
> [ext4-test]
> runtime=150
> name=ext4-test
> readwrite=randrw
> size=15G
> blocksize=4k
On Thu, Sep 3, 2015 at 7:48 AM, Chris Taylor wrote:
> I removed the latest OSD that was respawing (osd.23) and now I having the
> same problem with osd.30. It looks like they both have pg 3.f9 in common. I
> tried "ceph pg repair 3.f9" but the OSD is still respawning.
>
> Does anyone have any idea
Can you post the output ot
ceph daemon osd.xx config show? (probably as an attachment).
There are several things that I've seen cause it
1) too many PGs but too little degraded objects make it seem "slow" (if you
just have 2 degraded objects but restarted a host with 10K PGs, it will have to
sc
You're like the 5th person here (including me) that was hit by this.
Could I get some input from someone using CEPH with RBD and thousands of OSDs?
How high did you have to go?
I only have ~200 OSDs and I had to bump the limit up to 1 for VMs that have
multiple volumes attached, this doesn'
Just a word of warning.
I had multiple simultaneous node failures from running "i7z" monitoring tool
while investigating latency issues. It does nothing more than reading MSRs from
the CPU.
That was on CentOS 6.5 kernel.
cpu_dma_latency was opened with "1" with an occassional run of cyclictest fr
EnhanceIO? I'd say get rid of that first and then try reproducing it.
Jan
> On 03 Sep 2015, at 03:14, Alex Gorbachev wrote:
>
> e have experienced a repeatable issue when performing the following:
>
> Ceph backend with no issues, we can repeat any time at will in lab and
> production. Cloning
Hi,
>>As of yesterday we are now ready to start providing Debian Jessie packages.
>>They will be present by default for the upcoming Ceph release (Infernalis).
>>For other releases (e.g. Firefly, Hammer, Giant) it means that there will be
>>a Jessie package for them for new versions only.
Ok
And what to do for those with systemd? Because systemd totally ignores
limits.conf and manages limits on per-service basis...
What actual services should be tuned WRT LimitNOFILE?
Or should the DefaultLimitNOFILE increased in /etc/systemd/system.conf?
Thanks in advance!
2015-09-03 17:46 GMT+08:00
/etc/libvirt/qemu.conf:
max_files=
I expect this should always work, even on systemd b0rked systems...
Only solves the problem for QEMU, not for other librbd users.
Jan
> On 03 Sep 2015, at 14:48, Vasiliy Angapov wrote:
>
> And what to do for those with systemd? Because systemd totally ign
Hello,
In response to an rbd map command, we are getting a "Device or resource
busy".
$ rbd -p platform map ceph:pzejrbegg54hi-stage-4ac9303161243dc71c75--php
rbd: sysfs write failed
rbd: map failed: (16) Device or resource busy
We currently have over 200 rbds mapped on a single host. Can
Hi, Experts
i am fresh user for the CEPH cluster
after my installation of CEPH gateway, the HTTP transaction never work!!i think
there is something wrong in the configuration, but i don't know where
from the apache2 log show, the message sent to protocol handle already and
reply with 403, seems i
Hi!
Actually, it looks that O_APPEND does not work even if the file kept open
read-only (reader + writer). Test:
in one session
> less /mnt/ceph/test
in another session
> echo "start or end" >> /mnt/ceph/test
"start or end" is written to the start of the file.
J.
On 02.09.2015 11:50,
Hi,
I am wondering if anybody in the community is running ceph cluster with
high density machines e.g. Supermicro SYS-F618H-OSD288P (288 TB),
Supermicro SSG-6048R-OSD432 (432 TB) or some other high density
machines. I am assuming that the installation will be of petabyte scale
as you would want to
It's funny cause in my mind, such dense servers seems like a bad idea to
me for exactly the reason you mention, what if it fails. Losing 400+TB
of storage is going to have quite some impact, 40G interfaces or not and
no matter what options you tweak.
Sure it'll be cost effective per TB, but that is
It's not exactly a single system
SSG-F618H-OSD288P*
4U-FatTwin, 4x 1U 72TB per node, Ceph-OSD-Storage Node
This could actually be pretty good, it even has decent CPU power.
I'm not a big fan of blades and blade-like systems - sooner or later a
backplane will die and you'll need to power off ev
Echoing what Jan said, the 4U Fat Twin is the better choice of the two options,
as it is very difficult to get long-term reliable and efficient operation of
many OSDs when they are serviced by just one or two CPUs.
I don’t believe the FatTwin design has much of a backplane, primarily sharing
pow
> On 03 Sep 2015, at 16:49, Paul Evans wrote:
>
> Echoing what Jan said, the 4U Fat Twin is the better choice of the two
> options, as it is very difficult to get long-term reliable and efficient
> operation of many OSDs when they are serviced by just one or two CPUs.
> I don’t believe the Fa
Those Fat twins are not blades in the classical sense, they are what are often
referred to as un-blades.
They only share power, ie about 4-6 pins which are connected by solid bits of
copper to the PSU’s. I can’t see anyway of this going wrong. If you take out
all the sleds you are just left
My take is that you really only want to do these kinds of systems if you
have massive deployments. At least 10 of them, but probably more like
20-30+. You do get massive density with them, but I think if you are
considering 5 of these, you'd be better off with 10 of the 36 drive
units. An ev
Rewording to remove confusion...
Config 1: set up a cluster with 1 node with 6 OSDs
Config 2: identical hardware, set up a cluster with 2 nodes with 3 OSDs each
In each case I do the following:
1) rados bench write --no-cleanup the same number of 4M size objects
2) drop caches on all osd no
On 09/03/2015 10:39 AM, Deneau, Tom wrote:
Rewording to remove confusion...
Config 1: set up a cluster with 1 node with 6 OSDs
Config 2: identical hardware, set up a cluster with 2 nodes with 3 OSDs each
In each case I do the following:
1) rados bench write --no-cleanup the same number of 4
On 09/03/2015 02:22 AM, Gregory Farnum wrote:
On Thu, Sep 3, 2015 at 7:48 AM, Chris Taylor wrote:
I removed the latest OSD that was respawing (osd.23) and now I having the
same problem with osd.30. It looks like they both have pg 3.f9 in common. I
tried "ceph pg repair 3.f9" but the OSD is stil
We also just started having our 850 Pros die one after the other after
about 9 months of service. 3 down, 11 to go... No warning at all, the drive
is fine, and then it's not even visible to the machine. According to the
stats in hdparm and the calcs I did they should have had years of life
left, so
Hey Mark / Community
These are the sequences of changes that seems to have fixed the ceph problem
1# Upgrading Disk controller firmware from 6.34 to 6.64 ( latest )
2# Rebooting all nodes in order to make new firmware into effect
Read and write operations are now normal as well as system load
Am I the only one who finds it funny that the "ceph problem" was fixed by
an update to the disk controller firmware? :-)
Ian
On Thu, Sep 3, 2015 at 11:13 AM, Vickey Singh
wrote:
> Hey Mark / Community
>
> These are the sequences of changes that seems to have fixed the ceph
> problem
>
> 1# Upg
This crash is what happens if a clone is missing from SnapSet (internal
data) for an object in the ObjectStore. If you had out of space issues,
this could possibly have been caused by being able to rename or create
files in a directory, but not being able to update SnapSet.
I've completely
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Just about how funny "ceph problems" are fixed by changing network
configurations.
-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Sep 3, 2015 at 11:16 AM, Ian Colle wrote:
> Am I the on
On 09/03/2015 10:20 AM, David Zafman wrote:
This crash is what happens if a clone is missing from SnapSet
(internal data) for an object in the ObjectStore. If you had out of
space issues, this could possibly have been caused by being able to
rename or create files in a directory, but not bei
It's always possible it was the reboot (seriously!) :)
Mark
On 09/03/2015 12:16 PM, Ian Colle wrote:
Am I the only one who finds it funny that the "ceph problem" was fixed
by an update to the disk controller firmware? :-)
Ian
On Thu, Sep 3, 2015 at 11:13 AM, Vickey Singh
mailto:vickey.singh22
I found place to paste my output for `ceph daemon osd.xx config show` for
all my OSD's:
https://www.zerobin.net/?743bbbdea41874f4#FNk5EjsfRxvkX1JuTp52fQ4CXW6VOIEB0Lj0Icnyr4Q=
If you want it in a gzip'd txt file, you can download here:
https://mega.nz/#!oY5QAByC!JEWhHRms0WwbYbwG4o4RdTUWtFwFjUDLWh
Thanks everybody for the feedback.
On 09/03/2015 05:09 PM, Mark Nelson wrote:
> My take is that you really only want to do these kinds of systems if you
> have massive deployments. At least 10 of them, but probably more like
> 20-30+. You do get massive density with them, but I think if you are
>
On 09/03/2015 02:49 PM, Gurvinder Singh wrote:
Thanks everybody for the feedback.
On 09/03/2015 05:09 PM, Mark Nelson wrote:
My take is that you really only want to do these kinds of systems if you
have massive deployments. At least 10 of them, but probably more like
20-30+. You do get massi
I really advise removing the bastards becore they die...no rebalancing
hapening just temp osd down while replacing journals...
What size and model are yours Samsungs?
On Sep 3, 2015 7:10 PM, "Quentin Hartman"
wrote:
> We also just started having our 850 Pros die one after the other after
> about
Yeah, we've ordered some S3700's to replace them already. Should be here
early next week. Hopefully they arrive before we have multiple nodes die at
once and can no longer rebalance successfully.
Most of the drives I have are the 850 Pro 128GB (specifically MZ7KE128HMGA)
There are a couple 120GB 8
Hey Mark,
I've just tweaked these filestore settings for my cluster -- after
changing this, is there a way to make ceph move existing objects
around to new filestore locations, or will this only apply to newly
created objects? (i would assume the latter..)
thanks,
-Ben
On Wed, Jul 8, 2015 at 6:
Chris,
WARNING: Do this at your own risk. You are deleting one of the
snapshots of a specific portion of an rbd image. I'm not sure how rbd
will react. Maybe you should repair the SnapSet instead of remove the
inconsistency. However, as far as I know there isn't a tool to it.
If you ar
On Thu, Sep 3, 2015 at 6:58 AM, Jan Schermer wrote:
> EnhanceIO? I'd say get rid of that first and then try reproducing it.
Jan, EnhanceIO has not been used in this case, in fact we have never
had a problem with it in read cache mode.
Thank you,
Alex
>
> Jan
>
>> On 03 Sep 2015, at 03:14, Alex
Hrm, I think it will follow the merge/split rules if it's out of whack
given the new settings, but I don't know that I've ever tested it on an
existing cluster to see that it actually happens. I guess let it sit
for a while and then check the OSD PG directories to see if the object
counts make
I was wondering if anybody could give me some insight as to how CephFS does
its caching - read-caching in particular.
We are using CephFS with an EC pool on the backend with a replicated cache
pool in front of it. We're seeing some very slow read times. Trying to
compute an md5sum on a 15GB file t
On Thu, Sep 3, 2015 at 3:20 AM, Nicholas A. Bellinger
wrote:
> (RESENDING)
>
> On Wed, 2015-09-02 at 21:14 -0400, Alex Gorbachev wrote:
>> e have experienced a repeatable issue when performing the following:
>>
>> Ceph backend with no issues, we can repeat any time at will in lab and
>> production
After running some other experiments, I see now that the high single-node
bandwidth only occurs when ceph-mon is also running on that same node.
(In these small clusters I only had one ceph-mon running).
If I compare to a single-node where ceph-mon is not running, I see
basically identical performa
If you have ceph-dencoder installed or can build v0.94.3 to build the
binary, you can dump the SnapSet for the problem object. Once you
understand the removal procedure you could do the following to get a
look at the SnapSet information.
Find the object from --op list with snapid -2 and cu
I'm about to change it on a big cluster too. It totals around 30 million, so
I'm a bit nervous on changing it. As far as I understood, it would indeed move
them around, if you can get underneath the threshold, but it may be hard to do.
Two more settings that I highly recommend changing on a big
In the minority on this one. We have a number of the big SM 72 drive units w/
40 Gbe. Definitely not as fast as even the 36 drive units, but it isn't awful
for our average mixed workload. We can exceed all available performance with
some workloads though.
So while we can't extract all the perfo
On 09/03/2015 02:44 PM, David Zafman wrote:
Chris,
WARNING: Do this at your own risk. You are deleting one of the
snapshots of a specific portion of an rbd image. I'm not sure how rbd
will react. Maybe you should repair the SnapSet instead of remove the
inconsistency. However, as far as
Hiya. Playing with a small cephs setup from the Quick start documentation.
Seeing an issue running rdb bench-write. Initial trace is provided
below, let me know if you need other information. fwiw the rados bench
works just fine.
Any idea what is causing this? Is it a parsing issue in the rbd
48 matches
Mail list logo