Hi,
Since many ceph clusters use intel ssds and admins do recommend them,
they are probably very good drives. My own experiences however are not
so good with them. (About 70% of our intel drives ran into the 8mb bug
at my previous job, 5xx and DC35xx series both, latest firmware at that
time
Hi,
If the problem is not severe and you can wait, then according to this:
http://ceph.com/community/new-luminous-pg-overdose-protection/
there is a pg merge feature coming.
Regards,
Denes.
On 12/18/2017 02:18 PM, Jens-U. Mozdzen wrote:
Hi *,
facing the problem to reduce the number of P
Hi,
This is just a tip, I do not know if this actually applies to you, but
some ssds are decreasing their write throughput on purpose so they do
not wear out the cells before the warranty period is over.
Denes.
On 12/17/2017 06:45 PM, shadow_lin wrote:
Hi All,
I am testing luminous 12.2.
Hi,
I found another possible cause for your problem:
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#placement-group-down-peering-failure
I hope that I helped,
Denes.
On 12/11/2017 03:43 PM, Denes Dolhay wrote:
Hi Aaron!
There is an previous post about safely
Hi Aaron!
There is an previous post about safely shutting down and restarting a
cluster:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-April/017378.html
To the problems at hand:
What size were you using?
Ceph can only obey the failure domain if it knows exactly which osd is
on
Hi,
The ceph mds keeps all the capabilities for the files, however the
clients modify the the rados data pool objects directly (they do not do
the content modification threw the mds).
IMHO IF the file (really) gets corrupted because a client write (not
some corruption from the mds / osd) th
Hi,
So for this to happen you have to lose another osd before backfilling is
done.
Thank You! This clarifies it!
Denes
On 12/05/2017 03:32 PM, Ronny Aasen wrote:
On 05. des. 2017 10:26, Denes Dolhay wrote:
Hi,
This question popped up a few times already under filestore and
bluestore
Hello!
I can only answer some of your questions:
-The backfill process obeys a "nearfull_ratio" limit (I think defaults
to 85%) above it the system will stop repairing itself, so it wont go up
to 100%
-The normal write ops obey a full_ratio too, I think default 95%, above
that no write io w
Hi,
This question popped up a few times already under filestore and
bluestore too, but please help me understand, why this is?
"when you have 2 different objects, both with correct digests, in your
cluster, the cluster can not know witch of the 2 objects are the correct
one."
Doesn't it us
Yep, you are correct, thanks!
On 12/04/2017 07:31 PM, David Turner wrote:
"The journals can only be moved back by a complete rebuild of that osd
as to my knowledge."
I'm assuming that since this is a cluster that he's inherited and that
it's configured like this that it's probably not runnin
Hi,
I would not rip out the discs, but I would reweight the osd to 0, wait
for the cluster to reconfigure, and when it is done, you can remove the
disc / raid pair without ever going down to 1 copy only.
The jornals can only be moved back by a complete rebiuld of that osd as
to my knowledge.
Hi,
On 12/04/2017 12:12 PM, tim taler wrote:
Hi,
thnx a lot for the quick response
and for laying out some of the issues
I'm also new, but I'll try to help. IMHO most of the pros here would be quite
worried about this cluster if it is production:
thought so ;-/
-A prod ceph cluster shoul
Hi,
I'm also new, but I'll try to help. IMHO most of the pros here would be
quite worried about this cluster if it is production:
-A prod ceph cluster should not be run with size=2 min_size=1, because:
--In case of a down'ed osd / host the cluster could have problems
determining which data i
As per your ceph status it seems that you have 19 pools, all of them are
erasure coded as 3+2?
It seems that when you taken the node offline ceph could move some of
the PGs to other nodes (it seems that that one or more pools does not
require all 5 osds to be healty. Maybe they are replicated,
Hello,
You might consider checking the iowait (during the problem), and the
dmesg (after it recovered). Maybe an issue with the given sata/sas/nvme
port?
Regards,
Denes
On 11/29/2017 06:24 PM, Matthew Vernon wrote:
Hi,
We have a 3,060 OSD ceph cluster (running Jewel
10.2.7-0ubuntu0.16.0
So you are using a 40 / 100 gbit connection all the way to your client?
John's question is valid because 10 gbit = 1.25GB/s ... subtract some
ethernet, ip, tcp and protocol overhead take into account some
additional network factors and you are about there...
Denes
On 11/10/2017 05:10 PM, R
-sorry, wrong address
Hi Richard,
I have seen a few lectures about bluestore, and they made it abundantly
clear, that bluestore is superior to filestore in that manner, that it
writes data to the disc only once (this way they could achieve a 2x-3x
speed increase).
So this is true if there
quot;
-osd 2 comes back up.
bang! osd 1 and 2 now have different views of pg "A" but both claim to
have current data.
In this case, OSD 1 will not accept IO precisely because it can not
prove it has the current data. That is the basic purpose of OSD
peering and holds in all
Hello,
I have a trick question for Mr. Turner's scenario:
Let's assume size=2, min_size=1
-We are looking at pg "A" acting [1, 2]
-osd 1 goes down, OK
-osd 1 comes back up, backfill of pg "A" commences from osd 2 to osd 1, OK
-osd 2 goes down (and therefore pg "A" 's backfill to osd 1 is
incompl
Hello,
I'm not sure and not sure where, but I think I red that you can specify
80+443s to accomplish this
Would you please try that?
Kind regards,
Denes.
On 10/30/2017 05:42 PM, alastair.dewhu...@stfc.ac.uk wrote:
Hello
We have a dual stack test machine running RadosGW. It is current
that which copy should be used when
repairing?
There is a command, ceph pg force-recovery, but I cannot find
documentation for it.
Kind regards,
Denes Dolhay.
On 10/28/2017 01:05 PM, Mario Giammarco wrote:
Hello,
we recently upgraded two clusters to Ceph luminous with bluestore and
we disco
oo:
http://docs.ceph.com/docs/master/rados/configuration/auth-config-ref/
You will probably have to go threw all the steps, because you are
missing the mds, osd, rgw, mgr keyring too.
Cheers,
Denes.
On 10/27/2017 04:07 AM, GiangCoi Mr wrote:
Hi Denes Dolhay,
This is error when I run com
my iPhone
On Oct 26, 2017, at 10:34 PM, Denes Dolhay <mailto:de...@denkesys.com>> wrote:
Hi,
Did you to create a cluster first?
ceph-deploy new {initial-monitor-node(s)} Cheers, Denes.
On 10/26/2017 05:25 PM, GiangCoi Mr wrote:
Dear Alan Johnson
I install with command: ceph-deplo
Hi,
Did you to create a cluster first?
ceph-deploy new {initial-monitor-node(s)} Cheers, Denes.
On 10/26/2017 05:25 PM, GiangCoi Mr wrote:
Dear Alan Johnson
I install with command: ceph-deploy install ceph-node1 —no-adjust-repos. When
install success, I run command: ceph-deploy mon ceph-nod
I think you are searching for this:
|osd scrub sleep|
|http://docs.ceph.com/docs/master/rados/configuration/osd-config-ref/|
|
|
|Denes.|
On 10/25/2017 06:06 PM, Alejandro Comisario wrote:
any comment on this one ?
interesting what to do in this situation
On Wed, Jul 5, 2017 at 10:51 PM, Adrian
Hi,
There was a thread about this a not long ago, please check:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021676.html
Denes.
On 10/24/2017 11:48 AM, shadow_lin wrote:
Hi All,
The cluster has 24 osd with 24 8TB hdd.
Each osd server has 2GB ram and runs 2OSD with 2 8TBHDD
- H.S. Amiata wrote:
Hi
I used the tool pveceph provided with Proxmox to initialize ceph, I
can change but in that case should I put only public network or only
cluster network in ceph.conf?
Thanks
Il 23/10/2017 17:33, Denes Dolhay ha scritto:
Hi,
So, you are running both the public an
x27;s everything.
Thanks
Il 23/10/2017 15:42, Denes Dolhay ha scritto:
Hi,
Maybe some routing issue?
"CEPH has public and cluster network on 10.10.10.0/24"
This means that the nodes have public and cluster network separately
both on 10.10.10.0/24, or that you did not specify
Hi,
Maybe some routing issue?
"CEPH has public and cluster network on 10.10.10.0/24"
This means that the nodes have public and cluster network separately
both on 10.10.10.0/24, or that you did not specify a separate cluster
network?
Please provide route table, ifconfig, ceph.conf
Regards
Hi,
If you want to split your data to 10 peaces (stripes), and hold 4 parity
peaces in extra (so your cluster can handle the loss of any 4 osds),
then you need a minimum of 14 osds to hold your data.
Denes.
On 10/19/2017 04:24 PM, Josy wrote:
Hi,
I would like to set up an erasure code pr
Hi All,
The linked document is for filestore, which in your case is correct as I
understand it, but I wonder, if a similar document exists for bluestore?
Thanks,
Denes.
On 10/18/2017 02:56 PM, Stijn De Weirdt wrote:
hi all,
we have a ceph 10.2.7 cluster with a 8+3 EC pool.
in that pool, the
Hello,
Could you include the monitors and the osds as well to your clock skew test?
How did you create the osds? ceph-deploy osd create osd1:/dev/sdX
osd2:/dev/sdY osd3: /dev/sdZ ?
Some log from one of the osds would be great!
Kind regards,
Denes.
On 10/14/2017 07:39 PM, dE wrote:
On 10
data cache would only be populated in a timer, (between 1s
and 5s) which is never reached because of the repeated watch ls query
just a blind shot in the dark...
Thanks:
Denes.
On 10/13/2017 01:32 PM, Burkhard Linke wrote:
Hi,
On 10/13/2017 12:36 PM, Denes Dolhay wrote:
Dear All
Dear All,
First of all, this is my first post, so please be lenient :)
For the last few days I have been testing ceph, and cephfs, deploying a
PoC cluster.
I have been testing the cephfs kernel client caching, when I came across
something strange, and I cannot decide if it is a bug or I ju
34 matches
Mail list logo