Re: [ceph-users] Cluster is empty but it still use 1Gb of data

2018-03-02 Thread Gonzalo Aguilar Delgado
Hi Max, No that's not normal. 9GB for an empty cluster. Maybe you reserved some space or you have other service that's taking the space. But It seems way to much for me. El 02/03/18 a las 12:09, Max Cuttins escribió: I don't care of get back those space. I just want to know if it's expecte

Re: [ceph-users] PG::peek_map_epoch assertion fail

2017-12-06 Thread Gonzalo Aguilar Delgado
Hi, Since my email server falled down because the error. I have to reply this way. I added more logs:   int r = store->omap_get_values(coll, pgmeta_oid, keys, &values);   if (r == 0) {     assert(values.size() == 2); -- 0> 2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: I

[ceph-users] I cannot make the OSD to work, Journal always breaks 100% time

2017-12-06 Thread Gonzalo Aguilar Delgado
Hi, Another OSD falled down. And it's pretty scary how easy is to break the cluster. This time is something related to the journal. /usr/bin/ceph-osd -f --cluster ceph --id 6 --setuser ceph --setgroup ceph starting osd.6 at :/0 osd_data /var/lib/ceph/osd/ceph-6 /var/lib/ceph/osd/ceph-6/journ

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-05 Thread Gonzalo Aguilar Delgado
Hi, I created this. http://paste.debian.net/999172/ But the expiration date is too short. So I did this too https://pastebin.com/QfrE71Dg. What I want to mention is that there's no known cause for what's happening. It's true that time desynch happens on reboot because few millis skew. But ntp cor

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-04 Thread Gonzalo Aguilar Delgado
have something to do with the map values showing in ceph?   pgmap v71223952: 764 pgs, 6 pools, 561 GB data, 141 kobjects     1124 GB used, 1514 GB / 2639 GB avail     20266198323167232/288980 objects degraded (7013010700798.405%) Best regards On 03/12/17 13:31, Gonzalo Agui

[ceph-users] PG::peek_map_epoch assertion fail

2017-12-03 Thread Gonzalo Aguilar Delgado
Hello, What can make fail this assertion?   int r = store->omap_get_values(coll, pgmeta_oid, keys, &values);   if (r == 0) {     assert(values.size() == 2); -- 0> 2017-12-03 13:39:29.497091 7f467ba0b8c0 -1 osd/PG.cc: In function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, e

Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-12-03 Thread Gonzalo Aguilar Delgado
ests this seems > to go fine automatically. Are you doing something that is not adviced? > > > > > -----Original Message- > From: Gonzalo Aguilar Delgado [mailto:gagui...@aguilardelgado.com] > Sent: zaterdag 25 november 2017 20:44 > To: 'ceph-users' >

[ceph-users] Another OSD broken today. How can I recover it?

2017-11-25 Thread Gonzalo Aguilar Delgado
Hello, I had another blackout with ceph today. It seems that ceph osd's fall from time to time and they are unable to recover. I have 3 OSD's down now. 1 removed from the cluster and 2 down because I'm unable to recover them. We really need a recovery tool. It's not normal that an OSD breaks and

Re: [ceph-users] Infinite degraded objects

2017-10-25 Thread Gonzalo Aguilar Delgado
rent Jewel release is 10.2.10 - I don't know if the problem > you're seeing is fixed in there but I'd upgrade to 10.2.10 and then > open a tracker ticket if the problem still persists. > > On Thu, Oct 26, 2017 at 9:13 AM, Gonzalo Aguilar Delgado > wrote: >> Hello, >

Re: [ceph-users] Infinite degraded objects

2017-10-25 Thread Gonzalo Aguilar Delgado
fixed a week or so ago: > http://tracker.ceph.com/issues/21803 > > On Mon, Oct 23, 2017 at 5:10 AM, Gonzalo Aguilar Delgado > wrote: >> Hello, >> >> Since we upgraded ceph cluster we are facing a lot of problems. Most of them >> due to osd crashing. Wh

[ceph-users] Infinite degraded objects

2017-10-22 Thread Gonzalo Aguilar Delgado
Hello, Since we upgraded ceph cluster we are facing a lot of problems. Most of them due to osd crashing. What can cause this? This morning I woke up with thi message: root@red-compute:~# ceph -w     cluster 9028f4da-0d77-462b-be9b-dbdf7fa57771 health HEALTH_ERR     1 pgs are stuck

[ceph-users] Ceph OSD get blocked and start to make inconsistent pg from time to time

2017-09-29 Thread Gonzalo Aguilar Delgado
Hi, I discovered that my cluster starts to make slow requests and all disk activity get blocked. This happens once a day. And the ceph OSD get 100% CPU. In the ceph health I get something like: 2017-09-29 10:49:01.227257 [INF] pgmap v67494428: 764 pgs: 1 active+recovery_wait+degraded+inc

Re: [ceph-users] Ceph OSD crash starting up

2017-09-20 Thread Gonzalo Aguilar Delgado
al dangers of some of the commands. Googling for `ceph-users scrub errors inconsistent pgs` is a good place to start. On Tue, Sep 19, 2017 at 11:28 AM Gonzalo Aguilar Delgado mailto:gagui...@aguilardelgado.com>> wrote: Hi David, What I want is to add the OSD back with i

Re: [ceph-users] Ceph OSD crash starting up

2017-09-19 Thread Gonzalo Aguilar Delgado
ith its data or add it back in as a fresh osd. What is your `ceph status`? On Tue, Sep 19, 2017, 5:23 AM Gonzalo Aguilar Delgado mailto:gagui...@aguilardelgado.com>> wrote: Hi David, Thank you for the great explanation of the weights, I thought that ceph was adjusting them

Re: [ceph-users] Ceph OSD crash starting up

2017-09-19 Thread Gonzalo Aguilar Delgado
er is health_ok without any missing objects, then there is nothing that you need off of OSD1 and ceph recovered from the lost disk successfully. On Thu, Sep 14, 2017 at 4:39 PM Gonzalo Aguilar Delgado mailto:gagui...@aguilardelgado.com>> wrote: Hello, I was on a old version of c

Re: [ceph-users] Ceph OSD crash starting up

2017-09-14 Thread Gonzalo Aguilar Delgado
a copy of your crush map and `ceph osd df`? On Wed, Sep 13, 2017 at 6:39 AM Gonzalo Aguilar Delgado mailto:gagui...@aguilardelgado.com>> wrote: Hi, I'recently updated crush map to 1 and did all relocation of the pgs. At the end I found that one of the OSD is not start

[ceph-users] Scrub failing all the time, new inconsistencies keep appearing

2017-09-14 Thread Gonzalo Aguilar Delgado
Hello, I'm using ceph since long time ago. A day ago added jewel requirement for OSD. And upgraded crush map. From this time I had all kind of errors, maybe because disks failing because rebalances or because there's a problem I don't know. I have some pg active+clean+inconsistent, from di

[ceph-users] Ceph OSD crash starting up

2017-09-13 Thread Gonzalo Aguilar Delgado
Hi, I'recently updated crush map to 1 and did all relocation of the pgs. At the end I found that one of the OSD is not starting. This is what it shows: 2017-09-13 10:37:34.287248 7f49cbe12700 -1 *** Caught signal (Aborted) ** in thread 7f49cbe12700 thread_name:filestore_sync ceph version

Re: [ceph-users] Ceph mount rbd

2017-07-14 Thread Gonzalo Aguilar Delgado
Hi, Why you would like to maintain copies by yourself. You replicate on ceph and then on different files inside ceph? Let ceph take care of counting. Create a pool with 3 or more copies and let ceph take care of what's stored and where. Best regards, El 13/07/17 a las 17:06, li...@marcelof

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-11 Thread Gonzalo Aguilar Delgado
ed to work again.  I suppose it was in a stale situation.On mié, 2016-05-11 at 09:37 +0200, Gonzalo Aguilar Delgado wrote: > Hello again,  > > I was looking at the patches sent on the repository and I found a > patch that made the OSD to check for cluster health before starting > up

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-11 Thread Gonzalo Aguilar Delgado
Hello again, I was looking at the patches sent on the repository and I found a patch that made the OSD to check for cluster health before starting up. Can this be patch the source of all my problems? Best regards, On Tue, May 10, 2016 at 6:07 PM, Gonzalo Aguilar Delgado < gaguilar.d

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-10 Thread Gonzalo Aguilar Delgado
881 7eff735598c0 0 probe_block_device_fsid /dev/sdf2 is filestore, fd069e6a-9a62-4286-99cb-d8a523bd946a r On Tue, May 10, 2016 at 6:07 PM, Gonzalo Aguilar Delgado < gaguilar.delg...@gmail.com> wrote: > Hello, > > I just upgraded my cluster to the version 10.1.2 and it worked well for a > while

Re: [ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-10 Thread Gonzalo Aguilar Delgado
quot;9028f4da-0d77-462b-be9b-dbdf7fa57771", "osd_fsid": "8dd085d4-0b50-4c80-a0ca-c5bc4ad972f7", "whoami": 3, "state": "preboot", "oldest_map": 1764, "newest_map": 2504, "num_pgs": 150 } 3 is up

[ceph-users] Ceph OSD not goint up and join to the cluster. OSD does not goes up. ceph version 10.1.2

2016-05-10 Thread Gonzalo Aguilar Delgado
Hello, I just upgraded my cluster to the version 10.1.2 and it worked well for a while until I saw that systemctl ceph-disk@dev-sdc1.service was failed and I reruned it. >From there the OSD stopped working. This is ubuntu 16.04. I connected to the IRC looking for help where people pointed me to

Re: [ceph-users] Slow/Hung IOs

2015-01-06 Thread Gonzalo Aguilar Delgado
Hi, I just ran this test and found my system is not better. But I use commodity hardware. The only difference is latency. You should look at it. Total time run: 62.412381 Total writes made: 919 Write size: 4194304 Bandwidth (MB/sec): 58.899 Stddev Bandwidth:

[ceph-users] ceph-mon is taking too much memory. It's a bug?

2014-04-30 Thread Gonzalo Aguilar Delgado
Hi, I've found my system with memory almost full. I see PID USUARIO PR NIVIRTRESSHR S %CPU %MEM HORA+ ORDEN 2317 root 20 0 824860 647856 3532 S 0,7 5,3 29:46.51 ceph-mon I think it's too much. But what do you think? Best regards, _

[ceph-users] Configuration file

2014-04-22 Thread Gonzalo Aguilar Delgado
Hi, I don't add anything to the configuration file and the cluster seems to discover each host config. My current ceph.conf is fairly simple: - [global] fsid = 9028f4da-0d77-462b-be9b-dbdf7fa57771 mon_initial_members = blue-compute, red-compute mon_host = 172.16.0.119, 172.16.0.100 auth_cl

Re: [ceph-users] Ceph not replicating

2014-04-22 Thread Gonzalo Aguilar Delgado
o re-deploy now than when you have data serving in production. Thanks, Michael J. Kidd Sr. Storage Consultant Inktank Professional Services On Sat, Apr 19, 2014 at 5:51 PM, Gonzalo Aguilar Delgado wrote: Hi Michael, It worked. I didn't realized of this because docs it installs two

[ceph-users] rdb - huge disk - slow ceph

2014-04-22 Thread Gonzalo Aguilar Delgado
Hi, I did my first mistake so big... I did a rbd disk of about 300 TB, yes 300 TB rbd info test-disk -p high_value rbd image 'test-disk': size 300 TB in 78643200 objects order 22 (4096 kB objects) block_name_prefix: rb.0.18d7.2ae8944a format: 1 but even more. I

Re: [ceph-users] Ceph not replicating

2014-04-22 Thread Gonzalo Aguilar Delgado
e advice. I will surely do it. Thank you a lot Michael. Thanks, Michael J. Kidd Sr. Storage Consultant Inktank Professional Services On Sat, Apr 19, 2014 at 5:51 PM, Gonzalo Aguilar Delgado wrote: Hi Michael, It worked. I didn't realized of this because docs it installs two osd node

[ceph-users] Journal partition on prepare

2014-04-20 Thread Gonzalo Aguilar Delgado
Hi, when I create an osd with ceph-deploy with: ceph-deploy osd prepare --fs-type btrfs red-compute:sdb I see system creates two partitions on the disk one for data, one for journal. This is right since I want to use an SSD disk for journals, but I want to follow the bcache path. So one SSD w

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
g efficiently and most options available, functionnally speaking, deploy a cluster with 3 nodes, 3 OSDs each is my best practice. Or make 1 node with 3 OSDs modifying your crushmap to "choose type osd" in your rulesets. JC On Saturday, April 19, 2014, Gonzalo Aguilar Delgado w

Re: [ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
r make 1 node with 3 OSDs modifying your crushmap to "choose type osd" in your rulesets. JC On Saturday, April 19, 2014, Gonzalo Aguilar Delgado wrote: Hi, I'm building a cluster where two nodes replicate objects inside. I found that shutting down just one of the nodes (th

[ceph-users] Ceph not replicating

2014-04-19 Thread Gonzalo Aguilar Delgado
Hi, I'm building a cluster where two nodes replicate objects inside. I found that shutting down just one of the nodes (the second one), makes everything "incomplete". I cannot find why, since crushmap looks good to me. after shutting down one node cluster 9028f4da-0d77-462b-be9b-dbdf7f