Now I have also discovered that, by mistake, someone has put production
data on a virtual machine of the cluster. I need that ceph starts I/O so I
can boot that virtual machine.
Can I mark the incomplete pgs as valid?
If needed, where can I buy some paid support?
Thanks again,
Mario

Il giorno mer 29 giu 2016 alle ore 08:02 Mario Giammarco <
mgiamma...@gmail.com> ha scritto:

> pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 512 pgp_num 512 last_change 9313 flags hashpspool
> stripe_width 0
>        removed_snaps [1~3]
> pool 1 'rbd2' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 512 pgp_num 512 last_change 9314 flags hashpspool
> stripe_width 0
>        removed_snaps [1~3]
> pool 2 'rbd3' replicated size 2 min_size 1 crush_ruleset 0 object_hash
> rjenkins pg_num 512 pgp_num 512 last_change 10537 flags hashpspool
> stripe_width 0
>        removed_snaps [1~3]
>
>
> ID WEIGHT  REWEIGHT SIZE   USE   AVAIL %USE  VAR
> 5 1.81000  1.00000  1857G  984G  872G 53.00 0.86
> 6 1.81000  1.00000  1857G 1202G  655G 64.73 1.05
> 2 1.81000  1.00000  1857G 1158G  698G 62.38 1.01
> 3 1.35999  1.00000  1391G  906G  485G 65.12 1.06
> 4 0.89999  1.00000   926G  702G  223G 75.88 1.23
> 7 1.81000  1.00000  1857G 1063G  793G 57.27 0.93
> 8 1.81000  1.00000  1857G 1011G  846G 54.44 0.88
> 9 0.89999  1.00000   926G  573G  352G 61.91 1.01
> 0 1.81000  1.00000  1857G 1227G  629G 66.10 1.07
> 13 0.45000  1.00000   460G  307G  153G 66.74 1.08
>              TOTAL 14846G 9136G 5710G 61.54
> MIN/MAX VAR: 0.86/1.23  STDDEV: 6.47
>
>
>
> ceph version 0.94.7 (d56bdf93ced6b80b07397d57e3fa68fe68304432)
>
> http://pastebin.com/SvGfcSHb
> http://pastebin.com/gYFatsNS
> http://pastebin.com/VZD7j2vN
>
> I do not understand why I/O on ENTIRE cluster is blocked when only few pgs
> are incomplete.
>
> Many thanks,
> Mario
>
>
> Il giorno mar 28 giu 2016 alle ore 19:34 Stefan Priebe - Profihost AG <
> s.pri...@profihost.ag> ha scritto:
>
>> And ceph health detail
>>
>> Stefan
>>
>> Excuse my typo sent from my mobile phone.
>>
>> Am 28.06.2016 um 19:28 schrieb Oliver Dzombic <i...@ip-interactive.de>:
>>
>> Hi Mario,
>>
>> please give some more details:
>>
>> Please the output of:
>>
>> ceph osd pool ls detail
>> ceph osd df
>> ceph --version
>>
>> ceph -w for 10 seconds ( use http://pastebin.com/ please )
>>
>> ceph osd crush dump ( also pastebin pls )
>>
>> --
>> Mit freundlichen Gruessen / Best regards
>>
>> Oliver Dzombic
>> IP-Interactive
>>
>> mailto:i...@ip-interactive.de <i...@ip-interactive.de>
>>
>> Anschrift:
>>
>> IP Interactive UG ( haftungsbeschraenkt )
>> Zum Sonnenberg 1-3
>> 63571 Gelnhausen
>>
>> HRB 93402 beim Amtsgericht Hanau
>> Geschäftsführung: Oliver Dzombic
>>
>> Steuer Nr.: 35 236 3622 1
>> UST ID: DE274086107
>>
>>
>> Am 28.06.2016 um 18:59 schrieb Mario Giammarco:
>>
>> Hello,
>>
>> this is the second time that happens to me, I hope that someone can
>>
>> explain what I can do.
>>
>> Proxmox ceph cluster with 8 servers, 11 hdd. Min_size=1, size=2.
>>
>>
>> One hdd goes down due to bad sectors.
>>
>> Ceph recovers but it ends with:
>>
>>
>> cluster f2a8dd7d-949a-4a29-acab-11d4900249f4
>>
>>     health HEALTH_WARN
>>
>>            3 pgs down
>>
>>            19 pgs incomplete
>>
>>            19 pgs stuck inactive
>>
>>            19 pgs stuck unclean
>>
>>            7 requests are blocked > 32 sec
>>
>>     monmap e11: 7 mons at
>>
>> {0=192.168.0.204:6789/0,1=192.168.0.201:6789/0,
>>
>> 2=192.168.0.203:6789/0,3=192.168.0.205:6789/0,4=192.168.0.202:
>>
>> 6789/0,5=192.168.0.206:6789/0,6=192.168.0.207:6789/0}
>>
>>            election epoch 722, quorum
>>
>> 0,1,2,3,4,5,6 1,4,2,0,3,5,6
>>
>>     osdmap e10182: 10 osds: 10 up, 10 in
>>
>>      pgmap v3295880: 1024 pgs, 2 pools, 4563 GB data, 1143 kobjects
>>
>>            9136 GB used, 5710 GB / 14846 GB avail
>>
>>                1005 active+clean
>>
>>                  16 incomplete
>>
>>                   3 down+incomplete
>>
>>
>> Unfortunately "7 requests blocked" means no virtual machine can boot
>>
>> because ceph has stopped i/o.
>>
>>
>> I can accept to lose some data, but not ALL data!
>>
>> Can you help me please?
>>
>> Thanks,
>>
>> Mario
>>
>>
>> _______________________________________________
>>
>> ceph-users mailing list
>>
>> ceph-users@lists.ceph.com
>>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to