[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-04-25 Thread Szabo, Istvan (Agoda)
Hi,

No, doesn’t work, now we will write our own sync app for ceph, I gave up.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---

From: 特木勒 
Sent: Friday, April 23, 2021 7:50 PM
To: Szabo, Istvan (Agoda) 
Cc: ceph-users@ceph.io
Subject: Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site does not 
sync olds data

Hi Istvan:

We just upgraded whole cluster to 15.2.10 and the multiple site still cannot 
sync whole objects to secondary cluster. 

Do you have any suggestions on this? And I open another issues in ceph tracker 
site:
https://tracker.ceph.com/issues/50474

Hope someone could go to check this issue.

Thanks

特木勒 mailto:twl...@gmail.com>>于2021年3月22日 周一下午9:08写道:
Thank you~

I will try to upgrade cluster too. Seem like this is the only way for now. 

I will let you know once I complete testing. :)

Have a good day

Szabo, Istvan (Agoda) 
mailto:istvan.sz...@agoda.com>>于2021年3月22日 周一下午3:38写道:
Yeah, doesn't work. Last week they fixed my problem ticket which caused the 
crashes, and due to the crashes stopped the replication I'll give a try this 
week again after the update if the daemon doesn't crash, maybe it will work, 
because if crash hasn't happened, the data was synced. Fingers crossed ;) Don't 
give up 

From: 特木勒 mailto:twl...@gmail.com>>
Sent: Monday, March 22, 2021 1:38 PM
To: Szabo, Istvan (Agoda) 
mailto:istvan.sz...@agoda.com>>
Cc: ceph-users@ceph.io 
mailto:ceph-users@ceph.io>>

Subject: Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site does not 
sync olds data

Hi Istvan:

Do you have any update on directional sync?

I am trying to upgrade cluster to 15.2.10 to see if the problem is solved. :(

Thanks

Szabo, Istvan (Agoda) mailto:istvan.sz...@agoda.com>> 
于2021年3月1日周一 上午10:01写道:

So-so. I had some interruption so it failed on one site, but the other is kind 
of working. This is the first time when I saw data caught up in the 
radosgw-admin data sync status on 1 side.

Today will finish the other problematic site, I’ll let you know the result is 
it working or not.



Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---



From: 特木勒 mailto:twl...@gmail.com>>
Sent: Sunday, February 28, 2021 1:34 PM
To: Szabo, Istvan (Agoda) 
mailto:istvan.sz...@agoda.com>>
Cc: ceph-users@ceph.io
Subject: Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site does not 
sync olds data



Email received from outside the company. If in doubt don't click links nor open 
attachments!



Hi Istvan:



Thanks for your reply.



Does directional sync solve the problem? I tried to run `radosgw-admin sync 
init`, bit it still did not work. :(



Thanks



Szabo, Istvan (Agoda) mailto:istvan.sz...@agoda.com>> 
于2021年2月26日周五 上午7:47写道:

Same for me, 15.2.8 also.
I’m trying directional sync now, looks like symmetrical has issue.

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: 
istvan.sz...@agoda.com>
---

On 2021. Feb 26., at 1:03, 特木勒 mailto:twl...@gmail.com>> 
wrote:

Email received from outside the company. If in doubt don't click links nor 
open attachments!


Hi all:

ceph version: 15.2.7 (88e41c6c49beb18add4fdb6b4326ca466d931db8)

I have a strange question, I just create a multiple site for Ceph cluster.
But I notice the old data of source cluster is not synced. Only new data
will be synced into second zone cluster.

Is there anything I need to do to enable full sync for bucket or this is a
bug?

Thanks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. 

[ceph-users] PG can't deep and simple scrub after unfound data delete

2021-04-25 Thread Szabo, Istvan (Agoda)
Hi,

I have a pg where it has been run the following command:
ceph pg 44.1aa mark_unfound_lost delete

After the cluster never report the unknown pgs which was actually the goal to 
ran this.

However this pg is now inconsistent and can't be deepscrubbed.

ceph health detail
HEALTH_ERR 214275 scrub errors; Possible data damage: 1 pg inconsistent; 1 pgs 
not deep-scrubbed in time
[ERR] OSD_SCRUB_ERRORS: 214275 scrub errors
[ERR] PG_DAMAGED: Possible data damage: 1 pg inconsistent
pg 44.1aa is active+clean+inconsistent, acting [59,128,127,43]
[WRN] PG_NOT_DEEP_SCRUBBED: 1 pgs not deep-scrubbed in time
pg 44.1aa not deep-scrubbed since 2021-01-14T05:50:23.852626+0100

ceph pg dump pgs_brief|grep 'ACTING_PRIMARY\|44.1aa'
dumped pgs_brief
PG_STAT  STATE   UPUP_PRIMARY  
ACTINGACTING_PRIMARY
44.1aaactive+clean+inconsistent   [59,128,127,43]  59   
[59,128,127,43]  59

Any idea what to do with it?



This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-04-25 Thread 特木勒
Another problem I notice for a new bucket, the first object in the bucket
will not be sync. the sync will start with the second object. I tried to
fix the index on the bucket and manually rerun bucket sync, but the first
object still does not sync with secondary cluster.

Do you have any ideas for this issue?

Thanks

特木勒  于2021年4月26日周一 上午11:16写道:

> Hi Istvan:
>
> Thanks Amit's suggestion.
>
> I followed his suggestion to fix bucket index and re-do sync on buckets,
> but it still did not work for me.
>
> Then I tried to use bucket rewrite command to rewrite all the objects in
> buckets and it works for me. I think the reason is there's something wrong
> with bucket index and rewrite has rebuilt the index.
>
> Here's the command I use:
> `sudo radosgw-admin bucket rewrite -b BUCKET-NAME --min-rewrite-size 0`
>
> Maybe you can try this to fix the sync issues.
>
> @Amit Ghadge  Thanks for your suggestions. Without
> your suggestions, I will not notice something wrong with index part.
>
> Thanks :)
>
> Szabo, Istvan (Agoda)  于2021年4月26日周一 上午9:57写道:
>
>> Hi,
>>
>>
>>
>> No, doesn’t work, now we will write our own sync app for ceph, I gave up.
>>
>>
>>
>> Istvan Szabo
>> Senior Infrastructure Engineer
>> ---
>> Agoda Services Co., Ltd.
>> e: istvan.sz...@agoda.com
>> ---
>>
>>
>>
>> *From:* 特木勒 
>> *Sent:* Friday, April 23, 2021 7:50 PM
>> *To:* Szabo, Istvan (Agoda) 
>> *Cc:* ceph-users@ceph.io
>> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
>> does not sync olds data
>>
>>
>>
>> Hi Istvan:
>>
>>
>>
>> We just upgraded whole cluster to 15.2.10 and the multiple site still
>> cannot sync whole objects to secondary cluster. 
>>
>>
>>
>> Do you have any suggestions on this? And I open another issues in ceph
>> tracker site:
>>
>> https://tracker.ceph.com/issues/50474
>>
>>
>>
>> Hope someone could go to check this issue.
>>
>>
>>
>> Thanks
>>
>>
>>
>> 特木勒 于2021年3月22日 周一下午9:08写道:
>>
>> Thank you~
>>
>>
>>
>> I will try to upgrade cluster too. Seem like this is the only way for
>> now. 
>>
>>
>>
>> I will let you know once I complete testing. :)
>>
>>
>>
>> Have a good day
>>
>>
>>
>> Szabo, Istvan (Agoda) 于2021年3月22日 周一下午3:38写道:
>>
>> Yeah, doesn't work. Last week they fixed my problem ticket which caused
>> the crashes, and due to the crashes stopped the replication I'll give a try
>> this week again after the update if the daemon doesn't crash, maybe it will
>> work, because if crash hasn't happened, the data was synced. Fingers
>> crossed ;) Don't give up 
>> --
>>
>> *From:* 特木勒 
>> *Sent:* Monday, March 22, 2021 1:38 PM
>> *To:* Szabo, Istvan (Agoda) 
>> *Cc:* ceph-users@ceph.io 
>>
>>
>> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
>> does not sync olds data
>>
>>
>>
>> Hi Istvan:
>>
>>
>>
>> Do you have any update on directional sync?
>>
>>
>>
>> I am trying to upgrade cluster to 15.2.10 to see if the problem is
>> solved. :(
>>
>>
>>
>> Thanks
>>
>>
>>
>> Szabo, Istvan (Agoda)  于2021年3月1日周一 上午10:01写道:
>>
>> So-so. I had some interruption so it failed on one site, but the other is
>> kind of working. This is the first time when I saw data caught up in the
>> radosgw-admin data sync status on 1 side.
>>
>> Today will finish the other problematic site, I’ll let you know the
>> result is it working or not.
>>
>>
>>
>> Istvan Szabo
>> Senior Infrastructure Engineer
>> ---
>> Agoda Services Co., Ltd.
>> e: istvan.sz...@agoda.com
>> ---
>>
>>
>>
>> *From:* 特木勒 
>> *Sent:* Sunday, February 28, 2021 1:34 PM
>> *To:* Szabo, Istvan (Agoda) 
>> *Cc:* ceph-users@ceph.io
>> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
>> does not sync olds data
>>
>>
>>
>> Email received from outside the company. If in doubt don't click links
>> nor open attachments!
>> --
>>
>> Hi Istvan:
>>
>>
>>
>> Thanks for your reply.
>>
>>
>>
>> Does directional sync solve the problem? I tried to run `radosgw-admin
>> sync init`, bit it still did not work. :(
>>
>>
>>
>> Thanks
>>
>>
>>
>> Szabo, Istvan (Agoda)  于2021年2月26日周五 上午7:47写道:
>>
>> Same for me, 15.2.8 also.
>> I’m trying directional sync now, looks like symmetrical has issue.
>>
>> Istvan Szabo
>> Senior Infrastructure Engineer
>> ---
>> Agoda Services Co., Ltd.
>> e: istvan.sz...@agoda.com
>> ---
>>
>> On 2021. Feb 26., at 1:03, 特木勒  wrote:
>>
>> Email received from outside the company. If in doubt don't click links
>> nor open attachments!
>> 
>>
>> Hi all:
>>
>> ceph version: 15.2.7 (88e41c6c49beb18add4fdb6b4326ca466d931db8)
>>
>> I have a strange question, I just create a multiple site for Ceph cluster.

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-04-25 Thread 特木勒
Hi Istvan:

Thanks Amit's suggestion.

I followed his suggestion to fix bucket index and re-do sync on buckets,
but it still did not work for me.

Then I tried to use bucket rewrite command to rewrite all the objects in
buckets and it works for me. I think the reason is there's something wrong
with bucket index and rewrite has rebuilt the index.

Here's the command I use:
`sudo radosgw-admin bucket rewrite -b BUCKET-NAME --min-rewrite-size 0`

Maybe you can try this to fix the sync issues.

@Amit Ghadge  Thanks for your suggestions. Without
your suggestions, I will not notice something wrong with index part.

Thanks :)

Szabo, Istvan (Agoda)  于2021年4月26日周一 上午9:57写道:

> Hi,
>
>
>
> No, doesn’t work, now we will write our own sync app for ceph, I gave up.
>
>
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
>
>
> *From:* 特木勒 
> *Sent:* Friday, April 23, 2021 7:50 PM
> *To:* Szabo, Istvan (Agoda) 
> *Cc:* ceph-users@ceph.io
> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
> does not sync olds data
>
>
>
> Hi Istvan:
>
>
>
> We just upgraded whole cluster to 15.2.10 and the multiple site still
> cannot sync whole objects to secondary cluster. 
>
>
>
> Do you have any suggestions on this? And I open another issues in ceph
> tracker site:
>
> https://tracker.ceph.com/issues/50474
>
>
>
> Hope someone could go to check this issue.
>
>
>
> Thanks
>
>
>
> 特木勒 于2021年3月22日 周一下午9:08写道:
>
> Thank you~
>
>
>
> I will try to upgrade cluster too. Seem like this is the only way for now.
> 
>
>
>
> I will let you know once I complete testing. :)
>
>
>
> Have a good day
>
>
>
> Szabo, Istvan (Agoda) 于2021年3月22日 周一下午3:38写道:
>
> Yeah, doesn't work. Last week they fixed my problem ticket which caused
> the crashes, and due to the crashes stopped the replication I'll give a try
> this week again after the update if the daemon doesn't crash, maybe it will
> work, because if crash hasn't happened, the data was synced. Fingers
> crossed ;) Don't give up 
> --
>
> *From:* 特木勒 
> *Sent:* Monday, March 22, 2021 1:38 PM
> *To:* Szabo, Istvan (Agoda) 
> *Cc:* ceph-users@ceph.io 
>
>
> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
> does not sync olds data
>
>
>
> Hi Istvan:
>
>
>
> Do you have any update on directional sync?
>
>
>
> I am trying to upgrade cluster to 15.2.10 to see if the problem is solved.
> :(
>
>
>
> Thanks
>
>
>
> Szabo, Istvan (Agoda)  于2021年3月1日周一 上午10:01写道:
>
> So-so. I had some interruption so it failed on one site, but the other is
> kind of working. This is the first time when I saw data caught up in the
> radosgw-admin data sync status on 1 side.
>
> Today will finish the other problematic site, I’ll let you know the result
> is it working or not.
>
>
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
>
>
> *From:* 特木勒 
> *Sent:* Sunday, February 28, 2021 1:34 PM
> *To:* Szabo, Istvan (Agoda) 
> *Cc:* ceph-users@ceph.io
> *Subject:* Re: [Suspicious newsletter] [ceph-users] RGW: Multiple Site
> does not sync olds data
>
>
>
> Email received from outside the company. If in doubt don't click links nor
> open attachments!
> --
>
> Hi Istvan:
>
>
>
> Thanks for your reply.
>
>
>
> Does directional sync solve the problem? I tried to run `radosgw-admin
> sync init`, bit it still did not work. :(
>
>
>
> Thanks
>
>
>
> Szabo, Istvan (Agoda)  于2021年2月26日周五 上午7:47写道:
>
> Same for me, 15.2.8 also.
> I’m trying directional sync now, looks like symmetrical has issue.
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
> On 2021. Feb 26., at 1:03, 特木勒  wrote:
>
> Email received from outside the company. If in doubt don't click links
> nor open attachments!
> 
>
> Hi all:
>
> ceph version: 15.2.7 (88e41c6c49beb18add4fdb6b4326ca466d931db8)
>
> I have a strange question, I just create a multiple site for Ceph cluster.
> But I notice the old data of source cluster is not synced. Only new data
> will be synced into second zone cluster.
>
> Is there anything I need to do to enable full sync for bucket or this is a
> bug?
>
> Thanks
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
> 
> This message is confidential and is for the sole use of the intended
> recipient(s). It may also be privileged or otherwise protected by copyright
> or 

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-04-25 Thread Ilya Dryomov
On Sun, Apr 25, 2021 at 12:37 AM Markus Kienast  wrote:
>
> I am seeing these messages when booting from RBD and booting hangs there.
>
> libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated
> 131072, skipping
>
> However, Ceph Health is OK, so I have no idea what is going on. I
> reboot my 3 node cluster and it works again for about two weeks.
>
> How can I find out more about this issue, how can I dig deeper? Also
> there has been at least one report about this issue before on this
> mailing list - "[ceph-users] Strange Data Issue - Unexpected client
> hang on OSD I/O Error" - but no solution has been presented.
>
> This report was from 2018, so no idea if this is still an issue for
> Dyweni the original reporter. If you read this, I would be happy to
> hear how you solved the problem.

Hi Markus,

What versions of ceph and the kernel are in use?

Are you also seeing I/O errors and "missing primary copy of ..., will
try copies on ..." messages in the OSD logs (in this case osd2)?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io