Re: [Openstack] [Swift] Delete handling with md5 collisions

2015-08-27 Thread Shrinand Javadekar
Thanks Sam, Anthony! It's a bit scary (even though the probability is
low) to know that there could be data loss in Swift.

On Wed, Aug 26, 2015 at 3:30 PM, Chow, Anthony T (Anthony)** CTR **
 wrote:
> Shri,
>
> Will these 2 discussions help resolve your doubt?
>
> https://answers.launchpad.net/swift/+question/156307
>
> http://stackoverflow.com/questions/28379809/how-are-hash-collisions-handled
>
>
> Anthony.
>
> -Original Message-
> From: Shrinand Javadekar [mailto:shrin...@maginatics.com]
> Sent: Wednesday, August 26, 2015 1:37 PM
> To: openstack@lists.openstack.org
> Subject: [Openstack] [Swift] Delete handling with md5 collisions
>
> Hi,
>
> I have a question about how object deletes are handled with md5 collisions. I 
> looked at the code and here's my understanding of how things will work.
>
> If I have two objects that have the same md5 hash, they will go to the same 
> hash directory. Say, they go to /srv/node/r1/object/1024/eef/deadbeef/t1.data 
> and /srv/node/r1/object/1024/eef/deadbeef/t2.data.
>
> Now, if I delete object t1, Swift will created a new file called t3.ts and 
> put it in the hash directory.
> /srv/node/r1/object/1024/eef/deadbeef/t3.ts.
>
> When the replicator runs, it will delete all files with timestamp less than 
> t3. So will it delete both t1 and t2?
>
> Thanks in advance.
> -Shri
>
> ___
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] [Swift] Delete handling with md5 collisions

2015-08-26 Thread Chow, Anthony T (Anthony)** CTR **
Shri,

Will these 2 discussions help resolve your doubt? 

https://answers.launchpad.net/swift/+question/156307

http://stackoverflow.com/questions/28379809/how-are-hash-collisions-handled


Anthony.

-Original Message-
From: Shrinand Javadekar [mailto:shrin...@maginatics.com] 
Sent: Wednesday, August 26, 2015 1:37 PM
To: openstack@lists.openstack.org
Subject: [Openstack] [Swift] Delete handling with md5 collisions

Hi,

I have a question about how object deletes are handled with md5 collisions. I 
looked at the code and here's my understanding of how things will work.

If I have two objects that have the same md5 hash, they will go to the same 
hash directory. Say, they go to /srv/node/r1/object/1024/eef/deadbeef/t1.data 
and /srv/node/r1/object/1024/eef/deadbeef/t2.data.

Now, if I delete object t1, Swift will created a new file called t3.ts and put 
it in the hash directory.
/srv/node/r1/object/1024/eef/deadbeef/t3.ts.

When the replicator runs, it will delete all files with timestamp less than t3. 
So will it delete both t1 and t2?

Thanks in advance.
-Shri

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] [Swift] Delete handling with md5 collisions

2015-08-26 Thread Shrinand Javadekar
Actually, I'm confused now. I used to think that Swift does HTTP
deletes by synchronously truncating the object file and renaming it
with a .ts extension. But the currently code simply creates a new file
with the request timestamp and .ts extension.

On Wed, Aug 26, 2015 at 1:37 PM, Shrinand Javadekar
 wrote:
> Hi,
>
> I have a question about how object deletes are handled with md5
> collisions. I looked at the code and here's my understanding of how
> things will work.
>
> If I have two objects that have the same md5 hash, they will go to the
> same hash directory. Say, they go to
> /srv/node/r1/object/1024/eef/deadbeef/t1.data and
> /srv/node/r1/object/1024/eef/deadbeef/t2.data.
>
> Now, if I delete object t1, Swift will created a new file called t3.ts
> and put it in the hash directory.
> /srv/node/r1/object/1024/eef/deadbeef/t3.ts.
>
> When the replicator runs, it will delete all files with timestamp less
> than t3. So will it delete both t1 and t2?
>
> Thanks in advance.
> -Shri

___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


Re: [Openstack] [Swift] Delete handling with md5 collisions

2015-08-26 Thread Samuel Merritt

On 8/26/15 1:37 PM, Shrinand Javadekar wrote:

Hi,

I have a question about how object deletes are handled with md5
collisions. I looked at the code and here's my understanding of how
things will work.

If I have two objects that have the same md5 hash, they will go to the
same hash directory. Say, they go to
/srv/node/r1/object/1024/eef/deadbeef/t1.data and
/srv/node/r1/object/1024/eef/deadbeef/t2.data.


That's two objects whose *names* have the same MD5 hash. The objects' 
contents are irrelevant when determining placement.



Now, if I delete object t1, Swift will created a new file called t3.ts
and put it in the hash directory.
/srv/node/r1/object/1024/eef/deadbeef/t3.ts.

When the replicator runs, it will delete all files with timestamp less
than t3. So will it delete both t1 and t2?


Correct. Two objects whose names have the same MD5 hash are considered 
equivalent by Swift.


If I remember correctly, since MD5 has a 128-bit output, that means you 
have a 50% probability of having a collision once your cluster reaches 
2^64 objects.


___
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack