On 8/26/15 1:37 PM, Shrinand Javadekar wrote:
Hi,

I have a question about how object deletes are handled with md5
collisions. I looked at the code and here's my understanding of how
things will work.

If I have two objects that have the same md5 hash, they will go to the
same hash directory. Say, they go to
/srv/node/r1/object/1024/eef/deadbeef/t1.data and
/srv/node/r1/object/1024/eef/deadbeef/t2.data.

That's two objects whose *names* have the same MD5 hash. The objects' contents are irrelevant when determining placement.

Now, if I delete object t1, Swift will created a new file called t3.ts
and put it in the hash directory.
/srv/node/r1/object/1024/eef/deadbeef/t3.ts.

When the replicator runs, it will delete all files with timestamp less
than t3. So will it delete both t1 and t2?

Correct. Two objects whose names have the same MD5 hash are considered equivalent by Swift.

If I remember correctly, since MD5 has a 128-bit output, that means you have a 50% probability of having a collision once your cluster reaches 2^64 objects.

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to