Re: problem with applying md5 to data

Wilker Wed, 29 Jun 2011 12:28:05 -0700

Nice :)

This is my latest code, I mean it should be right now, hehe (and yeth, Im
using GC):


+(NSString *)generateHashFromPath:(NSString *)path {
    const NSUInteger CHUNK_SIZE = 65536;

    NSError *error = nil;
    NSData *fileData = [NSData dataWithContentsOfFile:path
options:NSDataReadingMapped | NSDataReadingUncached error:&error];

    if (error) {
        return nil;
    }

    const uint8_t* buffer = [fileData bytes];

    NSUInteger byteLength = [fileData length];
    NSUInteger byteOffset = 0;

    if (byteLength > CHUNK_SIZE) {
        byteOffset = byteLength - CHUNK_SIZE;
        byteLength = CHUNK_SIZE;
    }

    CC_MD5_CTX md5;
    CC_MD5_Init(&md5);

    CC_MD5_Update(&md5, buffer, (unsigned int) byteLength);
    CC_MD5_Update(&md5, buffer + byteOffset, (unsigned int) byteLength);

    unsigned char digest[CC_MD5_DIGEST_LENGTH];

    CC_MD5_Final(digest, &md5);

    NSMutableString *ret = [NSMutableString
stringWithCapacity:CC_MD5_DIGEST_LENGTH * 2];

    for (int i = 0; i < CC_MD5_DIGEST_LENGTH; i++) {
        [ret appendFormat:@"%02x", digest[i]];
    }

    return [ret lowercaseString];
}

I tried it against a video of 8.5GB, stored on external drive (at wifi with
Airport Extreme), and it's blazing fast :)
Thanks a lot
---
Wilker Lúcio
http://about.me/wilkerlucio/bio
Kajabi Consultant
+55 81 82556600



On Wed, Jun 29, 2011 at 4:18 PM, Quincey Morris <quinceymor...@earthlink.net
> wrote:

> On Jun 29, 2011, at 11:32, Wilker wrote:
>
> > Just for curiousity, if I do the line:
> >
> >  const uint8_t* buffer = [fileData bytes];
> >
> > it will not read the entire file? or these address are pointed direct on
> disk so they are load on demand?
>
> [fileData bytes] is a pointer to some memory address or other, in your
> application's virtual memory address space. The actual pages of data don't
> exist yet, they are indeed "loaded" on demand. The demand will happen when
> CC_MD5_Update tries to retrieve bytes to update its calculations. As its
> internal pointer increments into each new page, its data accesses will cause
> VM faults, which will cause the pages to be read from disk, which in this
> case is your video file.
>
> That's why this is efficient. A "normal" read will transfer pages of data
> into the system's disk cache, then transfer some of it again into your
> application's address space, and if those pages in your address space happen
> to be swapped out sometime, your data is written *back* to disk (in the
> system VM backing store) for later rereading. Using mapped reads with no
> caching avoids all of that.
>
> > also, just to mean if I understand here:
> >
> > CC_MD5_Update(&md5, buffer + byteOffset, byteLength);
> >
> > in the sum "buffer + byteOffset", in case, adding a number to an array
> pointer will change it offset?
>
> 'buffer' is not an array, it's just a pointer. It's a pointer to an array
> of bytes, if you choose to think of it that way, but so are all pointers, if
> you choose to think of them that way.
>
> Think of it this way. You were using
> 'getBytes:range:NSMakeRange(offset,length)' to copy the bytes to a local
> buffer. That buffer must have copied the bytes *from* somewhere. The place
> where the bytes are being copied from is, literally, '[fileData bytes] +
> offset', and NSData's API makes it perfectly legal to go to the source
> yourself.
_______________________________________________

Cocoa-dev mailing list (Cocoa-dev@lists.apple.com)

Please do not post admin requests or moderator comments to the list.
Contact the moderators at cocoa-dev-admins(at)lists.apple.com

Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Re: problem with applying md5 to data

Reply via email to