Nice :) This is my latest code, I mean it should be right now, hehe (and yeth, Im using GC):
+(NSString *)generateHashFromPath:(NSString *)path { const NSUInteger CHUNK_SIZE = 65536; NSError *error = nil; NSData *fileData = [NSData dataWithContentsOfFile:path options:NSDataReadingMapped | NSDataReadingUncached error:&error]; if (error) { return nil; } const uint8_t* buffer = [fileData bytes]; NSUInteger byteLength = [fileData length]; NSUInteger byteOffset = 0; if (byteLength > CHUNK_SIZE) { byteOffset = byteLength - CHUNK_SIZE; byteLength = CHUNK_SIZE; } CC_MD5_CTX md5; CC_MD5_Init(&md5); CC_MD5_Update(&md5, buffer, (unsigned int) byteLength); CC_MD5_Update(&md5, buffer + byteOffset, (unsigned int) byteLength); unsigned char digest[CC_MD5_DIGEST_LENGTH]; CC_MD5_Final(digest, &md5); NSMutableString *ret = [NSMutableString stringWithCapacity:CC_MD5_DIGEST_LENGTH * 2]; for (int i = 0; i < CC_MD5_DIGEST_LENGTH; i++) { [ret appendFormat:@"%02x", digest[i]]; } return [ret lowercaseString]; } I tried it against a video of 8.5GB, stored on external drive (at wifi with Airport Extreme), and it's blazing fast :) Thanks a lot --- Wilker LĂșcio http://about.me/wilkerlucio/bio Kajabi Consultant +55 81 82556600 On Wed, Jun 29, 2011 at 4:18 PM, Quincey Morris <quinceymor...@earthlink.net > wrote: > On Jun 29, 2011, at 11:32, Wilker wrote: > > > Just for curiousity, if I do the line: > > > > const uint8_t* buffer = [fileData bytes]; > > > > it will not read the entire file? or these address are pointed direct on > disk so they are load on demand? > > [fileData bytes] is a pointer to some memory address or other, in your > application's virtual memory address space. The actual pages of data don't > exist yet, they are indeed "loaded" on demand. The demand will happen when > CC_MD5_Update tries to retrieve bytes to update its calculations. As its > internal pointer increments into each new page, its data accesses will cause > VM faults, which will cause the pages to be read from disk, which in this > case is your video file. > > That's why this is efficient. A "normal" read will transfer pages of data > into the system's disk cache, then transfer some of it again into your > application's address space, and if those pages in your address space happen > to be swapped out sometime, your data is written *back* to disk (in the > system VM backing store) for later rereading. Using mapped reads with no > caching avoids all of that. > > > also, just to mean if I understand here: > > > > CC_MD5_Update(&md5, buffer + byteOffset, byteLength); > > > > in the sum "buffer + byteOffset", in case, adding a number to an array > pointer will change it offset? > > 'buffer' is not an array, it's just a pointer. It's a pointer to an array > of bytes, if you choose to think of it that way, but so are all pointers, if > you choose to think of them that way. > > Think of it this way. You were using > 'getBytes:range:NSMakeRange(offset,length)' to copy the bytes to a local > buffer. That buffer must have copied the bytes *from* somewhere. The place > where the bytes are being copied from is, literally, '[fileData bytes] + > offset', and NSData's API makes it perfectly legal to go to the source > yourself. _______________________________________________ Cocoa-dev mailing list (Cocoa-dev@lists.apple.com) Please do not post admin requests or moderator comments to the list. Contact the moderators at cocoa-dev-admins(at)lists.apple.com Help/Unsubscribe/Update your Subscription: http://lists.apple.com/mailman/options/cocoa-dev/archive%40mail-archive.com This email sent to arch...@mail-archive.com