[issue17436] pass a file object to hashlib.update

2013-03-17 Thread STINNER Victor

STINNER Victor added the comment:

 obj.update(buffer[:size])

This code does an useless memory copy: obj.update(memoryview(buffer)[:size]) 
can be used instead.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

Changes by anatoly techtonik techto...@gmail.com:


--
title: pass a string to hashlib.update - pass a file object to hashlib.update

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

Otherwise you need to repeat this code.

def filehash(filepath):
blocksize = 64*1024
sha = hashlib.sha256()
with open(filepath, 'rb') as fp:
while True:
data = fp.read(blocksize)
if not data:
break
sha.update(data)
return sha.hexdigest()

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread STINNER Victor

STINNER Victor added the comment:

 It makes sense to allow hashlib.update accept file like object
 to read from.

Not update directly, but I agree that an helper would be convinient.

Here is another proposition using unbuffered file and readinto() with 
bytearray. It should be faster, but I didn't try with a benchmark. I also wrote 
two functions, because sometimes you have a file object, not a file path.

---
import hashlib, sys

def hash_readfile_obj(obj, fp, buffersize=64 * 1024):
buffer = bytearray(buffersize)
while True:
size = fp.readinto(buffer)
if not size:
break
if size == buffersize:
obj.update(buffer)
else:
obj.update(buffer[:size])

def hash_readfile(obj, filepath, buffersize=64 * 1024):
with open(filepath, 'rb', buffering=0) as fp:
hash_readfile_obj(obj, fp, buffersize)

def file_sha256(filepath):
sha = hashlib.sha256()
hash_readfile(sha, filepath)
return sha.hexdigest()

for name in sys.argv[1:]:
print(%s %s % (file_sha256(name), name))
---

readfile() and readfile_obj() should be methods of an hash object.

--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread Jesús Cea Avión

Changes by Jesús Cea Avión j...@jcea.es:


--
nosy: +jcea

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

Why unbuffered will be faster??

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

Even though I mentioned passing file object in the title of this bugreport, 
what I really need is the following API:

  hexhash = hashlib.sha256().readfile(filename).hexdigest()

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread STINNER Victor

STINNER Victor added the comment:

 Why unbuffered will be faster??

Well, I'm not sure that it is faster. But I would prefer to avoid
buffering if it is not needed.

2013/3/16 anatoly techtonik rep...@bugs.python.org:

 anatoly techtonik added the comment:

 Why unbuffered will be faster??

 --

 ___
 Python tracker rep...@bugs.python.org
 http://bugs.python.org/issue17436
 ___

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue17436] pass a file object to hashlib.update

2013-03-16 Thread anatoly techtonik

anatoly techtonik added the comment:

I don't get that. I thought that buffered reading should be faster, although I 
agree that OS should handle this better. Why have the buffering turned on by 
default then? (I miss the ability to fork discussions from tracker, but there 
is no choice).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue17436
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com