On 2013-07-31, Frank Millman <fr...@chagford.com> wrote: > > "Antoine Pitrou" <solip...@pitrou.net> wrote in message > news:loom.20130731t114936-...@post.gmane.org... >> Frank Millman <frank <at> chagford.com> writes: >>> >>> I have some binary data (a gzipped xml object) that I want to store in a >>> database. For PostgreSQL I use a column with datatype 'bytea', which is >>> their recommended way of storing binary strings. >>> >>> I use psycopg2 to access the database. It returns binary data >>> in the form of a python 'memoryview'. >>> >> [...] >>> >>> Using MS SQL Server and pyodbc, it returns a byte string, not >>> a memoryview, and it does compare equal with the original. >>> >>> I can hack my program to use tobytes(), but it would add >>> complication, and it would be database-specific. I would >>> prefer a cleaner solution. >> >> Just cast the result to bytes (`bytes(row[1])`). It will work >> both with bytes and memoryview objcts. > > Thanks for that, Antoine. It is an improvement over tobytes(), > but i am afraid it is still not ideal for my purposes. > > At present, I loop over a range of columns, comparing 'before' > and 'after' values, without worrying about their types. Strings > are returned as str, integers are returned as int, etc. Now I > will have to check the type of each column before deciding > whether to cast to 'bytes'. > > Can anyone explain *why* the results do not compare equal? If I > understood the problem, I might be able to find a workaround.
A memoryview will compare equal to another object that supports the buffer protocol when the format and shape are also equal. The database must be returning chunks of binary data in a different shape or format than you are writing it. Perhaps psycopg2 is returning a chunk of ints when you have written a chunk of bytes. Check the .format and .shape members of the return value to see. >>> x = memoryview(b"12345") >>> x.format 'B' >>> x.shape (5,) >>> x == b"12345" True My guess is you're getting format "I" from psycopg2. Hopefully there's a way to coerce your desired "B" format interpretation of the raw data using psycopg2's API. -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list