[issue15504] pickle/cPickle saves invalid/incomplete data

2012-07-30 Thread Philipp Lies

New submission from Philipp Lies:

I just stumbled upon a very serious bug in cPickle where cPickle stores the 
data passed to it only partially without a warning/error:

#creating a 8GB long random data sting
import os
import cPickle
random_string = os.urandom(int(1.1*2**33))
print len(random_string)
fout = open('test.pickle', 'wb')
cPickle.dump(random_string, fout, 2)
fout.close()
fin = open('test.pickle', 'rb')
random_string2 = cPickle.load(fin)
print len(random_string2)
print random_string == random_string2

The loaded string is significantly shorter, meaning that some of the data got 
lost while storing the string. This is a serious issue. However, when I use 
pickle, writing fails with 
error: 'i' format requires -2147483648 = number = 2147483647
so I guess pickle is not able to handle large data, therefore cPickle should 
either throw an error as well of pickle/cPickle should be patched to handle 
larger data.

Code to reproduce error using numpy (that's how I stumbled upon it):
import numpy as np
import cPickle as pickle
A = np.random.randn(1080,1920,553)
fout = open('test.pickle', 'wb')
pickle.dump(A, fout, 2)
fout.close()
fin = open('test.pickle', 'rb')
B = pickle.load(fin)
Here, numpy detects that the amount of data is wrong and throws an error. 
However, still serious because saving does not lead to an error so the user 
expects that the data are safely stored.

I guess might be related to http://bugs.python.org/issue13555 which is still 
open.

Python 2.7.3 on latest Ubuntu with numpy 1.6.2, 64bit architecture, 128GB RAM

--
messages: 166906
nosy: Philipp.Lies
priority: normal
severity: normal
status: open
title: pickle/cPickle saves invalid/incomplete data
type: crash
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue15504
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13555] cPickle MemoryError when loading large file (while pickle works)

2011-12-12 Thread Philipp Lies

Philipp Lies p...@bethgelab.org added the comment:

a) it's 122GB free RAM (out of 128GB total RAM)

b) when I convert the numpy array to a list it works. So seems to be a problem 
with cPickle and numpy at/from a certain array size

c) $ /usr/bin/time -v python test_np.py 
Traceback (most recent call last):
  File test_np.py, line 12, in module
A2 = cPickle.load(f2)
MemoryError
Command exited with non-zero status 1
Command being timed: python test_np.py
User time (seconds): 73.72
System time (seconds): 4.56
Percent of CPU this job got: 87%
Elapsed (wall clock) time (h:mm:ss or m:ss): 1:29.52
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 7402448
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 726827
Voluntary context switches: 41043
Involuntary context switches: 7793
Swaps: 0
File system inputs: 3368
File system outputs: 2180744
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 1

hth

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13555
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13555] cPickle MemoryError when loading large file (while pickle works)

2011-12-08 Thread Philipp Lies

Philipp Lies p...@bethgelab.org added the comment:

Well, replace cPickle by pickle and it works. So if there is a memory 
allocation problem cPickle should be able to handle it, especially since it 
should be completely compatible to pickle.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13555
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com