On Wed, Sep 23, 2009 at 9:42 AM, davew0000 <[email protected]> wrote: > > Hi, > > I've got a fairly large (but not huge, 58mb) tab seperated text file, with > approximately 200 columns and 56k rows of numbers and strings. > > Here's a snippet of my code to create a numpy matrix from the data file... > > #### > > data = map(lambda x : x.strip().split('\t'), sys.stdin.readlines()) > data = array(data) > > ### > data = array(data) > It causes the following error: > >> ValueError: setting an array element with a sequence > > If I take the 1st 40,000 lines of the file, it works fine. > If I take the last 40,000 lines of the file, it also works fine, so it isn't > a problem with the file. > > I've found a few other posts complaining of the same problem, but none of > their fixes work. > > It seems like a memory problem to me. This was reinforced when I tried to > break the dataset into 3 chunks and stack the resulting arrays - I got an > error message saying "memory error". > I don't really understand why reading in this 57mb txt file is taking up > ~2gb's of RAM. > > Any advice? Thanks in advance >
Without knowing more, I wouldn't think that there's really a memory error trying to load a 57 MB file or stacking it split into 3. Try using genfromtxt or loadtxt. It should work without a problem unless there is something funny about your file. Skipper _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
