Are you tied to ASCII files? HDF5 (via h5py or pytables) might be a better storage format for what you are describing.
Tom On Wed, Jul 5, 2017 at 8:42 AM <paul.carr...@free.fr> wrote: > Dear all > > > I’m sorry if my question is too basic (not fully in relation to Numpy – > while it is to build matrices and to work with Numpy afterward), but I’m > spending a lot of time and effort to find a way to record data from an asci > while, and reassign it into a matrix/array … with unsuccessfully! > > > The only way I found is to use *‘append()’* instruction involving dynamic > memory allocation. :-( > > > From my current experience under Scilab (a like Matlab scientific solver), > it is well know: > > 1. Step 1 : matrix initialization like *‘np.zeros(n,n)’* > 2. Step 2 : record the data > 3. and write it in the matrix (step 3) > > > I’m obviously influenced by my current experience, but I’m interested in > moving to Python and its packages > > > For huge asci files (involving dozens of millions of lines), my strategy > is to work by ‘blocks’ as : > > - Find the line index of the beginning and the end of one block (this > implies that the file is read ounce) > - Read the block > - (process repeated on the different other blocks) > > > I tried different codes such as bellow, but each time Python is telling me *I > cannot mix iteration and record method* > > ############################################# > > position = []; j=0 > > with open(PATH + file_name, "r") as rough_ data: > > for line in rough_ data: > > if *my_criteria* in line: > > position.append(j) ## huge blocs but limited in number > > j=j+1 > > > i = 0 > > blockdata = np.zeros( (size_block), dtype=np.float) > > with open(PATH + file_name, "r") as f: > > for line in itertools.islice(f,1,size_block): > > blockdata [i]=float(f.readline() ) > > i=i+1 > > ######################################### > > > Should I work on lists using f.readlines (but this implies to load all the > file in memory). > > > *Additional question*: can I use record with vectorization, with ‘i > =np.arange(0,65406)’ if I remain in the previous example > > > > Thanks for your time and comprehension > > (I’m obviously interested by doc references speaking about those specific > tasks) > > > Paul > > > PS: for Chuck: I’ll had a look to pandas package but in an code > optimization step :-) (nearly 2000 doc pages) > > > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@python.org > https://mail.python.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion