Are you tied to ASCII files?   HDF5 (via h5py or pytables) might be a
better storage format for what you are describing.


On Wed, Jul 5, 2017 at 8:42 AM <> wrote:

> Dear all
> I’m sorry if my question is too basic (not fully in relation to Numpy –
> while it is to build matrices and to work with Numpy afterward), but I’m
> spending a lot of time and effort to find a way to record data from an asci
> while, and reassign it into a matrix/array … with unsuccessfully!
> The only way I found is to use *‘append()’* instruction involving dynamic
> memory allocation. :-(
> From my current experience under Scilab (a like Matlab scientific solver),
> it is well know:
>    1. Step 1 : matrix initialization like *‘np.zeros(n,n)’*
>    2. Step 2 : record the data
>    3. and write it in the matrix (step 3)
> I’m obviously influenced by my current experience, but I’m interested in
> moving to Python and its packages
> For huge asci files (involving dozens of millions of lines), my strategy
> is to work by ‘blocks’ as :
>    - Find the line index of the beginning and the end of one block (this
>    implies that the file is read ounce)
>    - Read the block
>    - (process repeated on the different other blocks)
> I tried different codes such as bellow, but each time Python is telling me *I
> cannot mix iteration and record method*
> #############################################
> position = []; j=0
> with open(PATH + file_name, "r") as rough_ data:
>             for line in rough_ data:
>                 if *my_criteria* in line:
>                     position.append(j) ## huge blocs but limited in number
>                 j=j+1
>         i = 0
>         blockdata = np.zeros( (size_block), dtype=np.float)
>         with open(PATH + file_name, "r") as f:
>                  for line in itertools.islice(f,1,size_block):
>                      blockdata [i]=float(f.readline() )
>                      i=i+1
>  #########################################
> Should I work on lists using f.readlines (but this implies to load all the
> file in memory).
> *Additional question*:  can I use record with vectorization, with ‘i
> =np.arange(0,65406)’ if I remain  in the previous example
> Thanks for your time and comprehension
> (I’m obviously interested by doc references speaking about those specific
> tasks)
> Paul
> PS: for Chuck:  I’ll had a look to pandas package but in an code
> optimization step :-) (nearly 2000 doc pages)
> _______________________________________________
> NumPy-Discussion mailing list
NumPy-Discussion mailing list

Reply via email to