Hello all,

I would like to process a large numpy matrix with dimensions:

(100K+, 30K+)

The column names and the row names are meaningful.

My plan was to save the numpy matrix values as a txt file and read it to a
PColleciton. However, I am not sure how to add the row names to the element
for processing.
The column names are easier - I can pass them as parameter to the DoFn
function and they are not changing.

With regards to the row names, the only way that I could see is to map the
row index to a string, read the row number at the DoFn function and
retrieve the name based on it. Is there any more elegant way to solve that?

Many thanks,
-- 
Eila
www.orielresearch.org
https://www.meetu <https://www.meetup.com/Deep-Learning-In-Production/>p.co
<https://www.meetup.com/Deep-Learning-In-Production/>
m/Deep-Learning-In-Production/
<https://www.meetup.com/Deep-Learning-In-Production/>

Reply via email to