I am trying to ingest a 575MB CSV file with 192,444 lines using the 
CsvBulkLoadTool MapReduce job. When running this job, I find that I have to 
boost the max Java heap space to 48GB (24GB fails with Java out of memory 
errors).

I'm concerned about scaling issues. It seems like it shouldn't require between 
24-48GB of memory to ingest a 575MB file. However, I am pretty new to 
Hadoop/HBase/Phoenix, so maybe I am off base here.

Can anybody comment on this observation?

Thanks,
Jonathan

Reply via email to