Hello I am trying to import data from a bson file to a 3 node Accumulo cluster using pyaccumulo. The bson file is 4G and has a lot of records, all to be stored into one table. I tried a very naive approach and used pyaccumulo batch writer to write to the table. After parsing some records, my master became unresonsive and shut down with the tserver threads stuck on low memory error. I am assuming that the records are created faster than what the proxy/master can handle. Is there ant other way to go about it? I am thinking of using bulk ingest but I am not sure how exactly.
Best regards, Yamini Joshi