Hi,

Im trying to copy a 20 GB CSV file into a 3 node fresh cassandra cluster
with 32 GB memory each, sufficient disk, RF-1 and durable write false. The
machine im feeding into is external to the cluster and shares 1GBps line
and has 16 GB RAM. (We have chosen this setup to possibly reduce CPU and IO
usage).

Im trying to use COPY command to feed in data. It kicks off well, launches
a set of processes, does about 50,000 rows per second. But I can see that
the parent process starts aggregating memory almost of the size of data
processed and after a point the processes just hang. The parent process was
consuming 95% system memory when it had processed around 60% data.

I had earlier tried to feed in data from multiple files (Less than 4GB
each) and it was working as expected.

Is it a valid scenario?

Regards,
Bhuvan

Reply via email to