> > > > Also as I have mentioned I cant afford to run my code using 4-5 times > memory. > > Total resource available in my server is about 180 GB memory (approx 64 > GB RAM + 128GB swap). > > OK, There is a huge difference between having 100G of RAM and having > 64G+128G swap. > swap is basically disk so if you are reading your data into memory and > that memory is > bouncing in and out of swap things will slow down by an order of > magnitude. > You need to try to optimise to use real RAM and minimise use of swap. >
I concur with Alan, and want to state his point more forcefully. If you are hitting swap, you are computationally DOOMED and must do something different. You _must_ avoid swap at all costs here. You may not understand the point, so a little more explanation: touching swap is several orders of magnitude more expensive than anything else you are doing in your program. CPU operations are on the order of nanoseconds. (10^-9) Disk operations are on the order of milliseconds. (10^-3) References: http://en.wikipedia.org/wiki/Instructions_per_second http://en.wikipedia.org/wiki/Hard_disk_drive_performance_characteristics As soon as you start touching your swap space to simulate virtual memory, you've lost the battle. We were trying not to leap to conclusions till we knew more. Now we know more. If your system has much less RAM than can fit your dataset at once, trying to read it all at once on your single machine, into an in-memory buffer, is wrong.
_______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor