subject:"Processing large CSV files \- how to maximise throughput\?"

Re: Processing large CSV files - how to maximise throughput?

2013-10-26 Thread Walter Hurry

On Thu, 24 Oct 2013 18:38:21 -0700, Victor Hooi wrote: Hi, We have a directory of large CSV files that we'd like to process in Python. We process each input CSV, then generate a corresponding output CSV file. input CSV - munging text, lookups etc. - output CSV My question is,

Re: Processing large CSV files - how to maximise throughput?

2013-10-25 Thread Chris Angelico

On Fri, Oct 25, 2013 at 2:57 PM, Dave Angel da...@davea.name wrote: But I would concur -- probably they'll both give about the same speedup. I just detest the pain that multithreading can bring, and tend to avoid it if at all possible. I don't have a history of major pain from threading. Is

Re: Processing large CSV files - how to maximise throughput?

2013-10-25 Thread Stefan Behnel

Chris Angelico, 25.10.2013 08:13: On Fri, Oct 25, 2013 at 2:57 PM, Dave Angel wrote: But I would concur -- probably they'll both give about the same speedup. I just detest the pain that multithreading can bring, and tend to avoid it if at all possible. I don't have a history of major pain

Re: Processing large CSV files - how to maximise throughput?

2013-10-25 Thread Chris Angelico

On Fri, Oct 25, 2013 at 5:39 PM, Stefan Behnel stefan...@behnel.de wrote: Basically, with multiple processes, you start with independent systems and add connections specifically where needed, whereas with threads, you start with completely shared state and then prune away interdependencies and

Re: Processing large CSV files - how to maximise throughput?

2013-10-25 Thread Dave Angel

On 25/10/2013 02:13, Chris Angelico wrote: On Fri, Oct 25, 2013 at 2:57 PM, Dave Angel da...@davea.name wrote: But I would concur -- probably they'll both give about the same speedup. I just detest the pain that multithreading can bring, and tend to avoid it if at all possible. I don't have

Re: Processing large CSV files - how to maximise throughput?

2013-10-25 Thread Chris Angelico

On Fri, Oct 25, 2013 at 10:24 PM, Dave Angel da...@davea.name wrote: On 25/10/2013 02:13, Chris Angelico wrote: On Fri, Oct 25, 2013 at 2:57 PM, Dave Angel da...@davea.name wrote: But I would concur -- probably they'll both give about the same speedup. I just detest the pain that

Re: Processing large CSV files - how to maximise throughput?

2013-10-25 Thread Roy Smith

In article mailman.1560.1382744694.18130.python-l...@python.org, Dennis Lee Bieber wlfr...@ix.netcom.com wrote: Memory is cheap -- I/O is slow. G Just how massive are these CSV files? Actually, these days, the economics of hardware are more like, CPU is cheap, memory is expensive. I

Processing large CSV files - how to maximise throughput?

2013-10-24 Thread Victor Hooi

Hi, We have a directory of large CSV files that we'd like to process in Python. We process each input CSV, then generate a corresponding output CSV file. input CSV - munging text, lookups etc. - output CSV My question is, what's the most Pythonic way of handling this? (Which I'm assuming

Re: Processing large CSV files - how to maximise throughput?

2013-10-24 Thread Dave Angel

On 24/10/2013 21:38, Victor Hooi wrote: Hi, We have a directory of large CSV files that we'd like to process in Python. We process each input CSV, then generate a corresponding output CSV file. input CSV - munging text, lookups etc. - output CSV My question is, what's the most Pythonic

Re: Processing large CSV files - how to maximise throughput?

2013-10-24 Thread Steven D'Aprano

On Thu, 24 Oct 2013 18:38:21 -0700, Victor Hooi wrote: Hi, We have a directory of large CSV files that we'd like to process in Python. We process each input CSV, then generate a corresponding output CSV file. input CSV - munging text, lookups etc. - output CSV My question is,

Re: Processing large CSV files - how to maximise throughput?

2013-10-24 Thread Steven D'Aprano

On Fri, 25 Oct 2013 02:10:07 +, Dave Angel wrote: If I have multiple large CSV files to deal with, and I'm on a multi-core machine, is there anything else I can do to boost throughput? Start multiple processes. For what you're doing, there's probably no point in multithreading. Since

Re: Processing large CSV files - how to maximise throughput?

2013-10-24 Thread Mark Lawrence

On 25/10/2013 02:38, Victor Hooi wrote: So for the reading, it'll iterates over the lines one by one, and won't read it into memory which is good. Wow this is fantastic, which OS are you using? Or do you actually mean that the whole file doesn't get read into memory, only one line at a

Re: Processing large CSV files - how to maximise throughput?

2013-10-24 Thread Dave Angel

On 24/10/2013 23:35, Steven D'Aprano wrote: On Fri, 25 Oct 2013 02:10:07 +, Dave Angel wrote: If I have multiple large CSV files to deal with, and I'm on a multi-core machine, is there anything else I can do to boost throughput? Start multiple processes. For what you're doing,

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

Re: Processing large CSV files - how to maximise throughput?

13 matches

Site Navigation

Mail list logo

Footer information