Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-02-04 Thread John McKown
I'm now using a Perl script which Malcom Beattie kindly gave me. I made some minor changes to be more generalized, but the main logic is his. I ran it and it was significantly faster than BASH. And I then managed to use up all the space in the filesystem that I had it on. OOPS. I'm going to need to

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-02-04 Thread Pavelka, Tomas
The internal bash parameter expansion functions (e.g. ${line%% *}) tend to be quite inefficient. Here is one example, compare the performance of bash substitution to Perl substitution: #!/bin/bash comma_sep=$(perl -e 'for($i=0;$i<1000;$i++) { print("$i;") };') time space_sep=${comma_sep//;/ } ti

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-02-01 Thread Rob van der Heij
On 31 January 2013 22:01, Philipp Kern wrote: > even quite old Intel boxes manage to saturate 1 GE easily. You're > copying stuff into the send buffer and ring a bell. > > Nowadays it doesn't seem hard to do 10 GE with a Linux box, especially if > you've got HW assist on the network card. The z n

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread John McKown
I want to thank Malcolm Beattie for the Perl script. I'm running it now. It has already finished processing one generation, having split it up into 60 output files. This has been about 3 hours now. Significantly faster. I only made one change. I did a close on all the cached output files after fini

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread Philipp Kern
Rob, am Thu, Jan 31, 2013 at 03:14:09PM +0100 hast du folgendes geschrieben: > On 31 January 2013 14:38, Philipp Kern wrote: > > Also you should be able to do between 100MB/s to 1GB/s on 10 GE, which is > My rule of thumb is that pumping 100 MB/s or so through the Linux > TCP/IP stack will burn a

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread Rob van der Heij
On 31 January 2013 14:38, Philipp Kern wrote: > Also you should be able to do between 100MB/s to 1GB/s on 10 GE, which is My rule of thumb is that pumping 100 MB/s or so through the Linux TCP/IP stack will burn a CPU, maybe half if he can use large packets. So 1 GB/s takes 5-10 CPUs if the wire

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread Philipp Kern
John, am Wed, Jan 30, 2013 at 08:24:04AM -0600 hast du folgendes geschrieben: > Well, I know that downloading the 160 Gig uncompressed data takes > about 8 hours on the 10 Gig/sec Ethernet connection. I then bzip2 > compress [...] don't do that. If you just need a compression on such a high throu

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread John McKown
Many thanks for that Perl code. I've taken it and will see how fast it is. On Wed, Jan 30, 2013 at 7:55 AM, Malcolm Beattie wrote: > John McKown writes: > Perl (and Python) aren't simply interpreted. In the case of perl, it > compiles the source into an internal op tree (rather like bytecode) >

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread John McKown
On Wed, Jan 30, 2013 at 11:56 PM, Chase, John wrote: >> -Original Message- >> From: Linux on 390 Port On Behalf Of John McKown >> >> Well, I know that downloading the 160 Gig uncompressed data takes about 8 >> hours on the 10 Gig/sec >> Ethernet connection. I then bzip2 compress that to a

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-31 Thread Malcolm Beattie
John McKown writes: > This is more a curiosity question. I have written a bash script which > reads a bzip2 compressed set of files. For each record in the file, it > writes the record into a file name based on the first two "words" in > the record and the "generation number" from the input file na

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread Chase, John
> -Original Message- > From: Linux on 390 Port On Behalf Of John McKown > > Well, I know that downloading the 160 Gig uncompressed data takes about 8 > hours on the 10 Gig/sec > Ethernet connection. I then bzip2 compress that to about 50 Meg. Which I > binary upload back to z/OS > for sa

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread Patrick Spinler
Regarding i/o buffering, as Rob discusses On 1/30/13 2:40 AM, Rob van der Heij wrote: > > If the input files have a lot of 'chunks' that go to the same output > file, it might be fairly easy to gobble up the ones that go together > and write them in a single go. Based on more heuristics, you may b

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread John McKown
Well, I know that downloading the 160 Gig uncompressed data takes about 8 hours on the 10 Gig/sec Ethernet connection. I then bzip2 compress that to about 50 Meg. Which I binary upload back to z/OS for safety (since it's setting on my Linux desktop) in just a few minutes. But bzgrep can scan the co

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread Patrick Spinler
And I should mention, as Shane G. suggested, for simple, big, text processing stuff like this you'll get as good performance from perl as from compiled code. There's a reason big text processing genomics stuff like bioperl is written in perl I'm not a python wizard, but I have to imagine tha

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread Mark Post
>>> On 1/30/2013 at 08:44 AM, John McKown wrote: > But I may be forced into using C or C++ for > speed. Too bad I'm not a very good C programmer. Based on the number of buffer overflow exploits that have been discovered over the years, I don't think very many people are. Mark Post --

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread Shane G
On Thu, Jan 31st, 2013 at 12:44 AM, John McKown wrote: > Thanks to all for the input! I _tried_ to run the script over night. I > added an echo to tell me which input file I was working on. I came in > this morning. It had been running from 14:00 to 06:30 (16 1/2 hours) > and was still on the firs

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread John McKown
Thanks to all for the input! I _tried_ to run the script over night. I added an echo to tell me which input file I was working on. I came in this morning. It had been running from 14:00 to 06:30 (16 1/2 hours) and was still on the first input file. That ain't gonna cut it. Time to rethink. Using a

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread Agblad Tore
INUX-390@VM.MARIST.EDU] On Behalf Of John McKown Sent: den 29 januari 2013 23:13 To: LINUX-390@VM.MARIST.EDU Subject: Speed of BASH script vs. Python vs. Perl vs. compiled This is more a curiosity question. I have written a bash script which reads a bzip2 compressed set of files. For each recor

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-30 Thread Rob van der Heij
On 30 January 2013 05:17, Patrick Spinler wrote: > Since no one else seems to be pointing this out, i can see at least one > potential optimization: > > This will re-open the output file and seek to the end every output. > That's a factor of 2 or 3 more syscalls every time. (possibly plus the >

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread Patrick Spinler
Since no one else seems to be pointing this out, i can see at least one potential optimization: On 1/29/13 4:13 PM, John McKown wrote: > > If you're interested, the bash script looks like: > > #!/bin/bash > for i in irradu00.g*.bz2;do > gen=${i#irradu00.}; # remove prefix > gen=${g

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread Henry Schaffer
On Tue, Jan 29, 2013 at 8:55 PM, John McKown wrote: > Interesting. I've tried looking at R, but just can't get the time to read > the the books I've bought. I've been doing some analyses with R. It is *very* complex with lots and lots of commands. I've found faculty who teach courses in which R

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread David Boyes
I don't think you're going to see much (if any) improvement, really. This process is pretty much a simple filter, and you're mostly I/O bound, so C/C++ aren't really going to help much, and the amount of code you'll need to write to simulate the parsing capabilities of any of the shell languages

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread John McKown
Interesting. I've tried looking at R, but just can't get the time to read the the books I've bought. On Jan 29, 2013 7:39 PM, "David Boyes" wrote: > > Actually, it is the IRRADU00 reformatted RACF audit records from SMF. > Can't > > process the SMF itself easily on VM/CMS or Linux. > > I have a f

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread David Boyes
> Interesting. I've tried looking at R, but just can't get the time to read the > the > books I've bought. Another option for CMS or Linux might be the really ancient version of MACSYMA that lives on the MVS CBT tape. If you have access to a Fortran compiler, that beastie can eat structured re

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread David Boyes
> Actually, it is the IRRADU00 reformatted RACF audit records from SMF. Can't > process the SMF itself easily on VM/CMS or Linux. I have a faint memory that someone took the SMF publication, extracted the record layouts and created some data descriptions for the S statistical tool on Linux. Don'

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread John McKown
Actually, it is the IRRADU00 reformatted RACF audit records from SMF. Can't process the SMF itself easily on VM/CMS or Linux. On Jan 29, 2013 6:14 PM, "Michael Harding" wrote: > Based on John's previous posts and the dataset names referenced in this, > I'd say he's playing with the output from z/

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread Michael Harding
Based on John's previous posts and the dataset names referenced in this, I'd say he's playing with the output from z/OS' RACF database unload. If his zlinux is VM-hosted, I'd be inclined to bring the files to VM first, mung them up with CMS pipelines then transfer the output files to his guest. O

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread John McKown
Lifetime is as long as I work here and want to. This is not production. The data is RACF audit information that I alone use to answer ad hoc information requests. If pushed, I must recreate the report using z/OS procedures. I do this on my Linux box to save z/OS cycles. I may look at using ooREXX,

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread Richard Troth
Shell is a write-only language. (that's an opinion) What I mean is, maintaining "applications" written in shell, even BASH, is *hard*. However ... What is the extent and life of this script? If the purpose is to wrap-up a number of other programs, then use a shell. I agree with Jon. You're ca

Re: Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread Jon Miller
Although I'm a fan of Python, I wouldn't rewrite your program. Bash itself is an interpreted language like your other options of Python / Perl. You're going to get near native (compiled program) speed by using that call to bzcat for your decompression and then I like how your while loop is using ba

Speed of BASH script vs. Python vs. Perl vs. compiled

2013-01-29 Thread John McKown
This is more a curiosity question. I have written a bash script which reads a bzip2 compressed set of files. For each record in the file, it writes the record into a file name based on the first two "words" in the record and the "generation number" from the input file name. Do to the extreme size o