Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 21:18:43 +0530, srinivas devaki wrote: > please upload the log file, Sorry, it's work stuff, can't do that, but just take any big set of files and change the strings appropriately and the numbers should be equivalent. > > and global variables in python are slow, so just keep all that in a > function and try again. generally i get 20-30% time improvement by > doin that. #!/usr/bin/env python # vim: tw=0 import sys import re def faster (): isready = re.compile ("(.*) is ready") relreq = re.compile (".*release_req") for fn in sys.argv[1:]: # logfile name tn = None with open (fn) as fd: for line in fd: #match = re.match ("(.*) is ready", line) match = isready.match (line) if match: tn = match.group(1) continue #match = re.match (".*release_req", line) match = relreq.match (line) if match: #print "%s: %s" % (tn, line), print tn faster() $ time python ./find-relreq *.out | sort -u TestCase_F_00_P TestCase_F_00_S TestCase_F_01_S TestCase_F_02_M real0m25.515s user0m25.294s sys 0m0.136s 3 more seconds! -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On 03/17/2016 09:08 AM, Charles T. Smith wrote: On Thu, 17 Mar 2016 10:52:30 -0500, Tim Chase wrote: Not saying this will make a great deal of difference, but these two items jumped out at me. I'd even be tempted to just use string manipulations for the isready aspect as well. Something like (untested) well, I don't want to forgo REs in order to have python's numbers be better The issue is not avoiding REs, but using Python's strengths and idioms. Write the code in Python's style, get the same results, then compare the times. If you posted the data file and exact results the rest of us could try, but as it is all we can do is offer ideas and you have test them. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 15:29:47 +, Charles T. Smith wrote: And for completeness, and also surprising: time sed -n -e '/ is ready/{s///;h}' -e '/release_req/{g;p}' *.out | sort -u TestCase_F_00_P TestCase_F_00_S TestCase_F_01_S TestCase_F_02_M real0m10.998s user0m10.885s sys 0m0.108s Twice as long as perl... I guess there's no excuse for sed anymore... -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 19:08:58 +0200, Marko Rauhamaa wrote: > "Charles T. Smith": > > > Compare Perl (http://www.perlmonks.org/?node_id=98357>): > >my $str = "I have a dream"; >my $find = "have"; >my $replace = "had"; >$find = quotemeta $find; # escape regex metachars if present >$str =~ s/$find/$replace/g; >print $str; > > with Python: > >print("I have a dream".replace("have", "had")) > > > Marko Uh... that perl is way over my head. I admit though, that perl's powerful substitute command is also clumsy. The best I can do right now is: $v = "I have a dream\n"; $v =~ s/have/had/; print $v One of the ugliest things about perl are the "silly" type prefixes ($, @, %). But in a python project I'm doing now, I realized an important advantage that they bring... I want to be able to initialize msgs to communicate with C. Ideally, I'd to just specify the path to an equivalent python instance but all intermediate instances have to already exist - python does not have autovivication. I implemented it but only up until the leaf node - because python doesn't know their types. Perl can do that, because the prefix tells it the type. But, don't get me wrong, coding in python is a JOY! -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
"Charles T. Smith": > I need the second check to also be a RE because it's not > separate tokens. The string "in" check doesn't care about tokens. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On 17/03/2016 18:53, Marko Rauhamaa wrote: BartC: sub replacewith{ $s = $_[0]; $t = $_[1]; $u = $_[2]; $s =~ s/$t/$u/; return $s; } Although once done, the original task now looks a proper language: print (replacewith("I have a dream","have","had")); Now try your function with: print (replacewith("I have a dream",".","had")); Yeah, it needs your quotemeta line (whatever that does). But the call is unaffected as the clutter is in the function. -- bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 10:52:30 -0500, Tim Chase wrote: >> Not saying this will make a great deal of difference, but these two > items jumped out at me. I'd even be tempted to just use string > manipulations for the isready aspect as well. Something like > (untested) well, I don't want to forgo REs in order to have python's numbers be better -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
"Charles T. Smith": > well, I don't want to forgo REs in order to have python's numbers be > better http://stackoverflow.com/questions/12793562/text-processing-pytho n-vs-perl-performance> Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
"Charles T. Smith": > Here's the programs: > > #!/usr/bin/env python > # vim: tw=0 > import sys > import re > > isready = re.compile ("(.*) is ready") > relreq = re.compile (".*release_req") > for fn in sys.argv[1:]: # logfile name > tn = None > with open (fn) as fd: > for line in fd: > #match = re.match ("(.*) is ready", line) > match = isready.match (line) > if match: > tn = match.group(1) > #match = re.match (".*release_req", line) > match = relreq.match (line) > if match: > #print "%s: %s" % (tn, line), > print tn > > vs. > > while (<>) { > if (/(.*) is ready/) { > $tn = $1; > } > elsif (/release_req/) { > print "$tn\n"; > } > } > > Look at those numbers: > 1 minute for python without precompiled REs > 1/2 minute with precompiled REs > 5 seconds with perl. Can't comment on the numbers but the code segments are not quite analogous. What about this one: #!/usr/bin/env python # vim: tw=0 import sys import re isready = re.compile("(.*) is ready") for fn in sys.argv[1:]: tn = None with open(fn) as fd: for line in fd: match = isready.match(line) if match: tn = match.group(1) elif "release_req" in line: print tn Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 18:30:29 +0200, Marko Rauhamaa wrote: > "Charles T. Smith": > >> I need the second check to also be a RE because it's not >> separate tokens. > > The string "in" check doesn't care about tokens. > > > Marko Ah, yes. Okay. -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
BartC: > On 17/03/2016 18:53, Marko Rauhamaa wrote: >> BartC : > >>> sub replacewith{ >>> $s = $_[0]; >>> $t = $_[1]; >>> $u = $_[2]; >>> $s =~ s/$t/$u/; >>> return $s; >>> } >>> >>> Although once done, the original task now looks a proper language: >>> >>> print (replacewith("I have a dream","have","had")); >> >> Now try your function with: >> >> print (replacewith("I have a dream",".","had")); > > Yeah, it needs your quotemeta line (whatever that does). But the call > is unaffected as the clutter is in the function. Well, you fell in the trap. Most perl programmers would fall in it. Same with bash programmers, including myself. That's why I'm wondering if Python could come to the rescue and offer a solid alternative to bash. You have to go out of your way to get into accidental quoting/escaping problems in Python. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Fri, 18 Mar 2016 03:08 am, Charles T. Smith wrote: > On Thu, 17 Mar 2016 10:52:30 -0500, Tim Chase wrote: > >>> Not saying this will make a great deal of difference, but these two >> items jumped out at me. I'd even be tempted to just use string >> manipulations for the isready aspect as well. Something like >> (untested) > > well, I don't want to forgo REs in order to have python's numbers be > better Even when REs are the wrong tool for the job? "Yeah, I know I ought to be using a power drill, but all I've got is this hammer..." -- Steven -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
BartC: > I was going to suggest just using a function. But never having coded in > Perl before, I wasn't expecting something this ugly: > > sub replacewith{ >$s = $_[0]; >$t = $_[1]; >$u = $_[2]; >$s =~ s/$t/$u/; >return $s; > } > > Although once done, the original task now looks a proper language: > > print (replacewith("I have a dream","have","had")); Now try your function with: print (replacewith("I have a dream",".","had")); Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 10:26:12 -0700, Ethan Furman wrote: > On 03/17/2016 09:36 AM, Charles T. Smith wrote: > >> Yes, your point was to forgo REs despite that they are useful. >> I could have thought the search would have been better as: >> >> 'release[-.:][Rr]eq' >> >> or something else ... you're in a "defend python at all costs!" mode. > > No, I'm in the "don't try to write in Python" mode, and > "don't use 10lb sledge when 6oz hammer will do" mode: Yes, fine. I'd only like to add that the perl numbers might also improve if the print in the loop were postponed. -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On 17/03/2016 17:25, Charles T. Smith wrote: On Thu, 17 Mar 2016 19:08:58 +0200, Marko Rauhamaa wrote: my $str = "I have a dream"; my $find = "have"; my $replace = "had"; $find = quotemeta $find; # escape regex metachars if present $str =~ s/$find/$replace/g; print $str; with Python: print("I have a dream".replace("have", "had")) Uh... that perl is way over my head. I admit though, that perl's powerful substitute command is also clumsy. The best I can do right now is: $v = "I have a dream\n"; $v =~ s/have/had/; print $v I was going to suggest just using a function. But never having coded in Perl before, I wasn't expecting something this ugly: sub replacewith{ $s = $_[0]; $t = $_[1]; $u = $_[2]; $s =~ s/$t/$u/; return $s; } Although once done, the original task now looks a proper language: print (replacewith("I have a dream","have","had")); -- Bartc -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 18:07:12 +0200, Marko Rauhamaa wrote: > "Charles T. Smith": > Ok. The LANG=C setting has a tremendous effect on the performance of > textutils. > > > Marko Good to know, thank you... -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 17:48:54 +0200, Marko Rauhamaa wrote: > "Charles T. Smith": > >> On Thu, 17 Mar 2016 15:29:47 +, Charles T. Smith wrote: >> >> And for completeness, and also surprising: >> >> time sed -n -e '/ is ready/{s///;h}' -e '/release_req/{g;p}' *.out | sort -u >> TestCase_F_00_P >> TestCase_F_00_S >> TestCase_F_01_S >> TestCase_F_02_M >> >> real0m10.998s >> user0m10.885s >> sys 0m0.108s >> >> Twice as long as perl... I guess there's no excuse for sed anymore... > > Try running the sed command again after setting: > > export LANG=C > > > Marko Hmmm. Interesting thought. But... $ locale LANG=C LANGUAGE= LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE=C LC_MONETARY="C" LC_MESSAGES="C" LC_PAPER="C" LC_NAME="C" LC_ADDRESS="C" LC_TELEPHONE="C" LC_MEASUREMENT="C" LC_IDENTIFICATION="C" LC_ALL= -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On 2016-03-17 15:29, Charles T. Smith wrote: > isready = re.compile ("(.*) is ready") > relreq = re.compile (".*release_req") > for fn in sys.argv[1:]: # logfile > name tn = None > with open (fn) as fd: > for line in fd: > #match = re.match ("(.*) is ready", line) > match = isready.match (line) > if match: > tn = match.group(1) > #match = re.match (".*release_req", line) > match = relreq.match (line) > if match: Note that this "match" and "if" get executed for every line > #print "%s: %s" % (tn, line), > print tn > > vs. > > while (<>) { > if (/(.*) is ready/) { > $tn = $1; > } > elsif (/release_req/) { Note this else ^ > print "$tn\n"; > } > } Also, you might just test for string-presence on that second one So what happens if your code looks something like isready = re.compile ("(.*) is ready") for fn in sys.argv[1:]: # logfile name tn = None with open (fn) as fd: for line in fd: match = isready.match (line) if match: tn = match.group(1) elif "release_req" in line: print tn Not saying this will make a great deal of difference, but these two items jumped out at me. I'd even be tempted to just use string manipulations for the isready aspect as well. Something like (untested) IS_READY = " is ready" REL_REQ = "release_req" for n in sys.argv[1:]: tn = None with open(fn) as fd): for line in fd: try: index = line.rindex(IS_READY) except ValueError: if REL_REQ in line: print tn else: tn = line[:index] -tkc -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
"Charles T. Smith": > On Thu, 17 Mar 2016 17:48:54 +0200, Marko Rauhamaa wrote: >> Try running the sed command again after setting: >> >> export LANG=C > > Hmmm. Interesting thought. But... > > $ locale > LANG=C Ok. The LANG=C setting has a tremendous effect on the performance of textutils. Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 09:21:51 -0700, Ethan Furman wrote: >> well, I don't want to forgo REs in order to have python's numbers be >> better > > The issue is not avoiding REs, but using Python's strengths and idioms. > Write the code in Python's style, get the same results, then compare > the times. Yes, your point was to forge REs despite that they are useful. I could have thought the search would have been better as: 'release[-.:][Rr]eq' or something else ... you're in a "defend python at all costs!" mode. -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
Charles T. Smith wrote: > On Thu, 17 Mar 2016 10:52:30 -0500, Tim Chase wrote: > >>> Not saying this will make a great deal of difference, but these two >> items jumped out at me. I'd even be tempted to just use string >> manipulations for the isready aspect as well. Something like >> (untested) > > well, I don't want to forgo REs in order to have python's numbers be > better As has been said, for simple text processing tasks string methods are the preferred approach in Python. I think this is more for clarity than performance. If you need regular expressions a simple way to boost performance may be to use the external regex module. (By the way, if you are looking for a simple way to iterate over multiple files use for line in fileinput.input(): ... ) Some numbers: $ time perl find.pl data/sample*.txt > r1.txt real0m0.504s user0m0.466s sys 0m0.036s $ time python find.py data/sample*.txt > r2.txt real0m2.403s user0m2.339s sys 0m0.059s $ time python find_regex.py data/sample*.txt > r3.txt real0m0.693s user0m0.631s sys 0m0.060s $ time python find_no_re.py data/sample*.txt > r4.txt real0m0.319s user0m0.267s sys 0m0.048s Python 3 slows down things: $ time python3 find_no_re.py data/sample*.txt > r5.txt real0m0.497s user0m0.444s sys 0m0.051s The scripts: $ cat find.pl #!/usr/bin/env perl while (<>) { if (/(.*) is ready/) { $tn = $1; } elsif (/release_req/) { print "$tn\n"; } } $ cat find.py #!/usr/bin/env python import sys import re def main(): isready = re.compile ("(.*) is ready").match relreq = re.compile (".*release_req").match tn = "" for fn in sys.argv[1:]: with open(fn) as fd: for line in fd: match = isready(line) if match: tn = match.group(1) elif relreq(line): print(tn) main() $ cat find_regex.py #!/usr/bin/env python import sys import regex as re [rest the same as find.py] $ cat find_no_re.py #!/usr/bin/env python import sys def main(): tn = "" for fn in sys.argv[1:]: with open(fn) as fd: for line in fd: if " is ready" in line: tn = line.partition(" is ready")[0] elif "release_req" in line: print(tn) main() The test data was generated with $ cat make_test_data.py #!/usr/bin/env python3 import os import random import shutil from itertools import islice def make_line_factory(words, line_length, isready): choice = random.choice def make_line(): while True: line = [choice(words)] length = len(line[0]) while length < line_length: word = choice(words) line.append(word) length += len(word) + 1 if random.randrange(100) < isready: pos = random.randrange(len(line)) line[pos:pos+1] = ["is", "ready"] elif random.randrange(100) < isready: pos = random.randrange(len(line)) line[pos:pos] = ["release_req"] yield " ".join(line) return make_line def main(): import argparse parser = argparse.ArgumentParser() parser.add_argument("--words", default="/usr/share/dict/words") parser.add_argument("--line-length", type=int, default=80) parser.add_argument("--num-lines", type=eval, default=10**5) parser.add_argument("--num-files", type=int, default=4) parser.add_argument("--name-template", default="sample{:0{}}.txt") parser.add_argument("--data-folder", default="data") parser.add_argument("--remove-data-folder", action="store_true") parser.add_argument("--first-match-percent", type=int, default=10) try: import argcomplete except ImportError: pass else: argcomplete.autocomplete(parser) args = parser.parse_args() if args.remove_data_folder: shutil.rmtree(args.data_folder) os.mkdir(args.data_folder) with open(args.words) as f: words = [line.strip() for line in f] make_line = make_line_factory( words, args.line_length, args.first_match_percent)() width = len(str(args.num_files)) for index in range(1, args.num_files+1): filename = os.path.join( args.data_folder, args.name_template.format(index, width)) print(filename) with open(filename, "w") as f: for line in islice(make_line, args.num_lines): print(line, file=f) if __name__ == "__main__": main() -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 17:47:55 +0200, Marko Rauhamaa wrote: > Can't comment on the numbers but the code segments are not quite > analogous. What about this one: > > #!/usr/bin/env python > # vim: tw=0 > import sys > import re > > isready = re.compile("(.*) is ready") > for fn in sys.argv[1:]: > tn = None > with open(fn) as fd: > for line in fd: > match = isready.match(line) > if match: > tn = match.group(1) > elif "release_req" in line: > print tn > > > Marko I need the second check to also be a RE because it's not separate tokens. How about this change: match = isready.match (line) if match: tn = match.group(1) > continue match = relreq.match (line) if match: print tn real0m28.737s user0m28.538s sys 0m0.128s Shaved 2 seconds off. -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thu, 17 Mar 2016 18:34:06 +0200, Marko Rauhamaa wrote: > n-vs-perl-performance Okay, that was interesting. Actually, I saw a study some years ago that concluded that python could be both slower and faster than perl, but that perl had much less deviation than python. I took that and accepted it, but was surprised now that in exactly the field of application that I've traditionally used perl, it really is better, er... faster. Furthermore, the really nice thing about python is its OO, but I've really neglected looking into that with perl's OO capabilities. -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On 03/17/2016 09:36 AM, Charles T. Smith wrote: Yes, your point was to forgo REs despite that they are useful. I could have thought the search would have been better as: 'release[-.:][Rr]eq' or something else ... you're in a "defend python at all costs!" mode. No, I'm in the "don't try to write in Python" mode, and "don't use 10lb sledge when 6oz hammer will do" mode: # using `in` and printing line as each is found real0m1.703s user0m0.184s sys 0m0.260s # using `in` and printing lines at the end real0m0.217s user0m0.112s sys 0m0.068s # using 're' and printing lines at the end real0m0.608s user0m0.516s sys 0m0.060s As you can see, how you print has a huge impact. Hopefully you also noticed that using `re` when `in` would do made the script 3 times slower. # using `in` code import sys found = [] for fn in sys.argv[1:]: with open(fn) as fh: for line in fh: if 'timezone' in line: found.append(line) print ''.join(found) # using `re` code import sys import re found = [] for fn in sys.argv[1:]: with open(fn) as fh: for line in fh: if re.search('timezone', line): found.append(line) print ''.join(found) -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
Charles T. Smith: I've really learned to love working with python, but it's too soon to pack perl away. I was amazed at how long a simple file search took so I ran some statistics: Write Python in pythonic style instead of translated-from-Perl style, and the tables are turned: $ cat find-rel.py | import sys | def main(): | for fn in sys.argv[1:]: | tn = None | with open(fn, 'rt') as fd: | for line in fd: | if ' is ready' in line: | tn = line.split(' is ready', 1)[0] | elif 'release_req' in line: | print tn | main() $ time python find-rel.py *.out real0m0.647s user0m0.616s sys0m0.029s $ time perl find-rel.pl *.out real0m0.935s user0m0.910s sys0m0.023s I don't have your log files and my quickly assembled test file doesn't actually contain the phrase 'release_req', so my results may be misleading. Perhaps you'll try it and post your results? regards, Anders -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On Thursday, March 17, 2016 at 11:24:00 PM UTC+5:30, BartC wrote: > On 17/03/2016 17:25, Charles T. Smith wrote: > > On Thu, 17 Mar 2016 19:08:58 +0200, Marko Rauhamaa wrote: > > >> my $str = "I have a dream"; > >> my $find = "have"; > >> my $replace = "had"; > >> $find = quotemeta $find; # escape regex metachars if present > >> $str =~ s/$find/$replace/g; > >> print $str; > >> > >> with Python: > >> > >> print("I have a dream".replace("have", "had")) > > > Uh... that perl is way over my head. I admit though, that perl's > > powerful substitute command is also clumsy. The best I can do > > right now is: > > > > $v = "I have a dream\n"; > > $v =~ s/have/had/; > > print $v > > I was going to suggest just using a function. But never having coded in > Perl before, I wasn't expecting something this ugly: > > sub replacewith{ > $s = $_[0]; > $t = $_[1]; > $u = $_[2]; I think [untested] you can shorten those 3 lines to: ($s, $t, $u) = @_ ; > $s =~ s/$t/$u/; > return $s; > } -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
"Charles T. Smith": > On Thu, 17 Mar 2016 15:29:47 +, Charles T. Smith wrote: > > And for completeness, and also surprising: > > time sed -n -e '/ is ready/{s///;h}' -e '/release_req/{g;p}' *.out | sort -u > TestCase_F_00_P > TestCase_F_00_S > TestCase_F_01_S > TestCase_F_02_M > > real0m10.998s > user0m10.885s > sys 0m0.108s > > Twice as long as perl... I guess there's no excuse for sed anymore... Try running the sed command again after setting: export LANG=C Marko -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
Marko Rauhamaawrites: > "Charles T. Smith" : > >> Actually, I saw a study some years ago that concluded that python >> could be both slower and faster than perl, but that perl had much less >> deviation than python. I took that and accepted it, but was surprised >> now that in exactly the field of application that I've traditionally >> used perl, it really is better, er... faster. >> >> Furthermore, the really nice thing about python is its OO, but I've >> really neglected looking into that with perl's OO capabilities. > > I haven't had such log processing needs as you, nor has it come down to > performance in such a way. Do use the best tool for the job. > > (When it comes to freely formatted logs, gleaning information from them > is somewhat of a lost cause. I've done my best to move to rigorously > formatted logs that are much more amenable to post processing.) > > Perl might be strong on its home turf, but I am a minimalist and > reductionist -- Perl was intentionally designed to be a maximalist, > imitating the principles of natural languages. Python has concise, > crystal-clear semantics that are convenient to work with. > > Compare Perl (http://www.perlmonks.org/?node_id=98357>): > >my $str = "I have a dream"; >my $find = "have"; >my $replace = "had"; >$find = quotemeta $find; # escape regex metachars if present >$str =~ s/$find/$replace/g; >print $str; > > with Python: > >print("I have a dream".replace("have", "had")) If you know the strings are "have" and "had", you can just write print 'I have a dream' =~ s/have/had/r; but I think your point is to show up the lack of a string (rather than regex) replace in Perl, so the strings should be considered arbitrarily "dangerous". For that purpose it might have been better to give the example as print("I have a dream".replace(find, replace)) for which the closest Perl match is probably print 'I have a dream' =~ s/\Q$find/$replace/r; The closest to the actual line -- where you can just edit two strings with your only concern being the end quote of the string -- would be something like my $find = 'have'; print 'I have a dream' =~ s{\Q$find}'had'r I don't want to start a language war! I'm not saying that this is as simple and clear as the Python, but a "compare X with Y" should try to do the best by both X and Y. -- Ben. -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On 03/17/2016 10:35 AM, Charles T. Smith wrote: On Thu, 17 Mar 2016 10:26:12 -0700, Ethan Furman wrote: On 03/17/2016 09:36 AM, Charles T. Smith wrote: Yes, your point was to forgo REs despite that they are useful. I could have thought the search would have been better as: 'release[-.:][Rr]eq' or something else ... you're in a "defend python at all costs!" mode. No, I'm in the "don't try to write in Python" mode, and "don't use 10lb sledge when 6oz hammer will do" mode: Yes, fine. I'd only like to add that the perl numbers might also improve if the print in the loop were postponed. Yup, might! Try it and see. -- ~Ethan~ -- https://mail.python.org/mailman/listinfo/python-list
sobering observation, python vs. perl
I've really learned to love working with python, but it's too soon to pack perl away. I was amazed at how long a simple file search took so I ran some statistics: $ time python find-rel.py ./find-relreq *.out | sort -u TestCase_F_00_P TestCase_F_00_S TestCase_F_01_S TestCase_F_02_M real1m4.581s user1m4.412s sys 0m0.140s $ time python find-rel.py # modified to use precompiled REs: TestCase_F_00_P TestCase_F_00_S TestCase_F_01_S TestCase_F_02_M real0m29.337s user0m29.174s sys 0m0.100s $ time perl find-rel.pl find-relreq.pl *.out | sort -u TestCase_F_00_P TestCase_F_00_S TestCase_F_01_S TestCase_F_02_M real0m5.009s user0m4.932s sys 0m0.072s Here's the programs: #!/usr/bin/env python # vim: tw=0 import sys import re isready = re.compile ("(.*) is ready") relreq = re.compile (".*release_req") for fn in sys.argv[1:]: # logfile name tn = None with open (fn) as fd: for line in fd: #match = re.match ("(.*) is ready", line) match = isready.match (line) if match: tn = match.group(1) #match = re.match (".*release_req", line) match = relreq.match (line) if match: #print "%s: %s" % (tn, line), print tn vs. while (<>) { if (/(.*) is ready/) { $tn = $1; } elsif (/release_req/) { print "$tn\n"; } } Look at those numbers: 1 minute for python without precompiled REs 1/2 minute with precompiled REs 5 seconds with perl. -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
On 17/03/2016 16:36, Charles T. Smith wrote: On Thu, 17 Mar 2016 09:21:51 -0700, Ethan Furman wrote: well, I don't want to forgo REs in order to have python's numbers be better The issue is not avoiding REs, but using Python's strengths and idioms. Write the code in Python's style, get the same results, then compare the times. Yes, your point was to forge REs despite that they are useful. I could have thought the search would have been better as: 'release[-.:][Rr]eq' or something else ... you're in a "defend python at all costs!" mode. I believe it is more along the lines of "In Rome, do as the Romans". -- My fellow Pythonistas, ask not what our language can do for you, ask what you can do for our language. Mark Lawrence -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
please upload the log file, and global variables in python are slow, so just keep all that in a function and try again. generally i get 20-30% time improvement by doin that. On Thu, Mar 17, 2016 at 8:59 PM, Charles T. Smithwrote: > I've really learned to love working with python, but it's too soon > to pack perl away. I was amazed at how long a simple file search took > so I ran some statistics: > > $ time python find-rel.py > ./find-relreq *.out | sort -u > TestCase_F_00_P > TestCase_F_00_S > TestCase_F_01_S > TestCase_F_02_M > > real1m4.581s > user1m4.412s > sys 0m0.140s > > > $ time python find-rel.py > # modified to use precompiled REs: > TestCase_F_00_P > TestCase_F_00_S > TestCase_F_01_S > TestCase_F_02_M > > real0m29.337s > user0m29.174s > sys 0m0.100s > > > $ time perl find-rel.pl > find-relreq.pl *.out | sort -u > TestCase_F_00_P > TestCase_F_00_S > TestCase_F_01_S > TestCase_F_02_M > > real0m5.009s > user0m4.932s > sys 0m0.072s > > Here's the programs: > > #!/usr/bin/env python > # vim: tw=0 > import sys > import re > > isready = re.compile ("(.*) is ready") > relreq = re.compile (".*release_req") > for fn in sys.argv[1:]: # logfile name > tn = None > with open (fn) as fd: > for line in fd: > #match = re.match ("(.*) is ready", line) > match = isready.match (line) > if match: > tn = match.group(1) > #match = re.match (".*release_req", line) > match = relreq.match (line) > if match: > #print "%s: %s" % (tn, line), > print tn > > vs. > > while (<>) { > if (/(.*) is ready/) { > $tn = $1; > } > elsif (/release_req/) { > print "$tn\n"; > } > } > > Look at those numbers: > 1 minute for python without precompiled REs > 1/2 minute with precompiled REs > 5 seconds with perl. > -- > https://mail.python.org/mailman/listinfo/python-list -- Regards Srinivas Devaki Junior (3rd yr) student at Indian School of Mines,(IIT Dhanbad) Computer Science and Engineering Department ph: +91 9491 383 249 telegram_id: @eightnoteight -- https://mail.python.org/mailman/listinfo/python-list
Re: sobering observation, python vs. perl
"Charles T. Smith": > Actually, I saw a study some years ago that concluded that python > could be both slower and faster than perl, but that perl had much less > deviation than python. I took that and accepted it, but was surprised > now that in exactly the field of application that I've traditionally > used perl, it really is better, er... faster. > > Furthermore, the really nice thing about python is its OO, but I've > really neglected looking into that with perl's OO capabilities. I haven't had such log processing needs as you, nor has it come down to performance in such a way. Do use the best tool for the job. (When it comes to freely formatted logs, gleaning information from them is somewhat of a lost cause. I've done my best to move to rigorously formatted logs that are much more amenable to post processing.) Perl might be strong on its home turf, but I am a minimalist and reductionist -- Perl was intentionally designed to be a maximalist, imitating the principles of natural languages. Python has concise, crystal-clear semantics that are convenient to work with. Compare Perl (http://www.perlmonks.org/?node_id=98357>): my $str = "I have a dream"; my $find = "have"; my $replace = "had"; $find = quotemeta $find; # escape regex metachars if present $str =~ s/$find/$replace/g; print $str; with Python: print("I have a dream".replace("have", "had")) Marko -- https://mail.python.org/mailman/listinfo/python-list