On Fri, Mar 11, 2016 at 9:34 AM, Wolfgang Maier <wolfgang.ma...@biologie.uni-freiburg.de> wrote: > On 11.03.2016 15:23, Fillmore wrote: >> >> On 03/11/2016 07:13 AM, Wolfgang Maier wrote: >>> >>> One lesson for Perl regex users is that in Python many things can be >>> solved without regexes. >>> How about defining: >>> >>> printable = {chr(n) for n in range(32, 127)} >>> >>> then using: >>> >>> if (set(my_string) - set(printable)): >>> break >> >> >> seems computationally heavy. I have a file with about 70k lines, of >> which only 20 contain "funny" chars. >> > > Not sure what you call computationally heavy. I just test-parsed a 30 MB > file (28k lines) with: > > with open(my_file) as i: > for line in i: > if set(line) - printable: > continue > > and it finished in less than a second.
Did your test file contain on the order of 100 unique characters, or on the order of 100,000? Granted that most input data would likely fall into the former category. -- https://mail.python.org/mailman/listinfo/python-list