On Fri, Mar 11, 2016 at 9:34 AM, Wolfgang Maier
<wolfgang.ma...@biologie.uni-freiburg.de> wrote:
> On 11.03.2016 15:23, Fillmore wrote:
>>
>> On 03/11/2016 07:13 AM, Wolfgang Maier wrote:
>>>
>>> One lesson for Perl regex users is that in Python many things can be
>>> solved without regexes.
>>> How about defining:
>>>
>>> printable = {chr(n) for n in range(32, 127)}
>>>
>>> then using:
>>>
>>> if (set(my_string) - set(printable)):
>>>      break
>>
>>
>> seems computationally heavy. I have a file with about 70k lines, of
>> which only 20 contain "funny" chars.
>>
>
> Not sure what you call computationally heavy. I just test-parsed a 30 MB
> file (28k lines) with:
>
> with open(my_file) as i:
>     for line in i:
>         if set(line) - printable:
>             continue
>
> and it finished in less than a second.

Did your test file contain on the order of 100 unique characters, or
on the order of 100,000?  Granted that most input data would likely
fall into the former category.
-- 
https://mail.python.org/mailman/listinfo/python-list

Reply via email to