Re: Python read text file columnwise
Hi! On 15/01/2019 17:04, Neil Cerutti wrote: > On 2019-01-11, shibashib...@gmail.com wrote: >> Hello >>> >>> I'm very new in python. I have a file in the format: >>> >>> 2018-05-31 16:00:0028.90 81.77 4.3 >>> 2018-05-31 20:32:0028.17 84.89 4.1 >>> 2018-06-20 04:09:0027.36 88.01 4.8 >>> 2018-06-20 04:15:0027.31 87.09 4.7 >>> 2018-06-28 04.07:0027.87 84.91 5.0 >>> 2018-06-29 00.42:0032.20 104.61 4.8 >> >> I would like to read this file in python column-wise. >> >> I tried this way but not working >>event_list = open('seismicity_R023E.txt',"r") >> info_event = read(event_list,'%s %s %f %f %f %f\n'); > > If it's really tabular data in fixed-width columns you can read > it that way with Python. > > records = [] > for line in file: > record = [] > i = 0 > for width in (30, 8, 7, 5): # approximations > item = line[i:i+width] > record.append(item) > i += width > records.append(record) > > This leaves them all strings, which in my experience is more > convenient in practice. You can convert as you go if you > want,though it won't look nice and simple any longer. > Perhaps even better approach is to use csv module from standard library: import csv csv_reader = csv.reader(file, dialect="excel-tab") for row in csv_reader: # do something with record data which is conveniently parsed to list print(row) ['2018-05-31', '16:00:00', '28.90', '81.77', '4.3'] ... ['2018-06-29', '00.42:00', '32.20', '104.61', '4.8'] BR, Juris -- https://mail.python.org/mailman/listinfo/python-list
Tracemalloc overhead when profiling
Hi, I was looking for a way to profile memory usage for some script which deals with log message parsing. Looking through Python's stdlib I stumbled upon tracemalloc module. So I tried my hand on profiling my script. A few things I noticed that I am not 100% sure I can explain. Tracemalloc memory overhead when tracing seems somewhere 3x-4x. Is that expected? The dumb example that demonstrates behavior: ---8<--- # memprof.py import tracemalloc def expensive(): return [str(x) for x in range(1_000_000)] if __name__ == '__main__': if not tracemalloc.is_tracing(): tracemalloc.start() snapshot1 = tracemalloc.take_snapshot() _ = expensive() snapshot2 = tracemalloc.take_snapshot() tracemalloc.stop() for stat in snapshot2.compare_to(snapshot1, key_type="lineno"): print(stat) ---8<--- Script output with naive GNU time program profiling: $ /usr/bin/time python3.7 memprof.py memprof.py:6: size=60.6 MiB (+60.6 MiB), count=101 (+101), average=64 B ...snip... 1.40user 0.10system 0:01.51elapsed 99%CPU (0avgtext+0avgdata 280284maxresident)k 0inputs+0outputs (0major+62801minor)pagefaults 0swaps Same script but without actually tracing with tracemalloc: $ /usr/bin/time python3.7 memprof.py 0.26user 0.03system 0:00.29elapsed 100%CPU (0avgtext+0avgdata 72316maxresident)k 0inputs+0outputs (0major+17046minor)pagefaults 0swaps So, when not tracing with tracemalloc memory used by script is 72MiB (credible since tracemalloc reports 60.6MiB allocated in hot spot). But then when tracemalloc is tracing script uses almost 4x memory e.g. 280MiB. Is this expected? Any other tools for memory profiling you can recommend? Running Python 3.7.2 on x86_64 Linux system. BR, Juris -- https://mail.python.org/mailman/listinfo/python-list