Tim Chase wrote: > On 2016-02-19 02:47, wrong.addres...@gmail.com wrote: >> 2 12.657823 0.1823467E-04 114 0 >> 3 4 5 9 11 >> "Lower" >> 278.15 >> >> Is it straightforward to read this, or does one have to read one >> character at a time and then figure out what the numbers are? -- > > It's easy to read. What you do with that mess of data is the complex > part. They come in as byte-strings, but you'd have to convert them > to the corresponding formats: > > from shlex import shlex > USE_LEX = True # False > with open('data.txt') as f: > for i, line in enumerate(f, 1): > if USE_LEX: > bits = shlex(line) > else: > bits = line.split() > for j, bit in enumerate(bits, 1): > if bit.isdigit(): > result = int(bit) > t = "an int" > elif '"' in bit: > result = bit > t = "a string" > else: > result = float(bit) > t = "a float" > print("On line %i I think that item %i %r is %s: %r" % ( > i, > j, > bit, > t, > result, > )) > > The USE_LEX controls whether the example code uses string-splitting > on white-space, or uses the built-in "shlex" module to parse for > quoted strings that might contain a space. The naive way of > string-splitting will be faster, but choke on string-data containing > spaces. > > You'd have to make up your own heuristics for determining what type > each data "bit" is, parsing it out (with int(), float() or whatever), > but the above gives you some rough ideas with at least one known > bug/edge-case.
Or just tell the parser what to expect: $ cat read_data_shlex2.py import shlex CONVERTERS = { "i": int, "f": float, "s": str } def parse_line(types, line=None, file=None): if line is None: line = file.readline() values = shlex.split(line) if len(values) != len(types): raise ValueError("Too few/many values %r <-- %r" % (types, values)) return tuple(CONVERTERS[t](v) for t, v in zip(types, values)) with open("data.txt") as f: print(parse_line("iffii", file=f)) print(parse_line("iiiii", file=f)) print(parse_line("s", file=f)) print(parse_line("fsi", file=f)) print(parse_line("ff", file=f)) $ cat data.txt 2 12.657823 0.1823467E-04 114 0 3 4 5 9 11 "Lower" 1.2 "foo \" bar \\ baz" 42 278.15 $ python3 read_data_shlex2.py (2, 12.657823, 1.823467e-05, 114, 0) (3, 4, 5, 9, 11) ('Lower',) (1.2, 'foo " bar \\ baz', 42) Traceback (most recent call last): File "read_data_shlex2.py", line 24, in <module> print(parse_line("ff", file=f)) File "read_data_shlex2.py", line 15, in parse_line raise ValueError("Too few/many values %r <-- %r" % (types, values)) ValueError: Too few/many values 'ff' <-- ['278.15'] $ But we can't do *all* the work for you ;-) If this thread goes long enough eventually we will ;) -- https://mail.python.org/mailman/listinfo/python-list