g thakuri writes: > I would want to avoid using multiple split in the below code , what > options do we have before tokenising the line?, may be validate the > first line any other ideas > > cmd = 'utility %s' % (file) > out, err, exitcode = command_runner(cmd) > data = stdout.strip().split('\n')[0].split()[5][:-2]
That .strip() looks suspicious to me, but perhaps you know better. Also, stdout should be out, right? You can use io.StringIO to turn a string into an object that you can read line by line just like a file object. This reads just the first line and picks the part that you want: data = next(io.StringIO(out)).split()[5][:-2] I don't know how much this affects performance, but it's kind of neat. A thing I like to do is name all fields even I don't use them all. The assignment will fail with an exception if there's an unexpected number of fields, and that's usually what I want when input is bad: line = next(io.StringIO(out)) ID, FORM, LEMMA, POS, TAGS, WEV, ETC = line.split() data = WEV[:-2] (Those are probably not appropriate names for your fields :) Just a couple of ideas that you may like to consider. -- https://mail.python.org/mailman/listinfo/python-list