On Aug 4, 9:21?pm, "Jim Langston" <[EMAIL PROTECTED]> wrote: > <[EMAIL PROTECTED]> wrote in message > > news:[EMAIL PROTECTED] > > > > > > > On Aug 4, 6:35?pm, SMERSH009 <[EMAIL PROTECTED]> wrote: > >> Hi All. > >> Let's say I have some badly formatted text called doc: > > >> doc= > >> """ > >> friendid > >> Female > > >> 23 years old > > >> Los Gatos > > >> United States > >> friendid > >> Male > > >> 24 years old > > >> San Francisco, California > > >> United States > >> """ > > >> How would I get these results to be displayed in a format similar to: > >> friendid;Female;23 years old;Los Gatos;United States > >> friendid;Male; 24 years old;San Francisco, California;United States > > >> The latter is a lot easier to organize and can be quickly imported > >> into Excel's column format. > > >> Thanks Much, > >> Sam > > > d = doc.split('\n') > > > f = [i.split() for i in d if i] > > > g = [' '.join(i) for i in f] > > > rec = [] > > temprec = [] > > for i in g: > > if i: > > if i == 'friendid': > > rec.append(temprec) > > temprec = [i] > > else: > > temprec.append(i) > > rec.append(temprec) > > > output = [';'.join(i) for i in rec if i] > > > for i in output: print i > > > ## friendid;Female;23 years old;Los Gatos;United States > > ## friendid;Male;24 years old;San Francisco, California;United States > > also, I would suggest you use CSV format.
Well, the OP asked for a specific format. One is not always at liberty to change it. > CSV stands for "Comma Seperated > Variable" and Excel can load such a sheet directly. And Excel can load the shown format directly also, just specify the delimiter. > > Instead of seperating using ; seperate using , Of course, this provides a > problem when there is a , in a string. Which explains the popularity of using tabs as delimiters. The data deliverable specification I use at work uses the pipe character | which never appears as data in this particular application. > Resolution is to quote the string. Which makes the file bigger and isn't necessary when tabs and pipes are used as delimiters. > Being such, you can just go ahead and quote all strings. So you would want > the output to be: > > "friendid","Female","23 years old","Los Gatos","United States" > "friendid","Male","24 years old","San Francisco, California","United States" Which I would do if I had a specification that demanded it or was making files for others. For my own use, I wouldn't bother as it's unnecessary work. > > Numbers should not be quoted if you wish to treat them as numeric and not > text. A good reason not to use quotes at all. Besides which, Excel can handle that also. -- http://mail.python.org/mailman/listinfo/python-list