On Aug 4, 8:25 pm, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > On Aug 4, 9:21?pm, "Jim Langston" <[EMAIL PROTECTED]> wrote: > > > > > <[EMAIL PROTECTED]> wrote in message > > >news:[EMAIL PROTECTED] > > > > On Aug 4, 6:35?pm, SMERSH009 <[EMAIL PROTECTED]> wrote: > > >> Hi All. > > >> Let's say I have some badly formatted text called doc: > > > >> doc= > > >> """ > > >> friendid > > >> Female > > > >> 23 years old > > > >> Los Gatos > > > >> United States > > >> friendid > > >> Male > > > >> 24 years old > > > >> San Francisco, California > > > >> United States > > >> """ > > > >> How would I get these results to be displayed in a format similar to: > > >> friendid;Female;23 years old;Los Gatos;United States > > >> friendid;Male; 24 years old;San Francisco, California;United States > > > >> The latter is a lot easier to organize and can be quickly imported > > >> into Excel's column format. > > > >> Thanks Much, > > >> Sam > > > > d = doc.split('\n') > > > > f = [i.split() for i in d if i] > > > > g = [' '.join(i) for i in f] > > > > rec = [] > > > temprec = [] > > > for i in g: > > > if i: > > > if i == 'friendid': > > > rec.append(temprec) > > > temprec = [i] > > > else: > > > temprec.append(i) > > > rec.append(temprec) > > > > output = [';'.join(i) for i in rec if i] > > > > for i in output: print i > > > > ## friendid;Female;23 years old;Los Gatos;United States > > > ## friendid;Male;24 years old;San Francisco, California;United States > > > also, I would suggest you use CSV format. > > Well, the OP asked for a specific format. One is not > always at liberty to change it. > > > CSV stands for "Comma Seperated > > Variable" and Excel can load such a sheet directly. > > And Excel can load the shown format directly also, > just specify the delimiter. > > > > > Instead of seperating using ; seperate using , Of course, this provides a > > problem when there is a , in a string. > > Which explains the popularity of using tabs as delimiters. > The data deliverable specification I use at work > uses the pipe character | which never appears as data > in this particular application. > > > Resolution is to quote the string. > > Which makes the file bigger and isn't necessary > when tabs and pipes are used as delimiters. > > > Being such, you can just go ahead and quote all strings. So you would want > > the output to be: > > > "friendid","Female","23 years old","Los Gatos","United States" > > "friendid","Male","24 years old","San Francisco, California","United States" > > Which I would do if I had a specification that > demanded it or was making files for others. For my > own use, I wouldn't bother as it's unnecessary work. > > > > > Numbers should not be quoted if you wish to treat them as numeric and not > > text. > > A good reason not to use quotes at all. Besides which, > Excel can handle that also.
Thanks for all your posts guys. mensanator's was the most helpful, and I only ended up needing to use a few lines from that code. The only question that remains for me--and this is just for my knowledge-- what does the "if i" mean in this code snippet? f = [i.split() for i in d if i] How is it helpful to leave a dangling "if i"? Why not just f = [i.split() for i in d]? And yes John, this was indeed a "homework question." It was for my daughter's preschool. You are going to help her ace her beginner Python class! (No, this was not a homework question). -- http://mail.python.org/mailman/listinfo/python-list