Re: Best way to clean up list items?

Jussi Piitulainen Mon, 02 May 2016 11:33:30 -0700

DFS writes:

> On 5/2/2016 12:57 PM, Jussi Piitulainen wrote:
>> DFS writes:
>>
>>> Have: list1 = ['\r\n   Item 1  ','  Item 2  ','\r\n  ']
>>> Want: list1 = ['Item 1','Item 2']


. .

>> Funny-looking data you have.
>
> I know - sadly, it's actual data:
>
> --------------------------------------------------------------------
> from lxml import html
> import requests
>
> webpage =
> "http://www.usdirectory.com/ypr.aspx?fromform=qsearch&qs=TN&wqhqn=2&qc=Nashville&rg=30&qhqn=restaurant&sb=zipdisc&ap=2";
>
> page  = requests.get(webpage)
> tree  = html.fromstring(page.content)
> addr1 = tree.xpath('//span[@class="text3"]/text()')
> print 'Addresses: ', addr1
> --------------------------------------------------------------------
>
> I couldn't figure out a better way to extract it from the HTML (maybe
> XML and DOM?)

I should have guessed :) But now I'm a bit worried about those spaces
inside your items. Can it happen that item text is split into strings in
the middle? Then the above sanitation does the wrong thing.

If someone has the right solution, I'm watching, too.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Best way to clean up list items?

Reply via email to