Hi, how can parse an HTML String. I need parse next Line : '<FIELD><NAME>BSCS status</NAME><TYPE>string</TYPE><VALUE>none</VALUE></FIELD><FIELD><NAME>TopCre_life</NAME><TYPE>integer</TYPE><VALUE>0</VALUE></FIELD>'
And this is the program: ============================================== #!/usr/bin/env python from sgmllib import SGMLParser import urllib import pdb class ParserHTML(SGMLParser): #pdb.set_trace() def unknown_starttag(self, tag, attrs): value = 0 startTAG = '<' + tag for i in attrs: if(i[0].lower() == i[1].lower() and not i[0] == i[1]): startTAG = startTAG[:-1] + ' ' + str(i[1]) value = 1 else: startTAG += ' ' + str(i[0]) + '="' + str(i[1]) + '"' value = 0 if(value == 1): startTAG += '"' startTAG += '>' def handle_data(self, data): #print data detalle = [] detalle2 = [] a = '' for pruebas in data: #pruebas = data detalle.extend(pruebas) a = ''.join([a, pruebas]) detalle2.append(a) print detalle2 return detalle2 def P_main(self, atr): return p.feed(atr) if __name__ == '__main__': node = '<FIELD><NAME>BSCS status</NAME><TYPE>string</TYPE><VALUE>none</VALUE></FIELD><FIELD><NAME>TopCre_life</NAME><TYPE>integer</TYPE><VALUE>0</VALUE></FIELD>' p = ParserHTML() dts = p.P_main(node) ============================================== Result of program its: bash-3.1$ ./pruebasDOM.py ['BSCS status'] ['string'] ['none'] ['TopCre_life'] ['integer'] ['0'] I can't pass the data to one dict() or []. I need all values, ['BSCS Status', 'string', 'none', 'TopCre_life', 'integer', '0'] That i can do? Tanks and greetings.
signature.asc
Description: This is a digitally signed message part
_______________________________________________ python-win32 mailing list python-win32@python.org http://mail.python.org/mailman/listinfo/python-win32