> So What should I do to get the exact value(here the value after > 'href=') in any case even if the > > tags are like these? >> > > <link rel="stylesheet" href="mystylesheet.css" type="text/css"> > -OR- > <link href="mystylesheet.css" rel="stylesheet" type="text/css"> > -OR- > <link type="text/css" href="mystylesheet.css" rel="stylesheet">
The following should do it: expr = r'<link .*?href="(.*?)"' or if single quotes might have been used: expr = r'''<link .*?href=["'](.*?)['"]''' But like the others have said, beautiful soup is very good for things like this. -- http://mail.python.org/mailman/listinfo/python-list