Hi, This should be easy, try something like if (title.equals("")) { Pattern p = Pattern.compile("\\<title\\>.?\\<\\/title\\>"); Matcher m = p.matcher(text); if (m.find()) { title = m.group(); } }
after line 194 in HtmlParser.java Best regards, Magnus On Fri, Aug 28, 2009 at 8:07 PM, Alexey Torochkov <all.net...@gmail.com>wrote: > > On Fri, Aug 28, 2009 at 7:39 PM, Fuad Efendi <f...@efendi.ca> wrote: > >> Some bad guys even put <div> before <html> tag – check Google cached >> page J >> >> (just joking...) >> >> Wonderfully browsers understand that... >> > :-P > Without sarcasm and irony... I just wanted to say that if a page have a > title - it should be extracted anyway > > -- > Alexey Torochkov