So the tool you use can extract outlinks and plaintext, not accessing DOM? 2006/2/18, Fuad Efendi <[EMAIL PROTECTED]>: > > I am using http://htmlparser.sourseforge.net for my Data Mining engine. > It has 'lexer' package, lightweight, and I don't need to perform ANY > html/xml error checking etc., - it's lightweight low-level 'parser', it is > not a parser, it is not DOM, SAX, etc. We do not need to create DOM to > extract Outlink[], and to extract plain text. > What about licensing? > > We can develop own low-lewel HTML (InputSource) processing engine from > scratch, we need only Outlink[] and PlainText. > >
-- 《盖世豪侠》好评如潮,让无线收视居高不下, 无线高兴之余,仍未重用。周星驰岂是池中物, 喜剧天分既然崭露,当然不甘心受冷落,于是 转投电影界,在大银幕上一展风采。无线既得 千里马,又失千里马,当然后悔莫及。
