On Jan 4, 2007, at 1:55 PM, Brian Whitman wrote: > I did that kill -SIGQUIT thing on the parse hang-- looks like > jid3lib has a problem... but if jid3lib throws an exception, > shouldn't the parse-mp3 plugin and nutch pick it up and continue? > (Excuse my java lack of knowledge...)
As suspected, jid3lib had a nasty bug in parsing certain bad id3v2.X files. I've patched it only to find someone else had found the bug in 2004 but the devs never integrated the change... does anyone know a better (more maintained) java id3 parsing library? I could volunteer to write parse-mp3 against something else instead. More to the topic of Nutch, it's interesting that the hangs only happened on a re-parse. These files have been parsed before during the crawl. If a parse subprocess hangs like this during a crawl, doesn't a reaper come around and kill the thread and ignore the url? If so, shouldn't the same thing happen during explicit parses? ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
