[ https://issues.apache.org/jira/browse/PIG-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vadim Zaliva updated PIG-593: ----------------------------- Attachment: PIG-593.diff Attaching patch. > RegExLoader stops an non-matching line > -------------------------------------- > > Key: PIG-593 > URL: https://issues.apache.org/jira/browse/PIG-593 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.1.0 > Reporter: Vadim Zaliva > Attachments: PIG-593.diff > > > Class RegExLoader and all its subclasses stop if some of lines does not match > provided regular expression. > In particular, I have noticed this when CombinedLogLoader stopped at the > following line: > 58.210.62.24 - - [29/Dec/2008:23:06:57 -0800] "GET > /tor/browse/?id=24746&rel=FLY > 999%40Jack's+Teen+America+22%2FFLY999原創%40單掛D.C.資訊交流網+Jack's+Teen+Ameri > ca+22+cd1.avi HTTP/1.1" 8952 200 > "http://img252.imageshack.us/tor/browse/?id=247 > 46&rel=FLY999%40Jack%27s+Teen+America+22" "Mozilla/4.0 (compatible; MSIE 6.0; > Wi > ndows NT 5.1; )" "-" > Looks like some japanese characters here do not match \S expression used. > In general I expect it to skip such lines, not to stop processing data file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.