[ 
https://issues.apache.org/jira/browse/PIG-593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vadim Zaliva updated PIG-593:
-----------------------------

    Attachment: PIG-593.diff

Attaching patch.



> RegExLoader stops an non-matching line
> --------------------------------------
>
>                 Key: PIG-593
>                 URL: https://issues.apache.org/jira/browse/PIG-593
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.1.0
>            Reporter: Vadim Zaliva
>         Attachments: PIG-593.diff
>
>
> Class RegExLoader and all its subclasses stop if some of lines does not match 
> provided regular expression.
> In particular, I have noticed this when CombinedLogLoader stopped at the 
> following line:
> 58.210.62.24 - - [29/Dec/2008:23:06:57 -0800] "GET 
> /tor/browse/?id=24746&rel=FLY
> 999%40Jack's+Teen+America+22%2FFLY999原創%40單掛D.C.資訊交流網+Jack's+Teen+Ameri
> ca+22+cd1.avi HTTP/1.1" 8952 200 
> "http://img252.imageshack.us/tor/browse/?id=247
> 46&rel=FLY999%40Jack%27s+Teen+America+22" "Mozilla/4.0 (compatible; MSIE 6.0; 
> Wi
> ndows NT 5.1; )" "-"
> Looks like some japanese characters here do not match \S expression used.  
> In general I expect it to skip such lines, not to stop processing data file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to