[
https://issues.apache.org/jira/browse/PIG-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy V. Ryaboy resolved PIG-2722.
------------------------------------
Resolution: Won't Fix
> UDF FilterFunc in expression using OR right hand side gets ignored
> ------------------------------------------------------------------
>
> Key: PIG-2722
> URL: https://issues.apache.org/jira/browse/PIG-2722
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.1
> Environment: pig-0.8.1, hadoop-0.20.2 from Clouderas distribution
> cdh3u3 on Kubuntu 12.04 64Bit.
> Reporter: Johannes Schwenk
>
> The following pig script does not produce the expected output:
> {noformat}
> register adition.jar
> a = LOAD 'TestCONTAINS-testFilteringCluster-input.txt' AS (id:int, grp:int,
> additional:int, referer:chararray);
> b = FILTER a BY com.adition.pig.filtering.string.CONTAINS(referer, 'obama')
> OR com.adition.pig.filtering.string.CONTAINS(referer, 'praesident');
> EXPLAIN b;
> dump b;
> {noformat}
> TestCONTAINS-testFilteringCluster-input.txt contains
> {noformat}
> 1 23 42
> http://www.google.com/url&url=http%3A%2F%2Fwww.example.com%2Fmypage.htm&q=flowers
> 2 123 42
> http://www.google.com/url&url=http%3A%2F%2Fwww.zeit.de%2Findex.php&q=towers
> 3 223 142
> http://www.google.com/url&url=http%3A%2F%2Fwww.nix-wie-weg.de&q=mallorca
> 4 323 242
> http://www.google.com/url&url=http%3A%2F%2Fwww.tagesschau.de&q=obama
> 5 423 342 http://www.google.com/url&url=http%3A%2F%2Fwww.bild.de&q=obama
> 6 523 442
> http://www.google.com/url&url=http%3A%2F%2Fwww.example.com%2Fmypage.htm&q=praesident
> {noformat}
> The {{adition.jar}} has been built against the cloudera cdh3u3 distribution
> and contains the filter function {{CONTAINS}}, see here
> http://pastebin.com/Uwje7v1V .
> The output can be seen here http://pastebin.com/yXY17mXx . Essentially what
> is happening is that the right hand side of the OR in the FILTER expression
> is beeing ignored, resulting in the script returning just two lines
> {noformat}
> (4,323,242,http://www.google.com/url&url=http%3A%2F%2Fwww.tagesschau.de&q=obama)
> (5,423,342,http://www.google.com/url&url=http%3A%2F%2Fwww.bild.de&q=obama)
> {noformat}
> instead of three lines
> {noformat}
> (4,323,242,http://www.google.com/url&url=http%3A%2F%2Fwww.tagesschau.de&q=obama)
> (5,423,342,http://www.google.com/url&url=http%3A%2F%2Fwww.bild.de&q=obama)
> (6,523,442,http://www.google.com/url&url=http%3A%2F%2Fwww.example.com%2Fmypage.htm&q=praesident)
> {noformat}
> Running the script with pig 0.11.0 yields correct results
> http://pastebin.com/Cr5CkHui
> See also the diskussion on the pig-user mailinglist
> http://www.mail-archive.com/user%40pig.apache.org/msg05278.html
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira