Roman Borisov created DATAFU-31: ----------------------------------- Summary: bags.DistinctBy works incorrectly on string containing minuses Key: DATAFU-31 URL: https://issues.apache.org/jira/browse/DATAFU-31 Project: DataFu Issue Type: Bug Affects Versions: 1.3.0 Reporter: Roman Borisov
How to reproduce: Input: {(a-b,c), (a-b,d)} define distinct as DistinctBy('1') input = load 'input' as vs:bag{(v0:chararray,v1:chararray)}; output = foreach input generate distinct(vs); dump output; expected: {(a-b,c), (a-b,d)} actual: {(a-b,c)} The bug is caused by the implementation based on splitting the tuple string by '-' to get tuple parts. -- This message was sent by Atlassian JIRA (v6.1.5#6160)