Sergey created PIG-3625: --------------------------- Summary: Strange schema transformation and API path to get nested objects Key: PIG-3625 URL: https://issues.apache.org/jira/browse/PIG-3625 Project: Pig Issue Type: Bug Affects Versions: 0.10.1 Environment: CDH 4.4 Reporter: Sergey
Hi, here is a part fo my script: {code} describe groupedNotMatchedSaleItems; /* groupedNotMatchedSaleItems: { group: long, notMatchedSaleItems: {(sale_id: long,sale_item_id: long)} } */ describe groupedFlatSales; groupedFlatSales: { group: long, flatSales: {(npl_id: long,block_id: int,is_napoleon: int,rec_cnt: int,recs: chararray,item_id: long,shop_id: int,internal_id: int,catalog_category_id: long,sale_item_id: long,sale_id: long,price: int,count: int)} } describe projectedRecsOf2ndLevel; /* projectedRecsOf2ndLevel: {sale_id: long,sale_item_id: long,npl_id: long,recs: chararray} */ cogroupedSalesNotMatched = COGROUP groupedFlatSales by group, groupedNotMatchedSaleItems by group, projectedRecsOf2ndLevel by sale_id; describe cogroupedSalesNotMatched; /* cogroupedSalesNotMatched: { group: long, groupedFlatSales: { ( group: long, flatSales: {(npl_id: long,block_id: int,is_napoleon: int,rec_cnt: int,recs: chararray,item_id: long,shop_id: int,internal_id: int,catalog_category_id: long,sale_item_id: long,sale_id: long,price: int,count: int)} ) }, groupedNotMatchedSaleItems: { ( group: long, notMatchedSaleItems: {(sale_id: long,sale_item_id: long)} ) }, projectedRecsOf2ndLevel: { (sale_id: long,sale_item_id: long,npl_id: long,recs: chararray) */ secondLevelRecommendations = FOREACH cogroupedSalesNotMatched{ GENERATE NplRecSecondLevelMatcher(groupedNotMatchedSaleItems.notMatchedSaleItems, groupedFlatSales.flatSales, projectedRecsOf2ndLevel); } {code} NplRecSecondLevelMatcher is a Java UDF Input shema inside UDF is: {code} { { ( notMatchedSaleItems:{(sale_id: long,sale_item_id: long)} ) }, { ( flatSales:{(npl_id: long,block_id: int,is_napoleon: int,rec_cnt: int,recs: chararray,item_id: long,shop_id: int,internal_id: int,catalog_category_id: long,sale_item_id: long,sale_id: long,price: int,count: int)} ) }, projectedRecsOf2ndLevel: {(sale_id: long,sale_item_id: long,npl_id: long,recs: chararray)} } {code} Why is it so strage for notMatchedSaleItems and flatSales? I have to write this strage code to get access to notMatchedSaleItems bag: {code} /** It's a groovy @param input is an input tuple for the UDF @param bagName is a bag name in schema. data-fu lib is used. def getInputBag(Tuple input, String bagName){ def bag = getBag(input, bagName) (bag.iterator().next() as Tuple).get(0) as DataBag } */ {code} I supposed that {code} (DataBag)udfInputTuple.get(0) should return the bag with "notMatchedSaleItems" {code} Why my input is wrapped with these bags and tuples? -- This message was sent by Atlassian JIRA (v6.1.4#6159)