[ https://issues.apache.org/jira/browse/PIG-4705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Dai updated PIG-4705: ---------------------------- Fix Version/s: 0.16.0 > Error Schema for data cannot be determined using HCatalog > --------------------------------------------------------- > > Key: PIG-4705 > URL: https://issues.apache.org/jira/browse/PIG-4705 > Project: Pig > Issue Type: Bug > Components: tez > Affects Versions: 0.15.0 > Environment: HDP 2.3.2 > Reporter: Krzysztof Indyk > Fix For: 0.16.0 > > Attachments: hive_tables.hql, sample.csv, stack_trace.log > > > When we use {{HCatalog}} as source and destination of data for {{Pig}} on > {{Tez}} we get ??ERROR 1115: Schema for data cannot be determined??. > Pig works fine when we use map reduce or use HCatalog only as one of > endpoints i.e. load data directly from file and store using HCatalog. > The error appears after upgrading from {{Pig 0.14}} on {{Tez 0.5.2}} to {{Pig > 0.15}} on {{Tez 0.7.0}} ( {{HDP 2.2.6}} to {{HDP 2.3.2}}). > To reproduce: > - create hive tables from [^hive_tables.hql] > - load data to table_input from [^sample.csv] > - run following Pig script on Tez > {code} > data = LOAD 'table_input' USING org.apache.hive.hcatalog.pig.HCatLoader(); > items_unique = DISTINCT data; > counted = FOREACH (GROUP items_unique BY col2) > GENERATE > group AS name, > COUNT(items_unique) AS value; > > STORE counted INTO 'table_output' USING > org.apache.hive.hcatalog.pig.HCatStorer(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)