Hi, I wrote a similar simple UDTF and a new table. This simple UDTF does work on hive 0.10. But my original one doesn't. Still don't understand why.
Does the fact that the original query works with the setting 'set hive.optimize.ppd=true' give any clue ? Please let me know. On Tuesday, February 25, 2014 3:28 PM, java8964 <java8...@hotmail.com> wrote: Works for me on 0.10. Yong ________________________________ Date: Tue, 25 Feb 2014 11:37:32 -0800 From: kumarbuyonl...@yahoo.com Subject: Re: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col3] To: user@hive.apache.org Hi, Thanks for looking into it. I am also trying this on hive 0.11 to see if it works there. If you get a chance to reproduce this problem on hive 0.10, please let me know. Thanks. On Monday, February 24, 2014 10:59 PM, java8964 <java8...@hotmail.com> wrote: My guess is that your UDTF will return an array of struct. I don't have Hive 0.10 in handy right now, but I write a simple UDTF to return an array of struct to test on Hive 0.12 release. hive> desc test; OK id int None name string None Time taken: 0.074 seconds, Fetched: 2 row(s) hive> select * from test; OK 1Apples,Bananas,Carrots Time taken: 0.08 seconds, Fetched: 1 row(s) The pair UDTF will output "Apples,Bananas,Carrots" to "Apples, Bananas" "Apples, Carrots" "Bananas, Carrots" an array of 2 elements struct. hive> select id, name, m1, m2 from test lateral view pair(name) p as m1, m2 where m1 is not null; OK 1Apples,Bananas,CarrotsApplesBananas 1Apples,Bananas,CarrotsApplesCarrots 1Apples,Bananas,CarrotsBananasCarrots Time taken: 7.683 seconds, Fetched: 3 row(s) hive> select id, name, m1, m2 from test lateral view pair(name) p as m1, m2 where m1 = 'Apples'; OK 1Apples,Bananas,CarrotsApplesBananas 1Apples,Bananas,CarrotsApplesCarrots Time taken: 7.726 seconds, Fetched: 2 row(s) hive> set hive.optimize.ppd=true; hive> select id, name, m1, m2 from test lateral view pair(name) p as m1, m2 where m1 is not null; Total MapReduce jobs = 1 OK 1Apples,Bananas,CarrotsApplesBananas 1Apples,Bananas,CarrotsApplesCarrots 1Apples,Bananas,CarrotsBananasCarrots Time taken: 7.716 seconds, Fetched: 3 row(s) I cannot reproduce your error in Hive 0.12, as you can see. I can test on Hive 0.10 tomorrow when I have time, but can your test your case in Hive 0.12, or review your UDTF again? Yong ________________________________ Date: Mon, 24 Feb 2014 07:09:44 -0800 From: kumarbuyonl...@yahoo.com Subject: Re: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col3] To: user@hive.apache.org; kumarbuyonl...@yahoo.com As suggested, I changed the query like this: select x.f1,x,f2,x,f3,x.f4 from ( select e.f1 as f1,e.f2 as f2,e.f3 as f3,e.f4 as f4 from mytable LATERAL VIEW myfunc(p1,p2,p3,p4) e as f1,f2,f3,f4 where lang=123) x where x.f3 is not null; And it still doesn't work. I am getting the same error. If anyone has any ideas, please let me know. Thanks. On Friday, February 21, 2014 11:27 AM, Kumar V <kumarbuyonl...@yahoo.com> wrote: Line 316 in my UDTF where is shows the error is the line where I call forward(). The whole trace is : Caused by: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col6, 3:_col7, 4:_col8, 5:_col9] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:346) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) at org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:55) at org.apache.hadoop.hive.ql.exec.ExprNodeFieldEvaluator.initialize(ExprNodeFieldEvaluator.java:55) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:128) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:128) at org.apache.hadoop.hive.ql.exec.ExprNodeGenericFuncEvaluator.initialize(ExprNodeGenericFuncEvaluator.java:128) at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:85) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.LateralViewJoinOperator.processOp(LateralViewJoinOperator.java:133) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:112) at org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:44) at org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:81) at pb2CSVReadFile.FlatTableFileUDTFTx.process(FlatTableFileUDTFTx.java:316) at org.apache.hadoop.hive.ql.exec.UDTFOperator.processOp(UDTFOperator.java:98) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.LateralViewForwardOperator.processOp(LateralViewForwardOperator.java:37) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.FilterOperator.processOp(FilterOperator.java:132) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:83) at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:474) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:800) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:658) ... 9 more On Friday, February 21, 2014 11:18 AM, java8964 <java8...@hotmail.com> wrote: What is your stracktrace? Can you paste here? It is maybe a different bug. If you put e.f3 <> null at an outsider query? Does that work? Or maybe you have to enhance your UDTF to push that filter into your UDTF. It is not perfect, but maybe a solution for you as now. You can create a new Jira if it is a new bug. Yong ________________________________ Date: Fri, 21 Feb 2014 07:18:32 -0800 From: kumarbuyonl...@yahoo.com Subject: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col3] To: user@hive.apache.org Hi, I have a UDTF which works fine except when I do a query like the following : select e.* from mytable LATERAL VIEW myfunc(p1,p2,p3,p4) e as f1,f2,f3,f4 where lang=123and e.f3 <> null; The error I see is: java.lang.RuntimeException: cannot find field key from [0:_col0, 1:_col2, 2:_col3] If i remove 'and e.f3 <> null' from the WHERE clause, it works fine. Also, with e.f3 <> null in the WHERE clause, if I add the setting hive.optimize.ppd=false it works fine, but now, instead of using 600 mappers, it uses about 10,000 mappers and runs for more than 2 hours instead of a few minutes. I am using hive 0.10. I saw the jira HIVE-3226 which says that it has been fixed in hive 0.10. Is this the bug that I am hitting now ? Any other ideas of how to make it work ? I am actually on CDH 4.4 which has hive 0.10. Please let me know. Thanks, Murali.