[jira] [Created] (HIVE-12530) Merge join in mutiple subsquence join and a mapjoin in it in mr model
Feng Yuan created HIVE-12530: Summary: Merge join in mutiple subsquence join and a mapjoin in it in mr model Key: HIVE-12530 URL: https://issues.apache.org/jira/browse/HIVE-12530 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 1.2.1 Reporter: Feng Yuan Fix For: 2.00 sample hql: select A.state_date, A.customer, A.channel_2, A.id, A.pid, A.type, A.pv, A.uv, A.visits, if(C.stay_visits is null,0,C.stay_visits) as stay_visits, A.stay_time, if(B.bounce is null,0,B.bounce) as bounce from (select a.state_date, a.customer, b.url as channel_2, b.id, b.pid, b.type, count(1) as pv, count(distinct a.gid) uv, count(distinct a.session_id) as visits, sum(a.stay_time) as stay_time from ( select state_date, customer, gid, session_id, ep, stay_time from bdi_fact.mid_pageview_dt0 where l_date ='$v_date' )a join (select l_date as state_date , url, id, pid, type, cid from bdi_fact.frequency_channel where l_date ='$v_date' and type ='2' and dr='0' )b on a.customer=b.cid where a.ep rlike b.url group by a.state_date, a.customer, b.url,b.id,b.pid,b.type )A left outer join ( select c.state_date , c.customer , d.url as channel_2, d.id, sum(pagedepth) as bounce from ( select t1.state_date , t1.customer , t1.session_id, t1.ep, t2.pagedepth from ( select state_date , customer , session_id, exit_url as ep from ods.mid_session_enter_exit_dt0 where l_date ='$v_date' )t1 join ( select state_date , customer , session_id, pagedepth from ods.mid_session_action_dt0 where l_date ='$v_date' and pagedepth='1' )t2 on t1.customer=t2.customer and t1.session_id=t2.session_id )c join (select * from bdi_fact.frequency_channel where l_date ='$v_date' and type ='2' and dr='0' )d on c.customer=d.cid where c.ep rlike d.url group by c.state_date,c.customer,d.url,d.id )B on A.customer=B.customer and A.channel_2=B.channel_2 and A.id=B.id left outer join ( select e.state_date, e.customer, f.url as channel_2, f.id, f.pid, f.type, count(distinct e.session_id) as stay_visits from ( select state_date, customer, gid, session_id, ep, stay_time from bdi_fact.mid_pageview_dt0 where l_date ='$v_date' )e join (select l_date as state_date, url, id, pid, type, cid from bdi_fact.frequency_channel where l_date ='$v_date' and type ='2' and dr='0' )f on e
[jira] [Created] (HIVE-12531) Implement fast-path for Year/Month UDFs for dates between 1999 and 2038
Gopal V created HIVE-12531: -- Summary: Implement fast-path for Year/Month UDFs for dates between 1999 and 2038 Key: HIVE-12531 URL: https://issues.apache.org/jira/browse/HIVE-12531 Project: Hive Issue Type: Improvement Reporter: Gopal V Current codepath goes into the JDK Calendar implementation, which is very slow for the simple cases in the current decade. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12532) LLAP Cache: Uncompressed data cache has NPE
Gopal V created HIVE-12532: -- Summary: LLAP Cache: Uncompressed data cache has NPE Key: HIVE-12532 URL: https://issues.apache.org/jira/browse/HIVE-12532 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.0.0 Reporter: Gopal V Assignee: Sergey Shelukhin {code} 2015-11-26 08:28:45,232 [TezTaskRunner_attempt_1448429572030_0255_2_02_19_2(attempt_1448429572030_0255_2_02_19_2)] WARN org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception when closing input a(cleanup). Exception class=java.io.IOException, message=java.lang.NullPointerException java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:283) at org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.close(LlapInputFormat.java:275) at org.apache.hadoop.hive.ql.io.HiveRecordReader.doClose(HiveRecordReader.java:50) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.close(HiveContextAwareRecordReader.java:104) at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.close(TezGroupedSplitsInputFormat.java:177) at org.apache.tez.mapreduce.lib.MRReaderMapred.close(MRReaderMapred.java:96) at org.apache.tez.mapreduce.input.MRInput.close(MRInput.java:559) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:872) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104) at org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:795) at org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:320) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191) at org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74) ... 5 more {code} Not clear if current.next can set it to null before the continue; {code} assert partOffset <= current.getOffset(); if (partOffset == current.getOffset() && current instanceof CacheChunk) { // We assume cache chunks would always match the way we read, so check and skip it. assert current.getOffset() == partOffset && current.getEnd() == partEnd; lastUncompressed = (CacheChunk)current; current = current.next; continue; } {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12533) Unexpected NULL in map join small table
Rajesh Balamohan created HIVE-12533: --- Summary: Unexpected NULL in map join small table Key: HIVE-12533 URL: https://issues.apache.org/jira/browse/HIVE-12533 Project: Hive Issue Type: Bug Components: Hive Reporter: Rajesh Balamohan {noformat} Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected NULL in map join small table at org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:110) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:293) at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:174) at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:170) at org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104) ... 5 more Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected NULL in map join small table at org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:88) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:182) at org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:97) ... 9 more {noformat} \cc [~gopalv] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HIVE-12534) Date functions with vectorization is returning wrong results
Rajesh Balamohan created HIVE-12534: --- Summary: Date functions with vectorization is returning wrong results Key: HIVE-12534 URL: https://issues.apache.org/jira/browse/HIVE-12534 Project: Hive Issue Type: Bug Components: Hive Reporter: Rajesh Balamohan {noformat} select c.effective_date, year(c.effective_date), month(c.effective_date) from customers c where c.customer_id = 146028; hive> set hive.vectorized.execution.enabled=true; hive> select c.effective_date, year(c.effective_date), month(c.effective_date) from customers c where c.customer_id = 146028; 2015-11-19 0 0 hive> set hive.vectorized.execution.enabled=false; hive> select c.effective_date, year(c.effective_date), month(c.effective_date) from customers c where c.customer_id = 146028; 2015-11-19 201511 {noformat} \cc [~gopalv], [~sseth], [~sershe] -- This message was sent by Atlassian JIRA (v6.3.4#6332)