[jira] [Created] (HIVE-12530) Merge join in mutiple subsquence join and a mapjoin in it in mr model

2015-11-26 Thread Feng Yuan (JIRA)
Feng Yuan created HIVE-12530:


 Summary: Merge join in mutiple subsquence join and a mapjoin in it 
in mr model
 Key: HIVE-12530
 URL: https://issues.apache.org/jira/browse/HIVE-12530
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 1.2.1
Reporter: Feng Yuan
 Fix For: 2.00


sample hql:
select  A.state_date, 
   A.customer, 
   A.channel_2,
   A.id,
   A.pid,
   A.type,
   A.pv,
   A.uv,
   A.visits,
   if(C.stay_visits is null,0,C.stay_visits) as stay_visits,
   A.stay_time,
   if(B.bounce is null,0,B.bounce) as bounce
 from
 (select a.state_date, 
a.customer, 
b.url as channel_2,
b.id,
b.pid,
b.type,
count(1) as pv,
count(distinct a.gid) uv,
count(distinct a.session_id) as visits,
sum(a.stay_time) as stay_time
   from   
   ( select state_date, 
   customer, 
   gid,
   session_id,
   ep,
   stay_time
from bdi_fact.mid_pageview_dt0
where l_date ='$v_date'
  )a
  join
  (select l_date as state_date ,
  url,
  id,
  pid,
  type,
  cid
   from bdi_fact.frequency_channel
   where l_date ='$v_date'
   and type ='2'
   and dr='0'
  )b
   on  a.customer=b.cid  
   where a.ep  rlike b.url
   group by a.state_date, a.customer, b.url,b.id,b.pid,b.type
   )A
  
left outer join
   (   select 
   c.state_date ,
   c.customer ,
   d.url as channel_2,
   d.id,
   sum(pagedepth) as bounce
from
  ( select 
  t1.state_date ,
  t1.customer ,
  t1.session_id,
  t1.ep,
  t2.pagedepth
from   
 ( select 
 state_date ,
 customer ,
 session_id,
 exit_url as ep
  from ods.mid_session_enter_exit_dt0
  where l_date ='$v_date'
  )t1
 join
  ( select 
state_date ,
customer ,
session_id,
pagedepth
from ods.mid_session_action_dt0
where l_date ='$v_date'
and  pagedepth='1'
  )t2
 on t1.customer=t2.customer
 and t1.session_id=t2.session_id
   )c
   join
   (select *
   from bdi_fact.frequency_channel
   where l_date ='$v_date'
   and type ='2'
   and dr='0'
   )d
   on c.customer=d.cid
   where c.ep  rlike d.url
   group by  c.state_date,c.customer,d.url,d.id
 )B
 on 
 A.customer=B.customer
 and A.channel_2=B.channel_2 
 and A.id=B.id
  left outer join
 ( 
 select e.state_date, 
e.customer, 
f.url as channel_2,
f.id,
f.pid,
f.type,
count(distinct e.session_id) as stay_visits
   from   
   ( select state_date, 
   customer, 
   gid,
   session_id,
   ep,
   stay_time
from bdi_fact.mid_pageview_dt0
where l_date ='$v_date'
  )e
  join
  (select l_date as state_date,
  url,
  id,
  pid,
  type,
  cid
   from bdi_fact.frequency_channel
   where l_date ='$v_date'
   and type ='2'
   and dr='0'
  )f
   on  e

[jira] [Created] (HIVE-12531) Implement fast-path for Year/Month UDFs for dates between 1999 and 2038

2015-11-26 Thread Gopal V (JIRA)
Gopal V created HIVE-12531:
--

 Summary: Implement fast-path for Year/Month UDFs for dates between 
1999 and 2038
 Key: HIVE-12531
 URL: https://issues.apache.org/jira/browse/HIVE-12531
 Project: Hive
  Issue Type: Improvement
Reporter: Gopal V


Current codepath goes into the JDK Calendar implementation, which is very slow 
for the simple cases in the current decade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12532) LLAP Cache: Uncompressed data cache has NPE

2015-11-26 Thread Gopal V (JIRA)
Gopal V created HIVE-12532:
--

 Summary: LLAP Cache: Uncompressed data cache has NPE
 Key: HIVE-12532
 URL: https://issues.apache.org/jira/browse/HIVE-12532
 Project: Hive
  Issue Type: Bug
  Components: llap
Affects Versions: 2.0.0
Reporter: Gopal V
Assignee: Sergey Shelukhin


{code}
2015-11-26 08:28:45,232 
[TezTaskRunner_attempt_1448429572030_0255_2_02_19_2(attempt_1448429572030_0255_2_02_19_2)]
 WARN org.apache.tez.runtime.LogicalIOProcessorRuntimeTask: Ignoring exception 
when closing input a(cleanup). Exception class=java.io.IOException, 
message=java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
at 
org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:283)
at 
org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.close(LlapInputFormat.java:275)
at 
org.apache.hadoop.hive.ql.io.HiveRecordReader.doClose(HiveRecordReader.java:50)
at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.close(HiveContextAwareRecordReader.java:104)
at 
org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.close(TezGroupedSplitsInputFormat.java:177)
at 
org.apache.tez.mapreduce.lib.MRReaderMapred.close(MRReaderMapred.java:96)
at org.apache.tez.mapreduce.input.MRInput.close(MRInput.java:559)
at 
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.cleanup(LogicalIOProcessorRuntimeTask.java:872)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:104)
at 
org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:35)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.preReadUncompressedStream(EncodedReaderImpl.java:795)
at 
org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:320)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.performDataRead(OrcEncodedDataReader.java:413)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:194)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader$4.run(OrcEncodedDataReader.java:191)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:191)
at 
org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:74)
... 5 more
{code}

Not clear if current.next can set it to null before the continue; 

{code}
  assert partOffset <= current.getOffset();
  if (partOffset == current.getOffset() && current instanceof CacheChunk) {
// We assume cache chunks would always match the way we read, so check 
and skip it.
assert current.getOffset() == partOffset && current.getEnd() == partEnd;
lastUncompressed = (CacheChunk)current;
current = current.next;
continue;
  }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12533) Unexpected NULL in map join small table

2015-11-26 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-12533:
---

 Summary: Unexpected NULL in map join small table
 Key: HIVE-12533
 URL: https://issues.apache.org/jira/browse/HIVE-12533
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajesh Balamohan


{noformat}
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected NULL in map join 
small table
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:110)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:293)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:174)
at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:170)
at 
org.apache.hadoop.hive.ql.exec.tez.LlapObjectCache.retrieve(LlapObjectCache.java:104)
... 5 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unexpected NULL in 
map join small table
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastLongHashTable.putRow(VectorMapJoinFastLongHashTable.java:88)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastTableContainer.putRow(VectorMapJoinFastTableContainer.java:182)
at 
org.apache.hadoop.hive.ql.exec.vector.mapjoin.fast.VectorMapJoinFastHashTableLoader.load(VectorMapJoinFastHashTableLoader.java:97)
... 9 more
{noformat}

\cc [~gopalv]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-12534) Date functions with vectorization is returning wrong results

2015-11-26 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created HIVE-12534:
---

 Summary: Date functions with vectorization is returning wrong 
results
 Key: HIVE-12534
 URL: https://issues.apache.org/jira/browse/HIVE-12534
 Project: Hive
  Issue Type: Bug
  Components: Hive
Reporter: Rajesh Balamohan


{noformat}
select c.effective_date, year(c.effective_date), month(c.effective_date) from 
customers c where c.customer_id = 146028;


hive> set hive.vectorized.execution.enabled=true;
hive> select c.effective_date, year(c.effective_date), month(c.effective_date) 
from customers c where c.customer_id = 146028;

2015-11-19  0   0

hive> set hive.vectorized.execution.enabled=false;
hive> select c.effective_date, year(c.effective_date), month(c.effective_date) 
from customers c where c.customer_id = 146028;

2015-11-19  201511
{noformat}

\cc [~gopalv], [~sseth], [~sershe]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)