Rahul Challapalli created DRILL-1482:
----------------------------------------

             Summary: Tpch 3 over text files for SF100 causes some of the 
drillbits(JVM) to crash
                 Key: DRILL-1482
                 URL: https://issues.apache.org/jira/browse/DRILL-1482
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
            Reporter: Rahul Challapalli


git.commit.id.abbrev=5c220e3

We created views on top of Tpch text data (SF 100) so that we need not modify 
the TPCH original queries. Below are the views and the query itself.

{code}
create view nation as select cast(columns[0] as int) n_nationkey, columns[1] 
n_name, cast(columns[2] as int) n_regionkey, columns[3] n_comment from 
`nation_text`;
create view region as select cast(columns[0] as int) r_regionkey, columns[1] 
r_name, columns[2] r_comment from `region_text`;
create view part as select cast(columns[0] as int) p_partkey, columns[1] 
p_name, columns[2] p_mfgr, columns[3] p_brand, columns[4] p_type, 
cast(columns[5] as int) p_size, columns[6] p_container, cast(columns[7] as 
double) p_retailprice, columns[8] p_comment from `part_text`;
create view supplier as select cast(columns[0] as int) s_suppkey, columns[1] 
s_name, columns[2] s_address, cast(columns[3] as int) s_nationkey, columns[4] 
s_phone, cast(columns[5] as double) s_acctbal, columns[6] s_comment from 
`supplier_text`;
create view partsupp as select cast(columns[0] as int) ps_partkey, 
cast(columns[1] as int) ps_suppkey, cast(columns[2] as int) ps_availqty, 
cast(columns[3] as double) ps_supplycost, columns[4] ps_comment from 
`partsupp_text`;
create view customer as select cast(columns[0] as int) c_custkey, columns[1] 
c_name, columns[2] c_address, cast(columns[3] as int) c_nationkey, columns[4] 
c_phone, cast(columns[5] as double) c_acctbal, columns[6] c_mktsegment, 
columns[7] c_comment from `customer_text`;
create view orders as select cast(columns[0] as int) o_orderkey, 
cast(columns[1] as int) o_custkey, columns[2] o_orderstatus, cast(columns[3] as 
double) o_totalprice, cast(columns[4] as date)o_orderdate, columns[5] 
o_orderpriority, columns[6] o_clerk, cast(columns[7] as int) o_shippriority, 
columns[8] o_comment from `orders_text`;
create view lineitem as select cast(columns[0] as int) l_orderkey, 
cast(columns[1] as int) l_partkey, cast(columns[2] as int) l_suppkey, 
cast(columns[3] as int) l_linenumber, cast(columns[4] as double) l_quantity, 
cast(columns[5] as double) l_extendedprice, cast(columns[6] as double) 
l_discount, cast(columns[7] as double) l_tax, columns[8] l_returnflag, 
columns[9] l_linestatus, cast(columns[10] as date) l_shipdate, cast(columns[11] 
as date) l_commitdate, cast(columns[12] as date) l_receiptdate, columns[13] 
l_shipinstruct, columns[14] l_shipmode, columns[15] l_comment from 
`lineitem_text`;

-- TPCH Query 3

select
  l.l_orderkey,
  sum(l.l_extendedprice * (1 - l.l_discount)) as revenue,
  o.o_orderdate,
  o.o_shippriority
from
  customer c,
  orders o,
  lineitem l
where
  c.c_mktsegment = 'HOUSEHOLD'
  and c.c_custkey = o.o_custkey
  and l.l_orderkey = o.o_orderkey
  and o.o_orderdate < date '1995-03-25'
  and l.l_shipdate > date '1995-03-25'
group by
  l.l_orderkey,
  o.o_orderdate,
  o.o_shippriority
order by
  revenue desc,
  o.o_orderdate
limit 10;
{code}

The cluster has 8 drillbits running with DRILL_MAX_DIRECT_MEMORY="32G". After 
running for around 45 seconds 3 out of the 8 drillbits come down due to a jvm 
crash. I attached the log files. Let me know if you need more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to