Abhishek Girish created DRILL-1562:
--------------------------------------

             Summary: Parquet Writer hangs when converting TPCH text data 
(SF100)
                 Key: DRILL-1562
                 URL: https://issues.apache.org/jira/browse/DRILL-1562
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Parquet
            Reporter: Abhishek Girish
            Assignee: Jacques Nadeau


Converting TPCH text data into Parquet hangs. 

Table name: lineitem
Table size: ~80GB
Input format: psv ('|' separated)

Number of drillbits: 4
DRILL_MAX_DIRECT_MEMORY="64G"
DRILL_MAX_HEAP="32G"

Query:
> create table lineitem as select
. . . . . . . . . . . . . . . . . >     cast(columns[0] as int) l_orderkey,
. . . . . . . . . . . . . . . . . >     cast(columns[1] as int) l_partkey,
. . . . . . . . . . . . . . . . . >     cast(columns[2] as int) l_suppkey,
. . . . . . . . . . . . . . . . . >     cast(columns[3] as int) l_linenumber,
. . . . . . . . . . . . . . . . . >     cast(columns[4] as double) l_quantity,
. . . . . . . . . . . . . . . . . >     cast(columns[5] as double) 
l_extendedprice,
. . . . . . . . . . . . . . . . . >     cast(columns[6] as double) l_discount,
. . . . . . . . . . . . . . . . . >     cast(columns[7] as double) l_tax,
. . . . . . . . . . . . . . . . . >     cast(columns[8] as char(1)) 
l_returnflag,
. . . . . . . . . . . . . . . . . >     cast(columns[9] as char(1)) 
l_linestatus,
. . . . . . . . . . . . . . . . . >     cast(columns[10] as date) l_shipdate,
. . . . . . . . . . . . . . . . . >     cast(columns[11] as date) l_commitdate,
. . . . . . . . . . . . . . . . . >     cast(columns[12] as date) l_receiptdate,
. . . . . . . . . . . . . . . . . >     cast(columns[13] as char(25)) 
l_shipinstruct,
. . . . . . . . . . . . . . . . . >     cast(columns[14] as char(10)) 
l_shipmode,
. . . . . . . . . . . . . . . . . >     cast(columns[15] as varchar(200)) 
l_comment
. . . . . . . . . . . . . . . . . > from dfs.`/tpch-text/scale100/lineitem` 
lineitem;
+------------+---------------------------+
|  Fragment  | Number of records written |
+------------+---------------------------+
| 1_58       | 4072947                   |
| 1_90       | 4088667                   |
| 1_38       | 4072639                   |
...
...
| 1_14       | 6109440                   |
<hangs>
...

The drill-bit endpoint gets set to null. And the point of hang varies on each 
run. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to