[ 
https://issues.apache.org/jira/browse/TRAFODION-3282?focusedWorklogId=208077&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-208077
 ]

ASF GitHub Bot logged work on TRAFODION-3282:
---------------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Mar/19 21:22
            Start Date: 05/Mar/19 21:22
    Worklog Time Spent: 10m 
      Work Description: DaveBirdsall commented on pull request #1809: 
[TRAFODION-3282] Fix buffer overrun in ExHdfsScanTcb::work
URL: https://github.com/apache/trafodion/pull/1809
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 208077)
    Time Spent: 40m  (was: 0.5h)

> Buffer overrun in ExHdfsScan::work in certain conditions
> --------------------------------------------------------
>
>                 Key: TRAFODION-3282
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-3282
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>    Affects Versions: 2.4
>            Reporter: David Wayne Birdsall
>            Assignee: David Wayne Birdsall
>            Priority: Major
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> If we have a large enough Hive text table with string columns, and the string 
> columns have values that are longer than CQD HIVE_MAX_STRING_LENGTH_IN_BYTES, 
> and there is no external table definition with longer column sizes given, we 
> may core in ExHdfsScan::work with a buffer overrun.
> The following test case reproduces the behavior.
> First, use the following python script, called datagen.py:
> {quote}#! /usr/bin/env python
> import sys
> if len(sys.argv) != 5 or \
>  sys.argv[1].lower() == '-h' or \
>  sys.argv[1].lower() == '-help':
>  print 'Usage: ' + sys.argv[0] + ' <file> <num of rows> <num of varchar colum
> ns> <varchar column length>'
>  sys.exit()
> f = open(sys.argv[1], "w+")
> marker=list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
> for num_rows in range(0, int(sys.argv[2])):
>  f.write(str(num_rows) + '|')
>  for num_cols in range(0, int(sys.argv[3])):
>  f.write(marker[num_rows%len(marker)])
>  for i in range (1, int(sys.argv[4])):
>  f.write(str(i % 10))
>  f.write('|')
>  f.write(str(num_rows))
>  f.write('\n')
> f.close()
> {quote}
> Run this script as follows:
> {quote}chmod 755 ./datagen.py
> ./datagen.py ./data_lgvc.10rows_512KB.txt 10 2 524288
> {quote}
> Next, perform the following commands in a Hive shell:
> {quote}drop table if exists lgvc_base_table;
> create table lgvc_base_table(c_int int, c_string1 string, c_string2 string, 
> p_in
> t int) row format delimited fields terminated by '|';
> load data local inpath './data_lgvc.10rows_512KB.txt' overwrite into table 
> lgvc_
> base_table;
> {quote}
> Finally, do the following in sqlci:
> {quote}CQD HDFS_IO_BUFFERSIZE '2048';
> prepare s1 from select * from hive.hive.lgvc_base_table where c_int > 10;
> execute s1;
> {quote}
> (The point of the CQD is to reduce the default HDFS read buffer size to 2Mb 
> rather than its default of 65Mb, so the test will fail with a smaller input 
> file.)
> When this test case is run, we get a core with the following stack trace:
> {quote}(gdb) bt
> #0 0x00007ffff5116495 in raise () from /lib64/libc.so.6
> #1 0x00007ffff5117c75 in abort () from /lib64/libc.so.6
> #2 0x00007ffff6f02935 in ?? ()
>  from /usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/lib/amd64/server/libjvm.so
> #3 0x00007ffff707bfdf in ?? ()
>  from /usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/lib/amd64/server/libjvm.so
> #4 0x00007ffff6f077c2 in JVM_handle_linux_signal ()
>  from /usr/lib/jvm/java-1.7.0-openjdk.x86_64/jre/lib/amd64/server/libjvm.so
> #5 <signal handler called>
> #6 0x00007ffff516d753 in memcpy () from /lib64/libc.so.6
> #7 0x00007ffff35b4dd5 in ExHdfsScanTcb::work (this=0x7ffff7e99148)
>  at ../executor/ExHdfsScan.cpp:601
> #8 0x00007ffff333d7a1 in ex_tcb::sWork (tcb=0x7ffff7e99148)
>  at ../executor/ex_tcb.h:102
> #9 0x00007ffff350dba7 in ExSubtask::work (this=0x7ffff7e99ad0)
>  at ../executor/ExScheduler.cpp:757
> #10 0x00007ffff350cbf1 in ExScheduler::work (this=0x7ffff7e98cb0, 
> prevWaitTime=
>  0) at ../executor/ExScheduler.cpp:280
> #11 0x00007ffff33a41c7 in ex_root_tcb::execute (this=0x7ffff7e99b78, 
>  cliGlobals=0xba5970, glob=0x7ffff7ea5d40, input_desc=0x7ffff7ee1178, 
>  diagsArea=@0x7ffffffee020, reExecute=0) at ../executor/ex_root.cpp:928
> #12 0x00007ffff4e4c452 in Statement::execute (this=0x7ffff7e84f40, cliGlobals=
>  0xba5970, input_desc=0x7ffff7ee1178, diagsArea=..., execute_state=
> ---Type <return> to continue, or q <return> to quit---q
> Statement:Quit
> (gdb)
> {quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to