[ 
https://issues.apache.org/jira/browse/HAWQ-280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruilong Huo updated HAWQ-280:
-----------------------------
    Affects Version/s: 2.0.0-beta-incubating

> Error accessing external table or copying from file with bad rows
> -----------------------------------------------------------------
>
>                 Key: HAWQ-280
>                 URL: https://issues.apache.org/jira/browse/HAWQ-280
>             Project: Apache HAWQ
>          Issue Type: Bug
>          Components: External Tables
>    Affects Versions: 2.0.0-beta-incubating
>            Reporter: Ruilong Huo
>            Assignee: Lei Chang
>
> It errors out without return result when accessing external table or copying 
> from file with bad rows.
> 1. Error accessing external table with bad rows
> {noformat}
> Step 1: download attached test.csv with 2000 row which are all bad formated
> Step 2: start gpfdist service
> gpfdist -d /home/gpadmin/data/ -p 8081 -l /home/gpadmin/log/load.log &
> ------------------------------------------------------------------------------------------------
> [1] 34635
> Serving HTTP on port 8081, directory /home/gpadmin/data
> Step 3: create external table
> CREATE EXTERNAL TABLE test_ext (id INT, a TEXT, b TEXT, c TEXT, z TEXT)
> LOCATION ('gpfdist://localhost:8081/test.csv')
> FORMAT 'CSV'
> LOG ERRORS INTO test_ext_err SEGMENT REJECT LIMIT 3000 ROWS;
> -----------------------------------------------------------------------------------------------------
> NOTICE:  Error table "test_ext_err" does not exist. Auto generating an error 
> table with the same name
> CREATE EXTERNAL TABLE
> Step 4: access external table
> SELECT COUNT(*) FROM test_ext;
> -------------------------------------------------
> ERROR:  All 1000 first rows in this segment were rejected. Aborting operation 
> regardless of REJECT LIMIT value. Last error was: missing data for column "z" 
>  (seg0 localhost:40000 pid=35647)
> DETAIL:  External table test_ext, line 1000 of 
> gpfdist://localhost:8081/test.csv: "29,aaa,bbb,zzz"
> {noformat}
> 2. Error copying from file with bad rows
> {noformat}
> Step 1: download attached test.csv with 2000 row which are all bad formated
> Step 2: create table
> CREATE TABLE test_copy (id INT, a TEXT, b TEXT, c TEXT, z TEXT);
> ------------------------------------------------------------------------------------------------
> CREATE TABLE
> Step 3: copy data in file to table in database
> COPY test_copy FROM '/Users/intern/Downloads/test.csv' LOG ERRORS INTO 
> test_copy_err SEGMENT REJECT LIMIT 3000 ROWS;
> --------------------------------------------------------------------------------------------------------
> NOTICE:  Error table "test_copy_err" does not exist. Auto generating an error 
> table with the same name
> WARNING:  The error table was created in the same transaction as this 
> operation. It will get dropped if transaction rolls back even if bad rows are 
> present
> HINT:  To avoid this create the error table ahead of time using: CREATE TABLE 
> <name> (cmdtime timestamp with time zone, relname text, filename text, 
> linenum integer, bytenum integer, errmsg text, rawdata text, rawbytes bytea)
> ERROR:  All 1000 first rows in this segment were rejected. Aborting operation 
> regardless of REJECT LIMIT value. Last error was: missing data for column "a"
> CONTEXT:  COPY test_copy, line 1000: "29,aaa,bbb,zzz"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to