Jesse Anderson created HIVE-6266:
------------------------------------
Summary: CTAS Properties Not Passed
Key: HIVE-6266
URL: https://issues.apache.org/jira/browse/HIVE-6266
Project: Hive
Issue Type: Bug
Components: Serializers/Deserializers
Affects Versions: 0.8.0
Reporter: Jesse Anderson
I am doing a CTAS and using a Custom SerDe property to change output format
settings. Here is the query I am doing:
{code}
CREATE TABLE calldataformat
ROW FORMAT SERDE
'com.loudacre.hiveserdebonus.solution.CallDetailSerDe'
WITH SERDEPROPERTIES
( "fixedwidth.regex" = "^(.{36})(.{17})(.{17})(.{10})(.{10})(.{10})$",
"fixedwidth.dateformat" = "yyyy-DDD kk:mm:ss" )
LOCATION
'/loudacre/calldataformat'
AS
SELECT call_id,
call_begin,
call_end,
status,
from_phone,
to_phone
FROM calldata
WHERE status <> 'SUCCESS';
{code}
The fixedwidth.regex and fixedwidth.dateformat properties are never passed in
via the Property object. I added some logging output to the initialize method
to log every property that comes in. This is the logging output:
{noformat}
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:name
Value:default.calldataformat
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:columns
Value:_col0,_col1,_col2,_col3,_col4,_col5
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:serialization.format Value:1
2014-01-22 14:53:35,110 INFO CallDetailSerDe: Key:columns.types
Value:string:timestamp:timestamp:string:string:string
{noformat}
The work around is to do a 2-step process instead of a CTAS. You need to create
the table first and then do a INSERT INTO. This way, the properties are passed
in and all of the formatting is correct.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)