[ 
https://issues.apache.org/jira/browse/HIVE-16667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16039623#comment-16039623
 ] 

Naveen Gangam commented on HIVE-16667:
--------------------------------------

[~rusanu] I have had a chance to test with VARCHAR in the JDO mappings file 
with Oracle, Postgres and Derby. Seems to be working without any issues.
I am uploading the patch with the changes. Could you please review it when you 
get a chance? Thanks

{code}
2017-06-06T15:50:41,716 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 280.51028ms + 0.084425ms, the query is [select 
"PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"     and "TBLS"."TBL_NAME" = ?   inner 
join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"      and "DBS"."NAME" = ? ]
2017-06-06T15:50:41,858 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 141.674264ms + 0.379869ms, the query is [select 
"PARTITIONS"."PART_ID", "SDS"."SD_ID", "SDS"."CD_ID", "SERDES"."SERDE_ID", 
"PARTITIONS"."CREATE_TIME", "PARTITIONS"."LAST_ACCESS_TIME", 
"SDS"."INPUT_FORMAT", "SDS"."IS_COMPRESSED", "SDS"."IS_STOREDASSUBDIRECTORIES", 
"SDS"."LOCATION", "SDS"."NUM_BUCKETS", "SDS"."OUTPUT_FORMAT", "SERDES"."NAME", 
"SERDES"."SLIB" from "PARTITIONS"  left outer join "SDS" on 
"PARTITIONS"."SD_ID" = "SDS"."SD_ID"   left outer join "SERDES" on 
"SDS"."SERDE_ID" = "SERDES"."SERDE_ID" where "PART_ID" in (1) order by 
"PART_NAME" asc]
2017-06-06T15:50:42,002 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 142.592725ms + 0.294649ms, the query is [select "PART_ID", 
"PARAM_KEY", "PARAM_VALUE" from "PARTITION_PARAMS" where "PART_ID" in (1) and 
"PARAM_KEY" is not null order by "PART_ID" asc]
2017-06-06T15:50:42,142 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 139.308419ms + 0.229592ms, the query is [select "PART_ID", 
"PART_KEY_VAL" from "PARTITION_KEY_VALS" where "PART_ID" in (1) and 
"INTEGER_IDX" >= 0 order by "PART_ID" asc, "INTEGER_IDX" asc]
2017-06-06T15:50:42,281 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 139.136081ms + 0.092378ms, the query is [select "SD_ID", 
"PARAM_KEY", "PARAM_VALUE" from "SD_PARAMS" where "SD_ID" in (2) and 
"PARAM_KEY" is not null order by "SD_ID" asc]
2017-06-06T15:50:42,423 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 141.601391ms + 0.08076ms, the query is [select "SD_ID", 
"COLUMN_NAME", "SORT_COLS"."ORDER" from "SORT_COLS" where "SD_ID" in (2) and 
"INTEGER_IDX" >= 0 order by "SD_ID" asc, "INTEGER_IDX" asc]
2017-06-06T15:50:42,564 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 140.216113ms + 0.114448ms, the query is [select "SD_ID", 
"BUCKET_COL_NAME" from "BUCKETING_COLS" where "SD_ID" in (2) and "INTEGER_IDX" 
>= 0 order by "SD_ID" asc, "INTEGER_IDX" asc]
2017-06-06T15:50:42,703 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 139.086542ms + 0.094296ms, the query is [select "SD_ID", 
"SKEWED_COL_NAME" from "SKEWED_COL_NAMES" where "SD_ID" in (2) and 
"INTEGER_IDX" >= 0 order by "SD_ID" asc, "INTEGER_IDX" asc]
2017-06-06T15:50:42,847 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 142.786353ms + 0.433393ms, the query is [select "CD_ID", 
"COMMENT", "COLUMN_NAME", "TYPE_NAME" from "COLUMNS_V2" where "CD_ID" in (1) 
and "INTEGER_IDX" >= 0 order by "CD_ID" asc, "INTEGER_IDX" asc]
2017-06-06T15:50:42,990 DEBUG [pool-7-thread-4] metastore.MetaStoreDirectSql: 
Direct SQL query in 142.115654m
{code}

> PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and 
> other field is incorrect
> -------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-16667
>                 URL: https://issues.apache.org/jira/browse/HIVE-16667
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Remus Rusanu
>            Assignee: Naveen Gangam
>         Attachments: HIVE-16667.patch, HiveCLIOutput.txt, PostgresDBOutput.txt
>
>
> The CLOB JDO type introduced with HIVE-12274 does not work correctly with 
> PostgreSQL. The value is written out-of-band and the LOB handle is written,as 
> an INT, into the table. SELECTs return the INT value, which should had been 
> read via the {{lo_get}} PG built-in, and then cast into string.
> Furthermore, the behavior is different between fields upgraded from earlier 
> metastore versions (they retain their string storage) vs. values inserted 
> after the upgrade (inserted as LOB roots).
> Teh code in 
> {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} expects 
> the underlying JDO/Datanucleus to map the column to a {{Clob}} but that does 
> not happen, the value is a Java String containing the int which is the LOB 
> root saved by PG.
> This manifests at runtime with errors like:
> {code}
> hive> select * from srcpart;
> Failed with exception java.io.IOException:java.lang.IllegalArgumentException: 
> Error: type expected at the position 0 of '24030:24031' but '24030' is found.
> {code}
> the 24030:24031 should be 'string:string'.
> repro:
> {code}
> CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 
> 'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO 
> TABLE srcpart PARTITION (ds="2008-04-09", hr="11");
> select * from srcpart;
> {code}
> I did not see the issue being hit by non-partitioned/textfile tables, but 
> that is just the luck of the path taken by the code. Inspection of my PG 
> metastore shows all the CLOB fields suffering from this issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to