[jira] [Updated] (FLINK-29013) Fail to use "transform using" when record reader is binary record reader in Hive dialect

luoyuxia (Jira) Wed, 17 Aug 2022 04:37:05 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-29013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


luoyuxia updated FLINK-29013:
-----------------------------
    Description: 
It'll cause NPE when using following code:

 
{code:java}
tableEnv.executeSql(
                "INSERT OVERWRITE TABLE dest1\n"
                        + "SELECT TRANSFORM(*)\n"
                        + "  USING 'cat'\n"
                        + "  AS mydata STRING\n"
                        + "    ROW FORMAT SERDE\n"
                        + "      
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'\n"
                        + "    WITH SERDEPROPERTIES (\n"
                        + "      
'serialization.last.column.takes.rest'='true'\n"
                        + "    )\n"
                        + "    RECORDREADER 
'org.apache.hadoop.hive.ql.exec.BinaryRecordReader'\n"
                        + "FROM src"){code}
 

The NPE is thrown in

 
{code:java}
// HiveScriptTransformOutReadThread

recordReader.next(reusedWritableObject); {code}
 

For BinaryRecordReader, we should first call method  
BinaryRecordReader#createRow to do initailization
{code:java}
// BinaryRecordReader
public Writable createRow() throws IOException {
  bytes = new BytesWritable();
  bytes.setCapacity(maxRecordLength);
  return bytes;
} {code}
otherwise it will throw exception in the following code:
{code:java}
// BinaryRecordReader

public int next(Writable row) throws IOException {
  int recordLength = in.read(bytes.get(), 0, maxRecordLength);
  if (recordLength >= 0) {
    bytes.setSize(recordLength);
  }
  return recordLength;
} {code}
 

 

> Fail to use "transform using"  when record reader is binary record reader in 
> Hive dialect
> -----------------------------------------------------------------------------------------
>
>                 Key: FLINK-29013
>                 URL: https://issues.apache.org/jira/browse/FLINK-29013
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / Hive
>    Affects Versions: 1.16.0
>            Reporter: luoyuxia
>            Priority: Major
>             Fix For: 1.16.0
>
>
> It'll cause NPE when using following code:
>  
> {code:java}
> tableEnv.executeSql(
>                 "INSERT OVERWRITE TABLE dest1\n"
>                         + "SELECT TRANSFORM(*)\n"
>                         + "  USING 'cat'\n"
>                         + "  AS mydata STRING\n"
>                         + "    ROW FORMAT SERDE\n"
>                         + "      
> 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'\n"
>                         + "    WITH SERDEPROPERTIES (\n"
>                         + "      
> 'serialization.last.column.takes.rest'='true'\n"
>                         + "    )\n"
>                         + "    RECORDREADER 
> 'org.apache.hadoop.hive.ql.exec.BinaryRecordReader'\n"
>                         + "FROM src"){code}
>  
> The NPE is thrown in
>  
> {code:java}
> // HiveScriptTransformOutReadThread
> recordReader.next(reusedWritableObject); {code}
>  
> For BinaryRecordReader, we should first call method  
> BinaryRecordReader#createRow to do initailization
> {code:java}
> // BinaryRecordReader
> public Writable createRow() throws IOException {
>   bytes = new BytesWritable();
>   bytes.setCapacity(maxRecordLength);
>   return bytes;
> } {code}
> otherwise it will throw exception in the following code:
> {code:java}
> // BinaryRecordReader
> public int next(Writable row) throws IOException {
>   int recordLength = in.read(bytes.get(), 0, maxRecordLength);
>   if (recordLength >= 0) {
>     bytes.setSize(recordLength);
>   }
>   return recordLength;
> } {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-29013) Fail to use "transform using" when record reader is binary record reader in Hive dialect

Reply via email to