https://issues.apache.org/jira/browse/FLINK-17086

It is my first time to create a flink jira issue. 
Just point it out and correct it if I write something wrong.

Thanks,
Lei



wangl...@geekplus.com.cn 

 
From: Jingsong Li
Date: 2020-04-10 11:03
To: wangl...@geekplus.com.cn
CC: Jark Wu; lirui; user
Subject: Re: Re: fink sql client not able to read parquet format table
Hi lei,

I think the reason is that our `HiveMapredSplitReader` not supports name 
mapping reading for parquet format.
Can you create a JIRA for tracking this?

Best,
Jingsong Lee

On Fri, Apr 10, 2020 at 9:42 AM wangl...@geekplus.com.cn 
<wangl...@geekplus.com.cn> wrote:

I am using Hive 3.1.1 
The table has many fields, each field is corresponded to a feild in the 
RobotUploadData0101 class.

CREATE TABLE `robotparquet`(`robotid` int,   `framecount` int,   `robottime` 
bigint,   `robotpathmode` int,   `movingmode` int,   `submovingmode` int,   
`xlocation` int,   `ylocation` int,   `robotradangle` int,   `velocity` int,   
`acceleration` int,   `angularvelocity` int,   `angularacceleration` int,   
`literangle` int,   `shelfangle` int,   `onloadshelfid` int,   `rcvdinstr` int, 
  `sensordist` int,   `pathstate` int,   `powerpresent` int,   `neednewpath` 
int,   `pathelenum` int,   `taskstate` int,   `receivedtaskid` int,   
`receivedcommcount` int,   `receiveddispatchinstr` int,   
`receiveddispatchcount` int,   `subtaskmode` int,   `versiontype` int,   
`version` int,   `liftheight` int,   `codecheckstatus` int,   `cameraworkmode` 
int,   `backrimstate` int,   `frontrimstate` int,   `pathselectstate` int,   
`codemisscount` int,   `groundcameraresult` int,   `shelfcameraresult` int,   
`softwarerespondframe` int,   `paramstate` int,   `pilotlamp` int,   
`codecount` int,   `dist2waitpoint` int,   `targetdistance` int,   
`obstaclecount` int,   `obstacleframe` int,   `cellcodex` int,   `cellcodey` 
int,   `cellangle` int,   `shelfqrcode` int,   `shelfqrangle` int,   `shelfqrx` 
int,   `shelfqry` int,   `trackthetaerror` int,   `tracksideerror` int,   
`trackfuseerror` int,   `lifterangleerror` int,   `lifterheighterror` int,   
`linearcmdspeed` int,   `angluarcmdspeed` int,   `liftercmdspeed` int,   
`rotatorcmdspeed` int) PARTITIONED BY (`hour` string) STORED AS parquet; 


Thanks,
Lei


wangl...@geekplus.com.cn 

 
From: Jingsong Li
Date: 2020-04-09 21:45
To: wangl...@geekplus.com.cn
CC: Jark Wu; lirui; user
Subject: Re: Re: fink sql client not able to read parquet format table
Hi lei,

Which hive version did you use?
Can you share the complete hive DDL?

Best,
Jingsong Lee

On Thu, Apr 9, 2020 at 7:15 PM wangl...@geekplus.com.cn 
<wangl...@geekplus.com.cn> wrote:

I am using the newest 1.10 blink planner.

Perhaps it is because of the method i used to write the parquet file.

Receive kafka message, transform each message to a Java class Object, write the 
Object to HDFS using StreamingFileSink, add  the HDFS path as a partition of 
the hive table

No matter what the order of the field description in  hive ddl statement, the 
hive client will work, as long as  the field name is the same with Java Object 
field name.
But flink sql client will not work.

DataStream<RobotUploadData0101> sourceRobot = source.map( x->transform(x));
final StreamingFileSink<RobotUploadData0101> sink;
sink = StreamingFileSink
    .forBulkFormat(new 
Path("hdfs://172.19.78.38:8020/user/root/wanglei/robotdata/parquet"),
        ParquetAvroWriters.forReflectRecord(RobotUploadData0101.class))
For example 
RobotUploadData0101 has two fields:  robotId int, robotTime long

CREATE TABLE `robotparquet`(  `robotid` int,  `robottime` bigint ) and 
CREATE TABLE `robotparquet`(  `robottime` bigint,   `robotid` int)
is the same for hive client, but is different for flink-sql client

It is an expected behavior?

Thanks,
Lei



wangl...@geekplus.com.cn 

 
From: Jark Wu
Date: 2020-04-09 14:48
To: wangl...@geekplus.com.cn; Jingsong Li; lirui
CC: user
Subject: Re: fink sql client not able to read parquet format table
Hi Lei,

Are you using the newest 1.10 blink planner? 

I'm not familiar with Hive and parquet, but I know @Jingsong Li and 
@li...@apache.org are experts on this. Maybe they can help on this question. 

Best,
Jark

On Tue, 7 Apr 2020 at 16:17, wangl...@geekplus.com.cn 
<wangl...@geekplus.com.cn> wrote:

Hive table stored as parquet.

Under hive client: 
hive> select robotid from robotparquet limit 2;
OK
1291097
1291044


But under flink sql-client the result is 0
Flink SQL> select robotid  from robotparquet limit 2;
                  robotid
                         0
                         0

Any insight on this?

Thanks,
Lei





wangl...@geekplus.com.cn 



-- 
Best, Jingsong Lee


-- 
Best, Jingsong Lee

Reply via email to