Thanks, looks well, nice job!

Best,
Jingsong Lee

On Fri, Apr 10, 2020 at 5:56 PM wangl...@geekplus.com.cn <
wangl...@geekplus.com.cn> wrote:

>
> https://issues.apache.org/jira/browse/FLINK-17086
>
> It is my first time to create a flink jira issue.
> Just point it out and correct it if I write something wrong.
>
> Thanks,
> Lei
>
> ------------------------------
> wangl...@geekplus.com.cn
>
>
> *From:* Jingsong Li <jingsongl...@gmail.com>
> *Date:* 2020-04-10 11:03
> *To:* wangl...@geekplus.com.cn
> *CC:* Jark Wu <imj...@gmail.com>; lirui <li...@apache.org>; user
> <user@flink.apache.org>
> *Subject:* Re: Re: fink sql client not able to read parquet format table
> Hi lei,
>
> I think the reason is that our `HiveMapredSplitReader` not supports name
> mapping reading for parquet format.
> Can you create a JIRA for tracking this?
>
> Best,
> Jingsong Lee
>
> On Fri, Apr 10, 2020 at 9:42 AM wangl...@geekplus.com.cn <
> wangl...@geekplus.com.cn> wrote:
>
>>
>> I am using Hive 3.1.1
>> The table has many fields, each field is corresponded to a feild in the 
>> RobotUploadData0101
>> class.
>>
>> CREATE TABLE `robotparquet`(`robotid` int,   `framecount` int,
>> `robottime` bigint,   `robotpathmode` int,   `movingmode` int,
>> `submovingmode` int,   `xlocation` int,   `ylocation` int,
>> `robotradangle` int,   `velocity` int,   `acceleration` int,
>> `angularvelocity` int,   `angularacceleration` int,   `literangle` int,
>> `shelfangle` int,   `onloadshelfid` int,   `rcvdinstr` int,   `sensordist`
>> int,   `pathstate` int,   `powerpresent` int,   `neednewpath` int,
>> `pathelenum` int,   `taskstate` int,   `receivedtaskid` int,
>> `receivedcommcount` int,   `receiveddispatchinstr` int,
>> `receiveddispatchcount` int,   `subtaskmode` int,   `versiontype` int,
>> `version` int,   `liftheight` int,   `codecheckstatus` int,
>> `cameraworkmode` int,   `backrimstate` int,   `frontrimstate` int,
>> `pathselectstate` int,   `codemisscount` int,   `groundcameraresult` int,
>> `shelfcameraresult` int,   `softwarerespondframe` int,   `paramstate` int,
>>   `pilotlamp` int,   `codecount` int,   `dist2waitpoint` int,
>> `targetdistance` int,   `obstaclecount` int,   `obstacleframe` int,
>> `cellcodex` int,   `cellcodey` int,   `cellangle` int,   `shelfqrcode` int,
>>   `shelfqrangle` int,   `shelfqrx` int,   `shelfqry` int,
>> `trackthetaerror` int,   `tracksideerror` int,   `trackfuseerror` int,
>> `lifterangleerror` int,   `lifterheighterror` int,   `linearcmdspeed` int,
>>   `angluarcmdspeed` int,   `liftercmdspeed` int,   `rotatorcmdspeed` int)
>> PARTITIONED BY (`hour` string) STORED AS parquet;
>>
>>
>> Thanks,
>> Lei
>> ------------------------------
>> wangl...@geekplus.com.cn
>>
>>
>> *From:* Jingsong Li <jingsongl...@gmail.com>
>> *Date:* 2020-04-09 21:45
>> *To:* wangl...@geekplus.com.cn
>> *CC:* Jark Wu <imj...@gmail.com>; lirui <li...@apache.org>; user
>> <user@flink.apache.org>
>> *Subject:* Re: Re: fink sql client not able to read parquet format table
>> Hi lei,
>>
>> Which hive version did you use?
>> Can you share the complete hive DDL?
>>
>> Best,
>> Jingsong Lee
>>
>> On Thu, Apr 9, 2020 at 7:15 PM wangl...@geekplus.com.cn <
>> wangl...@geekplus.com.cn> wrote:
>>
>>>
>>> I am using the newest 1.10 blink planner.
>>>
>>> Perhaps it is because of the method i used to write the parquet file.
>>>
>>> Receive kafka message, transform each message to a Java class Object,
>>> write the Object to HDFS using StreamingFileSink, add  the HDFS path as a
>>> partition of the hive table
>>>
>>> No matter what the order of the field description in  hive ddl
>>> statement, the hive client will work, as long as  the field name is the
>>> same with Java Object field name.
>>> But flink sql client will not work.
>>>
>>> DataStream<RobotUploadData0101> sourceRobot = source.map( x->transform(x));
>>> final StreamingFileSink<RobotUploadData0101> sink;
>>> sink = StreamingFileSink
>>>     .forBulkFormat(new 
>>> Path("hdfs://172.19.78.38:8020/user/root/wanglei/robotdata/parquet"),
>>>         ParquetAvroWriters.forReflectRecord(RobotUploadData0101.class))
>>>
>>> For example
>>> RobotUploadData0101 has two fields:  robotId int, robotTime long
>>>
>>> CREATE TABLE `robotparquet`(  `robotid` int,  `robottime` bigint ) and
>>> CREATE TABLE `robotparquet`(  `robottime` bigint,   `robotid` int)
>>> is the same for hive client, but is different for flink-sql client
>>>
>>> It is an expected behavior?
>>>
>>> Thanks,
>>> Lei
>>>
>>> ------------------------------
>>> wangl...@geekplus.com.cn
>>>
>>>
>>> *From:* Jark Wu <imj...@gmail.com>
>>> *Date:* 2020-04-09 14:48
>>> *To:* wangl...@geekplus.com.cn; Jingsong Li <jingsongl...@gmail.com>;
>>> lirui <li...@apache.org>
>>> *CC:* user <user@flink.apache.org>
>>> *Subject:* Re: fink sql client not able to read parquet format table
>>> Hi Lei,
>>>
>>> Are you using the newest 1.10 blink planner?
>>>
>>> I'm not familiar with Hive and parquet, but I know @Jingsong Li
>>> <jingsongl...@gmail.com> and @li...@apache.org <li...@apache.org> are
>>> experts on this. Maybe they can help on this question.
>>>
>>> Best,
>>> Jark
>>>
>>> On Tue, 7 Apr 2020 at 16:17, wangl...@geekplus.com.cn <
>>> wangl...@geekplus.com.cn> wrote:
>>>
>>>>
>>>> Hive table stored as parquet.
>>>>
>>>> Under hive client:
>>>> hive> select robotid from robotparquet limit 2;
>>>> OK
>>>> 1291097
>>>> 1291044
>>>>
>>>>
>>>> But under flink sql-client the result is 0
>>>> Flink SQL> select robotid  from robotparquet limit 2;
>>>>                   robotid
>>>>                          0
>>>>                          0
>>>>
>>>> Any insight on this?
>>>>
>>>> Thanks,
>>>> Lei
>>>>
>>>>
>>>>
>>>> ------------------------------
>>>> wangl...@geekplus.com.cn
>>>>
>>>>
>>
>> --
>> Best, Jingsong Lee
>>
>>
>
> --
> Best, Jingsong Lee
>
>

-- 
Best, Jingsong Lee

Reply via email to