Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
可以发一下具体的SQL语句么(包括DDL和insert)? On Wed, Apr 21, 2021 at 5:46 PM HunterXHunter <1356469...@qq.com> wrote: > 在ddl的时候设置了 watermark。在任务页面查看watermark的时候一直没有更新watermark > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ > -- Best regards! Rui Li
Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
在ddl的时候设置了 watermark。在任务页面查看watermark的时候一直没有更新watermark -- Sent from: http://apache-flink.147419.n8.nabble.com/
Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
用partition-time的话是用watermark与分区字段的timestamp对比来触发提交的,因此还需要你的source有watermark。 On Fri, Apr 16, 2021 at 9:32 AM HunterXHunter <1356469...@qq.com> wrote: > 但是用process-time是有数据的,目前用partition-time一直没成功写出过数据 > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ > -- Best regards! Rui Li
Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
但是用process-time是有数据的,目前用partition-time一直没成功写出过数据 -- Sent from: http://apache-flink.147419.n8.nabble.com/
Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
重要: 在流模式下使用 FileSink 时需要启用 Checkpoint ,每次做 Checkpoint 时写入完成。如果 Checkpoint 被禁用,部分文件(part file)将永远处于 'in-progress' 或 'pending' 状态,下游系统无法安全地读取。 在官方文档找到了这个,必须要有checkpoint才行,但是我 手动savepoint之后,虽然有sucess文件,但是没有数据 -- Sent from: http://apache-flink.147419.n8.nabble.com/
Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
查看hdfs文件: 分区一直是这样的一个文件,没有生成 _SUCCESS文件 .part-40a2c94d-0437-4666-8d43-31c908aaa02e-0-0.inprogress.73dcc10b-44f4-47e3-abac-0c14bd59f9c9 -- Sent from: http://apache-flink.147419.n8.nabble.com/
Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
你好,这个问题已经解决了。 我现在通过官方例子: SET table.sql-dialect=default; create table flink_kafka( sys_time bigint, rt AS TO_TIMESTAMP(FROM_UNIXTIME(sys_time / 1000, '-MM-dd HH:mm:ss')), WATERMARK FOR rt AS rt - INTERVAL '5' SECOND ) WITH ( 'connector' = 'kafka', 'topic' = 'xx', 'scan.startup.mode' = 'latest-offset', 'properties.bootstrap.servers' = '', 'properties.group.id' = 'test-sql', 'format' = 'json', 'json.ignore-parse-errors' = 'true' ); SET table.sql-dialect=hive; CREATE TABLE hive_table ( sys_time bigint ) PARTITIONED BY (dt STRING, hr STRING) STORED AS orc TBLPROPERTIES ( 'partition.time-extractor.timestamp-pattern'='$dt $hr:00:00', 'sink.partition-commit.trigger'='process-time', 'sink.partition-commit.delay'='0s', 'sink.partition-commit.policy.kind'='metastore,success-file' ); INSERT INTO hive_table SELECT sys_time, DATE_FORMAT(rt, '-MM-dd') as dt, DATE_FORMAT(rt, 'HH') as hr FROM flink_kafka; 发现数据一直无法写入hive。程序没有报错, select * from flink_kafka;是有数据的。 但是hive_table一直没有数据, 我发送各个时间段的数据,watermark应该也是超过了分区时间的,但是hive_table一直没有数据 -- Sent from: http://apache-flink.147419.n8.nabble.com/
Re: flink 1.12.2 sql-cli 写入Hive报错 is_generic
你好, 我用你提供的这个DDL没有复现这个问题,有更详细的操作步骤么?另外如果kafka表是通过create table like创建的话有个已知问题: https://issues.apache.org/jira/browse/FLINK-21660 On Thu, Apr 1, 2021 at 4:08 PM HunterXHunter <1356469...@qq.com> wrote: > 当配置好HiveCatalog后, > SQL-Cli 也可以查到hive库表信息 > 创建kafka表: > > create table test.test_kafka( > word VARCHAR > ) WITH ( > 'connector' = 'kafka', > 'topic' = 'xx', > 'scan.startup.mode' = 'latest-offset', > 'properties.bootstrap.servers' = 'xx', > 'properties.group.id' = 'test', > 'format' = 'json', > 'json.ignore-parse-errors' = 'true' > ); > 在 Hive里面可以查到改表 > hive > DESCRIBE FORMATTED test_kafka >... > is_generic true >. > > 但是我在 Flink SQL > select * from test.test_kafka; > 报错: > org.apache.flink.table.api.ValidationException: Unsupported options found > for connector 'kafka'. > Unsupported options: > is_generic > Supported options: > connector > format > json.fail-on-missing-field > json.ignore-parse-errors > > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ > -- Best regards! Rui Li