Flink 与 Hive 集成问题

2019-05-13 Thread Yaoting Gong
大家好, 我是一个大数据新人。之前熟悉了下Flink Stream API 相关,用 Stream API 处理过kafka, es,hbase。 但目前调研 Flink SQL这块遇到问题,我们需要支持多个数据源之间的join,尤其是hive。希望能做成一个小平台,新任务通过添加配置即可完成。 我们的Flink 是1.7.1。 在和hive交互时遇到问题。如果我用jdbc方式连hive,性能肯定不够。如果我直接连 底层hdfs文件,那么好像需要用 batch环境,和我需要join的stream有冲突。

Re: Reconstruct object through partial select query

2019-05-13 Thread Shahar Cizer Kobrinsky
Hey Hequn & Fabian, It seems like i found a reasonable way using both Row and my own TypeInfo: - I started by just using my own TypeInfo using your example. So i'm using a serializer which is basically a compound of the original event type serializer as well as a string array serializer

AvroSerializer

2019-05-13 Thread Debasish Ghosh
Hello - I am using Avro based encoding with Flink. I see that Flink has an AvroSerializer that gets used for serializing Avro. Is it possible to provide a custom implementation of the serializer e.g. I want to use MyAvroSerializer instead of AvroSerializer in *all* places. Is there any way to

Re: Reconstruct object through partial select query

2019-05-13 Thread Shahar Cizer Kobrinsky
Thanks for looking into it Hequn! I do not have a requirement to use TaggedEvent vs Row. But correct me if I am wrong, creating a Row will require me knowing the internal fields of the original event in compile time, is that correct? I do have a requirement to support a generic original event

assignTimestampsAndWatermarks not work after KeyedStream.process

2019-05-13 Thread an0
Thanks everyone, I learned a lot through this single thread! On 2019/05/13 07:19:30, Fabian Hueske wrote: > Hi, > > Am Fr., 10. Mai 2019 um 16:55 Uhr schrieb an0 : > > > > Q2: after a, map(A), and map(B) would work fine. Assign watermarks > > > immediatedly after a keyBy() is not a good idea,

Flink and Prometheus setup in K8s

2019-05-13 Thread Wouter Zorgdrager
Hey all, I'm working on a deployment setup with Flink and Prometheus on Kubernetes. I'm running into the following issues: 1) Is it possible to use the default Flink Docker image [1] and enable the Prometheus reporter? Modifying the flink-config.yaml is easy, but somehow the Prometheus reporter

RE: ????: flink

2019-05-13 Thread Shi Quan
OutputTag?? BTW??OutputTag?? Sent from Mail for Windows 10 From: deng Sent: Monday, May 13, 2019

Re: apache-flink项目导入IDEA出现错误

2019-05-13 Thread Zili Chen
你好, 图片不可见,请使用附件或贴链接。一般来说,IDEA你可以直接从git version control check out出来。 Best, tison. 程 婕 于2019年5月13日周一 下午3:09写道: > Dear Flink Commiter > > > > 您好。我是一名研一的学生,并且我对flink的开发有非常大的兴趣。最近我正在尝试将一个算法做在flink框架中。但是我在将flink代码导入IDEA > 时出现了一些问题。比如,我在build project时,会显示以下错误: > > [image:

????: flink

2019-05-13 Thread deng
??delay ??delay flink ?? ?? ?0?2 ?0?2Kobeli ???0?22019-05-11?0?218:44 ?0?2user-zh ???0?2flink hello flink

Re:

2019-05-13 Thread Fabian Hueske
Hi, Am Fr., 10. Mai 2019 um 16:55 Uhr schrieb an0 : > > Q2: after a, map(A), and map(B) would work fine. Assign watermarks > > immediatedly after a keyBy() is not a good idea, because 1) the records > are > > shuffled and it's hard to reasoning about ordering, and 2) you lose the > > KeyedStream

Re: how to count kafka sink number

2019-05-13 Thread Konstantin Knauf
Hi Chong, to my knowledge, neither Flink's built-in metrics nor the metrics of the Kafka producer itself give you this number directly. If your sink is chained (no serialization, no network) to another Flink operator, you could take the numRecordsOut of this operator instead. It will tell you how

apache-flink项目导入IDEA出现错误

2019-05-13 Thread 程 婕
Dear Flink Commiter 您好。我是一名研一的学生,并且我对flink的开发有非常大的兴趣。最近我正在尝试将一个算法做在flink框架中。但是我在将flink代码导入IDEA时出现了一些问题。比如,我在build project时,会显示以下错误: [cid:image001.png@01D5082C.AE3E9D60] 这个错误是在buid [cid:image004.png@01D5082C.CCC3E0B0] 时出现的。 我很好奇为什么会出现这个错误,因为在导入项目和build

回复: flink 无法 获取从数据源读了多少条数据,写入目标数据源多少条数据

2019-05-13 Thread deng
源算子是没有接收统计的,只有发送统计,sink算子刚好相反。   发件人: jszhouch...@163.com 发送时间: 2019-05-10 17:28 收件人: user-zh@flink.apache.org 主题: flink 无法 获取从数据源读了多少条数据,写入目标数据源多少条数据 大家好, 我现在有一个程序消费kafka,写入kafka,我想获取flink 消费了多少条kafka数据,写入了多少条kafka数据,,但是在flink页面上 第一个subtasks的Records received是0,最后一个subtask的Records

flink

2019-05-13 Thread Kobeli
hello flink watermark?? flink jobevent time?? source??kafka,??(under-replica), flink jobpartition??(kafka)?? watermark??