Hello everyone, I am a newbie.
I am learning the flink-sql-submit project. From @Jark Wu :
https://github.com/wuchong/flink-sql-submit
My local environment is:
1. flink1.9.0 standalone
2. kafka_2.11-2.2.0 single
I configured Flink Connectors and Formats jars to $FLINK_HOME/lib .
Reference:
Hello everyone, I am a newbie.
I am learning the flink-sql-submit project. From @Jark Wu :
https://github.com/wuchong/flink-sql-submit
My local environment is:
1. flink1.9.0 standalone
2. kafka_2.11-2.2.0 single
I configured Flink Connectors and Formats jars to $FLINK_HOME/lib .
Reference:
Hi anyang:
For you information. I plan to support JSON format in file system connector
after https://issues.apache.org/jira/browse/FLINK-14256
After FLIP-66[1], we can define time attribute in SQL DDL whatever
connector is.
[1]
Hi Chan,
If it is a bug, i think it is critical. Could you share the job manager
logs with me too? I have some time to
analyze and hope to find the root cause.
Best,
Yang
Chan, Regina 于2019年10月30日周三 上午10:55写道:
> Till, were you able find anything? Do you need more logs?
>
>
>
>
>
> *From:*
Hi Caio,
Did you check whether there are enough resources to launch the other nodes?
Could you attach the logs you mentioned? And elaborate how the tasks are
connected in the topology?
Thanks,
Zhu Zhu
Caio Aoque 于2019年10月30日周三 上午8:31写道:
> Hi, I've been running some flink scala applications
Hi Caio,
Because it involves interaction with external systems. It would be better
if you can provide the full logs.
Best,
Vino
Caio Aoque 于2019年10月30日周三 上午8:31写道:
> Hi, I've been running some flink scala applications on an AWS EMR cluster
> (version 5.26.0 with flink 1.8.0 for scala 2.11)
Hi Amran,
See my inline answers.
Best,
Vino
amran dean 于2019年10月30日周三 上午2:59写道:
> Hello,
> Exact semantics for checkpointing/task recovery are still a little
> confusing to me after parsing docs: so a few questions.
>
> - What does Flink consider a task failure? Is it any exception that the
>
hello:
环境: flink1.9.1, on yarn hadoop2.6
flink只安装在了一台提交的机器上,
lib目录下有文件:
flink-dist_2.11-1.9.1.jar
flink-json-1.9.0-sql-jar.jar
flink-shaded-hadoop-2-uber-2.6.5-7.0.jar
flink-sql-connector-kafka_2.11-1.9.0.jar
flink-table_2.11-1.9.1.jar
flink-table-blink_2.11-1.9.1.jar
log4j-1.2.17.jar
Hi, I've been running some flink scala applications on an AWS EMR cluster
(version 5.26.0 with flink 1.8.0 for scala 2.11) for a while and I started
to have some issues now.
I have a flink app that reads some files from S3, process them and save
some files to s3 and also some records to a
Hello,
Exact semantics for checkpointing/task recovery are still a little
confusing to me after parsing docs: so a few questions.
- What does Flink consider a task failure? Is it any exception that the job
does not handle?
- Do the failure recovery strategies mentioned in
Hi All,
I want to do a full outer join on two streaming data sources and store the
state of full outer join in some external storage like rocksdb or something
else. And then want to use this intermediate state as a streaming source
again, do some transformation and write it to some external
Hi,
Thanks Dawid and Florin.
To Dawid:
CsvTableSource doesn't implements DefinedProctimeAttribute and
DefinedRowtimeAttributes interfaces, so we can not use proctime and
rowtime in source ddl. Except csv, we also need to consume json and pb
data.
To Florin:
Installing local kafka and zk
Hi all,
I am running Flink on a standalone cluster and getting very long
execution time for the streaming queries like WordCount for a fixed text
file. My VM runs on a Debian 10 with 16 cpu cores and 32GB of RAM. I
have a text file with size of 2GB. When I run the Flink on a standalone
Hi,
I am not able to roll the files based on file size as the bulkFormat has
onCheckpointRollingPolicy.
One way is to write CustomStreamingFileSink and provide RollingPolicy like
RowFormatBuilder. Is this the correct way to go ahead ?
Another way is to write ParquetEncoder and use
You could also disable the security feature of the Hadoop cluster or upgrade
the hadoop version. I'm not sure if this is acceptable for you as it requires
more changes. Setting the configuration is the minimum changes I could think of
to solve this issue as it will not affect other users of the
Hi!
Another solution would be to locally install kafka+zookeeper and push your
dumped json (from the production server) data in a topic(you create a Kafka
producer).
Then you configure your code to point to this local broker. Consume your
data from topic from either strategy you need (earliest
Hi,
Unfortunately it is not possible out of the box. The only format that
the filesystem connector supports as of now is CSV.
As a workaround you could create a Table out of a DataStream reusing the
JsonRowDeserializationSchema. Have a look at the example below:
Hi guys,
In flink1.9, we can set `connector.type` to `kafka` and `format.type` to
json to read/write json data from kafka or write json data to kafka.
In my scenario, I wish to read local json data as a souce table, since I
need to do local debug and don't consume online kafka data.
For
Hi Srikanth,
Which sql throws exception?
On Tue, Oct 29, 2019 at 3:41 PM vino yang wrote:
> Hi,
>
> The exception shows your SQL statement has grammatical errors. Please
> check it again or provide the whole SQL statement here.
>
> Best,
> Vino
>
> srikanth flink 于2019年10月29日周二 下午2:51写道:
>
>>
Hi,
You can use TUMBLE_ROWTIME(...) to get the rowtime attribute of the first
window result, and use this field to apply a following window aggregate.
See more
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql.html#group-windows
Best,
Jark
On Tue, 29 Oct 2019 at 15:39, 刘建刚
Hi,
You can use TUMBLE_ROWTIME(...) to get the rowtime attribute of the first
window result, and use this field to apply a following window aggregate.
See more
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql.html#group-windows
Best,
Jark
On Tue, 29 Oct 2019 at 15:39, 刘建刚
pom 文件
```
http://maven.apache.org/POM/4.0.0;
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
4.0.0
com.hb
flink
pom
1.9.1-SNAPSHOT
Hi Srikanth,
1.Can you share complete sql? And schema of input table?
2.What schema you want to select? I didn't understand what is "agg" field
mean.
On Tue, Oct 29, 2019 at 3:28 PM Jingsong Li wrote:
> Hi Srikanth,
> 1.Can you share complete sql? And schema of input table?
> 2.What schema you
Hi,
The exception shows your SQL statement has grammatical errors. Please check
it again or provide the whole SQL statement here.
Best,
Vino
srikanth flink 于2019年10月29日周二 下午2:51写道:
> Hi there,
>
> I'm querying json data and is working fine. I would like to add custom
> fields including the
Thanks Fabian,
@Gordon - Can you please help here.
Regards,
Vinay Patil
On Fri, Oct 25, 2019 at 9:11 PM Fabian Hueske wrote:
> Hi Vinay,
>
> Maybe Gordon (in CC) has an idea about this issue.
>
> Best, Fabian
>
> Am Do., 24. Okt. 2019 um 14:50 Uhr schrieb Vinay Patil <
>
For one sql window, I can register table with event time and use time
field in the tumble window. But if I want to use the result for the first
window and use another window to process it, how can I do it? Thank you.
For one sql window, I can register table with event time and use time
field in the tumble window. But if I want to use the result for the first
window and use another window to process it, how can I do it? Thank you.
??
maven??pom
----
??:"hb"<343122...@163.com;
:2019??10??29??(??) 2:53
??:"user-zh"
Thank you!
On Mon, Oct 28, 2019 at 3:53 AM vino yang wrote:
> Hi Pankaj,
>
> It seems it is a bug. You can report it by opening a Jira issue.
>
> Best,
> Vino
>
> Pankaj Chand 于2019年10月28日周一 上午10:51写道:
>
>> Hello,
>>
>> I am trying to modify the parallelism of a streaming Flink job
>>
我指定的是0.11版本kafka, 之前lib目录下 只有 0.11 jar的文件, 还是报这个错误的
在 2019-10-29 13:47:34,"如影随形" <1246407...@qq.com> 写道:
>你好:
>
>
> 看着像是包冲突了,你lib包下有4个flink_kafka的jar包,去掉不用的看看呢
>
>
>
>陈浩
>
>
>
>
>
>
>
>
>--原始邮件--
>发件人:"hb"<343122...@163.com;
>发送时间:2019年10月29日(星期二) 下午2:41
Hi there,
I'm querying json data and is working fine. I would like to add custom
fields including the query result.
My query looks like: select ROW(`source`), ROW(`destination`), ROW(`dns`),
organization, cnt from (select (source.`ip`,source.`isInternalIP`) as
source,
??
lib??4??flink_kafka??jar
----
??:"hb"<343122...@163.com;
:2019??10??29??(??) 2:41
??:"user-zh"
代码本地ide 能正常执行, 有正常输出,
打包成fat-jar包后,提交到yarn-session 上执行
报:
Caused by: org.apache.kafka.common.config.ConfigException: Invalid value
org.apache.flink.kafka010.shaded.org.apache.kafka.common.serialization.ByteArrayDeserializer
for configuration key.deserializer: Class
Thanks for the information. Without setting such parameter explicitly, is there
any possibility that it may work intermittently?
From: Dian Fu
Sent: Tuesday, October 29, 2019 7:12 AM
To: V N, Suchithra (Nokia - IN/Bangalore)
Cc: user@flink.apache.org
Subject: Re: Flink 1.8.1 HDFS 2.6.5 issue
34 matches
Mail list logo