Hi Gorden,
here is the link, if anyone else is also interested:
https://issues.apache.org/jira/browse/FLINK-21308
Cheers,
Stephan
Von: Tzu-Li (Gordon) Tai
Gesendet: Freitag, 5. Februar 2021 12:58
An: Stephan Pelikan
Cc: user@flink.apache.org; Igal Shilman
Betreff: Re: Statefun: cancel
Thanks Aljoscha!
On Fri, Feb 5, 2021 at 1:48 AM Aljoscha Krettek wrote:
> Hi Dan,
>
> I'm afraid this is not easily possible using the DataStream API in
> STREAMING execution mode today. However, there is one possible solution
> and we're introducing changes that will also make this work on
Could you provide us with a minimal working example which reproduces the
problem for you? This would be super helpful in figuring out the problem
you are experiencing. Thanks a lot for your help.
Cheers,
Till
On Fri, Feb 5, 2021 at 1:03 PM 赵一旦 wrote:
> Yeah, and if it is different, why my job
Dealing with types is not always easy in Flink. If you have further
issues, it might make sense to just pass them explicitly. We list all
types in:
org.apache.flink.api.common.typeinfo.Types
org.apache.flink.api.scala.typeutils.Types
Regards,
Timo
On 05.02.21 16:04, Xavier wrote:
Hi Timo,
Hi Timo,
Thank you for ur clarification, it is very useful to me, I am also
combining the realization of map function, trying to do implicit conversion
of case class, so that I can restore state from FS.
On Fri, Feb 5, 2021 at 10:38 PM Timo Walther wrote:
> Hi Xavier,
>
> the Scala API has
Hi Xavier,
the Scala API has special implicits in method such as `DataStream.map()`
or `DataStream.keyBy()` to support Scala specifics like case classe. For
Scala one needs to use the macro `createTypeInformation[CaseClass]` for
Java we use reflection via `TypeInformation.of()`. But Scala and
Hi Stephan,
I think that what you are trying to achieve is very interesting, and
possibly other users might find that useful as well
and we will definitely add that to our roadmap.
I think that Gordon's suggestion of using the state processor API to
examine a savepoint, makes a lot of sense in
I've seen checkpoints timeout when using the RocksDB state backend with
very large objects. The issue is that updating a ValueState stored in
RocksDB requires deserializing, updating, and then re-serializing that
object -- and if that's some enormous collection type, that will be slow.
In such
Hello Martijn,
Great to hear that you are exploring StateFun as part of your university
project!
Can you please clarify:
- how do you measure throughput?
- by co-located functions, do you mean a remote function on the same
machine?
- Can you share a little bit more about your functions, what are
nohup bin/flink run -m yarn-cluster \
-c main \
-ynm ${FLINK_NAME} \
-ys 3 \
-p 4 \
-yjm 2048m \
-ytm 2048m \
在flink on yarn 的情况下,使用以上flink run 参数,确保TaskManager 为 2
奇怪的是 JobManager 里面报如下错误,但TaskManager的确启动2个,只是报错的那个TaskManager无法正常工作
谢谢解答
错误:
Caused by:
Yeah, and if it is different, why my job runs normally. The problem only
occurres when I stop it.
Robert Metzger 于2021年2月5日周五 下午7:08写道:
> Are you 100% sure that the jar files in the classpath (/lib folder) are
> exactly the same on all machines? (It can happen quite easily in a
> distributed
Hi Stephan,
Thanks for providing the details of the use case! It does indeed sound like
being able to delete scheduled delayed messages would help here.
And yes, please do proceed with creating an issue. As for details on the
implementation, we can continue to discuss that on the JIRA.
Cheers,
Okay, I am following up to my question. I see information regarding the
threading and distribution model on the documentation about the
architecture.
https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/flink-architecture.html
Next, I want to read up on what I have control over.
Flink supports Hadoop's FileSystem abstraction, which has an implementation
for FTP:
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/ftp/FTPFileSystem.html
On Tue, Feb 2, 2021 at 3:43 AM 1095193...@qq.com <1095193...@qq.com> wrote:
> Hi
> I have investigate the relevant document
By default, a checkpoint times out after 10 minutes. This means if not all
operators are able to confirm the checkpoint, it will be cancelled.
If you have an operator that is blocking for more than 10 minutes on a
single record (because this record contains millions of elements that are
written
I'm not sure if this is related, but you are mixing scala 2.11 and 2.12
dependencies (and mentioning scala 2.1.1 dependencies).
On Tue, Feb 2, 2021 at 8:32 AM Eleanore Jin wrote:
> Hi experts,
> I am trying to experiment how to use Hive to store metadata along using
> Flink SQL. I am running
Great to hear that you were able to resolve the issue!
On Thu, Feb 4, 2021 at 5:12 PM Yordan Pavlov wrote:
> Thank you for your tips Robert,
> I think I narrowed down the problem to having slow Hard disks. Once
> the memory runs out, RocksDb starts spilling to the disk and the
> performance
Is it possible to use different statebackends for different operators?
There are certain situations where I want the state to reside completely in
memory, and other situations where I want it stored in rocksdb.
Are you 100% sure that the jar files in the classpath (/lib folder) are
exactly the same on all machines? (It can happen quite easily in a
distributed standalone setup that some files are different)
On Fri, Feb 5, 2021 at 12:00 PM 赵一旦 wrote:
> Flink1.12.0; only using aligned checkpoint;
尝试调用:
get_gateway().jvm.Test2.Test2.main(None)
> 在 2021年2月5日,18:27,瞿叶奇 <389243...@qq.com> 写道:
>
> 老师,您好,列表参数就不在报错,但是还是没有加载进去。
> >>> from pyflink.util.utils import add_jars_to_context_class_loader
> >>> add_jars_to_context_class_loader(['file:///root/Test2.jar
> >>> '])
> >>> from
as data flows from a source through a pipeline of operators and finally
sinks, is there a means to control how many threads are used within an
operator, and how an operator is distributed across the network?
Where can I read up on these types of details specifically?
I don't know what your dependency issue is (post it here if you want
help!), but I generally recommend using mvn dependency:tree to debug
version clashes (and then pin or exclude versions)
On Tue, Feb 2, 2021 at 9:23 PM Sebastián Magrí wrote:
> The root of the previous error seemed to be the
Flink1.12.0; only using aligned checkpoint; Standalone Cluster;
Robert Metzger 于2021年2月5日周五 下午6:52写道:
> Are you using unaligned checkpoints? (there's a known bug in 1.12.0 which
> can lead to corrupted data when using UC)
> Can you tell us a little bit about your environment? (How are you
>
Hey,
the code and exception are not included in your message. Did you try to
send them as images (screenshots)?
I recommend sending code and exceptions as text for better searchability.
On Wed, Feb 3, 2021 at 12:58 PM cristi.cioriia wrote:
> Hey guys,
>
> I'm pretty new to Flink, I hope I could
Answers inline:
On Wed, Feb 3, 2021 at 3:55 PM Marco Villalobos
wrote:
> Hi Gorden,
>
> Thank you very much for the detailed response.
>
> I considered using the state-state processor API, however, our enrichment
> requirements make the state-processor API a bit inconvenient.
> 1. if an element
Are you using unaligned checkpoints? (there's a known bug in 1.12.0 which
can lead to corrupted data when using UC)
Can you tell us a little bit about your environment? (How are you deploying
Flink, which state backend are you using, what kind of job (I guess
DataStream API))
Somehow the process
??
from pyflink.util.utils import add_jars_to_context_class_loader
add_jars_to_context_class_loader(['file:///root/Test2.jar'])
from pyflink.java_gateway import get_gateway
get_gateway().jvm.Test2.main()
Traceback (most recent call last):
图片似乎无法加载,不过我猜应该是参数类型问题?这个函数需要参数类型为List:
add_jars_to_context_class_loader(["file:///xxx "])
> 在 2021年2月5日,17:48,瞿叶奇 <389243...@qq.com> 写道:
>
> 老师,您好,
> 我升级到了flink1.12.0了,用这个函数加载类报错了,是我url写的有问题吗?pyfink有没有连接hdfs认证kerberos的方法呢?
>
>
>
>
> -- 原始邮件 --
> 发件人:
Hi Utopia,
Have u fixed this problem? I also meet this problem, so I transferred the
case class to Java POJO, then this problem was fixed.
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
??flink1.12.0url??pyfink??hdfskerberos??
----
??:
hello 我想问一下使用flink
cdc同步数据是设置了snapshot.mode这个参数为schema_only,但是我发现每次重启任务都会从最新开始读取数据,我怎么做才可以从上次断点继续消费呢;同时我通过MySQLSource.builder().serverId(123456)的方式设置了server_id但是从我同步出来的数据来看server_id并不是我设置的值
--
Sent from: http://apache-flink.147419.n8.nabble.com/
hello 我现在碰到一个问题 在使用flink
cdc同步数据时我设置了snapshot.mode的值为schema_only,但是当我重启任务时发现都是从最新开始消费,我该怎么做才能从上次停止任务的断点继续消费;同时我使用MySQLSource.builder().serverId(123456)的方式设置了server_id,但是从打印出来的数据来看并没有生效
--
Sent from: http://apache-flink.147419.n8.nabble.com/
Hi Dan,
I'm afraid this is not easily possible using the DataStream API in
STREAMING execution mode today. However, there is one possible solution
and we're introducing changes that will also make this work on STREAMING
mode.
The possible solution is to use the `FileSink` instead of the
Hi all, I find that the failure always occurred in the second task, after
the source task. So I do something in the first chaining task, I transform
the 'Map' based class object to another normal class object, and the
problem disappeared.
Based on the new solution, I also tried to stop and
Basically the only thing that Watermarks do is to trigger event time
timers. Event time timers are used explicitly in KeyedProcessFunctions, but
are also used internally by time windows, CEP (to sort the event stream),
in various time-based join operations, and within the Table/SQL API.
If you
Great, thanks for the update.
On Fri, Feb 5, 2021 at 2:06 PM Fabian Paul
wrote:
> We are currently working on supporting arbitrary pod template specs for
> the
> Flink pods. It allows you to specify a V1PodTemplateSpec for taskmanager
> and jobmanager.
>
> The feature will be included in the
Another strategy to resolve such issues is by explicitly excluding the
conflicting dependency from one of the transitive dependencies.
Besides that, I don't think there's a nicer solution here.
On Thu, Feb 4, 2021 at 6:26 PM Jan Oelschlegel <
oelschle...@integration-factory.de> wrote:
> I
Caused by: org.apache.flink.table.api.ValidationException: Could not find
any factory for identifier 'kafka' that implements
'org.apache.flink.table.factories.DynamicTableSourceFactory' in the
classpath.
看着像是缺少kafka-connector的依赖
> 2020年10月14日 下午4:55,奔跑的小飞袁 写道:
>
> hello,
>
Hi,
首先你需要将你的java程序打成jar包,之后热加载你的jar包, 目前pyflink里有一个util方法可以直接调用:
from pyflink.util.utils import add_jars_to_context_class_loader
add_jars_to_context_class_loader("file:///xxx ") # 注意需要是url格式的路径
然后就能通过java gateway进行调用了:
from pyflink.java_gateway import get_gateway
I do not think this is some code related problem anymore, maybe it is some
bug?
赵一旦 于2021年2月5日周五 下午4:30写道:
> Hi all, I find that the failure always occurred in the second task, after
> the source task. So I do something in the first chaining task, I transform
> the 'Map' based class object to
We are currently working on supporting arbitrary pod template specs for the
Flink pods. It allows you to specify a V1PodTemplateSpec for taskmanager
and jobmanager.
The feature will be included in the next upcoming release 2.4 of the
ververica platform. We plan to release it in the next few
41 matches
Mail list logo