AW: Statefun: cancel "sendAfter"

2021-02-05 Thread Stephan Pelikan
Hi Gorden, here is the link, if anyone else is also interested: https://issues.apache.org/jira/browse/FLINK-21308 Cheers, Stephan Von: Tzu-Li (Gordon) Tai Gesendet: Freitag, 5. Februar 2021 12:58 An: Stephan Pelikan Cc: user@flink.apache.org; Igal Shilman Betreff: Re: Statefun: cancel

Re: Minicluster Flink tests, checkpoints and inprogress part files

2021-02-05 Thread Dan Hill
Thanks Aljoscha! On Fri, Feb 5, 2021 at 1:48 AM Aljoscha Krettek wrote: > Hi Dan, > > I'm afraid this is not easily possible using the DataStream API in > STREAMING execution mode today. However, there is one possible solution > and we're introducing changes that will also make this work on

Re: flink kryo exception

2021-02-05 Thread Till Rohrmann
Could you provide us with a minimal working example which reproduces the problem for you? This would be super helpful in figuring out the problem you are experiencing. Thanks a lot for your help. Cheers, Till On Fri, Feb 5, 2021 at 1:03 PM 赵一旦 wrote: > Yeah, and if it is different, why my job

Re: Question about Scala Case Class and List in Flink

2021-02-05 Thread Timo Walther
Dealing with types is not always easy in Flink. If you have further issues, it might make sense to just pass them explicitly. We list all types in: org.apache.flink.api.common.typeinfo.Types org.apache.flink.api.scala.typeutils.Types Regards, Timo On 05.02.21 16:04, Xavier wrote: Hi Timo,

Re: Question about Scala Case Class and List in Flink

2021-02-05 Thread Xavier
Hi Timo, Thank you for ur clarification, it is very useful to me, I am also combining the realization of map function, trying to do implicit conversion of case class, so that I can restore state from FS. On Fri, Feb 5, 2021 at 10:38 PM Timo Walther wrote: > Hi Xavier, > > the Scala API has

Re: Question about Scala Case Class and List in Flink

2021-02-05 Thread Timo Walther
Hi Xavier, the Scala API has special implicits in method such as `DataStream.map()` or `DataStream.keyBy()` to support Scala specifics like case classe. For Scala one needs to use the macro `createTypeInformation[CaseClass]` for Java we use reflection via `TypeInformation.of()`. But Scala and

Re: Stateful Functions - accessing the state aside of normal processing

2021-02-05 Thread Igal Shilman
Hi Stephan, I think that what you are trying to achieve is very interesting, and possibly other users might find that useful as well and we will definitely add that to our roadmap. I think that Gordon's suggestion of using the state processor API to examine a savepoint, makes a lot of sense in

Re: question on checkpointing

2021-02-05 Thread David Anderson
I've seen checkpoints timeout when using the RocksDB state backend with very large objects. The issue is that updating a ValueState stored in RocksDB requires deserializing, updating, and then re-serializing that object -- and if that's some enormous collection type, that will be slow. In such

Re: StateFun scalability

2021-02-05 Thread Igal Shilman
Hello Martijn, Great to hear that you are exploring StateFun as part of your university project! Can you please clarify: - how do you measure throughput? - by co-located functions, do you mean a remote function on the same machine? - Can you share a little bit more about your functions, what are

flink on yarn 多TaskManager 拒绝连接问题

2021-02-05 Thread Junpb
nohup bin/flink run -m yarn-cluster \ -c main \ -ynm ${FLINK_NAME} \ -ys 3 \ -p 4 \ -yjm 2048m \ -ytm 2048m \ 在flink on yarn 的情况下,使用以上flink run 参数,确保TaskManager 为 2 奇怪的是 JobManager 里面报如下错误,但TaskManager的确启动2个,只是报错的那个TaskManager无法正常工作 谢谢解答 错误: Caused by:

Re: flink kryo exception

2021-02-05 Thread 赵一旦
Yeah, and if it is different, why my job runs normally. The problem only occurres when I stop it. Robert Metzger 于2021年2月5日周五 下午7:08写道: > Are you 100% sure that the jar files in the classpath (/lib folder) are > exactly the same on all machines? (It can happen quite easily in a > distributed

Re: Statefun: cancel "sendAfter"

2021-02-05 Thread Tzu-Li (Gordon) Tai
Hi Stephan, Thanks for providing the details of the use case! It does indeed sound like being able to delete scheduled delayed messages would help here. And yes, please do proceed with creating an issue. As for details on the implementation, we can continue to discuss that on the JIRA. Cheers,

Re: threading and distribution

2021-02-05 Thread Marco Villalobos
Okay, I am following up to my question. I see information regarding the threading and distribution model on the documentation about the architecture. https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/flink-architecture.html Next, I want to read up on what I have control over.

Re: How to implement a FTP connector Flink Table/sql support?

2021-02-05 Thread Robert Metzger
Flink supports Hadoop's FileSystem abstraction, which has an implementation for FTP: https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/ftp/FTPFileSystem.html On Tue, Feb 2, 2021 at 3:43 AM 1095193...@qq.com <1095193...@qq.com> wrote: > Hi > I have investigate the relevant document

Re: question on checkpointing

2021-02-05 Thread Robert Metzger
By default, a checkpoint times out after 10 minutes. This means if not all operators are able to confirm the checkpoint, it will be cancelled. If you have an operator that is blocking for more than 10 minutes on a single record (because this record contains millions of elements that are written

Re: Flink sql using Hive for metastore throws Exception

2021-02-05 Thread Robert Metzger
I'm not sure if this is related, but you are mixing scala 2.11 and 2.12 dependencies (and mentioning scala 2.1.1 dependencies). On Tue, Feb 2, 2021 at 8:32 AM Eleanore Jin wrote: > Hi experts, > I am trying to experiment how to use Hive to store metadata along using > Flink SQL. I am running

Re: Very slow recovery from Savepoint

2021-02-05 Thread Robert Metzger
Great to hear that you were able to resolve the issue! On Thu, Feb 4, 2021 at 5:12 PM Yordan Pavlov wrote: > Thank you for your tips Robert, > I think I narrowed down the problem to having slow Hard disks. Once > the memory runs out, RocksDb starts spilling to the disk and the > performance

hybrid state backends

2021-02-05 Thread Marco Villalobos
Is it possible to use different statebackends for different operators? There are certain situations where I want the state to reside completely in memory, and other situations where I want it stored in rocksdb.

Re: flink kryo exception

2021-02-05 Thread Robert Metzger
Are you 100% sure that the jar files in the classpath (/lib folder) are exactly the same on all machines? (It can happen quite easily in a distributed standalone setup that some files are different) On Fri, Feb 5, 2021 at 12:00 PM 赵一旦 wrote: > Flink1.12.0; only using aligned checkpoint;

Re: pyflink的py4j里是不是可以调用我自己写的java程序 ?

2021-02-05 Thread Wei Zhong
尝试调用: get_gateway().jvm.Test2.Test2.main(None) > 在 2021年2月5日,18:27,瞿叶奇 <389243...@qq.com> 写道: > > 老师,您好,列表参数就不在报错,但是还是没有加载进去。 > >>> from pyflink.util.utils import add_jars_to_context_class_loader > >>> add_jars_to_context_class_loader(['file:///root/Test2.jar > >>> ']) > >>> from

threading and distribution

2021-02-05 Thread Marco Villalobos
as data flows from a source through a pipeline of operators and finally sinks, is there a means to control how many threads are used within an operator, and how an operator is distributed across the network? Where can I read up on these types of details specifically?

Re: Conflicts between the JDBC and postgresql-cdc SQL connectors

2021-02-05 Thread Robert Metzger
I don't know what your dependency issue is (post it here if you want help!), but I generally recommend using mvn dependency:tree to debug version clashes (and then pin or exclude versions) On Tue, Feb 2, 2021 at 9:23 PM Sebastián Magrí wrote: > The root of the previous error seemed to be the

Re: flink kryo exception

2021-02-05 Thread 赵一旦
Flink1.12.0; only using aligned checkpoint; Standalone Cluster; Robert Metzger 于2021年2月5日周五 下午6:52写道: > Are you using unaligned checkpoints? (there's a known bug in 1.12.0 which > can lead to corrupted data when using UC) > Can you tell us a little bit about your environment? (How are you >

Re: How to use the TableAPI to query the data in the Sql Training Rides table ?

2021-02-05 Thread Robert Metzger
Hey, the code and exception are not included in your message. Did you try to send them as images (screenshots)? I recommend sending code and exceptions as text for better searchability. On Wed, Feb 3, 2021 at 12:58 PM cristi.cioriia wrote: > Hey guys, > > I'm pretty new to Flink, I hope I could

Re: Question regarding a possible use case for Iterative Streams.

2021-02-05 Thread Robert Metzger
Answers inline: On Wed, Feb 3, 2021 at 3:55 PM Marco Villalobos wrote: > Hi Gorden, > > Thank you very much for the detailed response. > > I considered using the state-state processor API, however, our enrichment > requirements make the state-processor API a bit inconvenient. > 1. if an element

Re: flink kryo exception

2021-02-05 Thread Robert Metzger
Are you using unaligned checkpoints? (there's a known bug in 1.12.0 which can lead to corrupted data when using UC) Can you tell us a little bit about your environment? (How are you deploying Flink, which state backend are you using, what kind of job (I guess DataStream API)) Somehow the process

?????? pyflink??py4j??????????????????????????java???? ??

2021-02-05 Thread ??????
?? from pyflink.util.utils import add_jars_to_context_class_loader add_jars_to_context_class_loader(['file:///root/Test2.jar']) from pyflink.java_gateway import get_gateway get_gateway().jvm.Test2.main() Traceback (most recent call last):

Re: pyflink的py4j里是不是可以调用我自己写的java程序 ?

2021-02-05 Thread Wei Zhong
图片似乎无法加载,不过我猜应该是参数类型问题?这个函数需要参数类型为List: add_jars_to_context_class_loader(["file:///xxx "]) > 在 2021年2月5日,17:48,瞿叶奇 <389243...@qq.com> 写道: > > 老师,您好, > 我升级到了flink1.12.0了,用这个函数加载类报错了,是我url写的有问题吗?pyfink有没有连接hdfs认证kerberos的方法呢? > > > > > -- 原始邮件 -- > 发件人:

Re: Question about Scala Case Class and List in Flink

2021-02-05 Thread Xavier
Hi Utopia, Have u fixed this problem? I also meet this problem, so I transferred the case class to Java POJO, then this problem was fixed. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

?????? pyflink??py4j??????????????????????????java???? ??

2021-02-05 Thread ??????
??flink1.12.0url??pyfink??hdfskerberos?? ---- ??:

flink cdc 同步数据问题

2021-02-05 Thread 奔跑的小飞袁
hello 我想问一下使用flink cdc同步数据是设置了snapshot.mode这个参数为schema_only,但是我发现每次重启任务都会从最新开始读取数据,我怎么做才可以从上次断点继续消费呢;同时我通过MySQLSource.builder().serverId(123456)的方式设置了server_id但是从我同步出来的数据来看server_id并不是我设置的值 -- Sent from: http://apache-flink.147419.n8.nabble.com/

flink cdc同步数据

2021-02-05 Thread 奔跑的小飞袁
hello 我现在碰到一个问题 在使用flink cdc同步数据时我设置了snapshot.mode的值为schema_only,但是当我重启任务时发现都是从最新开始消费,我该怎么做才能从上次停止任务的断点继续消费;同时我使用MySQLSource.builder().serverId(123456)的方式设置了server_id,但是从打印出来的数据来看并没有生效 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Minicluster Flink tests, checkpoints and inprogress part files

2021-02-05 Thread Aljoscha Krettek
Hi Dan, I'm afraid this is not easily possible using the DataStream API in STREAMING execution mode today. However, there is one possible solution and we're introducing changes that will also make this work on STREAMING mode. The possible solution is to use the `FileSink` instead of the

Re: flink kryo exception

2021-02-05 Thread 赵一旦
Hi all, I find that the failure always occurred in the second task, after the source task. So I do something in the first chaining task, I transform the 'Map' based class object to another normal class object, and the problem disappeared. Based on the new solution, I also tried to stop and

Re: Watermarks on map operator

2021-02-05 Thread David Anderson
Basically the only thing that Watermarks do is to trigger event time timers. Event time timers are used explicitly in KeyedProcessFunctions, but are also used internally by time windows, CEP (to sort the event stream), in various time-based join operations, and within the Table/SQL API. If you

Re: Set Readiness, liveness probes on task/job manager pods via Ververica Platform

2021-02-05 Thread narasimha
Great, thanks for the update. On Fri, Feb 5, 2021 at 2:06 PM Fabian Paul wrote: > We are currently working on supporting arbitrary pod template specs for > the > Flink pods. It allows you to specify a V1PodTemplateSpec for taskmanager > and jobmanager. > > The feature will be included in the

Re: AbstractMethodError while writing to parquet

2021-02-05 Thread Robert Metzger
Another strategy to resolve such issues is by explicitly excluding the conflicting dependency from one of the transitive dependencies. Besides that, I don't think there's a nicer solution here. On Thu, Feb 4, 2021 at 6:26 PM Jan Oelschlegel < oelschle...@integration-factory.de> wrote: > I

Re: flinksql引入flink-parquet_2.11任务提交失败

2021-02-05 Thread zhuxiaoshang
Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'kafka' that implements 'org.apache.flink.table.factories.DynamicTableSourceFactory' in the classpath. 看着像是缺少kafka-connector的依赖 > 2020年10月14日 下午4:55,奔跑的小飞袁 写道: > > hello, >

Re: pyflink的py4j里是不是可以调用我自己写的java程序 ?

2021-02-05 Thread Wei Zhong
Hi, 首先你需要将你的java程序打成jar包,之后热加载你的jar包, 目前pyflink里有一个util方法可以直接调用: from pyflink.util.utils import add_jars_to_context_class_loader add_jars_to_context_class_loader("file:///xxx ") # 注意需要是url格式的路径 然后就能通过java gateway进行调用了: from pyflink.java_gateway import get_gateway

Re: flink kryo exception

2021-02-05 Thread 赵一旦
I do not think this is some code related problem anymore, maybe it is some bug? 赵一旦 于2021年2月5日周五 下午4:30写道: > Hi all, I find that the failure always occurred in the second task, after > the source task. So I do something in the first chaining task, I transform > the 'Map' based class object to

Re: Set Readiness, liveness probes on task/job manager pods via Ververica Platform

2021-02-05 Thread Fabian Paul
We are currently working on supporting arbitrary pod template specs for the Flink pods. It allows you to specify a V1PodTemplateSpec for taskmanager and jobmanager. The feature will be included in the next upcoming release 2.4 of the ververica platform. We plan to release it in the next few