Re:Re: Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-28 Thread Haibo Sun
Congratulations Dian ! Best, Haibo At 2020-08-27 18:03:38, "Zhijiang" wrote: >Congrats, Dian! > > >-- >From:Yun Gao >Send Time:2020年8月27日(星期四) 17:44 >To:dev ; Dian Fu ; user >; user-zh >Subject:Re: Re: [ANNOUNCE] New PMC

Re:Re: Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-28 Thread Haibo Sun
Congratulations Dian ! Best, Haibo At 2020-08-27 18:03:38, "Zhijiang" wrote: >Congrats, Dian! > > >-- >From:Yun Gao >Send Time:2020年8月27日(星期四) 17:44 >To:dev ; Dian Fu ; user >; user-zh >Subject:Re: Re: [ANNOUNCE] New PMC

Re:[ANNOUNCE] Yu Li is now part of the Flink PMC

2020-06-16 Thread Haibo Sun
Congratulations Yu! Best, Haibo At 2020-06-17 09:15:02, "jincheng sun" wrote: >Hi all, > >On behalf of the Flink PMC, I'm happy to announce that Yu Li is now >part of the Apache Flink Project Management Committee (PMC). > >Yu Li has been very active on Flink's Statebackend component,

Re:[ANNOUNCE] Yu Li is now part of the Flink PMC

2020-06-16 Thread Haibo Sun
Congratulations Yu! Best, Haibo At 2020-06-17 09:15:02, "jincheng sun" wrote: >Hi all, > >On behalf of the Flink PMC, I'm happy to announce that Yu Li is now >part of the Apache Flink Project Management Committee (PMC). > >Yu Li has been very active on Flink's Statebackend component,

Re:[ANNOUNCE] Apache Flink 1.10.0 released

2020-02-12 Thread Haibo Sun
Thanks Gary & Yu. Great work! Best, Haibo At 2020-02-12 21:31:00, "Yu Li" wrote: >The Apache Flink community is very happy to announce the release of Apache >Flink 1.10.0, which is the latest major release. > >Apache Flink® is an open-source stream processing framework for >distributed,

Re:[ANNOUNCE] Apache Flink 1.9.0 released

2019-08-22 Thread Haibo Sun
Great news! Thanks Gordon and Kurt!Best, Haibo At 2019-08-22 20:03:26, "Tzu-Li (Gordon) Tai" wrote: >The Apache Flink community is very happy to announce the release of Apache >Flink 1.9.0, which is the latest major release. > >Apache Flink® is an open-source stream processing framework for

Re:Re: FlatMap returning Row<> based on ArrayList elements()

2019-08-07 Thread Haibo Sun
register the table when the number of elements/columns and data types are both nondeterministic. Correct me if I misunderstood your meaning. Best, Victor From: Andres Angel Date: Wednesday, August 7, 2019 at 9:55 PM To: Haibo Sun Cc: user Subject: Re: FlatMap returning Row<&

Re:Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread Haibo Sun
Congratulations! Best, Haibo At 2019-08-08 02:08:21, "Yun Tang" wrote: >Congratulations Hequn. > >Best >Yun Tang > >From: Rong Rong >Sent: Thursday, August 8, 2019 0:41 >Cc: dev ; user >Subject: Re: [ANNOUNCE] Hequn becomes a Flink committer > >Congratulations

Re:FlatMap returning Row<> based on ArrayList elements()

2019-08-07 Thread Haibo Sun
Hi Andres Angel, I guess people don't understand your problem (including me). I don't know if the following sample code is what you want, if not, can you describe the problem more clearly? StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

Re:Pramaters in eclipse with Flink

2019-08-06 Thread Haibo Sun
Hi alaa.abutaha, In fact, your problem is not related to Flink, but how to specify program parameters in Eclipse. I think the following document will help you. https://www.cs.colostate.edu/helpdocs/cmd.pdf Best, Haibo At 2019-07-26 22:02:48, "alaa" wrote: >Hallo > I run this example

Re:Re: How to write value only using flink's SequenceFileWriter?

2019-08-06 Thread Haibo Sun
Hi Liu Bo, If you haven't customize serializations through the configuration item "io.serializations", the default serializer for Writable objects is org.apache.hadoop.io.serializer.WritableSerialization.WritableSerializer. As you said, when WritableSerializer serialize the NullWritable

Re:StreamingFileSink part file count reset

2019-07-29 Thread Haibo Sun
Hi Sidhartha, Currently, the part counter is never reset to 0, nor is it allowed to customize the part filename. So I don't think there's any way to reset it right now. I guess the reason why it can't be reset to 0 is that it is concerned that the previous parts will be overwritten. Although

Re:sqlQuery split string

2019-07-24 Thread Haibo Sun
Hi Andres Angel, At present, there seems to be no such built-in function, and you need to register a user-defined function to do that. You can look at the following document to see how to do. https://ci.apache.org/projects/flink/flink-docs-master/dev/table/udfs.html#table-functions Best,

Re:Re: Re: Re: Flink and Akka version incompatibility ..

2019-07-24 Thread Haibo Sun
s in Flink ? regards. On Wed, Jul 24, 2019 at 3:44 PM Debasish Ghosh wrote: Hi Haibo - Thanks for the clarification .. regards. On Wed, Jul 24, 2019 at 2:58 PM Haibo Sun wrote: Hi Debasish Ghosh, I agree that Flink should shade its Akka. Maybe you misunderstood me. I mean, in t

Re:Re: Re: Flink and Akka version incompatibility ..

2019-07-24 Thread Haibo Sun
o that all the projects that use flink won't hit this kind of issue. Haibo Sun 于2019年7月24日周三 下午4:07写道: Hi, Debasish Ghosh I don't know why not shade Akka, maybe it can be shaded. Chesnay may be able to answer that. I recommend to shade Akka dependency of your application because it don't be kn

Re:Re: How to handle JDBC connections in a topology

2019-07-24 Thread Haibo Sun
Hi Stephen, I don't think it's possible to use the same connection pool for the entire topology, because the nodes on the topology may run in different JVMs and on different machines. If you want all operators running in the same JVM to use the same connection pool, I think you can

Re:Re: Flink and Akka version incompatibility ..

2019-07-24 Thread Haibo Sun
sion of my application instead of Flink ? regards. On Wed, Jul 24, 2019 at 12:42 PM Haibo Sun wrote: Hi Debasish Ghosh, Does your application have to depend on Akka 2.5? If not, it's a good idea to always keep the Akka version that the application depend on in line with Flink. If you want to t

Re:Flink and Akka version incompatibility ..

2019-07-24 Thread Haibo Sun
Hi Debasish Ghosh, Does your application have to depend on Akka 2.5? If not, it's a good idea to always keep the Akka version that the application depend on in line with Flink. If you want to try shading Akka dependency, I think that it is more recommended to shade Akka dependency of your

Re:Re: MiniClusterResource class not found using AbstractTestBase

2019-07-23 Thread Haibo Sun
Hi, Juan It is dependent on "flink-runtime-*-tests.jar", so build.sbt should be modified as follows: scalaVersion := "2.11.0" val flinkVersion = "1.8.1" libraryDependencies ++= Seq( "org.apache.flink" %% "flink-test-utils" % flinkVersion % Test, "org.apache.flink" %%

Re:AW: Re:Unable to build Flink1.10 from source

2019-07-22 Thread Haibo Sun
true Best, Haibo At 2019-07-23 04:54:11, "Yebgenya Lazarkhosrouabadi" wrote: Hi, I used the command mvn clean package -DskipTests -Punsafe-mapr-repo , but it didn’t work. I get the same error. Regards Yebgenya Lazar Von: Haibo Sun Gesendet: Montag, 22. Juli 201

Re:Unable to build Flink1.10 from source

2019-07-21 Thread Haibo Sun
Hi, Yebgenya The reason for this problem can be found in this email (http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/NOTICE-SSL-issue-when-building-flink-mapr-fs-td30757.html). The solution is to add the parameter "-Punsafe-mapr-repo" to the maven command, as given in the

Re:Re: Writing Flink logs into specific file

2019-07-19 Thread Haibo Sun
Hi, Soheil Placing the log configuration file in the resource directory of the job's jar will not be used by Flink, because the log configuration is explicitly specified by the script under the bin directory of Flink and the bootstrap code (for example the BootstrapTools class). If you want

Re:Re: yarn-session vs cluster per job for streaming jobs

2019-07-18 Thread Haibo Sun
e standard situation where I deploy new version of all my jobs. My current impression that job starts faster in yarn-session mode. Thanks, Maxim. On Thu, Jul 18, 2019 at 4:57 AM Haibo Sun wrote: Hi, Maxim For the concern talking on the first point: If HA and checkpointing are enabled, AM (the a

Re:Re: Job leak in attached mode (batch scenario)

2019-07-17 Thread Haibo Sun
ot; wrote: Thanks Haibo for the response! Is there any community issue or plan to implement heartbeat mechanism between Dispatcher and Client? If not, should I create one? Regards, Qi On Jul 17, 2019, at 10:19 AM, Haibo Sun wrote: Hi, Qi As far as I know, there is no such mechanism now.

Re:yarn-session vs cluster per job for streaming jobs

2019-07-17 Thread Haibo Sun
Hi, Maxim For the concern talking on the first point: If HA and checkpointing are enabled, AM (the application master, that is the job manager you said) will be restarted by YARN after it dies, and then the dispatcher will try to restore all the previously running jobs correctly. Note that

Re:Job leak in attached mode (batch scenario)

2019-07-16 Thread Haibo Sun
Hi, Qi As far as I know, there is no such mechanism now. To achieve this, I think it may be necessary to add a REST-based heartbeat mechanism between Dispatcher and Client. At present, perhaps you can add a monitoring service to deal with these residual Flink clusters. Best, Haibo At

Re:Converting Metrics from a Reporter to a Custom Events mapping

2019-07-16 Thread Haibo Sun
Hi, Vijay Or can you implement a Reporter that transforms the metrics and sends them directly to a Kinesis Stream? Best, Haibo At 2019-07-16 00:01:36, "Vijay Balakrishnan" wrote: Hi, I need to capture the Metrics sent from a Flink app to a Reporter and transform them to an Events API

Re:Re: Creating a Source function to read data continuously

2019-07-15 Thread Haibo Sun
Hi, Soheil As Caizhi said, to create a source that implements `SourceFunction`, you can first take a closer look at the example in javadoc (https://ci.apache.org/projects/flink/flink-docs-release-1.8/api/java/org/apache/flink/streaming/api/functions/source/SourceFunction.html). Although

Re:State incompatible

2019-07-15 Thread Haibo Sun
Hi, Avi Levi I don't think there's any way to solve this problem right now, and Flink documentation clearly shows that this is not supported. “Trying to restore state, which was previously configured without TTL, using TTL enabled descriptor or vice versa will lead to compatibility failure

Re:Re: [ANNOUNCE] Rong Rong becomes a Flink committer

2019-07-11 Thread Haibo Sun
Congrats Rong!Best, Haibo At 2019-07-12 09:40:26, "JingsongLee" wrote: Congratulations Rong. Rong Rong has done a lot of nice work in the past time to the flink community. Best, JingsongLee -- From:Rong Rong Send

Re:Cannot catch exception throws by kafka consumer with JSONKeyValueDeserializationSchema

2019-07-08 Thread Haibo Sun
Hi, Zhechao Usually, if you can, share your full exception stack and where you are trying to capture exceptions in your code (preferably with posting your relevant code directly ). That will help us understand and locate the issue you encounter. Best, Haibo At 2019-07-08 14:11:22,

Re:Tracking message processing in my application

2019-07-04 Thread Haibo Sun
Hi, Roey > What do you think about that? I would have some concerns about throughput and latency, so I think that the operators should report state data asynchronously and in batches to minimize the impact of monitoring on the normal business processing. In addition, If the amount of

Re:Load database table with many columns

2019-07-03 Thread Haibo Sun
Hi, Soheil Pourbafrani For the current implementation of JDBCInputFormat, it cannot automatically infer the column types. As far as I know, there also is no other way to do this. If you're going to implement such an input format, the inference work needs to be done by yourself. Because it

Re:RE: Re:Re: File Naming Pattern from HadoopOutputFormat

2019-07-03 Thread Haibo Sun
h From: Haibo Sun Sent: Tuesday, July 2, 2019 5:57 AM To: Yitzchak Lieberman Cc: Hailu, Andreas [Tech] ; user@flink.apache.org Subject: Re:Re: File Naming Pattern from HadoopOutputFormat Hi, Andreas You are right. To meet this requirement, Flink should need to expose a interfac

Re:Could not load the native RocksDB library

2019-07-02 Thread Haibo Sun
Hi, Samya.Patro I guess this may be a setup problem. What OS and what version of JDK do you use? You can try upgrading JDK to see if the issue can be solved. Best, Haibo At 2019-07-02 17:16:59, "Patro, Samya" wrote: Hello, I am using rocksdb for storing state . But when I run the

Re:Re: File Naming Pattern from HadoopOutputFormat

2019-07-02 Thread Haibo Sun
file name as getBucketId() defined the directory for the files in case of partitioning the data, for example: /day=20190101/part-1-1 there is an open issue for that: https://issues.apache.org/jira/browse/FLINK-12573 On Tue, Jul 2, 2019 at 6:18 AM Haibo Sun wrote: Hi, Andreas I think the following

Re:File Naming Pattern from HadoopOutputFormat

2019-07-01 Thread Haibo Sun
Hi, Andreas I think the following things may be what you want. 1. For writing Avro, I think you can extend AvroOutputFormat and override the getDirectoryFileName() method to customize a file name, as shown below. The javadoc of AvroOutputFormat:

Re:Re: Re:Flink batch job memory/disk leak when invoking set method on a static Configuration object.

2019-07-01 Thread Haibo Sun
-06-28 15:49:50, "Vadim Vararu" wrote: Hi, I've run it on a standalone Flink cluster. No Yarn involved. From: Haibo Sun Sent: Friday, June 28, 2019 6:13 AM To: Vadim Vararu Cc: user@flink.apache.org Subject: Re:Flink batch job memory/disk leak when invoking set method on a static Con

Re:Flink batch job memory/disk leak when invoking set method on a static Configuration object.

2019-06-27 Thread Haibo Sun
Hi, Vadim This similar issue has occurred in earlier versions, see https://issues.apache.org/jira/browse/FLINK-4485. Are you running a Flink session cluster on YARN? I think it might be a bug. I'll see if I can reproduce it with the master branch code, and if yes, I will try to investigate

Re:Limit number of jobs or Job Managers

2019-06-27 Thread Haibo Sun
Hi, Pankaj Chand If you're running Flink on YARN, you can do this by limiting the number of applications in the cluster or in the queue. As far as I know, Flink does not limit that. The following are the configuration items for YARN : yarn.scheduler.capacity.maximum-applications