Re: Apache Flink - Rest API for num of records in/out

2022-06-08 Thread M Singh
metrics for through the various name() methods 2.  Use the jobid API [1] and find the operator we’ve named in the “vertices” array   [1]https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/rest_api/#jobs-jobid   ah   From: M Singh Sent: Tuesday, June 7, 2022 4:51 PM

Apache Flink - Rest API for num of records in/out

2022-06-07 Thread M Singh
Hi Folks: I am trying to find if I can get the number of records for an operator using flinks REST API.  I've checked the docs at  https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/rest_api/.    I did see some apis that use vertexid, but could not find how to that info without

Apache Flink StateFun user email list

2022-04-17 Thread M Singh
Hi: I just wanted to see if this is the right place for Apache Flink StateFun users email list. If it is not, please let me know and I apologize for any inconvenience. Thanks

Apache StateFun - A few questions about of module.yaml

2022-04-12 Thread M Singh
Hi Folks: I am trying to understand details of module.yaml. The documentation at (https://nightlies.apache.org/flink/flink-statefun-docs-release-3.2/docs/modules/overview/) is just one example and I am trying to find out what are other kinds that can be configured. Also, in the greeter example

Apache StateFun - Broadcast to all functions and Timers

2022-04-10 Thread M Singh
Hi: I am working with Stateful Functions 3.2.0 using Java SDK. I wanted to find out if there is a broadcast an event to all functions functionality available in Stateful Functions just like broadcast process and keyed broadcast processing in Apache Flink. Also, how would we implement processing

StateFun - How to run the application in a Flink cluster and IDE

2022-04-09 Thread M Singh
Hi Folks: I would like to run StateFun application in a flink cluster just like a Flink application. I found the instructions in (https://nightlies.apache.org/flink/flink-statefun-docs-release-3.2/docs/deployment/overview/) but could not find if there is any difference in how to build the jar

Apache Flink - Exception on left outer join with 'kafka' connector

2022-02-22 Thread M Singh
Hi Folks: I am using 'kafka' connector and joining with data from jdbc source (using connector).   I am using Flink v 1.14.3.  If I do a left outer join between kafka source and jdbc source, and try to save it to another kafka sink using connectors api, I get the following exception: Exception

Re: Apache Flink - User Defined Functions - Exception when passing all arguments

2022-02-22 Thread M Singh
phase rather than when parsing, but it shouldn't work anyway. I suggest you to just use Table API for that, as it's richer. You can even use withColumns(range(..)) which gives you more control. Hope it helps,FG On Thu, Feb 17, 2022 at 1:34 AM M Singh wrote: Hi: I have a simple concatenate UDF

Re: Apache Flink - Continuously streaming data using jdbc connector

2022-02-21 Thread M Singh
AM M Singh wrote: Hi Folks: I am trying to monitor a jdbc source and continuously streaming data in an application using the jdbc connector.  However, the application stops after reading the data in the table. I've checked the docs (https://nightlies.apache.org/flink/flink-docs-master/docs

Apache Flink - Continuously streaming data using jdbc connector

2022-02-20 Thread M Singh
Hi Folks: I am trying to monitor a jdbc source and continuously streaming data in an application using the jdbc connector.  However, the application stops after reading the data in the table. I've checked the docs

Re: Apache Flink - Continuously monitoring directory using filesystem connector - 1.14.3

2022-02-18 Thread M Singh
18, 2022 at 1:28 AM M Singh wrote: Hi:   I have a simple application and am using file system connector to monitor a directory and then print to the console (using datastream).  However, the application stops after reading the file in the directory (at the moment I have a single file

Apache Flink - Continuously monitoring directory using filesystem connector - 1.14.3

2022-02-17 Thread M Singh
Hi:   I have a simple application and am using file system connector to monitor a directory and then print to the console (using datastream).  However, the application stops after reading the file in the directory (at the moment I have a single file in the directory).   I am using Apache Flink

Apache Flink - User Defined Functions - Exception when passing all arguments

2022-02-16 Thread M Singh
Hi: I have a simple concatenate UDF (for testing purpose) defined as:     public static class ConcatenateFunction extends ScalarFunction {        public String eval(@DataTypeHint(inputGroup = InputGroup.ANY) Object ... inputs) {            return Arrays.stream(inputs).map(i ->

Re: Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-12 Thread M Singh
 is NULL, and no events will be considered late. Hope this helps clarify things. Regards,David On Sat, Feb 12, 2022 at 12:01 AM M Singh wrote: I thought a little more about your references Martijn and wanted to confirm one thing - the table is specifying the watermark and the downstream view needs

Re: Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-11 Thread M Singh
. Mans On Friday, February 11, 2022, 05:02:49 PM EST, M Singh wrote: Hi Martijn: Thanks for the reference.    My understanding was that if we use watermark then any event with event time (in the above example) < event_time - 30 seconds will be dropped automatically.    My question

Re: Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-11 Thread M Singh
at 16:45, M Singh wrote: Hi: The flink docs (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/) indicates that the CURRENT_WATERMARK(rowtime) can return null: Note that this function can return NULL, and you may have to consider this case. Fo

Apache Flink - Will later events from a table with watermark be propagated and when can CURRENT_WATERMARK be null in that situation

2022-02-11 Thread M Singh
Hi: The flink docs (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/functions/systemfunctions/) indicates that the CURRENT_WATERMARK(rowtime) can return null: Note that this function can return NULL, and you may have to consider this case. For example, if you want to filter

Re: Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-25 Thread M Singh
],groupByType:byAccount,aggregationKey:'2'}       From: Colletta, Edward Sent: Tuesday, January 25, 2022 1:29 PM To: M Singh ; Caizhi Weng ; User-Flink Subject: RE: Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application   You don’t have to add keyBy’s

Re: Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-25 Thread M Singh
to the stream based on the value of groupByFields   The flatMap means one message may be emitted several times with different values of aggregationKey so it may belong to multiple aggregations.       From: M Singh Sent: Monday, January 24, 2022 9:52 PM To: Caizhi Weng ; User-Flink

Re: Apache Flink - Can AllWindowedStream be parallel ?

2022-01-24 Thread M Singh
would be  SingleOutputStreamOperator, which cannot be parallel. [1]  https://github.com/apache/flink/blob/master/flink-streaming-scala/src/main/scala/org/apache/flink/streaming/api/scala/AllWindowedStream.scala BestYun TangFrom: M Singh Sent: Sunday, January 23, 2022 4:24 To: User-Flink Subject

Re: Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-24 Thread M Singh
it is not possible to do so without restarting the job and as far as I know there is no existing framework/pattern to achieve this. By the way, why do you need this functionality? Could you elaborate more on your use case? M Singh 于2022年1月22日周六 21:32写道: Hi Folks: I am working on an exploratory project

Apache Flink - Can AllWindowedStream be parallel ?

2022-01-22 Thread M Singh
Hi Folks: The documentation for AllWindowedStream (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#datastream-rarr-allwindowedstream) has a note: This is in many cases a non-parallel transformation. All records will be gathered in one task for the

Apache Fink - Adding/Removing KeyedStreams at run time without stopping the application

2022-01-22 Thread M Singh
Hi Folks: I am working on an exploratory project in which I would like to add/remove KeyedStreams (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/overview/#keyby) without restarting the Flink streaming application. Is it possible natively in Apache Flink ? 

Re: Apache Flink - Building flink docs locally

2022-01-16 Thread M Singh
Thanks Chesnay for your pointers.  Mans On Sunday, January 16, 2022, 06:19:09 AM EST, Chesnay Schepler wrote: see https://github.com/apache/flink/tree/master/docs On 15/01/2022 15:04, M Singh wrote: Hi:   I wanted to find out what's the command for building the flink

Apache Flink - Building flink docs locally

2022-01-15 Thread M Singh
Hi:   I wanted to find out what's the command for building the flink docs locally and the location of the final docs.  Is it apart of the build commands (Building Flink from Source) and can they be built separately ?   Thanks 

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-20 Thread M Singh
`, `JdbcOutputFormat.Builder` would choose to create a `TableJdbcUpsertOutputFormat` or `JdbcOutputFormat` instance depends on whether key fields is defined in DML.  [1]  https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/jdbc/#idempotent-writes Best,JING ZHANG M Singh 于2021年10月19日周

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-18 Thread M Singh
; + ")\n" + "GROUP BY cnt, cTag") .await();   [1]  https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/jdbc/#key-handling[2]   https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/d

Apache Flink - Using upsert JDBC sink for DataStream

2021-10-16 Thread M Singh
Hi Folks: I am working on Flink DataStream pipeline and would like to use JDBC upsert functionality.  I found a class TableJdbcUpsertOutputFormat but am not sure who to use it with the JdbcSink as shown in the document 

Re: Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-15 Thread M Singh
need schema registry. Best, Dawid [1]https://avro.apache.org/docs/1.10.2/spec.html#Data+Serialization+and+Deserialization [2] https://avro.apache.org/docs/1.10.2/spec.html#Schema+Resolution On 15/07/2021 15:48, M Singh wrote: Hello Dawid: Thanks for your answers and references

Re: Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-15 Thread M Singh
ever used in a single parallel instance and it is not sent over a network again. So basically there you use the writer schema retrieved from schema registry as the reader schema. I hope this answers your questions. Best, Dawid [1] https://avro.apache.org/docs/1.10.2/spec.html On 09/07/2021 03:09,

Apache Flink - Reading Avro messages from Kafka with schema in schema registry

2021-07-08 Thread M Singh
Hi: I am trying to read avro encoded messages from Kafka with schema registered in schema registry. I am using the class (https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/formats/avro/registry/confluent/ConfluentRegistryAvroDeserializationSchema.html) using the

Re: Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-08 Thread M Singh
ts/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/joins/#event-time-temporal-join M Singh 于2021年7月7日周三 下午5:22写道: Hi Jing: Thanks for your explanation and references.  I looked at your reference (https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/joi

Re: Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-07 Thread M Singh
//ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/joins/ Best regards,JING ZHANG M Singh 于2021年7月7日周三 上午8:23写道: Hey Folks: I am trying to understand how LookupTableSource works and have a few questions:  1. Are there other examples/documentation on how create

Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-06 Thread M Singh
Hey Folks: I am trying to understand how LookupTableSource works and have a few questions:  1. Are there other examples/documentation on how create a query that uses it vs ScanTableSource ?2. Are there any best practices for using this interface ?3. How does the planner decide to use

Re: Apache Flink - A question about Tables API and SQL interfaces

2021-05-12 Thread M Singh
://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/tableapi/#groupby-window-aggregation[3]: https://github.com/ververica/flink-sql-cookbook#aggregations-and-analytics[4]: https://github.com/ververica/sql-training/wiki/Queries-and-Time On Wed, May 12, 2021 at 1:30 PM M Singh wrote: Hey

Apache Flink - A question about Tables API and SQL interfaces

2021-05-12 Thread M Singh
Hey Folks: I have the following questions regarding Table API/SQL in streaming mode: 1. Is there is a notion triggers/evictors/timers when using Table API or SQL interfaces ?2. Is there anything like side outputs and ability to define allowed lateness when dealing with the Table API or SQL

Re: Flink AskTimeoutException killing the jobs

2020-07-06 Thread M Singh
. The Akka ask timeout does not seem to be the root problem to me. Thank you~ Xintong Song On Sat, Jul 4, 2020 at 12:12 AM M Singh wrote: Hi Xintong/LakeShen: We have the following setting in flink-conf.yaml akka.ask.timeout: 180 s akka.tcp.timeout: 180 s But still see this exception

Re: Flink AskTimeoutException killing the jobs

2020-07-03 Thread M Singh
would suggest to look into the jobmanager logs and gc logs, see if there's any problem that prevent the process from handling the rpc messages timely. Thank you~ Xintong Song On Fri, Jul 3, 2020 at 3:51 AM M Singh wrote: Hi: I am using Flink 1.10 on AWS EMR cluster. We are getting

Flink AskTimeoutException killing the jobs

2020-07-02 Thread M Singh
Hi: I am using Flink 1.10 on AWS EMR cluster. We are getting AskTimeoutExceptions which is causing the flink jobs to die.    Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/resourcemanager#-1602864959]] after [1 ms]. Message of type

Re: Kinesis ProvisionedThroughputExceededException

2020-06-18 Thread M Singh
the data to consumer, instead of consumer periodically pulling the data). Roman Grebennikov | g...@dfdx.me On Wed, Jun 17, 2020, at 04:39, M Singh wrote: Thanks Roman for your response and advice. >From my understanding increasing shards will increase throughput but still if >more

Re: Kinesis ProvisionedThroughputExceededException

2020-06-16 Thread M Singh
ot;1.5") // multiplying pause to 1.5 on each next step     conf.put(ConsumerConfigConstants.SHARD_GETRECORDS_RETRIES, "100") // and make up to 100 retries with best regards, Roman Grebennikov | g...@dfdx.me On Mon, Jun 15, 2020, at 13:45, M Singh wrote: Hi: I am using multiple (almost

Kinesis ProvisionedThroughputExceededException

2020-06-15 Thread M Singh
Hi: I am using multiple (almost 30 and growing) Flink streaming applications that read from the same kinesis stream and get  ProvisionedThroughputExceededException exception which fails the job. I have seen a reference 

Flink savepoints history

2020-06-07 Thread M Singh
Hi: I wanted to find out if we can access the savepoints created for a job or all jobs using Flink CLI or REST API.   I took a look at the CLI (Apache Flink 1.10 Documentation: Command-Line Interface) and REST API

Re: Stopping a job

2020-06-06 Thread M Singh
-docs-release-1.6/api/java/org/apache/flink/api/common/functions/StoppableFunction.html[2] https://github.com/apache/flink/blob/release-1.6/flink-core/src/main/java/org/apache/flink/api/common/functions/StoppableFunction.java On Sat, Jun 6, 2020 at 8:03 PM M Singh wrote: Hi Arvid: I check

Re: Stopping a job

2020-06-06 Thread M Singh
if this SO thread [1] helps you already? [1] https://stackoverflow.com/questions/53735318/flink-how-to-solve-error-this-job-is-not-stoppable On Thu, Jun 4, 2020 at 7:43 PM M Singh wrote: Hi: I am running a job which consumes data from Kinesis and send data to another Kinesis queue.  I am using

Stopping a job

2020-06-04 Thread M Singh
Hi: I am running a job which consumes data from Kinesis and send data to another Kinesis queue.  I am using an older version of Flink (1.6), and when I try to stop the job I get an exception  Caused by: java.util.concurrent.ExecutionException:

Re: Apache Flink - Question about application restart

2020-05-28 Thread M Singh
will recover the submitted jobs from a persistent storage system. Cheers,Till On Thu, May 28, 2020 at 4:05 PM M Singh wrote: Hi Till/Zhu/Yang:  Thanks for your replies. So just to clarify - the job id remains same if the job restarts have not been exhausted.  Does Yarn also resubmit the job in case

Re: ClusterClientFactory selection

2020-05-28 Thread M Singh
e) is also used for the same purpose from the execution environments. Cheers, Kostas On Wed, May 27, 2020 at 4:49 AM Yang Wang wrote: > > Hi M Singh, > > The Flink CLI picks up the correct ClusterClientFactory via java SPI. You > could check YarnClusterClientFactory#isCompatibleWit

Re: Apache Flink - Question about application restart

2020-05-28 Thread M Singh
the checkpoints associated with this job. Cheers,Till On Tue, May 26, 2020 at 12:42 PM M Singh wrote: Hi Zhu Zhu: I have another clafication - it looks like if I run the same app multiple times - it's job id changes.  So it looks like even though the graph is the same the job id is not dependent

ClusterClientFactory selection

2020-05-26 Thread M Singh
Hi: I wanted to find out which parameter/configuration allows flink cli pick up the appropriate cluster client factory (especially in the yarn mode). Thanks

Re: Apache Flink - Question about application restart

2020-05-26 Thread M Singh
know if I've missed anything. Thanks On Monday, May 25, 2020, 05:32:39 PM EDT, M Singh wrote: Hi Zhu Zhu: Just to clarify - from what I understand, EMR also has by default restart times (I think it is 3). So if the EMR restarts the job - the job id is the same since the job graph

Re: Apache Flink - Error on creating savepoints using REST interface

2020-05-25 Thread M Singh
On Saturday, May 23, 2020, 10:17:27 AM EDT, Chesnay Schepler wrote: You also have to set the boolean cancel-job parameter. On 22/05/2020 22:47, M Singh wrote: Hi: I am using Flink 1.6.2 and trying to create a savepoint using the following curl command using the following

Re: Apache Flink - Question about application restart

2020-05-25 Thread M Singh
side and persist in DFS for reuse2. yes if high availability is enabled Thanks,Zhu Zhu M Singh 于2020年5月23日周六 上午4:06写道: Hi Flink Folks: If I have a Flink Application with 10 restarts, if it fails and restarts, then: 1. Does the job have the same id ?2. Does the automatically restarting

Apache Flink - Error on creating savepoints using REST interface

2020-05-22 Thread M Singh
Hi: I am using Flink 1.6.2 and trying to create a savepoint using the following curl command using the following references (https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/rest_api.html) and 

Apache Flink - Question about application restart

2020-05-22 Thread M Singh
Hi Flink Folks: If I have a Flink Application with 10 restarts, if it fails and restarts, then: 1. Does the job have the same id ?2. Does the automatically restarting application, pickup from the last checkpoint ? I am assuming it does but just want to confirm. Also, if it is running on AWS EMR

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread M Singh
BTW - Is there any limit to the amount of data that can be stored on emptyDir in K8 ?   On Wednesday, February 26, 2020, 07:33:54 PM EST, M Singh wrote: Thanks Yang and Arvid for your advice and pointers.  Mans On Wednesday, February 26, 2020, 09:52:26 AM EST, Arvid Heise wrote

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread M Singh
at 3:36 AM Yang Wang wrote: Hi M Singh, > Mans - If we use the session based deployment option for K8 - I thought K8 >will automatically restarts any failed TM or JM.  In the case of failed TM - the job will probably recover, but in the case of failed JM - perhaps we need to resubmit al

Re: Apache Flink Job fails repeatedly due to RemoteTransportException

2020-02-24 Thread M Singh
Thanks will try your recommendations and apologize for the delayed response. On Wednesday, January 29, 2020, 09:58:26 AM EST, Till Rohrmann wrote: Hi M Singh, have you checked the TaskManager logs of  ip-xx-xxx-xxx-xxx.ec2.internal/xx.xxx.xxx.xxx:39623 for any suspicious logging

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-24 Thread M Singh
should we use emptyDir ? [1].  https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html M Singh 于2020年2月23日周日 上午2:28写道: Hey Folks: I am trying to figure out the options for running Flink on Kubernetes and am trying to find out the pros and cons of running in F

Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-22 Thread M Singh
Hey Folks: I am trying to figure out the options for running Flink on Kubernetes and am trying to find out the pros and cons of running in Flink Session vs Flink Cluster mode

Apache Flink Job fails repeatedly due to RemoteTransportException

2020-01-28 Thread M Singh
Hi Folks: We have streaming Flink application (using v 1.6.2) and it dies within 12 hours.  We have configured number of restarts which is 10 at the moment. Sometimes the job runs for some time and then within a very short time has a number of restarts and finally fails.  In other instances, the

Re: Apache Flink - Sharing state in processors

2020-01-23 Thread M Singh
/KeyGroupStreamPartitioner.java#L58 BestYun Tang From: M Singh Sent: Friday, January 10, 2020 23:29 To: User Subject: Apache Flink - Sharing state in processors Hi: I have a few question about how state is shared in processors in Flink. 1. If I have a processor instantiated in the Flink app, and apply use

Apache Flink - Sharing state in processors

2020-01-10 Thread M Singh
Hi: I have a few question about how state is shared in processors in Flink. 1. If I have a processor instantiated in the Flink app, and apply use in multiple times in the Flink -     (a) if the tasks are in the same slot - do they share the same processor on the taskmanager ?     (b) if the

Re: Apache Flink - Flink Metrics collection using Prometheus on EMR from streaming mode

2019-12-25 Thread M Singh
ecommended for a streaming job? Best,Vino M Singh 于2019年12月24日周二 下午4:02写道: Hi: I wanted to find out what's the best way of collecting Flink metrics using Prometheus in a streaming application on EMR/Hadoop. Since the Flink streaming jobs could be running on any node - is there any Prome

Apache Flink - Flink Metrics collection using Prometheus on EMR from streaming mode

2019-12-24 Thread M Singh
Hi: I wanted to find out what's the best way of collecting Flink metrics using Prometheus in a streaming application on EMR/Hadoop. Since the Flink streaming jobs could be running on any node - is there any Prometheus configuration or service discovery option available that will dynamically

Re: Apache Flink - Flink Metrics - How to distinguish b/w metrics for two job manager on the same host

2019-12-19 Thread M Singh
specify it to distinguish  different Flink cluster.[1] [1]:  https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#datadog-orgapacheflinkmetricsdatadogdatadoghttpreporter Best,Vino M Singh 于2019年12月19日周四 上午2:54写道: Hi: I am using AWS EMR with Flink application and two of the job

Apache Flink - Flink Metrics - How to distinguish b/w metrics for two job manager on the same host

2019-12-18 Thread M Singh
Hi: I am using AWS EMR with Flink application and two of the job managers are running on the same host.  I am looking at the metrics documentation (Apache Flink 1.9 Documentation: Metrics) and and see the following:  | | | | Apache Flink 1.9 Documentation: Metrics | | | -

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread M Singh
On Wed, Dec 11, 2019 at 1:40 PM M Singh wrote: Thanks Timo for your answer.  I will try the prototype but was wondering if I can find some theoretical documentation to give me a sound understanding. Mans On Wednesday, December 11, 2019, 05:44:15 AM EST, Timo Walther wrote: Little mistake

Re: Apache Flink - Retries for async processing

2019-12-11 Thread M Singh
Thanks Zhu for your advice.  Mans On Tuesday, December 10, 2019, 09:32:01 PM EST, Zhu Zhu wrote: Hi M Singh, I think you would be able to know the request failure cause and whether it is recoverable or not.You can handle the error as you like. For example, if you think the error

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread M Singh
time for testing the > semantics. > > I hope this helps. Otherwise of course we can help you in finding the > answers to the remaining questions. > > Regards, > Timo > > > > On 10.12.19 20:32, M Singh wrote: >> Hi: >> >> I have a few question

Apache Flink - Clarifications about late side output

2019-12-10 Thread M Singh
Hi: I have a few questions about the side output late data.   Here is the API stream .keyBy(...) <- keyed versus non-keyed windows .window(...) <- required: "assigner" [.trigger(...)]<- optional: "trigger" (else default trigger)

Re: Apache Flink - Retries for async processing

2019-12-10 Thread M Singh
:08:39 PM EST, Jingsong Li wrote: Hi M Singh, Our internal has this scenario too, as far as I know, Flink does not have this internal mechanism in 1.9 too.I can share my solution:- In async function, start a thread factory.- Send the call to thread factory when this call has failed. Do

Re: Side output question

2019-12-10 Thread M Singh
, there should be no issue to only have side-outputs in your operator. There should also be no big drawbacks. I guess mostly some metrics will not be properly populated, but you can always populate them manually or add new ones. Best, Arvid On Mon, Dec 2, 2019 at 8:40 PM M Singh wrote: Hi: I am replacing

Apache Flink - Retries for async processing

2019-12-09 Thread M Singh
Hi Folks: I am working on a project where I will be using Flink's async processing capabilities.  The job has to make http request using a token.  The token expires periodically and needs to be refreshed. So, I was looking for patterns for handling async call failures and retries when the token

Side output question

2019-12-02 Thread M Singh
Hi: I am replacing SplitOperator in my flink application with a simple processor with side outputs. My questions is that does the main stream from which we get the side outputs need to have any events (ie, produced using by the using collector.collect) ?  Or can we have all the output as side

Re: Apache Flink - Throttling stream flow

2019-11-29 Thread M Singh
Wednesday, November 27, 2019, 11:32:06 AM EST, Rong Rong wrote: Hi Mans, is this what you are looking for [1][2]? --Rong [1] https://issues.apache.org/jira/browse/FLINK-11501[2]  https://github.com/apache/flink/pull/7679 On Mon, Nov 25, 2019 at 3:29 AM M Singh wrote: Thanks Ciazhi & Th

Re: Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-29 Thread M Singh
-to-all-operators-in-my-job[2]   https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/savepoints.html#what-happens-if-i-delete-an-operator-that-has-state-from-my-job Best,Congxian M Singh 于2019年11月26日周二 上午10:49写道: Hi Kostas/Congxian: Thanks fo your response.   Based on your

Re: Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-25 Thread M Singh
rovide some code from your job, just to see if > there is anything problematic with the application code? > Normally there should be no problem with not providing UIDs for some > stateless operators. > > Cheers, > Kostas > > On Sat, Nov 23, 2019 at 11:16 AM M Singh wrote:

Re: Apache Flink - Uid and name for Flink sources and sinks

2019-11-25 Thread M Singh
Thanks DIan for your pointers.  MansOn Sunday, November 24, 2019, 08:57:53 PM EST, Dian Fu wrote: Hi Mans, Please see my reply inline below. 在 2019年11月25日,上午5:42,M Singh 写道: Thanks Dian for your answers. A few more questions: 1. If I do not assign uids to operators/sources

Re: Apache Flink - Throttling stream flow

2019-11-25 Thread M Singh
able/udfs.html Here is a throttling iterator example:  https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/utils/ThrottledIterator.java M Singh 于2019年11月25日周一 上午5:50写道: Hi: I have an Flink streaming applica

Apache Flink - Throttling stream flow

2019-11-24 Thread M Singh
Hi: I have an Flink streaming application that invokes  some other web services.  However the webservices have limited throughput.  So I wanted to find out if there any recommendations on how to throttle the Flink datastream so that they don't overload the downstrream services.  I am using

Re: Apache Flink - Uid and name for Flink sources and sinks

2019-11-24 Thread M Singh
ating a uid for the sink ?   >> Not sure what do you mean by "attribute id". Could you give some more >> detailed information about it? Regards, Dian On Fri, Nov 22, 2019 at 6:27 PM M Singh wrote: Hi Folks - Please let me know if you have any advice on the best practices

Re: Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-23 Thread M Singh
Hey Folks:    Please let me know how to resolve this issue since using  --allowNonRestoredState without knowing if any state will be lost seems risky. ThanksOn Friday, November 22, 2019, 02:55:09 PM EST, M Singh wrote: Hi: I have a flink application in which some of the operators have

Apache Flink - Troubleshooting exception while starting the job from savepoint

2019-11-22 Thread M Singh
Hi: I have a flink application in which some of the operators have uid and name and some stateless ones don't. I've taken a save point and tried to start another instance of the application from a savepoint - I get the following exception which indicates that the operator is not available to

Re: Apache Flink - Uid and name for Flink sources and sinks

2019-11-22 Thread M Singh
Hi Folks - Please let me know if you have any advice on the best practices for setting uid for sources and sinks.  Thanks.  MansOn Thursday, November 21, 2019, 10:10:49 PM EST, M Singh wrote: Hi Folks: I am assigning uid and name for all stateful processors in our application

Apache Flink - Uid and name for Flink sources and sinks

2019-11-21 Thread M Singh
Hi Folks: I am assigning uid and name for all stateful processors in our application and wanted to find out the following: 1. Should we assign uid and name to the sources and sinks too ?  2. What are the pros and cons of adding uid to sources and sinks ?3. The sinks have uid and hashUid - which

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-21 Thread M Singh
-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointMetadataLoadingTest.java Best,Congxian M Singh 于2019年11月21日周四 下午7:44写道: Hi Congxian: For my application i see many uuids under the chk-6 directory ( I posted one in the sample above).   I am trying to understand that if I

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-21 Thread M Singh
, and the assigned in the application belongs to one operator, they are different. Best,Congxian M Singh 于2019年11月21日周四 上午6:18写道: Hi Arvid: Thanks for your clarification. I am giving supplying uid for the stateful operators and find the following directory structure on in the chkpoint directory

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-20 Thread M Singh
in any case. Best, Arvid On Sat, Nov 16, 2019 at 6:40 PM M Singh wrote: Thanks Jiayi for your response. I am thinking on the same lines.   Regarding using the same name and uuid, I believe the checkpoint state for an operator will be easy to identify if the uuid is the same as name.  But I am

Re: Apache Airflow - Question about checkpointing and re-run a job

2019-11-18 Thread M Singh
-1.9/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint Best,Congxian M Singh 于2019年11月18日周一 上午2:54写道: Folks - Please let me know if you have any advice on this question.  Thanks On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh wrote: Hi: I have a Flink job

Re: Apache Airflow - Question about checkpointing and re-run a job

2019-11-17 Thread M Singh
Folks - Please let me know if you have any advice on this question.  Thanks On Saturday, November 16, 2019, 02:39:18 PM EST, M Singh wrote: Hi: I have a Flink job and sometimes I need to cancel and re run it.  From what I understand the checkpoints for a job are saved under the job id

Apache Airflow - Question about checkpointing and re-run a job

2019-11-16 Thread M Singh
Hi: I have a Flink job and sometimes I need to cancel and re run it.  From what I understand the checkpoints for a job are saved under the job id directory at the checkpoint location. If I run the same job again, it will get a new job id and the checkpoint saved from the previous run job (which

Re: Re:Apache Flink - Operator name and uuid best practices

2019-11-16 Thread M Singh
ery operator, which gives me better experience in monitoring and scaling. Hope this helps. [1]  https://ci.apache.org/projects/flink/flink-docs-stable/ops/upgrading.html#matching-operator-state Best, Jiayi Liao At 2019-11-16 18:35:38, "M Singh" wrote: Hi: I a

Apache Flink - Operator name and uuid best practices

2019-11-16 Thread M Singh
Hi: I am working on a project and wanted to find out what are the best practices for setting name and uuid for operators: 1. Are there any restrictions on the length of the name and uuid attributes ?2. Are there any restrictions on the characters used for name and uuid (blank spaces, etc) ?3.

Apache Flink - How to get heap dump when a job is failing in EMR

2019-08-21 Thread M Singh
Hi: Is there any configuration to get heap dump when job fails in an EMR ?  Thanks

Re: Apache Flink - Manipulating the state of an object in an evictor or trigger

2019-07-19 Thread M Singh
t's OK for now. But I suggest do not manipulate state in `Evictor` until there is an official guarantee. BTW, I'm just wondering why do you want to manipulate state in `Evictor`? If your requirement is reasonable, maybe we could support it officially. M Singh 于2019年7月14日周日 下午9:07写道: Hi: Is it s

Re: Apache Flink - Event time and process time timers with same timestamp

2019-07-19 Thread M Singh
ent `TimeCharacteristic` in one job at the same time?I guess the answer is no. So I don't think there exists such a scenario. M Singh 于2019年7月19日周五 上午12:19写道: Hey Folks - Just checking if you have any pointers for me.  Thanks for your advice. On Sunday, July 14, 2019, 03:12:25 PM EDT, M Singh wr

Re: Apache Flink - Side output time semantics for DataStream

2019-07-19 Thread M Singh
not sure what you exactly mean. Could you describe more about your requirements? M Singh 于2019年7月14日周日 上午9:33写道: Hi: I wanted to find out what is the timestamp associated with the elements of a stream side output with different stream time characteristics. Thanks Man 

Re: Apache Flink - Event time and process time timers with same timestamp

2019-07-18 Thread M Singh
Hey Folks - Just checking if you have any pointers for me.  Thanks for your advice. On Sunday, July 14, 2019, 03:12:25 PM EDT, M Singh wrote: Also, are the event time timers and processing time timers handled separately - ie,  if I register event time timer and then use the same

Re: Apache Flink - Side output time semantics for DataStream

2019-07-18 Thread M Singh
Hey Folks - Just wanted to see if there are any thoughts on this question. ThanksOn Saturday, July 13, 2019, 09:33:15 PM EDT, M Singh wrote: Hi: I wanted to find out what is the timestamp associated with the elements of a stream side output with different stream time characteristics

  1   2   >