Re: Print on screen DataStream content

2020-11-23 Thread Pankaj Chand
Please correct me if I am wrong. `DataStream#print()` only prints to the screen when running from the IDE, but does not work (print to the screen) when running on a cluster (even a local cluster). Thanks, Pankaj On Mon, Nov 23, 2020 at 5:31 PM Austin Cawley-Edwards < austin.caw...@gmail.com> wro

Re: Questions on Table to Datastream conversion and onTimer method in KeyedProcessFunction

2020-11-23 Thread Jark Wu
AFAIK, FLINK-10886 is not implemented yet. cc @Becket may know more plans about this feature. Best, Jark On Sat, 21 Nov 2020 at 03:46, wrote: > Hi Timo, > > One more question, the blog also mentioned a jira task to solve this > issue. https://issues.apache.org/jira/browse/FLINK-10886. Will thi

Re: Hi I'm having problems with self-signed certificiate trust with Native K8S

2020-11-23 Thread Yang Wang
Hi Kevin, Let me try to understand your problem. You have added the trusted keystore to the Flink app image(my-flink-app:0.0.1) and it could not be loaded. Right? Even though you tunnel in the pod, you could not find the key store. It is strange. I know it is not very convenient to bundle the key

Learn flink source code book recommendation

2020-11-23 Thread ????
Excuse me, I want to learn the flink source code. Do you have any good information and the latest books?

Re: Print on screen DataStream content

2020-11-23 Thread Austin Cawley-Edwards
Hey Simone, I'd suggest trying out the `DataStream#print()` function to start, but there are a few other easy-to-integrate sinks for testing that you can check out in the docs here[1] Best, Austin [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/datastream_api.html#data-sink

Print on screen DataStream content

2020-11-23 Thread Simone Cavallarin
Hi All, On my code I have a DataStream that I would like to access. I need to understand what I'm getting for each transformation to check if the data that I'm working on make sense. How can I print into the console or get a file (csv, txt) for the variables: "stream", "enriched" and "result"?

Re: Dynamic ad hoc query deployment strategy

2020-11-23 Thread lalala
Hi Till, Thank you for your comment. I am looking forward to hearing from Timo and Dawid as well. Best regards, -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
For the sake of simplification (so everybody looking for missing methods in RestClusterClient) I just shared the new methods at [1]. In this way you can add them to the RestClusterClient when you want (if you want to). I also had to change the visibility of some variables and methods in order to ma

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
I don't know if they need to be added also to the ClusterClient but for sure they are missing in the RestClusterClient On Mon, Nov 23, 2020 at 4:31 PM Aljoscha Krettek wrote: > On 23.11.20 16:26, Flavio Pompermaier wrote: > > Thank you Aljosha,.now that's more clear! > > I didn't know that jobGr

Re: Logs of JobExecutionListener

2020-11-23 Thread Aljoscha Krettek
On 23.11.20 16:26, Flavio Pompermaier wrote: Thank you Aljosha,.now that's more clear! I didn't know that jobGraph.getJobID() was the solution for my use case..I was convinced that the job ID was assigned by the cluster! And to me it's really weird that the job listener was not called by the subm

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
Thank you Aljosha,.now that's more clear! I didn't know that jobGraph.getJobID() was the solution for my use case..I was convinced that the job ID was assigned by the cluster! And to me it's really weird that the job listener was not called by the submitJob...Probably this should be documented at l

Re: Hi I'm having problems with self-signed certificiate trust with Native K8S

2020-11-23 Thread Till Rohrmann
Thanks for reaching out to the Flink community Kevin. Yes, with Flink 1.12.0 it should be possible to mount secrets with your K8s deployment. >From the posted stack trace it is not possible to see what exactly is going wrong. Could you maybe post the complete logs? I am also pulling in Yang Wang wh

Re: Logs of JobExecutionListener

2020-11-23 Thread Aljoscha Krettek
On 20.11.20 22:09, Flavio Pompermaier wrote: To achieve this, I was using the RestClusterClient because with that I can use the following code and retrieve the JobID: (1) JobID flinkJobId = client.submitJob(jobGraph).thenApply(DetachedJobExecutionResult::new).get().getJobID(); All you wan

Re: Concise example of how to deploy flink on Kubernetes

2020-11-23 Thread Till Rohrmann
Hi George, Here is some documentation about how to deploy a stateful function job [1]. In a nutshell, you need to deploy a Flink cluster on which you can run the stateful function job. This can either happen before (e.g. by spawning a session cluster on K8s [2]) or you can combine your job into a

Re: Is possible that make two operators always locate in same taskmanager?

2020-11-23 Thread Till Rohrmann
In all cases (session and per-job mode cluster) except for the JM recovery of the application mode [1], the main() function only runs once in order to generate the JobGraph which is sent to the cluster and which is also used for recoveries. [1] https://ci.apache.org/projects/flink/flink-docs-stabl

Re: Concise example of how to deploy flink on Kubernetes

2020-11-23 Thread George Costea
Sorry. Forgot to reply to all. On Sun, Nov 22, 2020 at 9:24 PM George Costea wrote: > > Hi Xingbo, > > I’m interested in using stateful functions to build an application on > Kubernetes. Don’t I need to deploy the flink cluster on Kubernetes first > before deploying my stateful functions? > >

Re: Dynamic ad hoc query deployment strategy

2020-11-23 Thread Till Rohrmann
Hi Lalala, I think this approach can work as long as the generated query plan contains the same sub plan for the previous queries as before. Otherwise Flink won't be able to match the state to the operators of the plan. I think Timo and Dawid should know definitely whether this is possible or not.

Re: Is possible that make two operators always locate in same taskmanager?

2020-11-23 Thread Si-li Liu
Thanks for your reply. The source will poll the state of T operator periodicly. The it find the offset is 0 then it can fallback to latest committed offset. Till Rohrmann 于2020年11月23日周一 下午9:35写道: > Hi Si-li Liu, > > if you want to run T with a parallelism of 1, then your parallelism of A > shou

Re: Is possible that make two operators always locate in same taskmanager?

2020-11-23 Thread Si-li Liu
Thanks for your reply! Seems state processor api can solve my problem, the state written by T operator's checkpoint can be read by main function when job restart. My question is, when streaming job restarts due to some reason, does the main function will also rerun again? Arvid Heise 于2020年11月23

Re: Is possible that make two operators always locate in same taskmanager?

2020-11-23 Thread Till Rohrmann
Hi Si-li Liu, if you want to run T with a parallelism of 1, then your parallelism of A should be limited by the total number of slots on your TM. Otherwise you would have some A_i which are not running on a machine with T. For the approach with the colocation constraint, you can take a look at Tr

Re: Logs of JobExecutionListener

2020-11-23 Thread Flavio Pompermaier
My final analysis is that the RestClusterClient lack of many methods (jarUpload, jarRun, getExceptions for example) and that the submitJob (and the JobSubmitHandler endpoint) is bugged or should be deprecated (because it does not call the job listeners). Indeed, if the JarRunHandler endpoint is inv

Re: Is possible that make two operators always locate in same taskmanager?

2020-11-23 Thread Arvid Heise
If you would prefer to have T with parallelism 1, one complete alternative solution would be to leave the timestamp in the state of T and extract the timestamp from the savepoint/checkpoint upon start of the application using the state processor API [1]. Unfortunately, it may be a bit hacky when yo

Re: CREATE TABLE LIKE clause from different catalog or database

2020-11-23 Thread Benchao Li
Hi Dongwon, You are hitting a known bug[1] which is fixed in 1.11.3 and 1.12.0 Another tip, currently, LIKE clause cannot work with Hive table. (General table stored in hive metastore should work) [1] https://issues.apache.org/jira/browse/FLINK-19281 Dongwon Kim 于2020年11月17日周二 上午12:04写道: > Hi

Re: Dynamic ad hoc query deployment strategy

2020-11-23 Thread lalala
Hi Kostas, Yes, that would satisfy my use case as the platform is always future-oriented. Any arbitrary query is executed on the latest data. >From your comment, I understand that even the session mode does not optimize our readers. I wish Flink could support arbitrary job submission and graph ge

Re: Dynamic ad hoc query deployment strategy

2020-11-23 Thread Kostas Kloudas
Hi Lalala, Even in session mode, the jobgraph is created before the job is executed. So all the above hold. Although I am not super familiar with the catalogs, what you want is that two or more jobs share the same readers of a source. This is not done automatically in DataStream or DataSet and I a