Re: runtime memory management

2020-08-30 Thread Xintong Song
Hi, For a complex streaming job, is there any way to tilt the memory towards > stateful operators? If streaming jobs are interested, the quick answer is no. Memory is fetched on demand for all operators. Currently, only managed memory for batch jobs are pre-planned for each operator. Thank you~

Re: Packaging multiple Flink jobs from a single IntelliJ project

2020-08-30 Thread Manas Kale
Hi, I solved my second issue - I was not following Maven's convention for placing source code (I had not placed my source in src/main/java). However, I still would like some help with my first question - what is the recommended way to set a project with multiple main() classes? At the end, I would

Re: Exception on s3 committer

2020-08-30 Thread Yun Gao
Hi Ivan, I think there might be some points to check: 1. Is the job restored from the latest successful checkpoint after restart ? 2. Have you ever changed the timeout settings for uncompleted multipart upload ? 3. Does cbd/late-landing/event_date=2020-08-28/event_hour=16/part-5-26

Re: Implementation of setBufferTimeout(timeoutMillis)

2020-08-30 Thread Yun Gao
Hi Pankaj, I think it should be in org.apache.flink.runtime.io.network.api.writer.RecordWriter$OutputFlusher. Best, Yun -- Sender:Pankaj Chand Date:2020/08/31 02:40:15 Recipient:user Theme:Implementation of setBufferTimeout(

Packaging multiple Flink jobs from a single IntelliJ project

2020-08-30 Thread Manas Kale
Hi, I have an IntelliJ project that has multiple classes with main() functions. I want to package this project as a JAR that I can submit to the Flink cluster and specify the entry class when I start the job. Here are my questions: - I am not really familiar with Maven and would appreciate some

runtime memory management

2020-08-30 Thread lec ssmi
HI: Generally speaking, when we submitting the flink program, the number of taskmanager and the memory of each tn will be specified. And the smallest real execution unit of flink should be operator. Since the calculation logic corresponding to each operator is different, some need to save the

Re: Why consecutive calls of orderBy are forbidden?

2020-08-30 Thread hongfanxo
Hi, Thanks for your reply. I'll look in to the Blink planner later. For the workaround you mentioned, in the actual usage, the second orderBy is wrapped in a function. So I've no idea what has been done for the input Table. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.

Re: flink watermark strategy

2020-08-30 Thread Danny Chan
Watermark mainly serves for windows for the late arrive data, it actually reduces your performance. Best, Danny Chan 在 2020年8月29日 +0800 AM3:09,Vijayendra Yadav ,写道: > Hi Team, > > For regular unbounded streaming application streaming through kafka, which  > does not use any event time or window o

user@flink.apache.org

2020-08-30 Thread Danny Chan
Thanks for the share ~ The query you gave is actually an interval join[1] , a windowed join is two windowed stream join together, see [2]. Theoretically, for interval join, the state would be cleaned periodically based on the watermark and allowed lateness when the range of RHS had been conside

Re: How to use Flink IDE

2020-08-30 Thread Piper Piper
Thank you, Narasimha and Arvid! On Sun, Aug 30, 2020 at 3:09 PM Arvid Heise wrote: > Hi Piper, > > to step into Flink source code, you don't need to import Flink sources > manually or build Flink at all. It's enough to tell IntelliJ to also > download sources for Maven dependencies. [1] > > Flin

Re: How to use Flink IDE

2020-08-30 Thread Arvid Heise
Hi Piper, to step into Flink source code, you don't need to import Flink sources manually or build Flink at all. It's enough to tell IntelliJ to also download sources for Maven dependencies. [1] Flink automatically uploads the source code for each build. For example, see the 1.11.1 artifacts of f

Implementation of setBufferTimeout(timeoutMillis)

2020-08-30 Thread Pankaj Chand
Hello, The documentation gives the following two sample lines for setting the buffer timeout for the streaming environment or transformation. *env.setBufferTimeout(timeoutMillis);env.generateSequence(1,10).map(new MyMapper()).setBufferTimeout(timeoutMillis);* I have been trying to find where (

Re: How to use Flink IDE

2020-08-30 Thread Ardhani Narasimha Swamy
Hi Piper, Welcome to Flink Community. Import flink project like any other project into IDE, only difference while running is you have click on "Include dependencies with "Provided" scope" in the main class run configurations. This bundles the Flink dependencies in the artifact, making it a fat j

How to use Flink IDE

2020-08-30 Thread Piper Piper
Hi, Till now, I have only been using Flink binaries. How do I setup Flink in my IntelliJ IDE so that while running/debugging my Flink application program I can also step into the Flink source code? Do I first need to import Flink's source repository into my IDE and build it? Thanks, Piper

Re: Resource leak in DataSourceNode?

2020-08-30 Thread Robert Metzger
Hi Mark, from the discussion in the JIRA ticket, it seems that we've found somebody in the community who's going to fix this. I don't think calling close() is necessary in the DataSourceNode. The problem is that the connection should not be established in configure() but in open(). Thanks again f

Re: Flink OnCheckpointRollingPolicy streamingfilesink

2020-08-30 Thread Vijayendra Yadav
Thank You Andrey. Regards, Vijay > On Aug 29, 2020, at 3:38 AM, Andrey Zagrebin wrote: > >  > Hi Vijay, > > I would apply the same judgement. It is latency vs throughput vs spent > resources vs practical need. > > The more concurrent checkpoints your system is capable of handling, the > b

Re: Resource leak in DataSourceNode?

2020-08-30 Thread Mark Davis
Hi Robert, Thank you for confirming that there is an issue. I do not have a solution for it and would like to hear the committer insights what is wrong there. I think there are actually two issues - the first one is the HBase InputFormat does not close a connection in close(). Another is DataSo

Re: Flink not outputting windows before all data is seen

2020-08-30 Thread Teodor Spæren
Hey again David! I tried your proposed change of setting the paralilism higher. This worked, but why does this fix the behavior? I don't understand why this would fix it. The only thing that happens to the query plan is that a "remapping" node is added. Thanks for the fix, and for any additi

Re: Flink not outputting windows before all data is seen

2020-08-30 Thread Teodor Spæren
Hey David! I tried what you said, but it did not solve the problem. The job still has to wait until the very end before outputting anything. I mentioned in my original email that I had set the parallelism to 1 job wide, but when I reran the task, I added your line. Are there any circumstance