Re: Network Buffers

2023-06-06 Thread weijie guo
Hi Pritam,

The legacy config option `taskmanager.network.numberOfBuffers` is
deprecated and will be ignored, so please do not refer to it.
The exact size of network memory calculated through
`taskmanager.memory.network.fraction`, while also ensuring that it is
within the constraints of min and max value. So it can be explicitly
specified by setting the min & max to the same value if you want. Another
thing to note is that the fraction is relative to flink memory, not process
memory(8gb in your example), and there is a difference of
jvmMetaspaceAndOverload memory part between them.


Best regards,

Weijie


Hangxiang Yu  于2023年6月7日周三 10:44写道:

> Hi, Pritam.
> IIUC, the number is TM scope and just calculated by "available network
> memory / buffer size".
> For example, if the fraction is 0.1, the number may be about ( 8 * 0.1 *
> 1024 * 1024 * 1024 / 32768).
>
>
> On Tue, Jun 6, 2023 at 3:14 PM Pritam Agarwala <
> pritamagarwala...@gmail.com> wrote:
>
>> Thanks for answering Hangxiang!
>>
>> Still confused. How did Flink get this number 22773 ? Couldn't find the
>> default value of "taskmanager.network.numberOfBuffers" config.
>> and According to the formula it should be around 1000. ( #slots-per-TM^2
>> * #TMs * 4 = 4^2 * 16 * 4 = 1024)
>>
>> I have a total of 6000 tasks with 16 TM , 4 cores each with
>> jobmanger/taskmanger.momry.process.size = 8 gb .
>>
>>
>> Thanks & Regards,
>> Pritam
>>
>>
>>
>> On Tue, Jun 6, 2023 at 9:02 AM Hangxiang Yu  wrote:
>>
>>> Hi, Pritam.
>>> This error message indicates that the current configuration of the
>>> network buffer is not enough to handle the current workload.
>>>
 What is the meaning of this exception (The total number of network
 buffers is currently set to 22773 of 32768 bytes each)?

>>> This just provides some information about the current status of network
>>> buffers (22773 * 32768 bytes ~= 711MB).
>>>
>>> How to figure out a good combination of
 ('taskmanager.memory.network.fraction', 'taskmanager.memory.network.min',
 and 'taskmanager.memory.network.max'.)
 for this issue ?

>>> IIUC,  There is no absolute standard for setting these parameters.
>>> These parameters may be affected by many factors, such as the data flow
>>> rate, computational complexity, and memory usage of your job.
>>>
>>> Some steps to setup and adjust these parameters:
>>> 1. Check the available memory on your job.
>>> 2. Evaluate the network usage and consider how much memory could be
>>> used for network buffers.
>>> 3. Monitor the system and collect metrics such as network throughput,
>>> memory usage.
>>> 4. Adjust these parameters if the job has a high network usage or
>>> memory-intensive.
>>>
>>> Just a personal and immature suggestion about how to adjust when it's
>>> not enough:  1. Increase taskmanager.memory.network.fraction from 0.1
>>> to 0.2, or just increase taskmanager.memory.network.max slightly.
>>> 2. If the buffer size is too large, it may affect checkpoints. So it's
>>> recommended to combine with buffer debloating.
>>> 
>>>
>>>
>>> On Tue, Jun 6, 2023 at 2:44 AM Pritam Agarwala <
>>> pritamagarwala...@gmail.com> wrote:
>>>
 Hi All,


 java.io.IOException: Insufficient number of network buffers: required
 2, but only 0 available. The total number of network buffers is currently
 set to 22773 of 32768 bytes each.

 What is the meaning of this exception (The total number of network
 buffers is currently set to 22773 of 32768 bytes each)?

 How to figure out a good combination of
 ('taskmanager.memory.network.fraction', 'taskmanager.memory.network.min',
 and 'taskmanager.memory.network.max'.)
 for this issue ?


 Thanks & Regards,
 Pritam

>>>
>>>
>>> --
>>> Best,
>>> Hangxiang.
>>>
>>
>
> --
> Best,
> Hangxiang.
>


Setting up a Multi-node Flink Cluster

2023-06-06 Thread Z M Ang
Hello
Is there a reference working implementation of a Multi-VM Flink Cluster
(NOT on Docker)?
e.g., a 1 Master VM + 3 Worker VMs.
Not looking for documentation - but a working example with conf files
modified.

Thanks
Z Mang


Re: changing serializer affects resuming from checkpoint

2023-06-06 Thread Hangxiang Yu
HI, Peng.
Do these two jobs have any dependency?
Or Could you please share the specific logic of the two jobs if convenient
? Could you also share the failure message of the producer job ?
In my opinion, if the two tasks have no other association, as you said, the
consumer job will fail due to unsupported scheme evolution, but it should
not affect the producer job.


On Tue, Jun 6, 2023 at 2:58 PM Peng Peng  wrote:

> Hi,
>
> I have 2 flink jobs, of which one produces records to kafka using kryo
> serializer and the other consumes records from kafka and deserializes with
> kryo. It has been working fine. Then I stopped both jobs with checkpoints
> and changed the consumer job to disable generic type and kryo to use avro
> serialization. However, when I resumed the 2 jobs from the checkpoint, both
> failed. It made sense the consumer job would fail, but why is the producer
> job also affected?
>
> Thanks,
> Peng
>


-- 
Best,
Hangxiang.


Re: Network Buffers

2023-06-06 Thread Hangxiang Yu
Hi, Pritam.
IIUC, the number is TM scope and just calculated by "available network
memory / buffer size".
For example, if the fraction is 0.1, the number may be about ( 8 * 0.1 *
1024 * 1024 * 1024 / 32768).


On Tue, Jun 6, 2023 at 3:14 PM Pritam Agarwala 
wrote:

> Thanks for answering Hangxiang!
>
> Still confused. How did Flink get this number 22773 ? Couldn't find the
> default value of "taskmanager.network.numberOfBuffers" config.
> and According to the formula it should be around 1000. ( #slots-per-TM^2
> * #TMs * 4 = 4^2 * 16 * 4 = 1024)
>
> I have a total of 6000 tasks with 16 TM , 4 cores each with
> jobmanger/taskmanger.momry.process.size = 8 gb .
>
>
> Thanks & Regards,
> Pritam
>
>
>
> On Tue, Jun 6, 2023 at 9:02 AM Hangxiang Yu  wrote:
>
>> Hi, Pritam.
>> This error message indicates that the current configuration of the
>> network buffer is not enough to handle the current workload.
>>
>>> What is the meaning of this exception (The total number of network
>>> buffers is currently set to 22773 of 32768 bytes each)?
>>>
>> This just provides some information about the current status of network
>> buffers (22773 * 32768 bytes ~= 711MB).
>>
>> How to figure out a good combination of
>>> ('taskmanager.memory.network.fraction', 'taskmanager.memory.network.min',
>>> and 'taskmanager.memory.network.max'.)
>>> for this issue ?
>>>
>> IIUC,  There is no absolute standard for setting these parameters.
>> These parameters may be affected by many factors, such as the data flow
>> rate, computational complexity, and memory usage of your job.
>>
>> Some steps to setup and adjust these parameters:
>> 1. Check the available memory on your job.
>> 2. Evaluate the network usage and consider how much memory could be used
>> for network buffers.
>> 3. Monitor the system and collect metrics such as network throughput,
>> memory usage.
>> 4. Adjust these parameters if the job has a high network usage or
>> memory-intensive.
>>
>> Just a personal and immature suggestion about how to adjust when it's not
>> enough:  1. Increase taskmanager.memory.network.fraction from 0.1 to
>> 0.2, or just increase taskmanager.memory.network.max slightly.
>> 2. If the buffer size is too large, it may affect checkpoints. So it's
>> recommended to combine with buffer debloating.
>> 
>>
>>
>> On Tue, Jun 6, 2023 at 2:44 AM Pritam Agarwala <
>> pritamagarwala...@gmail.com> wrote:
>>
>>> Hi All,
>>>
>>>
>>> java.io.IOException: Insufficient number of network buffers: required 2,
>>> but only 0 available. The total number of network buffers is currently set
>>> to 22773 of 32768 bytes each.
>>>
>>> What is the meaning of this exception (The total number of network
>>> buffers is currently set to 22773 of 32768 bytes each)?
>>>
>>> How to figure out a good combination of
>>> ('taskmanager.memory.network.fraction', 'taskmanager.memory.network.min',
>>> and 'taskmanager.memory.network.max'.)
>>> for this issue ?
>>>
>>>
>>> Thanks & Regards,
>>> Pritam
>>>
>>
>>
>> --
>> Best,
>> Hangxiang.
>>
>

-- 
Best,
Hangxiang.


Re: flink14 batch mode can read.iceberg table but stream mode can not

2023-06-06 Thread liu ron
Hi,

As Martijn says, you can find help in Iceberg community.

Best,
Ron

Martijn Visser  于2023年6月5日周一 22:45写道:

> Hi,
>
> This question is better suited for the Iceberg community, since they've
> built the Flink-Iceberg integration.
>
> Best regards,
>
> Martijn
>
> On Wed, May 31, 2023 at 9:48 AM 湘晗刚 <1016465...@qq.com> wrote:
>
>> flink14 batch mode can read iceberg table but stream mode can not ,why?
>> Thanks in advance
>> Kobe24
>>
>


Re: flink14 sql sink kafka error

2023-06-06 Thread liu ron
Hi,

You can give more exception info which can help to find the root cause.
But according to the context you provided, as Hang says, it may be the
Kafka Server problem.

Best,
Ron

Hang Ruan  于2023年6月6日周二 10:46写道:

> Hi, 湘晗刚,
>
> This error seem to be an error from the Kafka server. Maybe you should
> check whether the Kafka server occurs some error.
> Or you could provide more messages about the request. These information is
> too short to analyze,
>
> Best,
> Hang
>
> 湘晗刚 <1016465...@qq.com> 于2023年6月5日周一 15:08写道:
>
>> UnknownServerException :The server experienced an unexpected error when
>> processing the reqiest.
>> Thanks
>> Kobe24
>>
>


Re: In HA mode, support the same application run multiple jobs?

2023-06-06 Thread liu ron
Hi,

Your mean Application Mode or other modes?

Best,
Ron

Weihua Hu  于2023年5月23日周二 21:29写道:

> Hi
>
> High-Availability in Application Mode is only supported for
> single-execute() applications.[1]
>
> And the reason is[2]:
>
> The added complexity stems mainly from the fact that the different jobs
>> within an application may at any point be in different stages of their
>> execution,
>
> e.g. some may be running, some finished and some may be waiting for others
>> to finish in order to consume their output.
>
>
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/overview/#application-mode
> [2]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-85%3A+Flink+Application+Mode
>
> Best,
> Weihua
>
>
> On Tue, May 23, 2023 at 3:18 PM melin li  wrote:
>
>> In HA mode,  support the same application run multiple jobs?
>>
>>


Re: Seeking advice on Looking up data - FileSystem connector / Kafka topic

2023-06-06 Thread liu ron
Hi, Neha
> 1. Is there a plan to extend filesystem connector to support lookup on
the file in the way I mentioned above? Or if it already does, I would
really appreciate if you could point me to some example or documentation.
According to my understanding, there are no plans for community to support
it at this time. I think you try to extend.

Best,
Ron

Neha Rawat  于2023年5月26日周五 09:29写道:

> Hello,
>
>
>
> I have a usecase where I use flink to execute sql query on data that’s
> flowing in from a kafka topic. The output of this query is written to
> another kafka topic.  The query that I want to execute is  supposed to do a
> lookup on a small csv file (from which entries can be deleted/updated at
> any point in time). I initially used the filesystem connector which worked
> fine until the csv file was updated. On reading more about it I found out
> that flink processes each file just once on job startup and so any updates
> after that aren’t taken care of.
>
>
>
>1. Is there a plan to extend filesystem connector to support lookup on
>the file in the way I mentioned above? Or if it already does, I would
>really appreciate if you could point me to some example or documentation.
>2. Can you suggest an alternative to this? I did try using a compacted
>kafka topic (that always keeps the latest records for a key) where I would
>pass on the records after reading from csv, but even in that case I found
>that if I update a record in compacted kafka topic such that it does not
>match the query, I still get the matched output. Is it perhaps matching
>against stale/cached data?
>
>
>
> Example of the tables I created and the query-
>
>  Input table ->
>
> "create table input(sessionid string, usern string, ipsrc string, filehash
> string, time_sql_timestamp TIMESTAMP(3), WATERMARK FOR time_sql_timestamp
> AS time_sql_timestamp - INTERVAL '5' SECOND) WITH ( 'connector'\
>
> \ = 'kafka','topic' = 'input','properties.bootstrap.servers' =
> 'kafka:9092','json.ignore-parse-errors'\
>
> \ = 'true','json.fail-on-missing-field'= 'false','properties.group.id'
> = 'input','scan.startup.mode'\
>
> \ = 'latest-offset','format' = 'json');"
>
>
>
> Context Table for lookup ->
>
>"create table context(id string, username string, blacklisted string,
> update_time TIMESTAMP(3) METADATA FROM 'timestamp' VIRTUAL
>
>) WITH ( 'connector'\
>
> \ = 'kafka','topic' = 'context', 'properties.bootstrap.servers' =
> 'kafka:9092', 'json.ignore-parse-errors'\
>
> \ = 'true','json.fail-on-missing-field'= 'false','properties.group.id'
> = 'context','scan.startup.mode'\
>
> \ = 'earliest-offset', 'format' = 'json');"
>
>
>
> Output Table ->
>
>   "create table output(sessionid string, usern string, ipsrc string,
> filehash string, time_sql_timestamp TIMESTAMP(3)) WITH ( 'connector'\
>
> \ = 'kafka','topic' = 'output','properties.bootstrap.servers' =
> 'kafka:9092', 'properties.bootstrap.servers' =
> 'kafka:9092','json.ignore-parse-errors'\
>
>\ = 'true','json.fail-on-missing-field'= 'false','properties.group.id'
> = 'input','scan.startup.mode'\
>
> \ = 'latest-offset','format' = 'json');"
>
>
>
>
>
> Sql à
>
> "INSERT INTO output WITH input_cte AS (SELECT * FROM TABLE (TUMBLE(TABLE
> input, DESCRIPTOR(time_sql_timestamp), INTERVAL '10' SECOND)), context
> where (usern = context.username) ORDER BY time_sql_timestamp) (SELECT
> i.sessionid, i.usern, i.ipsrc, i.filehash, i.time_sql_timestamp  FROM
> input_cte as i  join context on  i.usern = context.username)"
>
>
>
>
>
> Sample input=
>
> {"sessionid":"101", "usern":"abc","ipsrc":"2.1.1.1","filehash":"hash1",
> "time_sql_timestamp":"2023-04-11 23:08:23.000"}
>
> {"sessionid":"102", "usern":"def","ipsrc":"3.1.1.1","filehash":"hash1",
> "time_sql_timestamp":"2023-04-11 23:08:24.000"}
>
> {"sessionid":"100", "usern":"neha","ipsrc":"1.1.1.1","filehash":"hash1",
> "time_sql_timestamp":"2023-04-11 23:08:25.000"}
>
>
>
> Sample context =
>
> {"id" : "1", "username" : "neha1", "blacklisted" : "false"}
>
> {"id" : "2", "username" : "neha2", "blacklisted" : "false"}
>
> {"id" : "3", "username" : "neha3", "blacklisted" : "false"}
>
> {"id" : "4", "username" : "neha", "blacklisted" : "false"}
>
>
>
>
>
> First I get record in output as we do have username=neha in context.
>
> {"sessionid":"100", "usern":"neha","ipsrc":"1.1.1.1","filehash":"hash1",
> "time_sql_timestamp":"2023-04-11 23:08:25.000"}
>
>
>
> But when I update {"id" : "4", "username" : "neha", "blacklisted" :
> "false"} to {"id" : "4", "username" : "some-other-value", "blacklisted" :
> "false"}, even then I get same result in output –
>
> {"sessionid":"100", "usern":"neha","ipsrc":"1.1.1.1","filehash":"hash1",
> "time_sql_timestamp":"2023-04-11 23:08:25.000"}
>
>
>
>
>
> Its only after restarting the job that I do not get the output due to the
> updated record in context.
>
>
>
>
>
>
>
> Thanks,
>
> Neha
>
>
>
>
>
>
> Caution: External email. Do not click or open 

Re: Why I can't run more than 19 tasks?

2023-06-06 Thread liu ron
Hi, Hemi,

According to your context, it seems that the problem comes from MySQL
client side, maybe you can check the client configuration.

Best,
Ron

Shammon FY  于2023年5月25日周四 08:40写道:

> Hi Hemi,
>
> There may be two reasons that I can think of
> 1. The number of connections exceeds the MySQL limit, you can check the
> options in my.cnf for your mysql server and increase the max connections.
> 2. Connection timeout for mysql client, you can try to add
> 'autoReconnect=true' to the connection url
>
> Best,
> Shammon FY
>
>
> On Thu, May 25, 2023 at 8:32 AM Hemi Grs  wrote:
>
>> hey everybody,
>>
>> I have a problem with my apache flink, I am synchronizing from MySQL to
>> Elasticsearch but it seems that I can't run more than 19 tasks. it gave me
>> this error:
>>
>> --
>> Caused by: org.apache.flink.util.FlinkRuntimeException:
>> org.apache.flink.util.FlinkRuntimeException:
>> java.sql.SQLTransientConnectionException: connection-pool-10.10.10.111:3306
>> - Connection is not available, request timed out after 3ms. at
>> com.ververica.cdc.connectors.mysql.debezium.DebeziumUtils.openJdbcConnection(DebeziumUtils.java:64)
>> at
>> com.ververica.cdc.connectors.mysql.source.assigners.MySqlSnapshotSplitAssigner.discoveryCaptureTables(MySqlSnapshotSplitAssigner.java:171)
>> ... 12 more
>> Caused by: org.apache.flink.util.FlinkRuntimeException:
>> java.sql.SQLTransientConnectionException: connection-pool-10.10.10.111:3306
>> - Connection is not available, request timed out after 3ms. at
>> com.ververica.cdc.connectors.mysql.source.connection.JdbcConnectionFactory.connect(JdbcConnectionFactory.java:72)
>> at io.debezium.jdbc.JdbcConnection.connection(JdbcConnection.java:890) at
>> io.debezium.jdbc.JdbcConnection.connection(JdbcConnection.java:885)
>> at io.debezium.jdbc.JdbcConnection.connect(JdbcConnection.java:418) at
>> com.ververica.cdc.connectors.mysql.debezium.DebeziumUtils.openJdbcConnection(DebeziumUtils.java:61)
>> ... 13 moreCaused by: java.sql.SQLTransientConnectionException:
>> connection-pool-10.10.10.111:3306 - Connection is not available, request
>> timed out after 3ms.
>> at
>> com.ververica.cdc.connectors.shaded.com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:696)
>> at
>> com.ververica.cdc.connectors.shaded.com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:197)
>> at
>> com.ververica.cdc.connectors.shaded.com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:162)
>> at
>> com.ververica.cdc.connectors.shaded.com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:100)
>> at
>> com.ververica.cdc.connectors.mysql.source.connection.JdbcConnectionFactory.connect(JdbcConnectionFactory.java:59)
>> ... 17 more
>> -
>>
>> I have try adding this 2 lines on flink-conf.yaml but doesn't do anything:
>> -
>> env.java.opts:
>> "-Dcom.ververica.cdc.connectors.mysql.hikari.maximumPoolSize=100"
>> flink.connector.mysql-cdc.max-pool-size: 100
>> -
>>
>> does anybody know the solution?
>> Additional info, my database is doing fine, because I try creating
>> another apache flink server and it can run another 19 tasks, so total there
>> 38 tasks running and it's doing fine. So how do I run many tasks on 1
>> server and the server still have lots of resources.
>>
>> Thanks
>>
>


Re: Custom Counter on Flink File Source

2023-06-06 Thread Hang Ruan
Hi, Kirti Dhar Upadhyay K.

I check the FLIP-274[1]. This issue will be released in the 1.18.0. It is
not contained in any release now.

Best,
Hang

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-274%3A+Introduce+metric+group+for+OperatorCoordinator

Kirti Dhar Upadhyay K  于2023年6月7日周三
02:51写道:

> Hi Hang,
>
>
>
> Thanks for reply.
>
> I tried using SplitEnumeratorContext passed in
> AbstractFileSource#createEnumerator but resulted as NullPointerException.
>
> As SplitEnumeratorContext provides its implementation as
> SourceCoordinatorContext having metricGroup() as below-
>
>
>
>
>
> @Override
>
> *public* SplitEnumeratorMetricGroup metricGroup() {
>
> *return* *null*;
>
> }
>
>
>
> Am I doing any mistake?
>
>
>
> Regards,
>
> Kirti Dhar
>
>
>
> *From:* Hang Ruan 
> *Sent:* 06 June 2023 08:12
> *To:* Kirti Dhar Upadhyay K 
> *Cc:* user@flink.apache.org
> *Subject:* Re: Custom Counter on Flink File Source
>
>
>
> Hi, Kirti Dhar Upadhyay K.
>
>
>
> We could get the metric group from the context, like `SourceReaderContext`
> and `SplitEnumeratorContext`. These contexts could be found when creating
> readers and enumerators. See `AbstractFileSource#createReader` and
> `AbstractFileSource#createEnumerator`.
>
>
>
> Best,
>
> Hang
>
>
>
> Kirti Dhar Upadhyay K via user  于2023年6月5日周一 22:57
> 写道:
>
> Hi Community,
>
>
>
> I am trying to add a new counter for number of files collected on Flink
> File Source.
>
> Referring the doc
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/ I
> understand how to add a new counter on any operator.
>
>
>
> *this.*counter *=* *getRuntimeContext().*getMetricGroup*().*counter*(*
> "myCounter"*);*
>
>
>
> But not able to get this RuntimeContext on FileSource.
>
> Can someone give some clue on this?
>
>
>
> Regards,
>
> Kirti Dhar
>
>


RE: Custom Counter on Flink File Source

2023-06-06 Thread Kirti Dhar Upadhyay K via user
Hi Hang,

Thanks for reply.
I tried using SplitEnumeratorContext passed in 
AbstractFileSource#createEnumerator but resulted as NullPointerException.
As SplitEnumeratorContext provides its implementation as 
SourceCoordinatorContext having metricGroup() as below-


@Override
public SplitEnumeratorMetricGroup metricGroup() {
return null;
}

Am I doing any mistake?

Regards,
Kirti Dhar

From: Hang Ruan 
Sent: 06 June 2023 08:12
To: Kirti Dhar Upadhyay K 
Cc: user@flink.apache.org
Subject: Re: Custom Counter on Flink File Source

Hi, Kirti Dhar Upadhyay K.

We could get the metric group from the context, like `SourceReaderContext` and 
`SplitEnumeratorContext`. These contexts could be found when creating readers 
and enumerators. See `AbstractFileSource#createReader` and 
`AbstractFileSource#createEnumerator`.

Best,
Hang

Kirti Dhar Upadhyay K via user 
mailto:user@flink.apache.org>> 于2023年6月5日周一 22:57写道:
Hi Community,

I am trying to add a new counter for number of files collected on Flink File 
Source.
Referring the doc  
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/ I 
understand how to add a new counter on any operator.

this.counter = getRuntimeContext().getMetricGroup().counter("myCounter");

But not able to get this RuntimeContext on FileSource.
Can someone give some clue on this?

Regards,
Kirti Dhar


Re: [DISCUSS] Status of Statefun Project

2023-06-06 Thread Marco Villalobos
Why can't the Apache Software Foundation allow community members to bring it up 
to date?

What's the process for that?

I believe that there are people and companies on this mailing list interested 
in supporting Apache Flink Stateful Functions.

You already had two people on this thread express interest.

At the very least, we could keep the library versions up to date.

There are only a small list of new features that might be worthwhile:

1. event time processing
2. state rest api


> On Jun 6, 2023, at 3:06 AM, Chesnay Schepler  wrote:
> 
> If you were to fork it and want to redistribute it then the short version is 
> that 
> you have to adhere to the Apache licensing requirements
> you have to make it clear that your fork does not belong to the Apache Flink 
> project. (Trademarks and all that)
> Neither should be significant hurdles (there should also be plenty of online 
> resources regarding 1), and if you do this then you can freely share your 
> fork with others.
> 
> I've also pinged Martijn to take a look at this thread.
> To my knowledge the project hasn't decided anything yet.
> 
> On 27/05/2023 04:05, Galen Warren wrote:
>> Ok, I get it. No interest.
>> 
>> If this project is being abandoned, I guess I'll work with my own fork. Is
>> there anything I should consider here? Can I share it with other people who
>> use this project?
>> 
>> On Tue, May 16, 2023 at 10:50 AM Galen Warren  
>> 
>> wrote:
>> 
>>> Hi Martijn, since you opened this discussion thread, I'm curious what your
>>> thoughts are in light of the responses? Thanks.
>>> 
>>> On Wed, Apr 19, 2023 at 1:21 PM Galen Warren  
>>> 
>>> wrote:
>>> 
 I use Apache Flink for stream processing, and StateFun as a hand-off
> point for the rest of the application.
> It serves well as a bridge between a Flink Streaming job and
> micro-services.
 
 This is essentially how I use it as well, and I would also be sad to see
 it sunsetted. It works well; I don't know that there is a lot of new
 development required, but if there are no new Statefun releases, then
 Statefun can only be used with older Flink versions.
 
 On Tue, Apr 18, 2023 at 10:04 PM Marco Villalobos <
 mvillalo...@kineteque.com > wrote:
 
> I am currently using Stateful Functions in my application.
> 
> I use Apache Flink for stream processing, and StateFun as a hand-off
> point for the rest of the application.
> It serves well as a bridge between a Flink Streaming job and
> micro-services.
> 
> I would be disappointed if StateFun was sunsetted.  Its a good idea.
> 
> If there is anything I can do to help, as a contributor perhaps, please
> let me know.
> 
>> On Apr 3, 2023, at 2:02 AM, Martijn Visser  
>> 
> wrote:
>> Hi everyone,
>> 
>> I want to open a discussion on the status of the Statefun Project [1]
> in Apache Flink. As you might have noticed, there hasn't been much
> development over the past months in the Statefun repository [2]. There is
> currently a lack of active contributors and committers who are able to 
> help
> with the maintenance of the project.
>> In order to improve the situation, we need to solve the lack of
> committers and the lack of contributors.
>> On the lack of committers:
>> 
>> 1. Ideally, there are some of the current Flink committers who have
> the bandwidth and can help with reviewing PRs and merging them.
>> 2. If that's not an option, it could be a consideration that current
> committers only approve and review PRs, that are approved by those who are
> willing to contribute to Statefun and if the CI passes
>> On the lack of contributors:
>> 
>> 3. Next to having this discussion on the Dev and User mailing list, we
> can also create a blog with a call for new contributors on the Flink
> project website, send out some tweets on the Flink / Statefun twitter
> accounts, post messages on Slack etc. In that message, we would inform how
> those that are interested in contributing can start and where they could
> reach out for more information.
>> There's also option 4. where a group of interested people would split
> Statefun from the Flink project and make it a separate top level project
> under the Apache Flink umbrella (similar as recently has happened with
> Flink Table Store, which has become Apache Paimon).
>> If we see no improvements in the coming period, we should consider
> sunsetting Statefun and communicate that clearly to the users.
>> I'm looking forward to your thoughts.
>> 
>> Best regards,
>> 
>> Martijn
>> 
>> [1] https://nightlies.apache.org/flink/flink-statefun-docs-master/ 
>> 

Re: [DISCUSS] Status of Statefun Project

2023-06-06 Thread Galen Warren via user
Thanks for the information.

On Tue, Jun 6, 2023, 6:07 AM Chesnay Schepler  wrote:

> If you were to fork it /and want to redistribute it/ then the short
> version is that
>
>  1. you have to adhere to the Apache licensing requirements
>  2. you have to make it clear that your fork does not belong to the
> Apache Flink project. (Trademarks and all that)
>
> Neither should be significant hurdles (there should also be plenty of
> online resources regarding 1), and if you do this then you can freely
> share your fork with others.
>
> I've also pinged Martijn to take a look at this thread.
> To my knowledge the project hasn't decided anything yet.
>
> On 27/05/2023 04:05, Galen Warren wrote:
> > Ok, I get it. No interest.
> >
> > If this project is being abandoned, I guess I'll work with my own fork.
> Is
> > there anything I should consider here? Can I share it with other people
> who
> > use this project?
> >
> > On Tue, May 16, 2023 at 10:50 AM Galen Warren
> > wrote:
> >
> >> Hi Martijn, since you opened this discussion thread, I'm curious what
> your
> >> thoughts are in light of the responses? Thanks.
> >>
> >> On Wed, Apr 19, 2023 at 1:21 PM Galen Warren
> >> wrote:
> >>
> >>> I use Apache Flink for stream processing, and StateFun as a hand-off
>  point for the rest of the application.
>  It serves well as a bridge between a Flink Streaming job and
>  micro-services.
> >>>
> >>> This is essentially how I use it as well, and I would also be sad to
> see
> >>> it sunsetted. It works well; I don't know that there is a lot of new
> >>> development required, but if there are no new Statefun releases, then
> >>> Statefun can only be used with older Flink versions.
> >>>
> >>> On Tue, Apr 18, 2023 at 10:04 PM Marco Villalobos <
> >>> mvillalo...@kineteque.com> wrote:
> >>>
>  I am currently using Stateful Functions in my application.
> 
>  I use Apache Flink for stream processing, and StateFun as a hand-off
>  point for the rest of the application.
>  It serves well as a bridge between a Flink Streaming job and
>  micro-services.
> 
>  I would be disappointed if StateFun was sunsetted.  Its a good idea.
> 
>  If there is anything I can do to help, as a contributor perhaps,
> please
>  let me know.
> 
> > On Apr 3, 2023, at 2:02 AM, Martijn Visser
>  wrote:
> > Hi everyone,
> >
> > I want to open a discussion on the status of the Statefun Project [1]
>  in Apache Flink. As you might have noticed, there hasn't been much
>  development over the past months in the Statefun repository [2].
> There is
>  currently a lack of active contributors and committers who are able
> to help
>  with the maintenance of the project.
> > In order to improve the situation, we need to solve the lack of
>  committers and the lack of contributors.
> > On the lack of committers:
> >
> > 1. Ideally, there are some of the current Flink committers who have
>  the bandwidth and can help with reviewing PRs and merging them.
> > 2. If that's not an option, it could be a consideration that current
>  committers only approve and review PRs, that are approved by those
> who are
>  willing to contribute to Statefun and if the CI passes
> > On the lack of contributors:
> >
> > 3. Next to having this discussion on the Dev and User mailing list,
> we
>  can also create a blog with a call for new contributors on the Flink
>  project website, send out some tweets on the Flink / Statefun twitter
>  accounts, post messages on Slack etc. In that message, we would
> inform how
>  those that are interested in contributing can start and where they
> could
>  reach out for more information.
> > There's also option 4. where a group of interested people would split
>  Statefun from the Flink project and make it a separate top level
> project
>  under the Apache Flink umbrella (similar as recently has happened with
>  Flink Table Store, which has become Apache Paimon).
> > If we see no improvements in the coming period, we should consider
>  sunsetting Statefun and communicate that clearly to the users.
> > I'm looking forward to your thoughts.
> >
> > Best regards,
> >
> > Martijn
> >
> > [1]https://nightlies.apache.org/flink/flink-statefun-docs-master/  <
>  https://nightlies.apache.org/flink/flink-statefun-docs-master/>
> > [2]https://github.com/apache/flink-statefun  <
>  https://github.com/apache/flink-statefun>
> 
>


Re: [DISCUSS] Status of Statefun Project

2023-06-06 Thread Chesnay Schepler
If you were to fork it /and want to redistribute it/ then the short 
version is that


1. you have to adhere to the Apache licensing requirements
2. you have to make it clear that your fork does not belong to the
   Apache Flink project. (Trademarks and all that)

Neither should be significant hurdles (there should also be plenty of 
online resources regarding 1), and if you do this then you can freely 
share your fork with others.


I've also pinged Martijn to take a look at this thread.
To my knowledge the project hasn't decided anything yet.

On 27/05/2023 04:05, Galen Warren wrote:

Ok, I get it. No interest.

If this project is being abandoned, I guess I'll work with my own fork. Is
there anything I should consider here? Can I share it with other people who
use this project?

On Tue, May 16, 2023 at 10:50 AM Galen Warren
wrote:


Hi Martijn, since you opened this discussion thread, I'm curious what your
thoughts are in light of the responses? Thanks.

On Wed, Apr 19, 2023 at 1:21 PM Galen Warren
wrote:


I use Apache Flink for stream processing, and StateFun as a hand-off

point for the rest of the application.
It serves well as a bridge between a Flink Streaming job and
micro-services.


This is essentially how I use it as well, and I would also be sad to see
it sunsetted. It works well; I don't know that there is a lot of new
development required, but if there are no new Statefun releases, then
Statefun can only be used with older Flink versions.

On Tue, Apr 18, 2023 at 10:04 PM Marco Villalobos <
mvillalo...@kineteque.com> wrote:


I am currently using Stateful Functions in my application.

I use Apache Flink for stream processing, and StateFun as a hand-off
point for the rest of the application.
It serves well as a bridge between a Flink Streaming job and
micro-services.

I would be disappointed if StateFun was sunsetted.  Its a good idea.

If there is anything I can do to help, as a contributor perhaps, please
let me know.


On Apr 3, 2023, at 2:02 AM, Martijn Visser

wrote:

Hi everyone,

I want to open a discussion on the status of the Statefun Project [1]

in Apache Flink. As you might have noticed, there hasn't been much
development over the past months in the Statefun repository [2]. There is
currently a lack of active contributors and committers who are able to help
with the maintenance of the project.

In order to improve the situation, we need to solve the lack of

committers and the lack of contributors.

On the lack of committers:

1. Ideally, there are some of the current Flink committers who have

the bandwidth and can help with reviewing PRs and merging them.

2. If that's not an option, it could be a consideration that current

committers only approve and review PRs, that are approved by those who are
willing to contribute to Statefun and if the CI passes

On the lack of contributors:

3. Next to having this discussion on the Dev and User mailing list, we

can also create a blog with a call for new contributors on the Flink
project website, send out some tweets on the Flink / Statefun twitter
accounts, post messages on Slack etc. In that message, we would inform how
those that are interested in contributing can start and where they could
reach out for more information.

There's also option 4. where a group of interested people would split

Statefun from the Flink project and make it a separate top level project
under the Apache Flink umbrella (similar as recently has happened with
Flink Table Store, which has become Apache Paimon).

If we see no improvements in the coming period, we should consider

sunsetting Statefun and communicate that clearly to the users.

I'm looking forward to your thoughts.

Best regards,

Martijn

[1]https://nightlies.apache.org/flink/flink-statefun-docs-master/  <

https://nightlies.apache.org/flink/flink-statefun-docs-master/>

[2]https://github.com/apache/flink-statefun  <

https://github.com/apache/flink-statefun>



How can i update OnlineLogisticRegression Model continuously

2023-06-06 Thread 刘勋
Hi All,Im trying OnlineLogisticRegressionExample of flink-ml.How can i update 
OnlineLogisticRegression Model continuously?

How can i update OnlineLogisticRegression Model continuously

2023-06-06 Thread 刘勋
Hi All,Im trying OnlineLogisticRegressionExample of flink-ml.How can i update 
OnlineLogisticRegression Model continuously?

Re: Network Buffers

2023-06-06 Thread Pritam Agarwala
Thanks for answering Hangxiang!

Still confused. How did Flink get this number 22773 ? Couldn't find the
default value of "taskmanager.network.numberOfBuffers" config.
and According to the formula it should be around 1000. ( #slots-per-TM^2 *
#TMs * 4 = 4^2 * 16 * 4 = 1024)

I have a total of 6000 tasks with 16 TM , 4 cores each with
jobmanger/taskmanger.momry.process.size = 8 gb .


Thanks & Regards,
Pritam



On Tue, Jun 6, 2023 at 9:02 AM Hangxiang Yu  wrote:

> Hi, Pritam.
> This error message indicates that the current configuration of the
> network buffer is not enough to handle the current workload.
>
>> What is the meaning of this exception (The total number of network
>> buffers is currently set to 22773 of 32768 bytes each)?
>>
> This just provides some information about the current status of network
> buffers (22773 * 32768 bytes ~= 711MB).
>
> How to figure out a good combination of
>> ('taskmanager.memory.network.fraction', 'taskmanager.memory.network.min',
>> and 'taskmanager.memory.network.max'.)
>> for this issue ?
>>
> IIUC,  There is no absolute standard for setting these parameters.
> These parameters may be affected by many factors, such as the data flow
> rate, computational complexity, and memory usage of your job.
>
> Some steps to setup and adjust these parameters:
> 1. Check the available memory on your job.
> 2. Evaluate the network usage and consider how much memory could be used
> for network buffers.
> 3. Monitor the system and collect metrics such as network throughput,
> memory usage.
> 4. Adjust these parameters if the job has a high network usage or
> memory-intensive.
>
> Just a personal and immature suggestion about how to adjust when it's not
> enough:  1. Increase taskmanager.memory.network.fraction from 0.1 to 0.2,
> or just increase taskmanager.memory.network.max slightly.
> 2. If the buffer size is too large, it may affect checkpoints. So it's
> recommended to combine with buffer debloating.
> 
>
>
> On Tue, Jun 6, 2023 at 2:44 AM Pritam Agarwala <
> pritamagarwala...@gmail.com> wrote:
>
>> Hi All,
>>
>>
>> java.io.IOException: Insufficient number of network buffers: required 2,
>> but only 0 available. The total number of network buffers is currently set
>> to 22773 of 32768 bytes each.
>>
>> What is the meaning of this exception (The total number of network
>> buffers is currently set to 22773 of 32768 bytes each)?
>>
>> How to figure out a good combination of
>> ('taskmanager.memory.network.fraction', 'taskmanager.memory.network.min',
>> and 'taskmanager.memory.network.max'.)
>> for this issue ?
>>
>>
>> Thanks & Regards,
>> Pritam
>>
>
>
> --
> Best,
> Hangxiang.
>


changing serializer affects resuming from checkpoint

2023-06-06 Thread Peng Peng
Hi,

I have 2 flink jobs, of which one produces records to kafka using kryo
serializer and the other consumes records from kafka and deserializes with
kryo. It has been working fine. Then I stopped both jobs with checkpoints
and changed the consumer job to disable generic type and kryo to use avro
serialization. However, when I resumed the 2 jobs from the checkpoint, both
failed. It made sense the consumer job would fail, but why is the producer
job also affected?

Thanks,
Peng