unctions, so they use yield to emit results.
>
> David
>
> On Tue, Nov 7, 2023 at 1:16 PM Alexander Fedulov <
> alexander.fedu...@gmail.com> wrote:
>
>> Java ProcessFunction API defines a clear way to collect data via the
>> Collector object.
>>
>> PyFli
Java ProcessFunction API defines a clear way to collect data via the
Collector object.
PyFlink documentation also refers to the Collector [1] , but it is not
being passed to the function and is also nowhere to be found in the pyflink
source code.
How can multiple elements be collected? Is "yield"
ingestion rate exceeds the processing rate. You also lose any delivery
guarantees because Flink's fault tolerance model relies on having
replayable sources.
Is using a message broker not feasible in your case?
Best,
Alexander Fedulov
On Tue, 31 Oct 2023 at 13:08, Kamal Mittal
wrote:
&g
operators which you can scale
independently from the source parallelism. Can you describe what you are
trying to achieve?
Best,
Alexander Fedulov
On Tue, 31 Oct 2023 at 07:22, Kamal Mittal via user
wrote:
> Hello Community,
>
>
>
> I need to have a custom parallel dat
this
change?
Best,
Alexander Fedulov
On Mon, 30 Oct 2023 at 18:24, Matthias Pohl
wrote:
> Thanks for your proposal, Zhanghao Chen. I think it adds more transparency
> to the configuration documentation.
>
> +1 from my side on the proposal
>
> On Wed, Oct 11, 2023 at 2:0
of the checkpoints you were advising against?
>
> To be sure, I was referring to moving the previously processed files away,
> not the checkpoints themselves.
>
> On Fri, Oct 27, 2023 at 12:45 PM Alexander Fedulov <
> alexander.fedu...@gmail.com> wrote:
>
>> > I wonde
> I wonder if you could use this fact to query the committed checkpoints
and move them away after the job is done.
This is not a robust solution, I would advise against it.
Best,
Alexander
On Fri, 27 Oct 2023 at 16:41, Andrew Otto wrote:
> For moving the files:
> > It will keep the files as is
* with regards to empty string. The null check is still a bit defensive and
one could return false in test(), but it does not matter really since
String.substring in getName() can never return null.
On Fri, 27 Oct 2023 at 16:32, Alexander Fedulov
wrote:
> Actually, this is not even "d
is it possible to get a null file name for some
> sub directories and hence important to return true so that the File Source
> can monitor inside those sub directories?
>
> On Friday, 27 October, 2023 at 12:58:44 am IST, Alexander Fedulov <
> alexander.fedu...@gmail.com> wrote:
&g
Great work, thanks everyone!
Best,
Alexander
On Thu, 26 Oct 2023 at 21:15, Martijn Visser
wrote:
> Thank you all who have contributed!
>
> Op do 26 okt 2023 om 18:41 schreef Feng Jin
>
> > Thanks for the great work! Congratulations
> >
> >
> > Best,
> > Feng Jin
> >
> > On Fri, Oct 27, 2023 at
* to clarify: by different output I mean that for the same input message
the output message could be slightly smaller due to the abovementioned
factors and fall into the allowed size range without causing any failures
On Thu, 26 Oct 2023 at 21:52, Alexander Fedulov
wrote:
> Your expectati
reproduce the error reliably - this is something that needs to be
further looked into.
Best,
Alexander Fedulov
On Mon, 23 Oct 2023 at 19:11, Gabriele Modena wrote:
> Hey folks,
>
> We currently run (py) flink 1.17 on k8s (managed by flink k8s
> operator), with HA and checkpointing (f
,
Alexander Fedulov
On Thu, 26 Oct 2023 at 07:35, Chirag Dewan via user
wrote:
> Hi,
>
> I was looking at this check in DefaultFileFilter:
>
> public boolean test(Path path) {
> final String fileName = path.getName();
> if (fileName == null || fileName.length() == 0) {
lines
processed in each file). In case of failures, the source will pick up where
it left off. Files removal is trickier - the easiest way to achieve that
would be to have tombstones at the end of files and process them in user
code.
Best,
Alexander Fedulov
On Thu, 26 Oct 2023 at 18:17, arjun s
t;
> final DataStreamSource file = env.fromSource(source,
> WatermarkStrategy.*forMonotonousTimestamps*()
> .withTimestampAssigner(new WatermarkAssigner((Object input)
> -> System.*currentTimeMillis*())),"FileSource");
> file.print();
> }
>
>
>
es in your reader schema might help [1]
[1] https://avro.apache.org/docs/1.8.1/spec.html#Aliases
Best,
Alexander Fedulov
On Thu, 26 Oct 2023 at 16:24, Kirti Dhar Upadhyay K via user <
user@flink.apache.org> wrote:
> Hi Team,
>
>
>
> I am using Flink CSV Decoder with AVSC g
and kafka topic that return
>> different datatypes so I dont know how my answers relate to the original
>> problem tbh. Regards,
>> Oscar
>>
>> On Tue, 4 Jul 2023 at 20:53, Alexander Fedulov <
>> alexander.fedu...@gmail.com> wrote:
>>
>>> @Os
@Oscar
1. How do you plan to use that CSV data? Is it needed for lookup from the
"main" stream?
2. Which API are you using? DataStream/SQL/Table or low level
ProcessFunction?
Best,
Alex
On Tue, 4 Jul 2023 at 11:14, Oscar Perez via user
wrote:
> ok, but is it? As I said, both sources have diffe
;
> Add a sub_filter rule to patch the HTML response.
>
> I use this to add a tag to the header and for the
> Flink-Dashboard I experience no glitches.
>
>
>
> As to point 3. … you don’t need to expose that Ingress to the internet,
> but only to the node IP, so it becom
Hi Prabhu,
make sure that the key you use is the same for both records and try to
reproduce the issue with the level of parallelism of 1.
Best,
Alex
On Sun, 25 Jun 2023 at 04:29, Hangxiang Yu wrote:
> Hi, Prabhu.
>
> This is a correctness issue. IIUC, It should not be related to the size of
>
Hi Mike,
no, it is currently hard-coded
https://github.com/apache/flink/blob/master/flink-runtime-web/web-dashboard/src/app/app.component.html#L23
Your options are:
1. Contribute a change to make it configurable
2. Use some browser plugin that allows renaming page titles
3. Always use different p
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/internals/filesystems/
On Fri, 23 Jun 2023 at 11:20, 李 琳 wrote:
>
>
> Hi all,
>
>
>
> Recently, I have been testing the Flink Kubernetes Operator. In the
> official example, the checkpoint/savepoint path is configured with a file
>
Hi Lu,
I would say that if your application is stable and checkpoints do not
timeout there is no immediate necessity to do anything. The fact that the
consumer lag stays low means that you are able to keep up with the incoming
data. That said, the fact that you observe "constant backpressure" with
Great to see this, congratulations!
Best,
Alex
On Mon, 27 Mar 2023 at 11:24, Yu Li wrote:
> Dear Flinkers,
>
>
>
> As you may have noticed, we are pleased to announce that Flink Table Store
> has joined the Apache Incubator as a separate project called Apache
> Paimon(incubating) [1] [2] [3].
Hi Leonard,
Sure, here is the new ticket:
https://issues.apache.org/jira/browse/FLINK-29890
Best,
Alexander Fedulov
On Fri, Nov 4, 2022 at 2:12 PM Leonard Xu wrote:
> Thanks Alexander for reporting this issue, Could you open a jira ticket as
> well?
>
> CC: Shengkai, please take
/pull/20001
https://github.com/apache/flink/pull/19845
https://github.com/apache/flink/pull/20211 (fixes a similar issue
introduced after classloading changes in 1.16)
How can UDF JARs be loaded in 1.16?
Best,
Alexander Fedulov
Can't you add a flatMap function just before the Sink that does exactly
this verification and filters out everything that is not supposed to be
sent downstream?
Best,
Alexander Fedulov
On Thu, Sep 8, 2022 at 6:15 PM Salva Alcántara
wrote:
> Sorry I meant do nothing when the serializ
ive k8s, taskmanager is a Pod.The data
> directory size of a single container is limited in our company.Are there
> any idea to deal with this ?
>
> --
> Best,
> Hjw
>
>
>
> ---------- 原始邮件 --
> *发件人:* "Alexander Fe
/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#fault-tolerance
https://youtu.be/bhcFfS1-eDY?t=410
https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/state/checkpoints/#unaligned-checkpoints
Best,
Alexander Fedulov
On Mon, Sep 5, 2022 at 2:21 PM Sebastian Struss wr
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#state-backend-rocksdb-localdir
Make sure to use a local SSD disk (not NFS/EBS).
Best,
Alexander Fedulov
On Mon, Sep 5, 2022 at 7:24 PM hjw <1010445...@qq.com> wrote:
> The EmbeddedRocksDBStateBackend
You are welcome, glad it helped!
Best,
Alexander Fedulov
On Mon, Jul 18, 2022 at 8:06 PM Salva Alcántara
wrote:
> For the record, Alexander Fedulov pointed me to an example within the
> kafka connector:
>
>
> https://github.com/apache/flink/blob/025675725336cd572aa2601be525efd
Best,
Alexander Fedulov
On Mon, Jul 18, 2022 at 8:01 PM Salva Alcántara
wrote:
> Yep, that is mostly it. I have (DataStream) connector (sources & sink)
> which works for a fixed type (`JsonNode` for what it's worth) as you say
> and I want to reuse it for Table/SQL, which req
Hi Salva,
what is the goal? Do you have some source that already has a fixed type and
you want to reuse its functionality for producing a different data type?
Best,
Alexander Fedulov
On Mon, Jul 18, 2022 at 1:29 PM Salva Alcántara
wrote:
> If I have a Source (Sink), what would be
ble/formats/csv/
Best,
Alexander Fedulov
On Mon, Jul 11, 2022 at 5:43 PM wrote:
> No, I did not mean.
> I said 'Does Table API connector, CSV, has some option to ignore some
> columns in source file?'
>
>
> *Sent:* Monday, July 11, 2022 at 5:28 PM
> *From:* "Xuyan
Hi Min Tu,
try clean install to make sure the build starts from scratch. Refresh maven
modules in IntelliJ after the build. If that doesn't work, try invalidating
IntelliJ caches and/or reimporting the project (remove .idea folder).
Best,
Alexander Fedulov
On Sun, Jul 10, 2022 at 12:
Hi podunk,
please share exceptions that you find in the log/ folder of your Flink
distribution.
The Taskmanger startup issues should be captured in the *-taskexecutor-*
files.
Best,
Alexander Fedulov
On Mon, Jul 11, 2022 at 5:42 PM Xuyang wrote:
> Hi, can you provide the error log so that
an't verify.
start-cluster.sh in Flink 1.15.x works fine on *nix systems .
Best,
Alexander Fedulov
On Fri, Jul 8, 2022 at 7:17 PM wrote:
>
> Fink will not run natively in windows - that is why I use Github CLI
>
> I made test with Flink version 1.14.4 - Taskmanager is running. Bu
Hi Robin,
you should be able to use the State Processor API to modify the operator
state (sources) and override the offsets manually there. I never tried
that, but I believe conceptually it should work.
Best,
Alexander Fedulov
[flink-dist-1.15.0.jar:1.15.0]
> at
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481)
> ~[flink-dist-1.15.0.jar:1.15.0]
> ... 5 more
>
> I'm trying to find the reason.
>
>
> *Sent:* Friday, July 08, 2022 at 12:21 P
You say 'Check in the UI that after you start your cluster you have
> TaskManagers registered successfully'.
> If I go to 'Task Managers' managers menu (
> http://localhost:8081/#/task-manager) I do not see any - list is empty.
>
> No idea what it should be there or
That said, when a non-existent file is specified, the job fails
immediately, so I would actually expect that behavior if the issue was
indeed with the file path.
Which version of Flink are you running?
Best,
Alexander Fedulov
On Wed, Jul 6, 2022 at 10:39 PM wrote:
> If I'm reading
see messages like this one for operators that did not have any
state in the savepoint:
*INFO o.a.f.r.c.CheckpointCoordinator [] - Skipping empty savepoint state
for operator a0f11f7a2c416beb6b7aed14be0d63ca. *
Best,
Alexander Fedulov
On Wed, Jul 6, 2022 at 9:50 PM John Tipper wrote:
> Hi
rator-source/flink-core/src/main/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceV0.java
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-238%3A+Introduce+FLIP-27-based+Data+Generator+Source#:~:text=%7D-,Usage%3A%C2%A0,-The%20envisioned%20usage
Best,
Alexander Fedulov
O
nts, everything
is straightforward - all SST files are copied over. The interplay of the
incremental checkpoints and compaction is described in this [1] blog post.
[1]
https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html
Best,
Alexander Fedulov
On Mon, Jul 4, 2022 at 4:25
Hi Christian,
thanks for the reply. We use AT_LEAST_ONCE delivery semantics in this
> application. Do you think this might still be related?
No, in that case, Kafka transactions are not used, so it should not be
relevant.
Best,
Alexander Fedulov
On Mon, Jun 13, 2022 at 3:48 PM Christ
,
Alexander Fedulov
[1]
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#fault-tolerance
On Mon, Jun 13, 2022 at 12:04 PM Martijn Visser
wrote:
> Hi Christian,
>
> I would expect that after the broker comes back up and recovers
> completely,
/testsuites/SinkTestSuiteBase.java#L226
Best,
Alexander Fedulov
On Wed, Jun 1, 2022 at 3:33 PM Qing Lim wrote:
> Thanks both, that’s perfect!
>
>
>
> *From:* Jing Ge
> *Sent:* 01 June 2022 14:29
> *To:* yuxia
> *Cc:* Qing Lim ; User
> *Subject:* Re: Can we resume a job
Flink jobs.
Maybe you could explain where specifically your situation does not fit in
one of those two scenarios?
Best,
Alexander Fedulov
On Wed, Jun 1, 2022 at 10:57 PM Jing Ge wrote:
> Hi Bariša,
>
> Could you share the reason why your data processing pipeline should keep
> runn
Hi Mac,
I just verified that objects with isXXX methods indeed will be interpreted
as POJOs. Would you be willing to contribute a documentation update?
Here are some guidelines: [1].
[1] https://flink.apache.org/contributing/contribute-documentation.html
Thanks,
Alexander Fedulov
On Thu
l
[2] https://flink.apache.org/news/2020/07/30/demo-fraud-detection-3.html
Best,
Alexander Fedulov
On Wed, Jan 26, 2022 at 10:45 PM Marco Villalobos
wrote:
> Hi Alexander,
>
> Thank you for responding. The solution you proposed uses statically
> defined windows. What I need a are dynamically cre
is no need for separate Flink deployments to create such a pipeline.
Best,
Alexander Fedulov
On Wed, Jan 26, 2022 at 6:47 PM Marco Villalobos
wrote:
> Hi,
>
> I am working with time series data in the form of (timestamp, name,
> value), and an event time that is the timestamp when
with your certificate into this container at startup
- Open a shell into this running connector, locate the "keytool" utility
and try to use it to import the certificate
Best,
Alexander Fedulov | Solutions Architect
<https://www.ververica.com/>
Follow us @VervericaData
first place.
Best,
--
Alexander Fedulov | Solutions Architect
<https://www.ververica.com/>
Follow us @VervericaData
--
Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference
Stream Processing | Event Driven | Real Time
On Fri, May 22, 2020 at 8:57 AM Jaswin
, ... *)*;` to decide
where to dispatch the messages (see [1]), collect to none or many side
outputs, depending on your logic.
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html
--
Alexander Fedulov | Solutions Architect
<https://www.ververica.com/>
Fol
this example [2]
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_time.html
[2]
https://github.com/afedulov/fraud-detection-demo/blob/master/flink-job/src/main/java/com/ververica/field/dynamicrules/sources/RulesSource.java#L80
--
Alexander Fedulov | Solutions Architect
&
Hi Felippe,
could you clarify in some more details what you are trying to achieve?
Best regards,
--
Alexander Fedulov | Solutions Architect
+49 1514 6265796
<https://www.ververica.com/>
Follow us @VervericaData
--
Join Flink Forward <https://flink-forward.org/> - The
Hi Sara,
do you have logs? Any exceptions in them?
Best,
--
Alexander Fedulov | Solutions Architect
+49 1514 6265796
<https://www.ververica.com/>
Follow us @VervericaData
--
Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference
Stream Processing |
here:
https://github.com/afedulov/fraud-detection-demo/blob/master/flink-job/src/test/java/com/ververica/field/dynamicrules/RulesEvaluatorTest.java
Hope this helps.
Best regards,
--
Alexander Fedulov | Solutions Architect
+49 1514 6265796
<https://www.ververica.com/>
Follow us @Ververi
control
precisely.
Best,
--
Alexander Fedulov | Solutions Architect
+49 1514 6265796
<https://www.ververica.com/>
Follow us @VervericaData
--
Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference
Stream Processing | Event Driven | Real Time
--
Ve
ks-in-parallel-streams
[2]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/experimental.html#reinterpreting-a-pre-partitioned-data-stream-as-keyed-stream
Best,
--
Alexander Fedulov | Solutions Architect
+49 1514 6265796
<https://www.ververica.com/>
Follow us @VervericaData
erverica.com/blog/announcing-ververica-platform-community-edition
[2] https://www.ververica.com/getting-started
[3] https://docs.ververica.com/getting_started/index.html
Best regards,
--
Alexander Fedulov | Solutions Architect
+49 1514 6265796
<https://www.ververica.com/>
Follow us @Ver
ontext);
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#streaming-file-sink
[2]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment
--
Alexander Fedulov | Solutions Architect
+49 1514 626
y the new `KafkaSerializationSchema`, which
would require a slight modification, but, from what I can tell, it will
still be possible to achieve such dynamic events dispatching.
Best regards,
Alexander Fedulov
>
63 matches
Mail list logo