Thanks for the approach but few updates w.r.t my query sent – Parquet file is a binary file so when I said corrupt record it is complete file in itself can’t be processed right? So it is not counting corrupt records rather counting corrupt files or splits in Flink?
From: Ken Krugler <kkrugler_li...@transpac.com> Sent: 16 June 2023 08:19 PM To: user@flink.apache.org Cc: Shammon FY <zjur...@gmail.com>; Kamal Mittal <kamal.mit...@ericsson.com> Subject: Re: Flink bulk and record file source format metrices Hi Kamal, In a similar situation, when a decoding failure happened I would generate a special record that I could then detect/filter out (and increment a counter) in a FilterFunction immediately following the source. — Ken On Jun 16, 2023, at 2:11 AM, Kamal Mittal via user <user@flink.apache.org<mailto:user@flink.apache.org>> wrote: Hello, Any way-forward, please suggest. Rgds, Kamal From: Kamal Mittal via user <user@flink.apache.org<mailto:user@flink.apache.org>> Sent: 15 June 2023 10:39 AM To: Shammon FY <zjur...@gmail.com<mailto:zjur...@gmail.com>> Cc: user@flink.apache.org<mailto:user@flink.apache.org> Subject: RE: Flink bulk and record file source format metrices Hello, I need one counter matric for no. of corrupt records while decoding parquet records at data source level. I know where the corrupt record handling requires but due to non-existence of “SourceContext” or “RuntimeContext”, unable to do anything w.r.t metric. It is needed similarly the way “SourceReaderBase” class maintaining one counter for no. of records emitted. Rgds, Kamal From: Shammon FY <zjur...@gmail.com<mailto:zjur...@gmail.com>> Sent: 14 June 2023 05:33 PM To: Kamal Mittal <kamal.mit...@ericsson.com<mailto:kamal.mit...@ericsson.com>> Cc: user@flink.apache.org<mailto:user@flink.apache.org> Subject: Re: Flink bulk and record file source format metrices Hi Kamal, Can you give more information about the metris you want? In Flink each source task has one source reader which already has some metrics, you can refer to metrics doc[1] for more detailed information. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/ Best, Shammon FY On Tue, Jun 13, 2023 at 11:13 AM Kamal Mittal via user <user@flink.apache.org<mailto:user@flink.apache.org>> wrote: Hello, Using Flink record stream format file source API as below for parquet records reading. FileSource.FileSourceBuilder<?> source = FileSource.forRecordStreamFormat(streamformat, path); source.monitorContinuously(Duration.ofMillis(10000)); Want to log/generate metrices for corrupt records and for the same need to log flink metrices at source level in parquet reader class, is there any way to do that as right now no handle for SourceContext available? Rgds, Kamal -------------------------- Ken Krugler http://www.scaleunlimited.com<https://protect2.fireeye.com/v1/url?k=31323334-501cfaf3-313273af-454445554331-f6b7d29d8e15caa4&q=1&e=33678f8f-9480-45e5-aac4-a9d76bed10d0&u=http%3A%2F%2Fwww.scaleunlimited.com%2F> Custom big data solutions Flink, Pinot, Solr, Elasticsearch