Hi Gen,

Thanks for your explanation.

Back to this code snippet, since they are not marked with "transient" now, I suppose Flink will use avro to serialize them (null values). Is there any benchmark to show the performance test between null values serialization and "transient"? I mean, it is indeed not good to write them with "transient", but it works. So is there any performance lose here?


On 2023/02/24 06:47:21 Gen Luo wrote:
> Hi,
>
> ValueState is a handle rather than an actual value. So it should never be
> serialized. In fact, ValueState itself is not a Serializable. It should be
> ok to always mark it as transient.
>
> In this case, I suppose it works because the ValueState is not set (which
> happens during the runtime) when the function is serialized (while
> deploying). But it's not good.
>
> On Fri, Feb 24, 2023 at 10:29 AM Zhongpu Chen <ch...@gmail.com> wrote:
>
> > Hi,
> >
> > When I am reading the code from flink-training-repo [1], I noticed the
> > following code:
> >
> > ```java
> >
> > public static class EnrichmentFunction
> > extends RichCoFlatMapFunction<TaxiRide, TaxiFare, RideAndFare> {
> >
> > private ValueState<TaxiRide> rideState; private ValueState<TaxiFare> fareState;
> > ...
> > }
> >
> > ```
> >
> > From my understanding, since ValueState variables here are scoped to each > > instance, they should not be serialized for the performance sake. Thus, we > > should always mark them with "transient". Similar discussion can be found
> > here [2].
> >
> > Should we always mark ValueState as "transient", and why? Please help me
> > to figure it out.
> >
> > [1]
> > https://github.com/apache/flink-training/blob/master/rides-and-fares/src/solution/java/org/apache/flink/training/solutions/ridesandfares/RidesAndFaresSolution.java
> >
> > [2]
> > https://stackoverflow.com/questions/72556202/flink-managed-state-as-transient
> >
>

Reply via email to