Hi Yarden, Since it's a bounded source you could try with Sql transformation grouping by the timestamp column. Here are some examples of grouping:
https://github.com/apache/beam/tree/master/sdks/python/apache_beam/yaml However, if you want to add a timestamp column in addition to the original CSV records then, there are multiple ways to achieve that. 1) MapToFields: https://github.com/apache/beam/blob/master/sdks/python/apache_beam/yaml/yaml_mapping.md [Your timestamp column could be a callable to get the current timestamp on each record] 2) If you need an extra layer of transformation complexity I would recommend creating a custom transformation: # - type: MyCustomTransform # name: AddDateTimeColumn # config: # prefix: 'whatever' providers: - type: 'javaJar' config: jar: 'gs://path/of/the/java.jar' transforms: MyCustomTransform: 'beam:transform:org.apache.beam:javatransformation:v1' Here a good example of how to do that in Java: https://github.com/apache/beam/blob/master/examples/multi-language/src/main/java/org/apache/beam/examples/multilanguage/JavaPrefixRegistrar.java Best, Ferran El lun, 8 ene 2024 a las 19:53, Yarden BenMoshe (<yarde...@gmail.com>) escribió: > > Hi all, > Im quite new to using beam yaml. I am working with a CSV file and want to > implement some windowing logic to it. > Was wondering what is the right way to add timestamps to each element, > assuming I have a column including a timestamp. > > I am aware of Beam Programming Guide (apache.org) part but not sure how this > can be implemented and used from yaml prespective. > > Thanks > Yarden