You are welcome to create a PR to fix this issue if you need to change the connector source code.
On Sun, Jul 7, 2024 at 5:39 AM Marcin Stańczak <[email protected]> wrote: > Hello Apache Beam Community, > > I'm Marcin and I am currently working on a project using Apache Beam > 2.57.0. I have encountered an issue when reading data from MongoDB > with the "mongodbio" connector. I am unable to reach the > transformation step due to an InvalidBSON error related to > out-of-range dates. > > Error Message: > > bson.errors.InvalidBSON: year 55054 is out of range (Consider Using > CodecOptions(datetime_conversion=DATETIME_AUTO) or > MongoClient(datetime_conversion='DATETIME_AUTO')). See: > > https://pymongo.readthedocs.io/en/stable/examples/datetimes.html#handling-out-of-range-datetimes > > Here are the details of my setup: > > Apache Beam version: 2.57.0 > Python version: 3.10 > > In my current MongoDB collection, it is possible to encounter dates > that are out of the standard range, such as year 0 or years greater > than 9999, which causes this issue. > > I have handled this issue in standalone Python scripts using > CodecOptions and DatetimeConversion. However, I am facing difficulties > integrating this logic within an Apache Beam pipeline and I don't > think it's possible to handle without changing the source code of this > connector. I would appreciate any guidance or suggestions on how to > resolve this issue within the Beam framework. > > Thank you for your assistance. > > Best regards, > Marcin >
