Hello, I am using Spark Structured Streaming to sink data from Kafka to AWS S3. I am wondering if its possible for me to introduce a uniquely incrementing identifier for each record as we do in RDBMS (incrementing long id)? This would greatly benefit to range prune while reading based on this ID.
Any thoughts? I have looked at monotonically_incrementing_id but seems like its not deterministic and it wont ensure new records gets next id from the latest id what is already present in the storage (S3) Regards, Felix K Jose