Hi devs! We're discussing the behavior of casting functions (including cast, to_timestamp, to_date, etc.) for invalid input in https://issues.apache.org/jira/browse/FLINK-24924. As this topic is crucial to compatibility and usability we'd like to continue discussing this publicly in the mailing list.
The main topic is to discuss that shall casting functions return null (keep its current behavior) or throw exceptions (introduce a new behavior). I'm trying to conclude the ideas on both sides. Correct me if I miss something. *From the devs who support throwing exceptions (new behavior):* The main concern is that if we silently return null then unexpected results or exceptions (mainly NullPointerException) may be produced. However, it will be hard for users to reason for this because there is no detailed message. If we throw exceptions in the first place, then it's much easier to catch the exception with nice detailed messages explaining what is going wrong. Especially for this case of DATE/TIME/TIMESTAMP it's very helpful to have a detailed error and see where and why the parsing broke. For compatibility concerns, we can provide a TRY_CAST function which is exactly the same as the current CAST function by returning nulls for invalid input. *From the devs who support return null (current behavior):* The main concern is compatibility and usability. On usability: The upstream system may occasionally produce invalid data and if we throw exceptions when seeing this it will fail the job again and again even after restart (because the invalid data is always there). Streaming computing is a resident program and users do not want it to frequently fail and cannot automatically recover. Most users are willing to just skip that record and continue processing. Imagine an online job running for a couple of weeks and suddenly fails due to some unexpected dirty data. What choices do users have to quickly resume the job? On compatibility: There are currently thousands of users and tens of thousands of jobs relying on the current behavior to filter out invalid input. If we change the behavior it will be a disaster for users as they have to rewrite and check their SQL very carefully. What do you think? We're looking forward to your feedback.