Hi Team,
I am using CSV decoder with Flink file source.
I am stuck with decoding issues as below-
1. In case there is any blank line in between two records or blank lines in
the end of file, it returns the blank object. E.g-
Input Records:
id,name,age,isPermanent,tenure,salary,gender,contact
5,emp1,25,true,3.5,4555555,Male,123456789;987654321
10,emp2,25,true,3.5,5555555,Male,123456789;987654321
Output:
{"id": 5, "name": "emp1", "age": 25, "isPermanent": true, "tenure": 3.5,
"salary": 4555555.0, "gender": "Male", "contact": [123456789, 987654321]}
{"id": null, "name": null, "age": 0, "isPermanent": false, "tenure": 0.0,
"salary": 0.0, "gender": null, "contact": null}
{"id": 10, "name": "emp2", "age": 25, "isPermanent": true, "tenure": 3.5,
"salary": 5555555.0, "gender": "Female", "contact": [123456789, 987654321]}
Is there any way, so that blank object creation can be avoided for blank lines
present?
1. If blank value is coming for any numeric data type, it assigns default
value of the data type, whereas it fails decoding for objects like Enum.
Scenario - 1:
Input:
id,name,age,isPermanent,tenure,salary,gender,contact
10,emp1,25,true,2.5,4555555, ,123456789;987654321
Output:
Exception occurs as gender(Enum) is not provided.
Scenario - 2:
Input:
id,name,age,isPermanent,tenure,salary,gender,contact
10,emp1,25,true,,4555555,Male,123456789;987654321
Output:
{"id": 10, "name": "emp1", "age": 25, "isPermanent": true, "tenure": 0.0,
"salary": 4555555.0, "gender": "Male", "contact": [123456789, 987654321]}
Ideally it should fail the decoding as blank is string not number, but it gives
the default value of datatype. Is there any way to fail the decoding in this
case?
Any help will be appreciated.
Regards,
Kirti Dhar