the validator function will probably have to do try catch with Long.parselong(jsonNode.toString()) to test for valid inputs in case the current check - jsonNode.isIntegralNumber() && jsonNode.canConvertToLong() is false.
On Fri, Oct 22, 2021 at 8:07 PM Reuven Lax <re...@google.com> wrote: > If we want to support this behavior, we would have to change this code: > > > https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/RowJsonValueExtractors.java#L93 > > I believe that instead of longValue, JsonNode::asLong would have to be > used (that function parses strings). I'm not sure how to test for invalid > input, since that function simply returns 0 in that case, which is also a > valid value. > > Reuven > > On Fri, Oct 22, 2021 at 12:41 PM gaurav mishra < > gauravmishra.it...@gmail.com> wrote: > >> I have two pipelines >> PipelineA -> output to pubsub(schema bound json) -> PipelineB >> >> PipelineA is emitting proto models serialized as JsonStrings. >> Serialization is done using JsonFormat.printer. >> ``` >> JsonFormat.printer().preservingProtoFieldNames() >> .print(model) >> ``` >> >> In PipelineB I am trying to read these as jsons as Row which is tied to >> the schema derived to the same proto which was used in PipelineA. >> >> ``` >> Schema schema = new >> ProtoMessageSchema().schemaFor(TypeDescriptor.of(protoClass)); >> .... >> input.apply(JsonToRow.withSchema(schema)); >> ... >> ``` >> >> The problem I am facing is with int64 type fields. When these fields are >> serialized using `JsonFormat.printer` these are serialized as strings in >> final json, which is by design( >> https://developers.google.com/protocol-buffers/docs/proto3#json). >> In pipelineB when the framework tries to deserialize these fields as >> int64 it fails. >> ``` >> Unable to get value from field 'site_id'. Schema type 'INT64'. JSON node >> type STRING >> org.apache.beam.sdk.util.RowJson$RowJsonDeserializer.extractJsonPrimitiveValue >> ``` >> >> Is there a way to workaround this problem, can I do something either on >> serialization side or deserialization side to fix this? >> >> >>