the validator function will probably have to do try catch with
Long.parselong(jsonNode.toString()) to test for valid inputs in case the
current check - jsonNode.isIntegralNumber() && jsonNode.canConvertToLong()
is false.

On Fri, Oct 22, 2021 at 8:07 PM Reuven Lax <re...@google.com> wrote:

> If we want to support this behavior, we would have to change this code:
>
>
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/RowJsonValueExtractors.java#L93
>
> I believe that instead of longValue, JsonNode::asLong would have to be
> used (that function parses strings). I'm not sure how to test for invalid
> input, since that function simply returns 0 in that case, which is also a
> valid value.
>
> Reuven
>
> On Fri, Oct 22, 2021 at 12:41 PM gaurav mishra <
> gauravmishra.it...@gmail.com> wrote:
>
>> I have two pipelines
>> PipelineA -> output to pubsub(schema bound json) -> PipelineB
>>
>> PipelineA is emitting proto models serialized as JsonStrings.
>> Serialization is done using JsonFormat.printer.
>> ```
>> JsonFormat.printer().preservingProtoFieldNames()
>>                         .print(model)
>> ```
>>
>> In PipelineB I am trying to read these as jsons as Row which is tied to
>> the schema derived to the same proto which was used in PipelineA.
>>
>> ```
>> Schema schema = new
>> ProtoMessageSchema().schemaFor(TypeDescriptor.of(protoClass));
>> ....
>> input.apply(JsonToRow.withSchema(schema));
>> ...
>> ```
>>
>> The problem I am facing is with int64 type fields. When these fields are
>> serialized using `JsonFormat.printer` these are serialized as strings in
>> final json, which is by design(
>> https://developers.google.com/protocol-buffers/docs/proto3#json).
>> In pipelineB when the framework tries to deserialize these fields as
>> int64 it fails.
>> ```
>> Unable to get value from field 'site_id'. Schema type 'INT64'. JSON node
>> type STRING
>> org.apache.beam.sdk.util.RowJson$RowJsonDeserializer.extractJsonPrimitiveValue
>> ```
>>
>> Is there a way to workaround this problem, can I do something either on
>> serialization side or deserialization side to fix this?
>>
>>
>>

Reply via email to