Thanks Piotrek! I forgot to mention that I'm using PyFlink and mostly Table
APIs. The documentation (
https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/tableApi.html#row-based-operations)
suggests that Map() function is not currently supported in Python. So, what
do you think would be my options here. Should I convert to a data stream to
perform this in Python?

Thanks again,
Sumeet


On Wed, Apr 14, 2021 at 7:09 PM Piotr Nowojski <pnowoj...@apache.org> wrote:

> Hi,
>
> One thing that you can do is to read this record using Avro keeping
> `Result` as `bytes` and in a subsequent mapping function, you could change
> the record type and deserialize the result. In Data Stream API:
>
> source.map(new MapFunction<record_with_bytes,
> record_with_deserialized_result> { ...} )
>
> Best,
> Piotrek
>
> śr., 14 kwi 2021 o 03:17 Sumeet Malhotra <sumeet.malho...@gmail.com>
> napisał(a):
>
>> Hi,
>>
>> I'm reading data from Kafka, which is Avro encoded and has the following
>> general schema:
>>
>> {
>>   "name": "SomeName",
>>   "doc": "Avro schema with variable embedded encodings",
>>   "type": "record",
>>   "fields": [
>>     {
>>       "name": "Name",
>>       "doc": "My name",
>>       "type": "string"
>>     },
>>     {
>>       "name": "ID",
>>       "doc": "My ID",
>>       "type": "string"
>>     },
>>     {
>>       "name": "Result",
>>       "doc": "Result data, could be encoded differently",
>>       "type": "bytes"
>>     },
>>     {
>>       "name": "ResultEncoding",
>>       "doc": "Result encoding media type (e.g. application/avro,
>> application/json)",
>>       "type": "string"
>>     },
>>   ]
>> }
>>
>> Basically, the "Result" field is bytes whose interpretation depends upon
>> the "ResultEncoding" field i.e. either avro or json. The "Result" byte
>> stream has its own well defined schema also.
>>
>> My use case involves extracting/aggregating data from within the embedded
>> "Result" field. What would be the best approach to perform this runtime
>> decoding and extraction of fields from the embedded byte data? Would user
>> defined functions help in this case?
>>
>> Thanks in advance!
>> Sumeet
>>
>>

Reply via email to