By looking at your usecase, the whole processing logic seems to be very
custom.
I would recommend using ParDo's to express your use case. If the processing
for individual dictionary is expensive then you can potentially use a
reshuffle operation to distribute the updation of dictionary over multiple
workers.

Note: As you are going to make write API calls your self, in case of worker
failure, your transform can be executed multiple times.

On Mon, Jun 3, 2019 at 11:41 AM Anjana Pydi <[email protected]>
wrote:

> Hi Ankur,
>
> Thanks for reply. Please find responses updated in below mail.
>
> Thanks,
> Anjana
> ------------------------------
> *From:* Ankur Goenka [[email protected]]
> *Sent:* Monday, June 03, 2019 11:01 AM
> *To:* [email protected]
> *Subject:* Re: How to build a beam python pipeline which does GET/POST
> request to API's
>
> Thanks for providing more information.
>
> Some follow up questions/comments
> 1. Call an API which would provide a dictionary as response.
> Question: Do you need to make multiple of these API calls? If yes, what
> distinguishes API call1 from call2? If its the input to the API, then can
> you provide the inputs to in a file etc? What I am trying to identify is an
> input source to the pipeline so that beam can distribute the work.
> Answer : When an API call is made, it can provide a list of dictionaries
> as response, we have to go through every dictionary, do the same
> transformations for each and send it.
> 2. Transform dictionary to add / remove few keys.
> 3. Send transformed dictionary as JSON to an API which prints this JSON as
> output.
> Question: Are these write operation idempotent? As you are doing your own
> api calls, its possible that after a failure, the calls are done again for
> the same input. If write calls are not idempotent then their can be
> duplicate data.
> Answer : Suppose, if I receive a list of 1000 dictionaries as response
> when I called API in point1, I should do only 1000 write operations
> respectively to each input. If there is a failure for any input, only that
> should not be posted and remaining should be posted successfully.
>
> On Sat, Jun 1, 2019 at 8:13 PM Anjana Pydi <[email protected]>
> wrote:
>
>> Hi Ankur,
>>
>> Thanks for the reply! Below is more details of the usecase:
>>
>> 1. Call an API which would provide a dictionary as response.
>> 2. Transform dictionary to add / remove few keys.
>> 3. Send transformed dictionary as JSON to an API which prints this JSON
>> as output.
>>
>> Please let me know in case of any clarifications.
>>
>> Thanks,
>> Anjana
>> ------------------------------
>> *From:* Ankur Goenka [[email protected]]
>> *Sent:* Saturday, June 01, 2019 6:47 PM
>> *To:* [email protected]
>> *Subject:* Re: How to build a beam python pipeline which does GET/POST
>> request to API's
>>
>> Hi Anjana,
>>
>> You can write your API logic in a ParDo and subsequently pass the
>> elements to other ParDos to transform and eventually make an API call to to
>> another endpoint.
>>
>> However, this might not be a good fit for Beam as the input is not well
>> defined and hence scaling and "once processing" of elements will not be
>> possible as their is no well defined input.
>>
>> It will be better to elaborate a bit more on the usecase for better
>> suggestions.
>>
>> Thanks,
>> Ankur
>>
>> On Sat, Jun 1, 2019 at 5:50 PM Anjana Pydi <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> I have a requirement to create an apache beam python pipeline to read a
>>> JSON from an API endpoint, transform it (add/remove few fields)and send the
>>> transformed JSON to another API endpoint.
>>>
>>> Can anyone please provide some suggestions on how to do it.
>>>
>>> Thanks,
>>> Anjana
>>> -----------------------------------------------------------------------------------------------------------------------
>>> The information contained in this communication is intended solely for the
>>> use of the individual or entity to whom it is addressed and others
>>> authorized to receive it. It may contain confidential or legally privileged
>>> information. If you are not the intended recipient you are hereby notified
>>> that any disclosure, copying, distribution or taking any action in reliance
>>> on the contents of this information is strictly prohibited and may be
>>> unlawful. If you are not the intended recipient, please notify us
>>> immediately by responding to this email and then delete it from your
>>> system. Bahwan Cybertek is neither liable for the proper and complete
>>> transmission of the information contained in this communication nor for any
>>> delay in its receipt.
>>>
>> -----------------------------------------------------------------------------------------------------------------------
>> The information contained in this communication is intended solely for the
>> use of the individual or entity to whom it is addressed and others
>> authorized to receive it. It may contain confidential or legally privileged
>> information. If you are not the intended recipient you are hereby notified
>> that any disclosure, copying, distribution or taking any action in reliance
>> on the contents of this information is strictly prohibited and may be
>> unlawful. If you are not the intended recipient, please notify us
>> immediately by responding to this email and then delete it from your
>> system. Bahwan Cybertek is neither liable for the proper and complete
>> transmission of the information contained in this communication nor for any
>> delay in its receipt.
>>
> -----------------------------------------------------------------------------------------------------------------------
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you are not the intended recipient, please notify us
> immediately by responding to this email and then delete it from your
> system. Bahwan Cybertek is neither liable for the proper and complete
> transmission of the information contained in this communication nor for any
> delay in its receipt.
>

Reply via email to