Delta, for example, manages merge/append/delete and also keeps previous
states of the table's data, so you can query what it looked like before.
See delta.io

On Thu, Jan 27, 2022, 11:54 AM Sid Kal <flinkbyhe...@gmail.com> wrote:

> Hi Sean,
>
> So you mean if I use those file formats it will do the work of CDC
> automatically or I would have to handle it via code ?
>
> Hi Mich,
>
> Not sure if I understood you. Let me try to explain my scenario. Suppose
> there is a Id "1" which is inserted today, so I transformed and ingested
> it. Now suppose if this user id is deleted from the source itself. Then how
> can I delete it in my transformed db
> ?
>
>
>
> On Thu, 27 Jan 2022, 22:44 Sean Owen, <sro...@gmail.com> wrote:
>
>> This is what storage engines like Delta, Hudi, Iceberg are for. No need
>> to manage it manually or use a DBMS. These formats allow deletes, upserts,
>> etc of data, using Spark, on cloud storage.
>>
>> On Thu, Jan 27, 2022 at 10:56 AM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Where ETL data is stored?
>>>
>>>
>>>
>>> *But now the main problem is when the record at the source is deleted,
>>> it should be deleted in my final transformed record too.*
>>>
>>>
>>> If your final sync (storage) is data warehouse, it should be soft
>>> flagged with op_type (Insert/Update/Delete) and op_time (timestamp).
>>>
>>>
>>>
>>> HTH
>>>
>>>
>>>    view my Linkedin profile
>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Thu, 27 Jan 2022 at 15:48, Sid Kal <flinkbyhe...@gmail.com> wrote:
>>>
>>>> I am using Spark incremental approach for bringing the latest data
>>>> everyday. Everything works fine.
>>>>
>>>> But now the main problem is when the record at the source is deleted,
>>>> it should be deleted in my final transformed record too.
>>>>
>>>> How do I capture such changes and change my table too ?
>>>>
>>>> Best regards,
>>>> Sid
>>>>
>>>>

Reply via email to