Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

Asif Ali Hirekumbi via Python-list Sun, 29 Sep 2024 23:45:50 -0700

Thanks Abdur Rahmaan.
I will give it a try !

Thanks
Asif


On Mon, Sep 30, 2024 at 11:19 AM Abdur-Rahmaan Janhangeer <
arj.pyt...@gmail.com> wrote:

> Idk if you tried Polars, but it seems to work well with JSON data
>
> import polars as pl
> pl.read_json("file.json")
>
> Kind Regards,
>
> Abdur-Rahmaan Janhangeer
> about <https://compileralchemy.github.io/> | blog
> <https://www.pythonkitchen.com>
> github <https://github.com/Abdur-RahmaanJ>
> Mauritius
>
>
> On Mon, Sep 30, 2024 at 8:00 AM Asif Ali Hirekumbi via Python-list <
> python-list@python.org> wrote:
>
>> Dear Python Experts,
>>
>> I am working with the Kenna Application's API to retrieve vulnerability
>> data. The API endpoint provides a single, massive JSON file in gzip
>> format,
>> approximately 60 GB in size. Handling such a large dataset in one go is
>> proving to be quite challenging, especially in terms of memory management.
>>
>> I am looking for guidance on how to efficiently stream this data and
>> process it in chunks using Python. Specifically, I am wondering if there’s
>> a way to use the requests library or any other libraries that would allow
>> us to pull data from the API endpoint in a memory-efficient manner.
>>
>> Here are the relevant API endpoints from Kenna:
>>
>>    - Kenna API Documentation
>>    <https://apidocs.kennasecurity.com/reference/welcome>
>>    - Kenna Vulnerabilities Export
>>    <https://apidocs.kennasecurity.com/reference/retrieve-data-export>
>>
>> If anyone has experience with similar use cases or can offer any advice,
>> it
>> would be greatly appreciated.
>>
>> Thank you in advance for your help!
>>
>> Best regards
>> Asif Ali
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

Reply via email to