Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

Abdur-Rahmaan Janhangeer via Python-list Sun, 29 Sep 2024 22:52:39 -0700

Idk if you tried Polars, but it seems to work well with JSON data

import polars as pl
pl.read_json("file.json")


Kind Regards,

Abdur-Rahmaan Janhangeer
about <https://compileralchemy.github.io/> | blog
<https://www.pythonkitchen.com>
github <https://github.com/Abdur-RahmaanJ>
Mauritius


On Mon, Sep 30, 2024 at 8:00 AM Asif Ali Hirekumbi via Python-list <
[email protected]> wrote:

> Dear Python Experts,
>
> I am working with the Kenna Application's API to retrieve vulnerability
> data. The API endpoint provides a single, massive JSON file in gzip format,
> approximately 60 GB in size. Handling such a large dataset in one go is
> proving to be quite challenging, especially in terms of memory management.
>
> I am looking for guidance on how to efficiently stream this data and
> process it in chunks using Python. Specifically, I am wondering if there’s
> a way to use the requests library or any other libraries that would allow
> us to pull data from the API endpoint in a memory-efficient manner.
>
> Here are the relevant API endpoints from Kenna:
>
>    - Kenna API Documentation
>    <https://apidocs.kennasecurity.com/reference/welcome>
>    - Kenna Vulnerabilities Export
>    <https://apidocs.kennasecurity.com/reference/retrieve-data-export>
>
> If anyone has experience with similar use cases or can offer any advice, it
> would be greatly appreciated.
>
> Thank you in advance for your help!
>
> Best regards
> Asif Ali
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API

Reply via email to