Hi folks, Following what has been discussed in d-project in an offtopic subthread, I prepared some demo on imagined use cases to leverage LLMs to help debian development. https://salsa.debian.org/deeplearning-team/debgpt
To run the demo, the least requirement is a CUDA GPU with > 6GB memory. You can run it on CPU of course, but that will require > 64GB RAM, and it may take > 20 minutes to give you a reply (I tested this using Xeon Gold 6140). A longer context will require more memory. If you cannot run the demo, I also provided a couple of example sessions. You can use `reply.py` to replay my llm session to figure out how it works. Installation and setup guide can be found in docs/. First start the LLM inference backend: $ debgpt backend --device cuda --precision 4bit Then you can launch the frontend to interact with it. The complete list of potential use cases are listed in demo.sh . I have recorded my session as an example for every single command inside. The following are some selected examples use cases: (the results are not always perfect. You can ask LLM to retry though) 1. let LLM read policy section 4.9.1 and implement "nocheck" support in pytorch/debian/rules command: debgpt x -f examples/pytorch/debian/rules --policy 4.9.1 free -i replay: python3 replay.py examples/84d5a49c-8436-4970-9955-d14592ef1de1.json 2. let LLM add armhf, and delete kfreebsd-amd64 from archlist in pytorch/debian/control command: debgpt x -f examples/pytorch/debian/control free -i replay: python3 replay.py examples/e98f8167-be4d-4c27-bc49-ac4b5411258f.json 3. I always forget which distribution should I target when uploading to stable. Is it bookworm? bookworm-pu? bookworm-updates? bookworm-proposed-updates? We let llm read devref section 5.1 and let it answer the question command: debgpt devref -s 5.5 free -i replay: python3 replay.py examples/6bc35248-ffe7-4bc3-93a2-0298cf45dbae.json 4. Let LLM explain the difference among proposals in vote/2023/vote_002 . command: debgpt vote -s 2023/vote_002 diff replay: python3 replay.py examples/bab71c6f-1102-41ed-831b-897c80e3acfb.json Note, this might be sensitive. I added a big red warning in the program if you ask LLM about vote questions. Do not let LLM affect your vote. 5. Mimic licensecheck. The licensecheck perl implementation is based on regex. It has a small knowledge base, and does not work when the text is very noisy. command: debgpt file -f debgpt/llm.py licensecheck -i replay: python3 replay.py examples/c7e40063-003e-4b04-b481-27943d1ad93f.json 6. My email is too long and you dont want to read it. LLM can summarize it. command: debgpt ml -u 'https://lists.debian.org/debian-project/2023/12/msg00029.html' summary -i replay: python3 replay.py examples/95e9759b-1b67-49d4-854a-2dedfff07640.json 7. General chat with llm without any additional information. command: debgpt none -i replay: python3 replay.py examples/da737d4c-2e93-4962-a685-2a0396d7affb.json The core idea of all those sub functionalities are the same. Just gather some task-specific information. And send them together to the LLM. I felt the state-of-the-art LLMs are much better than that in a few months ago. I'll leave it to the community to evaluate how LLM can help debian development, as well as how useful it is, and how reliable it is. You can also tell me more ideas on how we can interact with LLM for debian-specific tasks. It is generally not difficult to implement. The difficulty stems from the hardware capacity, and hence the context length. Thus, the client program has to fetch the most-relevant information regarding the task. How do you think?