On Sun, Apr 2, 2023 at 5:40 PM Erik Moeller <eloque...@gmail.com> wrote: > > I can't comment on the hardware requirements, but I would note that in > addition to the llama.cpp repository > (https://github.com/ggerganov/llama.cpp), which currently focuses on > LLaMA/Alpaca, there are other efforts to reduce the computational > requirements for running LLMs. https://github.com/NolanoOrg/cformers > looks promising and supports many of the open models. Fabrice Bellard > of FFmpeg fame was one of the first implementers of a highly optimized > LLM at https://textsynth.com/ ; sadly much of the work is proprietary
At this point I guess I would recommend adding five or so g2.cores8.ram36.disk20 flavor VPSs to WMCS, with between one and three RTX A6000 GPUs each, plus a 1TB SSD each, which should cost under $60k. That should allow for very widely multilingual models somewhere between GPT-3.5 and 4 performance with current training rates. > https://textsynth.com/playground.html remains one of the most > accessible ways to explore the performance of the open models with > only a rate limitation, and no requirement to purchase credits. There is are free Alpaca-30b demos for comparison at https://github.com/deep-diver/Alpaca-LoRA-Serve And free Alpaca-7b online at https://chatllama.baseten.co/ These models can be quantized into int4 weights which run on cell phones: https://github.com/rupeshs/alpaca.cpp/tree/linux-android-build-support It seems inevitable that we will someday include such LLMs with Internet-in-a-Box, and, why not also the primary mobile apps so we don't have to give away CPU utilization? There is a proposal to allow apps over 4GB in WASM: https://github.com/WebAssembly/memory64/blob/master/proposals/memory64/Overview.md At the rate things are improving maybe that won't even be neeedd to make a reasonable static web app, someday. -LW _______________________________________________ Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l Public archives at https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/V2NR6LH22JWDXDQL2KSAOEBJKID7LJSZ/ To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org