On behalf of Dr. Elliot Crowley from the School of Engineering at the 
University of Edinburgh (queries at elliot.crow...@ed.ac.uk):


Application link: Affordable Training of Large Language 
Models<https://www.eng.ed.ac.uk/studying/postgraduate/research/phd/affordable-training-large-language-modelshttps://www.eng.ed.ac.uk/studying/postgraduate/research/phd/affordable-training-large-language-models>


Recent developments in large language models (LLMs) have caught the attention 
of the public. LLMs such as OpenAI's GPT-4 and Google's Bard are able to 
generate remarkably realistic, coherent text based on a user's input and have 
the potential to be general-purpose tools used throughout society e.g. for 
customer service, summarising text, answering questions, writing contracts or 
translating between languages.


However, LLMs are prohibitively expensive to train. GPT-3 (which is 
significantly smaller than its successor, GPT-4) has an estimated training time 
of 355-GPU years and an estimated training cost of $4.6M [1]. Only large, 
wealthy institutions can train these models and thereby control how they are 
trained and who gets access to them. This is undemocratic.


Very recent work provides hope however. In [2] the authors explore the 
promising idea of “cramming”: the training of a LLM on a single GPU in a day. 
In [3] the authors use synthetic data to train “small” language models that can 
produce consistent stories at little cost. There is a huge discrepancy in 
quality between these models and their expensive counterparts, however.


In this PhD, the student will investigate affordable LLM training i.e. with 
limited compute and/or data, inspired by [2,3]. Avenues of research could 
include (i) generating training data that facilitates fast training e.g. 
through dataset distillation [4]; (ii) exploring neural architecture search to 
develop models that are "aware" of being resource-constrained while being 
trained; (iii) developing novel cost-effective training algorithms, (iv) 
leveraging and tuning open-source LLMs.


The successful student will have opportunities for collaboration within and 
outside Edinburgh’s School of Engineering e.g. with colleagues in the Institute 
for Digital 
Communications<https://www.eng.ed.ac.uk/research/institutes/idcom/>, The 
Bayesian and Neural Systems Group<https://www.bayeswatch.com/>, and Edinburgh 
NLP<https://edinburghnlp.inf.ed.ac.uk/>.


[1] https://lambdalabs.com/blog/demystifying-gpt-3

[2] https://arxiv.org/abs/2212.14034

[3] https://arxiv.org/abs/2305.07759

[4] https://arxiv.org/abs/1811.10959

The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh 
Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
_______________________________________________
Corpora mailing list -- corpora@list.elra.info
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to corpora-le...@list.elra.info

Reply via email to