Re: [FRIAM] Dissecting Recall of Factual Associations in, Auto-Regressive Language Models
Interesting that I downloaded this paper this morning also. Haven't yet looked to see what I could understand about it. From the little I know about it, this looks like it is related to the self-attention mechanism in transformers. -- Russ Abbott Professor Emeritus, Computer Science California State University, Los Angeles On Sun, May 7, 2023 at 9:29 AM Steve Smith wrote: > https://arxiv.org/pdf/2304.14767.pdf > > I am pretty much over my head in this literature, but continue to be > fascinated as I watch people who are not try to untangle some explanatory > power in their models... > > The details of this analysis or framing this as *information flow* rather > than *static data/structure* is reminiscent of some very nascent work we > *tried* to do 15 years ago, attempting to analyze/understand huge Systems > Dynamics models of Critical Infrastructure joined together/coupled to try > to predict the potential for cascading failures through these coupled > systems. The representation *as* SD models were natural for this framing > but we made only the tiniest progress IMO in extracting hints of > *explanatory* narratives.I was primarily doing visualization on those > tasks but tried to focus on clustering of the Dual Graph/Network to find > structure in the *flow* during extreme events rather than in the > engineered/designed structure of the network itself. > > I know there are others on this list who have worked with complex, dynamic > networks (I'm thinking of Frank's colleagues and Causal Discovery in > Graphical Models, various project Glen has alluded to, and a wide variety > of problems Stephen has related to me over the years, but I'm sure there > are plenty of others)... I'm curious if anyone else is wading in this deep > (and more to the point, finding any traction)? > > From the paper: > > -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . > FRIAM Applied Complexity Group listserv > Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom > https://bit.ly/virtualfriam > to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com > FRIAM-COMIC http://friam-comic.blogspot.com/ > archives: 5/2017 thru present > https://redfish.com/pipermail/friam_redfish.com/ > 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ > -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . FRIAM Applied Complexity Group listserv Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com FRIAM-COMIC http://friam-comic.blogspot.com/ archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/ 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/
Re: [FRIAM] Dissecting Recall of Factual Associations in, Auto-Regressive Language Models
You know what endlessly fascinates me? The way large language models are like those magic growth pills you see in cartoons. Just add some extra data, give it a stir, and voila! Emergent abilities appear out of thin air. It's like watching a kid turn into a superhero overnight. I quote verbatim from https://arxiv.org/pdf/2206.07682.pdf "Scaling up language models has been shown to predictably improve performance and sample efficiency on a wide range of downstream tasks. This paper instead discusses an unpredictable phenomenon that we refer to as emergent abilities of large language models. We consider an ability to be emergent if it is not present in smaller models but is present in larger models. Thus, emergent abilities cannot be predicted simply by extrapolating the performance of smaller models. The existence of such emergence raises the question of whether additional scaling could potentially further expand the range of capabilities of language models." Emergent properties of large language models are abilities that are not present in smaller models but are present in larger models. They are unpredictable and cannot be explained simply by extrapolating the performance of smaller models. Some examples of emergent abilities are: Solving math problems Answering factual questions Generating summaries Writing code Playing games Translating languages Composing music Drawing images Detecting emotions Reasoning logically On Sun, 7 May 2023 at 18:29, Steve Smith wrote: > https://arxiv.org/pdf/2304.14767.pdf > > I am pretty much over my head in this literature, but continue to be > fascinated as I watch people who are not try to untangle some explanatory > power in their models... > > The details of this analysis or framing this as *information flow* rather > than *static data/structure* is reminiscent of some very nascent work we > *tried* to do 15 years ago, attempting to analyze/understand huge Systems > Dynamics models of Critical Infrastructure joined together/coupled to try > to predict the potential for cascading failures through these coupled > systems. The representation *as* SD models were natural for this framing > but we made only the tiniest progress IMO in extracting hints of > *explanatory* narratives.I was primarily doing visualization on those > tasks but tried to focus on clustering of the Dual Graph/Network to find > structure in the *flow* during extreme events rather than in the > engineered/designed structure of the network itself. > > I know there are others on this list who have worked with complex, dynamic > networks (I'm thinking of Frank's colleagues and Causal Discovery in > Graphical Models, various project Glen has alluded to, and a wide variety > of problems Stephen has related to me over the years, but I'm sure there > are plenty of others)... I'm curious if anyone else is wading in this deep > (and more to the point, finding any traction)? > > From the paper: > > -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . > FRIAM Applied Complexity Group listserv > Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom > https://bit.ly/virtualfriam > to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com > FRIAM-COMIC http://friam-comic.blogspot.com/ > archives: 5/2017 thru present > https://redfish.com/pipermail/friam_redfish.com/ > 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/ > -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . FRIAM Applied Complexity Group listserv Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com FRIAM-COMIC http://friam-comic.blogspot.com/ archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/ 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/
[FRIAM] Dissecting Recall of Factual Associations in, Auto-Regressive Language Models
https://arxiv.org/pdf/2304.14767.pdf I am pretty much over my head in this literature, but continue to be fascinated as I watch people who are not try to untangle some explanatory power in their models... The details of this analysis or framing this as /information flow/ rather than /static data/structure/ is reminiscent of some very nascent work we *tried* to do 15 years ago, attempting to analyze/understand huge Systems Dynamics models of Critical Infrastructure joined together/coupled to try to predict the potential for cascading failures through these coupled systems. The representation *as* SD models were natural for this framing but we made only the tiniest progress IMO in extracting hints of *explanatory* narratives. I was primarily doing visualization on those tasks but tried to focus on clustering of the Dual Graph/Network to find structure in the *flow* during extreme events rather than in the engineered/designed structure of the network itself. I know there are others on this list who have worked with complex, dynamic networks (I'm thinking of Frank's colleagues and Causal Discovery in Graphical Models, various project Glen has alluded to, and a wide variety of problems Stephen has related to me over the years, but I'm sure there are plenty of others)... I'm curious if anyone else is wading in this deep (and more to the point, finding any traction)? From the paper: -. --- - / ...- .- .-.. .. -.. / -- --- .-. ... . / -.-. --- -.. . FRIAM Applied Complexity Group listserv Fridays 9a-12p Friday St. Johns Cafe / Thursdays 9a-12p Zoom https://bit.ly/virtualfriam to (un)subscribe http://redfish.com/mailman/listinfo/friam_redfish.com FRIAM-COMIC http://friam-comic.blogspot.com/ archives: 5/2017 thru present https://redfish.com/pipermail/friam_redfish.com/ 1/2003 thru 6/2021 http://friam.383.s1.nabble.com/