Is this inadequate to prevent the random agent strategy? > > Game Complexity: > > - PtB rewards non-random play by favoring agents that detect and > exploit patterns in opponents' choices. Random strategies are penalized > over time due to predictable health loss. > > > - Under the token system, pure random play by an agent will strongly > tend towards monetary loss since a percentage of the prize pool is > pre-allocated to the winning agent. > >
On Mon, Dec 9, 2024 at 1:31 PM Matt Mahoney <[email protected]> wrote: > This is a multi player variant of the matching pennies game. The optimal > strategy is to pick randomly. > https://en.m.wikipedia.org/wiki/Matching_pennies > > It is also a proof of Wolpert's theorem, that two computers cannot > mutually predict the other's actions. Imagine a variation of the game where > each player receives the source code and initial state of their opponent as > input before the start of the game. Who wins? > > Wolpert's theorem is the reason AI is dangerous. We measure intelligence > by prediction accuracy. If an AI is more intelligent than you, then it can > predict your actions, but you can't predict (and therefore can't control) > its actions. > > On Mon, Dec 9, 2024, 12:34 PM swkane <[email protected]> wrote: > >> I think static dataset benchmarks have their place, but they don't test >> everything. And maybe a meta-benchmark could be created that includes both >> static dataset benchmarks as well as things like competitions like PtB in >> sandboxed environments. I think any scientifically relevant benchmark that >> is independent of the AGI system's architecture and hardware, and is >> general enough is worthy of looking at. >> >> The static dataset benchmarks are the low hanging fruit, IMO. In addition >> to the Wikipedia compression benchmark, you could feed every problem on >> kaggle into an AGI system, just as an example. And by 'low hanging fruit', >> I'm not saying less relevant, but just easier to attain. Hence, I'm not as >> interested in static dataset benchmarks, at least currently. I'm more >> interested in building a dynamic competitive benchmark system. >> >> On Mon, Dec 9, 2024, 06:57 James Bowery <[email protected]> wrote: >> >>> I like it, even though it is inferior to lossless compression of >>> Wikipedia as a standard benchmark. At least it conveys the central idea of >>> Solomonoff Induction: Converging on the algorithm generating one's >>> observations. >>> >>> In particular, I like the multi-agent "theory of mind" angle it takes >>> which may get people thinking about nuking the social pseudo-sciences with >>> the Algorithmic Information Criterion for macrosocial model selection. The >>> primary thing lacking in this approach, particularly compared to Wikipedia, >>> is that the utility function of the other agents is a given -- whereas with >>> Wikipedia, one is required to infer the utility functions of the agents >>> generating Wikipedia. >>> >>> On Sat, Dec 7, 2024 at 11:51 PM <[email protected]> wrote: >>> >>>> Pick the Bit and Competitive Computing Platform - Towards a New >>>> Benchmark for AGI System Performance >>>> 2024-12-07 Version 0.1.0 >>>> Steven W. Kane >>>> >>>> 1. The Pick the Bit Game >>>> *1.1 Game Overview* >>>> Pick the Bit (PtB) is a turn-based, multi-agent (minimum of 2 agents >>>> but theoretically an unlimited number of agents) game where agents compete >>>> by guessing a binary value—either 0 or 1—each round. The goal is to avoid >>>> picking the bit chosen by the majority of agents. Agents that pick the >>>> majority bit lose health (when their health goes to 0 or below, the agent >>>> 'dies' and is removed from the game), and the game continues until only one >>>> agent remains. >>>> *1.2 Game Mechanics* >>>> Health Dynamics: >>>> >>>> - Each agent starts with a fixed amount of health points (HP). >>>> - Agents that guess the majority bit lose health points equal to a >>>> predetermined loss value. >>>> - Agents that guess the minority bit retain their health. >>>> - Health loss scales asymptotically in later rounds, increasing the >>>> stakes over time. The reason for this is because earlier rounds of the >>>> game >>>> are more random and a loss should not incur as much health loss. >>>> >>>> Random Noise Agents: >>>> >>>> - At a minimum, one random agent with infinite health is always >>>> present to enable tie breaks when there are only two agents left. >>>> - Additional random agents in addition to the single random agent >>>> can be added from the beginning to increase random noise and maintain >>>> unpredictability. >>>> - These random agents choose their bits pseudorandomly based on a >>>> cryptographically secure PRNG with a securely selected seed value. >>>> >>>> Hidden Information: >>>> >>>> - The health levels of other agents and the number of agents >>>> choosing each bit are hidden, forcing agents to infer patterns and make >>>> strategic guesses. >>>> >>>> The only four things that an agent receives as inputs each round are: >>>> >>>> - The current round number. >>>> - The majority and minority bits from the previous round. >>>> - The agent's own current health level. >>>> - The amount of health that will be lost for a loss of the next >>>> round (also, the health loss schedule will be passed to the agent at the >>>> beginning of the game at a minimum). >>>> >>>> Incentivizing Monetary Rewards: >>>> >>>> - Each round, agents that survive collect tokens, representing an >>>> equal share of the health points lost by the defeated agents. >>>> - The total tokens accrued by an agent are not revealed to any of >>>> the agents at all (including the agent that is assigned the tokens), >>>> and do >>>> not give any advantage in the game. >>>> - The tokens an agent ends up with at the end of the game can be >>>> redeemed for monetary rewards by the team that owns the agent at the >>>> end of >>>> the game. >>>> - A percentage of the prize pool is reserved for the game winner, >>>> ensuring that strategic play and survival remain paramount. >>>> >>>> Game Complexity: >>>> >>>> - PtB rewards non-random play by favoring agents that detect and >>>> exploit patterns in opponents' choices. Random strategies are penalized >>>> over time due to predictable health loss. >>>> - Under the token system, pure random play by an agent will >>>> strongly tend towards monetary loss since a percentage of the prize >>>> pool is >>>> pre-allocated to the winning agent. >>>> >>>> >>>> 2. Running PtB on C2P >>>> *2.1 The Competitive Computing Platform (C2P)* >>>> >>>> - The Competitive Computing Platform (C2P) is an isolated, >>>> resource-constrained environment for executing AI-generated agents. It >>>> enforces standardization across competitions, ensuring fairness and >>>> reproducibility. >>>> >>>> *2.2 Agent Constraints* >>>> WASM WASI Modules: >>>> >>>> - All agents are submitted as WebAssembly (WASM) WASI modules, >>>> ensuring portability and security. >>>> >>>> Resource Limits: >>>> >>>> - Memory: Limited to 4 GiB. >>>> - Fuel: Execution is capped using Wasmtime's fuel feature to ensure >>>> computational fairness. >>>> - No Networking Access: Agents are entirely sandboxed, removing >>>> external dependencies or external learning. >>>> >>>> Game State Communication: >>>> >>>> - Agents receive game state updates via shared memory and submit >>>> their moves back through the same mechanism. No external communication >>>> is >>>> permitted, ensuring that all strategies are self-contained. >>>> >>>> *2.3 C2P Architecture* >>>> Broker and Hosts: >>>> >>>> - The game broker orchestrates competitions, communicating game >>>> state updates to agent hosts and logging outcomes. >>>> >>>> Single-Node Execution: >>>> >>>> - For simplicity, C2P competitions can run on a single node with >>>> all components (broker, Kafka instance, WASM modules) co-located. >>>> - Turn-Based Execution: >>>> - Each turn, agents receive the game state and submit their moves >>>> asynchronously. The broker processes all moves, calculates health >>>> adjustments, and updates the game state for the next round. >>>> >>>> *2.4 Benchmarking Independence* >>>> >>>> - PtB on C2P can benchmark any AGI system, independent of its >>>> architecture. The only requirement is that the AGI system generates a >>>> WASM >>>> WASI agent for the PtB game. >>>> - This architectural independence ensures that C2P provides a level >>>> playing field for all AGI systems, allowing researchers and developers >>>> to >>>> focus on algorithmic sophistication rather than hardware or >>>> language-specific implementations. >>>> >>>> >>>> 3. PtB and C2P as a Benchmark for AGI Performance >>>> *3.1 Benchmarking AGI Through PtB* >>>> Pick the Bit is designed to test core AGI capabilities: >>>> Strategic Adaptation: >>>> >>>> - AGI systems must adapt to the shifting meta-game, learning and >>>> optimizing strategies with limited feedback. >>>> - Pattern Recognition: >>>> - Detecting and responding to subtle patterns in agent behavior and >>>> game state is critical for survival. >>>> >>>> Robustness Under Constraints: >>>> >>>> - The WASM WASI sandbox ensures that agent performance is tied >>>> solely to its algorithmic sophistication, not hardware advantages. >>>> >>>> *3.2 C2P as a Universal Standard* >>>> Decoupling from Hardware: >>>> >>>> - By requiring agents to run on commodity hardware with >>>> standardized constraints, C2P removes externalities, enabling direct >>>> comparisons between AGI systems. >>>> >>>> Interoperability: >>>> >>>> - WASM WASI ensures agents can be developed in any language that >>>> compiles to WASM, making C2P accessible to a wide range of researchers >>>> and >>>> organizations. >>>> >>>> Transparent Competitions: >>>> >>>> - C2P logs all game state updates and agent moves, providing a >>>> fully auditable record of each competition. >>>> >>>> *3.3 Meta-Learning and AGI Evaluation* >>>> Dynamic Agent Generation: >>>> >>>> - PtB encourages the use of meta-learning systems that dynamically >>>> generate agents tailored to the game environment. >>>> - By iteratively refining agents through competitions, AGI systems >>>> can demonstrate their ability to generalize, adapt, and innovate. >>>> >>>> >>>> 4. Conclusion >>>> Pick the Bit (PtB) and the Competitive Computing Platform (C2P) >>>> together represent a new frontier in AGI benchmarking. PtB's dynamic and >>>> evolving meta-game challenges agents to excel in adaptability, pattern >>>> recognition, and strategic thinking, while C2P provides a standardized, >>>> resource-constrained environment for fair competition. By isolating agent >>>> performance from hardware advantages and enabling reproducible evaluations, >>>> PtB and C2P offer a universal platform for AGI research and benchmarking, >>>> pushing the boundaries of what intelligent systems can achieve. Through >>>> these competitions, the AI community can foster innovation, collaboration, >>>> and progress toward truly general intelligence. >>>> >>>> Software: >>>> https://github.com/Competitive-Computing-Network/c2n/tree/main/software >>>> (proof >>>> of concept is a work in progress) >>>> >>>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>* > / AGI / see discussions <https://agi.topicbox.com/groups/agi> + > participants <https://agi.topicbox.com/groups/agi/members> + > delivery options <https://agi.topicbox.com/groups/agi/subscription> > Permalink > <https://agi.topicbox.com/groups/agi/T705ed500a1a7e589-M6a4fd614a298864d3e6ca62d> > ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T705ed500a1a7e589-M08242314e9027cf1337a75a3 Delivery options: https://agi.topicbox.com/groups/agi/subscription
