Is this inadequate to prevent the random agent strategy?

>
> Game Complexity:
>
>    - PtB rewards non-random play by favoring agents that detect and
>    exploit patterns in opponents' choices. Random strategies are penalized
>    over time due to predictable health loss.
>
>
>    - Under the token system, pure random play by an agent will strongly
>    tend towards monetary loss since a percentage of the prize pool is
>    pre-allocated to the winning agent.
>
>

On Mon, Dec 9, 2024 at 1:31 PM Matt Mahoney <[email protected]> wrote:

> This is a multi player variant of the matching pennies game. The optimal
> strategy is to pick randomly.
> https://en.m.wikipedia.org/wiki/Matching_pennies
>
> It is also a proof of Wolpert's theorem, that two computers cannot
> mutually predict the other's actions. Imagine a variation of the game where
> each player receives the source code and initial state of their opponent as
> input before the start of the game. Who wins?
>
> Wolpert's theorem is the reason AI is dangerous. We measure intelligence
> by prediction accuracy. If an AI is more intelligent than you, then it can
> predict your actions, but you can't predict (and therefore can't control)
> its actions.
>
> On Mon, Dec 9, 2024, 12:34 PM swkane <[email protected]> wrote:
>
>> I think static dataset benchmarks have their place, but they don't test
>> everything. And maybe a meta-benchmark could be created that includes both
>> static dataset benchmarks as well as things like competitions like PtB in
>> sandboxed environments. I think any scientifically relevant benchmark that
>> is independent of the AGI system's architecture and hardware, and is
>> general enough is worthy of looking at.
>>
>> The static dataset benchmarks are the low hanging fruit, IMO. In addition
>> to the Wikipedia compression benchmark, you could feed every problem on
>> kaggle into an AGI system, just as an example. And by 'low hanging fruit',
>> I'm not saying less relevant, but just easier to attain. Hence, I'm not as
>> interested in static dataset benchmarks, at least currently. I'm more
>> interested in building a dynamic competitive benchmark system.
>>
>> On Mon, Dec 9, 2024, 06:57 James Bowery <[email protected]> wrote:
>>
>>> I like it, even though it is inferior to lossless compression of
>>> Wikipedia as a standard benchmark.  At least it conveys the central idea of
>>> Solomonoff Induction:  Converging on the algorithm generating one's
>>> observations.
>>>
>>> In particular, I like the multi-agent "theory of mind" angle it takes
>>> which may get people thinking about nuking the social pseudo-sciences with
>>> the Algorithmic Information Criterion for macrosocial model selection.  The
>>> primary thing lacking in this approach, particularly compared to Wikipedia,
>>> is that the utility function of the other agents is a given -- whereas with
>>> Wikipedia, one is required to infer the utility functions of the agents
>>> generating Wikipedia.
>>>
>>> On Sat, Dec 7, 2024 at 11:51 PM <[email protected]> wrote:
>>>
>>>> Pick the Bit and Competitive Computing Platform - Towards a New
>>>> Benchmark for AGI System Performance
>>>> 2024-12-07 Version 0.1.0
>>>> Steven W. Kane
>>>>
>>>> 1. The Pick the Bit Game
>>>> *1.1 Game Overview*
>>>> Pick the Bit (PtB) is a turn-based, multi-agent (minimum of 2 agents
>>>> but theoretically an unlimited number of agents) game where agents compete
>>>> by guessing a binary value—either 0 or 1—each round. The goal is to avoid
>>>> picking the bit chosen by the majority of agents. Agents that pick the
>>>> majority bit lose health (when their health goes to 0 or below, the agent
>>>> 'dies' and is removed from the game), and the game continues until only one
>>>> agent remains.
>>>> *1.2 Game Mechanics*
>>>> Health Dynamics:
>>>>
>>>>    - Each agent starts with a fixed amount of health points (HP).
>>>>    - Agents that guess the majority bit lose health points equal to a
>>>>    predetermined loss value.
>>>>    - Agents that guess the minority bit retain their health.
>>>>    - Health loss scales asymptotically in later rounds, increasing the
>>>>    stakes over time. The reason for this is because earlier rounds of the 
>>>> game
>>>>    are more random and a loss should not incur as much health loss.
>>>>
>>>> Random Noise Agents:
>>>>
>>>>    - At a minimum, one random agent with infinite health is always
>>>>    present to enable tie breaks when there are only two agents left.
>>>>    - Additional random agents in addition to the single random agent
>>>>    can be added from the beginning to increase random noise and maintain
>>>>    unpredictability.
>>>>    - These random agents choose their bits pseudorandomly based on a
>>>>    cryptographically secure PRNG with a securely selected seed value.
>>>>
>>>> Hidden Information:
>>>>
>>>>    - The health levels of other agents and the number of agents
>>>>    choosing each bit are hidden, forcing agents to infer patterns and make
>>>>    strategic guesses.
>>>>
>>>> The only four things that an agent receives as inputs each round are:
>>>>
>>>>    - The current round number.
>>>>    - The majority and minority bits from the previous round.
>>>>    - The agent's own current health level.
>>>>    - The amount of health that will be lost for a loss of the next
>>>>    round (also, the health loss schedule will be passed to the agent at the
>>>>    beginning of the game at a minimum).
>>>>
>>>> Incentivizing Monetary Rewards:
>>>>
>>>>    - Each round, agents that survive collect tokens, representing an
>>>>    equal share of the health points lost by the defeated agents.
>>>>    - The total tokens accrued by an agent are not revealed to any of
>>>>    the agents at all (including the agent that is assigned the tokens), 
>>>> and do
>>>>    not give any advantage in the game.
>>>>    - The tokens an agent ends up with at the end of the game can be
>>>>    redeemed for monetary rewards by the team that owns the agent at the 
>>>> end of
>>>>    the game.
>>>>    - A percentage of the prize pool is reserved for the game winner,
>>>>    ensuring that strategic play and survival remain paramount.
>>>>
>>>> Game Complexity:
>>>>
>>>>    - PtB rewards non-random play by favoring agents that detect and
>>>>    exploit patterns in opponents' choices. Random strategies are penalized
>>>>    over time due to predictable health loss.
>>>>    - Under the token system, pure random play by an agent will
>>>>    strongly tend towards monetary loss since a percentage of the prize 
>>>> pool is
>>>>    pre-allocated to the winning agent.
>>>>
>>>>
>>>> 2. Running PtB on C2P
>>>> *2.1 The Competitive Computing Platform (C2P)*
>>>>
>>>>    - The Competitive Computing Platform (C2P) is an isolated,
>>>>    resource-constrained environment for executing AI-generated agents. It
>>>>    enforces standardization across competitions, ensuring fairness and
>>>>    reproducibility.
>>>>
>>>> *2.2 Agent Constraints*
>>>> WASM WASI Modules:
>>>>
>>>>    - All agents are submitted as WebAssembly (WASM) WASI modules,
>>>>    ensuring portability and security.
>>>>
>>>> Resource Limits:
>>>>
>>>>    - Memory: Limited to 4 GiB.
>>>>    - Fuel: Execution is capped using Wasmtime's fuel feature to ensure
>>>>    computational fairness.
>>>>    - No Networking Access: Agents are entirely sandboxed, removing
>>>>    external dependencies or external learning.
>>>>
>>>> Game State Communication:
>>>>
>>>>    - Agents receive game state updates via shared memory and submit
>>>>    their moves back through the same mechanism. No external communication 
>>>> is
>>>>    permitted, ensuring that all strategies are self-contained.
>>>>
>>>> *2.3 C2P Architecture*
>>>> Broker and Hosts:
>>>>
>>>>    - The game broker orchestrates competitions, communicating game
>>>>    state updates to agent hosts and logging outcomes.
>>>>
>>>> Single-Node Execution:
>>>>
>>>>    - For simplicity, C2P competitions can run on a single node with
>>>>    all components (broker, Kafka instance, WASM modules) co-located.
>>>>    - Turn-Based Execution:
>>>>    - Each turn, agents receive the game state and submit their moves
>>>>    asynchronously. The broker processes all moves, calculates health
>>>>    adjustments, and updates the game state for the next round.
>>>>
>>>> *2.4 Benchmarking Independence*
>>>>
>>>>    - PtB on C2P can benchmark any AGI system, independent of its
>>>>    architecture. The only requirement is that the AGI system generates a 
>>>> WASM
>>>>    WASI agent for the PtB game.
>>>>    - This architectural independence ensures that C2P provides a level
>>>>    playing field for all AGI systems, allowing researchers and developers 
>>>> to
>>>>    focus on algorithmic sophistication rather than hardware or
>>>>    language-specific implementations.
>>>>
>>>>
>>>> 3. PtB and C2P as a Benchmark for AGI Performance
>>>> *3.1 Benchmarking AGI Through PtB*
>>>> Pick the Bit is designed to test core AGI capabilities:
>>>> Strategic Adaptation:
>>>>
>>>>    - AGI systems must adapt to the shifting meta-game, learning and
>>>>    optimizing strategies with limited feedback.
>>>>    - Pattern Recognition:
>>>>    - Detecting and responding to subtle patterns in agent behavior and
>>>>    game state is critical for survival.
>>>>
>>>> Robustness Under Constraints:
>>>>
>>>>    - The WASM WASI sandbox ensures that agent performance is tied
>>>>    solely to its algorithmic sophistication, not hardware advantages.
>>>>
>>>> *3.2 C2P as a Universal Standard*
>>>> Decoupling from Hardware:
>>>>
>>>>    - By requiring agents to run on commodity hardware with
>>>>    standardized constraints, C2P removes externalities, enabling direct
>>>>    comparisons between AGI systems.
>>>>
>>>> Interoperability:
>>>>
>>>>    - WASM WASI ensures agents can be developed in any language that
>>>>    compiles to WASM, making C2P accessible to a wide range of researchers 
>>>> and
>>>>    organizations.
>>>>
>>>> Transparent Competitions:
>>>>
>>>>    - C2P logs all game state updates and agent moves, providing a
>>>>    fully auditable record of each competition.
>>>>
>>>> *3.3 Meta-Learning and AGI Evaluation*
>>>> Dynamic Agent Generation:
>>>>
>>>>    - PtB encourages the use of meta-learning systems that dynamically
>>>>    generate agents tailored to the game environment.
>>>>    - By iteratively refining agents through competitions, AGI systems
>>>>    can demonstrate their ability to generalize, adapt, and innovate.
>>>>
>>>>
>>>> 4. Conclusion
>>>> Pick the Bit (PtB) and the Competitive Computing Platform (C2P)
>>>> together represent a new frontier in AGI benchmarking. PtB's dynamic and
>>>> evolving meta-game challenges agents to excel in adaptability, pattern
>>>> recognition, and strategic thinking, while C2P provides a standardized,
>>>> resource-constrained environment for fair competition. By isolating agent
>>>> performance from hardware advantages and enabling reproducible evaluations,
>>>> PtB and C2P offer a universal platform for AGI research and benchmarking,
>>>> pushing the boundaries of what intelligent systems can achieve. Through
>>>> these competitions, the AI community can foster innovation, collaboration,
>>>> and progress toward truly general intelligence.
>>>>
>>>> Software:
>>>> https://github.com/Competitive-Computing-Network/c2n/tree/main/software 
>>>> (proof
>>>> of concept is a work in progress)
>>>>
>>>> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/T705ed500a1a7e589-M6a4fd614a298864d3e6ca62d>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T705ed500a1a7e589-M08242314e9027cf1337a75a3
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to