Re: [agi] rules...found something slightly undesired I think...very interesting

Matt Mahoney Sat, 29 May 2021 08:17:15 -0700

If you would like to host a 100 GB text benchmark, feel free. You'll soon
find out that smaller is a lot more practical. I use 1 GB because that is
enough for human level AI. That is how much language you hear, read, write,
and speak in a lifetime. I realize that current AI like Google, Alexa, and
GPT-2 are far beyond human level.


On Fri, May 28, 2021, 6:05 PM <immortal.discover...@gmail.com> wrote:

> So the Hutter Prize contest rules state only CPU usage, not GPU. I assume
> you can use CPU cores for parallelization.
>
> Matt's contest (LTCB) allows GPU usage, unlimited core counts, and memory,
> and time.
> "Timing information, when available, may vary widely depending on the test
> machine used."
>
> Both are correct...........with a small issue I found I think......... We
> know an algorithm can train on more data or use more cores, the contests
> restrict the data to a dataset enwik9 and at a stuck size of 1GB, we don't
> need to benchmark AI on more data to see who's is better. Nor use more
> cores. But this isn't to say 1 core is all we need, then can multiple the
> usage over 100000 cores. We should make sure the algorithm CAN be parallel,
> so we allow in the rules to use 4 cores in the CPU. Hutter Prize does this,
> you can use ex. 4 CPU cores and only enwik9, all limited in amount. Matt's
> goes too far, (and really it is a good thing but problem is he doesn't go
> all the way, keep reading), there is no cores limit, this is not good, it's
> like using more data, my AI can get LESS error per ratio if train on 100GB
> of text as it does better on bigger data hence better ratio, same if I use
> more cores, it doesn't tell us who's AI is better really, only who has more
> cash at home to use more cores or train on more data/ time. Matt's allows
> unlimited cores but limited data size, why can't I show my ratio from using
> 100GB? Now, I DO agree the HP contest should limit cores and data amount
> used to see who's AI is better, and I agree also we should have a unlimited
> contest like Matt's that shows how good a predictor can be, but Matt's
> needs to start allowing unlimited dataset sizes, he only currently allows
> unlimited core usage. > Because if you have more cash then you can use more
> cores and get a better predictor from having more compute, -- this, this
> right here in not public equalness, only the rich can get the best score,
> so the contest is no longer public - it is what is possible on Earth !,
> hence Matt's contest should also allow 100GB+ usage too, so we can show how
> good a predictor can be. Why unlimited cores but not unlimted data scores?
> We /can/ compare ratios, i.e. notice how Fabrice Bellard's scores 15MB on
> 100MB and 110MB on 1GB, well that means for the amount of prompts it saw it
> predicted the actual answer blindfoldedly that accurately averaged over all
> prompts, 89% accurate per prompt on average! It will more accurate on 1TB
> of text.
>
> So I'm going to add to my Guide that we should use 1 contest for finding
> better AI (HP...), and another contest for finding the best implementation
> of AI (half does Matt's match this criteria :(... ). Simply start adding
> 10GB+ benchmarks Matt, it is easy to just take the top algorithms and get
> some stats right away. Also beowulfs and supercomputers, should allow
> intense parallization... Korrelan seems to do this and so do
> supercomputers, and I think OpenAI.
>
> My job is obviously the HP contest, I could, sooner (instead of later),
> try large GPU usage but really this is not my game it is a rich man's job,
> I can get richer by doing the HP contest (showing my AI is smarter, not
> that I am rich/ have more cores). In this case my AI may appear worse than
> Bellard's score then, for now. Doing any elaborate core usage or even using
> GPU would be a waste of time like using more data is for finding smarter
> AI, for the most part.
>
> Then again hmm, Matt's contest is about scaling using more cores, a rich
> man's job, what we can do on Earth....but maybe you can use a limited
> dataset size of 1GB....I mean his test is who has more money, it's clear
> when used with 1GB, why use 10GB then? It would only change if had more
> cash for more memory, compute, I mean to score good on 10GB speed won't
> change it any more than scoring on 1GB (except for the contest of who has
> more time to train AI), hmm same for more memory. So it's a contest of who
> has more cash, and spent longer training AI, but the later requires
> unlimited datset size, otherwise matt's contest is fine then. But since
> it's not a who's AI is smarter "contest" and is a who is richer contest,
> why is it any weirder so see it as a who spent longer time training contest
> (hence use 1TB of text)?
>
> And if you had 1 trillion cores and big RAM or cache, and used enwik[3],
> or IOW 1KB of data, you couldn't use all your cores really, assuming you
> can look at the future far ahead and no longer are using compression
> (evaluation), for training. So.....perhaps using more data ex. 1Tb of text
> shows how rich you are.
> *Artificial General Intelligence List <https://agi.topicbox.com/latest>*
> / AGI / see discussions <https://agi.topicbox.com/groups/agi> +
> participants <https://agi.topicbox.com/groups/agi/members> +
> delivery options <https://agi.topicbox.com/groups/agi/subscription>
> Permalink
> <https://agi.topicbox.com/groups/agi/Te9633f76cfbb22e5-M7890e697c4d229f7091aa213>
>

------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/Te9633f76cfbb22e5-Mc1180d0128730a0af5b64fc7
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Re: [agi] rules...found something slightly undesired I think...very interesting

Reply via email to