On Sun, Jul 23, 2023, 8:16 PM James Bowery <jabow...@gmail.com> wrote:

> https://aclanthology.org/2023.findings-acl.426.pdf
>

Yes, and I think the only reason gzip didn't outperform the other text
classifiers on the largest data sets is that it only finds matching strings
over a 32 KB window.

Of course, text classification is mostly adversarial, used for spam
filtering and censorship. String matching can be easily defeated by
deliberately misspelling words like "v1agra". But smarter algorithms that
understand text and images have solved the problem. Your inbox is no longer
full of spam and viruses. Your political posts just get down ranked instead
of a ban.

For some reason this reminds me of Matt's distributed competitive routing
> AGI proposal <https://www.mattmahoney.net/agi2.html>:
>

Ah, yes, in 2008 before smart phones, social media, blockchain, and the
Arab Spring ushering the demand for internet censorship. A time when we
were young and idealistic and thought that our ideas about AGI would change
the world. Now we are older and watching big tech solve the problems we
failed to solve, and maybe not liking those changes. Instead of the
internet being a tool for the people to control the government, it is
becoming the other way around.

Well I did warn you that AGI would be expensive. It would require a global
effort and a funding model equal to decades of world GDP, the value of all
the human labor that would be automated. My motivation was to divide the
workload (hardware, software, knowledge collection) among millions of
specialist peers. The goal was to do this in a hostile environment. Like
blockchain, messages could not be deleted, edited, forged, or refuted, but
protected by pairwise private key digital signatures and a reputation
network instead of proof of work. I didn't give a thought to evading
censorship because at the time it didn't exist. I now realize the dangers
of centralized control.

I suppose the technical reason for failure is that a distributed search
index takes O(n log n) storage and work, but a centralized index like
Google is O(n) and for a long time did everything we wanted. Distributed
search depends on messages being compressible, so it wouldn't work for
encrypted data like Bitcoin transactions. But the bigger reason is that
people won't use a social network unless a lot of people already use it. If
Google+ was a failure, what chance do I have of kick-starting one?

My biggest mistake was assuming that people would be willing to make all
their personal data public, which is what you need to do to distribute all
human knowledge to billions of peers. This would eliminate identity theft
and the need for passwords because everyone could instantly see what you
are doing if you claim to be someone else. You couldn't secretly stalk
someone because your queries would be public, just like the responses.
There would be no such thing as a data breach because it's not secret

Of course I was wrong. We only share all our data with a few big companies
like Google, Amazon, MasterCard, etc. And it is getting worse. Social
security numbers and birthdays were never meant to be secret. Facebook
built a face recognition database with a billion names and a trillion
labeled images, and then deleted it. It is now effectively illegal to post
someone's picture without their permission.

Maybe I am being pessimistic about P2P networks. Freenet and Tor are mostly
unusable because they lack search engines. Napster was killed by shutting
down it's centralized search service. USENET was O(n^2) and disappeared.
Mastodon lacks a funding model. Bitcoin uses 1% of the world's electricity.
Ethereum with proof of stake and support for arbitrary messages is still
O(n^2) making transactions unaffordable for widespread use. I'm afraid that
censorship is here to stay, in part because we want it as long as it's
applied to other voices we want silenced.


------------------------------------------
Artificial General Intelligence List: AGI
Permalink: 
https://agi.topicbox.com/groups/agi/T4dbad1e5c8d7f685-Mcc76ed80bc4b3196ee78c085
Delivery options: https://agi.topicbox.com/groups/agi/subscription

Reply via email to