Re: [agi] Fast Parsing in support of the coming Intelligent Internet

Matt Mahoney Mon, 18 Mar 2013 18:06:20 -0700

On Mon, Mar 18, 2013 at 4:29 PM, Steve Richfield
<[email protected]> wrote:
> The patent for the details that any competent AI guru could apply to 
> implement REALLY fast parsing, that operates orders of magnitude faster than 
> prior art methods, and use it to make the Internet intelligent, are now 
> embodied in U.S. Patent Application 13/836,678 that is attached to this 
> posting.


Very entertaining, especially the long rambling about how to solve all
the problems of sending targeted spam and how great Dr. Eliza is. I
don't see how any of this is relevant to your (missing) claims. I
guess that the only claim that you actually tested was your new hash
function using floating point arithmetic.

Did you have a patent lawyer look at this before you submitted it? If
so, I would ask for my money back.

As to your hash function, I don't see why this should be any faster
than integer arithmetic. Your function multiplies the previous hash by
pi/4, then adds the next character. To get the final index, you
multiply by a large prime number and truncate to an integer. Of
course, all of this could be done with 32 or 64 bit integer arithmetic
just as fast (or faster), without the problem of rounding errors.
Specifically, expressions like (a+b+c) and many others will give you
different results depending on the order of evaluation, which depends
on which compiler you use, which version, and which optimization
settings you use. It makes a difference whether the computation is
done in the x87 or SSE registers (which depends on compiler options)
because x87 uses 80 bit temporaries but SSE uses 64 bits. They also
handle underflow differently. It may work fine when you tested it,
then fail when you update your software and it can't read the old hash
tables anymore because something else changed in the build process.
And no, you do not get overflow errors with integer arithmetic. The
result is just truncated, which for many hash functions is actually
what you want.

The rest, I guess, is just speculation. Sure, the user could write
lots of rules for parsing English in order to trigger rules for
sending spam. Is your fast hash function (which isn't actually any
faster) and looking up rules based on the least common words first
really going to solve all of the well known NLP problems like
ambiguity and noise and just the enormous complexity of natural
language? Do you really think that your web crawler will run on a
single computer and you are going to tell Google they are doing it all
wrong because they need a 100 petabyte index and a big building with
cooling towers? Do you really think that users will install spyware on
their computers to get around Facebook and Twitter blocking web
crawlers just so they can receive your spam? Do you really think you
are going to intercept email and send spam cleverly disguised to look
like personal replies. Do you really plan to crawl every blog, look
for posts that might be relevant to your spam and post replies? How do
you plan to get the poster's email address? How do you plan to get
past the CAPTCHAS, logins, spam filters, and various other forms of
moderation? Do you think that bloggers have never dealt with spam
before?

BTW, I do appreciate the reference to my AGI proposal.

--
-- Matt Mahoney, [email protected]


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Re: [agi] Fast Parsing in support of the coming Intelligent Internet

Reply via email to