Re: Websearch: ranking recent articles higher (was: Bandwidth-hungry services burden the internet)

2020-06-05 Thread Dmitry Alexandrov
[Please note: Something happened with your MUA and your letter had fallen off 
the thread.]

Akira Urushibata  wrote:
> On 28 May 2020 Dmitry Alexandrov wrote:
>> "Kaz Kylheku (gnu-misc-discuss)" <936-846-2...@kylheku.com> wrote:
>>> It is fairly well-known that Google ranks newer material above older 
>>> material.  Historic areas of the web are basically in a black hole as far 
>>> as the Google search is concerned.
>>>
>>> And since many people reach for the Google search engine without even 
>>> thinking there might be alternatives, those areas of the web basically 
>>> don't exist.
>>
>> That is, there are some websearch providers that do not rank new and updated 
>> articles higher?  Why do not they, I wonder?  It looks like a pretty sane 
>> choice.
>
> Other conditions being equal, a websearch will rank a newer document above an 
> older one.  But the other conditions are never equal.

Yes-yes, sure.  My question was rather about those ‘alternatives’, mentioned by 
@936-846-2...@kylheku.com, that treat dusty areas of the web better.


signature.asc
Description: PGP signature


Re: Websearch: ranking recent articles higher (was: Bandwidth-hungry services burden the internet)

2020-05-29 Thread Dmitry Alexandrov
"Kaz Kylheku (gnu-misc-discuss)" <936-846-2...@kylheku.com> wrote:
> It is fairly well-known that Google ranks newer material above older 
> material.  Historic areas of the web are basically in a black hole as far as 
> the Google search is concerned.
>
> And since many people reach for the Google search engine without even 
> thinking there might be alternatives, those areas of the web basically don't 
> exist.

That is, there are some websearch providers that do not rank new and updated 
articles higher?  Why do not they, I wonder?  It looks like a pretty sane 
choice.


signature.asc
Description: PGP signature


Re: Bandwidth-hungry services burden the internet

2020-05-28 Thread Akira Urushibata
On May 2020 14:28:16 Kaz Kylheku wrote:

> >Pages that Google had ranked top in search result lists last year
> >are for some reason gone when the same search is conducted.
> > 
> > This seems like a different issue though.  Google is not you friend,
> > and you should not trust them.

> It is fairly well-known that Google ranks newer material above older
> material.  Historic areas of the web are basically in a black hole as
> far as the Google search is concerned.

If that were the case the page would be demoted in the list.  I've seen
one page disappear from the ranking list.

How far a page is demoted depends on how much newer material has become
available.  If there are many new web pages and Google thinks that they
are better, the once-celebrated page goes down far.  If there are few or
no new web pages which fit the given search phrase, the ranking should
stay stable.

Though it is true that Google prefers newer articles, saying that it
ignores everything more than a few years old is certainly an
exaggeration.  Try searching for information on an older version of
software such as "gcc 2.96", "Pbmplus" or "Debian Potato" with Google
and you should see articles that are more than a decade old.





Re: Bandwidth-hungry services burden the internet

2020-05-27 Thread Kaz Kylheku (gnu-misc-discuss)

On 2020-05-26 22:33, a...@gnu.org wrote:

I have been through some strange experiences recently.  Certain web
   pages take seconds to load.  In some instances the communication
   fails with a time-out.

This sounds like an issue with your ISP -- and not a general issue.


Could be an issue with the ISP of those websites, or something else,
like a reverse-proxy, if they use one (e.g. CloudFare). If you're
originating from a network which the reverse-proxy thinks is attacking
the client website, it will throttle you.

In that case, you will have a problem with multiple websites that use
the same provider of that.


   Pages that Google had ranked top in search result lists last year
   are for some reason gone when the same search is conducted.

This seems like a different issue though.  Google is not you friend,
and you should not trust them.


It is fairly well-known that Google ranks newer material above older
material.  Historic areas of the web are basically in a black hole as
far as the Google search is concerned.

And since many people reach for the Google search engine without
even thinking there might be alternatives, those areas of the web
basically don't exist.

Except, of course, old stuff that is manipulated by SEO into fooling
Google into thinking that it's new. You know; junk blog article written
in 2006, but somehow appearing in a page updated in May 2020 ...

とても腹立たしくてとんでもないもんだよな~。





Re: Bandwidth-hungry services burden the internet

2020-05-26 Thread Alfred M. Szmidt
   I have been through some strange experiences recently.  Certain web
   pages take seconds to load.  In some instances the communication
   fails with a time-out.

This sounds like an issue with your ISP -- and not a general issue.

   Pages that Google had ranked top in search result lists last year
   are for some reason gone when the same search is conducted.

This seems like a different issue though.  Google is not you friend,
and you should not trust them.



Bandwidth-hungry services burden the internet

2020-05-26 Thread Akira Urushibata
Pretty much from our first encounter, Richard Stallman has been asking
me to translate free software and hacker ethos terminology into
Japanese.  I have found some interesting solutions in classics.

To explain what a hacker is you first have to understand.  Hacker is a
noun derived from the verb "hack."  Hackers are good at dividing
problems into smaller parts.  They are also good at producing terse
code.  In the past, when memory was expensive and small, this was an
essential skill for programmers attempting anything substantial.
Nowadays, with DRAM measured in gigabytes and communication speed
in gigabits per second, this is not true anymore.

Or so it seemed, until the coronavirus came along.

I have been through some strange experiences recently.  Certain web
pages take seconds to load.  In some instances the communication fails
with a time-out.  Pages that Google had ranked top in search result
lists last year are for some reason gone when the same search is
conducted.

I have a theory of what is going on at Google.  The traffic load of
live video conferencing and video content download is putting a heavy
load on server and exchanges.  As a result page fetch requests
occasionally fail and the relevant content is dropped from the search
results.

Perhaps some fellow list subscribers are going through similar
experiences.  I advice any person in charge of maintaining a web server
to take a look.

I am thinking of how to inform ordinary computer users about this
issue.  I welcome your opinions.