Hey people, It is not a homework assignment, and there are two problems with Google.
1. It doesn't index recent links that have been generated until sometime later. 2. Google's results: https://www.google.com/search?newwindow=1&q=site:http://lists.pdxlinux.org/pipermail/plug/+programming+FORTRAN&sa=X&ved=2ahUKEwjj5vvgjJPzAhX1HzQIHWz2Ac4QgwN6BAgBEAE&biw=1920&bih=921 Duckduckgo's results: https://duckduckgo.com/?q=site%3Ahttp%3A%2F%2Flists.pdxlinux.org%2Fpipermail%2Fplug%2F+programming+FORTRAN&t=iphone&ia=web Neither is complete, and if there is something like Google that gives everything and indexes these pages in the same day every time then I won't have to make a search engine. Now, there is a thing called typesense that could work, and maybe doing the search engine of a collection of links or a collection of files is an option to make it offline to deal with the traffic problem, and then make a program to update the files or links with new ones. typesense homepage: https://typesense.org On Fri, Sep 24, 2021 at 8:56 AM Michael Rasmussen <[email protected]> wrote: > does the original poster want > * to look things up in the archives > * figure out how to make such a thing > > If the second, how deep does he want to go? > > --- > Michael Rasmussen, Portland Oregon > Be Appropriate && Follow Your Curiosity > > On 2021-09-23 23:51, Tomas Kuchta wrote: > > +2 for google: > > a) no additional traffic to the mailing list. This could be > > significant > > for trivial search engines. > > b) speed - google responds in miliseconds > > c) google's NLP is state of the art. No way <1k people team effort > > could > > come close to what you get for free. > > > > Just my 2c, -T > > > > > > On Wed, Sep 22, 2021, 13:40 Russell Senior <[email protected]> > > wrote: > > > >> This sounds vaguely like a homework assignment. > >> > >> My advice would be to think about what information you'd need to have > >> to be able to do the things you are describing, and then think about > >> how to get that information. > >> > >> On Wed, Sep 22, 2021 at 8:25 AM Daniel Ortiz > >> <[email protected]> wrote: > >> > > >> > Hello everyone, > >> > May anyone please lead me in making a small search engine for this > >> mailing > >> > list's archives and another one? All it needs to do is return the > links > >> > that contains the words you put in regardless of space, order, > location, > >> or > >> > capitalization. The words also don't have to be all in there. Don't > >> concern > >> > yourselves as much with the ranking system since that is secondary and > >> > could be left out, but a ranking system that has ranking from the > >> greatest > >> > percentage to lowest percentage of words then in the search and > ranking > >> > from the first to the last word in the search (an example of that in > >> action > >> > is if the search has "programming FORTRAN" then it places first the > links > >> > with both words then the links with the first word then the link with > the > >> > last word) would make it more useful. > >> > From, Daniel Ortiz > >> >
