Re: [PLUG] Moving 15 GB ... in 1970

2023-11-17 Thread Russell Senior
On Fri, Nov 17, 2023 at 7:02 PM Keith Lofstrom wrote: > > On Fri, Nov 17, 2023 at 08:26:21AM -0800, Rich Shepard wrote: > > > I need to download ~15G of data from a web site. Using a PLUG mail list > > Apropos of not much, when I first got on this crazy > [...] I will "soon" install 100/100 Mbps

Re: [PLUG] Using wget to download all files from a web site (2)

2023-11-17 Thread Keith Lofstrom
On Fri, Nov 17, 2023 at 12:43:29PM -0800, Keith Lofstrom wrote: ... > I "wget-ed" a website, and was soon contacted by a > panicked/angry sysadmin watching their website brought > to a crawl because their 5 mbps upload bandwidth was > clobbered for hours by my scrape of their site. My bad. When y

[PLUG] Moving 15 GB ... in 1970

2023-11-17 Thread Keith Lofstrom
> On Fri, Nov 17, 2023 at 08:26:21AM -0800, Rich Shepard wrote: > > I need to download ~15G of data from a web site. Using a PLUG mail list Apropos of not much, when I first got on this crazy internet merry-go-round, the nearest host was UCBVAX in Berkeley, and we connected with modems. I connec

Re: [PLUG] email services supporting IMAP

2023-11-17 Thread Kevin Williams
Correction: $20/yr for Runbox, not $20/mo. On Fri, Nov 17, 2023, at 17:32, Kevin Williams wrote: > Hi Galen, > > I myself have been on this journey to migrate my internet accounts registered > using my Gmail address to my own domain, and use multiple aliases in the form > of serv...@mydomain.tl

Re: [PLUG] email services supporting IMAP

2023-11-17 Thread Kevin Williams
Hi Galen, I myself have been on this journey to migrate my internet accounts registered using my Gmail address to my own domain, and use multiple aliases in the form of serv...@mydomain.tld. Over the last year and a half, I have moved about 90 accounts from Gmail. Some sites allow self-service

[PLUG] email services supporting IMAP

2023-11-17 Thread Galen Seitz
Hi, A smart, but non-sysadmin, non-linux-using friend asks: "Hey I’ve been interested in getting off Gmail and switching to a mail service I pay for. And then using it with IMAP on my various devices. Do you have any knowledge about other services besides Gmail, yahoo, etc?" I'm pretty sure t

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
On Fri, 17 Nov 2023, Russell Senior wrote: Fwiw, I played a little bit with some approaches, unsuccessfully. But, the problem might yield under a little more pressure. The problem I eventually encountered and gave up at was that: a) the structure of their site isn't consistent; and b) there are

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
On Fri, 17 Nov 2023, Bill Barry wrote: Limiting how deep to recurse is helpful. You may want just the page you start with and one level down from that. --level= depth --level=1 would be a good place to start. Bill, Good idea. Thanks, Rich

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Russell Senior
Fwiw, I played a little bit with some approaches, unsuccessfully. But, the problem might yield under a little more pressure. The problem I eventually encountered and gave up at was that: a) the structure of their site isn't consistent; and b) there are links with embedded spaces or something. This

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Bill Barry
On Fri, Nov 17, 2023 at 3:17 PM Rich Shepard wrote: > > On Fri, 17 Nov 2023, Michael Barnes wrote: > > > I have used this command string successfully in the past to download > > complete websites. > > > > $ wget --recursive --no-clobber --page-requisites > > --html-extension --

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
On Fri, 17 Nov 2023, Michael Barnes wrote: I have used this command string successfully in the past to download complete websites. $ wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains website.com

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
On Fri, 17 Nov 2023, Keith Lofstrom wrote: A related question is "how much will the Portland Harbor Superfund Site need to pay to upload 15 GB to you? How much upload bandwidth do they have? Keith, I don't think anyone knows. I "wget-ed" a website, and was soon contacted by a panicked/angry

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Keith Lofstrom
On Fri, Nov 17, 2023 at 08:26:21AM -0800, Rich Shepard wrote: > I need to download ~15G of data from a web site. Using a PLUG mail list > thread from 2008 I tried this syntax: > wget -r --accept *.* http://ph-public-data.com/ A related question is "how much will the Portland Harbor Superfund Site

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
On Fri, 17 Nov 2023, Michael Barnes wrote: I have used this command string successfully in the past to download complete websites. $ wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains website.com

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
On Fri, 17 Nov 2023, Michael Ewan wrote: You may be getting caught by robots.txt, try setting the user agent header, i.e. -U agent-string Michael, The wget man page doesn't inform me how to identify the agent-string. All it says is: --user-agent=agent-string Identify as agent-stri

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Michael Barnes
I have used this command string successfully in the past to download complete websites. $ wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains website.com --no-parent website.com HTH, Michael

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Michael Ewan
You may be getting caught by robots.txt, try setting the user agent header, i.e. -U agent-string On Fri, Nov 17, 2023 at 8:31 AM Rich Shepard wrote: > On Fri, 17 Nov 2023, Rich Shepard wrote: > > > I need to download ~15G of data from a web site. Using a PLUG mail list > > thread from 2008 I tri

Re: [PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
On Fri, 17 Nov 2023, Rich Shepard wrote: I need to download ~15G of data from a web site. Using a PLUG mail list thread from 2008 I tried this syntax: wget -r --accept *.* http://ph-public-data.com/ To clarify, I don't think that I want to use the wget -m (mirror) command because I don't think

[PLUG] Using wget to download all files from a web site

2023-11-17 Thread Rich Shepard
I need to download ~15G of data from a web site. Using a PLUG mail list thread from 2008 I tried this syntax: wget -r --accept *.* http://ph-public-data.com/ What was quickly returned is the contents of a ph-public-data.com/ directory: about/ contact/ document/ file/ whatsnew/ What I want ar