I was able to mirror the site in Google's Colab. Here's a gist with a notebook describing what I did and its output:
https://gist.github.com/rwcitek/8d3035f6d2931d80f0569d3964fa6e28 In the notebook, you can click on the "Open in Colab" button to run the commands in your own Colab environment. Regards, - Robert On Fri, Nov 17, 2023 at 3:22 PM Rich Shepard <[email protected]> wrote: > On Fri, 17 Nov 2023, Russell Senior wrote: > > > Fwiw, I played a little bit with some approaches, unsuccessfully. But, > the > > problem might yield under a little more pressure. The problem I > eventually > > encountered and gave up at was that: a) the structure of their site isn't > > consistent; and b) there are links with embedded spaces or something. > This > > *might* be 75% of a solution or it might be a dead end: > > Russell, > > Yes, there is great inconsistency througout the site. > > > I kind of agree with Keith. Either just ask the site administrator for > all > > the data in one blob or start clicking. I don't have any better > solutions. > > There won't be any help from the EPA; I'll do one page at a time. > > Many thanks for trying to find a working solution. > > Regards, > > Rich >
