RE: Yahoo Groups going away
have been reasonably successful, after making a few mods, in backing up Yahoo groups using a clone of "Yahoo Group Archiver" which broadly works (but see below) and doesn't need any scraping tools. I made a few tweaks concentrating on speed rather than documenting the code and the current mess is here:- https://1drv.ms/u/s!Ag4BJfE5B3onleMG29vMs5czmPcoTw?e=TrqawF The script yahoo.py is supposed back up things to files. I couldn't get use the user/password login part to work, but noted scripts also have support for putting the cookies in the command line. So I downloaded cookie manager for Firefox, logged into Yahoo and added code to set the values at the top of the code. The result is "yahoo1.py" Its pretty obvious which cookies are needed. I found this fails on unnamed photo albums. I also found file download flaky. So Yahoo2 will fix photo albums with duff names and skip downloading any existing files. This leaves one bug. If a download fails the script may leave an empty file. If you want to restart that download you need to remove it before restarting the download. Sometimes Yahoo barfs at a file because updated av/malware scanners mark its as bad. E.g. archives which contain "netcat" In this case leave the partial download in place and allow the script to skip We also don't get file descriptions. I am running the scripts on Windows/10 on Python 3.7.5 on Windows/10 and use "py" to run the scripts When installing the required "requests" package (see the readme.md) I found I had to enter the full path to pip (Its it the scripts folder) I am happy to answer any questions but note I am in the UK (that is East Pondia not the University of Kentucky) so please allow for my time zone. Dave G4UGM P.S. I now hate python PPS I also now hate Yahoo. > -Original Message- > From: cctalk On Behalf Of Steve Malikoff via > cctalk > Sent: 24 October 2019 02:51 > To: General Discussion: On-Topic and Off-Topic Posts > Subject: Re: Yahoo Groups going away > > Jim said > > On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk wrote: > >>>> Yeah, it sucks. The Tomy Tutor users group has been there for > >>>> years, and I guess we'll jump over to groups.io. I managed to > >>>> archive everything last night. > >>> What's your strategy for archiving material off YahooGroups? Their > >>> Files and Photo (photostreams) sections are so heavily > >>> Javascript-encrusted that it's not at all easy to bulk archive from > >>> them. I tried a few tools (httrack, wget, > >>> curl) with no valid results, but I only used some basic settings. > >> For the messages, I used > >> > >>https://github.com/andrewferguson/YahooGroups-Archiver > >> > >> Unfortunately, the (rather inadequate) Y!G API for files makes it > >> difficult to iterate over files in a directory tree. I ended up > >> manually downloading them, since it was only about 30 files and not > >> worth ginning up something to scrape them. Some people have used > >> > >>https://github.com/csaftoiu/yahoo-groups-backup > > I didn't get that to work. Has anyone here got suggestions? Contact > > off list. It is getting errors, and I spent about an hour trying to > > figure it out. > > > > every issue was a bug in either Python that was unresolved, or the > > tools they were using, not errors in the tool, so I'm not really > > interested in a lot more debugging. > > > > I suspect it ran at some point, maybe I've got the wrong versions of > > some sort. > > > > thanks > > Jim > >> to get everything but it needs a MongoDB instance which seemed kind > >> of overkill for a one-time dump. > > > I set it up with python 3.7.3, pip installed the required modules such as > Selenium, installed geckodriver for Firefox (but I don't run Firefox on this > machine, I use a popular fork) and it emitted an error that referes to > Selenium > not being the correct match to Firefox. > I have other things to do so that's where I left it for now, will try it out > again > sometime soon with an earlier actual Firefox. > > Steve.
Re: Yahoo Groups going away
Jim said > On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk wrote: Yeah, it sucks. The Tomy Tutor users group has been there for years, and I guess we'll jump over to groups.io. I managed to archive everything last night. >>> What's your strategy for archiving material off YahooGroups? Their Files and >>> Photo (photostreams) sections are so heavily Javascript-encrusted that it's >>> not at all easy to bulk archive from them. I tried a few tools (httrack, >>> wget, >>> curl) with no valid results, but I only used some basic settings. >> For the messages, I used >> >> https://github.com/andrewferguson/YahooGroups-Archiver >> >> Unfortunately, the (rather inadequate) Y!G API for files makes it difficult >> to iterate over files in a directory tree. I ended up manually downloading >> them, since it was only about 30 files and not worth ginning up something >> to scrape them. Some people have used >> >> https://github.com/csaftoiu/yahoo-groups-backup > I didn't get that to work. Has anyone here got suggestions? Contact off > list. It is getting errors, and I spent about an hour trying to figure > it out. > > every issue was a bug in either Python that was unresolved, or the tools > they were using, not errors in the tool, so I'm not really interested in > a lot more debugging. > > I suspect it ran at some point, maybe I've got the wrong versions of > some sort. > > thanks > Jim >> to get everything but it needs a MongoDB instance which seemed kind of >> overkill for a one-time dump. I set it up with python 3.7.3, pip installed the required modules such as Selenium, installed geckodriver for Firefox (but I don't run Firefox on this machine, I use a popular fork) and it emitted an error that referes to Selenium not being the correct match to Firefox. I have other things to do so that's where I left it for now, will try it out again sometime soon with an earlier actual Firefox. Steve.
Re: Yahoo Groups going away
On 10/23/19 5:44 PM, jim stephens via cctalk wrote: >> https://github.com/csaftoiu/yahoo-groups-backup > I didn't get that to work. Has anyone here got suggestions? Contact off > list. It is getting errors, and I spent about > an hour trying to figure it out. I couldn't make it work either. This is what archiveteam knows currently https://www.archiveteam.org/index.php?title=Yahoo!_Groups
Re: Yahoo Groups going away
On Wed, 23 Oct 2019, jim stephens via cctalk wrote: I suspect it ran at some point, maybe I've got the wrong versions of some sort. I have no special knowledge. this is just uninformed speculation, . . . If it is old enough, could it be that the changes for "NEO" broke it? (possibly deliberately, since Yahoo had no desire for anybody to access below the user level)
Re: Yahoo Groups going away
On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk wrote: Yeah, it sucks. The Tomy Tutor users group has been there for years, and I guess we'll jump over to groups.io. I managed to archive everything last night. What's your strategy for archiving material off YahooGroups? Their Files and Photo (photostreams) sections are so heavily Javascript-encrusted that it's not at all easy to bulk archive from them. I tried a few tools (httrack, wget, curl) with no valid results, but I only used some basic settings. For the messages, I used https://github.com/andrewferguson/YahooGroups-Archiver Unfortunately, the (rather inadequate) Y!G API for files makes it difficult to iterate over files in a directory tree. I ended up manually downloading them, since it was only about 30 files and not worth ginning up something to scrape them. Some people have used https://github.com/csaftoiu/yahoo-groups-backup I didn't get that to work. Has anyone here got suggestions? Contact off list. It is getting errors, and I spent about an hour trying to figure it out. every issue was a bug in either Python that was unresolved, or the tools they were using, not errors in the tool, so I'm not really interested in a lot more debugging. I suspect it ran at some point, maybe I've got the wrong versions of some sort. thanks Jim to get everything but it needs a MongoDB instance which seemed kind of overkill for a one-time dump.
Re: Yahoo Groups going away
On 10/17/2019 8:02 PM, Ali via cctalk wrote: The groups are free as long as you use less than 1GB of storage. More storage costs. Unfortunately, since this past February, you also have to pay in order to have Groups.IO do the moving of your messages, files and photos. No freebie for that anymore. That explains it. BTW: is February when Yahoo first announced the shutdown of Yahoo Groups? Just wondering -Ali In the last couple of days. It seems to be a bad match to anything Verizon has any use for, and it's a shame they didn't find a way to spin it off. I suspect there is tons of intertwined infrastructure though to try to cleave it off to a really separate business. Thanks Jim
RE: Yahoo Groups going away
On Thu, 17 Oct 2019, Ali wrote: I saw a posting about this on one of the groups I am in (XXCopy) and it seems as if groups.io is not free. At least there was talk of a $110 payment. Groups.io has a free level. But, if you subscribe to "premium" for one year ($110), then they will do the transfer for you.
RE: Yahoo Groups going away
> The groups are free as long as you use less than 1GB of storage. More > storage costs. Unfortunately, since this past February, you also have > to pay in order to have Groups.IO do the moving of your messages, files > and photos. No freebie for that anymore. That explains it. BTW: is February when Yahoo first announced the shutdown of Yahoo Groups? Just wondering -Ali
Re: Yahoo Groups going away
On 10/17/2019 9:49 PM, Ali via cctalk wrote: The guy who runs groups.io is the one who created onelist, which became e-groups, with got swallowed up by Yahoo!, and then he left. He knows how to do it. People who have switched over seem happy with it. BUT, does he have an appropriate level of resources to handle THAT much traffic? I saw a posting about this on one of the groups I am in (XXCopy) and it seems as if groups.io is not free. At least there was talk of a $110 payment. -Ali The groups are free as long as you use less than 1GB of storage. More storage costs. Unfortunately, since this past February, you also have to pay in order to have Groups.IO do the moving of your messages, files and photos. No freebie for that anymore. -- John H. Reinhardt
RE: Yahoo Groups going away
> > The guy who runs groups.io is the one who created onelist, which became > e-groups, with got swallowed up by Yahoo!, and then he left. > > He knows how to do it. People who have switched over seem happy with > it. > BUT, does he have an appropriate level of resources to handle THAT much > traffic? I saw a posting about this on one of the groups I am in (XXCopy) and it seems as if groups.io is not free. At least there was talk of a $110 payment. -Ali
Re: Yahoo Groups going away
> > Yeah, it sucks. The Tomy Tutor users group has been there for years, and I > > guess we'll jump over to groups.io. I managed to archive everything last > > night. > > What's your strategy for archiving material off YahooGroups? Their Files and > Photo (photostreams) sections are so heavily Javascript-encrusted that it's > not at all easy to bulk archive from them. I tried a few tools (httrack, wget, > curl) with no valid results, but I only used some basic settings. For the messages, I used https://github.com/andrewferguson/YahooGroups-Archiver Unfortunately, the (rather inadequate) Y!G API for files makes it difficult to iterate over files in a directory tree. I ended up manually downloading them, since it was only about 30 files and not worth ginning up something to scrape them. Some people have used https://github.com/csaftoiu/yahoo-groups-backup to get everything but it needs a MongoDB instance which seemed kind of overkill for a one-time dump. -- personal: http://www.cameronkaiser.com/ -- Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckai...@floodgap.com -- Political correctness is tyranny with manners. -- Charlton Heston --
Re: Yahoo Groups going away
On 10/17/2019 6:08 PM, Steve Malikoff via cctalk wrote: Cameron said Yeah, it sucks. The Tomy Tutor users group has been there for years, and I guess we'll jump over to groups.io. I managed to archive everything last night. What's your strategy for archiving material off YahooGroups? Their Files and Photo (photostreams) sections are so heavily Javascript-encrusted that it's not at all easy to bulk archive from them. I tried a few tools (httrack, wget, curl) with no valid results, but I only used some basic settings. There is a now obsolete plugin for firefox called "downloadthemall" that sucks the files down. I saw elsewhere in the thread there may be scripts to scrape messages, will look at that. Downloadthemall sees the string of crap after the file name, and apparently it comes down with the correct file contents and file name. I just downloaded it one directory at a time, because DTA doesn't do a recursion in any way. I have an old set of perl code which I used in 2016 to grab several groups in their entirety, and now need to get from there forward. The thing that happened pre-Verizon was they rolled out a mangling of the groups code called "neo" which still remains in the URL. They killed the original code most tools could scrape groups from by turning off all but the neo type site. Grabyahoogroups.pl is the code FWIW that did work. I'm glad someone found something if it works with the messages. thanks Jim
Re: Yahoo Groups going away
On Thu, 17 Oct 2019, Nigel Johnson via cctalk wrote: Yes, other groups i belong to that moved have previously said that groups.io has a method of pulling groups over. Pull rather than push seems to be the way to go. Best to attack it from the groups.io end after setting up the new group there. The guy who runs groups.io is the one who created onelist, which became e-groups, with got swallowed up by Yahoo!, and then he left. He knows how to do it. People who have switched over seem happy with it. BUT, does he have an appropriate level of resources to handle THAT much traffic?
Re: Yahoo Groups going away
Yes, other groups i belong to that moved have previously said that groups.io has a method of pulling groups over. Pull rather than push seems to be the way to go. Best to attack it from the groups.io end after setting up the new group there. On 17/10/2019 21:12, Zane Healy via cctalk wrote: On Oct 17, 2019, at 3:32 PM, Cameron Kaiser via cctalk wrote: So if you subscribe to any Yahoo groups, or value any of that content, be sure to archive it before your friendly telco sends ALL of it to the bit bucket. Yeah, it sucks. The Tomy Tutor users group has been there for years, and I guess we'll jump over to groups.io. I managed to archive everything last night. The 3D photography group I’m on just moved to groups.io this afternoon. When I went and looked just now, it looks like all the files moved as well. Other groups I’m on had already moved. Zane -- Nigel Johnson MSc., MIEEE VE3ID/G4AJQ/VA3MCU Amateur Radio, the origin of the open-source concept! You can reach me by voice on Skype: TILBURY2591 If time travel ever will be possible, it already is. Ask me again yesterday This e-mail is not and cannot, by its nature, be confidential. En route from me to you, it will pass across the public Internet, easily readable by any number of system administrators along the way. Nigel Johnson Please consider the environment when deciding if you really need to print this message
Re: Yahoo Groups going away
On Fri, 18 Oct 2019, Steve Malikoff via cctalk wrote: What's your strategy for archiving material off YahooGroups? Their Files and Photo (photostreams) sections are so heavily Javascript-encrusted that it's not at all easy to bulk archive from them. I tried a few tools (httrack, wget, curl) with no valid results, but I only used some basic settings. https://www.archiveteam.org/index.php?title=Yahoo!_Groups has some of the needed information. And, there are some Python scripts for slurping up the messages.
Re: Yahoo Groups going away
> On Oct 17, 2019, at 3:32 PM, Cameron Kaiser via cctalk > wrote: > >> So if you subscribe to any Yahoo groups, or value any of that content, be >> sure to archive it before your friendly telco sends ALL of it to the bit >> bucket. > > Yeah, it sucks. The Tomy Tutor users group has been there for years, and I > guess we'll jump over to groups.io. I managed to archive everything last > night. The 3D photography group I’m on just moved to groups.io this afternoon. When I went and looked just now, it looks like all the files moved as well. Other groups I’m on had already moved. Zane
Re: Yahoo Groups going away
Cameron said > Yeah, it sucks. The Tomy Tutor users group has been there for years, and I > guess we'll jump over to groups.io. I managed to archive everything last > night. What's your strategy for archiving material off YahooGroups? Their Files and Photo (photostreams) sections are so heavily Javascript-encrusted that it's not at all easy to bulk archive from them. I tried a few tools (httrack, wget, curl) with no valid results, but I only used some basic settings.
Re: Yahoo Groups going away
> So if you subscribe to any Yahoo groups, or value any of that content, be > sure to archive it before your friendly telco sends ALL of it to the bit > bucket. Yeah, it sucks. The Tomy Tutor users group has been there for years, and I guess we'll jump over to groups.io. I managed to archive everything last night. -- personal: http://www.cameronkaiser.com/ -- Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckai...@floodgap.com -- No good deed goes unpunished. -- Clare Boothe Luce -
Re: Yahoo Groups going away
Thanks for the link! Groups are already jumping ship. The popular destination seems to be groups.io, which has some good features. Zane Sent from my iPod > On Oct 17, 2019, at 1:51 PM, Paul Koning via cctalk > wrote: > > A bit off topic, but I figure a number of us are interested in this older > "social media" mechanism. > > https://arstechnica.com/information-technology/2019/10/yahoo-is-deleting-all-content-ever-posted-to-yahoo-groups/ > > So if you subscribe to any Yahoo groups, or value any of that content, be > sure to archive it before your friendly telco sends ALL of it to the bit > bucket. > >paul >