RE: Yahoo Groups going away

2019-10-24 Thread Dave Wade via cctalk
have been reasonably successful, after making a few mods, in backing up Yahoo 
groups using a clone of "Yahoo Group Archiver" which broadly works (but see 
below) and doesn't need  any scraping tools.
I made a few tweaks concentrating on speed rather than documenting the code and 
the current mess is here:-

https://1drv.ms/u/s!Ag4BJfE5B3onleMG29vMs5czmPcoTw?e=TrqawF

The script yahoo.py is supposed back up things to files. I couldn't get use the 
user/password login part to work, but noted scripts also have support for 
putting the cookies in the command line.
So I downloaded cookie manager for Firefox, logged into Yahoo and added code to 
set the values at the top of the code. The result is "yahoo1.py" Its pretty 
obvious which cookies are needed.
I found this fails on unnamed photo albums. I also found file download flaky. 
So Yahoo2 will fix photo albums with duff names and skip downloading any 
existing files.
This leaves one bug. If a download fails the script may leave an empty file. If 
you want to restart that download you need to remove it before restarting the 
download.
Sometimes Yahoo barfs at a file because updated av/malware scanners mark its as 
bad. E.g. archives which contain "netcat" In this case leave the partial 
download in place and allow the script to skip
We also don't get file descriptions.

I am running the scripts on Windows/10 on Python 3.7.5 on Windows/10 and use 
"py" to run the scripts
When installing the required "requests" package (see the readme.md) I found I 
had to enter the full path to pip (Its it the scripts folder)

I am happy to answer any questions but note I am in the UK (that is East Pondia 
not the University of Kentucky) so please allow for my time zone.

Dave
G4UGM
P.S. I now hate python
PPS I also now hate Yahoo.


> -Original Message-
> From: cctalk  On Behalf Of Steve Malikoff via
> cctalk
> Sent: 24 October 2019 02:51
> To: General Discussion: On-Topic and Off-Topic Posts 
> Subject: Re: Yahoo Groups going away
> 
> Jim said
> > On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk wrote:
> >>>> Yeah, it sucks. The Tomy Tutor users group has been there for
> >>>> years, and I guess we'll jump over to groups.io. I managed to
> >>>> archive everything last night.
> >>> What's your strategy for archiving material off YahooGroups? Their
> >>> Files and Photo (photostreams) sections are so heavily
> >>> Javascript-encrusted that it's not at all easy to bulk archive from
> >>> them. I tried a few tools (httrack, wget,
> >>> curl) with no valid results, but I only used some basic settings.
> >> For the messages, I used
> >>
> >>https://github.com/andrewferguson/YahooGroups-Archiver
> >>
> >> Unfortunately, the (rather inadequate) Y!G API for files makes it
> >> difficult to iterate over files in a directory tree. I ended up
> >> manually downloading them, since it was only about 30 files and not
> >> worth ginning up something to scrape them. Some people have used
> >>
> >>https://github.com/csaftoiu/yahoo-groups-backup
> > I didn't get that to work.  Has anyone here got suggestions? Contact
> > off list.  It is getting errors, and I spent about an hour trying to
> > figure it out.
> >
> > every issue was a bug in either Python that was unresolved, or the
> > tools they were using, not errors in the tool, so I'm not really
> > interested in a lot more debugging.
> >
> > I suspect it ran at some point, maybe I've got the wrong versions of
> > some sort.
> >
> > thanks
> > Jim
> >> to get everything but it needs a MongoDB instance which seemed kind
> >> of overkill for a one-time dump.
> 
> 
> I set it up with python 3.7.3, pip installed the required modules such as
> Selenium, installed geckodriver for Firefox (but I don't run Firefox on this
> machine, I use a popular fork) and it emitted an error that referes to 
> Selenium
> not being the correct match to Firefox.
> I have other things to do so that's where I left it for now, will try it out 
> again
> sometime soon with an earlier actual Firefox.
> 
> Steve.




Re: Yahoo Groups going away

2019-10-23 Thread Steve Malikoff via cctalk
Jim said
> On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk wrote:
 Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
 guess we'll jump over to groups.io. I managed to archive everything last
 night.
>>> What's your strategy for archiving material off YahooGroups? Their Files and
>>> Photo (photostreams) sections are so heavily Javascript-encrusted that it's
>>> not at all easy to bulk archive from them. I tried a few tools (httrack, 
>>> wget,
>>> curl) with no valid results, but I only used some basic settings.
>> For the messages, I used
>>
>>  https://github.com/andrewferguson/YahooGroups-Archiver
>>
>> Unfortunately, the (rather inadequate) Y!G API for files makes it difficult
>> to iterate over files in a directory tree. I ended up manually downloading
>> them, since it was only about 30 files and not worth ginning up something
>> to scrape them. Some people have used
>>
>>  https://github.com/csaftoiu/yahoo-groups-backup
> I didn't get that to work.  Has anyone here got suggestions? Contact off
> list.  It is getting errors, and I spent about an hour trying to figure
> it out.
>
> every issue was a bug in either Python that was unresolved, or the tools
> they were using, not errors in the tool, so I'm not really interested in
> a lot more debugging.
>
> I suspect it ran at some point, maybe I've got the wrong versions of
> some sort.
>
> thanks
> Jim
>> to get everything but it needs a MongoDB instance which seemed kind of
>> overkill for a one-time dump.


I set it up with python 3.7.3, pip installed the required modules such as
Selenium, installed geckodriver for Firefox (but I don't run Firefox on this
machine, I use a popular fork) and it emitted an error that referes to Selenium
not being the correct match to Firefox.
I have other things to do so that's where I left it for now, will try it out
again sometime soon with an earlier actual Firefox.

Steve.



Re: Yahoo Groups going away

2019-10-23 Thread Al Kossow via cctalk



On 10/23/19 5:44 PM, jim stephens via cctalk wrote:

>> https://github.com/csaftoiu/yahoo-groups-backup
> I didn't get that to work.  Has anyone here got suggestions? Contact off 
> list.  It is getting errors, and I spent about
> an hour trying to figure it out.

I couldn't make it work either.
This is what archiveteam knows currently
https://www.archiveteam.org/index.php?title=Yahoo!_Groups




Re: Yahoo Groups going away

2019-10-23 Thread Fred Cisin via cctalk

On Wed, 23 Oct 2019, jim stephens via cctalk wrote:
I suspect it ran at some point, maybe I've got the wrong versions of some 
sort.


I have no special knowledge.  this is just uninformed speculation, . . .

If it is old enough, could it be that the changes for "NEO" broke it?
(possibly deliberately, since Yahoo had no desire for anybody to access 
below the user level)


Re: Yahoo Groups going away

2019-10-23 Thread jim stephens via cctalk




On 10/17/2019 6:52 PM, Cameron Kaiser via cctalk wrote:

Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
guess we'll jump over to groups.io. I managed to archive everything last
night.

What's your strategy for archiving material off YahooGroups? Their Files and
Photo (photostreams) sections are so heavily Javascript-encrusted that it's
not at all easy to bulk archive from them. I tried a few tools (httrack, wget,
curl) with no valid results, but I only used some basic settings.

For the messages, I used

https://github.com/andrewferguson/YahooGroups-Archiver

Unfortunately, the (rather inadequate) Y!G API for files makes it difficult
to iterate over files in a directory tree. I ended up manually downloading
them, since it was only about 30 files and not worth ginning up something
to scrape them. Some people have used

https://github.com/csaftoiu/yahoo-groups-backup
I didn't get that to work.  Has anyone here got suggestions? Contact off 
list.  It is getting errors, and I spent about an hour trying to figure 
it out.


every issue was a bug in either Python that was unresolved, or the tools 
they were using, not errors in the tool, so I'm not really interested in 
a lot more debugging.


I suspect it ran at some point, maybe I've got the wrong versions of 
some sort.


thanks
Jim

to get everything but it needs a MongoDB instance which seemed kind of
overkill for a one-time dump.





Re: Yahoo Groups going away

2019-10-17 Thread jim stephens via cctalk




On 10/17/2019 8:02 PM, Ali via cctalk wrote:

The groups are free as long as you use less than 1GB of storage.  More
storage costs.  Unfortunately, since this past February, you also have
to pay in order to have Groups.IO do the moving of your messages, files
and photos.  No freebie for that anymore.

That explains it. BTW: is February when Yahoo first announced the shutdown of 
Yahoo Groups? Just wondering

-Ali
In the last couple of days.  It seems to be a bad match to anything 
Verizon has any use for, and it's a shame they didn't find a way to spin 
it off.  I suspect there is tons of intertwined infrastructure though to 
try to cleave it off to a really separate business.


Thanks
Jim


RE: Yahoo Groups going away

2019-10-17 Thread Fred Cisin via cctalk

On Thu, 17 Oct 2019, Ali wrote:

I saw a posting about this on one of the groups I am in (XXCopy) and it
seems as if groups.io is not free. At least there was talk of a $110
payment.


Groups.io has a free level.  But, if you subscribe to "premium" for one 
year ($110), then they will do the transfer for you.





RE: Yahoo Groups going away

2019-10-17 Thread Ali via cctalk
> The groups are free as long as you use less than 1GB of storage.  More
> storage costs.  Unfortunately, since this past February, you also have
> to pay in order to have Groups.IO do the moving of your messages, files
> and photos.  No freebie for that anymore.

That explains it. BTW: is February when Yahoo first announced the shutdown of 
Yahoo Groups? Just wondering

-Ali



Re: Yahoo Groups going away

2019-10-17 Thread John H. Reinhardt via cctalk

On 10/17/2019 9:49 PM, Ali via cctalk wrote:

The guy who runs groups.io is the one who created onelist, which became
e-groups, with got swallowed up by Yahoo!, and then he left.

He knows how to do it.  People who have switched over seem happy with
it.
BUT, does he have an appropriate level of resources to handle THAT much
traffic?


I saw a posting about this on one of the groups I am in (XXCopy) and it
seems as if groups.io is not free. At least there was talk of a $110
payment.

-Ali


The groups are free as long as you use less than 1GB of storage.  More storage 
costs.  Unfortunately, since this past February, you also have to pay in order 
to have Groups.IO do the moving of your messages, files and photos.  No freebie 
for that anymore.

--
John H. Reinhardt



RE: Yahoo Groups going away

2019-10-17 Thread Ali via cctalk
> 
> The guy who runs groups.io is the one who created onelist, which became
> e-groups, with got swallowed up by Yahoo!, and then he left.
> 
> He knows how to do it.  People who have switched over seem happy with
> it.
> BUT, does he have an appropriate level of resources to handle THAT much
> traffic?


I saw a posting about this on one of the groups I am in (XXCopy) and it
seems as if groups.io is not free. At least there was talk of a $110
payment.

-Ali



Re: Yahoo Groups going away

2019-10-17 Thread Cameron Kaiser via cctalk
> > Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
> > guess we'll jump over to groups.io. I managed to archive everything last
> > night.
> 
> What's your strategy for archiving material off YahooGroups? Their Files and
> Photo (photostreams) sections are so heavily Javascript-encrusted that it's
> not at all easy to bulk archive from them. I tried a few tools (httrack, wget,
> curl) with no valid results, but I only used some basic settings.

For the messages, I used

https://github.com/andrewferguson/YahooGroups-Archiver

Unfortunately, the (rather inadequate) Y!G API for files makes it difficult
to iterate over files in a directory tree. I ended up manually downloading
them, since it was only about 30 files and not worth ginning up something
to scrape them. Some people have used

https://github.com/csaftoiu/yahoo-groups-backup

to get everything but it needs a MongoDB instance which seemed kind of
overkill for a one-time dump.

-- 
 personal: http://www.cameronkaiser.com/ --
  Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckai...@floodgap.com
-- Political correctness is tyranny with manners. -- Charlton Heston --


Re: Yahoo Groups going away

2019-10-17 Thread jim stephens via cctalk




On 10/17/2019 6:08 PM, Steve Malikoff via cctalk wrote:

Cameron said

Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
guess we'll jump over to groups.io. I managed to archive everything last
night.

What's your strategy for archiving material off YahooGroups? Their Files and
Photo (photostreams) sections are so heavily Javascript-encrusted that it's
not at all easy to bulk archive from them. I tried a few tools (httrack, wget,
curl) with no valid results, but I only used some basic settings.
There is a now obsolete plugin for firefox called "downloadthemall" that 
sucks the files down.  I saw elsewhere in the thread there may be 
scripts to scrape messages, will look at that.  Downloadthemall sees the 
string of crap after the file name, and apparently it comes down with 
the correct file contents and file name.  I just downloaded it one 
directory at a time, because DTA doesn't do a recursion in any way.


I have an old set of perl code which I used in 2016 to grab several 
groups in their entirety, and now need to get from there forward.


The thing that happened pre-Verizon was they rolled out a mangling of 
the groups code called "neo" which still remains in the URL. They killed 
the original code most tools could scrape groups from by turning off all 
but the neo type site.


Grabyahoogroups.pl is the code FWIW that did work.  I'm glad someone 
found something if it works with the messages.


thanks
Jim


Re: Yahoo Groups going away

2019-10-17 Thread Fred Cisin via cctalk

On Thu, 17 Oct 2019, Nigel Johnson via cctalk wrote:
Yes, other groups i belong to that moved have previously said that groups.io 
has a method of pulling groups over. Pull rather than push seems to be the 
way to go. Best to attack it from the groups.io end after setting up the new 
group there.


The guy who runs groups.io is the one who created onelist, which became 
e-groups, with got swallowed up by Yahoo!, and then he left.


He knows how to do it.  People who have switched over seem happy with it. 
BUT, does he have an appropriate level of resources to handle THAT much 
traffic?





Re: Yahoo Groups going away

2019-10-17 Thread Nigel Johnson via cctalk
Yes, other groups i belong to that moved have previously said that 
groups.io has a method of pulling groups over. Pull rather than push 
seems to be the way to go. Best to attack it from the groups.io end 
after setting up the new group there.



On 17/10/2019 21:12, Zane Healy via cctalk wrote:

On Oct 17, 2019, at 3:32 PM, Cameron Kaiser via cctalk  
wrote:


So if you subscribe to any Yahoo groups, or value any of that content, be
sure to archive it before your friendly telco sends ALL of it to the bit
bucket.

Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
guess we'll jump over to groups.io. I managed to archive everything last
night.

The 3D photography group I’m on just moved to groups.io this afternoon.  When I 
went and looked just now, it looks like all the files moved as well.  Other 
groups I’m on had already moved.

Zane


 


--
Nigel Johnson
MSc., MIEEE
VE3ID/G4AJQ/VA3MCU

Amateur Radio, the origin of the open-source concept!


You can reach me by voice on Skype:  TILBURY2591

If time travel ever will be possible, it already is. Ask me again yesterday

This e-mail is not and cannot, by its nature, be confidential. En route from me 
to you, it will pass across the public Internet, easily readable by any number 
of system administrators along the way.
   Nigel Johnson 


Please consider the environment when deciding if you really need to print this message






Re: Yahoo Groups going away

2019-10-17 Thread Fred Cisin via cctalk

On Fri, 18 Oct 2019, Steve Malikoff via cctalk wrote:

What's your strategy for archiving material off YahooGroups? Their Files and
Photo (photostreams) sections are so heavily Javascript-encrusted that it's
not at all easy to bulk archive from them. I tried a few tools (httrack, wget,
curl) with no valid results, but I only used some basic settings.


https://www.archiveteam.org/index.php?title=Yahoo!_Groups
has some of the needed information.
And, there are some Python scripts for slurping up the messages.


Re: Yahoo Groups going away

2019-10-17 Thread Zane Healy via cctalk


> On Oct 17, 2019, at 3:32 PM, Cameron Kaiser via cctalk 
>  wrote:
> 
>> So if you subscribe to any Yahoo groups, or value any of that content, be
>> sure to archive it before your friendly telco sends ALL of it to the bit
>> bucket.
> 
> Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
> guess we'll jump over to groups.io. I managed to archive everything last
> night.

The 3D photography group I’m on just moved to groups.io this afternoon.  When I 
went and looked just now, it looks like all the files moved as well.  Other 
groups I’m on had already moved.

Zane




Re: Yahoo Groups going away

2019-10-17 Thread Steve Malikoff via cctalk
Cameron said
> Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
> guess we'll jump over to groups.io. I managed to archive everything last
> night.

What's your strategy for archiving material off YahooGroups? Their Files and
Photo (photostreams) sections are so heavily Javascript-encrusted that it's
not at all easy to bulk archive from them. I tried a few tools (httrack, wget,
curl) with no valid results, but I only used some basic settings.




Re: Yahoo Groups going away

2019-10-17 Thread Cameron Kaiser via cctalk
> So if you subscribe to any Yahoo groups, or value any of that content, be
> sure to archive it before your friendly telco sends ALL of it to the bit
> bucket.

Yeah, it sucks. The Tomy Tutor users group has been there for years, and I
guess we'll jump over to groups.io. I managed to archive everything last
night.

-- 
 personal: http://www.cameronkaiser.com/ --
  Cameron Kaiser * Floodgap Systems * www.floodgap.com * ckai...@floodgap.com
-- No good deed goes unpunished. -- Clare Boothe Luce -


Re: Yahoo Groups going away

2019-10-17 Thread Zane Healy via cctalk
Thanks for the link!

Groups are already jumping ship.  The popular destination seems to be 
groups.io, which has some good features.

Zane 

Sent from my iPod

> On Oct 17, 2019, at 1:51 PM, Paul Koning via cctalk  
> wrote:
> 
> A bit off topic, but I figure a number of us are interested in this older 
> "social media" mechanism.
> 
> https://arstechnica.com/information-technology/2019/10/yahoo-is-deleting-all-content-ever-posted-to-yahoo-groups/
> 
> So if you subscribe to any Yahoo groups, or value any of that content, be 
> sure to archive it before your friendly telco sends ALL of it to the bit 
> bucket.
> 
>paul
>