Re: Need Help Throttling Downloads From an FTP Site
Hi Bob Sneidar, Scott Rossi, Mike Bonner, and Jim Lambert, Thanks for your suggestions. I going to experiment with all of them and share my results with the list. Regards, Gregory ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Need Help Throttling Downloads From an FTP Site
Whoops! That should be: on getNextFile if lListOfFilePaths = empty then exit getNextFile put line 1 of lListOfFilePaths into remoteFilePath delete line 1 of lListOfFilePaths —SET THE LOCAL FILE’S NAME HOWEVER YOU NORMALLY WOULD put whatever into localFileName libURLDownloadToFile (" ftp://anonymous:myemailaddr...@ftp.sec.gov/ " & remoteFilePath),(exportFolderPath & "/" & localFileName ),"downloadComplete" end getNextFile Jim Lambert ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Need Help Throttling Downloads From an FTP Site
Gregory, Try this (untested): local lListOfFilePaths on downloadAll put theListofFiles into lListOfFilePaths getnextFile end repeat on getNextFile if lListOfFilePaths = empty then exit getNextFile put line 1 of lListOfFilePaths into remoteFilePath —SET THE LOCAL FILE’S NAME HOWEVER YOU NORMALLY WOULD put whatever into localFileName libURLDownloadToFile ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & remoteFilePath),(exportFolderPath & "/" & localFileName ),"downloadComplete" end getNextFile command downloadComplete pURL, pStatus if pStatus = "error" or pStatus = "timeout" then answer error "The file” && pURL && "could not be downloaded." else getNextFile end if end downloadComplete Basically it fetches the files one at a time. No need for adding guessed-at delays. Jim Lambert ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Need Help Throttling Downloads From an FTP Site
The problem doesn't seem to be a local network issue. When I try to grab files from the sec site, too many connections too fast make it choke. (There end, not mine, most likely anti-bot code) As scott rossi said, using a delay should help. I've noticed, the magic number seems to be 5, so I used load and a counter to get reliable downloads. local sList,sBaseUrl,sCount on mouseUp put 0 into sCount put "ftp://anonymous:nob...@ftp.sec.gov/edgar/forms/"; into sBaseUrl -- the folder I chose to download from. put empty into field 2 -- my status field set the defaultfolder to specialfolderpath("desktop") & "/downloads" -- where I'm saving em put field 1 into sList -- my list of files downloadit -- start the downloads end mouseUp command downloadit repeat for each line tLine in sList if sCount mod 5 is 0 then wait 5 seconds with messages -- pause every 5 files load URL (sBaseUrl & tLine) with message "doDownloads" -- load the url into cache then process with doDownloads add 1 to sCount end repeat end downloadit command doDownloads pUrl, pStatus put URL pUrl into URL ("binfile:" & line 1 of sList) -- save the file from cache put pUrl & ":" && pStatus & cr after field 2 -- update the status field unload pUrl -- clear the url from the cache end doDownloads On Mon, Sep 21, 2015 at 4:33 PM, Scott Rossi wrote: > How large are the files you're retrieving? If the script below is your > actual script, you might try allowing some execution time in the loop: > > repeat with each line remoteFilePath in listOfFilePaths > -- set new localFileName is set before the download request is made > put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & > remoteFilePath) > into url ("file:/" & exportFolderPath & "/" & localFileName ) > wait 2 seconds with messages -- <-- ADD THIS > end repeat > > It would probably be most helpful to you to check the status of each > request, so you can keep track of which events succeeded and which failed. > I > imagine there are folks on the list who have something like this more > readily available than me. > > Regards, > > Scott Rossi > Creative Director > Tactile Media, UX/UI Design > > > > On 9/21/15, 2:33 PM, "use-livecode on behalf of Gregory Lypny" > gregory.ly...@videotron.ca> wrote: > > > Hello everyone, > > > > I posted about this a while back but am still having trouble. > > > > I need to download thousands of files from the Security and Exchange > > Commission's website. Access is through anonymous FTP with "anonymous" > as the > > username and my email address as the password. I've been using Put in a > Repeat > > With loop as > > > > repeat with each line remoteFilePath in listOfFilePaths > > ‹ set new localFileName is set before the download request is made > > put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & > remoteFilePath) > > into url ("file:/" & exportFolderPath & "/" & localFileName ) > > end repeat > > > > but my script dies (the stack is lifeless and unresponsive) after a few > dozen, > > and sometimes a few hundred downloads. I used similar scripts in > Mathematica > > and confirmed that the problem is session-timed-out and > > cannot-connect-to-server types of errors. The SEC's webmaster tells me, > "There > > is no load/rate limiting on FTP, but if you are running a fast process, > it is > > possible you are temporarily overwhelming the server." So, I'm thinking > that I > > need to throttle my requests, and maybe should be using > libURLDownloadToFile > > to check the status of the current file being downloaded and not request > > another file until the current download is complete. I also wonder > whether I > > should be connecting to the FTP site only once with the username and > password, > > loop my requests, and then close the connection. Not sure how to do > either of > > these and would greatly appreciate any suggestions or tips. > > > > Gregory > > ___ > > use-livecode mailing list > > use-livecode@lists.runrev.com > > Please visit this url to subscribe, unsubscribe and manage your > subscription > > preferences: > > http://lists.runrev.com/mailman/listinfo/use-livecode > > > ___ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode > ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Need Help Throttling Downloads From an FTP Site
How large are the files you're retrieving? If the script below is your actual script, you might try allowing some execution time in the loop: repeat with each line remoteFilePath in listOfFilePaths -- set new localFileName is set before the download request is made put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & remoteFilePath) into url ("file:/" & exportFolderPath & "/" & localFileName ) wait 2 seconds with messages -- <-- ADD THIS end repeat It would probably be most helpful to you to check the status of each request, so you can keep track of which events succeeded and which failed. I imagine there are folks on the list who have something like this more readily available than me. Regards, Scott Rossi Creative Director Tactile Media, UX/UI Design On 9/21/15, 2:33 PM, "use-livecode on behalf of Gregory Lypny" wrote: > Hello everyone, > > I posted about this a while back but am still having trouble. > > I need to download thousands of files from the Security and Exchange > Commission's website. Access is through anonymous FTP with "anonymous" as the > username and my email address as the password. I've been using Put in a Repeat > With loop as > > repeat with each line remoteFilePath in listOfFilePaths > set new localFileName is set before the download request is made > put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & remoteFilePath) > into url ("file:/" & exportFolderPath & "/" & localFileName ) > end repeat > > but my script dies (the stack is lifeless and unresponsive) after a few dozen, > and sometimes a few hundred downloads. I used similar scripts in Mathematica > and confirmed that the problem is session-timed-out and > cannot-connect-to-server types of errors. The SEC's webmaster tells me, "There > is no load/rate limiting on FTP, but if you are running a fast process, it is > possible you are temporarily overwhelming the server." So, I'm thinking that I > need to throttle my requests, and maybe should be using libURLDownloadToFile > to check the status of the current file being downloaded and not request > another file until the current download is complete. I also wonder whether I > should be connecting to the FTP site only once with the username and password, > loop my requests, and then close the connection. Not sure how to do either of > these and would greatly appreciate any suggestions or tips. > > Gregory > ___ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your subscription > preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode
Re: Need Help Throttling Downloads From an FTP Site
FTP has been called the misbehaving child of networking, and I'm being kind. While other protocols play nicely on a network, not grabbing all the bandwidth they can and refusing to throttle down when needed, FTP generally does the opposite. FTP will try to commandeer all the bandwidth your infrastructure allows, and won't let go once it has it. I'm sure modern FTP servers are better behaved than their cave-man-days predecessors, but the protocol itself is still what it is. If you have a router or switch with built in QOS, you may be able to do it there. Barring that you may want to use HTTP file transfers instead. Bob S On Sep 21, 2015, at 14:33 , Gregory Lypny mailto:gregory.ly...@videotron.ca>> wrote: Hello everyone, I posted about this a while back but am still having trouble. I need to download thousands of files from the Security and Exchange Commission's website. Access is through anonymous FTP with "anonymous" as the username and my email address as the password. I've been using Put in a Repeat With loop as repeat with each line remoteFilePath in listOfFilePaths — set new localFileName is set before the download request is made put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & remoteFilePath) into url ("file:/" & exportFolderPath & "/" & localFileName ) end repeat but my script dies (the stack is lifeless and unresponsive) after a few dozen, and sometimes a few hundred downloads. I used similar scripts in Mathematica and confirmed that the problem is session-timed-out and cannot-connect-to-server types of errors. The SEC's webmaster tells me, "There is no load/rate limiting on FTP, but if you are running a fast process, it is possible you are temporarily overwhelming the server." So, I'm thinking that I need to throttle my requests, and maybe should be using libURLDownloadToFile to check the status of the current file being downloaded and not request another file until the current download is complete. I also wonder whether I should be connecting to the FTP site only once with the username and password, loop my requests, and then close the connection. Not sure how to do either of these and would greatly appreciate any suggestions or tips. Gregory ___ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode