Re: Need Help Throttling Downloads From an FTP Site

2015-09-23 Thread Gregory Lypny
Hi Bob Sneidar, Scott Rossi, Mike Bonner, and Jim Lambert,

Thanks for your suggestions. I going to experiment with all of them and share 
my results with the list.

Regards,

Gregory

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Need Help Throttling Downloads From an FTP Site

2015-09-21 Thread Jim Lambert
Whoops!

That should be:

on getNextFile
if lListOfFilePaths = empty then exit getNextFile   
put line 1 of lListOfFilePaths into remoteFilePath

delete line 1 of lListOfFilePaths


—SET THE LOCAL FILE’S NAME HOWEVER YOU NORMALLY WOULD
put whatever into localFileName
libURLDownloadToFile ("
ftp://anonymous:myemailaddr...@ftp.sec.gov/
" & remoteFilePath),(exportFolderPath & "/" & localFileName ),"downloadComplete"
end getNextFile

Jim Lambert
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Need Help Throttling Downloads From an FTP Site

2015-09-21 Thread Jim Lambert
Gregory,

Try this (untested):

local lListOfFilePaths

on downloadAll
put theListofFiles into lListOfFilePaths
getnextFile
end repeat

on getNextFile
if lListOfFilePaths = empty then exit getNextFile   
put line 1 of lListOfFilePaths into remoteFilePath
—SET THE LOCAL FILE’S NAME HOWEVER YOU NORMALLY WOULD
put whatever into localFileName
libURLDownloadToFile ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & 
remoteFilePath),(exportFolderPath & "/" & localFileName ),"downloadComplete"
end getNextFile


command downloadComplete pURL, pStatus
if pStatus = "error" or pStatus = "timeout" then
answer error "The file” && pURL && "could not be downloaded."
else
getNextFile
end if
end downloadComplete


Basically it fetches the files one at a time.
No need for adding guessed-at delays.

Jim Lambert
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Need Help Throttling Downloads From an FTP Site

2015-09-21 Thread Mike Bonner
The problem doesn't seem to be a local network issue.  When I try to grab
files from the sec site, too many connections too fast make it choke.
 (There end, not mine, most likely anti-bot code)

As scott rossi said, using a delay should help.  I've noticed, the magic
number seems to be 5, so I used load and a counter to get reliable
downloads.

local sList,sBaseUrl,sCount
on mouseUp
   put 0 into sCount
   put "ftp://anonymous:nob...@ftp.sec.gov/edgar/forms/"; into sBaseUrl --
the folder I chose to download from.
   put empty into field 2 -- my status field
   set the defaultfolder to specialfolderpath("desktop") & "/downloads" --
where I'm saving em
   put field 1 into sList -- my list of files
   downloadit -- start the downloads
end mouseUp

command downloadit
   repeat for each line tLine in sList
  if sCount mod 5 is 0 then wait 5 seconds with messages -- pause every
5 files

  load URL (sBaseUrl & tLine) with message "doDownloads" -- load the
url into cache then process with doDownloads
  add 1 to sCount
   end repeat
end downloadit
command doDownloads pUrl, pStatus
  put URL pUrl into URL ("binfile:" & line 1 of sList) -- save the file
from cache
  put  pUrl & ":" && pStatus & cr after field 2 -- update the status
field
  unload pUrl -- clear the url from the cache
end doDownloads

On Mon, Sep 21, 2015 at 4:33 PM, Scott Rossi  wrote:

> How large are the files you're retrieving?  If the script below is your
> actual script, you might try allowing some execution time in the loop:
>
> repeat with each line remoteFilePath in listOfFilePaths
> -- set new localFileName is set before the download request is made
> put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; &
> remoteFilePath)
> into url ("file:/" & exportFolderPath & "/" & localFileName )
> wait 2 seconds with messages --  <-- ADD THIS
> end repeat
>
> It would probably be most helpful to you to check the status of each
> request, so you can keep track of which events succeeded and which failed.
> I
> imagine there are folks on the list who have something like this more
> readily available than me.
>
> Regards,
>
> Scott Rossi
> Creative Director
> Tactile Media, UX/UI Design
>
>
>
> On 9/21/15, 2:33 PM, "use-livecode on behalf of Gregory Lypny"
>  gregory.ly...@videotron.ca> wrote:
>
> > Hello everyone,
> >
> > I posted about this a while back but am still having trouble.
> >
> > I need to download thousands of files from the Security and Exchange
> > Commission's website. Access is through anonymous FTP with "anonymous"
> as the
> > username and my email address as the password. I've been using Put in a
> Repeat
> > With loop as
> >
> > repeat with each line remoteFilePath in listOfFilePaths
> > ‹ set new localFileName is set before the download request is made
> > put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; &
> remoteFilePath)
> > into url ("file:/" & exportFolderPath & "/" & localFileName )
> > end repeat
> >
> > but my script dies (the stack is lifeless and unresponsive) after a few
> dozen,
> > and sometimes a few hundred downloads. I used similar scripts in
> Mathematica
> > and confirmed that the problem is session-timed-out and
> > cannot-connect-to-server types of errors. The SEC's webmaster tells me,
> "There
> > is no load/rate limiting on FTP, but if you are running a fast process,
> it is
> > possible you are temporarily overwhelming the server." So, I'm thinking
> that I
> > need to throttle my requests, and maybe should be using
> libURLDownloadToFile
> > to check the status of the current file being downloaded and not request
> > another file until the current download is complete. I also wonder
> whether I
> > should be connecting to the FTP site only once with the username and
> password,
> > loop my requests, and then close the connection. Not sure how to do
> either of
> > these and would greatly appreciate any suggestions or tips.
> >
> > Gregory
> > ___
> > use-livecode mailing list
> > use-livecode@lists.runrev.com
> > Please visit this url to subscribe, unsubscribe and manage your
> subscription
> > preferences:
> > http://lists.runrev.com/mailman/listinfo/use-livecode
>
>
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your
> subscription preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode
>
___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Re: Need Help Throttling Downloads From an FTP Site

2015-09-21 Thread Scott Rossi
How large are the files you're retrieving?  If the script below is your
actual script, you might try allowing some execution time in the loop:

repeat with each line remoteFilePath in listOfFilePaths
-- set new localFileName is set before the download request is made
put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & remoteFilePath)
into url ("file:/" & exportFolderPath & "/" & localFileName )
wait 2 seconds with messages --  <-- ADD THIS
end repeat

It would probably be most helpful to you to check the status of each
request, so you can keep track of which events succeeded and which failed. I
imagine there are folks on the list who have something like this more
readily available than me.

Regards,

Scott Rossi
Creative Director
Tactile Media, UX/UI Design



On 9/21/15, 2:33 PM, "use-livecode on behalf of Gregory Lypny"
 wrote:

> Hello everyone,
> 
> I posted about this a while back but am still having trouble.
> 
> I need to download thousands of files from the Security and Exchange
> Commission's website. Access is through anonymous FTP with "anonymous" as the
> username and my email address as the password. I've been using Put in a Repeat
> With loop as
> 
> repeat with each line remoteFilePath in listOfFilePaths
> ‹ set new localFileName is set before the download request is made
> put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & remoteFilePath)
> into url ("file:/" & exportFolderPath & "/" & localFileName )
> end repeat
> 
> but my script dies (the stack is lifeless and unresponsive) after a few dozen,
> and sometimes a few hundred downloads. I used similar scripts in Mathematica
> and confirmed that the problem is session-timed-out and
> cannot-connect-to-server types of errors. The SEC's webmaster tells me, "There
> is no load/rate limiting on FTP, but if you are running a fast process, it is
> possible you are temporarily overwhelming the server." So, I'm thinking that I
> need to throttle my requests, and maybe should be using libURLDownloadToFile
> to check the status of the current file being downloaded and not request
> another file until the current download is complete. I also wonder whether I
> should be connecting to the FTP site only once with the username and password,
> loop my requests, and then close the connection. Not sure how to do either of
> these and would greatly appreciate any suggestions or tips.
> 
> Gregory
> ___
> use-livecode mailing list
> use-livecode@lists.runrev.com
> Please visit this url to subscribe, unsubscribe and manage your subscription
> preferences:
> http://lists.runrev.com/mailman/listinfo/use-livecode


___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode


Re: Need Help Throttling Downloads From an FTP Site

2015-09-21 Thread Bob Sneidar
FTP has been called the misbehaving child of networking, and I'm being kind. 
While other protocols play nicely on a network, not grabbing all the bandwidth 
they can and refusing to throttle down when needed, FTP generally does the 
opposite. FTP will try to commandeer all the bandwidth your infrastructure 
allows, and won't let go once it has it.

I'm sure modern FTP servers are better behaved than their cave-man-days 
predecessors, but the protocol itself is still what it is.

If you have a router or switch with built in QOS, you may be able to do it 
there. Barring that you may want to use HTTP file transfers instead.

Bob S


On Sep 21, 2015, at 14:33 , Gregory Lypny 
mailto:gregory.ly...@videotron.ca>> wrote:

Hello everyone,

I posted about this a while back but am still having trouble.

I need to download thousands of files from the Security and Exchange 
Commission's website. Access is through anonymous FTP with "anonymous" as the 
username and my email address as the password. I've been using Put in a Repeat 
With loop as

repeat with each line remoteFilePath in listOfFilePaths
— set new localFileName is set before the download request is made
   put url ("ftp://anonymous:myemailaddr...@ftp.sec.gov/"; & remoteFilePath) 
into url ("file:/" & exportFolderPath & "/" & localFileName )
end repeat

but my script dies (the stack is lifeless and unresponsive) after a few dozen, 
and sometimes a few hundred downloads. I used similar scripts in Mathematica 
and confirmed that the problem is session-timed-out and 
cannot-connect-to-server types of errors. The SEC's webmaster tells me, "There 
is no load/rate limiting on FTP, but if you are running a fast process, it is 
possible you are temporarily overwhelming the server." So, I'm thinking that I 
need to throttle my requests, and maybe should be using libURLDownloadToFile to 
check the status of the current file being downloaded and not request another 
file until the current download is complete. I also wonder whether I should be 
connecting to the FTP site only once with the username and password, loop my 
requests, and then close the connection. Not sure how to do either of these and 
would greatly appreciate any suggestions or tips.

Gregory

___
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode