On Thu, Aug 22, 2019 at 6:16 PM ALT-EMAIL Virilo Tejedor wrote: > thanks Barry, > > I'm not very sure if paralellizing could work, because it has a delay of > seconds to get a single image from this bucket. >
If an single image has a latecny of say 2 seconds, and 2 seconds to download. so 4 seconds per image. Say 15/minute. But if download say 10 in parallel, then thats 150/minute. Even with the latency per image. ... overall the 'latency' gets spread around, So your script wouldn't be waiting for all 10 at the same time. so will waiting for one to start, it can be downloading a differnt one. AWS does not store all 10 million images on the *same physical disk*. Its possibly spread over *millions *of disks. > > I should open several threads, and probably I'm going to have problems > with another limitations > In theory yes. But AWS will cape with high concurrency. It's designed that way. Could easily download 1,000 images concurrently, if you had the bandwidth. But however you download the data (even if it just to upload elsewhere - still dont understand why) - will have to deal with this latency to download it all in a realistic timeframe. Downloading them all one by might take 463 days ;) -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to google-appengine+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/CAJCAUuK3eq6EkAgcn5qznr7jeAtG9DCCHcb6zTFO_UZVFSazcA%40mail.gmail.com.