On Mon, May 01, 2017 at 10:20:42AM -0700, Ian Monat wrote: [...] > Then you have you run the .exe which produces a zipped file, and inside the > zipped file, is the .txt, which what I really want. There's no way the > distributor will change anything about how they store files on their > website for me. I've written a script using the requests module but I > think a web scraper like Scrapy, Beautiful Soup or Selinium may be > required. > > What would you do?
Find another distributor. (Its this sort of business to business incompetence that makes me laugh when people say that private industry is always more efficient than the alternatives. Did I say laugh? I meant cry.) Seriously, can't you tell them that your anti-virus blocks the .exe files, and if they want you to use their system, they'll have to provide text files as text files? Or tell them that you're using Apple Macs and the .exe files don't run under Mac. I guess it depends on whether you need them more than they need you. In any case, this isn't a problem that can be solved by a web scraper. The distributor's website provides .exe files. There's nothing you can do about that except complain or leave. The website gives you a .exe file, so that's what you receive. However, once you have the .exe file in your possession, you *may* be able to hack open the file and extract the .zip file without running it. That will require detailed knowledge of how the .exe file does its job, but it is conceivable that it will work. A good low-level hacker could probably determine whether the zip file is embedded in the .exe or if it is generated on the fly. That's beyond my skills though. If it is generated on the fly, you're screwed. You have no choice but to run the .exe, until you do the zip doesn't even exist. But if it is embedded, it can be extracted, and once the zip file is extracted, Python can easily unzip it. -- Steve _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor