Hmm..

When I try the yahoo finance url, I get a csv file. When I try it in
wget, I get a 401 not authorized response. When I try it in an
incognito window in my browser (so that I am sure I don't have any
authentication cookies), I get a csv file.

So, probably, yahoo has something which is sensitive to http headers
-- possibly the user agent string.

If you open a blank tab in chrome, and use its inspect option, (then
go to the network tab and select "preserve log", just to be safe --
this lets you retain redirects if they happen), when you fetch the
file you can click on the request you want to inspect and look at the
http headers.  Meanwhile, wget has a --header option that lets you
supply specific headers (and a --save-headers option, so you can see
what it's actually sending).
https://www.gnu.org/software/wget/manual/html_node/HTTP-Options.html

Anyways, that's where I'd start -- by comparing requests which work
with requests which do not work, guessing about what's different and
working from there.

FYI,

-- 
Raul

On Tue, Jan 5, 2021 at 12:37 AM Devon McCormick <devon...@gmail.com> wrote:
>
> This is not really a J question, but has anyone successfully figured out
> how to download something using https protocol?  It's actually more
> complicated than this.  If I want to get, say, the price and volume history
> for Tesla from Yahoo Finance, the "Download" command there generates a
> string like this:
> https://query1.finance.yahoo.com/v7/finance/download/TSLA?period1=1277769600&period2=1609718400&interval=1d&events=history&includeAdjustedClose=true
>
> I used to be able to take this string and substitute into it to download
> not only prices for TSLA but any other stock on Yahoo Finance for which I
> knew the ticker.  Now this sort of thing fails when I use my former method
> which was just invoking "wget" via the "shell" command in J.  Instead I get
> an 800+ byte html file with error messages.  I think the switch to https
> from http is to blame but having a query string instead of a filename may
> also be an issue.
>
> The "wget" method still works for http files like this (to get the famous
> iris data): shell 'wget -O iris.data
> http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'.
> So, I suspect it's probably https that is to blame but I do not have a
> working example of submitting a query string using http so I cannot be
> completely sure other than I think this used to work.
>
> Any suggestions for an automatable way to do this would be welcome.
>
> Thanks,
>
> Devon
> --
>
> Devon McCormick, CFA
>
> Quantitative Consultant
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to