I believe this has to do with a crumb that yahoo embeds in the session and unless your reply has the crumb yahoo rejects the query. It means you need a 2 step call access. One to set the session information and extract the crumb and one to actually get the data you are interested in.
I had played with this using web/gethttp and have some code to find the crumb. My code is over a year old and I haven’t tried it on J902 or above. But I have attached it for you to see if it helps with your problem Tom McGuire
NB. OK so here is my final code cleaned up and now working due to the double quote issue (see second to last line of code): NB. Navigating yahoo.com to programmatically get historical stock prices NB. require 'web/gethttp' require 'regex' NB. use the linux date command to create a linux time stamp epochtime =: 3 : 0 2!:0 'date -jf ''%m/%d/%Y %H:%M:%S %p'' ''',y,' 05:00:00 PM'' ''+%s''' ) NB. precision functions ppq =: 9 !: 10 NB. print-precision query pps =: 9 !: 11 NB. print-precision set NB. I set the precision to 16 to ensure full printing of the linux timestamps NB. Conversion of \u00xx escape sequences HEX=:16#.'0123456789abcdef'i.] xutf =: 3 : 0 u: HEX tolower 2 }. y ) crumbstr =: '"CrumbStore":{"crumb":"' NB. the crumb is on the page with the link to downloading the historical NB. data. If you call the correct first page you only need to search NB. for the above crumbstr there will be only one. getcrumb =: 3 : 0 NB. find the start index and end index of the crumb sidx =. (#crumbstr)+({: I. crumbstr E. y) sstr =. (sidx + i. 30){y eidx =. {. I. '"' E. sstr NB. using rxapply convert all \u00xx unicode escape sequences crumb =. '(\\u[0-9a-fA-F][0-9a-fA-F][0-9a-fA-F][0-9a-fA-F])' xutf rxapply (i.eidx){sstr ) financeURL =: 'https://finance.yahoo.com/quote/' NB. AAPL/history?p=AAPL' histURL =: 'https://query1.finance.yahoo.com/v7/finance/download/' NB. the histURL needs to have a ticker symbol followed by: NB. ?period1=<unixts for p1>&period2=<unixts for p2&interval=1d&events=history&crumb=<crumbval> NB. NB. here is a full fledged quote request from the website itself for Apple Computer NB. https://query1.finance.yahoo.com/v7/finance/download/AAPL?period1=1543024670&period2=1574560670&interval=1d&events=history&crumb=jZO816Y7CSK gethistorical=: 3 : 0 'symbol d1 d2' =. y NB. Create start URL for the start page with the crumb to get historical download NB. a BASH implementation uses the following format: NB. sURL =. financeURL,symbol,'/?p=',symbol NB. But the link to the download of historical prices is: sURL =. financeURL,symbol,'/history?p=',symbol NB. Get the response using gethttp. -c cookie.txt will open a cookie file res =. '-s -c cookie.txt' gethttp sURL crumb =. getcrumb res qstr =. '?period1=',(}:epochtime d1),'&period2=',(}:epochtime d2),'&interval=1d&events=history&crumb=',crumb URL=. histURL,symbol,qstr NB. turns out that to get a file download you need to double quote the URL NB. There is a built in function for that in J res2 =. '-s -b cookie.txt ' gethttp dquote URL res2 )
> On Jan 5, 2021, at 12:37 AM, Devon McCormick <devon...@gmail.com> wrote: > > This is not really a J question, but has anyone successfully figured out > how to download something using https protocol? It's actually more > complicated than this. If I want to get, say, the price and volume history > for Tesla from Yahoo Finance, the "Download" command there generates a > string like this: > https://query1.finance.yahoo.com/v7/finance/download/TSLA?period1=1277769600&period2=1609718400&interval=1d&events=history&includeAdjustedClose=true > > I used to be able to take this string and substitute into it to download > not only prices for TSLA but any other stock on Yahoo Finance for which I > knew the ticker. Now this sort of thing fails when I use my former method > which was just invoking "wget" via the "shell" command in J. Instead I get > an 800+ byte html file with error messages. I think the switch to https > from http is to blame but having a query string instead of a filename may > also be an issue. > > The "wget" method still works for http files like this (to get the famous > iris data): shell 'wget -O iris.data > http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'. > So, I suspect it's probably https that is to blame but I do not have a > working example of submitting a query string using http so I cannot be > completely sure other than I think this used to work. > > Any suggestions for an automatable way to do this would be welcome. > > Thanks, > > Devon > -- > > Devon McCormick, CFA > > Quantitative Consultant > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm
---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm