Shell script to get PIDs from schedules (again)

2017-03-14 Thread Peter Scott

I've fixed it.

 http://peterscott.eu/freeScripts/getPids_html

appears to work for TV as well now.

Sorry I can't get these three posts to link.  I'm having trouble using gmail
to reply to the list.

Peter
-- 
email: p.sc...@shu.ac.uk
website: http://peterscott.eu
NB: My mobile is a "not at home" phone; I don't hear or see it at home.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Shell script to get PIDs from schedules (again)

2017-03-14 Thread Peter Scott

I spoke too soon!  It works for radio three and four, maybe others but not TV.

Peter
-- 
email: p.sc...@shu.ac.uk
website: http://peterscott.eu
NB: My mobile is a "not at home" phone; I don't hear or see it at home.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Shell script to get PIDs from schedules (again)

2017-03-14 Thread Peter Scott

Amazingly my old screen-scraping script still works!

You can get it here:

 http://peterscott.eu/freeScripts/getPids_html

Peter
-- 
email: p.sc...@shu.ac.uk
website: http://peterscott.eu
NB: My mobile is a "not at home" phone; I don't hear or see it at home.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Chris Allison
Sorry, google switched back to html without me noticing, resending this.

Peter,

some good ideas there, but there is no need to scrape the web pages
when all the schedule info you could possibly need is available in
xml, json and yaml files at urls of this form:

www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json
www.bbc.co.uk/radio4extra/programmes/schedules/2014/11/1.json
www.bbc.co.uk/bbcfour/programmes/schedules/last_week.json

etc.

see this page for further info:
http://www.bbc.co.uk/blogs/legacy/radiolabs/2008/05/helping_machines_play_with_pro.shtml

the :outlet part is 'fm' for radio4, 'england' for bbcone and two and
not needed for radio 4 extra/bbc four etc.

hope this helps.

Chris

On 1 November 2014 14:39, Peter Scott p.sc...@shu.ac.uk wrote:
 Apologies if this arrives twice -- I forgot to send it as plain text the first
 time.

 I rarely used get_iplayer's searches and I only use Linux and the
 command line. I have a script to get the PIDs from the radio4 schedule;
 I have modified it to do TV too. It is very crude but someone may find it
 useful. You can get it here:
 http://www.apxd65.dsl.pipex.com/freeScripts/#getPids
 (It is not polished enough for my github page.)
 --
 email: p.sc...@shu.ac.uk
 website: http://peterscott.eu
 NB: My mobile is a not at home phone; I don't hear or see it at home.

 ___
 get_iplayer mailing list
 get_iplayer@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/get_iplayer



-- 
 _  o  ,   ,
/   |  |  |_| / \_/ \_|  |
\__/ \/ \/  |/ \/  \/  \/|/
(|

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Charles Johnson
On 02/11/14 08:52, Chris Allison wrote:
 Peter,

 some good ideas there, but there is no need to scrape the web pages
 when all the schedule info you could possibly need is available in
 xml, json and yaml files at urls of this form:

 www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json
 www.bbc.co.uk/radio4extra/programmes/schedules/2014/11/1.json
 www.bbc.co.uk/bbcfour/programmes/schedules/last_week.json

 etc.
Thanks for that Chris. Have been excited enough by that first link into
experimenting with the json parsing utility called 'jq'.

A pipeline like the following will produce all the titles, pids and
synopses:

wget -O -
http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq
'.[] | .[] | .[] | .[] | .programme as $P |
$P.display_titles.title,$P.short_synopsis,$P.pid'

So, just a 6-line tail with

wget -q -O -
http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq
'.[] | .[] | .[] | .[] | .programme as $P |
$P.display_titles.title,$P.short_synopsis,$P.pid' | tail -n 6

will get you the following:


The Film Programme
Director Mike Leigh discusses art and movie-making in his latest film
Mr Turner.
b04mgxtq
Something Understood
Mark Tully debates the cultural benefits of classical music with
composer James MacMillan.
b04n2fmh


Regards,

Charles


___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Roger Bell_West
On Sun, Nov 02, 2014 at 08:52:09AM +, Chris Allison wrote:
some good ideas there, but there is no need to scrape the web pages
when all the schedule info you could possibly need is available in
xml, json and yaml files at urls of this form:

www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json
www.bbc.co.uk/radio4extra/programmes/schedules/2014/11/1.json
www.bbc.co.uk/bbcfour/programmes/schedules/last_week.json

Please check the headers you get back from any such request, though:

X-Aps-Deprecation-Notice: APS is soon to be deprecated. It will first
of all cease to be supported on a 24/7 basis, and will then cease
responding entirely. Nitro is the BBC's new API for programme data,
and can provide all the information previously provided by APS. Go
here to read more: http://developer.bbc.co.uk/nitro

It's nice to have for now, but I wouldn't go building any serious
infrastructure on it.

Roger

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Terry L. Ridder
Hello

I may have missed something , but where is there any mention of the 
www.bbc.co.uk website programme schedules going away?

This will be sorted out. 
I am a happy camper, the Sunday morning programs came down as normal  and pain 
levels are manageable today, so all is good with the world.


Sent from my iPad
terry l. ridder 

 On Nov 2, 2014, at 7:17, Roger Bell_West ro...@firedrake.org wrote:
 
 On Sun, Nov 02, 2014 at 08:52:09AM +, Chris Allison wrote:
 some good ideas there, but there is no need to scrape the web pages
 when all the schedule info you could possibly need is available in
 xml, json and yaml files at urls of this form:
 
 www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json
 www.bbc.co.uk/radio4extra/programmes/schedules/2014/11/1.json
 www.bbc.co.uk/bbcfour/programmes/schedules/last_week.json
 
 Please check the headers you get back from any such request, though:
 
 X-Aps-Deprecation-Notice: APS is soon to be deprecated. It will first
 of all cease to be supported on a 24/7 basis, and will then cease
 responding entirely. Nitro is the BBC's new API for programme data,
 and can provide all the information previously provided by APS. Go
 here to read more: http://developer.bbc.co.uk/nitro
 
 It's nice to have for now, but I wouldn't go building any serious
 infrastructure on it.
 
 Roger
 
 ___
 get_iplayer mailing list
 get_iplayer@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/get_iplayer

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Roger Bell_West
On Sun, Nov 02, 2014 at 09:10:57AM -0600, Terry L. Ridder wrote:
I may have missed something , but where is there any mention of the 
www.bbc.co.uk website programme schedules going away?

As I said in the mail that you quoted, it's in the HTTP headers when
you request the actual schedules. It's not announced on the web site
itself as far as I know.

(Please don't reply both to the list and to me. I read the list.)

Roger

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Jeremy Nicoll - ml get_iplayer
Terry L. Ridder artisticfo...@gmail.com wrote:

Hello

I may have missed something , but where is there any mention of the
www.bbc.co.uk website programme schedules going away?

You've missed this: if a computer program grabs website pages and 'scrapes'
them, which is to say wades through all the rubbish that's there to make the
page look pretty, trying to extract only the data that says what the
tv/radio programmes are, their pids etc... it's

  - complicated
  - slow
  - unreliable because as soon as the BBC alter how the webpages
work, the scraping programs might need altered

So instead, programmers are concentrating on finding resources that contain
data without frills.  The stuff at:

 www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json

and

 www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.yaml

and

 www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.xml


(those three URLs are the same except for the last .xxx part) all yield data
that's much more immediately useful to programmers.  The first two are nasty
for a human to look at, the third is easier on the eye.  But as someone said
these simpler-to-use files are going to cease to exist; they're 'deprecated'
which is the term programmers use to mean something that works now but soon
won't. 

-- 
Jeremy Nicoll - my opinions are my own.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread artisticforge .
Hello;

checking with wireshark, only the json and xml versions have the warning.

so yes, JSON and XML may go away at anytime but HTML will still be there.

So parsing the HTML will not as easy as JSON or XML; the parsing should still
provide the same results. So using the schedules is still a viable option.



On Sun, Nov 2, 2014 at 9:17 AM, Roger Bell_West ro...@firedrake.org wrote:
 On Sun, Nov 02, 2014 at 09:10:57AM -0600, Terry L. Ridder wrote:
I may have missed something , but where is there any mention of the 
www.bbc.co.uk website programme schedules going away?

 As I said in the mail that you quoted, it's in the HTTP headers when
 you request the actual schedules. It's not announced on the web site
 itself as far as I know.

 (Please don't reply both to the list and to me. I read the list.)

 Roger

 ___
 get_iplayer mailing list
 get_iplayer@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/get_iplayer



-- 
terry l. ridder 

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread artisticforge .
hello;

JSON, XML and YAML, all have the following in the header sent by the
server which most people would never see.

HTTP/1.1 200 OK
Server: Apache
Content-Type: application/x-yaml
Access-Control-Allow-Origin: *
X-PAL-Host: pal131.telhc.bbc.co.uk:80
X-UA-Compatible: IE=edge
X-Aps-Deprecation-Notice: APS is soon to be deprecated. It will first
of all cease to be supported on a 24/7 basis, and will then cease
responding entirely. Nitro is the BBC's new API for programme data,
and can provide all the information previously provided by APS. Go
here to read more: http://developer.bbc.co.uk/nitro
Cache-Control: private, max-age=0, no-store
Content-Length: 495441
Date: Sun, 02 Nov 2014 16:17:13 GMT
Connection: keep-alive
X-Cache-Action: MISS
X-Cache-Age: 0
Vary: X-CDN,Accept-Encoding

Basically, JSON, XML and YAML, may disappear at any time. We are then
left in the same position that we have recently
found ourselves.

So one viable long term option is to start parsing the HTML version of
the programme schedules.

In my opinion it is better to start now than wait for the sky is
falling the sky falling we are doomed


On Sun, Nov 2, 2014 at 9:52 AM, Jeremy Nicoll - ml get_iplayer
jn.ml.gti...@wingsandbeaks.org.uk wrote:
 Terry L. Ridder artisticfo...@gmail.com wrote:

Hello

I may have missed something , but where is there any mention of the
 www.bbc.co.uk website programme schedules going away?

 You've missed this: if a computer program grabs website pages and 'scrapes'
 them, which is to say wades through all the rubbish that's there to make the
 page look pretty, trying to extract only the data that says what the
 tv/radio programmes are, their pids etc... it's

   - complicated
   - slow
   - unreliable because as soon as the BBC alter how the webpages
 work, the scraping programs might need altered

 So instead, programmers are concentrating on finding resources that contain
 data without frills.  The stuff at:

  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json

 and

  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.yaml

 and

  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.xml


 (those three URLs are the same except for the last .xxx part) all yield data
 that's much more immediately useful to programmers.  The first two are nasty
 for a human to look at, the third is easier on the eye.  But as someone said
 these simpler-to-use files are going to cease to exist; they're 'deprecated'
 which is the term programmers use to mean something that works now but soon
 won't.

 --
 Jeremy Nicoll - my opinions are my own.

 ___
 get_iplayer mailing list
 get_iplayer@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/get_iplayer



-- 
terry l. ridder 

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Dirk Husemann
this one seems to work:

% curl -v -v http://www.bbc.co.uk/iplayer/js/episode/b04nhkz9

GET /iplayer/js/episode/b04nhkz9 HTTP/1.1
 User-Agent: curl/7.37.1
 Host: www.bbc.co.uk
 Accept: */*

 HTTP/1.1 200 OK
* Server Apache is not blacklisted
 Server: Apache
 Content-Type: application/json
 Etag: 0336180592c132763a48612b843431b3
 X-PAL-Host: pal120.telhc.bbc.co.uk:80
 X-Ua-Compatible: IE=edge
 Content-Length: 224
 Date: Sun, 02 Nov 2014 18:15:34 GMT
 Connection: keep-alive
 X-Cache-Action: MISS
 X-Cache-Age: 0
 Cache-Control: private, max-age=0, must-revalidate
 Vary: X-CDN,Accept-Language,Accept-Encoding

{ [data not shown]
100   224  100   2240 0858  0 --:--:-- --:--:--
--:--:--   864
* Connection #0 to host www.bbc.co.uk left intact
{id:b04nhkz9,title:The Apprentice: You're
Fired,subtitle:Series 10: Episode 4,synopsis:Dara O Briain is
joined by Radio 1's Matt Edmondson and comedian Romesh
Ranganathan.,tleo:b007qgcl,versions:[HD]}

no deprecation warning there...


On 2014-11-02 17:24, artisticforge . wrote:
 hello;

 JSON, XML and YAML, all have the following in the header sent by the
 server which most people would never see.

 HTTP/1.1 200 OK
 Server: Apache
 Content-Type: application/x-yaml
 Access-Control-Allow-Origin: *
 X-PAL-Host: pal131.telhc.bbc.co.uk:80
 X-UA-Compatible: IE=edge
 X-Aps-Deprecation-Notice: APS is soon to be deprecated. It will first
 of all cease to be supported on a 24/7 basis, and will then cease
 responding entirely. Nitro is the BBC's new API for programme data,
 and can provide all the information previously provided by APS. Go
 here to read more: http://developer.bbc.co.uk/nitro
 Cache-Control: private, max-age=0, no-store
 Content-Length: 495441
 Date: Sun, 02 Nov 2014 16:17:13 GMT
 Connection: keep-alive
 X-Cache-Action: MISS
 X-Cache-Age: 0
 Vary: X-CDN,Accept-Encoding

 Basically, JSON, XML and YAML, may disappear at any time. We are then
 left in the same position that we have recently
 found ourselves.

 So one viable long term option is to start parsing the HTML version of
 the programme schedules.

 In my opinion it is better to start now than wait for the sky is
 falling the sky falling we are doomed


 On Sun, Nov 2, 2014 at 9:52 AM, Jeremy Nicoll - ml get_iplayer
 jn.ml.gti...@wingsandbeaks.org.uk wrote:
 Terry L. Ridder artisticfo...@gmail.com wrote:

 Hello

 I may have missed something , but where is there any mention of the
 www.bbc.co.uk website programme schedules going away?

 You've missed this: if a computer program grabs website pages and 'scrapes'
 them, which is to say wades through all the rubbish that's there to make the
 page look pretty, trying to extract only the data that says what the
 tv/radio programmes are, their pids etc... it's

   - complicated
   - slow
   - unreliable because as soon as the BBC alter how the webpages
 work, the scraping programs might need altered

 So instead, programmers are concentrating on finding resources that contain
 data without frills.  The stuff at:

  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json

 and

  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.yaml

 and

  www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.xml


 (those three URLs are the same except for the last .xxx part) all yield data
 that's much more immediately useful to programmers.  The first two are nasty
 for a human to look at, the third is easier on the eye.  But as someone said
 these simpler-to-use files are going to cease to exist; they're 'deprecated'
 which is the term programmers use to mean something that works now but soon
 won't.

 --
 Jeremy Nicoll - my opinions are my own.

 ___
 get_iplayer mailing list
 get_iplayer@lists.infradead.org
 http://lists.infradead.org/mailman/listinfo/get_iplayer




___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Sharon Kimble
Charles Johnson cehjohn...@gmail.com writes:

 On 02/11/14 08:52, Chris Allison wrote:
 Peter,

 some good ideas there, but there is no need to scrape the web pages
 when all the schedule info you could possibly need is available in
 xml, json and yaml files at urls of this form:

 www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json
 www.bbc.co.uk/radio4extra/programmes/schedules/2014/11/1.json
 www.bbc.co.uk/bbcfour/programmes/schedules/last_week.json

 etc.
 Thanks for that Chris. Have been excited enough by that first link into
 experimenting with the json parsing utility called 'jq'.

 A pipeline like the following will produce all the titles, pids and
 synopses:

 wget -O -
 http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq
 '.[] | .[] | .[] | .[] | .programme as $P |
 $P.display_titles.title,$P.short_synopsis,$P.pid'

 So, just a 6-line tail with

 wget -q -O -
 http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq
 '.[] | .[] | .[] | .[] | .programme as $P |
 $P.display_titles.title,$P.short_synopsis,$P.pid' | tail -n 6

 will get you the following:

 
 The Film Programme
 Director Mike Leigh discusses art and movie-making in his latest film
 Mr Turner.
 b04mgxtq
 Something Understood
 Mark Tully debates the cultural benefits of classical music with
 composer James MacMillan.
 b04n2fmh
 

Thanks for this Charles. With your last command

--8---cut here---start-8---
wget -q -O - http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json 
| jq '.[] | .[] | .[] | .[] | .programme as $P | 
$P.display_titles.title,$P.short_synopsis,$P.pid' | tail - 6

--8---cut here---end---8---

It is failing for me  saying

╭
│parse error: Invalid numeric literal at line 1, column 10
╰

Presumably its referring to -O, but what should it be please to
get it working properly?

Thanks
Sharon.
-- 
A taste of linux = http://www.sharons.org.uk
my git repo = https://bitbucket.org/boudiccas/dots
TGmeds = http://www.tgmeds.org.uk
Debian testing, fluxbox 1.3.5, emacs 24.4.1.0


signature.asc
Description: PGP signature
___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Jeremy Nicoll - ml get_iplayer
Dirk Husemann dirk+getipla...@d2h.net wrote:

this one seems to work:

% curl -v -v http://www.bbc.co.uk/iplayer/js/episode/b04nhkz9

GET /iplayer/js/episode/b04nhkz9 HTTP/1.1

Interesting, but not a schedule.  You already knew the pid...

-- 
Jeremy Nicoll - my opinions are my own.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Jeremy Nicoll - ml get_iplayer
Sharon Kimble boudic...@skimble.plus.com wrote:


Thanks for this Charles. With your last command

--8---cut here---start-8---
wget -q -O -
http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq '.[]
| .[] | .[] | .[] | .programme as $P |
$P.display_titles.title,$P.short_synopsis,$P.pid' | tail - 6

--8---cut here---end---8---

It is failing for me  saying

╭
│parse error: Invalid numeric literal at line 1, column 10
╰

Presumably its referring to -O, but what should it be please to
get it working properly?

Are you sure that the  tail - 6 is right?  It looks different from the
OP's | tail -n 6

Though if that's the problem I don't quite see why the parse error would
mention column 10.

Do the separate stages of the command work for you, before pipelineing them
together?

-- 
Jeremy Nicoll - my opinions are my own.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Dirk Husemann

On 2014-11-02 19:49, Jeremy Nicoll - ml get_iplayer wrote:
 Dirk Husemann dirk+getipla...@d2h.net wrote:

 this one seems to work:

 % curl -v -v http://www.bbc.co.uk/iplayer/js/episode/b04nhkz9

 GET /iplayer/js/episode/b04nhkz9 HTTP/1.1
 Interesting, but not a schedule.  You already knew the pid...

which you can get from the iplayer guide page:
http://www.bbc.co.uk/iplayer/guide/bbc/20141029


___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Jeremy Nicoll - ml get_iplayer
Dirk Husemann dirk+getipla...@d2h.net wrote:

On 2014-11-02 19:49, Jeremy Nicoll - ml get_iplayer wrote:

 Interesting, but not a schedule.  You already knew the pid...

which you can get from the iplayer guide page:
http://www.bbc.co.uk/iplayer/guide/bbc/20141029

Yes, but the point of the thread is finding ways to extract all this info
from the bbc (to build a searchable list of available programmes, which is
what get_iplayer's radio.cache and tv.cache files were) without scraping
complicated and - potentially - ever-changing internal format webpages.

-- 
Jeremy Nicoll - my opinions are my own.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Charles Johnson
On 02/11/14 18:47, Sharon Kimble wrote:
 --8---cut here---start-8---
 wget -q -O - 
 http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq '.[] 
 | .[] | .[] | .[] | .programme as $P | 
 $P.display_titles.title,$P.short_synopsis,$P.pid' | tail - 6

 --8---cut here---end---8---
Sharon - have a look at your tail command - it's missing an 'n'. Should be

wget -q -O - http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json 
| jq '.[] | .[] | .[] | .[] | .programme as $P | 
$P.display_titles.title,$P.short_synopsis,$P.pid' | tail -n 6

You might like to try the following, which will produce pipe-delimited
csv, which is like a mini version of the cache index:

 wget -q -O - 
http://www.bbc.co.uk/radio4/programmes/schedules/fm/this_week.json | jq '.[] | 
.[]  | .[] | .[] as $B | $B.programme as $P | 
$P.display_titles.title+|+$B.start+|+$B.end+|+$P.short_synopsis+|+$P.pid'
 | tr -d ''

Having said all that, it looks like this is redundant really as
http://packages.hedgerows.org.uk/gip/get_iplayer.pl provides a patch
that uses the schedules to build an index. I have symlinked to that pro-tem.

Charles

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Re: Shell script to get PIDs from schedules

2014-11-02 Thread Dirk Husemann

On 2014-11-02 20:22, Jeremy Nicoll - ml get_iplayer wrote:
 Dirk Husemann dirk+getipla...@d2h.net wrote:

 On 2014-11-02 19:49, Jeremy Nicoll - ml get_iplayer wrote:
 Interesting, but not a schedule.  You already knew the pid...
 which you can get from the iplayer guide page:
 http://www.bbc.co.uk/iplayer/guide/bbc/20141029
 Yes, but the point of the thread is finding ways to extract all this info
 from the bbc (to build a searchable list of available programmes, which is
 what get_iplayer's radio.cache and tv.cache files were) without scraping
 complicated and - potentially - ever-changing internal format webpages.
exactly, so we should take a look at how the iplayer webapp is working -
the feeds will be gone sooner rather than later, the iplayer webapp is
probably going to stay, so..

% http://www.bbc.co.uk/iplayer/guide/bbc/20141029 | grep /iplayer/episode

this will return all iplayer hrefs for 29 october, from which we can
extract the PID and then access the JSON info.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer


Shell script to get PIDs from schedules

2014-11-01 Thread Peter Scott
Apologies if this arrives twice -- I forgot to send it as plain text the first
time.

I rarely used get_iplayer's searches and I only use Linux and the
command line. I have a script to get the PIDs from the radio4 schedule;
I have modified it to do TV too. It is very crude but someone may find it
useful. You can get it here:
http://www.apxd65.dsl.pipex.com/freeScripts/#getPids
(It is not polished enough for my github page.)
-- 
email: p.sc...@shu.ac.uk
website: http://peterscott.eu
NB: My mobile is a not at home phone; I don't hear or see it at home.

___
get_iplayer mailing list
get_iplayer@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/get_iplayer