RE: New radio PIDs, more than 8 characters - "solved"
Vangelis , Unlocker only reported "Unlock and Delete Failed" for each of the three files. But it didn't fail because the files were deleted. I have no other diagnostic data. One again many thanks. Regards Hugh -Original Message- From: get_iplayer [mailto:get_iplayer-boun...@lists.infradead.org] On Behalf Of Vangelis forthnet Sent: 24 August 2017 02:17 To: get_iplayer@lists.infradead.org Subject: Re: New radio PIDs, more than 8 characters - "solved" On Wed Aug 23 16:44:30 BST 2017, Hugh Reynolds wrote: > Reboot didn't help > Uninstall and Reboot didn't help > iobit-unlocker helped. > > Clean install is now working. > > Many, many thanks. You're welcome! I'm glad you're back up and running :-) For the sake of completeness/closure, were you even able to determine (via Unlocker) the process(es) locking those .exe files even after a system reboot? Doesn't make much sense to me... Perhaps an overzealous antimalware suite? Using Unlocker is just a workaround; if you do not discover the root cause of your issue and rectify it, then there's a chance you'll revisit the issue in the next GiP update (?) - are you able now to, e.g., send ffmpeg.exe (temporarily) to the Recycle Bin? Just my 2p, I am not jinxing your future GiP upgrades! Best wishes, Vangelis. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
On Wed Aug 23 16:44:30 BST 2017, Hugh Reynolds wrote: Reboot didn't help Uninstall and Reboot didn't help iobit-unlocker helped. Clean install is now working. Many, many thanks. You're welcome! I'm glad you're back up and running :-) For the sake of completeness/closure, were you even able to determine (via Unlocker) the process(es) locking those .exe files even after a system reboot? Doesn't make much sense to me... Perhaps an overzealous antimalware suite? Using Unlocker is just a workaround; if you do not discover the root cause of your issue and rectify it, then there's a chance you'll revisit the issue in the next GiP update (?) - are you able now to, e.g., send ffmpeg.exe (temporarily) to the Recycle Bin? Just my 2p, I am not jinxing your future GiP upgrades! Best wishes, Vangelis. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
Vangelis, All, Sorry for the Hijack. Reboot didn't help Uninstall and Reboot didn't help iobit-unlocker helped. Clean install is now working. Many, many thanks. Hugh --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
On 22/08/2017 16:56, Vangelis forthnet wrote: On Tue Aug 22 15:52:56 BST 2017, Hugh Reynolds wrote: The installer failed in trying to delete perl.exe, atomicparsley.exe and ffmpeg.exe and left me with the PVR manager reporting "Access is denied." Those three files seem to be undeletable. I am a supervisor but even that doesn't seem to give me permission to delete those files. Any ideas what I could do now? Hi Hugh First thing is you seem to have "hijacked" a previous thread, because the title of the thread you posted in (New radio PIDs, more than 8 characters - "solved") has nothing to do with your current predicament... You'd better start a new thread titled, e.g., "Windows installation/upgrade problems". That thing aside, from looking at https://github.com/get-iplayer/get_iplayer_win32/compare/3.01.0...3.02.0 I can't spot any major changes to the Win installer... FWIW, I'm on Win7 x64 Ultimate and the upgrade for me was totally faultless. I haven't yet upgraded my Win 10 laptop. You should begin by supplying some info on 1. The WinOS version/architecture you're using 2. The GiP version previously used (prior to updating) First thing I'd suggest is REBOOT your machine. After loggin back in, inspect in Task manager whether any GiP related processes are running (perl.exe, atomicparsley.exe, ffmpeg.exe) and kill them if they do... Try to first uninstall GiP from Control Panel (you may want to back up your GiP user profile found at %USERPROFILE%\.get_iplayer) followed by a reboot, then proceed with a clean re-install of GiP. Your problem looks as if it's permissions related, or involving locked processes/files. I would recommend the Unlocker utility http://www.iobit.com/en/iobit-unlocker.php which has saved the day for me on numerous occassions, but use it only if you really know what you're doing... Yes I wondered that, although files can usually be deleted in the Windows folders assuming admin privileges are given. If processes are running and stop deletion, I'd also suggest something like Unlocker, after trying the Task Manager "end process" route. Another issue may be faulty sectors on your hard drive as that can stop files being accessed, deleted or altered, in which case a CHKDSK may be in order. Hopefully a reboot will clear your issue; GiP 3.02 was only released a few days ago, if it's some bug in the installer then I'd expect more reports to come forth... At first glance it only looks as something at your end, but little do I know... If all fails, then maybe report your issue over at the support forum, where, hopefully, it'll be addressed by the maintainer himself I'd have thought that even by now, had there been problems we'd have heard about them either here or in the support forum. Alan PS hope the sun is shining for you in Greece, Vangelis! :-) --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
On Tue Aug 22 15:52:56 BST 2017, Hugh Reynolds wrote: The installer failed in trying to delete perl.exe, atomicparsley.exe and ffmpeg.exe and left me with the PVR manager reporting "Access is denied." Those three files seem to be undeletable. I am a supervisor but even that doesn't seem to give me permission to delete those files. Any ideas what I could do now? Hi Hugh First thing is you seem to have "hijacked" a previous thread, because the title of the thread you posted in (New radio PIDs, more than 8 characters - "solved") has nothing to do with your current predicament... You'd better start a new thread titled, e.g., "Windows installation/upgrade problems". That thing aside, from looking at https://github.com/get-iplayer/get_iplayer_win32/compare/3.01.0...3.02.0 I can't spot any major changes to the Win installer... You should begin by supplying some info on 1. The WinOS version/architecture you're using 2. The GiP version previously used (prior to updating) First thing I'd suggest is REBOOT your machine. After loggin back in, inspect in Task manager whether any GiP related processes are running (perl.exe, atomicparsley.exe, ffmpeg.exe) and kill them if they do... Try to first uninstall GiP from Control Panel (you may want to back up your GiP user profile found at %USERPROFILE%\.get_iplayer) followed by a reboot, then proceed with a clean re-install of GiP. Your problem looks as if it's permissions related, or involving locked processes/files. I would recommend the Unlocker utility http://www.iobit.com/en/iobit-unlocker.php which has saved the day for me on numerous occassions, but use it only if you really know what you're doing... Hopefully a reboot will clear your issue; GiP 3.02 was only released a few days ago, if it's some bug in the installer then I'd expect more reports to come forth... At first glance it only looks as something at your end, but little do I know... If all fails, then maybe report your issue over at the support forum, where, hopefully, it'll be addressed by the maintainer himself Regards, Vangelis. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
I've just tried the windows upgrade and now I seem to have a totally broken get_iplayer. The installer failed in trying to delete perl.exe, atomicparsley.exe and ffmpeg.exe and left me with the PVR manager reporting "Access is denied." Those three files seem to be undeletable. I am a supervisor but even that doesn't seem to give me permission to delete those files. Any ideas what I could do now? Regards Hugh --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
Hi C. E., > for a high-level language, [Perl's] syntax is unnecessarily difficult > and obscure. Perl's syntax is heavy on notation, but then notation is powerful compared to the long-hand alternatives, and that's why it's fine in maths, chemistry, and Perl. For the occasional visitor, Python's that way → Perl uses sigils, a symbol attached to a variable name, but then so does sh(1) that influenced it. Many languages do, Ruby, PHP, ..., though not all the time, e.g. Python has `@foo' to mean it's a decorator function. Perl also sigils to indicate the type of an identifier, so `$foo' is a simple scalar variable whereas `@bar' is a indexable list. Similar syntax differences allow literals to be given: `[42, 314, "xyzzy"]' is a list whereas '{May => 10, Hammond => 11}' is a `hash', AKA associative array or dictionary. Perl is no harder to learn than C or Ruby. They both like notation too, e.g. the «int foo(int, int (*)(void *, char *), void *);» I wrote recently. Perl's easier than PHP because that has far too much duplication, bad design, and corner cases to memorise. C++ is also something to avoid; too large a language and each coder uses a distinct subset. Assembly languages are easy, once you understand a CPU's workings, but RISC ones like ARM are nice to learn compared to the twisty passages of x86. > The whole point of high-level languages, the reason they were > invented, was make to programming more human-readable and therefore > more understandable, but Perl bucks that trend. I think it was to give more expressive power than assembly language by introduction abstraction, and notation, at the cost of efficiency. Plenty of Unix programmers with a sh, sed, awk, background found Perl straightforward to pick up because it distilled their features into a single language. It was Perl 4 when I learnt it, and a single well-written man page described the language and that's all the documentation there was. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
Would you two please take this childish spat to private email and save the rest of us from terminal boredom. On Wed, 16 Aug 2017 17:20:10 +0100, David Cantrell wrote: On Wed, Aug 16, 2017 at 04:57:30PM +0100, C E Macfarlane wrote: But, since you are obviously spoiling for a fight, why should anyone listen to someone who has confessed to being a part of putting all that massive bloat in BBC web pages [citation needed] presumably therefore you will feel at home bloating my spam folder henceforth. Bye, bye. Awww, poor baby who can't bear to hear that he's wrong. -- . ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
On Wed, Aug 16, 2017 at 04:57:30PM +0100, C E Macfarlane wrote: > But, since you are obviously spoiling for a fight, why should anyone listen > to someone who has confessed to being a part of putting all that massive > bloat in BBC web pages [citation needed] > presumably therefore you will feel at home > bloating my spam folder henceforth. Bye, bye. Awww, poor baby who can't bear to hear that he's wrong. -- David Cantrell | Bourgeois reactionary pig I think the most difficult moment that anyone could face is seeing their domestic servants, whether maid or drivers, run away -- Abdul Rahman Al-Sheikh, writing on 25 Jan 2004 at http://www.arabnews.com/node/243486 ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
> On Wed, Aug 16, 2017 at 04:16:33PM +0100, C E Macfarlane wrote: > > > as for arcane-ness of language and difficulty in reading > > it's about on a par with The Bible! > > That's what everyone thinks about languages that they are too damned > lazy to learn. Laziness doesn't enter into it, the list of languages I've already learnt includes others, such as Assembler (for three different processors), that were much harder to learn than Perl would ever be, but the fact remains that, for a high-level language, its syntax is unnecessarily difficult and obscure. The whole point of high-level languages, the reason they were invented, was make to programming more human-readable and therefore more understandable, but Perl bucks that trend. But, since you are obviously spoiling for a fight, why should anyone listen to someone who has confessed to being a part of putting all that massive bloat in BBC web pages - presumably therefore you will feel at home bloating my spam folder henceforth. Bye, bye. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
On Wed, Aug 16, 2017 at 04:16:33PM +0100, C E Macfarlane wrote: > as for arcane-ness of language and difficulty in reading > it's about on a par with The Bible! That's what everyone thinks about languages that they are too damned lazy to learn. -- David Cantrell | Nth greatest programmer in the world Erudite is when you make a classical allusion to a feather. Kinky is when you use the whole chicken. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
See reply below ... -- www.macfh.co.uk/MacFH.html > The first two google results for "perl regular expression" not good > enough for you :-) I'm sure they would have been, but that wasn't what I searched for. I searched for something more precise. > BTW, it's Perl or perl, not PERL. Perl is the name of the > language, perl > is the name of the interpreter. I'd always understood it to be an acronym, but looking it up in response your post, acronyms have been applied to it, but after the event, and it's true derivation seems to have been from a Biblical quote - quite appropriate really, as for arcane-ness of language and difficulty in reading it's about on a par with The Bible! ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
On Tue, Aug 15, 2017 at 03:06:51PM +0100, C E Macfarlane wrote: > Yes, I was aware of \b support in some languages, but RE support varies > across languages, and, knowing this but not being experienced in PERL, I > checked at least two online sources for PERL REs and could find no evidence > of support for it. The first two google results for "perl regular expression" not good enough for you :-) BTW, it's Perl or perl, not PERL. Perl is the name of the language, perl is the name of the interpreter. -- David Cantrell | Hero of the Information Age If you have received this email in error, please add some nutmeg and egg whites, whisk, and place in a warm oven for 40 minutes. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
More on REs ... -- www.macfh.co.uk/MacFH.html > > would we both agree with?: > > \b[bpw][0-9][a-z0-9]{7,13}\b > > I think it's > > \b[bpw]\d[b-df-hj-np-tv-z\d]{6,13}\b > > to cover the existing ones that are eight long, up to the 15-long > w172vg029mkl852 that Vangelis mentioned. And we may as well > borrow from > the specification and cut out the vowels rather than allow a-z. Yes, I agree. I see that my 7 at the end was an actual error, but for the rest of it you're saving some characters in capturing the second character, and you're being more precise in the tail. I think your suggestion should work well, and am happy to agree with it. > I'd probably put all of it other than the two `\b' into a > variable with > qr//, and then embed that in regexps as needed, adding `()', or `\b', > etc., back. ... where and as needed. Yes, a sensible approach. > Out of interest, I've looked at 3.01's get_iplayer for "]0" to see how > it already uses it. > > 941 if ( $this->{pid} !~ m{^([pb]0[a-z0-9]{6})$} ) { > > $1 doesn't seem to be used afterwards, so the `()' aren't needed. > > 3359 if ( $prog->{pid} =~ > m{^http.+\/([pb]0[a-z0-9]{6})\/?.*$} ) { > > The `/' are unnecessarily backslashed given that m{} is used > so the `/' > doesn't have special meaning. The `.+' means the last thing to match > the PID RE is used. The `/?' makes the terminating slash > optional, but > this means "http://.../p0abc123def"; matches, but $1 ignores the "def". > The `.*$' isn't wanted as it's always true. > > 4409 $pid = $1 if $prog->{pid} =~ > /\/([bp]0[a-z0-9]{6})/ > > This time the first PID-like thing would be used. Again, a > "def" would > be ignored. > > 4416 if ( $pid !~ /^[bp]0[a-z0-9]{6}$/ ) { > 4521 if ( $pid !~ /^[bp]0[a-z0-9]{6}$/ && $pid !~ > /^http/ ) { > 4531 if ( $pid =~ /^[bp]0[a-z0-9]{6}$/ ) { > 4603 if ( $pid =~ /^[bp]0[a-z0-9]{6}$/ ) { > 4686 } elsif ( $prog->{pid} =~ /^[bp]0[a-z0-9]{6}$/ ) { > > All the same. Fine. > > 5095 if ( $prog->{pid} !~ m{^([pb]0[a-z0-9]{6})$} ) { > > "pb" rather than "bp", just for spice. No need to capture. > > 5253 return $1 if $_[0] =~ m{/?([wpb]0[a-z0-9]{6})}; > > This one has a `w'! I would cry out "Gawdon Bennet!", but he wouldn't hear me from shaking his head in disbelief. Even after Martin Clark's post giving a tally of them all, the full horror of it doesn't really sink in until you see them all listed together as you have done. It really is a classic example of the need to declare a multiply-used value up front at the top of the programme as a constant or variable, and why this need is the very first item in my programming check-list! ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
From: Jim web Sent: Wednesday, August 16, 2017 09:58 I've not encountered any problem thus far, so currently am asking just for clarification. Have a look at World Service programmes from Friday onwards. http://www.bbc.co.uk/programmes/p002w6r2/episodes/downloads http://www.bbc.co.uk/programmes/p016tl04/broadcasts/2017/08 3) If I still need a validation regex string, what should it be and how would I make the change? See http://lists.infradead.org/pipermail/get_iplayer/2017-August/011020.html or use the podcast if there is one. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
Hi C. E., > So, yielding to your superior knowledge of PERL, for the sake of > clarity for the benefit of those who may have had difficulty in > following the nuances of the argument, or been confused by the > multiple suggestions, would we both agree with?: > \b[bpw][0-9][a-z0-9]{7,13}\b I think it's \b[bpw]\d[b-df-hj-np-tv-z\d]{6,13}\b to cover the existing ones that are eight long, up to the 15-long w172vg029mkl852 that Vangelis mentioned. And we may as well borrow from the specification and cut out the vowels rather than allow a-z. I'd probably put all of it other than the two `\b' into a variable with qr//, and then embed that in regexps as needed, adding `()', or `\b', etc., back. Out of interest, I've looked at 3.01's get_iplayer for "]0" to see how it already uses it. 941 if ( $this->{pid} !~ m{^([pb]0[a-z0-9]{6})$} ) { $1 doesn't seem to be used afterwards, so the `()' aren't needed. 3359 if ( $prog->{pid} =~ m{^http.+\/([pb]0[a-z0-9]{6})\/?.*$} ) { The `/' are unnecessarily backslashed given that m{} is used so the `/' doesn't have special meaning. The `.+' means the last thing to match the PID RE is used. The `/?' makes the terminating slash optional, but this means "http://.../p0abc123def"; matches, but $1 ignores the "def". The `.*$' isn't wanted as it's always true. 4409 $pid = $1 if $prog->{pid} =~ /\/([bp]0[a-z0-9]{6})/ This time the first PID-like thing would be used. Again, a "def" would be ignored. 4416 if ( $pid !~ /^[bp]0[a-z0-9]{6}$/ ) { 4521 if ( $pid !~ /^[bp]0[a-z0-9]{6}$/ && $pid !~ /^http/ ) { 4531 if ( $pid =~ /^[bp]0[a-z0-9]{6}$/ ) { 4603 if ( $pid =~ /^[bp]0[a-z0-9]{6}$/ ) { 4686 } elsif ( $prog->{pid} =~ /^[bp]0[a-z0-9]{6}$/ ) { All the same. Fine. 5095 if ( $prog->{pid} !~ m{^([pb]0[a-z0-9]{6})$} ) { "pb" rather than "bp", just for spice. No need to capture. 5253 return $1 if $_[0] =~ m{/?([wpb]0[a-z0-9]{6})}; This one has a `w'! -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
The discussion prompts some questions on my part: 1) I always used -pid to specify a programme rather than other methods. Does this *have* to have a validation check by regex? I'd assume it doesn't need to parse an entire url because it could just tack the value I give onto the standard parts. Then tell me it can't find something if I mistyped a pid. 2) If it doesn't need to check, is there a way to tell gip not to do so? Thus dodging this problem entirely? 3) If I still need a validation regex string, what should it be and how would I make the change? I've not encountered any problem thus far, so currently am asking just for clarification. Jim -- Electronics https://www.st-andrews.ac.uk/~www_pa/Scots_Guide/intro/electron.htm Armstrong Audio http://www.audiomisc.co.uk/Armstrong/armstrong.html Audio Misc http://www.audiomisc.co.uk/index.html ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
On Mon Aug 14 13:19:14 BST 2017, M Clark wrote: changing all 7 occurrences (sigh...) of [bp]0[a-z0-9]{6} to (?:[bp]0[a-z0-9]{6}|w[a-z0-9]{7,14}) solves the w3*, w1* problem for Me. Hi Martin; that new code still assumes that both Red Bee & PIPs PIDs will have "0" as the second character in the string. I am not saying this is something that will have to be dealt with soon, but I've watched Red Bee PIDs move from "b08*" to "b09*" and, recently, from "b0909***" to "b0910***", e.g. "b0910w0x". If this is a pattern, then I expect strings like "b0999***" to appear in the future; and the next logical (?) step would be strings beginning with "b1**" (in which case the amended code will break..). Pure speculation on my part, though... I haven't done the maths myself (number of permutations of 7 alphanumeric strings), this is supposed to be a huge integer; but, as PIDs are unique (can't be recycled), linked to a specific audio-visual offering from the beeb, that huge number is bound to be exhausted sometime... Regards, Vangelis. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
More on REs ... -- www.macfh.co.uk/MacFH.html > > Yes, I was aware of \b support in some languages, but RE support > > varies across languages, and, knowing this but not being experienced > > in PERL, I checked at least two online sources for PERL REs > and could > > find no evidence of support for it. > > One is http://perldoc.perl.org/perlre.html#Assertions Obviously my brief research was too brief! So, yielding to your superior knowledge of PERL, for the sake of clarity for the benefit of those who may have had difficulty in following the nuances of the argument, or been confused by the multiple suggestions, would we both agree with?: \b[bpw][0-9][a-z0-9]{7,13}\b > It was Perl that invented `\b', along with many of the other > conventions > that spread to other implementations, e.g. `\d' for digit, a > `?' suffix > for non-greedy as in /<.*?>/, the otherwise invalid `?' after an open > parenthesis as a gateway for further flags like the > non-capturing `:' in > /(.)(?:.)(.)/, etc. Larry Wall was very knowledgable of the Unix > programming environment, including the various regular expression > syntaxes in sed, grep, egrep, ..., and came up with a consistent > almost-superset that had some nice conveniences too. As it happens I've been doing some Bash scripting over the last week or so. > > True, but if that is starting to happen, then one of the > other 'rules' > > was to break a monolithic program into blocks > > Alas, AFAIK, get_iplayer wishes to ship as a single file. You can still break a single file down into blocks, both by using subroutines/functions or even just by appropriate layout and commenting, and in both cases individually testing that the resulting sections do what is expected of them. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
Hi C. E., > Yes, I was aware of \b support in some languages, but RE support > varies across languages, and, knowing this but not being experienced > in PERL, I checked at least two online sources for PERL REs and could > find no evidence of support for it. One is http://perldoc.perl.org/perlre.html#Assertions It was Perl that invented `\b', along with many of the other conventions that spread to other implementations, e.g. `\d' for digit, a `?' suffix for non-greedy as in /<.*?>/, the otherwise invalid `?' after an open parenthesis as a gateway for further flags like the non-capturing `:' in /(.)(?:.)(.)/, etc. Larry Wall was very knowledgable of the Unix programming environment, including the various regular expression syntaxes in sed, grep, egrep, ..., and came up with a consistent almost-superset that had some nice conveniences too. > True, but if that is starting to happen, then one of the other 'rules' > was to break a monolithic program into blocks Alas, AFAIK, get_iplayer wishes to ship as a single file. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
More about Regular Expressions (REs), which can be safely ignored by those not interested ... -- www.macfh.co.uk/MacFH.html > > it might be necessary to bracket it at the beginning and end with > > non-capturing non-word meta- or pseudo-characters > > Rather than \W, representing a single non-word character, \b would be > better, meaning a zero-width boundary between a word, \w, and > non-word, > \W, character, or the start or end of the string. /\Wfoo\W/ matches > ":foo:", but not "foo", but /\bfoo\b/ matches both. The zero-width > means it consumes nothing; the test is of the character either side. Yes, I was aware of \b support in some languages, but RE support varies across languages, and, knowing this but not being experienced in PERL, I checked at least two online sources for PERL REs and could find no evidence of support for it. If you're a PERL programmer and therefore know that it works, I concede to your superior knowledge. RE support has varied from none to very complete with every language I've programmed, which includes Assembler, Bash, BASIC, C, COBOL, SQL, and Python, but these days I'm more used to HTML and JavaScript, and before that I was doing quite a lot of work in Java, and, despite their similar sounding names and similarities in basic syntax, the latter two are very different in many, perhaps most, other respects, including REs. So REs are regularly one of those areas I find myself having to refer back to manuals and API documentation in particular cases, and at one point in one particular case got so frustrated with the complexity of the REs I was trying to develop that I spent some time creating a JS RE test page to help develop the code. Ironically, the JS RE Test Page has blossomed into being quite a successful page on my site, but the original page that caused it to be written is still 'under development'!-) > > Pretty much the first item in that list was to declare constants at > > the beginning of the program containing all the fixed or semi-fixed > > values that the program needed > > Though that can put them a long way from their use, removing context > from their definition and requiring it to be put back into their > identifier instead. True, but if that is starting to happen, then one of the other 'rules' was to break a monolithic program into blocks - each of which has one particular purpose, which testing has shown it to do well and without rror - and then build the program up by combining such blocks; because the individual components are known to work, the wider program built from them is likely to be more reliable. Regards. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
Hi C. E., > it might be necessary to bracket it at the beginning and end with > non-capturing non-word meta- or pseudo-characters Rather than \W, representing a single non-word character, \b would be better, meaning a zero-width boundary between a word, \w, and non-word, \W, character, or the start or end of the string. /\Wfoo\W/ matches ":foo:", but not "foo", but /\bfoo\b/ matches both. The zero-width means it consumes nothing; the test is of the character either side. > pseudo code > if --url > strip characters following last / > use as pid ... > ... particularly as URLs exist with other characters after the PID, > though perhaps these might not be used in the context of GiP. Yes, it's never that simple. :-) URLs have a defined structure and encoding rules, and there's query parameters and fragments to consider. > Pretty much the first item in that list was to declare constants at > the beginning of the program containing all the fixed or semi-fixed > values that the program needed Though that can put them a long way from their use, removing context from their definition and requiring it to be put back into their identifier instead. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
RE: New radio PIDs, more than 8 characters - "solved"
More about regular expressions and programming follows, which those with no or little interest in either can safely ignore ... -- www.macfh.co.uk/MacFH.html > > I think what Charles was meaning is that if you were using > --url "http://www.bbc.co.uk/programmes/b08xy0gl"; rather than > a direct PID then the code is looking for something starting > with either b, p or w followed by between 7 and 14 letters or > numbers and the first thing it hits that matches all that > criteria is the word "programmes". Like you say, GiP wouldn't > return any VPID info but as it finds programmes to be a valid > PID, it won't keep looking for the proper PID in that URL so > would never be able to download from a URL. Yes, it depends very much on the intended use for the regular expresion (RE). The most general situation is trawling any text, such as HTML, WITHOUT REGARD TO CONTEXT for capturing something resembling a PID. In this situation, probably even the correction I suggested may not be adequate, it might be necessary to bracket it at the beginning and end with non-capturing non-word meta- or pseudo-characters, the representation of which can sometimes differ from language to language but is usually \W, as it is in PERL, so ... \W([bpw][0-9][a-z0-9]{7,13})\W ... should capture PIDs reasonably accurately without regard to context, though I wouldn't rely even on this without a deal of testing with many actual examples of text to be trawled. However, if you already know something about the context, then of course that makes things easier. The correction I suggested should pick PIDs out of URLs more elegantly and simply, in a single statement in fact, than either the original suggestion or programming to implement the following pseudo-code ... > ? > pseudo code > if --url > strip characters following last / > use as pid > validate_pid > end-if > ? ... particularly as URLs exist with other characters after the PID, though perhaps these might not be used in the context of GiP. > Anyway... > changing all 7 occurrences :-( > (sigh...) I think in my case that would more likely have been '(expletive deleted)'! > of > [bp]0[a-z0-9]{6} > to > (?:[bp]0[a-z0-9]{6}|w[a-z0-9]{7,14}) > solves the w3*, w1* problem for Me. > Also. No disrespect intended to Dinkypumpkin as "he's" only picked-up > existing code but, as an ex-programmer I'm horrified by the code > repetition. Doesn't Perl allow 'functions'? i.e. if valid_pid ... > where valid_pid contains said validation. Yes, grateful though I am, probably along with all of us here, for GiP's wonderfully useful functionality, when I first looked at its code, I rejected any idea of contributing much actual programming suggestions, because I'd feel I had to completely rewrite the program rather than just tinker with it! I can't remember where now, whether it was from a book, or a 6th form college or university course, but somewhere somehow I acquired a mental list of very basic things to get right when programming ... Pretty much the first item in that list was to declare constants at the beginning of the program containing all the fixed or semi-fixed values that the program needed, so that if one of them changed, you only had to change the one easily-found line at the beginning where the value was declared, not the possibly tens, hundreds, even thousands of lines throughout the rest of the program where that value was used. A template for BBC URLs and an RE to capture PIDs would both obviously be prime examples of this. As you suggest, another, probably second or third on the list, was to put oft repeated code in subroutines/functions. When I got out into the 'real' world, I was appalled to find that code that disregarded most or all of the principles outlined in my mental list was actually widespread, perhaps even in the majority! I sometimes think it's a near miracle that some programmes ever run correctly at all! Regards. ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
Hi M, > I'm horrified by the code repetition. Doesn't Perl allow 'functions'? Yes, that's those sub foo { ... } you see. It can also hold a regexp in a variable so a `$pid_regexp' could be defined once and used repeatedly. $ perl -e ' > $re = qr/^(food|drink|famine)\d*$/; > while (<>) { > /$re/ and print "$. $_"; > } > ' abc food 2 food drink42 3 drink42 xyz $ BTW, given your private email, you might be interested to know the Regular Expressions, of which regexps are an extension, are essentially a "little language" for describing a regular grammar, level 3 in Chomsky's hierarchy. These are grammars that can be matched with a finite-state automaton, and implementations are either non-deterministic, like Perl's, or deterministic, like Go's. As such, they're a succinct way of expressing many text matching problems, just as BNF is a convenient method for programming language grammars. It's interesting to compare the simple one above to the alternative long-hand imperative programming form. -- Cheers, Ralph. https://plus.google.com/+RalphCorderoy ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer
Re: New radio PIDs, more than 8 characters - "solved"
Aside from tellyaddict's caveat (others welcome); snip > I think what Charles was meaning is that if you were using --url > "http://www.bbc.co.uk/programmes/b08xy0gl"; rather than a direct PID then the > code is looking for something starting with either b, p or w followed by > between 7 and 14 letters or numbers and the first thing it hits that matches > all that criteria is the word "programmes". Like you say, GiP wouldn't return > any VPID info but as it finds programmes to be a valid PID, it won't keep > looking for the proper PID in that URL so would never be able to download > from a URL. > ? pseudo code if --url strip characters following last / use as pid validate_pid end-if ? Anyway... changing all 7 occurrences (sigh...) of [bp]0[a-z0-9]{6} to (?:[bp]0[a-z0-9]{6}|w[a-z0-9]{7,14}) solves the w3*, w1* problem for Me. I only use WebPVR and command line with explicit pids. A re-run of WebPVR successfully downloaded all my outstanding programmes. I then refreshed (radio) cache, made selections (b*, p* & w*) and everything down loaded fine. Big thanks to Vangelis & Ralph. FWIW I usually refresh radio cache daily, make selections and then run WebPVR (or list on command line). First noticed a w* pid on the 5th August (Music Extra: The Music of Time - The Music of Time – Cuba (w3csvnyc) ). Thought it was a one-off. It was until the 12th when a slew of w3* and w1* appeared. Can't comment on the BBC's Pid.php; https://github.com/bbc/programmes-pages-service/blob/master/src/Domain/ValueObject/Pid.php#l14 Also. No disrespect intended to Dinkypumpkin as "he's" only picked-up existing code but, as an ex-programmer I'm horrified by the code repetition. Doesn't Perl allow 'functions'? i.e. if valid_pid ... where valid_pid contains said validation. Regards, Martin ___ get_iplayer mailing list get_iplayer@lists.infradead.org http://lists.infradead.org/mailman/listinfo/get_iplayer