Script 'mail_helper' called by obssrc Hello community, here is the log from the commit of package youtube-dl for openSUSE:Factory checked in at 2021-05-20 19:25:32 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Comparing /work/SRC/openSUSE:Factory/youtube-dl (Old) and /work/SRC/openSUSE:Factory/.youtube-dl.new.2988 (New) ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Package is "youtube-dl" Thu May 20 19:25:32 2021 rev:166 rq:894590 version:2021.05.16 Changes: -------- --- /work/SRC/openSUSE:Factory/youtube-dl/python-youtube-dl.changes 2021-04-17 23:24:53.237586092 +0200 +++ /work/SRC/openSUSE:Factory/.youtube-dl.new.2988/python-youtube-dl.changes 2021-05-20 19:26:02.621700509 +0200 @@ -1,0 +2,6 @@ +Thu May 20 09:51:10 UTC 2021 - Jan Engelhardt <[email protected]> + +- Update to release 2021.05.16 + * Add support for sibnet embeds + +------------------------------------------------------------------- youtube-dl.changes: same change Old: ---- youtube-dl-2021.04.17.tar.gz youtube-dl-2021.04.17.tar.gz.sig New: ---- youtube-dl-2021.05.16.tar.gz youtube-dl-2021.05.16.tar.gz.sig ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Other differences: ------------------ ++++++ python-youtube-dl.spec ++++++ --- /var/tmp/diff_new_pack.AclAB4/_old 2021-05-20 19:26:03.433697178 +0200 +++ /var/tmp/diff_new_pack.AclAB4/_new 2021-05-20 19:26:03.437697162 +0200 @@ -19,7 +19,7 @@ %define modname youtube-dl %{?!python_module:%define python_module() python-%{**} python3-%{**}} Name: python-youtube-dl -Version: 2021.04.17 +Version: 2021.05.16 Release: 0 Summary: A Python module for downloading from video sites for offline watching License: CC-BY-SA-3.0 AND SUSE-Public-Domain ++++++ youtube-dl.spec ++++++ --- /var/tmp/diff_new_pack.AclAB4/_old 2021-05-20 19:26:03.461697064 +0200 +++ /var/tmp/diff_new_pack.AclAB4/_new 2021-05-20 19:26:03.461697064 +0200 @@ -17,7 +17,7 @@ Name: youtube-dl -Version: 2021.04.17 +Version: 2021.05.16 Release: 0 Summary: A tool for downloading from video sites for offline watching License: CC-BY-SA-3.0 AND SUSE-Public-Domain ++++++ youtube-dl-2021.04.17.tar.gz -> youtube-dl-2021.05.16.tar.gz ++++++ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/ChangeLog new/youtube-dl/ChangeLog --- old/youtube-dl/ChangeLog 2021-04-16 22:50:06.000000000 +0200 +++ new/youtube-dl/ChangeLog 2021-05-16 17:54:58.000000000 +0200 @@ -1,3 +1,47 @@ +version 2021.05.16 + +Core +* [options] Fix thumbnail option group name (#29042) +* [YoutubeDL] Improve extract_info doc (#28946) + +Extractors ++ [playstuff] Add support for play.stuff.co.nz (#28901, #28931) +* [eroprofile] Fix extraction (#23200, #23626, #29008) ++ [vivo] Add support for vivo.st (#29009) ++ [generic] Add support for og:audio (#28311, #29015) +* [phoenix] Fix extraction (#29057) ++ [generic] Add support for sibnet embeds ++ [vk] Add support for sibnet embeds (#9500) ++ [generic] Add Referer header for direct videojs download URLs (#2879, + #20217, #29053) +* [orf:radio] Switch download URLs to HTTPS (#29012, #29046) +- [blinkx] Remove extractor (#28941) +* [medaltv] Relax URL regular expression (#28884) ++ [funimation] Add support for optional lang code in URLs (#28950) ++ [gdcvault] Add support for HTML5 videos +* [dispeak] Improve FLV extraction (#13513, #28970) +* [kaltura] Improve iframe extraction (#28969) +* [kaltura] Make embed code alternatives actually work +* [cda] Improve extraction (#28709, #28937) +* [twitter] Improve formats extraction from vmap URL (#28909) +* [xtube] Fix formats extraction (#28870) +* [svtplay] Improve extraction (#28507, #28876) +* [tv2dk] Fix extraction (#28888) + + +version 2021.04.26 + +Extractors ++ [xfileshare] Add support for wolfstream.tv (#28858) +* [francetvinfo] Improve video id extraction (#28792) +* [medaltv] Fix extraction (#28807) +* [tver] Redirect all downloads to Brightcove (#28849) +* [go] Improve video id extraction (#25207, #25216, #26058) +* [youtube] Fix lazy extractors (#28780) ++ [bbc] Extract description and timestamp from __INITIAL_DATA__ (#28774) +* [cbsnews] Fix extraction for python <3.6 (#23359) + + version 2021.04.17 Core diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/README.md new/youtube-dl/README.md --- old/youtube-dl/README.md 2021-04-16 22:50:08.000000000 +0200 +++ new/youtube-dl/README.md 2021-05-16 17:55:04.000000000 +0200 @@ -287,7 +287,7 @@ --no-cache-dir Disable filesystem caching --rm-cache-dir Delete all filesystem cache files -## Thumbnail images: +## Thumbnail Options: --write-thumbnail Write thumbnail image to disk --write-all-thumbnails Write all thumbnail image formats to disk diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/README.txt new/youtube-dl/README.txt --- old/youtube-dl/README.txt 2021-04-16 22:50:29.000000000 +0200 +++ new/youtube-dl/README.txt 2021-05-16 17:55:39.000000000 +0200 @@ -318,7 +318,7 @@ --rm-cache-dir Delete all filesystem cache files -Thumbnail images: +Thumbnail Options: --write-thumbnail Write thumbnail image to disk --write-all-thumbnails Write all thumbnail image formats to diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/docs/supportedsites.md new/youtube-dl/docs/supportedsites.md --- old/youtube-dl/docs/supportedsites.md 2021-04-16 22:50:09.000000000 +0200 +++ new/youtube-dl/docs/supportedsites.md 2021-05-16 17:55:05.000000000 +0200 @@ -119,7 +119,6 @@ - **BitChuteChannel** - **BleacherReport** - **BleacherReportCMS** - - **blinkx** - **Bloomberg** - **BokeCC** - **BongaCams** @@ -713,6 +712,7 @@ - **play.fm** - **player.sky.it** - **PlayPlusTV** + - **PlayStuff** - **PlaysTV** - **Playtvak**: Playtvak.cz, iDNES.cz and Lidovky.cz - **Playvid** @@ -1162,7 +1162,7 @@ - **WWE** - **XBef** - **XboxClips** - - **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, XVideoSharing + - **XFileShare**: XFileShare based sites: Aparat, ClipWatching, GoUnlimited, GoVid, HolaVid, Streamty, TheVideoBee, Uqload, VidBom, vidlo, VidLocker, VidShare, VUp, WolfStream, XVideoSharing - **XHamster** - **XHamsterEmbed** - **XHamsterUser** diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/test/test_all_urls.py new/youtube-dl/test/test_all_urls.py --- old/youtube-dl/test/test_all_urls.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/test/test_all_urls.py 2021-05-01 13:57:05.000000000 +0200 @@ -70,15 +70,6 @@ # self.assertMatch('http://www.youtube.com/results?search_query=making+mustard', ['youtube:search_url']) # self.assertMatch('https://www.youtube.com/results?baz=bar&search_query=youtube-dl+test+video&filters=video&lclk=video', ['youtube:search_url']) - def test_youtube_extract(self): - assertExtractId = lambda url, id: self.assertEqual(YoutubeIE.extract_id(url), id) - assertExtractId('http://www.youtube.com/watch?&v=BaW_jenozKc', 'BaW_jenozKc') - assertExtractId('https://www.youtube.com/watch?&v=BaW_jenozKc', 'BaW_jenozKc') - assertExtractId('https://www.youtube.com/watch?feature=player_embedded&v=BaW_jenozKc', 'BaW_jenozKc') - assertExtractId('https://www.youtube.com/watch_popup?v=BaW_jenozKc', 'BaW_jenozKc') - assertExtractId('http://www.youtube.com/watch?v=BaW_jenozKcsharePLED17F32AD9753930', 'BaW_jenozKc') - assertExtractId('BaW_jenozKc', 'BaW_jenozKc') - def test_facebook_matching(self): self.assertTrue(FacebookIE.suitable('https://www.facebook.com/Shiniknoh#!/photo.php?v=10153317450565268')) self.assertTrue(FacebookIE.suitable('https://www.facebook.com/cindyweather?fref=ts#!/photo.php?v=10152183998945793')) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/test/test_execution.py new/youtube-dl/test/test_execution.py --- old/youtube-dl/test/test_execution.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/test/test_execution.py 2021-05-01 13:57:05.000000000 +0200 @@ -39,6 +39,16 @@ _, stderr = p.communicate() self.assertFalse(stderr) + def test_lazy_extractors(self): + try: + subprocess.check_call([sys.executable, 'devscripts/make_lazy_extractors.py', 'youtube_dl/extractor/lazy_extractors.py'], cwd=rootDir, stdout=_DEV_NULL) + subprocess.check_call([sys.executable, 'test/test_all_urls.py'], cwd=rootDir, stdout=_DEV_NULL) + finally: + try: + os.remove('youtube_dl/extractor/lazy_extractors.py') + except (IOError, OSError): + pass + if __name__ == '__main__': unittest.main() diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/test/test_youtube_misc.py new/youtube-dl/test/test_youtube_misc.py --- old/youtube-dl/test/test_youtube_misc.py 1970-01-01 01:00:00.000000000 +0100 +++ new/youtube-dl/test/test_youtube_misc.py 2021-05-01 13:57:05.000000000 +0200 @@ -0,0 +1,26 @@ +#!/usr/bin/env python +from __future__ import unicode_literals + +# Allow direct execution +import os +import sys +import unittest +sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__)))) + + +from youtube_dl.extractor import YoutubeIE + + +class TestYoutubeMisc(unittest.TestCase): + def test_youtube_extract(self): + assertExtractId = lambda url, id: self.assertEqual(YoutubeIE.extract_id(url), id) + assertExtractId('http://www.youtube.com/watch?&v=BaW_jenozKc', 'BaW_jenozKc') + assertExtractId('https://www.youtube.com/watch?&v=BaW_jenozKc', 'BaW_jenozKc') + assertExtractId('https://www.youtube.com/watch?feature=player_embedded&v=BaW_jenozKc', 'BaW_jenozKc') + assertExtractId('https://www.youtube.com/watch_popup?v=BaW_jenozKc', 'BaW_jenozKc') + assertExtractId('http://www.youtube.com/watch?v=BaW_jenozKcsharePLED17F32AD9753930', 'BaW_jenozKc') + assertExtractId('BaW_jenozKc', 'BaW_jenozKc') + + +if __name__ == '__main__': + unittest.main() Binary files old/youtube-dl/youtube-dl and new/youtube-dl/youtube-dl differ diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube-dl.1 new/youtube-dl/youtube-dl.1 --- old/youtube-dl/youtube-dl.1 2021-04-16 22:50:30.000000000 +0200 +++ new/youtube-dl/youtube-dl.1 2021-05-16 17:55:40.000000000 +0200 @@ -495,7 +495,7 @@ Delete all filesystem cache files .RS .RE -.SS Thumbnail images: +.SS Thumbnail Options: .TP .B \-\-write\-thumbnail Write thumbnail image to disk diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/YoutubeDL.py new/youtube-dl/youtube_dl/YoutubeDL.py --- old/youtube-dl/youtube_dl/YoutubeDL.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/YoutubeDL.py 2021-05-01 13:57:12.000000000 +0200 @@ -773,11 +773,20 @@ def extract_info(self, url, download=True, ie_key=None, extra_info={}, process=True, force_generic_extractor=False): - ''' - Returns a list with a dictionary for each video we find. - If 'download', also downloads the videos. - extra_info is a dict containing the extra values to add to each result - ''' + """ + Return a list with a dictionary for each video extracted. + + Arguments: + url -- URL to extract + + Keyword arguments: + download -- whether to download videos during extraction + ie_key -- extractor key hint + extra_info -- dictionary containing the extra values to add to each result + process -- whether to resolve all unresolved references (URLs, playlist items), + must be True for download to work. + force_generic_extractor -- force using the generic extractor + """ if not ie_key and force_generic_extractor: ie_key = 'Generic' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/bbc.py new/youtube-dl/youtube_dl/extractor/bbc.py --- old/youtube-dl/youtube_dl/extractor/bbc.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/bbc.py 2021-05-01 13:57:05.000000000 +0200 @@ -11,6 +11,7 @@ compat_etree_Element, compat_HTTPError, compat_parse_qs, + compat_str, compat_urllib_parse_urlparse, compat_urlparse, ) @@ -25,8 +26,10 @@ js_to_json, parse_duration, parse_iso8601, + strip_or_none, try_get, unescapeHTML, + unified_timestamp, url_or_none, urlencode_postdata, urljoin, @@ -761,8 +764,17 @@ 'only_matching': True, }, { # custom redirection to www.bbc.com + # also, video with window.__INITIAL_DATA__ 'url': 'http://www.bbc.co.uk/news/science-environment-33661876', - 'only_matching': True, + 'info_dict': { + 'id': 'p02xzws1', + 'ext': 'mp4', + 'title': "Pluto may have 'nitrogen glaciers'", + 'description': 'md5:6a95b593f528d7a5f2605221bc56912f', + 'thumbnail': r're:https?://.+/.+\.jpg', + 'timestamp': 1437785037, + 'upload_date': '20150725', + }, }, { # single video article embedded with data-media-vpid 'url': 'http://www.bbc.co.uk/sport/rowing/35908187', @@ -1164,12 +1176,29 @@ continue formats, subtitles = self._download_media_selector(item_id) self._sort_formats(formats) + item_desc = None + blocks = try_get(media, lambda x: x['summary']['blocks'], list) + if blocks: + summary = [] + for block in blocks: + text = try_get(block, lambda x: x['model']['text'], compat_str) + if text: + summary.append(text) + if summary: + item_desc = '\n\n'.join(summary) + item_time = None + for meta in try_get(media, lambda x: x['metadata']['items'], list) or []: + if try_get(meta, lambda x: x['label']) == 'Published': + item_time = unified_timestamp(meta.get('timestamp')) + break entries.append({ 'id': item_id, 'title': item_title, 'thumbnail': item.get('holdingImageUrl'), 'formats': formats, 'subtitles': subtitles, + 'timestamp': item_time, + 'description': strip_or_none(item_desc), }) for resp in (initial_data.get('data') or {}).values(): name = resp.get('name') diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/blinkx.py new/youtube-dl/youtube_dl/extractor/blinkx.py --- old/youtube-dl/youtube_dl/extractor/blinkx.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/blinkx.py 1970-01-01 01:00:00.000000000 +0100 @@ -1,86 +0,0 @@ -from __future__ import unicode_literals - -import json - -from .common import InfoExtractor -from ..utils import ( - remove_start, - int_or_none, -) - - -class BlinkxIE(InfoExtractor): - _VALID_URL = r'(?:https?://(?:www\.)blinkx\.com/#?ce/|blinkx:)(?P<id>[^?]+)' - IE_NAME = 'blinkx' - - _TEST = { - 'url': 'http://www.blinkx.com/ce/Da0Gw3xc5ucpNduzLuDDlv4WC9PuI4fDi1-t6Y3LyfdY2SZS5Urbvn-UPJvrvbo8LTKTc67Wu2rPKSQDJyZeeORCR8bYkhs8lI7eqddznH2ofh5WEEdjYXnoRtj7ByQwt7atMErmXIeYKPsSDuMAAqJDlQZ-3Ff4HJVeH_s3Gh8oQ', - 'md5': '337cf7a344663ec79bf93a526a2e06c7', - 'info_dict': { - 'id': 'Da0Gw3xc', - 'ext': 'mp4', - 'title': 'No Daily Show for John Oliver; HBO Show Renewed - IGN News', - 'uploader': 'IGN News', - 'upload_date': '20150217', - 'timestamp': 1424215740, - 'description': 'HBO has renewed Last Week Tonight With John Oliver for two more seasons.', - 'duration': 47.743333, - }, - } - - def _real_extract(self, url): - video_id = self._match_id(url) - display_id = video_id[:8] - - api_url = ('https://apib4.blinkx.com/api.php?action=play_video&' - + 'video=%s' % video_id) - data_json = self._download_webpage(api_url, display_id) - data = json.loads(data_json)['api']['results'][0] - duration = None - thumbnails = [] - formats = [] - for m in data['media']: - if m['type'] == 'jpg': - thumbnails.append({ - 'url': m['link'], - 'width': int(m['w']), - 'height': int(m['h']), - }) - elif m['type'] == 'original': - duration = float(m['d']) - elif m['type'] == 'youtube': - yt_id = m['link'] - self.to_screen('Youtube video detected: %s' % yt_id) - return self.url_result(yt_id, 'Youtube', video_id=yt_id) - elif m['type'] in ('flv', 'mp4'): - vcodec = remove_start(m['vcodec'], 'ff') - acodec = remove_start(m['acodec'], 'ff') - vbr = int_or_none(m.get('vbr') or m.get('vbitrate'), 1000) - abr = int_or_none(m.get('abr') or m.get('abitrate'), 1000) - tbr = vbr + abr if vbr and abr else None - format_id = '%s-%sk-%s' % (vcodec, tbr, m['w']) - formats.append({ - 'format_id': format_id, - 'url': m['link'], - 'vcodec': vcodec, - 'acodec': acodec, - 'abr': abr, - 'vbr': vbr, - 'tbr': tbr, - 'width': int_or_none(m.get('w')), - 'height': int_or_none(m.get('h')), - }) - - self._sort_formats(formats) - - return { - 'id': display_id, - 'fullid': video_id, - 'title': data['title'], - 'formats': formats, - 'uploader': data['channel_name'], - 'timestamp': data['pubdate_epoch'], - 'description': data.get('description'), - 'thumbnails': thumbnails, - 'duration': duration, - } diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/cbsnews.py new/youtube-dl/youtube_dl/extractor/cbsnews.py --- old/youtube-dl/youtube_dl/extractor/cbsnews.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/cbsnews.py 2021-05-01 13:57:05.000000000 +0200 @@ -26,7 +26,7 @@ def _real_extract(self, url): item = self._parse_json(zlib.decompress(compat_b64decode( compat_urllib_parse_unquote(self._match_id(url))), - -zlib.MAX_WBITS), None)['video']['items'][0] + -zlib.MAX_WBITS).decode('utf-8'), None)['video']['items'][0] return self._extract_video_info(item['mpxRefId'], 'cbsnews') diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/cda.py new/youtube-dl/youtube_dl/extractor/cda.py --- old/youtube-dl/youtube_dl/extractor/cda.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/cda.py 2021-05-01 13:57:12.000000000 +0200 @@ -133,6 +133,8 @@ 'age_limit': 18 if need_confirm_age else 0, } + info = self._search_json_ld(webpage, video_id, default={}) + # Source: https://www.cda.pl/js/player.js?t=1606154898 def decrypt_file(a): for p in ('_XDDD', '_CDA', '_ADC', '_CXD', '_QWE', '_Q5', '_IKSDE'): @@ -197,7 +199,7 @@ handler = self._download_webpage webpage = handler( - self._BASE_URL + href, video_id, + urljoin(self._BASE_URL, href), video_id, 'Downloading %s version information' % resolution, fatal=False) if not webpage: # Manually report warning because empty page is returned when @@ -209,6 +211,4 @@ self._sort_formats(formats) - info = self._search_json_ld(webpage, video_id, default={}) - return merge_dicts(info_dict, info) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/dispeak.py new/youtube-dl/youtube_dl/extractor/dispeak.py --- old/youtube-dl/youtube_dl/extractor/dispeak.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/dispeak.py 2021-05-01 13:57:12.000000000 +0200 @@ -32,6 +32,18 @@ # From http://www.gdcvault.com/play/1013700/Advanced-Material 'url': 'http://sevt.dispeak.com/ubm/gdc/eur10/xml/11256_1282118587281VNIT.xml', 'only_matching': True, + }, { + # From https://gdcvault.com/play/1016624, empty speakerVideo + 'url': 'https://sevt.dispeak.com/ubm/gdc/online12/xml/201210-822101_1349794556671DDDD.xml', + 'info_dict': { + 'id': '201210-822101_1349794556671DDDD', + 'ext': 'flv', + 'title': 'Pre-launch - Preparing to Take the Plunge', + }, + }, { + # From http://www.gdcvault.com/play/1014846/Conference-Keynote-Shigeru, empty slideVideo + 'url': 'http://events.digitallyspeaking.com/gdc/project25/xml/p25-miyamoto1999_1282467389849HSVB.xml', + 'only_matching': True, }] def _parse_mp4(self, metadata): @@ -84,26 +96,20 @@ 'vcodec': 'none', 'format_id': audio.get('code'), }) - slide_video_path = xpath_text(metadata, './slideVideo', fatal=True) - formats.append({ - 'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url, - 'play_path': remove_end(slide_video_path, '.flv'), - 'ext': 'flv', - 'format_note': 'slide deck video', - 'quality': -2, - 'preference': -2, - 'format_id': 'slides', - }) - speaker_video_path = xpath_text(metadata, './speakerVideo', fatal=True) - formats.append({ - 'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url, - 'play_path': remove_end(speaker_video_path, '.flv'), - 'ext': 'flv', - 'format_note': 'speaker video', - 'quality': -1, - 'preference': -1, - 'format_id': 'speaker', - }) + for video_key, format_id, preference in ( + ('slide', 'slides', -2), ('speaker', 'speaker', -1)): + video_path = xpath_text(metadata, './%sVideo' % video_key) + if not video_path: + continue + formats.append({ + 'url': 'rtmp://%s/ondemand?ovpfv=1.1' % akamai_url, + 'play_path': remove_end(video_path, '.flv'), + 'ext': 'flv', + 'format_note': '%s video' % video_key, + 'quality': preference, + 'preference': preference, + 'format_id': format_id, + }) return formats def _real_extract(self, url): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/eroprofile.py new/youtube-dl/youtube_dl/extractor/eroprofile.py --- old/youtube-dl/youtube_dl/extractor/eroprofile.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/eroprofile.py 2021-05-01 13:57:12.000000000 +0200 @@ -6,7 +6,7 @@ from ..compat import compat_urllib_parse_urlencode from ..utils import ( ExtractorError, - unescapeHTML + merge_dicts, ) @@ -24,7 +24,8 @@ 'title': 'sexy babe softcore', 'thumbnail': r're:https?://.*\.jpg', 'age_limit': 18, - } + }, + 'skip': 'Video not found', }, { 'url': 'http://www.eroprofile.com/m/videos/view/Try-It-On-Pee_cut_2-wmv-4shared-com-file-sharing-download-movie-file', 'md5': '1baa9602ede46ce904c431f5418d8916', @@ -77,19 +78,15 @@ [r"glbUpdViews\s*\('\d*','(\d+)'", r'p/report/video/(\d+)'], webpage, 'video id', default=None) - video_url = unescapeHTML(self._search_regex( - r'<source src="([^"]+)', webpage, 'video url')) title = self._html_search_regex( - r'Title:</th><td>([^<]+)</td>', webpage, 'title') - thumbnail = self._search_regex( - r'onclick="showVideoPlayer\(\)"><img src="([^"]+)', - webpage, 'thumbnail', fatal=False) + (r'Title:</th><td>([^<]+)</td>', r'<h1[^>]*>(.+?)</h1>'), + webpage, 'title') + + info = self._parse_html5_media_entries(url, webpage, video_id)[0] - return { + return merge_dicts(info, { 'id': video_id, 'display_id': display_id, - 'url': video_url, 'title': title, - 'thumbnail': thumbnail, 'age_limit': 18, - } + }) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/extractors.py new/youtube-dl/youtube_dl/extractor/extractors.py --- old/youtube-dl/youtube_dl/extractor/extractors.py 2021-04-16 22:49:44.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/extractors.py 2021-05-01 13:57:12.000000000 +0200 @@ -132,7 +132,6 @@ BleacherReportIE, BleacherReportCMSIE, ) -from .blinkx import BlinkxIE from .bloomberg import BloombergIE from .bokecc import BokeCCIE from .bongacams import BongaCamsIE @@ -926,6 +925,7 @@ from .playfm import PlayFMIE from .playplustv import PlayPlusTVIE from .plays import PlaysTVIE +from .playstuff import PlayStuffIE from .playtvak import PlaytvakIE from .playvid import PlayvidIE from .playwire import PlaywireIE diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/francetv.py new/youtube-dl/youtube_dl/extractor/francetv.py --- old/youtube-dl/youtube_dl/extractor/francetv.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/francetv.py 2021-05-01 13:57:05.000000000 +0200 @@ -383,6 +383,10 @@ }, { 'url': 'http://france3-regions.francetvinfo.fr/limousin/emissions/jt-1213-limousin', 'only_matching': True, + }, { + # "<figure id=" pattern (#28792) + 'url': 'https://www.francetvinfo.fr/culture/patrimoine/incendie-de-notre-dame-de-paris/notre-dame-de-paris-de-l-incendie-de-la-cathedrale-a-sa-reconstruction_4372291.html', + 'only_matching': True, }] def _real_extract(self, url): @@ -400,7 +404,7 @@ (r'player\.load[^;]+src:\s*["\']([^"\']+)', r'id-video=([^@]+@[^"]+)', r'<a[^>]+href="(?:https?:)?//videos\.francetv\.fr/video/([^@]+@[^"]+)"', - r'data-id=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'), + r'(?:data-id|<figure[^<]+\bid)=["\']([\da-f]{8}-[\da-f]{4}-[\da-f]{4}-[\da-f]{4}-[\da-f]{12})'), webpage, 'video id') return self._make_url_result(video_id) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/funimation.py new/youtube-dl/youtube_dl/extractor/funimation.py --- old/youtube-dl/youtube_dl/extractor/funimation.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/funimation.py 2021-05-01 13:57:12.000000000 +0200 @@ -16,7 +16,7 @@ class FunimationIE(InfoExtractor): - _VALID_URL = r'https?://(?:www\.)?funimation(?:\.com|now\.uk)/shows/[^/]+/(?P<id>[^/?#&]+)' + _VALID_URL = r'https?://(?:www\.)?funimation(?:\.com|now\.uk)/(?:[^/]+/)?shows/[^/]+/(?P<id>[^/?#&]+)' _NETRC_MACHINE = 'funimation' _TOKEN = None @@ -51,6 +51,10 @@ }, { 'url': 'https://www.funimationnow.uk/shows/puzzle-dragons-x/drop-impact/simulcast/', 'only_matching': True, + }, { + # with lang code + 'url': 'https://www.funimation.com/en/shows/hacksign/role-play/', + 'only_matching': True, }] def _login(self): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/gdcvault.py new/youtube-dl/youtube_dl/extractor/gdcvault.py --- old/youtube-dl/youtube_dl/extractor/gdcvault.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/gdcvault.py 2021-05-01 13:57:12.000000000 +0200 @@ -6,6 +6,7 @@ from .kaltura import KalturaIE from ..utils import ( HEADRequest, + remove_start, sanitized_Request, smuggle_url, urlencode_postdata, @@ -102,6 +103,26 @@ 'format': 'mp4-408', }, }, + { + # Kaltura embed, whitespace between quote and embedded URL in iframe's src + 'url': 'https://www.gdcvault.com/play/1025699', + 'info_dict': { + 'id': '0_zagynv0a', + 'ext': 'mp4', + 'title': 'Tech Toolbox', + 'upload_date': '20190408', + 'uploader_id': '[email protected]', + 'timestamp': 1554764629, + }, + 'params': { + 'skip_download': True, + }, + }, + { + # HTML5 video + 'url': 'http://www.gdcvault.com/play/1014846/Conference-Keynote-Shigeru', + 'only_matching': True, + }, ] def _login(self, webpage_url, display_id): @@ -175,7 +196,18 @@ xml_name = self._html_search_regex( r'<iframe src=".*?\?xml(?:=|URL=xml/)(.+?\.xml).*?".*?</iframe>', - start_page, 'xml filename') + start_page, 'xml filename', default=None) + if not xml_name: + info = self._parse_html5_media_entries(url, start_page, video_id)[0] + info.update({ + 'title': remove_start(self._search_regex( + r'>Session Name:\s*<.*?>\s*<td>(.+?)</td>', start_page, + 'title', default=None) or self._og_search_title( + start_page, default=None), 'GDC Vault - '), + 'id': video_id, + 'display_id': display_id, + }) + return info embed_url = '%s/xml/%s' % (xml_root, xml_name) ie_key = 'DigitallySpeaking' diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/generic.py new/youtube-dl/youtube_dl/extractor/generic.py --- old/youtube-dl/youtube_dl/extractor/generic.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/generic.py 2021-05-01 13:57:12.000000000 +0200 @@ -126,6 +126,7 @@ from .expressen import ExpressenIE from .zype import ZypeIE from .odnoklassniki import OdnoklassnikiIE +from .vk import VKIE from .kinja import KinjaEmbedIE from .arcpublishing import ArcPublishingIE from .medialaan import MedialaanIE @@ -2248,6 +2249,11 @@ }, 'playlist_mincount': 52, }, + { + # Sibnet embed (https://help.sibnet.ru/?sibnet_video_embed) + 'url': 'https://phpbb3.x-tk.ru/bbcode-video-sibnet-t24.html', + 'only_matching': True, + }, ] def report_following_redirect(self, new_url): @@ -2777,6 +2783,11 @@ if odnoklassniki_url: return self.url_result(odnoklassniki_url, OdnoklassnikiIE.ie_key()) + # Look for sibnet embedded player + sibnet_urls = VKIE._extract_sibnet_urls(webpage) + if sibnet_urls: + return self.playlist_from_matches(sibnet_urls, video_id, video_title) + # Look for embedded ivi player mobj = re.search(r'<embed[^>]+?src=(["\'])(?P<url>https?://(?:www\.)?ivi\.ru/video/player.+?)\1', webpage) if mobj is not None: @@ -3400,6 +3411,9 @@ 'url': src, 'ext': (mimetype2ext(src_type) or ext if ext in KNOWN_EXTENSIONS else 'mp4'), + 'http_headers': { + 'Referer': full_response.geturl(), + }, }) if formats: self._sort_formats(formats) @@ -3468,7 +3482,7 @@ m_video_type = re.findall(r'<meta.*?property="og:video:type".*?content="video/(.*?)"', webpage) # We only look in og:video if the MIME type is a video, don't try if it's a Flash player: if m_video_type is not None: - found = filter_video(re.findall(r'<meta.*?property="og:video".*?content="(.*?)"', webpage)) + found = filter_video(re.findall(r'<meta.*?property="og:(?:video|audio)".*?content="(.*?)"', webpage)) if not found: REDIRECT_REGEX = r'[0-9]{,2};\s*(?:URL|url)=\'?([^\'"]+)' found = re.search( diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/go.py new/youtube-dl/youtube_dl/extractor/go.py --- old/youtube-dl/youtube_dl/extractor/go.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/go.py 2021-05-01 13:57:05.000000000 +0200 @@ -4,10 +4,12 @@ import re from .adobepass import AdobePassIE +from ..compat import compat_str from ..utils import ( int_or_none, determine_ext, parse_age_limit, + try_get, urlencode_postdata, ExtractorError, ) @@ -117,6 +119,18 @@ 'skip_download': True, }, }, { + 'url': 'https://abc.com/shows/modern-family/episode-guide/season-01/101-pilot', + 'info_dict': { + 'id': 'VDKA22600213', + 'ext': 'mp4', + 'title': 'Pilot', + 'description': 'md5:74306df917cfc199d76d061d66bebdb4', + }, + 'params': { + # m3u8 download + 'skip_download': True, + }, + }, { 'url': 'http://abc.go.com/shows/the-catch/episode-guide/season-01/10-the-wedding', 'only_matching': True, }, { @@ -149,14 +163,30 @@ brand = site_info.get('brand') if not video_id or not site_info: webpage = self._download_webpage(url, display_id or video_id) - video_id = self._search_regex( - ( - # There may be inner quotes, e.g. data-video-id="'VDKA3609139'" - # from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood - r'data-video-id=["\']*(VDKA\w+)', - # https://abc.com/shows/the-rookie/episode-guide/season-02/03-the-bet - r'\b(?:video)?id["\']\s*:\s*["\'](VDKA\w+)' - ), webpage, 'video id', default=video_id) + data = self._parse_json( + self._search_regex( + r'["\']__abc_com__["\']\s*\]\s*=\s*({.+?})\s*;', webpage, + 'data', default='{}'), + display_id or video_id, fatal=False) + # https://abc.com/shows/modern-family/episode-guide/season-01/101-pilot + layout = try_get(data, lambda x: x['page']['content']['video']['layout'], dict) + video_id = None + if layout: + video_id = try_get( + layout, + (lambda x: x['videoid'], lambda x: x['video']['id']), + compat_str) + if not video_id: + video_id = self._search_regex( + ( + # There may be inner quotes, e.g. data-video-id="'VDKA3609139'" + # from http://freeform.go.com/shows/shadowhunters/episodes/season-2/1-this-guilty-blood + r'data-video-id=["\']*(VDKA\w+)', + # page.analytics.videoIdCode + r'\bvideoIdCode["\']\s*:\s*["\']((?:vdka|VDKA)\w+)', + # https://abc.com/shows/the-rookie/episode-guide/season-02/03-the-bet + r'\b(?:video)?id["\']\s*:\s*["\'](VDKA\w+)' + ), webpage, 'video id', default=video_id) if not site_info: brand = self._search_regex( (r'data-brand=\s*["\']\s*(\d+)', diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/kaltura.py new/youtube-dl/youtube_dl/extractor/kaltura.py --- old/youtube-dl/youtube_dl/extractor/kaltura.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/kaltura.py 2021-05-01 13:57:12.000000000 +0200 @@ -120,7 +120,7 @@ def _extract_urls(webpage): # Embed codes: https://knowledge.kaltura.com/embedding-kaltura-media-players-your-site finditer = ( - re.finditer( + list(re.finditer( r"""(?xs) kWidget\.(?:thumb)?[Ee]mbed\( \{.*? @@ -128,8 +128,8 @@ (?P<q2>['"])_?(?P<partner_id>(?:(?!(?P=q2)).)+)(?P=q2),.*? (?P<q3>['"])entry_?[Ii]d(?P=q3)\s*:\s* (?P<q4>['"])(?P<id>(?:(?!(?P=q4)).)+)(?P=q4)(?:,|\s*\}) - """, webpage) - or re.finditer( + """, webpage)) + or list(re.finditer( r'''(?xs) (?P<q1>["']) (?:https?:)?//cdnapi(?:sec)?\.kaltura\.com(?::\d+)?/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+)(?:(?!(?P=q1)).)* @@ -142,16 +142,16 @@ \[\s*(?P<q2_1>["'])entry_?[Ii]d(?P=q2_1)\s*\]\s*=\s* ) (?P<q3>["'])(?P<id>(?:(?!(?P=q3)).)+)(?P=q3) - ''', webpage) - or re.finditer( + ''', webpage)) + or list(re.finditer( r'''(?xs) - <(?:iframe[^>]+src|meta[^>]+\bcontent)=(?P<q1>["']) + <(?:iframe[^>]+src|meta[^>]+\bcontent)=(?P<q1>["'])\s* (?:https?:)?//(?:(?:www|cdnapi(?:sec)?)\.)?kaltura\.com/(?:(?!(?P=q1)).)*\b(?:p|partner_id)/(?P<partner_id>\d+) (?:(?!(?P=q1)).)* [?&;]entry_id=(?P<id>(?:(?!(?P=q1))[^&])+) (?:(?!(?P=q1)).)* (?P=q1) - ''', webpage) + ''', webpage)) ) urls = [] for mobj in finditer: diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/medaltv.py new/youtube-dl/youtube_dl/extractor/medaltv.py --- old/youtube-dl/youtube_dl/extractor/medaltv.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/medaltv.py 2021-05-01 13:57:12.000000000 +0200 @@ -15,33 +15,39 @@ class MedalTVIE(InfoExtractor): - _VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[0-9]+)' + _VALID_URL = r'https?://(?:www\.)?medal\.tv/clips/(?P<id>[^/?#&]+)' _TESTS = [{ - 'url': 'https://medal.tv/clips/34934644/3Is9zyGMoBMr', + 'url': 'https://medal.tv/clips/2mA60jWAGQCBH', 'md5': '7b07b064331b1cf9e8e5c52a06ae68fa', 'info_dict': { - 'id': '34934644', + 'id': '2mA60jWAGQCBH', 'ext': 'mp4', 'title': 'Quad Cold', 'description': 'Medal,https://medal.tv/desktop/', 'uploader': 'MowgliSB', 'timestamp': 1603165266, 'upload_date': '20201020', - 'uploader_id': 10619174, + 'uploader_id': '10619174', } }, { - 'url': 'https://medal.tv/clips/36787208', + 'url': 'https://medal.tv/clips/2um24TWdty0NA', 'md5': 'b6dc76b78195fff0b4f8bf4a33ec2148', 'info_dict': { - 'id': '36787208', + 'id': '2um24TWdty0NA', 'ext': 'mp4', 'title': 'u tk me i tk u bigger', 'description': 'Medal,https://medal.tv/desktop/', 'uploader': 'Mimicc', 'timestamp': 1605580939, 'upload_date': '20201117', - 'uploader_id': 5156321, + 'uploader_id': '5156321', } + }, { + 'url': 'https://medal.tv/clips/37rMeFpryCC-9', + 'only_matching': True, + }, { + 'url': 'https://medal.tv/clips/2WRj40tpY_EU9', + 'only_matching': True, }] def _real_extract(self, url): diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/orf.py new/youtube-dl/youtube_dl/extractor/orf.py --- old/youtube-dl/youtube_dl/extractor/orf.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/orf.py 2021-05-01 13:57:12.000000000 +0200 @@ -182,7 +182,7 @@ duration = end - start if end and start else None entries.append({ 'id': loop_stream_id.replace('.mp3', ''), - 'url': 'http://loopstream01.apa.at/?channel=%s&id=%s' % (self._LOOP_STATION, loop_stream_id), + 'url': 'https://loopstream01.apa.at/?channel=%s&id=%s' % (self._LOOP_STATION, loop_stream_id), 'title': title, 'description': clean_html(data.get('subtitle')), 'duration': duration, diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/phoenix.py new/youtube-dl/youtube_dl/extractor/phoenix.py --- old/youtube-dl/youtube_dl/extractor/phoenix.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/phoenix.py 2021-05-01 13:57:12.000000000 +0200 @@ -9,8 +9,9 @@ from ..utils import ( int_or_none, merge_dicts, + try_get, unified_timestamp, - xpath_text, + urljoin, ) @@ -27,10 +28,11 @@ 'title': 'Wohin f??hrt der Protest in der Pandemie?', 'description': 'md5:7d643fe7f565e53a24aac036b2122fbd', 'duration': 1691, - 'timestamp': 1613906100, + 'timestamp': 1613902500, 'upload_date': '20210221', 'uploader': 'Phoenix', - 'channel': 'corona nachgehakt', + 'series': 'corona nachgehakt', + 'episode': 'Wohin f??hrt der Protest in der Pandemie?', }, }, { # Youtube embed @@ -79,50 +81,53 @@ video_id = compat_str(video.get('basename') or video.get('content')) - details = self._download_xml( + details = self._download_json( 'https://www.phoenix.de/php/mediaplayer/data/beitrags_details.php', - video_id, 'Downloading details XML', query={ + video_id, 'Downloading details JSON', query={ 'ak': 'web', 'ptmd': 'true', 'id': video_id, 'profile': 'player2', }) - title = title or xpath_text( - details, './/information/title', 'title', fatal=True) - content_id = xpath_text( - details, './/video/details/basename', 'content id', fatal=True) + title = title or details['title'] + content_id = details['tracking']['nielsen']['content']['assetid'] info = self._extract_ptmd( 'https://tmd.phoenix.de/tmd/2/ngplayer_2_3/vod/ptmd/phoenix/%s' % content_id, content_id, None, url) - timestamp = unified_timestamp(xpath_text(details, './/details/airtime')) + duration = int_or_none(try_get( + details, lambda x: x['tracking']['nielsen']['content']['length'])) + timestamp = unified_timestamp(details.get('editorialDate')) + series = try_get( + details, lambda x: x['tracking']['nielsen']['content']['program'], + compat_str) + episode = title if details.get('contentType') == 'episode' else None thumbnails = [] - for node in details.findall('.//teaserimages/teaserimage'): - thumbnail_url = node.text + teaser_images = try_get(details, lambda x: x['teaserImageRef']['layouts'], dict) or {} + for thumbnail_key, thumbnail_url in teaser_images.items(): + thumbnail_url = urljoin(url, thumbnail_url) if not thumbnail_url: continue thumbnail = { 'url': thumbnail_url, } - thumbnail_key = node.get('key') - if thumbnail_key: - m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key) - if m: - thumbnail['width'] = int(m.group(1)) - thumbnail['height'] = int(m.group(2)) + m = re.match('^([0-9]+)x([0-9]+)$', thumbnail_key) + if m: + thumbnail['width'] = int(m.group(1)) + thumbnail['height'] = int(m.group(2)) thumbnails.append(thumbnail) return merge_dicts(info, { 'id': content_id, 'title': title, - 'description': xpath_text(details, './/information/detail'), - 'duration': int_or_none(xpath_text(details, './/details/lengthSec')), + 'description': details.get('leadParagraph'), + 'duration': duration, 'thumbnails': thumbnails, 'timestamp': timestamp, - 'uploader': xpath_text(details, './/details/channel'), - 'uploader_id': xpath_text(details, './/details/originChannelId'), - 'channel': xpath_text(details, './/details/originChannelTitle'), + 'uploader': details.get('tvService'), + 'series': series, + 'episode': episode, }) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/playstuff.py new/youtube-dl/youtube_dl/extractor/playstuff.py --- old/youtube-dl/youtube_dl/extractor/playstuff.py 1970-01-01 01:00:00.000000000 +0100 +++ new/youtube-dl/youtube_dl/extractor/playstuff.py 2021-05-01 13:57:12.000000000 +0200 @@ -0,0 +1,65 @@ +from __future__ import unicode_literals + +from .common import InfoExtractor +from ..compat import compat_str +from ..utils import ( + smuggle_url, + try_get, +) + + +class PlayStuffIE(InfoExtractor): + _VALID_URL = r'https?://(?:www\.)?play\.stuff\.co\.nz/details/(?P<id>[^/?#&]+)' + _TESTS = [{ + 'url': 'https://play.stuff.co.nz/details/608778ac1de1c4001a3fa09a', + 'md5': 'c82d3669e5247c64bc382577843e5bd0', + 'info_dict': { + 'id': '6250584958001', + 'ext': 'mp4', + 'title': 'Episode 1: Rotorua/Mt Maunganui/Tauranga', + 'description': 'md5:c154bafb9f0dd02d01fd4100fb1c1913', + 'uploader_id': '6005208634001', + 'timestamp': 1619491027, + 'upload_date': '20210427', + }, + 'add_ie': ['BrightcoveNew'], + }, { + # geo restricted, bypassable + 'url': 'https://play.stuff.co.nz/details/_6155660351001', + 'only_matching': True, + }] + BRIGHTCOVE_URL_TEMPLATE = 'http://players.brightcove.net/%s/%s_default/index.html?videoId=%s' + + def _real_extract(self, url): + video_id = self._match_id(url) + + webpage = self._download_webpage(url, video_id) + + state = self._parse_json( + self._search_regex( + r'__INITIAL_STATE__\s*=\s*({.+?})\s*;', webpage, 'state'), + video_id) + + account_id = try_get( + state, lambda x: x['configurations']['accountId'], + compat_str) or '6005208634001' + player_id = try_get( + state, lambda x: x['configurations']['playerId'], + compat_str) or 'default' + + entries = [] + for item_id, video in state['items'].items(): + if not isinstance(video, dict): + continue + asset_id = try_get( + video, lambda x: x['content']['attributes']['assetId'], + compat_str) + if not asset_id: + continue + entries.append(self.url_result( + smuggle_url( + self.BRIGHTCOVE_URL_TEMPLATE % (account_id, player_id, asset_id), + {'geo_countries': ['NZ']}), + 'BrightcoveNew', video_id)) + + return self.playlist_result(entries, video_id) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/shared.py new/youtube-dl/youtube_dl/extractor/shared.py --- old/youtube-dl/youtube_dl/extractor/shared.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/shared.py 2021-05-01 13:57:12.000000000 +0200 @@ -86,10 +86,10 @@ class VivoIE(SharedBaseIE): IE_DESC = 'vivo.sx' - _VALID_URL = r'https?://vivo\.sx/(?P<id>[\da-z]{10})' + _VALID_URL = r'https?://vivo\.s[xt]/(?P<id>[\da-z]{10})' _FILE_NOT_FOUND = '>The file you have requested does not exists or has been removed' - _TEST = { + _TESTS = [{ 'url': 'http://vivo.sx/d7ddda0e78', 'md5': '15b3af41be0b4fe01f4df075c2678b2c', 'info_dict': { @@ -98,7 +98,10 @@ 'title': 'Chicken', 'filesize': 515659, }, - } + }, { + 'url': 'http://vivo.st/d7ddda0e78', + 'only_matching': True, + }] def _extract_title(self, webpage): title = self._html_search_regex( diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/svt.py new/youtube-dl/youtube_dl/extractor/svt.py --- old/youtube-dl/youtube_dl/extractor/svt.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/svt.py 2021-05-01 13:57:12.000000000 +0200 @@ -146,7 +146,7 @@ ) (?P<svt_id>[^/?#&]+)| https?://(?:www\.)?(?:svtplay|oppetarkiv)\.se/(?:video|klipp|kanaler)/(?P<id>[^/?#&]+) - (?:.*?modalId=(?P<modal_id>[\da-zA-Z-]+))? + (?:.*?(?:modalId|id)=(?P<modal_id>[\da-zA-Z-]+))? ) ''' _TESTS = [{ @@ -178,6 +178,9 @@ 'url': 'https://www.svtplay.se/video/30479064/husdrommar/husdrommar-sasong-8-designdrommar-i-stenungsund?modalId=8zVbDPA', 'only_matching': True, }, { + 'url': 'https://www.svtplay.se/video/30684086/rapport/rapport-24-apr-18-00-7?id=e72gVpa', + 'only_matching': True, + }, { # geo restricted to Sweden 'url': 'http://www.oppetarkiv.se/video/5219710/trollflojten', 'only_matching': True, @@ -259,7 +262,7 @@ if not svt_id: svt_id = self._search_regex( (r'<video[^>]+data-video-id=["\']([\da-zA-Z-]+)', - r'<[^>]+\bdata-rt=["\']top-area-play-button["\'][^>]+\bhref=["\'][^"\']*video/%s/[^"\']*\bmodalId=([\da-zA-Z-]+)' % re.escape(video_id), + r'<[^>]+\bdata-rt=["\']top-area-play-button["\'][^>]+\bhref=["\'][^"\']*video/%s/[^"\']*\b(?:modalId|id)=([\da-zA-Z-]+)' % re.escape(video_id), r'["\']videoSvtId["\']\s*:\s*["\']([\da-zA-Z-]+)', r'["\']videoSvtId\\?["\']\s*:\s*\\?["\']([\da-zA-Z-]+)', r'"content"\s*:\s*{.*?"id"\s*:\s*"([\da-zA-Z-]+)"', diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/tv2dk.py new/youtube-dl/youtube_dl/extractor/tv2dk.py --- old/youtube-dl/youtube_dl/extractor/tv2dk.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/tv2dk.py 2021-05-01 13:57:12.000000000 +0200 @@ -74,6 +74,12 @@ webpage = self._download_webpage(url, video_id) entries = [] + + def add_entry(partner_id, kaltura_id): + entries.append(self.url_result( + 'kaltura:%s:%s' % (partner_id, kaltura_id), 'Kaltura', + video_id=kaltura_id)) + for video_el in re.findall(r'(?s)<[^>]+\bdata-entryid\s*=[^>]*>', webpage): video = extract_attributes(video_el) kaltura_id = video.get('data-entryid') @@ -82,9 +88,14 @@ partner_id = video.get('data-partnerid') if not partner_id: continue - entries.append(self.url_result( - 'kaltura:%s:%s' % (partner_id, kaltura_id), 'Kaltura', - video_id=kaltura_id)) + add_entry(partner_id, kaltura_id) + if not entries: + kaltura_id = self._search_regex( + r'entry_id\s*:\s*["\']([0-9a-z_]+)', webpage, 'kaltura id') + partner_id = self._search_regex( + (r'\\u002Fp\\u002F(\d+)\\u002F', r'/p/(\d+)/'), webpage, + 'partner id') + add_entry(partner_id, kaltura_id) return self.playlist_result(entries) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/tver.py new/youtube-dl/youtube_dl/extractor/tver.py --- old/youtube-dl/youtube_dl/extractor/tver.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/tver.py 2021-05-01 13:57:06.000000000 +0200 @@ -9,7 +9,6 @@ int_or_none, remove_start, smuggle_url, - strip_or_none, try_get, ) @@ -45,32 +44,18 @@ query={'token': self._TOKEN})['main'] p_id = main['publisher_id'] service = remove_start(main['service'], 'ts_') - info = { + + r_id = main['reference_id'] + if service not in ('tx', 'russia2018', 'sebare2018live', 'gorin'): + r_id = 'ref:' + r_id + bc_url = smuggle_url( + self.BRIGHTCOVE_URL_TEMPLATE % (p_id, r_id), + {'geo_countries': ['JP']}) + + return { '_type': 'url_transparent', 'description': try_get(main, lambda x: x['note'][0]['text'], compat_str), 'episode_number': int_or_none(try_get(main, lambda x: x['ext']['episode_number'])), + 'url': bc_url, + 'ie_key': 'BrightcoveNew', } - - if service == 'cx': - title = main['title'] - subtitle = strip_or_none(main.get('subtitle')) - if subtitle: - title += ' - ' + subtitle - info.update({ - 'title': title, - 'url': 'https://i.fod.fujitv.co.jp/plus7/web/%s/%s.html' % (p_id[:4], p_id), - 'ie_key': 'FujiTVFODPlus7', - }) - else: - r_id = main['reference_id'] - if service not in ('tx', 'russia2018', 'sebare2018live', 'gorin'): - r_id = 'ref:' + r_id - bc_url = smuggle_url( - self.BRIGHTCOVE_URL_TEMPLATE % (p_id, r_id), - {'geo_countries': ['JP']}) - info.update({ - 'url': bc_url, - 'ie_key': 'BrightcoveNew', - }) - - return info diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/twitter.py new/youtube-dl/youtube_dl/extractor/twitter.py --- old/youtube-dl/youtube_dl/extractor/twitter.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/twitter.py 2021-05-01 13:57:12.000000000 +0200 @@ -19,6 +19,7 @@ strip_or_none, unified_timestamp, update_url_query, + url_or_none, xpath_text, ) @@ -52,6 +53,9 @@ return [f] def _extract_formats_from_vmap_url(self, vmap_url, video_id): + vmap_url = url_or_none(vmap_url) + if not vmap_url: + return [] vmap_data = self._download_xml(vmap_url, video_id) formats = [] urls = [] diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/vk.py new/youtube-dl/youtube_dl/extractor/vk.py --- old/youtube-dl/youtube_dl/extractor/vk.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/vk.py 2021-05-01 13:57:12.000000000 +0200 @@ -300,6 +300,13 @@ 'only_matching': True, }] + @staticmethod + def _extract_sibnet_urls(webpage): + # https://help.sibnet.ru/?sibnet_video_embed + return [unescapeHTML(mobj.group('url')) for mobj in re.finditer( + r'<iframe\b[^>]+\bsrc=(["\'])(?P<url>(?:https?:)?//video\.sibnet\.ru/shell\.php\?.*?\bvideoid=\d+.*?)\1', + webpage)] + def _real_extract(self, url): mobj = re.match(self._VALID_URL, url) video_id = mobj.group('videoid') @@ -408,6 +415,10 @@ if odnoklassniki_url: return self.url_result(odnoklassniki_url, OdnoklassnikiIE.ie_key()) + sibnet_urls = self._extract_sibnet_urls(info_page) + if sibnet_urls: + return self.url_result(sibnet_urls[0]) + m_opts = re.search(r'(?s)var\s+opts\s*=\s*({.+?});', info_page) if m_opts: m_opts_url = re.search(r"url\s*:\s*'((?!/\b)[^']+)", m_opts.group(1)) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/xfileshare.py new/youtube-dl/youtube_dl/extractor/xfileshare.py --- old/youtube-dl/youtube_dl/extractor/xfileshare.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/xfileshare.py 2021-05-01 13:57:06.000000000 +0200 @@ -58,6 +58,7 @@ (r'vidlocker\.xyz', 'VidLocker'), (r'vidshare\.tv', 'VidShare'), (r'vup\.to', 'VUp'), + (r'wolfstream\.tv', 'WolfStream'), (r'xvideosharing\.com', 'XVideoSharing'), ) @@ -82,6 +83,9 @@ }, { 'url': 'https://aparat.cam/n4d6dh0wvlpr', 'only_matching': True, + }, { + 'url': 'https://wolfstream.tv/nthme29v9u2x', + 'only_matching': True, }] @staticmethod diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/xtube.py new/youtube-dl/youtube_dl/extractor/xtube.py --- old/youtube-dl/youtube_dl/extractor/xtube.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/xtube.py 2021-05-01 13:57:12.000000000 +0200 @@ -11,6 +11,7 @@ parse_duration, sanitized_Request, str_to_int, + url_or_none, ) @@ -87,10 +88,10 @@ 'Cookie': 'age_verified=1; cookiesAccepted=1', }) - title, thumbnail, duration = [None] * 3 + title, thumbnail, duration, sources, media_definition = [None] * 5 config = self._parse_json(self._search_regex( - r'playerConf\s*=\s*({.+?})\s*,\s*(?:\n|loaderConf)', webpage, 'config', + r'playerConf\s*=\s*({.+?})\s*,\s*(?:\n|loaderConf|playerWrapper)', webpage, 'config', default='{}'), video_id, transform_source=js_to_json, fatal=False) if config: config = config.get('mainRoll') @@ -99,20 +100,52 @@ thumbnail = config.get('poster') duration = int_or_none(config.get('duration')) sources = config.get('sources') or config.get('format') + media_definition = config.get('mediaDefinition') - if not isinstance(sources, dict): + if not isinstance(sources, dict) and not media_definition: sources = self._parse_json(self._search_regex( r'(["\'])?sources\1?\s*:\s*(?P<sources>{.+?}),', webpage, 'sources', group='sources'), video_id, transform_source=js_to_json) formats = [] - for format_id, format_url in sources.items(): - formats.append({ - 'url': format_url, - 'format_id': format_id, - 'height': int_or_none(format_id), - }) + format_urls = set() + + if isinstance(sources, dict): + for format_id, format_url in sources.items(): + format_url = url_or_none(format_url) + if not format_url: + continue + if format_url in format_urls: + continue + format_urls.add(format_url) + formats.append({ + 'url': format_url, + 'format_id': format_id, + 'height': int_or_none(format_id), + }) + + if isinstance(media_definition, list): + for media in media_definition: + video_url = url_or_none(media.get('videoUrl')) + if not video_url: + continue + if video_url in format_urls: + continue + format_urls.add(video_url) + format_id = media.get('format') + if format_id == 'hls': + formats.extend(self._extract_m3u8_formats( + video_url, video_id, 'mp4', entry_protocol='m3u8_native', + m3u8_id='hls', fatal=False)) + elif format_id == 'mp4': + height = int_or_none(media.get('quality')) + formats.append({ + 'url': video_url, + 'format_id': '%s-%d' % (format_id, height) if height else format_id, + 'height': height, + }) + self._remove_duplicate_formats(formats) self._sort_formats(formats) diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/extractor/youtube.py new/youtube-dl/youtube_dl/extractor/youtube.py --- old/youtube-dl/youtube_dl/extractor/youtube.py 2021-04-16 22:49:44.000000000 +0200 +++ new/youtube-dl/youtube_dl/extractor/youtube.py 2021-05-01 13:57:06.000000000 +0200 @@ -65,11 +65,6 @@ _PLAYLIST_ID_RE = r'(?:(?:PL|LL|EC|UU|FL|RD|UL|TL|PU|OLAK5uy_)[0-9A-Za-z-_]{10,}|RDMM)' - def _ids_to_results(self, ids): - return [ - self.url_result(vid_id, 'Youtube', video_id=vid_id) - for vid_id in ids] - def _login(self): """ Attempt to log in to YouTube. @@ -1219,6 +1214,9 @@ @classmethod def suitable(cls, url): + # Hack for lazy extractors until more generic solution is implemented + # (see #28780) + from .youtube import parse_qs qs = parse_qs(url) if qs.get('list', [None])[0]: return False @@ -2910,6 +2908,9 @@ def suitable(cls, url): if YoutubeTabIE.suitable(url): return False + # Hack for lazy extractors until more generic solution is implemented + # (see #28780) + from .youtube import parse_qs qs = parse_qs(url) if qs.get('v', [None])[0]: return False diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/options.py new/youtube-dl/youtube_dl/options.py --- old/youtube-dl/youtube_dl/options.py 2021-04-16 22:49:38.000000000 +0200 +++ new/youtube-dl/youtube_dl/options.py 2021-05-01 13:57:12.000000000 +0200 @@ -768,7 +768,7 @@ action='store_true', dest='rm_cachedir', help='Delete all filesystem cache files') - thumbnail = optparse.OptionGroup(parser, 'Thumbnail images') + thumbnail = optparse.OptionGroup(parser, 'Thumbnail Options') thumbnail.add_option( '--write-thumbnail', action='store_true', dest='writethumbnail', default=False, diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/youtube-dl/youtube_dl/version.py new/youtube-dl/youtube_dl/version.py --- old/youtube-dl/youtube_dl/version.py 2021-04-16 22:50:06.000000000 +0200 +++ new/youtube-dl/youtube_dl/version.py 2021-05-16 17:54:58.000000000 +0200 @@ -1,3 +1,3 @@ from __future__ import unicode_literals -__version__ = '2021.04.17' +__version__ = '2021.05.16'
