I want to extract / scrape the “Matrix form” dataset from the BCS website [1],
a.k.a., the data appeared in the 3rd column.
I tried with the following python code snippet, but still failed to figure out
the trick:
import requests
from bs4 import BeautifulSoup
import re
proxies = {
'http': 'socks5h://127.0.0.1:18888',
'https': 'socks5h://127.0.0.1:18888'
}
requests.packages.urllib3.disable_warnings()
r =
requests.get('https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane',
proxies=proxies, verify=False)
soup = BeautifulSoup(r.content, features="lxml")
table = soup.find('table')
id = table.find_all('id')
My python environment is as follows:
werner@X10DAi:~$ pyenv shell datasci
(datasci) werner@X10DAi:~$ python --version
Python 3.11.1
Any tips will be appreciated.
[1]
https://www.cryst.ehu.es/cgi-bin/plane/programs/nph-plane_getgen?gnum=17&type=plane
Regards,
Zhao
--
https://mail.python.org/mailman/listinfo/python-list