Re: [Tutor] Web Page Scraping
Hi Walter, Thank you for taking your time to do all the explanation. Have a great day. Cheers, Hank On Tue, May 24, 2016 at 10:45 PM, Walter Prinswrote: > On 24 May 2016 at 15:37, Walter Prins wrote: >> print(name1.encode(sys.stdout.encoding, "backslashreplace")) >> # > > I forgot to mention, you might want to read the following documentation page: > > https://docs.python.org/3/howto/unicode.html > > (good luck.) > > W ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Getting started in testing
Terry Carrollwrites: > Thanks to Alan, Danny, Albert-Jan and Ben for their suggestions. I've > now gotten my feet wet in unittest and have gone from not quite > knowing where to start to making substantial progress, with a small > suite of tests up and running. Great start! Do keep in mind that unit tests are only one kind of test — the most detailed, testing a single assertion about a single unit of code. A good test suite has tests written to test higher groups of functionality too. -- \“A life spent making mistakes is not only most honorable but | `\ more useful than a life spent doing nothing.” —anonymous | _o__) | Ben Finney ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Fwd: Re: I've subscribed to your service, no confirmation yet. I'm looking for a tutor and I need help with some code.
Forwarding to the list. Please use reply-all to respond to list messages. Also please use plain text as HTML messages often result in code listings being corrupted, especially the spacing, which is very important in Python. Forwarded Message > I just opened the python IDLE 3.5 on my MAC and I imported telnet.lib. > On the next line I typed telnet 192.09.168.55 and I got an error. It said > invalid syntax. I'm trying to telnet to my other MAC here at home, just > to see if I can connect. You cannot just type telnet commands into Python you need to use the telnet API. (Type help(telnetlib) at the >>> prompt or visit the modules documentation page) A typical session might look something like: >>> import telnetlib >>> tn = telnetlib.Telnet('myhost.com') >>> response = tn.read() >>> print(response) . some stuff here >>> tn.close() That's assuming you have telnet access to myhost.com of course, many sites don't allow it because of the security issues associated with telnet. ssh is probably a better bet. But in either case don't expect a telnet interactive session - that's what the telnet command (or ssh) is for. Python gives you the ability to automate a session, with no human interactivity required. If you want to interact you'll need to read the output and check for prompts from the host then relay those prompts to your user from Python. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Fwd: Re: I've subscribed to your service, no confirmation yet. I'm looking for a tutor and I need help with some code.
Forwarding to list... Forwarded Message The box is my controller with and IP address, I'm doing all this from my windows 7 PC. As I said I can type telnet 10.35.56.90 in the dos cmd prompt and get to my controller. I wrote a python script with the user_acct dictionary. I do get the >>> in the python IDLE but within my python script/file can I telnet to my controller? Keep in mind when I do log into my controller it's command line driven. I have python 2.7 and 3.5 installed on my windows 7 pc. So you're saying open the python IDLE and import the telnet lib and just type; telnet 10.45.34.80 and I'll be able to get to my controller??? Thank you for helping me :) ...you cannot direct the wind but you can adjust your sails... *Angelia Spencer (Angie)* *From:* Alan Gauld via Tutor*To:* tutor@python.org *Sent:* Tuesday, May 24, 2016 2:38 PM *Subject:* Re: [Tutor] I've subscribed to your service, no confirmation yet. I'm looking for a tutor and I need help with some code. Re your subject... This is a mailing list. You subscribe and you should receive mails sent to the list. If you have a question send it to the list (like you did here) and one or more of the list members will hopefully respond. The more specific your question the more precise will be the response. Try to include OS, Python version, any error messages(in full) Now to your message... On 24/05/16 18:06, Angelia Spencer via Tutor wrote: > I'm trying to telnet to my box. What is your "box"? A server somewhere? Running what OS? Where are you telnetting from? > When I do this in DOS it's simple, I even have a blinking cursor > for me to type in commands for each response. I assume you mean you telnet from a Windows PC and login to your server and get an OS command prompt? (Possibly running bash?) > Not so for python. What does this mean? If you run the python command on your DOS console you should get a prompt like >>> at which you can type in commands. If python is installed on your "box" then telnet to the box and at the OS prompt type python. If that doesn't work for you, you will need to give us a lot more information about how you installed Python, which version, how you are trying to run python etc. > I have a dictionary "user_acct" which has the username > and password in it. > My box is interactive, I must be able to type in commands Again we have no idea what your "box" is. What do you mean its interactive, nearly all computers are to some extent? > and I don't know how to do this in python. While python does have an interactive mode (the >>> prompt) it's not normally used that way. Usually you put your code in a script file (ending .py) and run it from an OS prompt (or file manager) like C:\WINDOWS> python myscript.py > 1st prompt = Username:2nd prompt = Password: > > After this my box's output looks like this:Last Login Date : May 24 2016 09:42:08 > Last Login Type : IP Session(CLI) > Login Failures : 0 (Since Last Login) > : 0 (Total for Account) > TA5000>then I type en and get > TA5000# then I type conf t and getTA5000(config)# OK, I'm guessing that's a Unix like system but I'm not sure. > My code is below: How are you trying to run this? Where is it stored? Where is python installed? > import getpass > import sys > import telnetlib username = input() > password = input() > tid = 'TA5000' > first_prompt = '>' # type 'en' at this prompt > second_prompt = '#' # type 'conf t' at this prompt > third_prompt = '(config)' > prompt1 = tid + first_prompt > prompt2 = tid + second_prompt > prompt3 = tid + third_prompt + second_prompt > user_acct = {'ADMIN':'PASSWORD','ADTRAN':'BOSCO','READONLY':'PASSWORD','READWRITE':'PASSWORD','TEST':'PASSWORD','guest':'PASSWORD','':'PASSWORD'} > host = "10.51.18.88" > #username = "ADMIN" + newline > #password = "PASSWORD" + newline > tn = telnetlib.Telnet(host,"23") > open() That calls the builtin open() function without arguments which should cause an error. Do you get an error message? You probably wanted tn.open() > tn.read_until("Username: ") > tn.write(username) > tn.read_until("Password: ") > tn.write(password) if username in user_acct and password == user_acct[username]: > print(prompt1 + "Enter en at this prompt" +"\n") > print(prompt2 + "Enter conf t at this prompt" + "\n") > print(prompt3 + "\n") > else: > > print('Invalid Login... Please Try Again')close() Shouldn't you check the login details before passing it to the telnet host? Also note you are not storing anything you get from the host so you are just checking your own local data. I don't really know what this is supposed to be doing. I'd suggest starting slow. Create a script that simply logs in with a hard coded name/password and then prints a succcess/fail message and logs out again. Once you know you can connect and login
Re: [Tutor] Learning Regular Expressions
On 23/05/16 23:08, Terry--gmail wrote: > scripted worked great without the notes! I'd like to know what it is in > the below Tripple-Quoted section that is causing me this problem...if > anyone recognizes. In IDLE's script file..._it's all colored green_, > which I thought meant Python was going to ignore everything between the > tripple-quotes! Its all green forv me too and it runs perfectly - as in it does absolutly nothing. And if I add print('hello world') at the end it prionts ok too. I even tried assigning your docsstring to a variable and printing that and it too worked. Linux Mint 17 Python 3.4.3 IDLE 3 So I don't think this is your entire problem. Maybe you should show us some code that actually causes the error? > But if I run just the below portion of the script in > it's own file, I get the same While Scanning Tripple-Quotes error. As above, it runs silently for me. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] I've subscribed to your service, no confirmation yet. I'm looking for a tutor and I need help with some code.
Re your subject... This is a mailing list. You subscribe and you should receive mails sent to the list. If you have a question send it to the list (like you did here) and one or more of the list members will hopefully respond. The more specific your question the more precise will be the response. Try to include OS, Python version, any error messages(in full) Now to your message... On 24/05/16 18:06, Angelia Spencer via Tutor wrote: > I'm trying to telnet to my box. What is your "box"? A server somewhere? Running what OS? Where are you telnetting from? > When I do this in DOS it's simple, I even have a blinking cursor > for me to type in commands for each response. I assume you mean you telnet from a Windows PC and login to your server and get an OS command prompt? (Possibly running bash?) > Not so for python. What does this mean? If you run the python command on your DOS console you should get a prompt like >>> at which you can type in commands. If python is installed on your "box" then telnet to the box and at the OS prompt type python. If that doesn't work for you, you will need to give us a lot more information about how you installed Python, which version, how you are trying to run python etc. > I have a dictionary "user_acct" which has the username > and password in it. > My box is interactive, I must be able to type in commands Again we have no idea what your "box" is. What do you mean its interactive, nearly all computers are to some extent? > and I don't know how to do this in python. While python does have an interactive mode (the >>> prompt) it's not normally used that way. Usually you put your code in a script file (ending .py) and run it from an OS prompt (or file manager) like C:\WINDOWS> python myscript.py > 1st prompt = Username:2nd prompt = Password: > > After this my box's output looks like this:Last Login Date : May 24 2016 > 09:42:08 > Last Login Type : IP Session(CLI) > Login Failures : 0 (Since Last Login) > : 0 (Total for Account) > TA5000>then I type en and get > TA5000# then I type conf t and getTA5000(config)# OK, I'm guessing that's a Unix like system but I'm not sure. > My code is below: How are you trying to run this? Where is it stored? Where is python installed? > import getpass > import sys > import telnetlib username = input() > password = input() > tid = 'TA5000' > first_prompt = '>' # type 'en' at this prompt > second_prompt = '#' # type 'conf t' at this prompt > third_prompt = '(config)' > prompt1 = tid + first_prompt > prompt2 = tid + second_prompt > prompt3 = tid + third_prompt + second_prompt > user_acct = > {'ADMIN':'PASSWORD','ADTRAN':'BOSCO','READONLY':'PASSWORD','READWRITE':'PASSWORD','TEST':'PASSWORD','guest':'PASSWORD','':'PASSWORD'} > host = "10.51.18.88" > #username = "ADMIN" + newline > #password = "PASSWORD" + newline > tn = telnetlib.Telnet(host,"23") > open() That calls the builtin open() function without arguments which should cause an error. Do you get an error message? You probably wanted tn.open() > tn.read_until("Username: ") > tn.write(username) > tn.read_until("Password: ") > tn.write(password) if username in user_acct and password == user_acct[username]: > print(prompt1 + "Enter en at this prompt" +"\n") > print(prompt2 + "Enter conf t at this prompt" + "\n") > print(prompt3 + "\n") > else: > > print('Invalid Login... Please Try Again')close() Shouldn't you check the login details before passing it to the telnet host? Also note you are not storing anything you get from the host so you are just checking your own local data. I don't really know what this is supposed to be doing. I'd suggest starting slow. Create a script that simply logs in with a hard coded name/password and then prints a succcess/fail message and logs out again. Once you know you can connect and login then you can start to think about extra features. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] I've subscribed to your service, no confirmation yet. I'm looking for a tutor and I need help with some code.
I'm trying to telnet to my box. When I do this in DOS it's simple, I even have a blinking cursor for me to type in commands for each response. Not so for python. I have a dictionary "user_acct" which has the username and password in it. My box is interactive, I must be able to type in commands and I don't know how to do this in python. Pleasehelp me any way you can. 1st prompt = Username:2nd prompt = Password: After this my box's output looks like this:Last Login Date : May 24 2016 09:42:08 Last Login Type : IP Session(CLI) Login Failures : 0 (Since Last Login) : 0 (Total for Account) TA5000>then I type en and get TA5000# then I type conf t and getTA5000(config)# My code is below: import getpass import sys import telnetlibusername = input() password = input() tid = 'TA5000' first_prompt = '>' # type 'en' at this prompt second_prompt = '#' # type 'conf t' at this prompt third_prompt = '(config)' prompt1 = tid + first_prompt prompt2 = tid + second_prompt prompt3 = tid + third_prompt + second_prompt user_acct = {'ADMIN':'PASSWORD','ADTRAN':'BOSCO','READONLY':'PASSWORD','READWRITE':'PASSWORD','TEST':'PASSWORD','guest':'PASSWORD','':'PASSWORD'} host = "10.51.18.88" #username = "ADMIN" + newline #password = "PASSWORD" + newline tn = telnetlib.Telnet(host,"23") open() tn.read_until("Username: ") tn.write(username) tn.read_until("Password: ") tn.write(password)if username in user_acct and password == user_acct[username]: print(prompt1 + "Enter en at this prompt" +"\n") print(prompt2 + "Enter conf t at this prompt" + "\n") print(prompt3 + "\n") else: print('Invalid Login... Please Try Again')close() ...you cannot direct the wind but you can adjust your sails... Angelia Spencer (Angie) ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Web Page Scraping
On 24 May 2016 at 15:37, Walter Prinswrote: > print(name1.encode(sys.stdout.encoding, "backslashreplace")) # I forgot to mention, you might want to read the following documentation page: https://docs.python.org/3/howto/unicode.html (good luck.) W ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Learning Regular Expressions
On Mon, May 23, 2016 at 5:08 PM, Terry--gmailwrote: > Running Linux Mint > The YouTube Sentdex Video tutor I am following. > He is working in Python3.4 and I am running Python3.4.3 > > He's demonstrating some Regular Expressions which I wanted to test out. On > these test scripts, for future referrence, I have been putting my notes in > Tripple Quotes and naming the scripts descriptively to be able to find them > again, when I need to review. However, this time, when I copied in a simple > script below my RE notes, and ran it from IDLE (and from Console) I got the > following error: > > SyntaxError: EOF while scanning triple-quoted string literal > > Now, there was also a tripple-quoted string I had set a variable to in my > script...so I thought it was the active part of the script! But eventually, > through the process of elimination, I discovered the scripted worked great > without the notes! I'd like to know what it is in the below Tripple-Quoted > section that is causing me this problem...if anyone recognizes. In IDLE's > script file..._it's all colored green_, which I thought meant Python was > going to ignore everything between the tripple-quotes! But if I run just the > below portion of the script in it's own file, I get the same While Scanning > Tripple-Quotes error. I do not know the exact point of error in your code, but even if you use triple-quoted strings, escape sequences still work. I do not have a Python 3 installation handy, but in the Python 2.7.8 that I do have handy: Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> print ''' \tTab character!!! ''' Tab character!!! >>> Note: I had to simulate with spaces what I see in IDLE as my Gmail refuses to accurately copy my IDLE result. I suspect that your multiple backslash instances are generating what you are observing. Doesn't your full traceback target the exact line of your code on which this occurs? HTH, boB ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Web Page Scraping
Hi, On 24 May 2016 at 04:17, Crusierwrote: > > Dear All, > > I am trying to scrape a web site using Beautiful Soup. However, BS > doesn't show any of the data. I am just wondering if it is Javascript > or some other feature which hides all the data. > > I have the following questions: > > 1) Please advise how to scrape the following data from the website: > > 'http://www.dbpower.com.hk/en/quote/quote-warrant/code/10348' > > Type, Listing Date (Y-M-D), Call / Put, Last Trading Day (Y-M-D), > Strike Price, Maturity Date (Y-M-D), Effective Gearing (X),Time to > Maturity (D), > Delta (%), Daily Theta (%), Board Lot... > > 2) I am able to scrape most of the data from the same site > > 'http://www.dbpower.com.hk/en/quote/quote-cbbc/code/63852' > > Please advise what is the difference between these two sites. You didn't state which version of Python you're using, nor what operating system, but the source contains print's with parenthesis, so I assume some version of Python 3 and I'm going to guess you're using Windows. Be that as it may, your program crashes with both Python 2 and Python 3. The str() conversion is flagged as a problem by Python2, stating: "Traceback (most recent call last): File "test.py", line 30, in web_scraper(warrants) File "test.py", line 25, in web_scraper name1 = str(n.text) UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 282: ordinal not in range(128)" Meanwhile Python3 breaks earlier with the message: "Traceback (most recent call last): File "test.py", line 30, in web_scraper(warrants) File "test.py", line 18, in web_scraper print(soup) File "C:\Python35-32\lib\encodings\cp850.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 435-439: character maps to " Both of these alert you to the fact that this is due to some encoding issue. Aside from this your program seems to work, and the data you say you want to retrieve is in fact returned. So in short: If you avoid trying to implicitly encode the Unicode result from Beautiful soup into ASCII (or the local machine codepage) implicitly (which is what happens with your unqualified print calls) you should avoid the problem. But I guess you're going to want to continue to use print, and you may therefore want to know what the issue is and how you might avoid it. So: The reason for the problem is (basically as I understand it) that on Windows your console (which is where the results of the print statements go) is not Unicode aware. This implies that when you ask Python to print a Unicode string to the console, that first of all there must be a conversion from Unicode to something your console can accept, to allow the print to execute. On Python 2 if you don't explicitly deal with this, "ascii" is used which then duly falls over if it runs into anything that doesn't map cleanly into the ASCII character set. On Python 3, it is clever enough to figure out what my console codepage (cp850) is, which means more characters are mappable to my console character set, however this is still not enough to convert character 435-439 which is encountered in the Beautifulsoup result, as mentioned in the error message. The way to avoid this is to tell Python how to deal with this. For example (change lines marked with ): from bs4 import BeautifulSoup import requests import json import re import sys # warrants = ['10348'] def web_scraper(warrants): url = "http://www.dbpower.com.hk/en/quote/quote-warrant/code/; # Scrape from the Web for code in warrants: new_url = url + code response = requests.get(new_url) html = response.content soup = BeautifulSoup(html,"html.parser") print(soup.encode(sys.stdout.encoding, "backslashreplace")) # name = soup.findAll('div', attrs={'class': 'article_content'}) #print(name) for n in name: name1 = n.text # s_code = name1[:4] print(name1.encode(sys.stdout.encoding, "backslashreplace")) # web_scraper(warrants) Here I'm picking up the encoding from stdout, which on my machine = "cp850". If sys.stdout.encoding is blank on your machine you might try something explicit or as a last resort you might try "utf-8" that should at least make the text "printable" (though perhaps not what you want.) I hope that helps (and look forward to possible corrections or improved advice from other list members as I'm admittedly not an expert on Unicode handling either.) For reference, in future always post full error messages, and version of Python/Operating system. Cheers Walter ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
[Tutor] Learning Regular Expressions
Running Linux Mint The YouTube Sentdex Video tutor I am following. He is working in Python3.4 and I am running Python3.4.3 He's demonstrating some Regular Expressions which I wanted to test out. On these test scripts, for future referrence, I have been putting my notes in Tripple Quotes and naming the scripts descriptively to be able to find them again, when I need to review. However, this time, when I copied in a simple script below my RE notes, and ran it from IDLE (and from Console) I got the following error: SyntaxError: EOF while scanning triple-quoted string literal Now, there was also a tripple-quoted string I had set a variable to in my script...so I thought it was the active part of the script! But eventually, through the process of elimination, I discovered the scripted worked great without the notes! I'd like to know what it is in the below Tripple-Quoted section that is causing me this problem...if anyone recognizes. In IDLE's script file..._it's all colored green_, which I thought meant Python was going to ignore everything between the tripple-quotes! But if I run just the below portion of the script in it's own file, I get the same While Scanning Tripple-Quotes error. #!/usr/bin/env python3 ''' Regular Expressions - or at least some Identifiers: \d any number \D anything but a number (digit) \s space \S anything but a space \w any character \W anything but a character . any character (or even a period itself if you use \.) except for a newline a search for just the letter 'a' \b the white space around words Modifiers {x}we are expecting "x" number of something {1, 3} we're expecting 1-3 in length of something -, so for digits we write \d{1-3} + means Match 1 or more ? means Match 0 or 1 * Match 0 or more $ Match the end of a string ^ Match the beginning of a string | Match either or - so you might write \d{1-3} | \w{5-6} [ ] a range or "variance" such as [A-Z] or [A-Za-z] Cap 1st letter followed by lower case or [1-5a-qA-Z] starts with a number inclusive of 1-5 then lower case letter then followed by any Cap letter! :) White Space Characters (may not be seen): \n new line \s space \t tab \e escape \f form feed \r return DON'T FORGET!: . + * ? [ ] $ ^ ( ) { } | \ if you really want to use these, you must escape them '\' ''' Thanks for your thoughts! --Terry ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Getting started in testing
Thanks to Alan, Danny, Albert-Jan and Ben for their suggestions. I've now gotten my feet wet in unittest and have gone from not quite knowing where to start to making substantial progress, with a small suite of tests up and running. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Python 3: string to decimal conversion
c...@zip.com.au wrote: > On 23May2016 12:18, Saidovwrote: >>Thanks everyone for all your help. This solved my problem with >>parenthesis and $ signs in the data: >> >>if not row[4]: >>pass >>else: >>try: >>expenses[ts.Date(row[0]).month] += >>decimal.Decimal(row[4].strip('()$ ,').replace(',','')) >>except decimal.InvalidOperation as e: >>print("unexpected expenses value: %r" % (row[4])) > > These are three things to remark on with your new code: Noughtily? Whatever you do, the conversion is complex enough to put it into a separate function. This makes it easier to test the code against typical input and corner cases. > Firstly, it is best to put try/except around as narrow a piece of code as [...] ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor