from:"Simon Evans"

Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

2015-07-12 Thread Simon Evans


Dear Peter Otten, 
I typed in (and did not copy and paste) the code as you suggested just now 
(6.28 pm, Sunday 12th July 2015), this is the result I got: 

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
... soup = BeautifulSoup(f,"lxml")
  File "", line 2
soup = BeautifulSoup(f,"lxml")
   ^
IndentationError: expected an indented block
>>> soup = BeautifulSoup(f,"lxml")
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'f' is not defined
>>>

The first time I typed in the second line, I got the 
"Indentation error" 
the second time I typed in exactly the same code, I got the: 
"NameError:name 'f' is not defined"
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

2015-07-12 Thread Simon Evans

Dear Peter Otten, 
Yes, I have been copying and pasting, as it saves typing. I do get 'indented 
block' error responses as a small price to pay for the time and energy thus 
saved. Also Console seems to reject for 'indented block' reasons better known 
to itself, copy and pasted lines that it accepts and  are exactly the same on 
the following line of input. Maybe it is an inbuilt feature of Python's to 
discourage copy and pasting.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

2015-07-12 Thread Simon Evans

Dear Peter Otten,
Incidentally, you have discovered a fault in that there is an erroneous 
difference in my code of 'ecologicalpyramid.html' and that given in the text, 
in the first few lines re: 

 
 
 
 
 
 
plants 
10 
 
 
algae 
10 
 
 






plants
10


algae
10



I have removed the line  to the right html code of the 
lower 

version. Now there is a string ("plants") between the <"li class producerlist"> 
and 
Sorry about that.
However as you said, the input code as quoted in the text, still won't return 
'plants' 

re:

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid,","lxml")
>>> producer_entries = soup.find("ul")
>>> print(producer_entries.li.div.string)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'li'
>>> 

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

2015-07-12 Thread Simon Evans

Dear Peter Otten, thank you for your reply that I have not gone very far into 
the detail of which, as it seems Python console cannot recognise the name 'f' 
as given it, re output below :


Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.

>>> from bs4 import BeautifulSoup
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r")as f:
>>> soup = BeautifulSoup(f, "lxml")
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'f' is not defined
>>>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

2015-07-12 Thread Simon Evans

Dear Mark Lawrence, thank you for your advice. 
I take it that I use the input you suggest for the line :

soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid.html",lxml")

seeing as I have to give the file's full address I therefore have to modify 
your :

soup = BeautifulSoup(ecological_pyramid,"lxml")

to :

soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid," "lxml")

otherwise I get :


>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as 
>>> ecological_pyramid:
>>> soup = BeautifulSoup(ecological_pyramid,"lxml")
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'ecological_pyramid' is not defined


so anyway with the input therefore as:

>>> with open("C:\Beautiful Soup\ecologicalpyramid.html"."r")as 
>>> ecological_pyramid: 
>>> soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid,","lxml")
>>> producer_entries = soup.find("ul")
>>> print(producer_entries.li.div.string)

I still get the following output from the console:

Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'li'
>>>

As is probably evident, what is the problem Python has with finding the 
required html code within the 'ecologicalpyramid' html file, or more 
specifically why does it respond that the html file has no such attribute as 
'li' ?
Incidentally I have installed all the xml, lxml, html, and html5 TreeBuilders/ 
Parsers. I am using lxml as that is the format specified in the text. 

I may as well quote the text on the page in question in 'Getting Started with 
Beautiful Soup':

'Since producers come as the first entry for the tag, we can use the find() 
method, which normally searches fo ronly the first occurrance of a particular 
tag in a BeautifulSoup object. We store this in producer_entries. The next line 
prints the name of the first producer. From the previous HTML diagram we can 
understand that the first producer is stored inside the first  tag of the 
first  tag that immediately follows the first  tag , as shown inthe 
following code: 



plants
10



So after running the preceding code, we will get plants, which is the first 
producer, as the output.'

(page 30)
-- 
https://mail.python.org/mailman/listinfo/python-list

Why doesn't input code return 'plants' as in 'Getting Started with Beautiful Soup' text (on page 30) ?

2015-07-11 Thread Simon Evans

Dear Programmers, 
Thank you for your advice regarding giving the console a current address in the 
code for it to access the html file. 

The console seems to accept the code to that extent, but when I input the two 
lines of code intended to access the location of a required word, the console 
rejects it re :

AttributeError:'NoneType' object has no attribute 'li' 

However the document 'EcologicalPyramid.html' does contain the words 'li' and 
'ul', in its text. I am not sure as to how the input is arranged to output 
'plants' which is also in the documents text, but that is the word the code is 
meant to elicit. 

I enclose the pertinent code as input and output from the console, and the html 
code for the document 'EcologicalPyramid.html'

Thank you in advance for your help. 

-
>>> with open("C:\Beautiful Soup\ecologicalpyramid.html","r") as 
>>> ecological_pyramid:
soup = BeautifulSoup("C:\Beautiful Soup\ecological_pyramid.html","lxml")
... producer_entries = soup.find("ul")
  File "", line 2
producer_entries = soup.find("ul")
   ^
SyntaxError: invalid syntax
>>> producer_entries = soup.find("ul")
>>> print (producer_entries.li.div.string)
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'NoneType' object has no attribute 'li'
--
prin





plants
10


algae
10




deer
1000

deer
1000


rabbit
2000




fox
100


bear
100




lion
80


tiger
50




-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python console rejects an object reference, having made an object with that reference as its name in previous line

2015-01-04 Thread Simon Evans

Dear Michael Torrie,
Thanks for pointing that out to me re: it not being a syntax problem.
The thing is there is a file called 'EcologicalPyramid.html'. I put it in a 
folder called 'Soup' as the text advised on page 28. For what its worth I also 
shifted the Windows Command Prompt to that folder (re: cd Soup)as instructed on 
page 30, and put a duplicate file of 'EcologicalPyramid.html' in the python 2.8 
directory. 
I therefore am wondering where I ought put this html file where the Python 
console will recognize it ? 
Thank you for your attention,
Yours 
Simon  

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python console rejects an object reference, having made an object with that reference as its name in previous line

2014-12-18 Thread Simon Evans

@Steven D'Aprano,
I input the following to Python 2.7, which got the following:- 

>>> from bs4 import BeautifulSoup
>>> with open("ecologicalpyramid.html","r") as ecological_pyramid:
...  soup= next(ecological_pyramid,"lxml")
...  producer_entries = soup.find("ul")
...
Traceback (most recent call last):
  File "", line 1, in 
IOError: [Errno 2] No such file or directory: 'ecologicalpyramid.html'
>>>

- I kept to your instructions to input the 'Enter' after the fourth line and 
then before the fifth line, ie between the indented block and the unindented 
one, which as above, doesn't give me a chance to actually input the fifth line. 
If I do it both ways, ie: pressing enter after the fourth and before the fifth 
or just pressing enter after the fourth and then after the fifth line of input, 
which again it won't actually let me input because before I do, I still get an 
error return.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python console rejects an object reference, having made an object with that reference as its name in previous line

2014-12-14 Thread Simon Evans

Dear Jussi, and Billy
I have changed the input in accordance with your advice, re:
--
Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> with open("ecologicalpyramid.html","r") as ecological_pyramid:
...  soup = next(ecological_pyramid,"lxml")
...  producer_entries = soup.find("ul")
...  print(producer_entries.li.div.string)
... print(producer_entries.li.div.string)
  File "", line 5
print(producer_entries.li.div.string)
^
SyntaxError: invalid syntax
>>> print (producer_entries.li.div.string)
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'producer_entries' is not defined
>>> from bs4 import BeautifulSoup
>>> with open("ecologicalpyramid.html","r") as ecological_pyramid:
...  soup = next(ecological_pyramid,"lxml")
...  producer_entries = soup.find("ul")
...  print(producer_entries.li.div.string)
...

As no doubt you can see, the last line, indented as it is, does not provide the 
output that the book's text says it will return - ie the word 'plants'
If I do not indent it, it returns an 'invalid syntax error' stating that 
'producer_entries' is not defined. Though code in the previous line is meant to 
do just that - isn't it ? 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Python console rejects an object reference, having made an object with that reference as its name in previous line

2014-12-14 Thread Simon Evans

I had another attempt at inputting the code perhaps with the right indentation, 
I still get an error return, but not one that indicates that the code has not 
been read, as you suggested. re:- 


Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> with open("ecologicalpyramid.html","r") as ecological_pyramid:
...  soup = BeautifulSoup(ecological_pyramid,"lxml")
... producer_entries = soup.find("ul")
  File "", line 3
producer_entries = soup.find("ul")
   ^
SyntaxError: invalid syntax
>>>  from bs4 import BeautifulSoup
  File "", line 1
from bs4 import BeautifulSoup

If, as you suggest I left a free line after the "with open( etc" line, console 
returns an error, if I leave a free line after the "soup = etc" line which 
comes after, again I get an error return, my only point is that with the above 
input, console return does not seem to infer that soup has not been defined. 
You recommend that I put all the code into a file then run it - how do I do 
that ? I am new to Python, as you might have gathered. 
Thank you for your help.
Yours Simon
-- 
https://mail.python.org/mailman/listinfo/python-list

Python console rejects an object reference, having made an object with that reference as its name in previous line

2014-12-14 Thread Simon Evans

Dear Python programmers,
Having input the line of code in text: 
cd Soup 
to the Windows console, and having put the file 'EcologicalPyramid.html' into 
the Directory 'Soup', on the C drive, in accordance with instructions I input 
the following code to the Python console, as given on page 30 of 'Getting 
Started with Beautiful Soup':

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> from bs4 import BeautifulSoup
>>> with open("ecologicalpyramid.html","r") as ecological_pyramid:
...  soup = BeautifulSoup(ecological_pyramid,"lxml")
... producer_entries = soup.find("ul")
 
   ^
SyntaxError: invalid syntax
>>> producer_entries = soup.find("ul")
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'soup' is not defined
>>>
   ^

so I cannot proceed with the next line withh would 've been :

print(producer_entries.li.div.string)

which would've given (according to the book) the output:
---
plants

Maybe that is getting a bit far ahead, but I can't quite see where I have gone 
wrong - 'soup' has been defined as an object made of file 
'EcologicalPyramid.html

I hope you can help me on this point. 
Yours 
Simon
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Text Code(from 'Getting Started in Beautiful Soup' re: cd Soup , returns 'Syntax Error, invalid syntax'

2014-12-14 Thread Simon Evans

Thanks Guys 
This book keeps swapping from the Python console to the Windows - without 
telling you, but it is the only book out there on 'Beautiful Soup' so I have 
got to put up with it. There's more problems with it, but I will start a new 
thread in regard of, I don't know if its related to the above or not. 
Yours 
Simon.
-- 
https://mail.python.org/mailman/listinfo/python-list

Text Code(from 'Getting Started in Beautiful Soup' re: cd Soup , returns 'Syntax Error, invalid syntax'

2014-12-11 Thread Simon Evans

At the start of Chapter 3 of 'Getting Started in Beautiful Soup' it has said to 
create a html file, 'ecological 

pyramid.html' - which I have already done re:







plants
10


algae
10




deer
1000

deer
1000


rabbit
2000




fox
100


bear
100




lion
80


tiger
50





and ran it okay in 'Explorer', and text then says to save it to a folder named 
'Soup' which I have done. 
On the next page (30) it says to navigate to that folder with the following  
code to the python console :-
 
cd Soup

however console rejects that code with the following return: - 

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> cd Soup
  File "", line 1
cd Soup
  ^
SyntaxError: invalid syntax
>>>
----
Thank you for reading, hope you can help.
Yours
Simon Evans
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: python 2.7 and unicode (one more time)

2014-12-02 Thread Simon Evans


Hi Peter Otten
re:

There is no assignment 

soup_atag = whatever 

but there is one to atag. The whole session should when you omit the 
offending line 

> atag = soup_atag.a 

or insert 

soup_atag = soup 

before it. 

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> from bs4 import BeautifulSoup
>>> html_atag = """Test html a tag example
... http://www.packtpub.com'>Home
... >> soup = BeautifulSoup(html_atag,'lxml')
>>> atag = soup.aprint(atag)
>>> atag = soup.a
>>> print(atag)
http://www.packtpub.com'>Home


>>> type(atag)

>>> tagname = atag.name
>>> print tagname
a
>>> atag.name = 'p'
>>> print (soup)
Test html a tag example
http://www.packtpub.com'>Home



>>> atag.name = 'p'
>>> print(soup)
Test html a tag example
http://www.packtpub.com'>Home



>>> atag.name = 'a'
>>> print(soup)
Test html a tag example
http://www.packtpub.com'>Home



>>> soup_atag = soup
>>> atag = soup_atag.a
>>> print (atag['href'])
http://www.packtpub.com'>Home
>>

Thank you.
Yours
Simon.

-- 
https://mail.python.org/mailman/listinfo/python-list

Tag objects in Beautiful Soup

2014-11-20 Thread Simon Evans

Re:'Accessing the Tag object from Beautiful Soup' (page 22-25 - Getting Started 
with Beautiful Soup)
So far the code to python27 runs as given in the book, re: - 

>>> html_atag = """Test html a tag example
... http://www.packtpub.com'>Home
... >> soup = BeautifulSoup(html_atag,'lxml')
>>> atag = soup.a
>>> print(atag)
http://www.packtpub.com'>Home</a>
<a href=" http="">

>>> type(atag)

>>>
>>> tagname = atag.name
>>> print tagname
a
>>> atag.name = 'p'
>>> print (soup)
Test html a tag example
http://www.packtpub.com'>Home</a>
<a href=" http="">



then under the next Sub heading : 'Attributes of a Tag object'
text reads : 
atag = soup_atag.a
print (atag['href'])

#output
http://www.packtpub.com

however when I put this code to the console I get error returns at the first 
line re:- 

>>> atag = soup_atag.a
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'soup_atag' is not defined
>>>
--------
Can anyone tell me where I am going wrong or where the text is wrong ? 
So far the given code has run okay, I have put to the console everything the 
text tells you to. 
Thank you for reading.
Simon Evans
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do you download and install HTML5TreeBuilder ?

2014-11-18 Thread Simon Evans

re: 

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>pip install html5lib
Downloading/unpacking html5lib
  Running setup.py (path:c:\users\intela~1\appdata\local\temp\pip_build_Intel At
om\html5lib\setup.py) egg_info for package html5lib

Downloading/unpacking six (from html5lib)
  Downloading six-1.8.0-py2.py3-none-any.whl
Installing collected packages: html5lib, six
  Running setup.py install for html5lib

Successfully installed html5lib six
Cleaning up...

C:\Users\Intel Atom>


- Thanks Mark.
-- 
https://mail.python.org/mailman/listinfo/python-list

How do you download and install HTML5TreeBuilder ?

2014-11-18 Thread Simon Evans

Dear Programmers,
I have installed the HTMLParserTreebuilder and LXMLTreeBuilder downloads to my 
Python2.7 console, using the Windows Console 'pip install' procedure.

I downloaded HTML5 files and installed them to my Python2.7 directory, and went 
through the 'pip install' procedure, but this did not work. 

I do not know whether it is because different procedure must be followed for 
HTML5, or that I downloaded the wrong files, the files I downloaded and 
attempted to install were the following three :- 

html5lib-0.999(1).tar.gz

html5lib-0.999.tar.gz

HTMLParser-0.0.2.tar.gz

The  Windows 7.0 Console returned the following in response :- 
 
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>pip install HTML5
Downloading/unpacking HTML5
  Could not find any downloads that satisfy the requirement HTML5
Cleaning up...
No distributions at all found for HTML5
Storing debug log for failure in c:\users\intela~1\appdata\local\temp\tmp4pxazz

C:\Users\Intel Atom>pip install HTML5
Downloading/unpacking HTML5
  Could not find any downloads that satisfy the requirement HTML5
Cleaning up...
No distributions at all found for HTML5
Storing debug log for failure in c:\users\intela~1\appdata\local\temp\tmp81fbka

C:\Users\Intel Atom>pip install HTML5
Downloading/unpacking HTML5
  Could not find any downloads that satisfy the requirement HTML5
Cleaning up...
No distributions at all found for HTML5
Storing debug log for failure in c:\users\intela~1\appdata\local\temp\tmphaw01m

C:\Users\Intel Atom>

I suppose my main conundrum is from where can I download a version of the 
HTML5 Treebuilder that will install using pip. It doesn't help that HTML5 also 
happens to be the name of some video editing software. 
Thank you for reading.
PS: If anyone is upset about 'one line paragraphs' and other such petulancies, 
then please decline to respond, seeing as far as I'm concerned such 
trivialities are besides the point, and are of no help, so vent your ire 
elsewhere. 
YOurs Simon Evans. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-03 Thread Simon Evans

I input to the cmd console  'pip install html5lib' but again got an error 
return. I thought one of the participants was unhappy about single line spacing 
(re: single line paragraphs') Okay I will go back to single line spacing, I 
don't think it is all that important, really.
Anyway this is my console's response:- 


Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>pip install html5lib
'pip' is not recognized as an internal or external command,
operable program or batch file.

C:\Users\Intel Atom>  
 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-03 Thread Simon Evans

I input 'pip install html5lib' to the Python 2.7 console and got :

>>> pip install html5lib
  File "", line 1
pip install html5lib
  ^
SyntaxError: invalid syntax
>>>

I am not sure what you mean about 'single line paragraphs'. I put my text 

into double line spacing in my last missive, I left the code input/ output 

in single line spacing as that is how it reads from the console, after all 

who am I to alter it? Regarding 'context' if you are referring to the text 

I am using, it is from 'Getting Started in Beautiful Soup' by Vineeth G.

Nair. 

For what its worth some of the subsequent code in the book runs, but not 

all, and I think this may be due to the parser installation factor, and I 

wanted to work through the book (112 pages) without any errors being 

returned. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-02 Thread Simon Evans

What I meant to say was I can't get the html5 or the html parsers to install, I 
have got their downloads in their respective directories in the downloads 
directory. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-02 Thread Simon Evans


Oh I don't mind quoting console output, I just thought I'd be sparing you 

unnecessary detail. 

output was going nicely as I input text from my 'Getting Started with 

Beautiful Soup' even when the author reckoned things would go wrong - due to

lxml not being installed, things went right, because I had already installed

it, re:

page 17

Python 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)] on win
32
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib2
>>> from bs4 import BeautifulSoup
>>> url = "http://www.packtpub.com/books";
>>> page = urllib2.urlopen(url)
>>> soup_packtpage = BeautifulSoup(page)
>>> with open("foo.html","r") as foo_file:
... soup_foo = Soup(foo_file)
  File "", line 2
soup_foo = Soup(foo_file)
   ^
IndentationError: expected an indented block
>>> soup_foo= BeautifulSoup("foo.html")

page 18

>>> print(soup_foo)
foo.html
>>> soup_url = BeautifulSoup("http://www.packtpub.com/books";)
>>> print(soup_url)
http://www.packtpub.com/books
>>> helloworld = "Hello World"
>>> soup_string = BeautifulSoup(helloworld)
>>> print(soup_string)
Hello World

page 19: no code in text on this page

page 20

>>> soup_xml = BeautifulSoup(helloworld,features= "xml")
>>> soup_xml = BeautifulSoup(helloworld,"xml")
>>> print(soup_xml)

Hello World
>>> soup_xml = BeautifulSoup(helloworld,features = "xml")
>>> print(soup_xml)

Hello World
>>>

Then on bottom of page 20 it says 'we should install the required parsers using 
easy-install,pip or setup.py install' but as I can't get the downloads of html 
or html5 parsers, text code halfway down returns statutory response regarding 
requisite parser needing to be installed, re: 

page 21

>>> invalid_html = '>> soup_invalid_html = BeautifulSoup(invalid_html,'lxml')
>>> print(soup_invalid_html)

>>> soup_invalid_html = BeautifulSoup(invalid_html,'html5lib')
Traceback (most recent call last):
  File "", line 1, in 
  File "C:\Python27\lib\site-packages\bs4\__init__.py", line 155, in __init__
% ",".join(features))
ValueError: Couldn't find a tree builder with the features you requested: 
html5lib. Do you need to install a parser library?
>>>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-02 Thread Simon Evans

I have got the html5lib-0.999.tar.gz

and the HTMLParser-0.0.2.tar.gz files in my Downloads the problem is how I

install them to Python2.7. 

The lxml-3.3.3.win32-py2.7 is an exe file, which upon clicking will install 

but obviously the html and the html5 installations  are not so 

straightforward. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-02 Thread Simon Evans

Dear Mark Lawrence, 
I have tried inputting the code in the first link, re:
>>> import lxml
>>> import lxml.etree
>>> import bs4.builder.htmlparser
Traceback (most recent call last):
  File "", line 1, in 
ImportError: No module named htmlparser
>>> import bs4.builder._lxml
>>> import bs4.builder.html5lib
Traceback (most recent call last):
  File "", line 1, in 
ImportError: No module named html5lib
>>>

which tells me lxml is installed, but that neither html nor html5 is installed. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-02 Thread Simon Evans

I have proceeded to click on the 'setup.py' in the html5-0.999 lib and got a 
python console for a few seconds, this may have been the installation of the 
HTML5 parser/ treebuilder - I will have to put the code that did not work to it 
previously to it again, hopefully it will. 

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-02 Thread Simon Evans

Dear Terry Reedy 
I am using operating system Windows 7. 
I put the HTML TreeBuilder / htm5 library into the Python2.7 folder. 
I read that the LXML Treebuilder /lmxl installs itself automatically to the 
Python2.7 installation, so that is why I am not having difficulty with that 
installation. 
I don't think it really matters where the lxml download ended up necessarily, 
all I want is to know how I can install it so it works, I cannot get any 
feedback because it isn't working, all I get is the automated inbuilt response 
about 'Do I want a treebuilder/ parser that is appropriate to the input' or 
words to that effect. What I want to know is how to get this lxml 
treebuilder/parser to run, ie: what is the protocol for running the lxml 
download so's it'll run, or what sort of code to I put to my python console in 
order to get it to run, seeing as the input suggested by the download site does 
not get it to run.  Maybe I should rephrase my question : how do I install 
LXMLTreeBuilder/lxml, and how do I download and install HTMLParserTreeBuilder 
and LXMLTreeBuilderForXML to my Python2.7, please ? I can post the Traceback 
but all it says is that it doesn't recognise any input with 'html5lib' in it. I 
will post the console response if it is important, but I can't see how it is 
relevan
 t to my request - which is how do I get these 'treebuilder/ parsers' to 
install and run.
-- 
https://mail.python.org/mailman/listinfo/python-list

Installing Parsers/Tree Builders to, and accessing these packages from Python2.7

2014-11-01 Thread Simon Evans

Hi Programmers, 
I have downloaded, installed, and can access the LXMLTreeBuilder/lxml, from 
Python2.7. 
however I have also downloaded HTMLTreeBuilder/html5lib but cannot get console 
to recognize the download, even using the code the download site suggests. I 
did put it in the Python2.7 directory, but unlike the HTML one, it doesn't 
recognize it, so the import statement returns an error. 
Can anyone tell me how I might proceed so's these TreeBuilders/ Parsers will 
work on my Python console ? 
I also will have to install HTMLParserTreeBuilder/html.parser and 
LXMLTreeBuilderForXML/lxml but best to cross that bridge when gotten to, as 
they say. 
Thank you for reading.I look forward to hearing from you.
Yours 
Simon Evans
-- 
https://mail.python.org/mailman/listinfo/python-list

Code to Python 27 prompt to access a html file stored on C drive

2014-08-14 Thread Simon Evans

Dear Programmers,  I want to access a html file on my C drive, in the 

Python 27 prompt, all the examples I come across seem to require for 

access for the html file be on a server, rather than on the same 

computer's C drive. I want to do this as a prerequisite to writing 

webscraping code,  surmising that if I can get the Python 27 

prompt (inclusive of 'Beautiful Soup''Urllib' 'Requests' downloads ) to 

output pertinent html code from a html document, then I can proceed to use 

similar code to ouput html code from URL addresses, such as 

'RacingPost.com' 'SportingLife.com''Oddschecker.com' and 

'Bestbetting.com' which is what I am interested in working on.

Hope you can help. 

Yours Simon Evans.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: Suitable Python code to scrape specific details from web pages.

2014-08-12 Thread Simon Evans

On Tuesday, August 12, 2014 9:00:30 PM UTC+1, Simon Evans wrote:
> Dear Programmers,
> 
> I have been looking at the You tube 'Web Scraping Tutorials' of Chris Reeves. 
> I have tried a few of his python programs in the Python27 command prompt, but 
> altered them from accessing data using links say from the Dow Jones index, to 
> accessing the details I would be interested in accessing from the 'Racing 
> Post' on a daily basis. Anyhow, the code it returns is not in the example I 
> am going to give, is not the information I am seeking, instead of returning 
> the given odds on a horse, it only returns a [], which isn't much use. 
> 
> I would be glad if you could tell me where I am going wrong. 
> 
> Yours faithfully
> 
> Simon Evans.
> 
> 
> 
> >>>import urllib
> 
> >>>import re
> 
> >>>htmlfile = urllib.urlopen("http://www.racingpost.com/horses2/cards/card.sd?
> 
> 
> 
> race_id=600048r_date=2014-05-08#raceTabs=sc_")
> 
> htmltext = htmlfile.read()
> 
> regex = '1http://www.racingpost.com/horses/horse_home.sd?
> 
> 
> 
> horse_id=758752"onclick="scorecards.send("horse_name":):return 
> Html.popup(this,
> 
> 
> 
> {width:695,height:800})"title="Full details about this HORSE">Lively 
> 
> 
> 
> Baron9/4F'
> 
> >>>pattern = re.compile(regex)
> 
> >>>odds=re.findall(pattern,htmltext)
> 
> >>>print odds
> 
> []
> 
> >>>
> 
> 
> 
> >>>import urllib
> 
> >>>import re
> 
> >>>htmlfile = urllib.urlopen("http://www.racingpost.com/horses2/cards/card.sd?
> 
> 
> 
> >>>race_id=600048r_date=2014-05-08#raceTabs=sc_")
> 
> >>>htmltext = htmlfile.read()
> 
> >>>regex = ''
> 
> >>>pattern = re.compile(regex)
> 
> >>>odds=re.findall(pattern,htmltext)
> 
> >>>print odds
> 
> []
> 
> >>>
> 
> ---
Dear Programmers, Thank you for your responses. I have installed 'Beautiful 
Soup' and I have the 'Getting Started in Beautiful Soup' book, but can't seem 
to make  any progress with it, I am too thick to make much use of it. I was 
hoping I could scrape specified stuff off Web pages without using it. I have 
installed 'Requests' also, is there any code I can use that you can suggest 
that can access the sort of Web page values that I have referred to ?  such as 
odds, names of runners, stuff like that off the 'inspect element' or 'source' 
htaml pages, on www.Racingpost.com. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Suitable Python code to scrape specific details from web pages.

2014-08-12 Thread Simon Evans

Dear Programmers,
I have been looking at the You tube 'Web Scraping Tutorials' of Chris Reeves. I 
have tried a few of his python programs in the Python27 command prompt, but 
altered them from accessing data using links say from the Dow Jones index, to 
accessing the details I would be interested in accessing from the 'Racing Post' 
on a daily basis. Anyhow, the code it returns is not in the example I am going 
to give, is not the information I am seeking, instead of returning the given 
odds on a horse, it only returns a [], which isn't much use. 
I would be glad if you could tell me where I am going wrong. 
Yours faithfully
Simon Evans.

>>>import urllib
>>>import re
>>>htmlfile = urllib.urlopen("http://www.racingpost.com/horses2/cards/card.sd?

race_id=600048r_date=2014-05-08#raceTabs=sc_")
htmltext = htmlfile.read()
regex = '1http://www.racingpost.com/horses/horse_home.sd?

horse_id=758752"onclick="scorecards.send("horse_name":):return 
Html.popup(this,

{width:695,height:800})"title="Full details about this HORSE">Lively 

Baron9/4F'
>>>pattern = re.compile(regex)
>>>odds=re.findall(pattern,htmltext)
>>>print odds
[]
>>>

>>>import urllib
>>>import re
>>>htmlfile = urllib.urlopen("http://www.racingpost.com/horses2/cards/card.sd?

>>>race_id=600048r_date=2014-05-08#raceTabs=sc_")
>>>htmltext = htmlfile.read()
>>>regex = ''
>>>pattern = re.compile(regex)
>>>odds=re.findall(pattern,htmltext)
>>>print odds
[]
>>>
---
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-15 Thread Simon Evans

Dear Programmers, I noticed a couple of typos in my previous message, so have 
now altered them thus :- 

Dear Programmers,
As anticipated, it has not been to long before I have encountered further 

difficulty. At the top of page 16 of 'Getting Started with Beautiful Soup" it 

gives code to be input, whether to the Python or Windows command prompt I am 
not 

sure, but both seem to be resistant to it. I quote the response to the code 
below, 

the code input being :- 

helloworld = "Hello World"
soup_string = BeautifulSoup(helloworld)

to Windows Command prompt this gives :- 
--
SyntaxError: invalid syntax
>>> helloworld = "HelloWorld"
>>> soup_string = BeautifulSoup(helloworld)
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'BeautifulSoup' is not defined
--
I have been told by one of the programmers, that I ought be inputting this to 
the 

Python command prompt (the book doesn't spacify), but that doesn't take either 

re:-
--
>>>helloworld = HelloWorld"
>>>soup_string = BeautifulSoup(helloworld)
Traceback (most recent call last):
File "", line 1, in 
NameError: name 'BeautifulSoup' is not defined
>>>
--
Looking at the bottom of page 16, there is more code for the inputting of, that 

again does not take to the Windows Command Prompt or the Python command prompt,
re:  import urllib2
 from bs4 import BeautifulSoup
 url = "http://www.packtpub.com/books";
 page = urllib2.urlopen(url)
 soup_packtpage = BeautifulSoup(page)

returns to the Windows Command prompt:- 
--
>>>import urllib2
Traceback (most recent call last):
  File "", line1, in 
ImportError: No module named 'urllib2'
>>>

--
returns to the Python command prompt :- 
--
>>> import urllib2
>>> from bs4 import BeautifulSoup
>>> url = "http://www.packtpub.com/books";
>>> page = urllib2.urlopen(url)
Traceback (most recent call last):
File "C\Python27\lib\urllib2.py",line 127, in urlopen
  return_opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py",line 410, in open
response = meth(req, response)
File "C:\Python27\lib\urllib2.py", oine 523, in http_response
'http', request, response, code, msg, hdrs)
File"C:\Python27\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:/Python27/lib/urllib2.py",line 382, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, masg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
-
Anway I hope you can tell me what is amiss, there is no point in my proceeding 

with the book (about 111 pages all told) until I find out why it won't take. 
I realise I have been told to learn python in order to make things less 
painful, 

but I don't see why code written in the book does not take. 
Thank you for reading.








 


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-15 Thread Simon Evans



Dear Programmers,
As anticipated, it has not been to long before I have encountered further 

difficulty. At the top of page 16 of 'Getting Started with Beautiful Soup" it 

gives code to be input, whether to the Python or Windows command prompt I am 
not 

sure, but both seem to be resistant to it. I quote the response to the code 
below, 

the code input being :- 

helloworld = "Hello World"
soup_string = BeautifulSoup(helloworld)

to Windows Command prompt this gives :- 
--
SyntaxError: invalid syntax
>>> helloworld = "HelloWorld"
>>> soup_string = BeautifulSoup(helloworld)
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'BeautifulSoup' is not defined
--
I have been told by one of the programmers, that I ought be inputting this to 
the 

Python command prompt (the book doesn't spacify), but that doesn't take either 

re:-
--
>>>helloworld = HelloWorld"
>>>soup_string = BeautifulSoup(helloworld)
Traceback (most recent call last):
File "", line 1, in 
NameError: name 'BeautifulSoup' is not defined
>>>
--
Looking at the bottom of page 16, there is more code for the inputting of, that 

again does not take to the Windows Command Prompt or the Python command prompt,
re:  import urllib2
 from bs4 import BeautifulSoup
 url = "http://www.packtpub.com/books";
 page = urllib2.urlopen(url)
 soup_packtpage = BeautifulSoup(page)

returns to the Windows Command prompt:- 
--
>>>import urllib2
Traceback (most recent call last):
  File "", line1, in 
ImportError: No module named 'urllib2'
>>>

--
returns to the Python command prompt :- 
--
>>> import urllib2
>>> from bs4 import BeautifulSoup
>>> url = "http://www.packtpub.com/books";
>>> page = urllib2.urlopen(url)
Traceback (most recent call last):
File "C\Python27\lib\urllib2.py",line 127, in urlopen
  return_opener.open(url, data, timeout)
File "C:\Python27\lib\urllib2.py",line 410, in open
response = meth(req, response)
File "C:\Pyton27\lib\urllib2.py", oine 523, in http_response
'http', request, response, code, msg, hdrs)
File"C:\Python27\lib\urllib2.py", line 448, in error
return self._call_chain(*args)
File "C:/Python27/lib/urllib2.py",line 382, in _call_chain
result = func(*args)
File "C:\Python27\lib\urllib2.py", line 531, in http_error_default
raise HTTPError(req.get_full_url(), code, masg, hdrs, fp)
urllib2.HTTPError: HTTP Error 403: Forbidden
-
Anway I hope you can tell me what is amiss, there is no point in my proceeding 

with the book (about 111 pages all told) until I find out why it won't take. 
I realise I have been told to learn python in order to make things less 
painful, 

but I don't see why code written in the book does not take. 
Thank you for reading.





I thought I might as well include, so's you might be able to see where things 
are 

going astray. The Windows command prompt :- 



-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-15 Thread Simon Evans

Dear Programmers,
I downloaded Peazip, which doesn't remove file/ folder hierarchy. I unzipped it 
and input the same code to the console and it installed Beautiful Soup 4 okay 
re:- 
-
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>c:\Python27\python setup.py install
running install
running build
running build_py
creating build
creating build\lib
creating build\lib\bs4
copying bs4\dammit.py -> build\lib\bs4
copying bs4\element.py -> build\lib\bs4
copying bs4\testing.py -> build\lib\bs4
copying bs4\__init__.py -> build\lib\bs4
creating build\lib\bs4\builder
copying bs4\builder\_html5lib.py -> build\lib\bs4\builder
copying bs4\builder\_htmlparser.py -> build\lib\bs4\builder
copying bs4\builder\_lxml.py -> build\lib\bs4\builder
copying bs4\builder\__init__.py -> build\lib\bs4\builder
creating build\lib\bs4\tests
copying bs4\tests\test_builder_registry.py -> build\lib\bs4\tests
copying bs4\tests\test_docs.py -> build\lib\bs4\tests
copying bs4\tests\test_html5lib.py -> build\lib\bs4\tests
copying bs4\tests\test_htmlparser.py -> build\lib\bs4\tests
copying bs4\tests\test_lxml.py -> build\lib\bs4\tests
copying bs4\tests\test_soup.py -> build\lib\bs4\tests
copying bs4\tests\test_tree.py -> build\lib\bs4\tests
copying bs4\tests\__init__.py -> build\lib\bs4\tests
running install_lib
creating c:\Python27\Lib\site-packages\bs4
creating c:\Python27\Lib\site-packages\bs4\builder
copying build\lib\bs4\builder\_html5lib.py -> c:\Python27\Lib\site-packages\bs4\
builder
copying build\lib\bs4\builder\_htmlparser.py -> c:\Python27\Lib\site-packages\bs
4\builder
copying build\lib\bs4\builder\_lxml.py -> c:\Python27\Lib\site-packages\bs4\buil
der
copying build\lib\bs4\builder\__init__.py -> c:\Python27\Lib\site-packages\bs4\b
uilder
copying build\lib\bs4\dammit.py -> c:\Python27\Lib\site-packages\bs4
copying build\lib\bs4\element.py -> c:\Python27\Lib\site-packages\bs4
copying build\lib\bs4\testing.py -> c:\Python27\Lib\site-packages\bs4
creating c:\Python27\Lib\site-packages\bs4\tests
copying build\lib\bs4\tests\test_builder_registry.py -> c:\Python27\Lib\site-pac
kages\bs4\tests
copying build\lib\bs4\tests\test_docs.py -> c:\Python27\Lib\site-packages\bs4\te
sts
copying build\lib\bs4\tests\test_html5lib.py -> c:\Python27\Lib\site-packages\bs
4\tests
copying build\lib\bs4\tests\test_htmlparser.py -> c:\Python27\Lib\site-packages\
bs4\tests
copying build\lib\bs4\tests\test_lxml.py -> c:\Python27\Lib\site-packages\bs4\te
sts
copying build\lib\bs4\tests\test_soup.py -> c:\Python27\Lib\site-packages\bs4\te
sts
copying build\lib\bs4\tests\test_tree.py -> c:\Python27\Lib\site-packages\bs4\te
sts
copying build\lib\bs4\tests\__init__.py -> c:\Python27\Lib\site-packages\bs4\tes
ts
copying build\lib\bs4\__init__.py -> c:\Python27\Lib\site-packages\bs4
byte-compiling c:\Python27\Lib\site-packages\bs4\builder\_html5lib.py to _html5l
ib.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\builder\_htmlparser.py to _html
parser.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\builder\_lxml.py to _lxml.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\builder\__init__.py to __init__
.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\dammit.py to dammit.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\element.py to element.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\testing.py to testing.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\test_builder_registry.py
to test_builder_registry.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\test_docs.py to test_docs
.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\test_html5lib.py to test_
html5lib.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\test_htmlparser.py to tes
t_htmlparser.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\test_lxml.py to test_lxml
.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\test_soup.py to test_soup
.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\test_tree.py to test_tree
.pyc
byte-compiling c:\Python27\Lib\site-packages\bs4\tests\__init__.py to __init__.p
yc
byte-compiling c:\Python27\Lib\site-packages\bs4\__init__.py to __init__.pyc
running install_egg_info
Writing c:\Python27\Lib\site-packages\beautifulsoup4-4.1.0-py2.7.egg-info

c:\Beautiful Soup>

Thank you for your thoughtful help, I am sure I will be needing more though, in 
the not too distant future. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-14 Thread Simon Evans

I have input the above code by copy and pasting to the Idle python console, as 
the python 2.7 command prompt is fussy about the indentation on the eleventh 
line down, if I then indent it, it replies that the indentation is unnecessary 
of unexpected, and if I don't it says an indentation is expected. 
However when I get to the next lines of code - in the Idle prompt re:

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>c:\Python27\python setup.py install

Again it does not recognise 'bs4'. I think having used 'Just unzip it' instead 
of 'WinZip' may have caused this problem, in the first place ,as when I looked 
at the WinZip version at a local net café, it did have a folder hierarchy, 
however I wanted, and still want to skimp the £25 fee for WinZip, which 
nowadays you can't seem to be able to do. I never asked for the darn files to 
be zipped, so why ought I pay to have them unzipped, being my contention.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-14 Thread Simon Evans

I downloaded the get-pip.py file. I installed it to the same folder on my C 
drive as the Beautiful Soup one in which the Beautiful Soup 4 downloads was 
unzipped to. I changed directory to the folder on the Command Prompt, as you 
instructed in step 2. I input the code to the console you gave on step 3), that 
returned some code, as quoted below. I then input the code you gave on step 4) 
but Console seems to reject or not recognise 'pip' as a term. I am sure quoting 
the actual prompt response can explain things better than I :
---
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.


C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>python get-pip.py
Downloading/unpacking pip from https://pypi.python.org/packages/py2.py3/p/pip/pi
p-1.5.5-py2.py3-none-any.whl#md5=03a932d6f82a3887d8de1cdb837c87ed
Installing collected packages: pip
  Found existing installation: pip 1.5.4
Uninstalling pip:
  Successfully uninstalled pip
Successfully installed pip
Cleaning up...

c:\Beautiful Soup>pip install beautifulsoup4
'pip' is not recognized as an internal or external command,
operable program or batch file.

c:\Beautiful Soup>

Perhaps I oughtn't have downloaded the pip file to the same directory as the 
Beautiful Soup ? I will have a try at transferring the file to another folder
and running the code you gave again. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-13 Thread Simon Evans

Dear Ian,  and other programmers, thank you for your advice. 
I am resending the last message because this twattish cut and paste facility on 
my computer has a knack of chopping off ones original message, I will try to 
convey the right message this time :  

I have removed the original Beautiful Soup 4 download, that I had unzipped to 
my Beautiful Soup directory on the C drive. 
I downloaded the latest version of Beautiful Soup 4 from the Crummy site. 
I unzipped it, and removed the contents of the unzipped directory and placed 
contents in my Beautiful Soup directory, and again had the same output to my 
console re: 

 

Microsoft Windows [Version 6.1.7601] 
Copyright (c) 2009 Microsoft Corporation.  All rights reserved. 

C:\Users\Intel Atom>cd "c:\Beautiful Soup" 

c:\Beautiful Soup>c:\Python27\python setup.py install 

running install
running build
running build_py
error: package directory 'bs4' does not exist


c:\Beautiful Soup> 
--- 
I have made a note of all the contents of the downloaded and unzipped BS4,ie 
the contents of my Beautiful Soup folder on the C drive, which is as follows: 
--- 

running install 
running build 
running build_py 

error: package directory 'bs4' does not existinit 
_html5lib 
_htmlparser 
_lxml 
6.1 
AUTHORS 
conf 
COPYING 
dammit 
demonstration_markup 
element 
index.rst 
Makefile 
NEWS 
PGK-INFO 
README 
setup 
test_builder_registry 
test_docs 
test_html5lib 
test_htmlparser 
text_lxml 
test_soup 
test_tree 
testing 
TODO 

 
I can see no bs4 folder within the contents. 
 I can not see any setup.py file either, but this is how I downloaded it. 
I am only following instructions as suggested. 
I do not understand why it is not working. 
I hope someone can direct me in the right direction, as I seem to be stuck, and 
I don't think it has much bearing on my fluency or lack of it with Python. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-13 Thread Simon Evans

I have removed the original Beautiful Soup 4 download, that I had unzipped to 
my Beautiful Soup directory on the C drive. 
I downloaded the latest version of Beautiful Soup 4 from the Crummy site. 
I unzipped it, and removed the contents of the unzipped directory and placed 
contents in my Beautiful Soup directory, and again had the same output to my 
console re: 

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>c:\Python27\python setup.py install

c:\Beautiful Soup>
---
I have made a note of all the contents of the downloaded and unzipped BS4,ie 
the contents of my Beautiful Soup folder on the C drive, which is as follows:
---
running install
running build
running build_py
error: package directory 'bs4' does not existinit
_html5lib
_htmlparser
_lxml
6.1
AUTHORS
conf
COPYING
dammit
demonstration_markup
element
index.rst
Makefile
NEWS
PGK-INFO
README
setup
test_builder_registry
test_docs
test_html5lib
test_htmlparser
text_lxml
test_soup
test_tree
testing
TODO

I can see no bs4 folder within the contents.
 I can not see any setup.py file either, but this is how I downloaded it.
I am only following instructions as suggested.
I do not understand why it is not working.
I hope someone can direct me in the right direction, as I seem to be stuck, and 
I don't think it has much bearing on my fluency or lack of it with Python. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-12 Thread Simon Evans

I did download the latest version of Beautiful Soup 4 from the download site, 
as the book suggested. 
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-12 Thread Simon Evans

Dear Ian, 
The book does recommend to use Python 2.7 (see bottom line of page 10).
The book also recommends to use Beautiful Soup 4. 
You are right that in that I have placed the unzipped BS4 folder within a 
folder, and I therefore removed the contents of the inner folder and 
transferred them to the outer folder. 
The console now can access the contents of the Beautiful Soup folder, but it is 
still having problems with it as the last output to my console demonstrates :


Microsoft Windows [Version 6.1.7601]

Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>c:\Python27\python setup.py install
running install
running build
running build_py
error: package directory 'bs4' does not exist

c:\Beautiful Soup>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-12 Thread Simon Evans

Thank you for your advice. I did buy a book on Python, 'Hello Python' but the 
code in it wouldn't run, so I returned it to the shop for a refund. I am going 
to visit the local library to see if they have any books on Python. I am 
familiar with Java and Pascal, and looking at a few You tubes on the subject, 
thought it was not much different, and shares many of the oop concepts 
(variables, initializing, expressions, methods, and so on, but I realize there 
is no point in walking backwards in new territory.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-12 Thread Simon Evans

The version of Python the book seems to be referring to is 2.7, re: bottom of 
page 10-
'Pick the Path variable and add the following section to the Path variable: 
;C:\PythonXY for example C:\Python 27'

The version of Beautiful Soup seems to be Beautiful Soup 4 as at the top of 
page 12 it states:
'1.Download the latest tarball from 
https://pypi.python.org/packages/source/b/beautifulsoup4/.'

I have downloaded and unzipped to a folder called 'Beautiful Soup' on the C 
drive the Beautiful Soup 4 version. I am using the Python 2.7 console and IDLE, 
I have removed the 3.4 version. 

All the same I seem to be having difficulties again as console wont accept the 
code it did when it was the previous version of BS that I used yesterday. I 
realise I would not be having this problem if I proceeded to input the 'Hello 
World' code on the Python console, but as said, the text never specifically 
said 'change to Python 2.7 console'. I thought the problem was with the BS 
version and so changed it, but now can't even get as far as I had before 
changing it. Anyhow be that as it may, this is the console response to my input:

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>Beautiful Soup>c:\Python27\python setup.py install
'Beautiful' is not recognized as an internal or external command,
operable program or batch file.

c:\Beautiful Soup>
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-12 Thread Simon Evans

Hi Ian, thank you for your help. 
Yes that is the book by Vineeth J Nair.
At the top of page 12, at step 1 it says :

1.Download the latest tarball from 
https://pypi.python.org/packages/source/b/beautifulsoup4/.

So yes, the version the book is dealing with is beautiful soup 4. 
I am using Pyhon 2.7, I have removed Python 3.4.
Also on the bottom of page 10, Mr Nair states:

Pick the path variagble and add the following section to the Path variable:

;C:\PythonXY for example C:\Python27

Which tells me that the Python version cited in the book must be 2.7

I downloaded beautiful soup 4 last night. I unzipped it with 'Just unzip it' to 
a folder I called Beautiful Soup, the same as I did with the previous beautiful 
soup download. The console return is as below, showing that I am now facing the 
same conundrum as yesterday, before changing my version of Beautiful Soup. re:

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>Beautiful Soup>c:\Python27\python setup.py install
'Beautiful' is not recognized as an internal or external command,
operable program or batch file.

c:\Beautiful Soup>


-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-11 Thread Simon Evans

- but wait a moment 'BeautifulSoup4 works with 2.6+ and 3.x'(Terry Reedy) - 
doesn't 2.6 + = 2.7, which is what I'm using with BeautifulSoup4.  
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-11 Thread Simon Evans

On Monday, May 12, 2014 12:19:24 AM UTC+1, Simon Evans wrote:
> Yeah well at no point does the book say to start inputting the code mentioned 
> in Python command prompt rather than the Windows command prompt, but thank 
> you for your guidance anyway. 
> 
> I have downloaded the latest version of Beautiful Soup 4, but am again facing 
> problems with the second line of code, re:-
> 
> ---
>  
> 
>  Microsoft Windows [Version 6.1.7601]
> 
> Copyright (c) 2009 Microsoft Corporation.  All rights reserved.
> 
> 
> 
> C:\Users\Intel Atom>cd "c:\Beautiful Soup"
> 
> 
> 
> c:\Beautiful Soup>c:\Python27\python setup.py install
> 
> c:\Python27\python: can't open file 'setup.py': [Errno 2] No such file or 
> direct
> 
> ory
> 
> 
> 
> though that was the code I used before which installed okay see above). Can 
> anyone tell me where I am going wrong ? Thanks.
Oh I think I see - I should be using Python 3.4 now, with BS4 ?  

-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-11 Thread Simon Evans

Yeah well at no point does the book say to start inputting the code mentioned 
in Python command prompt rather than the Windows command prompt, but thank you 
for your guidance anyway. 
I have downloaded the latest version of Beautiful Soup 4, but am again facing 
problems with the second line of code, re:-
--- 
 Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>c:\Python27\python setup.py install
c:\Python27\python: can't open file 'setup.py': [Errno 2] No such file or direct
ory

though that was the code I used before which installed okay see above). Can 
anyone tell me where I am going wrong ? Thanks.  
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-11 Thread Simon Evans

I have downloaded Beautiful Soup 3, I am using Python 2.7. I understand from 
your message that I ought to use Python 2.6 or Python 3.4 with Beautiful Soup 
4, the book I am using 'Getting Started with Beautiful Soup' is for Beautiful 
Soup 4. Therefore I gather I must re-download Beautiful Soup and get the 4 
version, dispose of my Python 2.7 and reinstall Python 3.4. I am sure I can do 
this, but doesn't the above information suggest that the only Python grade left 
that might work with Beautiful Soup 3 would by Python 2.7 - which is the 
configuration I have at present, though I am not perfectly happy, as it is not 
taking code in the book (meant for BS4) such as the following on page 16 :

helloworld = "Hello World"

re:- 

c:\Beautiful Soup>helloworld = "Hello World"
'helloworld' is not recognized as an internal or external command,
operable program or batch file.

I take it that this response is due to using code meant for BS4 with Python 
2.6/ 3.4, rather than BS3 with Python 2.7 which is what I am currently using. 
If so I will change the configurations.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-11 Thread Simon Evans

Dear Chris Angelico,
Yes, you are right, I did install Python 3.4 as well as 2.7. I have removed 
Python 3.4, and input the code you suggested and it looks like it has installed 
properly, returning the following code:- 

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>c:\Python27\python setup.py install
running install
running build
running build_py
creating build
creating build\lib
copying BeautifulSoup.py -> build\lib
copying BeautifulSoupTests.py -> build\lib
running install_lib
copying build\lib\BeautifulSoup.py -> c:\Python27\Lib\site-packages
copying build\lib\BeautifulSoupTests.py -> c:\Python27\Lib\site-packages
byte-compiling c:\Python27\Lib\site-packages\BeautifulSoup.py to BeautifulSoup.p
yc
byte-compiling c:\Python27\Lib\site-packages\BeautifulSoupTests.py to BeautifulS
oupTests.pyc
running install_egg_info
Writing c:\Python27\Lib\site-packages\BeautifulSoup-3.2.1-py2.7.egg-info

c:\Beautiful Soup>

Would that things were as straightforward as they are in the books, but anyway 
thank you much for your assistance, I'd still be typing the zillionth variation 
on the first line without your help. I don't doubt though that I will be coming 
unstuck in the not distant future. Until then, again thank you for your 
selfless help.
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-11 Thread Simon Evans


Thank you everyone who replied, for your help. Using the command prompt 
console, it accepts the first line of code, but doesn't seem to accept the 
second line. I have altered it a little, but it is not having any of it, I 
quote my console input and output here, as it can probably explain things 
better than I :- 


Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Intel Atom>cd"c:\Beautiful Soup"
The filename, directory name, or volume label syntax is incorrect.

C:\Users\Intel Atom>cd "c:\Beautiful Soup"

c:\Beautiful Soup>python setup.py install.
  File "setup.py", line 22
print "Unit tests have failed!"
  ^
SyntaxError: invalid syntax

c:\Beautiful Soup>python setup.py install"
  File "setup.py", line 22
print "Unit tests have failed!"
  ^
SyntaxError: invalid syntax

c:\Beautiful Soup>

I have tried writing "python setup.py install" 
ie putting the statement in inverted commas, but the console still seems to 
reject it  re:- 

c:\Beautiful Soup>"python setup. py install"
'"python setup. py install"' is not recognized as an internal or external comman
d,
operable program or batch file.

c:\Beautiful Soup>


Again I hope you python practitioners can help. I am only on page 12, and have 
another 99 pages to go, so can only hope it gets easier.
-- 
https://mail.python.org/mailman/listinfo/python-list

How do I access 'Beautiful Soup' on python 2.7 or 3.4 , console or idle versions.

2014-05-10 Thread Simon Evans

I am new to Python, but my main interest is to use it to Webscrape. I have 
downloaded Beautiful Soup, and have followed the instruction in the 'Getting 
Started with Beautiful Soup' book, but my Python installations keep returning 
errors, so I can't get started. I have unzipped Beautiful Soup to a folder of 
the same name on my C drive, in accordance with the first two steps of page 12 
of the aforementioned publication, but proceeding to navigate to the program as 
in step three, re: "Open up the command line prompt and navigate to the folder 
where you have unzipped the folder as follows:
cd Beautiful Soup
python setup python install "

This returns on my Python 27 :
>>> cd Beautiful Soup
File "",line 1
cd Beautiful Soup
   ^
SyntaxError: invalid syntax
>>>

also I get:
>>> cd Beautiful Soup
SyntaxError: invalid syntax
>>>

to my IDLE Python 2.7 version, same goes for the Python 3.4 installations. 
Hope someone can help. 
Thanks in advance.
-- 
https://mail.python.org/mailman/listinfo/python-list

48 matches

Mail list logo