Crusier wrote:
> Hi Python Tutors,
>
> I am currently able to strip down to the string I want. However, I
> have problems with the JSON script and I am not sure how to slice it
> into a dictionary.
>
> import urllib
> import json
> import requests
>
> from bs4 import BeautifulSoup
>
>
> url
Hi Python Tutors,
I am currently able to strip down to the string I want. However, I
have problems with the JSON script and I am not sure how to slice it
into a dictionary.
import urllib
import json
import requests
from bs4 import BeautifulSoup
url =
On 13/12/15 07:44, Crusier wrote:
> Dear All,
>
> I am trying to scrap the following website, however, I have
> encountered some problems. As you can see, I am not really familiar
> with regex and I hope you can give me some pointers to how to solve
> this problem.
I'm not sure why you mention
Hey Crusier/ (And Others...)
For your site...
As Alan mentioned, its a mix of html/jscript/etc..
So, you're going (or perhaps should) need to extract just the
json/struct that you need, and then go from there. I speak of
experience, as I've had to hande a number of sites that are
essentially
Dear All,
I am trying to scrap the following website, however, I have
encountered some problems. As you can see, I am not really familiar
with regex and I hope you can give me some pointers to how to solve
this problem.
I hope I can download all the transaction data into the database.
However, I
Hi
I am using Python 3.4. I am trying to do some web scraping at this moment.
I got stuck because there is an IndexError: list index out of range if I
put stock_code = (18). My desire output is that the script is able to
detect print out the recent price whether it is up, down or unchanged.
On 12Oct2015 21:21, Crusier wrote:
I am using Python 3.4. I am trying to do some web scraping at this moment.
I got stuck because there is an IndexError: list index out of range if I
put stock_code = (18). My desire output is that the script is able to
detect print out the
Hi
I have recently finished reading "Starting out with Python" and I
really want to do some web scraping. Please kindly advise where I can
get more information about BeautifulSoup. It seems that Documentation
is too hard for me.
Furthermore, I have tried to scrap this site but it seems that
On Tue, Sep 29, 2015 at 11:47 AM, Crusier wrote:
> Hi
>
> I have recently finished reading "Starting out with Python" and I
> really want to do some web scraping. Please kindly advise where I can
> get more information about BeautifulSoup. It seems that Documentation
> is too
Crusier wrote:
> I have recently finished reading "Starting out with Python" and I
> really want to do some web scraping. Please kindly advise where I can
> get more information about BeautifulSoup. It seems that Documentation
> is too hard for me.
If you tell us what you don't understand and
>> Hi
>>
>> I have recently finished reading "Starting out with Python" and I
>> really want to do some web scraping. Please kindly advise where I can
>> get more information about BeautifulSoup. It seems that Documentation
>> is too hard for me.
>>
>> Furthermore, I have tried to scrap this site
Hello, I have personally found this tutorial to be helpful. Check it out:
https://www.youtube.com/watch?v=3xQTJi2tqgk Thank you.
On Tuesday, September 29, 2015 12:05 PM, Joel Goldstick
wrote:
On Tue, Sep 29, 2015 at 11:47 AM, Crusier
There was a thread here a few months ago about problems with running the
unit tests for Beautiful Soup under Python 2.5. These problems have
apparently been fixed with a new release of Beautiful Soup.
http://www.crummy.com/software/BeautifulSoup/
Kent
Hi,
I am using beautiful soup to get links from an html document.
I found that beautiful Soup changes the in the links to amp; due to which
some of the links become unusable.
Is there any way I could stop this behaviour?
Regards,
Shitiz
-
Access over 1 million
Hi,
I am using beautiful soup for extracting links from a web page.
Most pages use relative links in their pages which is causing a problem. Is
there any library to extract complete links or do i have to parse this myself?
Thanks,
Shitiz
Terry Carroll [EMAIL PROTECTED] wrote: On Wed, 29 Nov
On 11/30/06, Shitiz Bansal [EMAIL PROTECTED] wrote:
I am using beautiful soup for extracting links from a web page.
Most pages use relative links in their pages which is causing a problem. Is
there any library to extract complete links or do i have to parse this
myself?
Beautiful Soup can
* Akash [EMAIL PROTECTED] [061129 20:54]:
On 11/30/06, Shitiz Bansal [EMAIL PROTECTED] wrote:
I am using beautiful soup for extracting links from a web page.
Most pages use relative links in their pages which is causing a problem. Is
there any library to extract complete links or do i have
Thanks, urlparse.urljoin did the trick.
Akash- the problem with directly prefixing url to the link is that the url most
of the times contains not just the page address but also parameters and
fragments.
Andreas Kostyrka [EMAIL PROTECTED] wrote: * Akash [061129 20:54]:
On 11/30/06, Shitiz
Kent Johnson wrote:
Is there a way to insert a node with Beautiful Soup?
BS doesn't really seem to be set up to support this. The Tags in a soup
are kept in a linked
What would the appropriate technology to use?
I tried the xml modules, but they fail on the parsing of the html.
--
Bob
Bob Tanner wrote:
Kent Johnson wrote:
Is there a way to insert a node with Beautiful Soup?
BS doesn't really seem to be set up to support this. The Tags in a soup
are kept in a linked
What would the appropriate technology to use?
Fredrik Lundh's elementtidy uses the Tidy library to
Bob Tanner wrote:
Kent Johnson wrote:
Is there a way to insert a node with Beautiful Soup?
BS doesn't really seem to be set up to support this. The Tags in a soup
are kept in a linked
What would the appropriate technology to use?
You might also email the author of BS and see if he has
Bob Tanner wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Is there a way to insert a node with Beautiful Soup?
BS doesn't really seem to be set up to support this. The Tags in a soup are
kept in a linked list by their next attribute so you will have to find the
right Tag, break the
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Is there a way to insert a node with Beautiful Soup?
I found away to append things. But
html
head
titleBlah
/title
/head
body
h1Foo!
/h1
/body
/html
I'd like to insert a div tag, like this:
html
head
titleBlah
Oops, Paul is probably right. I thought urllib2 opened local
files in the absence of an identifier like http://. Bad assumption on my part. I remembered that
behavior from somewhere else, maybe urllib.
That path beginning with \\C:\\ could still bite you, however. Good luck,
Andrew
grouchy wrote:
Hi,
I'm having bang-my-head-against-a-wall moments trying to figure all of this
out.
from BeautifulSoup import BeautifulSoup
file = urllib.urlopen(http://www.google.com/search?q=beautifulsoup;)
file = file.read().decode(utf-8)
soup = BeautifulSoup(file)
results =
Here you go:
import types
print types.StringTypes
(type 'str', type 'unicode')
import sys
print sys.version
2.3.4 (#2, May 29 2004, 03:31:27)
[GCC 3.3.3 (Debian 20040417)]
print type(u'hello' in types.StringTypes
True
sys.getdefaultencoding()
'ascii'
[CCing Leonard Richardson:
Hi Danny,
If you have a moment, do you mind doing this on your system?
Here you go:
import types
print types.StringTypes
(type 'str', type 'unicode')
import sys
print sys.version
2.3.4 (#2, May 29 2004, 03:31:27)
[GCC 3.3.3 (Debian 20040417)]
print type(u'hello' in types.StringTypes
This is the first question in the BeautifulSoup FAQ at
http://www.crummy.com/software/BeautifulSoup/FAQ.html
Unfortunately the author of BS considers this a problem with your
Python installation! So it
seems he doesn't have a good understanding of Python and Unicode.
(OK, I can forgive him
that,
Hi,
I'm having bang-my-head-against-a-wall moments trying to figure all of this out.
A word of warming, this is the first time I've tried using unicode, or
Beautiful Soup, so if I'm being stupid, please forgive me. I'm trying
to scrape results from google as a test case. with Beautiful Soup.
On Thu, 25 Aug 2005, grouchy wrote:
file = urllib.urlopen(http://www.google.com/search?q=beautifulsoup;)
file = file.read().decode(utf-8)
soup = BeautifulSoup(file)
results = soup('p','g')
x = results[1].a.renderContents()
type(x)
type 'unicode'
print x
Matt Croydon::Postneo 2.0 �
30 matches
Mail list logo