date:20150822

Re: [Tutor] Problem using lxml

2015-08-22 Thread Anthony Papillion

Many thanks, Martin! I had indeed skipped creating the tree object and a
few other things you pointed out. Here is my finished simple code that
actually works:

from lxml import html
import requests

page = requests.get("http://joplin.craigslist.org/search/w4m";)
tree = html.fromstring(page.text)
titles = tree.xpath('//a[@class="hdrlnk"]/text()')
try:
for title in titles:
print title
except:
pass

Pretty simple. Thanks for the help!


On Sat, Aug 22, 2015 at 4:20 PM Martin A. Brown  wrote:

>
> Hi there Anthony,
>
> > I'm pretty new to lxml but I pretty much thought I'd understood
> > the basics. However, for some reason, my first attempt at using it
> > is failing miserably.
> >
> > Here's the deal:
> >
> > I'm parsing specific page on Craigslist (
> > http://joplin.craigslist.org/search/rea) and trying to retreive the
> text of
> > each link on that page. When I do an "inspect element" in Firefox, a
> sample
> > anchor link looks like this:
> >
> > FIRST
> > OPEN HOUSE TOMORROW 2:00pm-4:00pm!!! (8-23-15)
> >
> > The code I'm using to try to get the link text is this:
> >
> > from lxml import html
> > import requests
> >
> > page = requests.get("http://joplin.craigslist.org/search/rea";)
>
> You are missing something here that takes the page.content, parses
> it and creates variable called tree.
>
> > titles = tree.xpath('//a[@title="hdrlnk"]/text()')
>
> And, your xpath is incorrect.  Play with this in the interactive
> browser and you will be able to correct your xpath.  I think you
> will notice from the example anchor link above that the attribute of
> the  HTML elements you want to grab is "class", not "title".
> Therefore:
>
>titles = tree.xpath('//a[@class="hdrlnk"]/text()')
>
> Is probably closer.
>
> > print titles
> >
> > The last line, where it supposedly will print the text of each anchor
> > returns [].
> >
> > I can't seem to figure out what I'm doing wrong. lmxml seems pretty
> > straightforward but I can't seem to get this down.
>
> Again, I'd recommend playing with the data in an interactive console
> session.  You will be able to figure out exactly which xpath gets
> you the data you would like, and then you can drop it into your
> script.
>
> Good luck,
>
> -Martin
>
> --
> Martin A. Brown
> http://linux-ip.net/
>
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Problem using lxml

2015-08-22 Thread Martin A. Brown



Hi there Anthony,

I'm pretty new to lxml but I pretty much thought I'd understood 
the basics. However, for some reason, my first attempt at using it 
is failing miserably.


Here's the deal:

I'm parsing specific page on Craigslist (
http://joplin.craigslist.org/search/rea) and trying to retreive the text of
each link on that page. When I do an "inspect element" in Firefox, a sample
anchor link looks like this:

FIRST
OPEN HOUSE TOMORROW 2:00pm-4:00pm!!! (8-23-15)

The code I'm using to try to get the link text is this:

from lxml import html
import requests

page = requests.get("http://joplin.craigslist.org/search/rea";)


You are missing something here that takes the page.content, parses 
it and creates variable called tree.



titles = tree.xpath('//a[@title="hdrlnk"]/text()')


And, your xpath is incorrect.  Play with this in the interactive 
browser and you will be able to correct your xpath.  I think you 
will notice from the example anchor link above that the attribute of 
the  HTML elements you want to grab is "class", not "title". 
Therefore:


  titles = tree.xpath('//a[@class="hdrlnk"]/text()')

Is probably closer.


print titles

The last line, where it supposedly will print the text of each anchor
returns [].

I can't seem to figure out what I'm doing wrong. lmxml seems pretty
straightforward but I can't seem to get this down.


Again, I'd recommend playing with the data in an interactive console 
session.  You will be able to figure out exactly which xpath gets 
you the data you would like, and then you can drop it into your 
script.


Good luck,

-Martin

--
Martin A. Brown
http://linux-ip.net/
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Problem using lxml

2015-08-22 Thread Joel Goldstick

On Sat, Aug 22, 2015 at 5:05 PM, Anthony Papillion  wrote:
> Hello Everyone,
>
> I'm pretty new to lxml but I pretty much thought I'd understood the basics.
> However, for some reason, my first attempt at using it is failing miserably.
>
> Here's the deal:
>
> I'm parsing specific page on Craigslist (
> http://joplin.craigslist.org/search/rea) and trying to retreive the text of
> each link on that page. When I do an "inspect element" in Firefox, a sample
> anchor link looks like this:
>
> FIRST
> OPEN HOUSE TOMORROW 2:00pm-4:00pm!!! (8-23-15)
>
> The code I'm using to try to get the link text is this:
>
> from lxml import html
> import requests
>
> page = requests.get("http://joplin.craigslist.org/search/rea";)
> titles = tree.xpath('//a[@title="hdrlnk"]/text()')
> print titles
>
> The last line, where it supposedly will print the text of each anchor
> returns [].
>
> I can't seem to figure out what I'm doing wrong. lmxml seems pretty
> straightforward but I can't seem to get this down.
>
> Can anyone make any suggestions?
>
> Thanks!
> Anthony

Not an answer, but have you checked out Beautiful Soup?  It is a great
html parsing tool, with a good tutorial:
http://www.crummy.com/software/BeautifulSoup/bs4/doc/
> ___
> Tutor maillist  -  Tutor@python.org
> To unsubscribe or change subscription options:
> https://mail.python.org/mailman/listinfo/tutor



-- 
Joel Goldstick
http://joelgoldstick.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

[Tutor] Problem using lxml

2015-08-22 Thread Anthony Papillion

Hello Everyone,

I'm pretty new to lxml but I pretty much thought I'd understood the basics.
However, for some reason, my first attempt at using it is failing miserably.

Here's the deal:

I'm parsing specific page on Craigslist (
http://joplin.craigslist.org/search/rea) and trying to retreive the text of
each link on that page. When I do an "inspect element" in Firefox, a sample
anchor link looks like this:

FIRST
OPEN HOUSE TOMORROW 2:00pm-4:00pm!!! (8-23-15)

The code I'm using to try to get the link text is this:

from lxml import html
import requests

page = requests.get("http://joplin.craigslist.org/search/rea";)
titles = tree.xpath('//a[@title="hdrlnk"]/text()')
print titles

The last line, where it supposedly will print the text of each anchor
returns [].

I can't seem to figure out what I'm doing wrong. lmxml seems pretty
straightforward but I can't seem to get this down.

Can anyone make any suggestions?

Thanks!
Anthony
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] filtering listed directories

2015-08-22 Thread Laura Creighton

In a message of Sat, 22 Aug 2015 14:32:56 +0100, Alan Gauld writes:
>But maybe some questions on a Tix (or Tk) forum might
>get more help? Once you know how to do it in native
>Tcl/Tk/Tix you can usually figure out how to do it
>in Python.
>
>-- 
>Alan G

I asked the question on tkinter-discuss, but the question hasn't shown
up yet.

In the meantime, I have found this:
http://www.ccs.neu.edu/research/demeter/course/projects/demdraw/www/tickle/u3/tk3_dialogs.html

which looks like, if we converted it to tkinter, would do the job, since
all it wants is a list of files.

I have guests coming over for dinner, so it will be much later before
I can work on this.  (And I will be slow -- so if you are a wizard
at converting tk to tkinter, by all means feel free to step in here. :) )

Laura

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] filtering listed directories

2015-08-22 Thread Alan Gauld


On 22/08/15 11:43, Laura Creighton wrote:


How can I filter out these hidden directories?
Help(tkFileDialog) doesn't help me as it just shows **options, but
doesn't show what these options might be.



tix (tkinter extensions) https://wiki.python.org/moin/Tix
have some more file dialogs, so maybe there is joy there.



There is a FileSelectDialog in Tix that has a dircmd option
according to the Tix documentation.

However, I've played about with it and can't figure out how
to make it work!

There is also allegedly a 'hidden' check-box subwidget
that controls whether hidden files are shown. Again I
couldn't find how to access this.

But maybe some questions on a Tix (or Tk) forum might
get more help? Once you know how to do it in native
Tcl/Tk/Tix you can usually figure out how to do it
in Python.

--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.amazon.com/author/alan_gauld
Follow my photo-blog on Flickr at:
http://www.flickr.com/photos/alangauldphotos


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Can someone explain this to me please

2015-08-22 Thread eryksun

On Fri, Aug 21, 2015 at 1:04 PM, Jon Paris  wrote:
>
> import sys
> x = sys.maxsize
> print ("Max size is: ", x)
> y = (x + 1)
> print ("y is", type(y), "with a value of", y)
>
> Produces this result:
>
> Max size is:  9223372036854775807
> y is  with a value of 9223372036854775808
>
> I was expecting it to error out but instead it produces a value greeter than 
> the
> supposed maximum while still keeping it as an int. I’m confused. If
> sys.maxsize _isn’t_ the largest possible value then how do I determine what 
> is?

sys.maxsize is the "maximum size lists, strings, dicts, and many other
containers can have". This value is related to the theoretical maximum
of Python's built-in arbitrary precision integer type (i.e. long in
Python 2 and int in Python 3), which can be thought of as a
'container' for 15-bit or 30-bit "digits". For example, in a 64-bit
version of Python 3 that's compiled to use 30-bit digits in its int
objects, the limit is about (sys.maxsize bytes) // (4 bytes /
30bit_digit) * (9 decimal_digits / 30bit_digit) ==
20752587082923245559 decimal digits. In practice you'll get a
MemoryError (or probably a human impatience KeyboardInterrupt) long
before that.

sys.maxint (only in Python 2) is the largest positive value for Python
2's fixed-precision int type. In pure Python code, integer operations
seamlessly transition to using arbitrary-precision integers, so you
have no reason to worry, practically speaking, about reaching the
"largest possible value". As a matter of trivia, sys.maxint in CPython
corresponds to the maximum value of a C long int. In a 64-bit Windows
process, a C long int is 32-bit, which means sys.maxint is 2**31 - 1.
In every other supported OS, sys.maxint is 2**63 - 1 in a 64-bit
process.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] filtering listed directories

2015-08-22 Thread Laura Creighton

In a message of Sat, 22 Aug 2015 12:20:31 +1000, Chris Roy-Smith writes:
>Hi,
>environment: Python 2.7, Ubuntu 12.4 Linux
>
>I am trying to get the list of directories shown by 
>tkFileDialog.askdirectory to not show hidden files (starting with .)
>
>this code results in lots of hidden directories listed in the interface 
>making things harder than they need to be for the user.
>
>#! /usr/bin/python
>import Tkinter, tkFileDialog
>root = Tkinter.Tk()
>root.withdraw()
>
>dirname = 
>tkFileDialog.askdirectory(parent=root,initialdir="/home/chris/",title='Pick 
>a directory')
>
>How can I filter out these hidden directories?
>Help(tkFileDialog) doesn't help me as it just shows **options, but 
>doesn't show what these options might be.

The options are listed here:

http://effbot.org/tkinterbook/tkinter-file-dialogs.htm
or
http://infohost.nmt.edu/tcc/help/pubs/tkinter/web/tkFileDialog.html

Unfortunately, they do not help.  There is all sorts of help for
'only show things that match a certain pattern' but not for
'only show things that do not match a certain pattern'.
(Or maybe my pattern-making skill is at fault, but I don't
think so.)

tix (tkinter extensions) https://wiki.python.org/moin/Tix
have some more file dialogs, so maybe there is joy there.

This seems utterly crazy to me -- you surely aren't the first person
who wanted to exclude certain directories in a file dialog.

I will look more later this afternoon in my old tkinter files.  I
must have wanted to do this at one point, mustn't I?

puzzled,
Laura

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

[Tutor] filtering listed directories

2015-08-22 Thread Chris Roy-Smith


Hi,
environment: Python 2.7, Ubuntu 12.4 Linux

I am trying to get the list of directories shown by 
tkFileDialog.askdirectory to not show hidden files (starting with .)


this code results in lots of hidden directories listed in the interface 
making things harder than they need to be for the user.


#! /usr/bin/python
import Tkinter, tkFileDialog
root = Tkinter.Tk()
root.withdraw()

dirname = 
tkFileDialog.askdirectory(parent=root,initialdir="/home/chris/",title='Pick 
a directory')


How can I filter out these hidden directories?
Help(tkFileDialog) doesn't help me as it just shows **options, but 
doesn't show what these options might be.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Do not understand why test is running.

2015-08-22 Thread Peter Otten

boB Stepp wrote:

> In the cold light of morning, I see that in this invocation, the path
> is wrong.  But even if I correct it, I get the same results:
> 
> e:\Projects\mcm>py -m unittest ./test/db/test_manager.py
[...]
> ValueError: Empty module name

Make sure that there are files

./test/__init__.py
./test/db/__init__.py

and then try

py -m unittest test.db.test_manager


> e:\Projects\mcm>py ./test/db/test_manager.py
> Traceback (most recent call last):
>   File "./test/db/test_manager.py", line 16, in 
> import mcm.db.manager
> ImportError: No module named 'mcm'

Make sure the parent directory of the mcm package (I believe this is 
E:\Projects\mcm) is in your PYTHONPATH, then try again.


___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Complications Take Two (Long) Frustrations.

2015-08-22 Thread Laura Creighton

In a message of Sat, 22 Aug 2015 17:00:55 +1000, "Steven D'Aprano" writes:
>On Fri, Aug 21, 2015 at 11:29:52PM +0200, Roel Schroeven wrote:
>> Joel Goldstick schreef op 2015-08-21 23:22:
>> >so:
>> >print -max(-A, -B)
>> 
>> That's what I mean, yes. I haven't tried it, but I don't see why it 
>> wouldn't work.
>
>It won't work with anything which isn't a number:
>
>py> min("hello", "goodbye")
>'goodbye'
>
>
>But the max trick fails:
>
>py> -max(-"hello", -"goodbye")
>Traceback (most recent call last):
>  File "", line 1, in 
>TypeError: bad operand type for unary -: 'str'
>
>
>If you want to write your own min without using the built-in, there is 
>only one correct way to do it that works for all objects:
>
>def min(a, b):
>if a < b: return a
>return b
>
>Well, more than one way -- you can change the "a < b" to "a <= b" if you 
>prefer. Or reverse the test and use >, or similar, but you know what I 
>mean.
>
>-- 
>Steve

Yes, but I think the OP's problem is that he has a fool for a teacher,
or a course designer at any rate.  For some reason the author thinks
that the fact that max(A, B) == -max(-A, -B) (for integers) is very,
very clever.

And somehow the teacher hasn't learnt that his or her job is to make
students question 'clever programming' while not distroying the
enthusiasm of any students who come up with clever solutions on their
own.  Cleverness is the consolation prize in this business -- what you
want to write is code that demonstrates wisdom, not cleverness.

They are fun to write, though.

But remember:

 Everyone knows that debugging is twice as hard as writing a
 program in the first place. So if you're as clever as you can be
 when you write it, how will you ever debug it?

 — Brian Kernighan The Elements of Programming Style

So the reflex you want to develop is 'I just did something clever.
Hmmm.  Maybe _too_ clever.  Let's see ...'  The cleverer you are as a
person, the more you have to develop this reflex, because after all,
somebody much less clever -- or experienced -- than you are may have
to fix a bug in your code some day.

so the max(A, B) == -max(-A, -B) trick has everything to do with
'Watch me pull a rabbit out of this hat' and nothing to do with
'good programming style'.

Too much education of the sort that rewards cleverness and penalises
wisdom means we end up with a lot of smart people in this world who
have managed to get the idea that 'Wisdom is something that only
stupid people need.  It is optional for smart people, and I am smart
enough to do without!'

Some people _never_ unlearn this one.  My family is, alas, full of them.

Laura

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Complications Take Two (Long) Frustrations.

2015-08-22 Thread Steven D'Aprano

On Fri, Aug 21, 2015 at 11:29:52PM +0200, Roel Schroeven wrote:
> Joel Goldstick schreef op 2015-08-21 23:22:
> >so:
> >print -max(-A, -B)
> 
> That's what I mean, yes. I haven't tried it, but I don't see why it 
> wouldn't work.

It won't work with anything which isn't a number:

py> min("hello", "goodbye")
'goodbye'

But the max trick fails:

py> -max(-"hello", -"goodbye")
Traceback (most recent call last):
  File "", line 1, in 
TypeError: bad operand type for unary -: 'str'

If you want to write your own min without using the built-in, there is 
only one correct way to do it that works for all objects:

def min(a, b):
if a < b: return a
return b

Well, more than one way -- you can change the "a < b" to "a <= b" if you 
prefer. Or reverse the test and use >, or similar, but you know what I 
mean.

-- 
Steve
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] Problem using lxml

Re: [Tutor] Problem using lxml

Re: [Tutor] Problem using lxml

[Tutor] Problem using lxml

Re: [Tutor] filtering listed directories

Re: [Tutor] filtering listed directories

Re: [Tutor] Can someone explain this to me please

Re: [Tutor] filtering listed directories

[Tutor] filtering listed directories

Re: [Tutor] Do not understand why test is running.

Re: [Tutor] Complications Take Two (Long) Frustrations.

Re: [Tutor] Complications Take Two (Long) Frustrations.

12 matches

Site Navigation

Mail list logo

Footer information