Re: [Tutor] Iterating through a list of strings

2010-05-02 Thread Stefan Behnel

Thomas C. Hicks, 03.05.2010 07:16:

%Comment introducing the next block of packages
%Below are the packages for using Chinese on the system
%Third line of comment because I am a verbose guy!
ibus-pinyin
ibus-table-wubi
language-pack-zh-hans

etc.

I read the lines of the file into a list for processing.  To strip
out the comments lines I am using something like this:

for x in lines:
 if x.startswith('%'):
 lines.remove(x)

>

This works great for all incidents of comments that are only one
line.  Sometimes I have blocks of comments that are more than one
line and find that the odd numbered lines are stripped from the list
but not the even numbered lines


You are modifying the list during iteration, so the size changes and the 
iterator gets diverted. Don't remove the line, just skip over it, e.g.


def read_package_names(open_text_file):
"""Read lines, strip any comments and return only the
package names found.
"""
for line in open_text_file:
if '%' in line:
# take only the part before the '%'
line = line.split('%', 1)[0]
line = line.strip()
if line:
yield line

with open('packages.txt') as f:
for package_name in read_package_names(f):
print package_name

Stefan

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Iterating through a list of strings

2010-05-02 Thread Thomas C. Hicks
I am using Python 2.6.4 in Ubuntu.  Since I use Ubuntu (with its every
6 months updates) and want to learn Python I have been working on a
post-install script that would get my Ubuntu system up and running with
my favorite packages quickly.  Basically the script reads a text file,
processes the lines in the file and then does an apt-get for each line
that is a package name.  The text file looks like this:

%Comment introducing the next block of packages
%Below are the packages for using Chinese on the system
%Third line of comment because I am a verbose guy!
ibus-pinyin
ibus-table-wubi
language-pack-zh-hans

etc.

I read the lines of the file into a list for processing.  To strip
out the comments lines I am using something like this:

for x in lines:
if x.startswith('%'):
lines.remove(x)

This works great for all incidents of comments that are only one
line.  Sometimes I have blocks of comments that are more than one
line and find that the odd numbered lines are stripped from the list
but not the even numbered lines (i.e in the above block the line
"%Below are the ..." line would not be stripped out of the
list).

Obviously there is something I don't understand about processing
the items in the list and using the string function x.startswith() and
the list function list.remove(). Interestingly if I put in "print x"
in place of the lines.remove(x) line I get all the comment lines
printed.

Can anyone point me in the right direction?  

thomas
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread Alan Gauld


"spir ☣"  wrote


...When writing "size = size + new_allocated" instead of
"size = new_allocated", It get a growth pattern of:
   0 3 6 9 16 24 33 43 54 66 80 96 114
Which is not exactly what is stated in code, but rather similar...


You mean like this line in the source?:

new_allocated += newsize;

Alan G. 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Is there a better way to use scientific notation in an equation?

2010-05-02 Thread Eike Welk
On Sunday May 2 2010 22:44:42 David Hutto wrote:
> Q1 and Q2 are to be entered as base ten scientific notation.
> When I try to input Q1 as raw input, entering in ((2*(10**7)), I get:
> 
> ValueError: invalid literal for int() with base 10: '((2)*(10**7))'
> 
> Which is why I broke it down into it's sub-components(i.e. what to
> multiply the base 10 by[Q1mult] and what exponent to use with the base
> 10[Q1exp]).
> 
> Is there a better way to write this, and what would be the best way to
> insert the
> scientific notation if not this way. 

Maybe this is what you are looking for:

In [1]: 1e9
Out[1]: 10.0

In [2]: float(raw_input())
1e9
Out[2]: 10.0


Eike.
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Is there a better way to use scientific notation in an equation?

2010-05-02 Thread Steven D'Aprano
On Mon, 3 May 2010 06:44:42 am David Hutto wrote:
> In the following code I'm trying to do basic calculations with
> coulumb's law
>
> #Coulombs Law
> '''
> F = (9*(10**9)) * (Q1*Q2) / (d**2)
> '''
> base = 10
> Q1mult = raw_input('First enter multiplier of base 10
> charge/coloumb(Q1):') Q1exp = raw_input('Now enter exponent of base
> 10(Q1):')
> Q1 = int(Q1mult)*(10**int(Q1exp))
> Q2mult = raw_input('First enter multiplier of base 10
> charge/coloumb(Q2):') Q2exp = raw_input('Now enter exponent of base
> 10(Q2):')
> Q2 = int(Q2mult)*(10**int(Q2exp))
> d = raw_input('Enter distance of charges/coulumbs from Q1 to Q2:')
>
>
>
> a = (9*(10**9))*((int(Q1))*(int(Q2)))/((int(d))**2)
> print a
> **
>
> Q1 and Q2 are to be entered as base ten scientific notation.
> When I try to input Q1 as raw input, entering in ((2*(10**7)), I get:
>
> ValueError: invalid literal for int() with base 10: '((2)*(10**7))'

((2)*(10**7)) is not an integer, it is a mathematical expression. 
Integers look like:

1
-27
3748105

and similar. Apart from a leading minus sign to make negative numbers, 
only digits are allowed.

The way to enter numbers in scientific notation is to use floats, in 
exponential form, just like scientific calculators:

2e7

and to change your program to use float instead of int. Also, you have a 
lot of unnecessary brackets that don't do anything except make it hard 
to read. This should do it:

a = 9e9*float(Q1)*float(Q2)/float(d)**2



-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Is there a better way to use scientific notation in an equation?

2010-05-02 Thread David Hutto
In the following code I'm trying to do basic calculations with coulumb's law

#Coulombs Law
'''
F = (9*(10**9)) * (Q1*Q2) / (d**2)
'''
base = 10
Q1mult = raw_input('First enter multiplier of base 10 charge/coloumb(Q1):')
Q1exp = raw_input('Now enter exponent of base 10(Q1):')
Q1 = int(Q1mult)*(10**int(Q1exp))
Q2mult = raw_input('First enter multiplier of base 10 charge/coloumb(Q2):')
Q2exp = raw_input('Now enter exponent of base 10(Q2):')
Q2 = int(Q2mult)*(10**int(Q2exp))
d = raw_input('Enter distance of charges/coulumbs from Q1 to Q2:')



a = (9*(10**9))*((int(Q1))*(int(Q2)))/((int(d))**2)
print a
**

Q1 and Q2 are to be entered as base ten scientific notation.
When I try to input Q1 as raw input, entering in ((2*(10**7)), I get:

ValueError: invalid literal for int() with base 10: '((2)*(10**7))'

Which is why I broke it down into it's sub-components(i.e. what to
multiply the base 10 by[Q1mult] and what exponent to use with the base
10[Q1exp]).

Is there a better way to write this, and what would be the best way to
insert the
scientific notation if not this way. I know there is probably a
module, but this is
for my own practice.

TIA
David
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread spir ☣
On Mon, 3 May 2010 00:50:40 +1000
Steven D'Aprano  wrote:

> On Sun, 2 May 2010 07:44:22 pm Lie Ryan wrote:
> 
> > Python's 'list' is an array of pointers to `PyObject` ('object' in
> > Python) and the resizing algorithm keeps the list size such that
> > "allocated / 2 <= actual <= allocated". When list need to resize, it
> > overallocates the list slightly over 1.125 times than the requested
> > size "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize".
> >
> > If you're interested in more precise details (since I do omit some,
> > especially those that seems to be micro-optimization-related), you
> > need to read the source at
> > http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c
> 
> Thanks for the correction Lie. However, I don't understand what is going 
> on in the code. I'm not a C programmer, so I'm struggling to read it, 
> but as far as I can tell the comment in the code and the code itself 
> are not the same.
> 
> The code calculates an amount to over-allocate the list:
> 
> new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);
> 
> Some experiments in Python shows that the right hand expression varies 
> like this:
> 
> >>> for i in range(30):
> ... print ((i >> 3) + (3 if i < 9 else 6)),
> ...
> 3 3 3 3 3 3 3 3 4 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9

Right, I don't understand neither. Since i>>8 <=> i/8, the above code is 
finally a (fast, because of bit-level op) version of:

if size < 8:
return 3
else:
return size/8 + (3 if size < 9 else 6),

> But the comment in the code says:
> 
> /* This over-allocates proportional to the list size, making room
> * for additional growth.  The over-allocation is mild, but is
> * enough to give linear-time amortized behavior over a long
> * sequence of appends() in the presence of a poorly-performing
> * system realloc().
> * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
> */
> 
> which doesn't look anything like the numbers above. I don't understand 
> what this growth pattern refers to. What have I misunderstood?

I first thought "growth pattern" is to be interpreted as "how the array size 
grows when it is progressively fed with new items", like when feeding a list in 
a loop.
But I tried to simulate this, and the size sticks at 3. finally, I think what 
is called in source "new_allocated" is not the new array memory size, but the 
_difference_. When writing "size = size + new_allocated" instead of "size = 
new_allocated", It get a growth pattern of:
0 3 6 9 16 24 33 43 54 66 80 96 114
Which is not exactly what is stated in code, but rather similar...

def grow():
size = 0 ; count = 0
for count in range(100):
if count > size:
size = size   +   (size >> 3) + (3 if size < 9 else 6)
print size,

Denis


vit esse estrany ☣

spir.wikidot.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread Steven D'Aprano
On Sun, 2 May 2010 07:44:22 pm Lie Ryan wrote:

> Python's 'list' is an array of pointers to `PyObject` ('object' in
> Python) and the resizing algorithm keeps the list size such that
> "allocated / 2 <= actual <= allocated". When list need to resize, it
> overallocates the list slightly over 1.125 times than the requested
> size "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize".
>
> If you're interested in more precise details (since I do omit some,
> especially those that seems to be micro-optimization-related), you
> need to read the source at
> http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c

Thanks for the correction Lie. However, I don't understand what is going 
on in the code. I'm not a C programmer, so I'm struggling to read it, 
but as far as I can tell the comment in the code and the code itself 
are not the same.

The code calculates an amount to over-allocate the list:

new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6);

Some experiments in Python shows that the right hand expression varies 
like this:

>>> for i in range(30):
... print ((i >> 3) + (3 if i < 9 else 6)),
...
3 3 3 3 3 3 3 3 4 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9


But the comment in the code says:

/* This over-allocates proportional to the list size, making room
* for additional growth.  The over-allocation is mild, but is
* enough to give linear-time amortized behavior over a long
* sequence of appends() in the presence of a poorly-performing
* system realloc().
* The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
*/

which doesn't look anything like the numbers above. I don't understand 
what this growth pattern refers to. What have I misunderstood?



-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread spir ☣
On Sun, 2 May 2010 18:57:41 +1000
Steven D'Aprano  wrote:

> On Sun, 2 May 2010 03:49:02 pm spir ☣ wrote:
> > Hello,
> >
> > Is there anywhere some introduction material to the implementation of
> > python lists (or to fully dynamic and flexible sequences, in
> > general)? More precisely, I'd like to know what kind of base
> > data-structure is used (linked list, dynamic array, something else --
> > and in case of array, how is resizing computed). Also, how is
> > "generics" (element are of free type, and even heterogeneous)
> > managed?
> 
> I'm not sure if this is documented anywhere (other than the source code, 
> of course) but in CPython lists are implemented as arrays of pointers 
> to objects.
> 
> Because re-sizing arrays is expensive, the array is over-allocated: the 
> array is (very approximately) created twice the size needed, to give 
> room to grow. Then when the array is full, or nearly full, it is 
> doubled in size again. This gives amortised O(1) appends, at the cost 
> of being 50% larger than needed on average. This means that append is 
> nearly always very, very fast, with an occasional, and rare, slow 
> append when the list resizes.
> 
> Because the list is an array, iteration is very fast and lookups are 
> O(1). However, deleting from the start or middle of the list is O(N), 
> as are insertions. If speed is important, and you need fast insertions 
> and deletions at both the start and end of the list, then you should 
> use collections.deque; however item lookup for deques is O(N).
> 
> Of course all of this is an implementation detail. Other implementations 
> are free to make other memory/time tradeoffs, although if their lists 
> performed very differently, people would complain.

Thank you. The reason why I started with linked lists rather than flexible 
arrays is the problem of deletion & insertion other than at the end. Is the 
tail of the list after the given element really shifted left or right, or are 
tricks used to avoid that? (I mean in a typical implementation). I imagined for 
instance there could be kinds of null placeholders for deleted items (this 
indeed makes all the rest more complicated but avoids compressing the list 
after deletion).

> > Denis
> >
> > PS: The reason is I'm trying to understand better the underlying
> > layers of magic facilities we use everyday. Starting from about no
> > computer science knowledge. I have a working implementation of linked
> > lists in primitive langguages ;-) (namely C & Pascal), but rather
> > complicated to my taste; and I'm probably overlooking commonly known
> > and used tricks that could make the algorithm simpler or more
> > efficient.
> 
> The simplest implementation of a linked list in Pascal is something like 
> this:
> 
> type:
>   ptr = ^node;
>   node = record
> data: integer;
> next: ptr
> end;
> 
> Here's a procedure to traverse the list, printing each item:
> 
> {untested}
> procedure walk(alist: ptr):
>   begin
> while alist <> nil:
>   begin
> writeln(alist^.data);
> alist := alist^.next
>   end;
>   end;
> 
> Everything else, I leave as an exercise :)

Yes, you're right. But when I said "linked list", I actually meant a kind of 
type with all needed "methods" ;-) 


Denis


vit esse estrany ☣

spir.wikidot.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread spir ☣
On Sun, 02 May 2010 19:44:22 +1000
Lie Ryan  wrote:

> On 05/02/10 15:49, spir ☣ wrote:
> > Hello,
> > 
> > Is there anywhere some introduction material to the implementation of 
> > python lists
> > (or to fully dynamic and flexible sequences, in general)?
> 
> 
> > More precisely, I'd like to know what kind of base data-structure is used
> > (linked list, dynamic array, something else -- and in case of array,
> how is
> > resizing computed). Also, how is "generics" (element are of free type, and
> > even heterogeneous) managed?
> 
> Python's 'list' is an array of pointers to `PyObject` ('object' in
> Python) and the resizing algorithm keeps the list size such that
> "allocated / 2 <= actual <= allocated". When list need to resize, it
> overallocates the list slightly over 1.125 times than the requested size
> "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize".

Thank you very much, that's exactly the kind of information I was looking for.
One "mystery" is now solved, if I understand correctly: actually it's not 
really generics, rather the type is a supertype of all relevant ones.

> If you're interested in more precise details (since I do omit some,
> especially those that seems to be micro-optimization-related), you need
> to read the source at
> http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c

Right, I'll have a look myself, now. Thank you for the pointer.

> > Denis
> > 
> > PS: The reason is I'm trying to understand better the underlying layers of
> > magic facilities we use everyday. Starting from about no computer science
> > knowledge. I have a working implementation of linked lists in primitive
> > langguages ;-) (namely C & Pascal), but rather complicated to my taste;
> > and I'm probably overlooking commonly known and used tricks that could
> > make the algorithm simpler or more efficient.
> 
> Real life implementation is always hairy, especially as the programmer
> cut corners to drench some speed benefit and include many error
> checkings. Today is the first time I actually looked at list's
> implementation, now I know why people hated C, for every line of real
> code, there's three or four lines of error checking code, e.g. to ensure
> malloc successfully allocated enough memory or that index access is in
> range.

Yes, I know that for having played with C a very long time ago. The reason why 
I started with Pascal, which makes things slightly simpler (esp. freepascal in 
my case). But maybe even more verbose: about 450 loc for a link list type (with 
all required operations, sure).


Denis


vit esse estrany ☣

spir.wikidot.com
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread Lie Ryan
On 05/02/10 15:49, spir ☣ wrote:
> Hello,
> 
> Is there anywhere some introduction material to the implementation of python 
> lists
> (or to fully dynamic and flexible sequences, in general)?


> More precisely, I'd like to know what kind of base data-structure is used
> (linked list, dynamic array, something else -- and in case of array,
how is
> resizing computed). Also, how is "generics" (element are of free type, and
> even heterogeneous) managed?

Python's 'list' is an array of pointers to `PyObject` ('object' in
Python) and the resizing algorithm keeps the list size such that
"allocated / 2 <= actual <= allocated". When list need to resize, it
overallocates the list slightly over 1.125 times than the requested size
"(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize".

If you're interested in more precise details (since I do omit some,
especially those that seems to be micro-optimization-related), you need
to read the source at
http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c

> Denis
> 
> PS: The reason is I'm trying to understand better the underlying layers of
> magic facilities we use everyday. Starting from about no computer science
> knowledge. I have a working implementation of linked lists in primitive
> langguages ;-) (namely C & Pascal), but rather complicated to my taste;
> and I'm probably overlooking commonly known and used tricks that could
> make the algorithm simpler or more efficient.

Real life implementation is always hairy, especially as the programmer
cut corners to drench some speed benefit and include many error
checkings. Today is the first time I actually looked at list's
implementation, now I know why people hated C, for every line of real
code, there's three or four lines of error checking code, e.g. to ensure
malloc successfully allocated enough memory or that index access is in
range.

___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread Steven D'Aprano
On Sun, 2 May 2010 03:49:02 pm spir ☣ wrote:
> Hello,
>
> Is there anywhere some introduction material to the implementation of
> python lists (or to fully dynamic and flexible sequences, in
> general)? More precisely, I'd like to know what kind of base
> data-structure is used (linked list, dynamic array, something else --
> and in case of array, how is resizing computed). Also, how is
> "generics" (element are of free type, and even heterogeneous)
> managed?

I'm not sure if this is documented anywhere (other than the source code, 
of course) but in CPython lists are implemented as arrays of pointers 
to objects.

Because re-sizing arrays is expensive, the array is over-allocated: the 
array is (very approximately) created twice the size needed, to give 
room to grow. Then when the array is full, or nearly full, it is 
doubled in size again. This gives amortised O(1) appends, at the cost 
of being 50% larger than needed on average. This means that append is 
nearly always very, very fast, with an occasional, and rare, slow 
append when the list resizes.

Because the list is an array, iteration is very fast and lookups are 
O(1). However, deleting from the start or middle of the list is O(N), 
as are insertions. If speed is important, and you need fast insertions 
and deletions at both the start and end of the list, then you should 
use collections.deque; however item lookup for deques is O(N).

Of course all of this is an implementation detail. Other implementations 
are free to make other memory/time tradeoffs, although if their lists 
performed very differently, people would complain.


> Denis
>
> PS: The reason is I'm trying to understand better the underlying
> layers of magic facilities we use everyday. Starting from about no
> computer science knowledge. I have a working implementation of linked
> lists in primitive langguages ;-) (namely C & Pascal), but rather
> complicated to my taste; and I'm probably overlooking commonly known
> and used tricks that could make the algorithm simpler or more
> efficient.

The simplest implementation of a linked list in Pascal is something like 
this:

type:
  ptr = ^node;
  node = record
data: integer;
next: ptr
end;

Here's a procedure to traverse the list, printing each item:

{untested}
procedure walk(alist: ptr):
  begin
while alist <> nil:
  begin
writeln(alist^.data);
alist := alist^.next
  end;
  end;

Everything else, I leave as an exercise :)



-- 
Steven D'Aprano
___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] python list, right! but concretely?

2010-05-02 Thread Alan Gauld

"spir ☣"  wrote

Is there anywhere some introduction material to the implementation
of python lists (or to fully dynamic and flexible sequences, in general)?


The definitive information is the source code which is freely available.


More precisely, I'd like to know what kind of base data-structure
is used (linked list, dynamic array, something else -- and in case of
array, how is resizing computed).


I'd look to the source for that. It may be a combination of traditional
techniques. But I've never been curious enough to look :-)


Also, how is "generics" (element are of free type, and even
heterogeneous) managed?


Everything is an object in Python. So it is a list of objects.


no computer science knowledge. I have a working implementation
of linked lists in primitive languages ;-) (namely C & Pascal),
but rather complicated to my taste; and I'm probably overlooking
commonly known and used tricks that could make the algorithm
simpler or more efficient.


Hmm, Linked lists in C and Pascal are usually pretty trivial.
Unless you have done something unusual its hard to think how
you could simplify them. You can complexify them for better speed
or storage efficiency but the basic linked list in either language
is about as simple as dynamic data gets.

HTH,

--
Alan Gauld
Author of the Learn to Program web site
http://www.alan-g.me.uk/ 



___
Tutor maillist  -  Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor