Re: [Tutor] Iterating through a list of strings
Thomas C. Hicks, 03.05.2010 07:16: %Comment introducing the next block of packages %Below are the packages for using Chinese on the system %Third line of comment because I am a verbose guy! ibus-pinyin ibus-table-wubi language-pack-zh-hans etc. I read the lines of the file into a list for processing. To strip out the comments lines I am using something like this: for x in lines: if x.startswith('%'): lines.remove(x) > This works great for all incidents of comments that are only one line. Sometimes I have blocks of comments that are more than one line and find that the odd numbered lines are stripped from the list but not the even numbered lines You are modifying the list during iteration, so the size changes and the iterator gets diverted. Don't remove the line, just skip over it, e.g. def read_package_names(open_text_file): """Read lines, strip any comments and return only the package names found. """ for line in open_text_file: if '%' in line: # take only the part before the '%' line = line.split('%', 1)[0] line = line.strip() if line: yield line with open('packages.txt') as f: for package_name in read_package_names(f): print package_name Stefan ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Iterating through a list of strings
I am using Python 2.6.4 in Ubuntu. Since I use Ubuntu (with its every 6 months updates) and want to learn Python I have been working on a post-install script that would get my Ubuntu system up and running with my favorite packages quickly. Basically the script reads a text file, processes the lines in the file and then does an apt-get for each line that is a package name. The text file looks like this: %Comment introducing the next block of packages %Below are the packages for using Chinese on the system %Third line of comment because I am a verbose guy! ibus-pinyin ibus-table-wubi language-pack-zh-hans etc. I read the lines of the file into a list for processing. To strip out the comments lines I am using something like this: for x in lines: if x.startswith('%'): lines.remove(x) This works great for all incidents of comments that are only one line. Sometimes I have blocks of comments that are more than one line and find that the odd numbered lines are stripped from the list but not the even numbered lines (i.e in the above block the line "%Below are the ..." line would not be stripped out of the list). Obviously there is something I don't understand about processing the items in the list and using the string function x.startswith() and the list function list.remove(). Interestingly if I put in "print x" in place of the lines.remove(x) line I get all the comment lines printed. Can anyone point me in the right direction? thomas ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
"spir ☣" wrote ...When writing "size = size + new_allocated" instead of "size = new_allocated", It get a growth pattern of: 0 3 6 9 16 24 33 43 54 66 80 96 114 Which is not exactly what is stated in code, but rather similar... You mean like this line in the source?: new_allocated += newsize; Alan G. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Is there a better way to use scientific notation in an equation?
On Sunday May 2 2010 22:44:42 David Hutto wrote: > Q1 and Q2 are to be entered as base ten scientific notation. > When I try to input Q1 as raw input, entering in ((2*(10**7)), I get: > > ValueError: invalid literal for int() with base 10: '((2)*(10**7))' > > Which is why I broke it down into it's sub-components(i.e. what to > multiply the base 10 by[Q1mult] and what exponent to use with the base > 10[Q1exp]). > > Is there a better way to write this, and what would be the best way to > insert the > scientific notation if not this way. Maybe this is what you are looking for: In [1]: 1e9 Out[1]: 10.0 In [2]: float(raw_input()) 1e9 Out[2]: 10.0 Eike. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] Is there a better way to use scientific notation in an equation?
On Mon, 3 May 2010 06:44:42 am David Hutto wrote: > In the following code I'm trying to do basic calculations with > coulumb's law > > #Coulombs Law > ''' > F = (9*(10**9)) * (Q1*Q2) / (d**2) > ''' > base = 10 > Q1mult = raw_input('First enter multiplier of base 10 > charge/coloumb(Q1):') Q1exp = raw_input('Now enter exponent of base > 10(Q1):') > Q1 = int(Q1mult)*(10**int(Q1exp)) > Q2mult = raw_input('First enter multiplier of base 10 > charge/coloumb(Q2):') Q2exp = raw_input('Now enter exponent of base > 10(Q2):') > Q2 = int(Q2mult)*(10**int(Q2exp)) > d = raw_input('Enter distance of charges/coulumbs from Q1 to Q2:') > > > > a = (9*(10**9))*((int(Q1))*(int(Q2)))/((int(d))**2) > print a > ** > > Q1 and Q2 are to be entered as base ten scientific notation. > When I try to input Q1 as raw input, entering in ((2*(10**7)), I get: > > ValueError: invalid literal for int() with base 10: '((2)*(10**7))' ((2)*(10**7)) is not an integer, it is a mathematical expression. Integers look like: 1 -27 3748105 and similar. Apart from a leading minus sign to make negative numbers, only digits are allowed. The way to enter numbers in scientific notation is to use floats, in exponential form, just like scientific calculators: 2e7 and to change your program to use float instead of int. Also, you have a lot of unnecessary brackets that don't do anything except make it hard to read. This should do it: a = 9e9*float(Q1)*float(Q2)/float(d)**2 -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
[Tutor] Is there a better way to use scientific notation in an equation?
In the following code I'm trying to do basic calculations with coulumb's law #Coulombs Law ''' F = (9*(10**9)) * (Q1*Q2) / (d**2) ''' base = 10 Q1mult = raw_input('First enter multiplier of base 10 charge/coloumb(Q1):') Q1exp = raw_input('Now enter exponent of base 10(Q1):') Q1 = int(Q1mult)*(10**int(Q1exp)) Q2mult = raw_input('First enter multiplier of base 10 charge/coloumb(Q2):') Q2exp = raw_input('Now enter exponent of base 10(Q2):') Q2 = int(Q2mult)*(10**int(Q2exp)) d = raw_input('Enter distance of charges/coulumbs from Q1 to Q2:') a = (9*(10**9))*((int(Q1))*(int(Q2)))/((int(d))**2) print a ** Q1 and Q2 are to be entered as base ten scientific notation. When I try to input Q1 as raw input, entering in ((2*(10**7)), I get: ValueError: invalid literal for int() with base 10: '((2)*(10**7))' Which is why I broke it down into it's sub-components(i.e. what to multiply the base 10 by[Q1mult] and what exponent to use with the base 10[Q1exp]). Is there a better way to write this, and what would be the best way to insert the scientific notation if not this way. I know there is probably a module, but this is for my own practice. TIA David ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
On Mon, 3 May 2010 00:50:40 +1000 Steven D'Aprano wrote: > On Sun, 2 May 2010 07:44:22 pm Lie Ryan wrote: > > > Python's 'list' is an array of pointers to `PyObject` ('object' in > > Python) and the resizing algorithm keeps the list size such that > > "allocated / 2 <= actual <= allocated". When list need to resize, it > > overallocates the list slightly over 1.125 times than the requested > > size "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize". > > > > If you're interested in more precise details (since I do omit some, > > especially those that seems to be micro-optimization-related), you > > need to read the source at > > http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c > > Thanks for the correction Lie. However, I don't understand what is going > on in the code. I'm not a C programmer, so I'm struggling to read it, > but as far as I can tell the comment in the code and the code itself > are not the same. > > The code calculates an amount to over-allocate the list: > > new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6); > > Some experiments in Python shows that the right hand expression varies > like this: > > >>> for i in range(30): > ... print ((i >> 3) + (3 if i < 9 else 6)), > ... > 3 3 3 3 3 3 3 3 4 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9 Right, I don't understand neither. Since i>>8 <=> i/8, the above code is finally a (fast, because of bit-level op) version of: if size < 8: return 3 else: return size/8 + (3 if size < 9 else 6), > But the comment in the code says: > > /* This over-allocates proportional to the list size, making room > * for additional growth. The over-allocation is mild, but is > * enough to give linear-time amortized behavior over a long > * sequence of appends() in the presence of a poorly-performing > * system realloc(). > * The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... > */ > > which doesn't look anything like the numbers above. I don't understand > what this growth pattern refers to. What have I misunderstood? I first thought "growth pattern" is to be interpreted as "how the array size grows when it is progressively fed with new items", like when feeding a list in a loop. But I tried to simulate this, and the size sticks at 3. finally, I think what is called in source "new_allocated" is not the new array memory size, but the _difference_. When writing "size = size + new_allocated" instead of "size = new_allocated", It get a growth pattern of: 0 3 6 9 16 24 33 43 54 66 80 96 114 Which is not exactly what is stated in code, but rather similar... def grow(): size = 0 ; count = 0 for count in range(100): if count > size: size = size + (size >> 3) + (3 if size < 9 else 6) print size, Denis vit esse estrany ☣ spir.wikidot.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
On Sun, 2 May 2010 07:44:22 pm Lie Ryan wrote: > Python's 'list' is an array of pointers to `PyObject` ('object' in > Python) and the resizing algorithm keeps the list size such that > "allocated / 2 <= actual <= allocated". When list need to resize, it > overallocates the list slightly over 1.125 times than the requested > size "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize". > > If you're interested in more precise details (since I do omit some, > especially those that seems to be micro-optimization-related), you > need to read the source at > http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c Thanks for the correction Lie. However, I don't understand what is going on in the code. I'm not a C programmer, so I'm struggling to read it, but as far as I can tell the comment in the code and the code itself are not the same. The code calculates an amount to over-allocate the list: new_allocated = (newsize >> 3) + (newsize < 9 ? 3 : 6); Some experiments in Python shows that the right hand expression varies like this: >>> for i in range(30): ... print ((i >> 3) + (3 if i < 9 else 6)), ... 3 3 3 3 3 3 3 3 4 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9 9 9 9 9 9 But the comment in the code says: /* This over-allocates proportional to the list size, making room * for additional growth. The over-allocation is mild, but is * enough to give linear-time amortized behavior over a long * sequence of appends() in the presence of a poorly-performing * system realloc(). * The growth pattern is: 0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ... */ which doesn't look anything like the numbers above. I don't understand what this growth pattern refers to. What have I misunderstood? -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
On Sun, 2 May 2010 18:57:41 +1000 Steven D'Aprano wrote: > On Sun, 2 May 2010 03:49:02 pm spir ☣ wrote: > > Hello, > > > > Is there anywhere some introduction material to the implementation of > > python lists (or to fully dynamic and flexible sequences, in > > general)? More precisely, I'd like to know what kind of base > > data-structure is used (linked list, dynamic array, something else -- > > and in case of array, how is resizing computed). Also, how is > > "generics" (element are of free type, and even heterogeneous) > > managed? > > I'm not sure if this is documented anywhere (other than the source code, > of course) but in CPython lists are implemented as arrays of pointers > to objects. > > Because re-sizing arrays is expensive, the array is over-allocated: the > array is (very approximately) created twice the size needed, to give > room to grow. Then when the array is full, or nearly full, it is > doubled in size again. This gives amortised O(1) appends, at the cost > of being 50% larger than needed on average. This means that append is > nearly always very, very fast, with an occasional, and rare, slow > append when the list resizes. > > Because the list is an array, iteration is very fast and lookups are > O(1). However, deleting from the start or middle of the list is O(N), > as are insertions. If speed is important, and you need fast insertions > and deletions at both the start and end of the list, then you should > use collections.deque; however item lookup for deques is O(N). > > Of course all of this is an implementation detail. Other implementations > are free to make other memory/time tradeoffs, although if their lists > performed very differently, people would complain. Thank you. The reason why I started with linked lists rather than flexible arrays is the problem of deletion & insertion other than at the end. Is the tail of the list after the given element really shifted left or right, or are tricks used to avoid that? (I mean in a typical implementation). I imagined for instance there could be kinds of null placeholders for deleted items (this indeed makes all the rest more complicated but avoids compressing the list after deletion). > > Denis > > > > PS: The reason is I'm trying to understand better the underlying > > layers of magic facilities we use everyday. Starting from about no > > computer science knowledge. I have a working implementation of linked > > lists in primitive langguages ;-) (namely C & Pascal), but rather > > complicated to my taste; and I'm probably overlooking commonly known > > and used tricks that could make the algorithm simpler or more > > efficient. > > The simplest implementation of a linked list in Pascal is something like > this: > > type: > ptr = ^node; > node = record > data: integer; > next: ptr > end; > > Here's a procedure to traverse the list, printing each item: > > {untested} > procedure walk(alist: ptr): > begin > while alist <> nil: > begin > writeln(alist^.data); > alist := alist^.next > end; > end; > > Everything else, I leave as an exercise :) Yes, you're right. But when I said "linked list", I actually meant a kind of type with all needed "methods" ;-) Denis vit esse estrany ☣ spir.wikidot.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
On Sun, 02 May 2010 19:44:22 +1000 Lie Ryan wrote: > On 05/02/10 15:49, spir ☣ wrote: > > Hello, > > > > Is there anywhere some introduction material to the implementation of > > python lists > > (or to fully dynamic and flexible sequences, in general)? > > > > More precisely, I'd like to know what kind of base data-structure is used > > (linked list, dynamic array, something else -- and in case of array, > how is > > resizing computed). Also, how is "generics" (element are of free type, and > > even heterogeneous) managed? > > Python's 'list' is an array of pointers to `PyObject` ('object' in > Python) and the resizing algorithm keeps the list size such that > "allocated / 2 <= actual <= allocated". When list need to resize, it > overallocates the list slightly over 1.125 times than the requested size > "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize". Thank you very much, that's exactly the kind of information I was looking for. One "mystery" is now solved, if I understand correctly: actually it's not really generics, rather the type is a supertype of all relevant ones. > If you're interested in more precise details (since I do omit some, > especially those that seems to be micro-optimization-related), you need > to read the source at > http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c Right, I'll have a look myself, now. Thank you for the pointer. > > Denis > > > > PS: The reason is I'm trying to understand better the underlying layers of > > magic facilities we use everyday. Starting from about no computer science > > knowledge. I have a working implementation of linked lists in primitive > > langguages ;-) (namely C & Pascal), but rather complicated to my taste; > > and I'm probably overlooking commonly known and used tricks that could > > make the algorithm simpler or more efficient. > > Real life implementation is always hairy, especially as the programmer > cut corners to drench some speed benefit and include many error > checkings. Today is the first time I actually looked at list's > implementation, now I know why people hated C, for every line of real > code, there's three or four lines of error checking code, e.g. to ensure > malloc successfully allocated enough memory or that index access is in > range. Yes, I know that for having played with C a very long time ago. The reason why I started with Pascal, which makes things slightly simpler (esp. freepascal in my case). But maybe even more verbose: about 450 loc for a link list type (with all required operations, sure). Denis vit esse estrany ☣ spir.wikidot.com ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
On 05/02/10 15:49, spir ☣ wrote: > Hello, > > Is there anywhere some introduction material to the implementation of python > lists > (or to fully dynamic and flexible sequences, in general)? > More precisely, I'd like to know what kind of base data-structure is used > (linked list, dynamic array, something else -- and in case of array, how is > resizing computed). Also, how is "generics" (element are of free type, and > even heterogeneous) managed? Python's 'list' is an array of pointers to `PyObject` ('object' in Python) and the resizing algorithm keeps the list size such that "allocated / 2 <= actual <= allocated". When list need to resize, it overallocates the list slightly over 1.125 times than the requested size "(newsize >> 3) + (newsize < 9 ? 3 : 6) + newsize". If you're interested in more precise details (since I do omit some, especially those that seems to be micro-optimization-related), you need to read the source at http://code.python.org/hg/trunk/file/e9d930f8b8ff/Objects/listobject.c > Denis > > PS: The reason is I'm trying to understand better the underlying layers of > magic facilities we use everyday. Starting from about no computer science > knowledge. I have a working implementation of linked lists in primitive > langguages ;-) (namely C & Pascal), but rather complicated to my taste; > and I'm probably overlooking commonly known and used tricks that could > make the algorithm simpler or more efficient. Real life implementation is always hairy, especially as the programmer cut corners to drench some speed benefit and include many error checkings. Today is the first time I actually looked at list's implementation, now I know why people hated C, for every line of real code, there's three or four lines of error checking code, e.g. to ensure malloc successfully allocated enough memory or that index access is in range. ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
On Sun, 2 May 2010 03:49:02 pm spir ☣ wrote: > Hello, > > Is there anywhere some introduction material to the implementation of > python lists (or to fully dynamic and flexible sequences, in > general)? More precisely, I'd like to know what kind of base > data-structure is used (linked list, dynamic array, something else -- > and in case of array, how is resizing computed). Also, how is > "generics" (element are of free type, and even heterogeneous) > managed? I'm not sure if this is documented anywhere (other than the source code, of course) but in CPython lists are implemented as arrays of pointers to objects. Because re-sizing arrays is expensive, the array is over-allocated: the array is (very approximately) created twice the size needed, to give room to grow. Then when the array is full, or nearly full, it is doubled in size again. This gives amortised O(1) appends, at the cost of being 50% larger than needed on average. This means that append is nearly always very, very fast, with an occasional, and rare, slow append when the list resizes. Because the list is an array, iteration is very fast and lookups are O(1). However, deleting from the start or middle of the list is O(N), as are insertions. If speed is important, and you need fast insertions and deletions at both the start and end of the list, then you should use collections.deque; however item lookup for deques is O(N). Of course all of this is an implementation detail. Other implementations are free to make other memory/time tradeoffs, although if their lists performed very differently, people would complain. > Denis > > PS: The reason is I'm trying to understand better the underlying > layers of magic facilities we use everyday. Starting from about no > computer science knowledge. I have a working implementation of linked > lists in primitive langguages ;-) (namely C & Pascal), but rather > complicated to my taste; and I'm probably overlooking commonly known > and used tricks that could make the algorithm simpler or more > efficient. The simplest implementation of a linked list in Pascal is something like this: type: ptr = ^node; node = record data: integer; next: ptr end; Here's a procedure to traverse the list, printing each item: {untested} procedure walk(alist: ptr): begin while alist <> nil: begin writeln(alist^.data); alist := alist^.next end; end; Everything else, I leave as an exercise :) -- Steven D'Aprano ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor
Re: [Tutor] python list, right! but concretely?
"spir ☣" wrote Is there anywhere some introduction material to the implementation of python lists (or to fully dynamic and flexible sequences, in general)? The definitive information is the source code which is freely available. More precisely, I'd like to know what kind of base data-structure is used (linked list, dynamic array, something else -- and in case of array, how is resizing computed). I'd look to the source for that. It may be a combination of traditional techniques. But I've never been curious enough to look :-) Also, how is "generics" (element are of free type, and even heterogeneous) managed? Everything is an object in Python. So it is a list of objects. no computer science knowledge. I have a working implementation of linked lists in primitive languages ;-) (namely C & Pascal), but rather complicated to my taste; and I'm probably overlooking commonly known and used tricks that could make the algorithm simpler or more efficient. Hmm, Linked lists in C and Pascal are usually pretty trivial. Unless you have done something unusual its hard to think how you could simplify them. You can complexify them for better speed or storage efficiency but the basic linked list in either language is about as simple as dynamic data gets. HTH, -- Alan Gauld Author of the Learn to Program web site http://www.alan-g.me.uk/ ___ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor