from:"Frederic Rentsch"

Re: Tkinter.event.widget: handler gets name instead of widget.

2012-07-16 Thread Frederic Rentsch

On Sat, 2012-07-14 at 20:10 -0700, rantingrickjohn...@gmail.com wrote:
 On Thursday, July 12, 2012 1:53:54 PM UTC-5, Frederic Rentsch wrote:
 
  The hit list is a table of investment titles (stock, funds, bonds)
  that displays upon entry of a search pattern into a respective template.
  The table displays the matching records: name, symbol, ISIN, CUSIP, Sec.
  Any line can be click-selected. So they are to look like buttons.
 
 Hmm. If they appear like a button widget anyway, then why not just use a 
 button widget?
 

Why indeed? Why does one do simple things in a complicated way? Stations
on the sinuous path the explorer takes surveying unknown territory, I
guess. Your example below is just what I need. Thanks! 

  Representing the mentioned names and id codes in Label widgets was the
  simplest way I could come up with to align them in columns, admittedly
  without the benefit of much experience. But it does look good. the
  layout is fine.
 
 But is it really the simplest? :)
 
 ## START CODE ##
 import Tkinter as tk
 from Tkconstants import *
 
 colWidths = (5,10,30,5)
 N_COLS = len(colWidths)
 N_ROWS = 6
 
 root = tk.Tk()
 for r in range(N_ROWS):
 # Create some imaginary text to display in each column.
 # Also try using string methods center and rjust to
 # see alternative justification of text.
 lst = [str(r).ljust(colWidths[r]) for r in range(N_COLS)]
 b=tk.Button(root, text=''.join(lst))
 b.pack(padx=5, pady=5)
 root.mainloop()
 ## END CODE ##
 
 You could easily expand that into something reusable.
 
 Now. If you need to place fancy borders around the texts, or use multiple 
 fonts, or use images, or blah blah blah... then you may want to use the 
 canvas items provided by the Tkinter.Canvas widget INSTEAD of buttons. 
 
 With the canvas, you can create a simple rectangle (canvas.create_rectangle) 
 that represents a button's outside dimension and give it a button styled 
 border. Then you can bind click events to mimic the button press action. Then 
 you can place canvas_text items on top of that fake button and configure them 
 to be invisible to click events. These text items will not interfer like the 
 Tkinter.Label widgets are currently doing. 
 
 However, i would suggest the Tkinter.Button solution is the easiest by far.
 
  I find the Tkinter system quite challenging. Doing a layout isn't so
  much a matter of dimensioning and placing things as a struggle to trick
  a number of automatic dimensioning and placing mechanisms into
  obliging--mechanisms that are rather numerous and hard to remember.
 
 I don't think i agree with that assessment. 
 

Sticky, justify, side, anchor, width, height, pad, ipad . . . a plethora
of similar formatting concepts with applicability, precedence and effect
rules that are certainly easy to work with once one knows them inside
out. Until such time it's much trial-and-error, and reading of course,
which also frequently involves guessing what is meant and cross-checking
by experiment.
   For instance, I had to find a way to control the size of frames. The
geometry mangers deflate everything bottom to top and utterly ignore
width and height specifications unless the widget is empty. The solution
I found was spanners, frames slimmed down to zero whose length acts as
a foot in the door of their parent, as it were. I suspect there are
better ways.

 ## START TANGENTIAL MEANDERINGS ##
 I find the geometry management of Tkinter to be quite powerful whilst being 
 simultaneously simplistic. You only have three main types of management: 
 Grid, Place, and Pack. Each of which has a very specific usage. One 
 caveat to know is that you can NEVER mix Grid and Pack in the same 
 container widget! I find myself using grid and pack the most, with grid being 
 at the top of the list.
 
 Now, i will agree that grid can be confusing at first until you understand 
 how to rowconfigure and columnconfigue the containing widget (be it a 
 frame or a toplevel). There is also the sticky attribute to consider. 
 ## END TANGENTIAL MEANDERINGS ##
 

Thanks for the reminder.

 But all in all, i would say the most difficult part of the Tkinter geometry 
 management API is coming to grips as to which of the three geometry managers 
 is the best choice for the particular problem at hand -- and you will find 
 yourself using more than one manager in a single GUI app!
 
 But i don't see you solving this problem by stacking one widget on another. I 
 believe it's time to seek out a new solution.
 

I agree. Your idea of using pre-formatted text in buttons is definitely
the way to go.  

 EASY: Using rows of Tkinter.Button coupled with a per-formatted text string.
 ADVANCED: Creating pseudo buttons on a canvas and stacking text objects (or 
 whatever you like) on them.

I'll keep that in mind.

Finally I can report that I found the error I started this thread with.
(Attribute 'widget' of an event was type str)
I have a Frame Data as a container of all sorts

Re: Tkinter.event.widget: handler gets name instead of widget.

2012-07-13 Thread Frederic Rentsch

On Tue, 2012-07-10 at 15:11 -0700, Rick Johnson wrote:
 I've tried to condense your code using the very limited info you have
 provided. I have removed unnecessarily configuring of widgets and
 exaggerated the widget borders to make debugging easier. Read below
 for QA.
 
 ## START CONDENSED CODE ##
 records = range(4)
 
 CNF_SUBFRAME = {
 'bd':5, # rowFrame boder width.
 'relief':RIDGE,
 }
 
 CNF_LABEL = {
 'anchor':W,
 'width':10,
 'bg':'gray',
 }
 
 class FooFrame(tk.Frame):
 def __init__(self, master, **kw):
 tk.Frame.__init__(self, master, **kw)
 self.build_records()
 
 def build_records(self):
 # Should this method be called by __init__???
 # Not sure if records is passed-in or global???
 for n in range(len(records)):
 record = records[n]
 rowFrame = tk.Frame(self, name='-%d-'%n, **CNF_SUBFRAME)
 rowFrame.bind ('Enter', self.evtEnter)
 rowFrame.bind ('Leave', self.evtLeave)
 rowFrame.bind ('ButtonRelease-1',
 self.evtButtonOneRelease)
 rowFrame.bind ('ButtonRelease-3',
 self.evtButtonThreeRelease)
 rowFrame.grid (row=n+2, column=1, padx=5, pady=5)
 for i in range(4):
 lbtext = 'Label_'+str(i)
 label = tk.Label(rowFrame, text=lbtext, **CNF_LABEL)
 label.grid (row=0, column=i, sticky=NW)
 
 def evtEnter(self, event):
 w = event.widget
 print 'evtEnter', w.winfo_class()
 w.config(bg='magenta')
 
 def evtLeave(self, event):
 w = event.widget
 print 'evtLeave', w.winfo_class()
 w.config(bg='SystemButtonFace')
 
 def evtButtonOneRelease(self, event):
 w = event.widget
 print 'evtButtonOneRelease', w.winfo_class()
 w.config(bg='Green')
 
 def evtButtonThreeRelease(self, event):
 w = event.widget
 print 'evtButtonThreeRelease', w.winfo_class()
 w.config(bg='Blue')
 
 if __name__ == '__main__':
 root = tk.Tk()
 frame = FooFrame(root, width=100, height=100, bg='red', bd=1)
 frame.pack(padx=5, pady=5)
 root.mainloop()
 ## END CONDENSED CODE ##
 
 
 In the code sample provided, you will see that the label widgets
 stacked on each row will block click events on the containing
 rowFrames below them. You can get a click event (on the sub frames)
 to work by clicking the exaggerated border on the frames. All the
 events work properly for me, although this GUI interface seems
 unintuitive even with proper borders and colors.
 
 Fredric, I can't help but feel that you are not attacking the problem
 correctly. Please explain the following questions in detail so that i
 may be able to provide help:
 
It works for me too.

I spent another day running the offending class in a simplified
environment and it worked flawlessly. In what way the environment makes
the difference is anything but obvious. But it has got to be the
environment.

 Q1. You have subclassed a Tkinter.Frame and you are building rows of
 sub-frames into this toplevel frame; with each row holding
 horizontally stacked label widgets. Okay, I can see a need to wrap up
 a RowFrame object, but i don't see a need to create a
 RowFrameFactory. Can you explain this design decision?
 
I sent this same response yesterday with a screen shot attached. The
message didn't pass. It must have been rejected by a spam filter. So I
try again without the screen shot. (Too bad. A picture is worth a
thousand words).
   The hit list is a table of investment titles (stock, funds, bonds)
that displays upon entry of a search pattern into a respective template.
The table displays the matching records: name, symbol, ISIN, CUSIP, Sec.
Any line can be click-selected. So they are to look like buttons.
Representing the mentioned names and id codes in Label widgets was the
simplest way I could come up with to align them in columns, admittedly
without the benefit of much experience. But it does look good. the
layout is fine.
   I find the Tkinter system quite challenging. Doing a layout isn't so
much a matter of dimensioning and placing things as a struggle to trick
a number of automatic dimensioning and placing mechanisms into
obliging--mechanisms that are rather numerous and hard to remember.

 Q2. It seems odd to me that you want to engage the rowFrame widgets
 via events but NOT the Label widgets. Can you explain this design
 decision?
 
Again, the labels serve to align the fields into columns. As to the
bindings, I just now found out, that Entry and Leave can be bound to
the line frame, but the mouse buttons don't act on the frame with the
labels covering it wall to wall. Entry will lighten up the background of
the line. Leave restores the normal color. ButtonRelease-N will select
the line, darkening the text. The coloring has to be done separately on
each label across the line, as the labels cover the frame. That isn't a
problem.

I'm sorry

Re: Tkinter.event.widget: handler gets name instead of widget.

2012-07-13 Thread Frederic Rentsch

On Fri, 2012-07-13 at 09:26 +0200, Peter Otten wrote:
 Frederic Rentsch wrote:
 
  I'm sorry I can't post an intelligible piece that does NOT work. I
  obviously can't post the whole thing. 
 
 How about a pastebin then? Or even bitbucket/github as you need to track 
 changes anyway?
 
  It is way too convoluted.
 
 Convoluted code is much easier to debug than no code ;)
 
 Another random idea: run your code on a more recent python/tcl installation. 
 If you are lucky you get a different error.
 

So many good ideas! I can hardly keep up. Let me try anyway.

I hesitate to ask dumb questions, but I guess I have to. What is
python/tcl? I enlisted Google, Synaptic, apt-cache, apt-get, dpkg and
scouring the profusion I couldn't detect any actionable piece of
information, undoubtedly due to my modest expertise in matters of system
administration.
   I next spent a day with an attempt to upgrade to Python 2.7.3,
figuring that that might simultaneously take care of upgrading tcl.
Accustomed to installing packages I had to venture into the unknown
territory of compiling source, because no package was available.
(Windows, Apple, yes. Linux, no). The compile went smoothly, but ended
like this:

... build finished, but the necessary bits to build these modules were
not found:

_bsddb
_curses
_curses_panel
_sqlite3
_ssl
_tkinter
bsddb185
bz2
dbm
gdbm
readline
sunaudiodev

To find the necessary bits, look in setup.py in detect_modules() for the
module's name.

I didn't know what to look for in setup.py, spent a couple of hours
turning stones and encountered evidence of a bunch of missing header
files, probably of other modules which I had installed rather than
compiled.
   2.7.3 came up in terminals, but not in an IDLE window. No wonder,
_tkinter was reported not found; and so many others with it that,
anxious to get on, I stopped venturing further into this labyrinth,
erased everything 2.7.3 and now I'm back to 2.6 and would greatly
appreciate advice on upgrading python/tcl.

I shall look at pastebin and bitbucket/github right away.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Tkinter.event.widget: handler gets name instead of widget.

2012-07-11 Thread Frederic Rentsch

On Tue, 2012-07-10 at 18:06 -0700, Rick Johnson wrote:
 Also:
 
 Q3: Why are you explicitly setting the name of your subFrame widgets
 instead of allowing Tkinter to assign a unique name?...AND are you
 aware of the conflicts that can arise from such changes[1]?
 

I find custom-named widgets easier to work with during development. I can tell
what this is; .main-frame.data-frame.title-hit-list.-10-. If I didn't assign
names it would look something line this:.2837029.283725.283762.2848308.  
Once my program works I can drop custom-naming.
   I understand that conflicts may arise if one assigns numeric names.
To find out whether the digits in the label names ('-10-' above) might
be implicated, I changed to spelled-out names ('ten'). The change had no
effect. The reference you list blow (x147-more-on-widget-names.htm)
indeed says don't use names which only contain digits. 

 Q4: Are you aware of the built-in function enumerate[2]? I see you
 are passing around indexes to iterables AND simultaneously needing the
 obj reference itself. I prefer to keep indexing to a minimum.  If
 there is no bleeding edge performance issue to worry about (and there
 almost *always* never is) why not use enumerate?
 
Aware, yes. In the habit of, no. Thanks for the reminder.

 [1] 
 http://www.pythonware.com/library/tkinter/introduction/x147-more-on-widget-names.htm
 [2] http://docs.python.org/release/3.0.1/library/functions.html#enumerate

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Tkinter.event.widget: handler gets name instead of widget.

2012-07-10 Thread Frederic Rentsch

On Mon, 2012-07-09 at 10:49 -0700, Rick Johnson wrote:
 On Jul 9, 12:58 am, Terry Reedy tjre...@udel.edu wrote:
  When posting problem code, you should post a minimal, self-contained
  example that people can try on other systems and versions. Can you
  create the problem with one record, which you could give, and one
  binding? Do you need 4 fields, or would 1 'work'?
 
 I'll firmly back that sentiment. Fredric, if you cannot get the
 following simple code events to work properly, then how do you think
 you can get events working properly on something more complex?
 
 ## START CODE ARTISTRY ##
 import Tkinter as tk
 from Tkconstants import *
 
 class MyFrame(tk.Frame):
 def __init__(self, master, **kw):
 tk.Frame.__init__(self, master, **kw)
 self.bind('Enter', self.evtMouseEnter)
 self.bind('Leave', self.evtMouseLeave)
 self.bind('Button-1', self.evtButtonOneClick)
 
 def evtMouseEnter(self, event):
 event.widget.config(bg='magenta')
 
 def evtMouseLeave(self, event):
 event.widget.config(bg='SystemButtonFace')
 
 def evtButtonOneClick(self, event):
 event.widget.config(bg='green')
 
 if __name__ == '__main__':
 root = tk.Tk()
 for x in range(10):
 f = MyFrame(root, height=20, bd=1, relief=SOLID)
 f.pack(fill=X, expand=YES, padx=5, pady=5)
 root.mainloop()
 ## END CODE ARTISTRY ##
 

This works perfectly well!

What makes the case difficult is an exceptional misbehavior for no
apparent reason. If I manage to strip the critical section of
environmental details in the interest of concision and legibility and
still reproduce the error I shall post it. So far I have failed: the
stripped model I distilled yesterday worked fine.

 ---
 More points to ponder:
 ---
 1. Just because the Tkinter designers decided to use idiotic names for
 event sequences does not mean you are required to blindly follow their
 bad example (the whole: if johnny jumps off a cliff..., thing comes
 to mind)
 
 2. I would strongly recommend you invest more thought into your event
 handler identifiers. ALL event handlers should marked as *event
 handlers* using a prefix. I like to use the prefix evt. Some people
 prefer other prefixes. In any case, just remember to be consistent.
 Also, event handler names should reflect WHAT event they are
 processing, not some esoteric functionality of the application like
 pick_record or info_profile. However if you like, simply have the
 event handler CALL an outside func/meth. This type of consistency is
 what separates the men from the boys.
 
 3. The Python Style Guide[1] frowns on superfluous white space (be it
 horizontal OR vertical!) I would strongly recommend you read and adapt
 as much of this style as you possibly can bear. Even if we don't all
 get along, it IS *very* important that we structure our code in a
 similar style.
 
 [1] http://www.python.org/dev/peps/pep-0008/

Excellent suggestions.


Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Tkinter.event.widget: handler gets name instead of widget.

2012-07-09 Thread Frederic Rentsch

On Mon, 2012-07-09 at 01:58 -0400, Terry Reedy wrote:
 On 7/8/2012 5:19 PM, Frederic Rentsch wrote:
  Hi widget wizards,
 
  The manual describes the event attribute widget as The widget
  which generated this event. This is a valid Tkinter widget instance, not
  a name. This attribute is set for all events.
 
 Same in 3.3, with nice example of using it.
 
 def turnRed(self, event):
  event.widget[activeforeground] = red
 
 self.button.bind(Enter, self.turnRed)
 
  Ans so it is--has been until on the latest occasion event.widget was
  not the widget, but its name, crashing the handler.
 
 Has event.widget been the widget only in other programs or previously
 with the same program?

I bind Enter to Frames, each Frame calling the same handler that is
supposed to change the background color. It is the Enter action that
generates the event. No later the handler receives the event whose
attribute widget is the widget's name (full path). My code doesn't
create events anywhere. I suppose events vanish when the last handler
terminates.

. . .

 When posting problem code, you should post a minimal, self-contained 
 example that people can try on other systems and versions. 

Attempting to strip the critical code, throwing out everything
incidental to the problem so I could post something intelligible, I
failed to fail: the bare essentials work. The problem appears to be in
the incidental.

 Can you 
 create the problem with one record, which you could give, and one 
 binding? Do you need 4 fields, or would 1 'work'?
 
It fails even with the Frame containing no Labels at all, like this:

for n, record in enumerate(records):
line_frame = Frame (self, name = _verbalize_number (n), width = 600, 
height = 20, relief = RAISED, **BUTTON_FRAME_)
line_frame.bind ('Enter', self.enter)
## No Labels at all:
##  for i in self.range_n_fields:
##  field = Label (line_frame, text = record [self.INDICES [i]], 
anchor = W, width = self.COLUMN_WIDTHS [i], **DB_LIST_LABEL_)
##  field.grid (row = 0, column = i, sticky = NW)

def enter (self, event):
w = event.widget
print 'hit list.enter (). Event, widget', event, w, w.__class__
w.config (bg = ENTERED_BG_COLOR)

hit list.leave (). Event, widget Tkinter.Event instance at 0xa52c60c 
.main-frame.data-frame.title-hit-list.one-zero type 'str'
Exception in Tkinter callback
Traceback (most recent call last):
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 1413, in __call__
return self.func(*args)
  File /home/fr/python/finance/piam/hit_list.py, line 114, in enter
w.config (bg = ENTERED_BG_COLOR)
AttributeError: 'str' object has no attribute 'config'

_verbalize_number spells out the line numbers, because the manual says
something about digits being reserved for the auto-generated widget
names. I thought that assigned names containing digits might be a
problem, but it wasn't.
   The dictionary arguments, by the way, are just style elements:
colors, fonts, reliefs, etc. nothing functionally essential.

 
  # Dell E6500, Ubuntu 10.04, Python 2.6
 
 You might try a current Python release, and the latest tcl/tk 8.5.11 
 released last March (comes with 3.3.0 Windows release, don't know how on 
 Ubuntu). There might be (have been?) a bug with events on Frames, or on 
 Frames within Frames treated as widgets.
 
 -- 
 Terry Jan Reedy
 

Terry,

I interspersed a couple of answers above. As to your last suggestion I
got Python 2.7.3 and managed to compile it. I would have preferred
something ready to install, but that doesn't seem to be available for
Linux. The compile went smoothly. But it'll take me another day to
reorganize, beginning with the Applications menu which still shows IDLE
(Python 2.6), while terminals already call the new version 2.7.3, but
it doesn't know where MySQLdb is, and possibly where other things are.
So for now I can't report on this effort either.

But I certainly appreciate your help. Many thanks.


Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Tkinter.event.widget: handler gets name instead of widget.

2012-07-09 Thread Frederic Rentsch

On Mon, 2012-07-09 at 10:49 -0700, Rick Johnson wrote:
 On Jul 9, 12:58 am, Terry Reedy tjre...@udel.edu wrote:
  When posting problem code, you should post a minimal, self-contained
  example that people can try on other systems and versions. Can you
  create the problem with one record, which you could give, and one
  binding? Do you need 4 fields, or would 1 'work'?
 
 I'll firmly back that sentiment. Fredric, if you cannot get the
 following simple code events to work properly, then how do you think
 you can get events working properly on something more complex?
 
 ## START CODE ARTISTRY ##
 import Tkinter as tk
 from Tkconstants import *
 
 class MyFrame(tk.Frame):
 def __init__(self, master, **kw):
 tk.Frame.__init__(self, master, **kw)
 self.bind('Enter', self.evtMouseEnter)
 self.bind('Leave', self.evtMouseLeave)
 self.bind('Button-1', self.evtButtonOneClick)
 
 def evtMouseEnter(self, event):
 event.widget.config(bg='magenta')
 
 def evtMouseLeave(self, event):
 event.widget.config(bg='SystemButtonFace')
 
 def evtButtonOneClick(self, event):
 event.widget.config(bg='green')
 
 if __name__ == '__main__':
 root = tk.Tk()
 for x in range(10):
 f = MyFrame(root, height=20, bd=1, relief=SOLID)
 f.pack(fill=X, expand=YES, padx=5, pady=5)
 root.mainloop()
 ## END CODE ARTISTRY ##
 
 ---
 More points to ponder:
 ---
 1. Just because the Tkinter designers decided to use idiotic names for
 event sequences does not mean you are required to blindly follow their
 bad example (the whole: if johnny jumps off a cliff..., thing comes
 to mind)
 
 2. I would strongly recommend you invest more thought into your event
 handler identifiers. ALL event handlers should marked as *event
 handlers* using a prefix. I like to use the prefix evt. Some people
 prefer other prefixes. In any case, just remember to be consistent.
 Also, event handler names should reflect WHAT event they are
 processing, not some esoteric functionality of the application like
 pick_record or info_profile. However if you like, simply have the
 event handler CALL an outside func/meth. This type of consistency is
 what separates the men from the boys.
 
 3. The Python Style Guide[1] frowns on superfluous white space (be it
 horizontal OR vertical!) I would strongly recommend you read and adapt
 as much of this style as you possibly can bear. Even if we don't all
 get along, it IS *very* important that we structure our code in a
 similar style.
 
 [1] http://www.python.org/dev/peps/pep-0008/

Rick,
Thanks for your remarks. I spent most of the day working with Terry's
input. And now I am falling asleep. So I shall study your inspirations
tomorrow.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Tkinter.event.widget: handler gets name instead of widget.

2012-07-08 Thread Frederic Rentsch

Hi widget wizards,

The manual describes the event attribute widget as The widget
which generated this event. This is a valid Tkinter widget instance, not
a name. This attribute is set for all events.
Ans so it is--has been until on the latest occasion event.widget was
not the widget, but its name, crashing the handler. 
# Here I build a list of selectable records having each four fields.
# The fields are labels. The selectable line is a frame containing the 
# labels side by side. The line frames go into self, which is a Frame. 

for n in range (len (records)):
record = records [n]
line_frame = Frame (self, name = '-%d-' % n, relief = RAISED, 
**BUTTON_FRAME_)
line_frame.bind ('Enter', self.enter)
line_frame.bind ('Leave', self.leave)
line_frame.bind ('ButtonRelease-1', self.pick_record)
line_frame.bind ('ButtonRelease-3', self.info_profile)
line_frame.grid (row = n+2, column = 1)
for i in self.range_n_fields:   # (0, 1, 2, 3)
field = Label (line_frame, text = record [self.INDICES [i]], 
anchor = W, width = self.COLUMN_WIDTHS [i], **DB_LIST_LABEL_)
field.grid (row = 0, column = i, sticky = NW)

# Here is the Enter handler:

def enter (self, event):
w = event.widget
print 'hit list.enter (). Event, widget', event, w, w.__class__ # 
Tracing line
w.config (bg = SELECTED_BG_COLOR)

# And here is what comes out. The first line is my tracing line. The name is 
correct in that it 
# names the entered line_frame, but is wrong because it should be the 
line_frame, not its name.
# The rest is the red exception message:

hit list.enter (). Event, widget Tkinter.Event instance at 0x9115dcc 
.main-frame.data-frame.title-hit-list.-10- type 'str'
Exception in Tkinter callback
Traceback (most recent call last):
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 1413, in __call__
return self.func(*args)
  File /home/fr/python/finance/piam/hit_list.py, line 83, in enter
w.config (bg = SELECTABLE_BG_COLOR)
AttributeError: 'str' object has no attribute 'config'

# The same thing happens with Leave. The other handlers I haven't done yet. 
The same bindings work well in 
# a Menu class with the difference that the bindings are on the Labels, not a 
containing Frame.

# Dell E6500, Ubuntu 10.04, Python 2.6


Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Tkinter binding question

2012-06-20 Thread Frederic Rentsch

On Tue, 2012-06-19 at 19:19 -0700, rantingrickjohn...@gmail.com wrote:
 On Tuesday, June 19, 2012 10:55:48 AM UTC-5, Frederic Rentsch wrote:
  If I copy your event descriptors into my program, the button-release
  callback still fails. It works in your code, not in mine. Here is what
  my code now looks like. It is somewhat more complicated than yours,
  because I bind Frames holding each a line (line_frame) and each frame
  contains a few Labels side by side. The idea is to achieve a table with
  vertically aligning columns each line of which I can click-select. (Is
  there a better way?)
  
 for line_frame in ...:
line_frame.bind ('Enter', self.color_selected)
line_frame.bind ('Leave', self.color_selectable)
line_frame.bind ('ButtonRelease-1', self.pick_record)
line_frame.bind ('ButtonRelease-3', self.info_profile)
line_frame.grid (row = n+1, column = 0)
for i in self.range_n_fields:
   field = Label (line_frame, width = ..., text = ...
   field.grid (row = 0, column = i, sticky = W)
...
  
 def color_selected (self, event):
print 'hit list.color_selected ()'
  
 def color_selectable (self, event):
print 'hit list.color_selectable ()'
  
 def pick_record (self, event): # Nver gets called
print 'hit list.pick_record ()'
  
 def info_profile (self, event):# Never gets called
print 'hit list.info_profile ()'
 
 Events only fire for the widget that currently has focus. Frames, labels, 
 and other widgets do not receive focus simply by hovering over them. You can 
 set the focus manually by calling w.focus_set() -- where w is any Tkinter 
 widget. I can't be sure because i don't have enough of your code to analyze, 
 but I think you should bind (either globally or by class type) all Enter 
 events to a callback that sets the focus of the current widget under the 
 mouse. Experiment with this code and see if it is what you need:
 
 ## START CODE ##
 from __future__ import print_function
 import Tkinter as tk
 
 def cb(event):
 print(event.widget.winfo_class())
 event.widget.focus_set()
 
 root = tk.Tk()
 root.geometry('200x200+20+20')
 for x in range(10):
 w = tk.Frame(root, width=20, height=20,bg='red')
 w.grid(row=x, column=0, padx=5, pady=5)
 w = tk.Frame(root, width=20, height=20,bg='green', highlightthickness=1)
 w.grid(row=x, column=1, padx=5, pady=5)
 w = tk.Button(root, text=str(x))
 w.grid(row=x, column=2, padx=5, pady=5)
 root.bind_all(Enter, cb)
 root.mainloop()
 ## END CODE ##
 
 You will see that the first column of frames are recieving focus but you have 
 no visual cues of that focus (due to a default setting). In the second column 
 you get the visual cue since i set highlightthicness=1. The third column is 
 a button widget which by default has visual focus cues.
 
 Is this the problem? 

Yes, I was unaware of focus control. I understand that it is set either
by a left mouse button click or the method focus_set ().  

 PS: Also check out the w.bind_class() method.
 
  Incidentally, my source of inspiration for chaining event descriptors
  was the New Mexico Tech Tkinter 8.4 reference, 
 
 That's an excellent reference BTW. Keep it under your pillow. Effbot also has 
 a great tutorial.


Thanks for this additional load af advice.

Googling I chanced on an excellent introduction Thinking in Tkinter by
Stephen Ferg.
(http://www.ferg.org/thinking_in_tkinter/all_programs.html). He sets out
identifying a common problem with tutorials: The problem is that the
authors of the books want to rush into telling me about all of the
widgets in the Tkinter toolbox, but never really pause to explain basic
concepts. They don't explain how to think in Tkinter.
He then explains seventeen functionalities, one at a time, and
illustrates them with a little piece of code ready to run. Working
through the examples is a good way to acquire a basic understanding
without falling victim to brain clutter. I shall go through the
examples very attentively and go on from there.

Thanks again

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Tkinter binding question

2012-06-19 Thread Frederic Rentsch

Rick, 

Thank you for your thorough discussion. I tried your little program.
Enter and leave work as expected. Pushing mouse buttons call
leave-enter, exactly as it happened with my code. So that seems to be a
default behavior. No big deal. Without the tracing messages it would go
unnoticed. Releasing either of the bound mouse buttons displays the
corresponding messages. So all works fine. 
   If I copy your event descriptors into my program, the button-release
callback still fails. It works in your code, not in mine. Here is what
my code now looks like. It is somewhat more complicated than yours,
because I bind Frames holding each a line (line_frame) and each frame
contains a few Labels side by side. The idea is to achieve a table with
vertically aligning columns each line of which I can click-select. (Is
there a better way?)

   for line_frame in ...:
  line_frame.bind ('Enter', self.color_selected)
  line_frame.bind ('Leave', self.color_selectable)
  line_frame.bind ('ButtonRelease-1', self.pick_record)
  line_frame.bind ('ButtonRelease-3', self.info_profile)
  line_frame.grid (row = n+1, column = 0)
  for i in self.range_n_fields:
 field = Label (line_frame, width = ..., text = ...
 field.grid (row = 0, column = i, sticky = W)
  ...

   def color_selected (self, event):
  print 'hit list.color_selected ()'

   def color_selectable (self, event):
  print 'hit list.color_selectable ()'

   def pick_record (self, event): # Nver gets called
  print 'hit list.pick_record ()'

   def info_profile (self, event):# Never gets called
  print 'hit list.info_profile ()'

I admit that I don't have an accurate conception of the inner workings.
It's something the traveler on the learning curve has to acquire a feel
for by trial and error. In this case the differing behavior should
logically have to do with the structural difference: I bind Labels that
contain Labels. If I click this nested assembly, who gets the event? The
contained widget, the containing widget or both?

Frederic


Incidentally, my source of inspiration for chaining event descriptors
was the New Mexico Tech Tkinter 8.4 reference, which says: ... In
general, an event sequence is a string containing one or more event
patterns. Each event pattern describes one thing that can happen. If
there is more than one event pattern in a sequence, the handler will be
called only when all the patterns happen in that same sequence ...

Again, predicting the precedence with overlaps is much like solving a
murder case: finding suspects and let the innocent ones off the hook.
The only reference I have found on that topic is in effbot.org's
tkinterbook, which says that precedence goes to the closest match, but
doesn't explain how one evaluates closeness.


-- 
http://mail.python.org/mailman/listinfo/python-list

Tkinter binding question

2012-06-18 Thread Frederic Rentsch

Hi All,

  For most of an afternoon I've had that stuck-in-a-dead-end feeling 
probing to no avail all permutations formulating bindings, trying to 
make sense of manuals and tutorials. Here are my bindings:

   label_frame.bind ('Enter', self.color_selected)
   label_frame.bind ('Leave', self.color_selectable)
   label_frame.bind ('Button-1ButtonRelease-1', self.pick_record)
   label_frame.bind ('Button-3ButtonRelease-3', self.info_profile)

Enter and Leave work fine. But when I try to select an entered item, 
the moment I push the left or the right button, color_selectable ()
and color_selected () are called again in rapid succession. The same 
effect happens even when I push the middle mouse button which is 
rather weird, because it has no binding. The behavior suggests that 
no event can occur on an entered widget before it is left again and 
if an event other that Leave comes in, the Leave callback gets 
called automatically. That can't be right, though. It isn't possible 
to click an item without entering it. 
   I put traces in all of the four callbacks. So I know what gets
called an what doesn't. Traces are:

   On Enter:
  hit list.color_selected ()
   On Leave:
  hit list.color_selectable ()

Fine so far. 

   Enter:
  hit list.color_selected ()# Still fine
   Button-1 or Button-2 or Button-3:
  hit list.color_selectable ()  # Not so fine!
  hit list.color_selected ()
   ButtonRelease-1 (or -2 or -3)
  (nothing)


Thanks for any suggestion

Frederic


OS: Ubuntu 10.04 LTS
Python: sys.version: 2.6.5 (r265:79063, Apr 16 2010, 13:09:56) [GCC 4.4.3]
Tkinter: VERSION 73770 (help (Tkinter), last line)



-- 
http://mail.python.org/mailman/listinfo/python-list

tkinter: is there a way to switch a widget's master?

2012-05-14 Thread Frederic Rentsch

Hi there,

I would like to prepare a bunch of info text widgets to be displayed in
Toplevel windows at the user's command (when ever he needs directions).
I know how to remove and restore widgets without destroying them in
between. The problem with a Toplevel is that it is a master that comes
and goes, of a Text that is supposed to stay as long as the program
runs. (Building the Text involves reading a file and compiling lots of
edited-in formatting symbols. Repeating this every time the Toplevel is
called seems rather inelegant.) 

toplevel = Toplevel (root, ...)
info_text = MyText (toplevel, ...)
info_text.pack ()
# No problem, except for the inelegant remake of the text on every call.

I tried:

text = MyText (root, ...)
# Later, on user demand
toplevel = Toplevel (root, ...)
info_text.lower (belowThis = toplevel)
info_text.pack ()

This doesn't work! toplevel is empty and text appears in the root
window. Is there a way to switch a widget's master?

Thanks for comments

Frederic




-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Tkinter: IDLE can't get out of mainloop

2012-04-02 Thread Frederic Rentsch

On Sat, 2012-03-31 at 06:29 -0400, Terry Reedy wrote:
 On 3/31/2012 3:42 AM, Frederic Rentsch wrote:
  Hi all,
 
  Is is a bad idea to develop Tkinter applications in IDLE? I understand
  that IDLE is itself a Tkinter application, supposedly in a mainloop and
  mainloops apparently don't nest.
 
 In standard configuration, one process runs IDLE, another runs user 
 code, including tkinter code. So there should be no interference. The 
 example in the tkinter doc runs from IDLE edit window on my system. The 
 revised example in coming releases works even better. There have been 
 several IDLE bugs fixed in the last few months, and even more since 2.6 
 before that. Upgrade if you can to get fixes, suffer the bugs since 
 fixed, or patch your 2.6 installation.
 
 -- 
 Terry Jan Reedy
 

Terry,
It helps to know that an upgrade might improve things. Thank you for
the suggestion. And thanks also to Chris for his remark on .pyc files.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Tkinter: IDLE can't get out of mainloop

2012-03-31 Thread Frederic Rentsch

Hi all,

Is is a bad idea to develop Tkinter applications in IDLE? I understand
that IDLE is itself a Tkinter application, supposedly in a mainloop and
mainloops apparently don't nest.

I tried to install a root-destroy-protocol:

def destroy_root ():
print 'Destroying root'
root.destroy ()

root.protocol (WM_DELETE_WINDOW, destroy_root)

I see the tracing message 'Destroying root', but stay stuck unable to
get the IDLE prompt back. Ctr-C doesn't work. The only way out I know is
killing IDLE. When I do, a warning says that a program is still running.
That must be IDLE's own WM_DELETE_WINDOW protocol. Is there a way to get
the prompt back without killing IDLE? Is there a way to nest a
mainloop? 
   Up to now I have been able to get by without a mainloop. I suppose
this is because I have only been doing layouts. Starting now to do
events I observe what in the absence of a mainloop looks like
synchronization problems with bindings responding to other events than
their own.
   If I run from a terminal things seem to work out. Is it standard
development practice to run code from a terminals ($ python program.py)?
What's the 'program.pyc' for if the source is compiled every time?
   I use Python 2.6 on Ubuntu 10.04 LTS.
 
Thankful for any suggestion

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

How to read image data into Tkinter Canvas?

2012-03-03 Thread Frederic Rentsch

Hi,
Familiarizing myself with Tkinter I'm stuck trying to fill a Canvas
with an image. I believe the class I need is PhotoImage rather than
BitmapImage. But I have no luck with either. The PhotoImage doc lists
available handlers for writing GIF and PPM files. It doesn't say
anything about reading. I would assume that any widely used image format
would activate the appropriate handler automatically. To work with
formats other than GIF and PPM the doc recommends to resort to the Image
module (PIL). This approach also failed.
I did manage to read a GIF file into PhotoImage, which seems to let my
code off the hook. And I have made sure beyond any doubt that my image
files exist and open with PIL.
I must be doing something wrong and will much appreciate any help. 

Frederic



-


Here's what happens:


First trial with PhotoImage:

. . .
canvas = Canvas (root)
picture = PhotoImage (file = /home/fr/temp/wandbild.bmp)
image = canvas.create_image (10, 10, anchor = NW, image = picture)
. . .

Traceback (most recent call last):
  File tk2.py, line 23, in module
picture = PhotoImage (file = /home/fr/temp/wandbild.bmp)
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 3288, in __init__
Image.__init__(self, 'photo', name, cnf, master, **kw)
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 3244, in __init__
self.tk.call(('image', 'create', imgtype, name,) + options)
_tkinter.TclError: couldn't recognize data in image file
/home/fr/temp/wandbild.bmp


-


Second trial with BitmapImage:

. . .
picture = BitmapImage (file = /home/fr/temp/wandbild.bmp)
. . .

Traceback (most recent call last):
  File tk2.py, line 22, in module
picture = BitmapImage (file = /home/fr/temp/wandbild.bmp)
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 3347, in __init__
Image.__init__(self, 'bitmap', name, cnf, master, **kw)
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 3244, in __init__
self.tk.call(('image', 'create', imgtype, name,) + options)
_tkinter.TclError: format error in bitmap data


-


Third trial with Image.open:

. . .
picture = Image.open (/home/fr/temp/wandbild.bmp)
print picture
image = canvas.create_image (10, 10, anchor = NW, image = picture)
. . .

JpegImagePlugin.JpegImageFile image mode=RGB size=963x616 at 0x9A4D84C
Traceback (most recent call last):
  File tk2.py, line 24, in module
image = canvas.create_image (10, 10, anchor = NW, image = picture)
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 2159, in
create_image
return self._create('image', args, kw)
  File /usr/lib/python2.6/lib-tk/Tkinter.py, line 2150, in _create
*(args + self._options(cnf, kw
_tkinter.TclError: image BmpImagePlugin.BmpImageFile image mode=RGB
size=3933x2355 at 0x9A0D84C doesn't exist


-


Same thing happens with JPG files. As I said, GIF works into PhotoImage,
but that's the only format I managed. 


-- 
http://mail.python.org/mailman/listinfo/python-list

try - except. How to identify errors unknown in advance?

2011-11-16 Thread Frederic Rentsch

Hi all,


I'd like to log MySQL errors. If I do:

try: (command)
except MySQLdb.OperationalError, e: print e

I may get something like:

(1136, Column count doesn't match value count at row 1)

If I don't know in advance which error to expect, but on the contrary
want to find out which error occurred, I can catch any error by omitting
the name:

except: (handle)

But now I don't have access to the error message 'e'. I'm sure there's a
way and it's probably ridiculously simple.

Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: try - except. How to identify errors unknown in advance?

2011-11-16 Thread Frederic Rentsch

On Wed, 2011-11-16 at 09:09 -0800, Chris Kaynor wrote:
 On Wed, Nov 16, 2011 at 8:57 AM, Frederic Rentsch
 anthra.nor...@bluewin.ch wrote:
  Hi all,
 
 
  I'd like to log MySQL errors. If I do:
 
 try: (command)
 except MySQLdb.OperationalError, e: print e
 
  I may get something like:
 
 (1136, Column count doesn't match value count at row 1)
 
  If I don't know in advance which error to expect, but on the contrary
  want to find out which error occurred, I can catch any error by omitting
  the name:
 
 except: (handle)
 
  But now I don't have access to the error message 'e'. I'm sure there's a
  way and it's probably ridiculously simple.
 
 except Exception, e: (or, in Py3, except Exception as e is prefereed).
 
 Note that you should generally avoid bare except statements except:
 as that will catch everything, including KeyboardInterrupt and
 SystemExit which may not be desirable.
 
 Even without saving the exception in the except statement, you can get
 the type, value, and traceback with the sys.exc_info command. See
 http://docs.python.org/library/sys.html#sys.exc_info
 
 For example:
 
 pyimport sys
 pytry:
 py raise RuntimeError
 py except:
 py print sys.exc_info()
 py
 (type 'exceptions.RuntimeError', RuntimeError(), traceback object
 at 0x02371588)

Chris, Thanks very much! Great help!

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: String multi-replace

2010-11-18 Thread Frederic Rentsch

On Wed, 2010-11-17 at 21:12 -0800, Sorin Schwimmer wrote:
 Thanks for your answers.
 
 Benjamin Kaplan: of course dict is a type... silly me! I'll blame it on the 
 time (it's midnight here).
 
 Chris Rebert: I'll have a look.
 
 Thank you both,
 SxN
 
 

Forgive me if this is off the track. I haven't followed the thread. I do
have a little module that I believe does what you attempted to do:
multiple substitutions using a regular expression that joins a bunch of
targets with '|' in between. Whether or not you risk unintended
translations as Dave Angel pointed out where the two characters or one
of your targets join coincidentally you will have to determine. If so
you can't use this approach. If, on the other hand, your format is safe
it'll work just fine. Use like this:

 import translator
 t = translator.Translator (nodia.items ())
 t (name)  # Your example
'Rasca'


Frederic



class Translator:

	
		Will translate any number of targets, handling them correctly if some overlap.

		Making Translator
			T = Translator (definitions, [eat = 1])
			'definitions' is a sequence of pairs: ((target, substitute),(t2, s2), ...)
			'eat = True' will make an extraction filter that lets only the replaced targets pass.
			Definitions example: (('a','A'),('b','B'),('ab','ab'),('abc','xyz'),
('\x0c', 'page break'), ('\r\n','\n'), ('   ','\t'))   # ('ab','ab') see Tricks.
			Order doesn't matter.  

		Testing
			T.test (). Translates the definitions and prints the result. All targets 
			must look like the substitutes as defined. If a substitute differs, it has been
			affected by the translation. (E.g. 'A'|'A' ... 'page break'|'pAge BreAk').
			If this is not intended---the effect can be useful---protect the 
			affected substitute by translating it to itself. See Tricks. 

		Running
			translation = T (source)

		Tricks 
			Deletion:  ('target', '')
			Exception: (('\n',''), ('\n\n','\n\n')) # Eat LF except paragraph breaks.
			Exception: (('\n', '\r\n'), ('\r\n',\r\n')) # Unix to DOS, would leave DOS unchanged
			Translation cascade: 
# Unwrap paragraphs, Unix or DOS, restoring inter-word space if missing,
Mark_LF = Translator ((('\n','+LF+'),('\r\n','+LF+'),('\n\n','\n\n'),('\r\n\r\n','\r\n\r\n')))
# Pick positively identifiable mark for end of lines in either Unix or MS-DOS.   
Single_Space_Mark = Translator (((' +LF+', ' '),('+LF+', ' '),('-+LF+', '')))
no_lf_text = Single_Space_Mark (Mark_LF (text))
			Translation cascade: 
reptiles = T_latin_english (T_german_latin (reptilien))

		Limitations
			1. The number of substitutions and the maximum size of input depends on the respective 
capabilities of the Python re module.
			2. Regular expressions will not work as such but will be handled literally.

		Author:
			Frederic Rentsch (i...@anthra-norell.ch).
			 
	

	def __init__ (self, definitions, eat = 0):

		'''
			definitions: a sequence of pairs of strings. ((target, substitute), (t, s), ...)
			eat: False (0) means translate: unaffected data passes unaltered.
			 True  (1) means extract:   unaffected data doesn't pass (gets eaten).
			 Extraction filters typically require substitutes to end with some separator, 
			 else they fuse together. (E.g. ' ', '\t' or '\n') 
			'eat' is an attribute that can be switched anytime.

		'''			
		self.eat = eat
		self.compile_sequence_of_pairs (definitions)
		
	
	def compile_sequence_of_pairs (self, definitions):

		'''
			Argument 'definitions' is a sequence of pairs:
			(('target 1', 'substitute 1'), ('t2', 's2'), ...)
			Order doesn't matter. 

		'''
	
		import re
		self.definitions = definitions
		targets, substitutes = zip (*definitions)
		re_targets = [re.escape (item) for item in targets]
		re_targets.sort (reverse = True)
		self.targets_set = set (targets)   
		self.table = dict (definitions)
		regex_string = '|'.join (re_targets)
		self.regex = re.compile (regex_string, re.DOTALL)
			
	
	def __call__ (self, s):
		hits = self.regex.findall (s)
		nohits = self.regex.split (s)
		valid_hits = set (hits)  self.targets_set  # Ignore targets with illegal re modifiers.
		if valid_hits:
			substitutes = [self.table [item] for item in hits if item in valid_hits] + []  # Make lengths equal for zip to work right
			if self.eat:
return ''.join (substitutes)
			else:
zipped = zip (nohits, substitutes)
return ''.join (list (reduce (lambda a, b: a + b, [zipped][0]))) + nohits [-1]
		else:
			if self.eat:
return ''
			else:
return s


	def test (self):

		'''
			Translates the definitions and prints the result. All targets 
			must look like the substitutes as defined. If a substitute differs,
			it has been affected by the translation, indicating a potential 
			problem, should the substitute occur in the source.

		'''

		targets_translated = [self (item [0]) for item in self.definitions]
		substitutes = [self (item [1

Re: Financial time series data

2010-09-04 Thread Frederic Rentsch

On Fri, 2010-09-03 at 19:58 +0200, Virgil Stokes wrote:
 import urllib2
 import re
 
 def get_SP500_symbolsX ():
 symbols = []
 lsttradestr = re.compile('Last Trade:')
 k = 0
 for page in range(10):
url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc='+str(page)
print url
f = urllib2.urlopen (url)
html = f.readlines ()
f.close ()
for line in html:
   if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'):
  line_split = line.split (':')
  s = [item.strip ().upper () for item in line_split [5].replace 
 ('','').split (',')]
  for symb in s:
 url = http://finance.yahoo.com/q?s=+symb
 f = urllib2.urlopen(url)
 html = f.readlines()
 f.close()
 
 for line in html:
if lsttradestr.search(line):
   k += 1
   print 'k = %3d (%s)' %(k,symb)
   # Here is where I will extract the numerical values and place
   # 
   #  them in an approrpriate file
  symbols.extend (s [:-3])
 
 return symbols
 # Not quite 500 -- which is correct (for example p. 2 has only 49 
 symbols!)
 # Actually the SP 500 as shown does not contain 500 stocks (symbols)
 
 
 symbols = get_SP500_symbolsX()
 pass
 
 And thanks for your help Frederic --- Have a good day! :-)
 
 --V

Good going! You get the idea. 
   Here's my try for a cleaned-up version that makes the best use of the
facility and takes only fifteen seconds to complete (on my machine).
   You may want to look at historical quotes too. Trent Nelson seems to
have a ready-made solution for this.

---

import urllib2
import re

def get_current_SP500_quotes_from_Yahoo ():

symbol_reader = re.compile ('([a-z-.]+,)+[a-z-.]+')
# Make sure you include all characters that may show up in symbols,

csv_data = ''

for page in range (10):

   url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc=' + str (page)
   print url
   f = urllib2.urlopen (url)
   html = f.readlines ()
   f.close ()

   for line in html:

  if line.lstrip ().startswith ('/scriptspan
id=yfs_params_vcr'):
 symbols = symbol_reader.search (line).group ()
 ## symbols = line.split (':')[5][2:-18]
 ## ^ This is an alternative to the regex. It won't stumble
over 
 ## unexpected characters in symbols, but depends on the
line 
 ## line format to stay put. 
 # print symbols.count (',') + 1   # Uncomment to check for
= 50
 url = 'http://download.finance.yahoo.com/d/quotes.csv?s=%
sf=sl1d1t1c1ohgve=.csv' % symbols  # Regex happens to grab symbols
correctly formatted
 # print url
 f = urllib2.urlopen (url)
 csv_data += f.read ()
 f.close ()

 break

return csv_data
   
---

Here is what you get:

A,29.85,9/3/2010,4:01pm,+0.64,29.49,29.99,29.49,2263815
AA,10.88,9/3/2010,4:00pm,+0.05,11.01,11.07,10.82,16634520
AEE,28.65,9/3/2010,4:01pm,+0.19,28.79,28.79,28.46,3029885
... 494 lines in all (today) 

Symbol, Current or close, Date, Time, Change, Open, High, Low, Volume


---

Good luck to you in the footsteps of Warren Buffet!

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Financial time series data

2010-09-03 Thread Frederic Rentsch

On Fri, 2010-09-03 at 13:29 +0200, Virgil Stokes wrote:
 A more direct question on accessing stock information from Yahoo.
 
 First, use your browser to go to:  http://finance.yahoo.com/q/cp?s=%
 5EGSPC+Components
 
 Now, you see the first 50 rows of a 500 row table of information on
 SP 500 index. You can LM click on
 
   1 -50 of 500 |First|Previous|Next|Last
 
 below the table to position to any of the 10 pages.
 
 I would like to use Python to do the following.
 
 Loop on each of the 10 pages and for each page extract information for
 each row --- How can this be accomplished automatically in Python?
 
 Let's take the first page (as shown by default). It is easy to see the
 link to the data for A is http://finance.yahoo.com/q?s=A. That is, I
 can just move 
 my cursor over the A and I see this URL in the message at the bottom
 of my browser (Explorer 8). If I LM click on A then I will go to
 this
 link --- Do this!
 
 You should now see a table which shows information on this stock and
 this is the information that I would like to extract. I would like to
 do this for all 500 stocks without the need to enter the symbols for
 them (e.g. A, AA, etc.). It seems clear that this should be
 possible since all the symbols are in the first column of each of the
 50 tables --- but it is not at all clear how to extract these
 automatically in Python. 
 
 Hopefully, you understand my problem. Again, I would like Python to
 cycle through these 10 pages and extract this information for each
 symbol in this table.
 
 --V
 
 
 

Here's a quick hack to get the SP500 symbols from the visual page with
the index letters. From this collection you can then order fifty at a
time from the download facility. (If you get a better idea from Yahoo,
you'll post it of course.)



def get_SP500_symbols ():
import urllib
symbols = []
url = 'http://finance.yahoo.com/q/cp?s=^GSPCalpha=%c'
for c in [chr(n) for n in range (ord ('A'), ord ('Z') + 1)]:

print url % c
f = urllib.urlopen (url % c)
html = f.readlines ()
f.close ()
for line in html:
if line.lstrip ().startswith ('/scriptspan 
id=yfs_params_vcr'):
line_split = line.split (':')
s = [item.strip ().upper () for item in 
line_split [5].replace ('',
'').split (',')]
symbols.extend (s [:-3])

return symbols 
# Not quite 500 (!?)


Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Financial time series data

2010-09-03 Thread Frederic Rentsch

On Fri, 2010-09-03 at 16:48 +0200, Virgil Stokes wrote:
 On 03-Sep-2010 15:45, Frederic Rentsch wrote:
  On Fri, 2010-09-03 at 13:29 +0200, Virgil Stokes wrote:
  A more direct question on accessing stock information from Yahoo.
 
  First, use your browser to go to:  http://finance.yahoo.com/q/cp?s=%
  5EGSPC+Components
 
  Now, you see the first 50 rows of a 500 row table of information on
  SP 500 index. You can LM click on
 
 1 -50 of 500 |First|Previous|Next|Last
 
  below the table to position to any of the 10 pages.
 
  I would like to use Python to do the following.
 
  Loop on each of the 10 pages and for each page extract information for
  each row --- How can this be accomplished automatically in Python?
 
  Let's take the first page (as shown by default). It is easy to see the
  link to the data for A is http://finance.yahoo.com/q?s=A. That is, I
  can just move
  my cursor over the A and I see this URL in the message at the bottom
  of my browser (Explorer 8). If I LM click on A then I will go to
  this
  link --- Do this!
 
  You should now see a table which shows information on this stock and
  this is the information that I would like to extract. I would like to
  do this for all 500 stocks without the need to enter the symbols for
  them (e.g. A, AA, etc.). It seems clear that this should be
  possible since all the symbols are in the first column of each of the
  50 tables --- but it is not at all clear how to extract these
  automatically in Python.
 
  Hopefully, you understand my problem. Again, I would like Python to
  cycle through these 10 pages and extract this information for each
  symbol in this table.
 
  --V
 
 
 
  Here's a quick hack to get the SP500 symbols from the visual page with
  the index letters. From this collection you can then order fifty at a
  time from the download facility. (If you get a better idea from Yahoo,
  you'll post it of course.)
 
 
 
  def get_SP500_symbols ():
  import urllib
  symbols = []
  url = 'http://finance.yahoo.com/q/cp?s=^GSPCalpha=%c'
  for c in [chr(n) for n in range (ord ('A'), ord ('Z') + 1)]:
  
  print url % c
  f = urllib.urlopen (url % c)
  html = f.readlines ()
  f.close ()
  for line in html:
  if line.lstrip ().startswith ('/scriptspan 
  id=yfs_params_vcr'):
  line_split = line.split (':')
  s = [item.strip ().upper () for item in 
  line_split [5].replace ('',
  '').split (',')]
  symbols.extend (s [:-3])
 
  return symbols
  # Not quite 500 (!?)
 
 
  Frederic
 
 
 
 I made a few modifications --- very minor. But, I believe that it is a little 
 faster.
 
 import urllib2
 
 def get_SP500_symbolsX ():
 symbols = []
 for page in range(0,9):
url = 'http://finance.yahoo.com/q/cp?s=%5EGSPCc='+str(page)
print url
f = urllib2.urlopen (url)
html = f.readlines ()
f.close ()
for line in html:
   if line.lstrip ().startswith ('/scriptspan id=yfs_params_vcr'):
  line_split = line.split (':')
  s = [item.strip ().upper () for item in line_split [5].replace 
 ('','').split (',')]
  symbols.extend (s [:-3])
 
 return symbols
 # Not quite 500 -- which is correct (for example p. 2 has only 49 
 symbols!)
 # Actually the SP 500 as shown does not contain 500 stocks (symbols)
 
 
 symbols = get_SP500_symbolsX()
 pass

Oh, yes, and there's no use reading lines to the end once the symbols
are in the bag. The symbol-line-finder conditional section should end
with break.
   And do let us know if you get an answer from Yahoo. Hacks like this
are unreliable. They fail almost certainly the next time a page gets
redesigned, which can be any time. 

Frederic
 

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: expression in an if statement

2010-08-19 Thread Frederic Rentsch

On Thu, 2010-08-19 at 00:12 +0200, Thomas Jollans wrote:
 On Wednesday 18 August 2010, it occurred to John Nagle to exclaim:
  On 8/18/2010 11:24 AM, ernest wrote:
   Hi,
   
   In this code:
   
   if set(a).union(b) == set(a): pass
   
   Does Python compute set(a) twice?
  
  CPython does.  Shed Skin might optimize.  Don't know
  about Iron Python.
 
 I doubt any actual Python implementation optimizes this -- how could it? 

And why should it if a programmer uses its facilities inefficiently. I
would write

 if set(a).issuperset (b): pass

Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Simple Problem but tough for me if i want it in linear time

2010-08-18 Thread Frederic Rentsch

On Mon, 2010-08-16 at 23:17 +, Steven D'Aprano wrote:
 On Mon, 16 Aug 2010 20:40:52 +0200, Frederic Rentsch wrote:
 
  How about
  
  [obj for obj in dataList if obj.number == 100]
  
  That should create a list of all objects whose .number is 100. No need
  to cycle through a loop. 
 
 What do you think the list comprehension does, if not cycle through a 
 loop?
 
 
 -- 
 Steven

What I think is that list comprehensions cycle through a loop a lot
faster than a coded loop (for n in ...:). As at the time of my post only
coded loops had been proposed and the OP was concerned about speed, I
thought I'd propose a list comprehension. I guess my explanation was
poorly phrased. Thanks for the reminder.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Simple Problem but tough for me if i want it in linear time

2010-08-16 Thread Frederic Rentsch

On Sun, 2010-08-15 at 15:14 +0200, Peter Otten wrote:
 ChrisChia wrote:
 
  dataList = [a, b, c, ...]
  where a, b, c are objects of a Class X.
  In Class X, it contains self.name and self.number
  
  If i wish to test whether a number (let's say 100) appears in one of
  the object, and return that object,
  is that only fast way of solving this problem without iterating
  through every object to see the number value?
  
  dataList.__contains__ can only check my object instance name...
  anyone can solve this in linear complexity?
 
 Well, iteration as in 
 
 next(item for item in dataList if item.number == 100) 
 
 is O(n) and list.__contains__() has no magic way to do better. If you need 
 O(1) lookup you can use a dict or collections.defaultdict that maps 
 item.number to a list (or set) of X instances:
 
 lookup = {}
 for item in dataList:
 lookup.setdefault(item.number, []).append(item)
 
 print lookup[100]
 
 
 Peter
 

How about 

 [obj for obj in dataList if obj.number == 100]

That should create a list of all objects whose .number is 100. No need
to cycle through a loop. If .number doesn't repeat get your object at
index 0. The approach may seem inefficient for the purpose of extracting
a single item, but the list needs to be gone through in any case and the
list comprehension is surely the most efficient way to do it. 

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

'reload M' doesn't update 'from M inport *'

2010-07-09 Thread Frederic Rentsch

I develop in an IDLE window.

Module M says 'from service import *'.
Next I correct a mistake in function 'service.f'.
Now 'service.f' works fine.

I do 'reload (service); reload (M)'.
The function 'M.f' still misbehaves.

'print inspect.getsource (service.f)' and
'print inspect.getsource (M.f)' shows the same 
corrected code. 

'print service.f' and 'print M.f' show different ids.

So I do 'del M; reload (M)'. Nothing changes.

I delete M again and run gc.collect () to really 
clean house. I reload M again and still nothing changes.
The id of the reloaded function 'M.f' is still the 
same as it was before the purge and so M.f still isn't 
fixed.  

I know I have more radical options, such as starting 
a new IDLE window. That would save me time, but 
I'd like to take the opportunity to understand what
is happening. Surely someone out there knows.

Frederic
 




-- 
http://mail.python.org/mailman/listinfo/python-list

Re: 'reload M' doesn't update 'from M inport *'

2010-07-09 Thread Frederic Rentsch

On Fri, 2010-07-09 at 15:58 +, Steven D'Aprano wrote:
 On Fri, 09 Jul 2010 15:02:25 +0200, Frederic Rentsch wrote:
 
  I develop in an IDLE window.
  
  Module M says 'from service import *'. Next I correct a mistake in
  function 'service.f'. Now 'service.f' works fine.
 
 from service import *
 
 should be considered advanced functionality that is discouraged unless 
 you really know what you are doing, precisely for the problems you are 
 experiencing. You should try to avoid it.
 
 But putting that aside, if you have done from service import * in 
 module m, where are you getting service.f from? The only way that is 
 possible is if you ALSO say import service.
 
 
  I do 'reload (service); reload (M)'.
  The function 'M.f' still misbehaves.
  
  'print inspect.getsource (service.f)' and 'print inspect.getsource
  (M.f)' shows the same corrected code.
 
 inspect.getsource always looks at the source code on disk, no matter what 
 the byte code in memory actually says.
 
  'print service.f' and 'print M.f' show different ids.
  
  So I do 'del M; reload (M)'. Nothing changes.
  
  I delete M again and run gc.collect () to really clean house. I reload M
  again and still nothing changes. The id of the reloaded function 'M.f'
  is still the same as it was before the purge and so M.f still isn't
  fixed.
 
  I know I have more radical options, such as starting a new IDLE window.
  That would save me time, but I'd like to take the opportunity to
  understand what is happening. Surely someone out there knows.
 
 Yes. You have to understand importing. Let's start with the simple:
 
 import m
 
 In *very* simplified pseudo-code, this does:
 
 look for module m in the global cache
 if not there, then:
 search for m.py
 compile it to a Module object
 put the Module object in the cache
 create a new name m in the local namespace
 set the name m to the Module object in the cache
 
 Now let's compare it to:
 
 from m import f
 
 look for module m in the global cache
 if not there, then:
 search for m.py
 compile it to a Module object
 put the Module object in the cache
 look for object named f in the Module object
 create a new name f in the local namespace
 set the name f to cached object
 
 The important thing to notice is the the name f is a local variable. It 
 doesn't, and can't, remember that it comes from module m. Reloading m 
 can't do anything to f, because the connection is lost.
 
 Now consider that the object f that came from m was itself imported 
 from another module, service. Reloading service doesn't help, because 
 m.f doesn't know it came from service. Reloading m doesn't help, because 
 all that does is run from service import f again, and that just fetches 
 f from the global cache.
 
 The simplest, easiest way of dealing with this is not to have to deal 
 with it: don't use from service import f, and ESPECIALLY don't use 
 from service import *. Always use fully-qualified importing:
 
 import service
 service.f
 
 Now reload(service) should do what you expect.
 
 The other way is not to bother with reload. It's not very powerful, only 
 good for the simplest use in the interactive interpreter. Just exit the 
 interpreter and restart it.
 
 
 -- 
 Steven

Thank you very much for your excellent explanation!
   I must say that I haven't been using the from soandso import ...
formula at all. I thought it might expose names to collision, and why
should I assume the responsibility if I can avoid the problem altogether
using explicit names. If I used the, shall we say, direct import this
time it was in an effort to develop a more extensive program. I thought
if a module grows beyond a size that's comfortable to edit, I could just
move select segments to separate files and replace the vacancy with
from the_respective_segment_module import *, analogous to #include
in C.
   The remedy seems to have side-effects that can kill the patient. So
I'll go back to the explicit imports, then. No problem at all. 

Thanking you and the other helpers too

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: 'reload M' doesn't update 'from M inport *'

2010-07-09 Thread Frederic Rentsch

On Fri, 2010-07-09 at 19:38 +0200, Jean-Michel Pichavant wrote:
 Frederic Rentsch wrote:
  I develop in an IDLE window.
 
  Module M says 'from service import *'.
  Next I correct a mistake in function 'service.f'.
  Now 'service.f' works fine.
 
  I do 'reload (service); reload (M)'.
  The function 'M.f' still misbehaves.
 
  'print inspect.getsource (service.f)' and
  'print inspect.getsource (M.f)' shows the same 
  corrected code. 
 
  'print service.f' and 'print M.f' show different ids.
 
  So I do 'del M; reload (M)'. Nothing changes.
 
  I delete M again and run gc.collect () to really 
  clean house. I reload M again and still nothing changes.
  The id of the reloaded function 'M.f' is still the 
  same as it was before the purge and so M.f still isn't 
  fixed.  
 
  I know I have more radical options, such as starting 
  a new IDLE window. That would save me time, but 
  I'd like to take the opportunity to understand what
  is happening. Surely someone out there knows.
 
  Frederic
   
 
 
 
 

 Hi,
 
 Don't use reload, this is nothing but a trap, espacially if your using 
 it to update your objects with the code you are writting.
 
 JM

I've found reload very usable for development in IDLE. IDLE memorizes
my input, and the variables I assign output to. If restart IDLE I lose
it all and start over. That is an awfully awkward alternative to
reload, an alternative I wouldn't consider.
   I found reload tricky with several modules, because all
dependencies need to be updated and which way they go isn't always
obvious. Reloading all modules in the right order works for me. The
reload commands come up with Alt-P as long, precisely, as I don't
restart IDLE. 

 S : You're misusing the del statement. It does not remove any object 
 from mmory, however, it removes the reference to it, the object is still 
 in memory. They are very few cases where del is usefull in python, so 
 try to avoid using it as well.

I understand that things going out of scope delete themselves. I have
used del on occasion, for instance, to get rid of invalid members of a
list or a dictionary. It has to be done in two passes, though, because
neither can be altered during an iteration. The first pass makes a
delete list of indices or keys, so that the actual deletion iterates
through the delete list, not the object deleted from. 
   Would you call that a misuse?

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Requesting direction for installation problem

2008-11-26 Thread Frederic Rentsch


Hi,

Where can one get assistance if a Windows installation service fails to 
install an msi installer? I used to download zip files, but they seem to 
have been replaced with msi files. I know this issue is off topic here. 
So my question simply is: where is it not off topic?


Thanks for any hint

Frederic

--
http://mail.python.org/mailman/listinfo/python-list

Re: problem deriving form type long

2008-01-23 Thread Frederic Rentsch

Gabriel Genellina wrote:
 En Mon, 21 Jan 2008 18:33:10 -0200, Frederic Rentsch
 [EMAIL PROTECTED] escribió:

 Hi, here's something that puzzles me:

   class Fix_Point (long):
 def __init__ (self, l):
long.__init__ (self, l * 0x1):

   fp = Fix_Point (99)
   fp
 99

 You have to override __new__, not __init__. Immutable types like numbers
 and tuples don't use __init__.
 See http://docs.python.org/ref/customization.html
That's a big help! Thank you very much.

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

problem deriving form type long

2008-01-21 Thread Frederic Rentsch

Hi, here's something that puzzles me:

  class Fix_Point (long):
def __init__ (self, l):
   long.__init__ (self, l * 0x1):

  fp = Fix_Point (99)
  fp
99

With prints:

  class Fix_Point (long):
def __init__ (self, l):
  print l
  l_ = l * 2
  print l_
  long.__init__ (self, l_)
  print self

  fp = Fix_Point (99)
99
198
99

I have tried dozens of variations: type casts, variables, ... nothing 
doing. Fix_Point instances always get the argument assigned regardless 
of the transformation __init__ () performs on it prior to calling 
long.__init__ (). Looks like long.__init__ () isn't called at all. Any 
idea anyone what's going on?

Frederic


(P.S. I am not currently a subscriber. I was and had to bail out when I 
couldn't handle the volume anymore. To subscribe just to post one 
question doesn't seem practical at all. So, I don't even know if this 
message goes through. In case it does, I would appreciate a CC directly 
to my address, as I don't think I can receive the list. Thanks a million.)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Parsing HTML

2007-02-14 Thread Frederic Rentsch

mtuller wrote:
 Alright. I have tried everything I can find, but am not getting
 anywhere. I have a web page that has data like this:

 tr 
 td headers=col1_1  style=width:21%   
 span  class=hpPageText LETTER/span/td
 td headers=col2_1  style=width:13%; text-align:right   
 span  class=hpPageText 33,699/span/td
 td headers=col3_1  style=width:13%; text-align:right   
 span  class=hpPageText 1.0/span/td
 td headers=col4_1  style=width:13%; text-align:right   
 /tr

 What is show is only a small section.

 I want to extract the 33,699 (which is dynamic) and set the value to a
 variable so that I can insert it into a database. I have tried parsing
 the html with pyparsing, and the examples will get it to print all
 instances with span, of which there are a hundred or so when I use:

 for srvrtokens in printCount.searchString(printerListHTML):
   print srvrtokens

 If I set the last line to srvtokens[3] I get the values, but I don't
 know grab a single line and then set that as a variable.

 I have also tried Beautiful Soup, but had trouble understanding the
 documentation, and HTMLParser doesn't seem to do what I want. Can
 someone point me to a tutorial or give me some pointers on how to
 parse html where there are multiple lines with the same tags and then
 be able to go to a certain line and grab a value and set a variable's
 value to that?


 Thanks,

 Mike

   
Posted problems rarely provide exhaustive information. It's just not 
possible. I have been taking shots in the dark of late suggesting a 
stream-editing approach to extracting data from htm files. The 
mainstream approach is to use a parser (beautiful soup or pyparsing).
  Often times nothing more is attempted than the location and 
extraction of some text irrespective of page layout. This can sometimes 
be done with a simple regular expression, or with a stream editor if a 
regular expression gets too unwieldy. The advantage of the stream editor 
over a parser is that it doesn't mobilize an arsenal of unneeded 
functionality and therefore tends to be easier, faster and shorter to 
implement. The editor's inability to understand structure isn't a 
shortcoming when structure doesn't matter and can even be an advantage 
in the presence of malformed input that sends a parser on a tough and 
potentially hazardous mission for no purpose at all.
  SE doesn't impose the study of massive documentation, nor the 
memorization of dozens of classes, methods and what not. The following 
four lines would solve the OP's problem (provided the post really is all 
there is to the problem):


  import re, SE# http://cheeseshop.python.org/pypi/SE/2.3

  Filter = SE.SE ('EAT ~(?i)col[0-9]_[0-9](.|\n)*?/td~==SOME 
SPLIT MARK')

  r = re.compile ('(?i)(col[0-9]_[0-9])(.|\n)*?([0-9,]+)/span')

  for line in Filter (s).split ('SOME SPLIT MARK'):
  print r.search (line).group (1, 3)

('col2_1', '33,699')
('col3_1', '0')
('col4_1', '7,428')


---

Input:

  s = '''
td headers=col1_1  style=width:21%   
span  class=hpPageText LETTER/span/td
td headers=col2_1  style=width:13%; text-align:right   
span  class=hpPageText 33,699/span/td
td headers=col3_1  style=width:13%; text-align:right   
span  class=hpPageText 1.0/span/td
td headers=col5_1  style=width:13%; text-align:right   
span  class=hppagetext 7,428/span/td
/tr'''

The SE object handles file input too:

  for line in Filter ('file_name', '').split ('SOME SPLIT MARK'):  # 
'' commands string output
  print r.search (line).group (1, 3)





-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Find and replace in a file with regular expression

2007-02-03 Thread Frederic Rentsch

TOXiC wrote:
 Hi everyone,
 First I say that I serched and tryed everything but I cannot figure 
 out how I can do it.
 I want to open a a file (not necessary a txt) and find and replace a 
 string.
 I can do it with:

 import fileinput, string, sys
 fileQuery = Text.txt
 sourceText = '''SOURCE'''
 replaceText = '''REPLACE'''
 def replace(fileName, sourceText, replaceText):

 file = open(fileName, r)
 text = file.read() #Reads the file and assigns the value to a 
 variable
 file.close() #Closes the file (read session)
 file = open(fileName, w)
 file.write(text.replace(sourceText, replaceText))
 file.close() #Closes the file (write session)
 print All went well, the modifications are done

 replacemachine(fileQuery, sourceText, replaceText)

 Now all went ok but I'm wondering if it's possible to replace text if /
 sourceText/ match a regex.
 Help me please!
 Thx in advance

   
Try this:

  import SE   # from http://cheeseshop.python.org/pypi/SE/2.3

  replacements = 'SOURCE=REPLACE another source=another replace 
~[0-9]+~=int ~[0-9]+\\.[0-9]+~=float  ~'
# Define as many replacements as you like. Identify regexes placing 
them between '~'

  Stream_Editor = SE.SE (replacements)

  Stream_Editor (input_file, output_file)

That's it.


PS 1: Your Stream_Editor accepts strings as well as file names and then 
returns a string by default. This is a great help for developing 
substitution sets interactively.

  print Stream_Editor ('''
If it works, this SOURCE should read REPLACE
and another source should become another replace
and this 123456789 should become int
and this 12345.6789 is a float and so should read float.''')

If it works, this REPLACE should read REPLACE
and another replace should become another replace
and this int should become int
and this float is a float and so should read float.

PS 2: It is convenient to keep large and frequently used substitution 
sets in text files. The SE constructor accepts a file name instead of 
the replacements string:

  Stream_Edtor = SE.SE ('path/replacement_definitions_file')


Regards

Frederic





-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Type casting a base class to a derived one?

2007-01-15 Thread Frederic Rentsch

Chris Mellon wrote:
 On 11 Jan 2007 15:01:48 +0100, Neil Cerutti [EMAIL PROTECTED] wrote:
   
 On 2007-01-11, Frederic Rentsch [EMAIL PROTECTED] wrote:
 
 If I derive a class from another one because I need a few extra
 features, is there a way to promote the base class to the
 derived one without having to make copies of all attributes?

 class Derived (Base):
def __init__ (self, base_object):
   # ( copy all attributes )
   ...

 This looks expensive. Moreover __init__ () may not be available
 if it needs to to something else.

 Thanks for suggestions
   
 How does it make sense to cast a base to a derived in your
 application?

 

 I can't figure out any circumstance when you'd need to do this in
 Python. Upcasting like this is something you do in statically typed
 languages. I suspect that the OP doesn't really believe dynamic
 casting works and doesn't want to pass a derived class for some
 reason.
   
What for? If an instance needs to collect a substantial amount of data 
and needs to perform a substantial amount of processing in order to 
analyze that data, and if the appropriate type of the instance depends 
on the analysis, I thought that the instance might at that point just 
kind of slip into the appropriate identity.
  After studying the various helpful suggestions, some of which, 
like this one, question the wisdom of such an approach, I think I see 
the light: I'd have a class that does the collecting and the analyzing, 
or even two classes: one collecting the other analyzing and then, 
depending on the outcome of the analysis, make the appropriate processor 
and hand it the data it needs. Right?
 
Thank you all very much for your input.

Frederic (OP)


-- 
http://mail.python.org/mailman/listinfo/python-list

Type casting a base class to a derived one?

2007-01-11 Thread Frederic Rentsch

Hi all,
   If I derive a class from another one because I need a few extra 
features, is there a way to promote the base class to the derived one 
without having to make copies of all attributes?

class Derived (Base):
   def __init__ (self, base_object):
  # ( copy all attributes )
  ...

This looks expensive. Moreover __init__ () may not be available if it 
needs to to something else.

Thanks for suggestions

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: textwrap.dedent replaces tabs?

2006-12-29 Thread Frederic Rentsch

Tom Plunket wrote:
 Frederic Rentsch wrote:

   
 Your rules seem incomplete.
 

 Not my rules, the stated documentation for dedent.  My understanding
 of them may not be equivalent to yours, however.
It's not about understanding, It's about the objective. Let us consider 
the difference between passing a driving test and riding a bicycle in 
city traffic. The objective of passing the test is getting the license 
and the means is knowing the rules. The objective of riding the bicycle 
is surviving and the means is anticipating all possible breaches of 
rules on he part of motorists.

 What if common tabs remain after stripping common white space?
 What if we just go with, [r]emove any whitespace than can be uniformly
 removed from the left of every line in `text`. ?
   
 Does this never happen? Or can we hope it doesn't happen?
 

 Hope has no place in programming software that is to be used by
 others.

   
That's exactly what I am saying. That's exactly why it may be a good 
idea to provide preventive measures for rules being breached be those 
others over whom we have no control.

 To err on the side of caution I complete your rules and this is my 
 (tested) attempt at expressing them pythonically.
 

 Inasmuch as my rules have been expressed via tests, the provided code
 fails four of the five tests provided.

   
toms_test_data = (
   ( \n   Hello\n  World,   # Do this
 \nHello\n   World, ),  # Expect this
   ( \n\tHello\n\t   World,
 \nHello\n   World, ),
   ( \t\tHello\n\tWorld,
 \tHello\nWorld, ),
   ( Hello\n\tWorld,
 Hello\n\tWorld, ),
   (   \t Hello\n   \tWorld,
 \t Hello\n \tWorld, ),
)
  for dedent_this, expect_this in toms_test_data:
done = '\n'.join (dedent (dedent_this.splitlines ()))
if done == expect_this: print 'BRAVO!!!'
else: print 'SHAME ON YOU!!!'
   
BRAVO!!!
BRAVO!!!
BRAVO!!!
BRAVO!!!
BRAVO!!!

You seem to have plugged my function into your tester. I wasn't 
concerned about your testing interface but about the dedentation.
 (I admit it does look awfully sevety-ish. Just a vulgar little 
 function.)
 

 Seventys-ish is as much a statement about the lack of statement about
 how you actually tested it as it is that an implementation was made
 apparently without understanding of the requirements.


 -tom!

   
Best regards

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: textwrap.dedent replaces tabs?

2006-12-28 Thread Frederic Rentsch

Tom Plunket wrote:
 Frederic Rentsch wrote:

   
 It this works, good for you. I can't say I understand your objective. 
 (You dedent common leading tabs, except if preceded by common leading 
 spaces (?)).
 

 I dedent common leading whitespace, and tabs aren't equivalent to
 spaces.

 E.g. if some text is indented exclusively with tabs, then the leading
 tabs are stripped appropriately.  If some other text is indented with
 common leading spaces, those are stripped appropriately.  If the text to
 be stripped has some lines starting with spaces and others starting with
 tabs, there are no /common/ leading whitespace characters, and thus
 nothing is stripped.

   
Your rules seem incomplete. What if common tabs remain after stripping common 
white space? Does this never happen? Or can we hope it doesn't happen? To err 
on the side of caution I complete your rules and this is my (tested) attempt at 
expressing them pythonically. (I admit it does look awfully sevety-ish. Just a 
vulgar little function.)

Cheers

Frederic

-

def dedent (lines):

   leading_space_re = re.compile (' *')
   leading_tab_re   = re.compile ('\t*')
   number_of_lines = len (lines)

   while 1:
  common_space_length = common_tab_length = 10
  for line in lines:
 if line:   # No '\n'
try: common_space_length = min (common_space_length, len 
(leading_space_re.match (line).group ()))
except AttributeError: pass
try: common_tab_length = min (common_tab_length, len 
(leading_tab_re.match (line).group ()))
except AttributeError: pass
  if 0  common_space_length  10:
 for i in xrange (number_of_lines):
lines [i] = lines [i][common_space_length:]
  elif 0  common_tab_length  10:
 for i in xrange (number_of_lines):
lines [i] = lines [i][common_tab_length:]
  else:
 break

   return lines


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: BeautifulSoup vs. loose chars

2006-12-26 Thread Frederic Rentsch

John Nagle wrote:
 Felipe Almeida Lessa wrote:
   
 On 26 Dec 2006 04:22:38 -0800, placid [EMAIL PROTECTED] wrote:

 
 So do you want to remove  or replace them with amp; ? If you want
 to replace it try the following;
   
 I think he wants to replace them, but just the invalid ones. I.e.,

 This  this amp; that

 would become

 This amp; this amp; that


 No, i don't know how to do this efficiently. =/...
 I think some kind of regex could do it.
 

 Yes, and the appropriate one is:

   krefindamp = re.compile(r'(?!(\w|#)+;)')
   ...
   xmlsection = re.sub(krefindamp,'amp;',xmlsection)

 This will replace an '' with 'amp' if the '' isn't
 immediately followed by some combination of letters, numbers,
 and '#' ending with a ';'  Admittedly this would let something
 like 'xx#2;', which isn't a legal entity, through unmodified.

 There's still a potential problem with unknown entities in the output XML, but
 at least they're recognized as entities.

   John Nagle


   

Here's another idea:

  s = '''html htm tag should not translate
 should be amp;   
xx#2; isn't a legal entity and should translate 
#123; is a legal entity and should not translate
   /html

  import SE  # http://cheeseshop.python.org/pypi/SE/2.3
  HTM_Escapes = SE.SE (definitions)  # See definitions below the 
dotted line
 
  print HTM_Escapes (s)
html htm tag should not translate
   gt; amp; should be amp;   
   gt; amp;xx#2; isnquot;t a legal entity and should translate 
   gt; #123; is a legal entity and should not translate
   /html

Regards

Frederic


--


definitions = '''

  # Do # Don't do
#  =nbsp;   nbsp;==  #   32  20
  (34)=dquot; dquot;== #   34  22
  =amp;  amp;==   #   38  26
  '=quot; quot;==  #   39  27
  =lt;   lt;==#   60  3c
  =gt;   gt;==#   62  3e
  ©=copy; copy;==  #  169  a9
  ·=middot;   middot;==#  183  b7
  »=raquo;raquo;== #  187  bb
  À=Agrave;   Agrave;==#  192  c0
  Á=Aacute;   Aacute;==#  193  c1
  Â=Acirc;Acirc;== #  194  c2
  Ã=Atilde;   Atilde;==#  195  c3
  Ä=Auml; Auml;==  #  196  c4
  Å=Aring;Aring;== #  197  c5
  Æ=AElig;AElig;== #  198  c6
  Ç=Ccedil;   Ccedil;==#  199  c7
  È=Egrave;   Egrave;==#  200  c8
  É=Eacute;   Eacute;==#  201  c9
  Ê=Ecirc;Ecirc;== #  202  ca
  Ë=Euml; Euml;==  #  203  cb
  Ì=Igrave;   Igrave;==#  204  cc
  Í=Iacute;   Iacute;==#  205  cd
  Î=Icirc;Icirc;== #  206  ce
  Ï=Iuml; Iuml;==  #  207  cf
  Ð=Eth;  Eth;==   #  208  d0
  Ñ=Ntilde;   Ntilde;==#  209  d1
  Ò=Ograve;   Ograve;==#  210  d2
  Ó=Oacute;   Oacute;==#  211  d3
  Ô=Ocirc;Ocirc;== #  212  d4
  Õ=Otilde;   Otilde;==#  213  d5
  Ö=Ouml; Ouml;==  #  214  d6
  Ø=Oslash;   Oslash;==#  216  d8
  Ù=Ugrve;Ugrve;== #  217  d9
  Ú=Uacute;   Uacute;==#  218  da
  Û=Ucirc;Ucirc;== #  219  db
  Ü=Uuml; Uuml;==  #  220  dc
  Ý=Yacute;   Yacute;==#  221  dd
  Þ=Thorn;Thorn;== #  222  de
  ß=szlig;szlig;== #  223  df
  à=agrave;   agrave;==#  224  e0
  á=aacute;   aacute;==#  225  e1
  â=acirc;acirc;== #  226  e2
  ã=atilde;   atilde;==#  227  e3
  ä=auml; auml;==  #  228  e4
  å=aring;aring;== #  229  e5
  æ=aelig;aelig;== #  230  e6
  ç=ccedil;   ccedil;==#  231  e7
  è=egrave;   egrave;==#  232  e8
  é=eacute;   eacute;==#  233  e9
  ê=ecirc;ecirc;== #  234  ea
  ë=euml; euml;==  #  235  eb
  ì=igrave;   igrave;==#  236  ec
  í=iacute;   iacute;==#  237  ed
  î=icirc;icirc;== #  238  ee
  ï=iuml; iuml;==  #  239  ef
  ð=eth;  eth;==   #  240  f0
  ñ=ntilde;   ntilde;==#  241  f1
  ò=ograve;   ograve;==#  242  f2
  ó=oacute;   oacute;==#  243  f3
  ô=ocric;ocric;== #  244  f4
  õ=otilde;   otilde;==#  245  f5
  ö=ouml; ouml;==  #  246  f6
  ø=oslash;   oslash;==#  248  f8
  ù=ugrave;   ugrave;==#  249  f9
  ú=uacute;   uacute;==#  250  fa
  û=ucirc;ucirc;== #  251  fb
  ü=uuml; uuml;==  #  252  fc
  ý=yacute;   yacute;==#  253  fd
  þ=thorn;thorn;== #  254  fe
  (xff)=#255;   #  255  ff
   #==  #  All numeric codes
~(.|\n)*?~== #  All HTM tags '''

If the ampersand is all you need to handle you can erase the others
in the first column. You need to keep the second column though, except
the last entry, because the tags don't need protection if '' and
'' in the first column are gone.
  Definitions are easily edited and

Re: textwrap.dedent replaces tabs?

2006-12-24 Thread Frederic Rentsch

Tom Plunket wrote:
 Frederic Rentsch wrote:

   
 Following a call to dedent () it shouldn't be hard to translate leading 
 groups of so many spaces back to tabs.
 

 Sure, but the point is more that I don't think it's valid to change to
 tabs in the first place.

 E.g.:

  input = ' ' + '\t' + 'hello\n' +
  '\t' + 'world'

  output = textwrap.dedent(input)

 will yield all of the leading whitespace stripped, which IMHO is a
 violation of its stated function.  In this case, nothing should be
 stripped, because the leading whitespace in these two lines does not
 /actually/ match.  Sure, it visually matches, but that's not the point
 (although I can understand that that's a point of contention in the
 interpreter anyway, I would have no problem with it not accepting 1 tab
 = 8 spaces for indentation...  But that's another holy war.

   
 If I understand your problem, you want to restore the dedented line to 
 its original composition if spaces and tabs are mixed and this doesn't 
 work because the information doesn't survive dedent ().
 

 Sure, although would there be a case to be made to simply not strip the
 tabs in the first place?

 Like this, keeping current functionality and everything...  (although I
 would think if someone wanted tabs expanded, they'd call expandtabs on
 the input before calling the function!):

 def dedent(text, expand_tabs=True):
 dedent(text : string, expand_tabs : bool) - string

 Remove any whitespace than can be uniformly removed from the left
 of every line in `text`, optionally expanding tabs before altering
 the text.

 This can be used e.g. to make triple-quoted strings line up with
 the left edge of screen/whatever, while still presenting it in the
 source code in indented form.

 For example:

 def test():
 # end first line with \ to avoid the empty line!
 s = '''\
  hello
 \t  world
 '''
 print repr(s) # prints ' hello\n\t  world\n'
 print repr(dedent(s))  # prints ' hello\n\t  world\n'
 
 if expand_tabs:
 text = text.expandtabs()
 lines = text.split('\n')
 
 margin = None
 for line in lines:
 if margin is None:
 content = line.lstrip()
 if not content:
 continue
 indent = len(line) - len(content)
 margin = line[:indent]
 elif not line.startswith(margin):
 if len(line)  len(margin):
 content = line.lstrip()
 if not content:
 continue
 while not line.startswith(margin):
 margin = margin[:-1]

 if margin is not None and len(margin)  0:
 margin = len(margin)
 for i in range(len(lines)):
 lines[i] = lines[i][margin:]

 return '\n'.join(lines)

 import unittest

 class DedentTest(unittest.TestCase):
 def testBasicWithSpaces(self):
 input = \n   Hello\n  World
 expected = \nHello\n   World
 self.failUnlessEqual(expected, dedent(input))

 def testBasicWithTabLeadersSpacesInside(self):
 input = \n\tHello\n\t   World
 expected = \nHello\n   World
 self.failUnlessEqual(expected, dedent(input, False))
 
 def testAllTabs(self):
 input = \t\tHello\n\tWorld
 expected = \tHello\nWorld
 self.failUnlessEqual(expected, dedent(input, False))
 
 def testFirstLineNotIndented(self):
 input = Hello\n\tWorld
 expected = input
 self.failUnlessEqual(expected, dedent(input, False))
 
 def testMixedTabsAndSpaces(self):
 input =   \t Hello\n   \tWorld
 expected = \t Hello\n \tWorld
 self.failUnlessEqual(expected, dedent(input, False))
 
 if __name__ == '__main__':
 unittest.main()
 -tom!

   
It this works, good for you. I can't say I understand your objective. 
(You dedent common leading tabs, except if preceded by common leading 
spaces (?)). Neither do I understand the existence of indentations made 
up of tabs mixed with spaces, but that is another topic.
 I have been wasting a lot of time with things of this nature coding 
away before forming a clear conception in my mind of what my code was 
supposed to accomplish. Sounds stupid. But many problems seem trivial 
enough at first sight to create the illusion of perfect understanding. 
The encounter with the devil in the details can be put off but not 
avoided. Best to get it over with from the start and write an exhaustive 
formal description of the problem. Follows an exhaustive formal 
description of the rules for its solution. The rules can then be morphed 
into code in a straightforward manner. In other words, coding should be 
the translation of a logical system into a language a machine 
understands. It should not be the construction of the logical system. 
This, anyway

Re: textwrap.dedent replaces tabs?

2006-12-22 Thread Frederic Rentsch

Tom Plunket wrote:
 Frederic Rentsch wrote:

   
 Well, there is that small problem that there are leading tabs that I
 want stripped.  I guess I could manually replace all tabs with eight
 spaces (as opposed to 'correct' tab stops), and then replace them when
 done, but it's probably just as easy to write a non-destructive dedent.
   
 This should do the trick:

   Dedent = re.compile ('^\s+')
   for line in lines: print Dedent.sub ('', line)
 

 The fact that this doesn't do what dedent() does makes it not useful.
 Stripping all leading spaces from text is as easy as calling lstrip() on
 each line:
   

My goodness! How right your are.
 text = '\n'.join([line.lstrip() for line in text.split('\n')])

 alas, that isn't what I am looking for, nor is that what
 textwrap.dedent() is intended to do.

 -tom!

   
Following a call to dedent () it shouldn't be hard to translate leading 
groups of so many spaces back to tabs. But this is probably not what you 
want. If I understand your problem, you want to restore the dedented 
line to its original composition if spaces and tabs are mixed and this 
doesn't work because the information doesn't survive dedent (). Could 
the information perhaps be passed around dedent ()? Like this: make a 
copy of your lines and translate the copy's tabs to so many (8?) marker 
bytes (e.g. ascii 0). Dedent  the originals. Left-strip each of the 
marked line copies to the length of its dedented original and translate 
the marked groups back to tabs.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: textwrap.dedent replaces tabs?

2006-12-17 Thread Frederic Rentsch

Tom Plunket wrote:
 CakeProphet wrote:

   
 Hmmm... a quick fix might be to temporarily replace all tab characters
 with another, relatively unused control character.

 MyString = MyString.replace(\t, chr(1))
 MyString = textwrap.dedent(MyString)
 MyString = MyString.replace(chr(1), \t)

 Of course... this isn't exactly safe, but it's not going to be fatal,
 if it does mess something up. As long as you don't expect receiving any
 ASCII 1 characters.
 

 Well, there is that small problem that there are leading tabs that I
 want stripped.  I guess I could manually replace all tabs with eight
 spaces (as opposed to 'correct' tab stops), and then replace them when
 done, but it's probably just as easy to write a non-destructive dedent.

 It's not that I don't understand /why/ it does it; indeed I'm sure it
 does this so you can mix tabs and spaces in Python source.  Why anyone
 would intentionally do that, though, I'm not sure.  ;)

 -tom!

   
This should do the trick:

  Dedent = re.compile ('^\s+')
  for line in lines: print Dedent.sub ('', line)

Frederic

---

Testing:

  text = s = '''   # Dedent demo
No indent
   Three space indent
\tOne tab indent
   \t\tThree space, two tab indent
\t  \tOne tab, two space, one tab indent with two tabs here \t\t'''

  print text
print s
   # Dedent demo
No indent
   Three space indent
One tab indent
   Three space - two tab indent
   One tab - two spaces - one tab indent with two tabs here 


  for line in text.splitlines (): print Dedent.sub ('', line)

# Dedent demo
No indent
Three space indent
One tab indent
Three space - two tab indent
One tab - two spaces - one tab indent with two tabs here 

---


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Python spam?

2006-12-01 Thread Frederic Rentsch

Hendrik van Rooyen wrote:
 Aahz [EMAIL PROTECTED] wrote:

   
 Anyone else getting Python-related spam?  So far, I've seen messages
 from Barry Warsaw and Skip Montanaro (although of course header
 analysis proves they didn't send it).
 --
 

 not like that - just the normal crud from people giving me get rich quick tips
 on the stock market that is aimed at mobilising my money to follow theirs to
 help influence the price of a share...

 - Hendrik


   
...which I noticed works amazingly well in many cases, looking at the 
charts. which, again, means that the trick isn't likely to fizzle out 
soon as others have with victims getting wise to it. Getting feathers 
plucked in this game isn't a turn-off. It's an opportunity to join the 
pluckers by speeding up one's turnover at the expense of the slowpokes. 
Like pyramid sales this it is a self-generating market.
   This game, at least, isn't unethical, other than clogging the 
internet with reckless traffic. I've been asking myself why it seems so 
difficult to backtrace such obtrusive, if not criminal, traffic to the 
source and squash it there. Perhaps some knowledgeable volunteer would 
share his insights. Perhaps stalking con artists could be another 
interest group.

Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: splitting a long string into a list

2006-11-29 Thread Frederic Rentsch

ronrsr wrote:
 still having a heckuva time with this.

 here's where it stand - the split function doesn't seem to work the way
 i expect it to.


 longkw1,type(longkw):   Agricultural subsidies; Foreign
 aid;Agriculture; Sustainable Agriculture - Support; Organic
 Agriculture; Pesticides, US, Childhood Development, Birth Defects;
 type 'list' 1

 longkw.replace(',',';')

 Agricultural subsidies; Foreign aid;Agriculture; Sustainable
 Agriculture - Support; Organic Agriculture; Pesticides, US, Childhood
 Development


  kw = longkw.split(; ,)#kw is now a list of len 1

 kw,typekw= ['Agricultural subsidies; Foreign aid;Agriculture;
 Sustainable Agriculture - Support; Organic Agriculture; Pesticides, US,
 Childhood Development, Birth Defects; Toxic Chemicals;Antibiotics,
 Animals;Agricultural Subsidies


 what I would like is to break the string into a list of the delimited
 words, but have had no luck doing that - I thought split wuld do that,
 but it doens't.

 bests,

 -rsr-


   

  import SE# http://cheeseshop.python.org/pypi/SE/2.3

  Split_Marker = SE.SE (' ,=|  ;=| ')# Translates both ',' and 
';' into an arbitrary split mark ('|')
  for item in Split_Marker (longstring).split ('|'): print item

Agricultural subsidies
 Foreign aidAgriculture
Sustainable Agriculture - Support
 Organic Agriculture

... etc.

To get rid of the the leading space on some lines simply add 
corresponding replacements. SE does any number of substitutions in one 
pass. Defining them is a simple matter of writing them up in one single 
string from which the translator object is made:

  Split_Marker = SE.SE (' ,=|  ;=|  , =|  ; =| ')
  for item in Split_Marker (longstring).split ('|'): print item

Agricultural subsidies
Foreign aidAgriculture
Sustainable Agriculture - Support
Organic Agriculture


Regards

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: utf - string translation

2006-11-29 Thread Frederic Rentsch

Dan wrote:
 On 22 nov, 22:59, John Machin [EMAIL PROTECTED] wrote:

   
 processes (Vigenère)
   
 So why do you want to strip off accents? The history of communication
 has several examples of significant difference in meaning caused by
 minute differences in punctuation or accents including one of which you
 may have heard: a will that could be read (in part) as either a chacun
 d'eux million francs or a chacun deux million francs with the
 remainder to a 3rd party.

 
 of course.
 My purpose is not doing something realistic on a cryptographic view.
 It's for learning rudiments of programming.
 In fact, coding characters is a kind of cryptography I mean, sometimes,
 when friends can't read an email because of the characters used...

 I wanted to strip off accents because I use the frequences of the
 charactacters. If  I only have 26 char, it's more easy to analyse (the
 text can be shorter for example)

   
Try this:

from_characters   = 
'\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xff\xe7\xe8\xe9\xea\xeb'
to_characters = 
'AAACDNOOYaaaonooyyc'
translation_table = string.maketrans (from_characters, to_characters)
translated_string = string.translate (original_string, translation_table)


Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: multi split function taking delimiter list

2006-11-16 Thread Frederic Rentsch

Paddy wrote:
 Paddy wrote:

 Paddy wrote:

 [EMAIL PROTECTED] wrote:

 Hi, I'm looking for something like:

 multi_split( 'a:=b+c' , [':=','+'] )

 returning:
 ['a', ':=', 'b', '+', 'c']

 whats the python way to achieve this, preferably without regexp?

 Thanks.

 Martin
 I resisted my urge to use a regexp and came up with this:

 from itertools import groupby
 s = 'apple=blue+cart'
 [''.join(g) for k,g in groupby(s, lambda x: x in '=+')]
 ['apple', '=', 'blue', '+', 'cart']
 For me, the regexp solution would have been clearer, but I need to
 stretch my itertools skills.

 - Paddy.
 Arghhh!
 No colon!
 Forget the above please.

 - Pad.

 With colon:

 from itertools import groupby
 s = 'apple:=blue+cart'
 [''.join(g) for k,g in groupby(s,lambda x: x in ':=+')]
 ['apple', ':=', 'blue', '+', 'cart']

 - Pad.

Automatic grouping may or may not work as intended. If some subsets 
should not be split, the solution raises a new problem.

I have been demonstrating solutions based on SE with such frequency of 
late that I have begun to irritate some readers and SE in sarcastic 
exaggeration has been characterized as the 'Solution of Everything'. 
With some trepidation I am going to demonstrate another SE solution, 
because the truth of the exaggeration is that SE is a versatile tool for 
handling a variety of relatively simple problems in a simple, 
straightforward manner.

  test_string =  'a:=b+c: apple:=blue:+cart''
  SE.SE (':\==/:\=/ +=/+/')(test_string).split ('/')   # For repeats 
the SE object would be assigned to a variable
['a', ':=', 'b', '+', 'c: apple', ':=', 'blue:', '+', 'cart']

This is a nuts-and-bolts approach. What you do is what you get. What you 
want is what you do. By itself SE doesn't do anything but search and 
replace, a concept without a learning curve. The simplicity doesn't 
suggest versatility. Versatility comes from application techniques.
SE is a game of challenge. You know the result you want. You know 
the pieces you have. The game is how to get the result with the pieces 
using search and replace, either per se or as an auxiliary, as in this 
case for splitting. That's all. The example above inserts some 
appropriate split mark ('/'). It takes thirty seconds to write it up and 
see the result. No need to ponder formulas and inner workings. If you 
don't like what you see you also see what needs to be changed. Supposing 
we should split single colons too, adding the corresponding substitution 
and verifying the effect is a matter of another ten seconds:

  SE.SE (':\==/:\=/ +=/+/ :=/:/')(test_string).split ('/')
['a', ':=', 'b', '+', 'c', ':', ' apple', ':=', 'blue', ':', '', '+', 
'cart']

Now we see an empty field we don't like towards the end. Why?

  SE.SE (':\==/:\=/ +=/+/ :=/:/')(test_string)
'a/:=/b/+/c/:/ apple/:=/blue/://+/cart'

Ah! It's two slashes next to each other. No problem. We de-multiply 
double slashes in a second pass:

  SE.SE (':\==/:\=/ +=/+/ :=/:/ | //=/')(test_string).split ('/')
['a', ':=', 'b', '+', 'c', ':', ' apple', ':=', 'blue', ':', '+', 'cart']

On second thought the colon should not be split if a plus sign follows:

  SE.SE (':\==/:\=/ +=/+/ :=/:/ :+=:/+/ | //=/')(test_string).split ('/') 

['a', ':=', 'b', '+', 'c', ':', ' apple', ':=', 'blue:', '+', 'cart']

No, wrong again! 'Colon-plus' should be exempt altogether. And no spaces 
please:

  SE.SE (':\==/:\=/ +=/+/ :=/:/ :+=:+  = | 
//=/')(test_string).split ('/')
['a', ':=', 'b', '+', 'c', ':', 'apple', ':=', 'blue:+cart']

etc.

It is easy to get carried away and to forget that SE should not be used 
instead of Python's built-ins, or to get carried away doing contextual 
or grammar processing explicitly, which gets messy very fast. SE fills a 
gap somewhere between built-ins and parsers.
 Stream editing is not a mainstream technique. I believe it has the 
potential to make many simple problems trivial and many harder ones 
simpler. This is why I believe the technique deserves more attention, 
which, again, may explain the focus of my posts.

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Seeking assistance - string processing.

2006-11-14 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 I've been working on some code to search for specific textstrings and
 act upon them insome way. I've got the conversion sorted however there
 is 1 problem remaining.

 I am trying to work out how to make it find a string like this ===
 and when it has found it, I want it to add === to the end of the
 line.

 For example.

 The text file contains this:

 ===Heading

 and I am trying to make it be processed and outputted as a .dat file
 with the contents

 ===Heading===

 Here's the code I have got so far.

 import string
 import glob
 import os

 mydir = os.getcwd()
 newdir = mydir#+\\Test\\;

 for filename in glob.glob1(newdir,*.txt):
 #print This is an input file:  + filename
 fileloc = newdir+\\+filename
 #print fileloc

 outputname = filename
 outputfile = string.replace(outputname,'.txt','.dat')
 #print filename
 #print a

 print This is an input file:  + filename + .  Output file:
 +outputfile

 #temp = newdir + \\ + outputfile
 #print temp


 fpi = open(fileloc);
 fpo = open(outputfile,w+);

 output_lines = []
 lines = fpi.readlines()

 for line in lines:
 if line.rfind() is not -1:
 new = line.replace(,)
 elif line.rfind(img:) is not -1:
 new = line.replace(img:,[[Image:)
 elif line.rfind(.jpg) is not -1:
 new = line.replace(.jpg,.jpg]])
 elif line.rfind(.gif) is not -1:
 new = line.replace(.gif,.gif]])
 else:
 output_lines.append(line);
 continue
 output_lines.append(new);

 for line in output_lines:
 fpo.write(line)

 fpi.close()
 fpo.flush()
 fpo.close()


 I hope this gets formatted correctly :-p

 Cheers, hope you can help.

   

Here's a suggestion:

  import SE
  Editor = SE.SE ('==  img:=[[Image: 
.jpg=.jpg]] .gif=.gif]]')
  Editor ('  img: .jpg .gif')# See if it works
'  [[Image: .jpg]] .gif]]'

It works. (Add in other replacements if the need arises.)

Works linewise

  for line in f:
  new_line = Editor 
(line)   
  ...

Or filewise, which comes in handy in your case:

  for in_filename in glob.glob (newdir+'/*.txt'):
  out_filename = in_filename.replace ('.txt','.dat')
  Editor (in_filename, out_filename)


See if that helps. Find SE here: http://cheeseshop.python.org/pypi/SE/2.3

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Character encoding

2006-11-08 Thread Frederic Rentsch

mp wrote:
 I have html document titles with characters like gt;, nbsp;, and
 #135. How do I decode a string with these values in Python?

 Thanks

   
This is definitely the most FAQ. It comes up about once a week.

The stream-editing way is like this:

  import SE
  HTM_Decoder = SE.SE ('htm2iso.se') # Include path

 test_string = '''I have html document titles with characters like gt;, 
 nbsp;, and
#135;. How do I decode a string with these values in Python?'''
 print HTM_Decoder (test_string)
I have html document titles with characters like ,  , and
‡. How do I decode a string with these values in Python?

An SE object does files too.

 HTM_Decoder ('with_codes.txt', 'translated_codes.txt')  # Include path

You could download SE from - http://cheeseshop.python.org/pypi/SE/2.3. The 
translation definitions file htm2iso.se is included. If you open it in your 
editor, you can see how to write your own definition files for other 
translation tasks you may have some other time.

Regards

Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: string to list of numbers conversion

2006-11-06 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Hi,
   I have a string '((1,2), (3,4))' and I want to convert this into a
 python tuple of numbers. But I do not want to use eval() because I do
 not want to execute any code in that string and limit it to list of
 numbers.
   Is there any alternative way?

 Thanks.
 Suresh

   
s = '((1,2), (3,4))'
separators = re.compile ('\(\s*\(|\)\s*,\s*\(|\)\s*\)')
tuple ([(float (n[0]), float (n[1])) for n in [pair.split (',') for pair 
in separators.split (s) if pair]])
((1.0, 2.0), (3.0, 4.0))

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ANN: SE 2.3. Available now

2006-11-04 Thread Frederic Rentsch

Fredrik Lundh wrote:
 Frederic Rentsch wrote:

   
 And here's the proof I am being perceived as a nuisance. I apologize, 
 keeping to myself that I don't care.
 

 since you're constantly targeting newbies, and are hawking your stuff 
 also for things for which there are simple and efficient solutions 
 available in Python's standard library, your behavior could be seen
 as more than just a nuisance.

 /F

   
Thank you for making me aware of it. I totally agree with you that 
inefficient complexity should never replace efficient simplicity. SE, in 
fact, is much more a convenient interface complement than a stand-alone 
tool and as such meant to cooperate with the available Python idiom.
  I used this forum to connect with real-life problems, which I 
believe is one of its main purposes. Testing the envelope of my 'stuff' 
I would almost certainly have overstretched it on occasion. And again, 
what's wrong with this absolutely normal development process? And what 
better environment could there be for it than an interest group like 
this one where experts like yourself offer direction?
  I am not targeting newbies (whatever that is). I am not targeting 
at all. I invite response. You had cause to respond as this message of 
yours reveals. But you didn't respond and now your are angry with me for it.
  So, thanks again for the information. Your comments are always 
welcome. I understand your style as an expression of your personality. 
Belligerence passes my notice if I don't see what I have to do with it.

Regards

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: SE 2.3 temporarily unavailable. Cheese shop defeats upload with erratic behavior. Urgently requesting help.

2006-11-03 Thread Frederic Rentsch

jim-on-linux wrote:
 Frederic,

 I've been trying to get back into my package in 
 the Cheese Shop for over a year. The phone 
 company changed my e:mail address and to make a 
 long and frustrating story short I can't get back 
 into the Cheese Shop to make changes to my file.

 Time is money.  At some time you have to consider 
 if it is worth it.  At least you have the name of 
 your program listed.

 I wish I could be more helpfull. I'll watch the 
 responses you get from others.

 Good Luck,
 jim-on-linux

 http://www.inqvista.com






 On Thursday 02 November 2006 09:00, you wrote:
   
 Some time ago I had managed to upload a small
 package to the Cheese Shop using the data entry
 template. Uploading is in two steps: first the
 text then the package file. When I had a new
 version it went like this: The new text made a
 new page, but the new file went to the old
 
 snip
   


Thanks for letting me know that I am not alone. Do you know of an 
alternative to the Cheese Shop?

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

ANN: SE 2.3. Available now

2006-11-03 Thread Frederic Rentsch

A few Cheese Shop upload problems have been solved with the help of this 
creative group. Thank you all!

Version 2.2 beta should be phased out. It has a functional defect, 
missing matches with a very low statistical probability. Version 2.3 has 
this fixed.

   Download URL: http://cheeseshop.python.org/pypi/SE/2.3

A list of possible handling improvements is being made to be 
incorporated in the next version, One major flaw of the interface design 
came to light the other day when a user reported a non-functioning 
Editor Object made with a file name. If the constructor cannot find the 
file it records the fact in the object's log without making the user 
aware that his Editor Object is incomplete or altogether devoid of 
substitutions. His obvious conclusion is that the damn thing isn't 
working right.
   Compile errors should certainly be reported at compile time. The next 
version will send all messages being logged also to stderr by default. 
The thing to do with the current version, if it appears to malfunction, 
is to inspect the log and the compiled substitutions.

  Editor = SE.SE ('operators.se')
  Editor.show_log ()

Fri Nov 03 12:49:17 2006 - Compiler - Ignoring single word 
'operators.se'. Not an existing file 'operators.se'.

  Editor = SE.SE ('se/operators.se')   # path was missing
  Editor.show_log ()

(Log is empty. All is well.)

  Editor.show (show_translators = 1)

(snip)

Single-Byte Targets
1: ||-|LT|
2: ||-|GT|

Multi-Byte Targets
3: ||-|AND|
4: -|OR|

etc...

The display makes definition errors conspicuous. Missing definitions 
indicate malformed or redefined (overwritten) ones.


Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: SE

2006-11-03 Thread Frederic Rentsch

C or L Smith wrote:
 Hello,

 I'm evaluating different methods of handling a transliteration (from an 
 ascii-based representation of the devanagari/indian script to a romanized 
 representation). I found SE and have been working with it today. One thing 
 that I ran into (that I don't see a reason for right off) is the behavior 
 where a single period just gets eaten and no output is produced:

   
 import SE
 s=SE.SE('.=period')
 s('A.')
 
 'Aperiod'
   
 s('.')
 

 It's not the fact that it's a single character target that is causing the 
 problem:

   
 s=SE.SE('x=y')
 s('x')
 
 'y'

 I also tried the following:

   
 s=SE.SE('(46)=dot')
 s('A.')
 
 'Adot'
   
 s('.')
 

 Am I missing something?

 /chris


   

No, you are not missing anything. Quite on the contrary. You caught 
something! Thanks for the report.

Here's the patch:

Line 343 in SEL.py is:  if name == '':

Make it:if name.replace ('.', '') == '':

In Version 2.2 it's line 338. But Version 2.2 should be phased out anyway.

The problem had to do with the automatic typing of the input data that 
tells string input from file names, which of course are also strings. 
The test is done by os.stat (). If it says an input string is an 
existing file, then that file is translated. Admittedly this is not a 
very robust mechanism. In practice, though, I have never had a problem 
with it, because input data just about never are single words. If names 
of existing files are the object of a translation they would be passed 
all together and be unambiguously recognized as a string. To translate a 
single file name SE is hardly a means of choice, but it could still be 
done if the file name is given a leading space.
   Now, if a single dot (or multiple dots) come along, os.stat does not 
raise an error and that results in a typing error at that point. The 
patch above takes care of that.

Frederic
















-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ANN: SE 2.3. Available now

2006-11-03 Thread Frederic Rentsch

Gary Herron wrote:
 Frederic Rentsch wrote:
   
 A few Cheese Shop upload problems have been solved with the help of this 
 creative group. Thank you all!

 Version 2.2 beta should be phased out. It has a functional defect, 
 missing matches with a very low statistical probability. Version 2.3 has 
 this fixed.

Download URL: http://cheeseshop.python.org/pypi/SE/2.3
   
 
 As a matter of polite netiquette, a message like this really ought to
 have a paragraph telling us what SE *is*.(Unless it's a secret :-))

   
Thanks for the inquiry. I've been hawking this thing so persistently of 
late that I'm afraid to start being perceived as a nuisance. SE is a 
stream editor that is particularly easy  and fast to use. A summary of 
its characteristics is only a click away at the URL a few lines up from 
this line.

Frederic
 A list of possible handling improvements is being made to be 
 incorporated in the next version, One major flaw of the interface design 
 came to light the other day when a user reported a non-functioning 
 Editor Object made with a file name. If the constructor cannot find the 
 file it records the fact in the object's log without making the user 
 aware that his Editor Object is incomplete or altogether devoid of 
 substitutions. His obvious conclusion is that the damn thing isn't 
 working right.
Compile errors should certainly be reported at compile time. The next 
 version will send all messages being logged also to stderr by default. 
 The thing to do with the current version, if it appears to malfunction, 
 is to inspect the log and the compiled substitutions.

   Editor = SE.SE ('operators.se')
   Editor.show_log ()

 Fri Nov 03 12:49:17 2006 - Compiler - Ignoring single word 
 'operators.se'. Not an existing file 'operators.se'.

   Editor = SE.SE ('se/operators.se')   # path was missing
   Editor.show_log ()

 (Log is empty. All is well.)

   Editor.show (show_translators = 1)

 (snip)

 Single-Byte Targets
 1: ||-|LT|
 2: ||-|GT|

 Multi-Byte Targets
 3: ||-|AND|
 4: -|OR|

 etc...

 The display makes definition errors conspicuous. Missing definitions 
 indicate malformed or redefined (overwritten) ones.


 Frederic


   
 

   


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: ANN: SE 2.3. Available now

2006-11-03 Thread Frederic Rentsch

Fredrik Lundh wrote:
 Gary Herron wrote:

   
 As a matter of polite netiquette, a message like this really ought to
 have a paragraph telling us what SE *is*.(Unless it's a secret :-))
 

 nah, if you've spent more than five minutes on c.l.python lately, you'd 
 noticed that it's the Solution to Everything (up there with pyparsing, I 
 think).

 /F

   
And here's the proof I am being perceived as a nuisance. I apologize, 
keeping to myself that I don't care.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: unescape HTML entities

2006-11-02 Thread Frederic Rentsch

Rares Vernica wrote:
 Hi,

 Nice module!

 I downloaded 2.3 and I started to play with it. The file names have 
 funny names, they are all caps, including extension.

 For example the main module file is SE.PY. Is you try import SE it 
 will not work as Python expects the file extension to be py.

 Thanks,
 Ray

 Frederic Rentsch wrote:
   
 Rares Vernica wrote:
 
 Hi,

 How can I unescape HTML entities like nbsp;?

 I know about xml.sax.saxutils.unescape() but it only deals with amp;, 
 lt;, and gt;.

 Also, I know about htmlentitydefs.entitydefs, but not only this 
 dictionary is the opposite of what I need, it does not have nbsp;.

 It has to be in python 2.4.

 Thanks a lot,
 Ray

   
 One way is this:

   import SE  # 
 Download from http://cheeseshop.python.org/pypi/SE/2.2%20beta
   SE.SE ('HTM2ISO.se')('input_file_name', 'output_file_name')# 
 HTM2ISO.se is included
 'output_file_name'

 For repeated translations the SE object would be assigned to a variable:

   HTM_Decoder = SE.SE ('HTM2ISO.se')

 SE objects take and return strings as well as file names which is useful 
 for translating string variables, doing line-by-line translations and 
 for interactive development or verification. A simple way to check a 
 substitution set is to use its definitions as test data. The following 
 is a section of the definition file HTM2ISO.se:

 test_string = '''
 oslash;=(xf8)   #  248  f8
 ugrave;=(xf9)   #  249  f9
 uacute;=(xfa)   #  250  fa
 ucirc;=(xfb)#  251  fb
 uuml;=(xfc) #  252  fc
 yacute;=(xfd)   #  253  fd
 thorn;=(xfe)#  254  fe
 #233;=(xe9)
 #234;=(xea)
 #235;=(xeb)
 #236;=(xec)
 #237;=(xed)
 #238;=(xee)
 #239;=(xef)
 '''

   print HTM_Decoder (test_string)

 ø=(xf8)   #  248  f8
 ù=(xf9)   #  249  f9
 ú=(xfa)   #  250  fa
 û=(xfb)#  251  fb
 ü=(xfc) #  252  fc
 ý=(xfd)   #  253  fd
 þ=(xfe)#  254  fe
 é=(xe9)
 ê=(xea)
 ë=(xeb)
 ì=(xec)
 í=(xed)
 î=(xee)
 ï=(xef)

 Another feature of SE is modularity.

   strip_tags = '''
~(.|\x0a)*?~=(9)   # one tag to one tab
~!--(.|\x0a)*?--~=(9)  # one comment to one tab
 |   # run
~\x0a[ \x09\x0d\x0a]*~=(x0a)   # delete empty lines
~\t+~=(32)   # one or more tabs to one space
~\x20\t+~=(32)   # one space and one or more tabs to 
 one space
~\t+\x20~=(32)   # one or more tab and one space to 
 one space
 '''

   HTM_Stripper_Decoder = SE.SE (strip_tags + ' HTM2ISO.se ')   # 
 Order doesn't matter

 If you write 'strip_tags' to a file, say 'STRIP_TAGS.se' you'd name it 
 together with HTM2ISO.se:

   HTM_Stripper_Decoder = SE.SE ('STRIP_TAGS.se  HTM2ISO.se')   # 
 Order doesn't matter

 Or, if you have two SE objects, one for stripping tags and one for 
 decoding the ampersands, you can nest them like this:

   test_string = p class=MsoNormal 
 style='line-height:110%'iReneacute;/i est un garccedil;on qui 
 paraicirc;t plus acirc;geacute;. /p

   print Tag_Stripper (HTM_Decoder (test_string))
   René est un garçon qui paraît plus âgé.

 Nesting works with file names too, because file names are returned:

   Tag_Stripper (HTM_Decoder ('input_file_name'), 'output_file_name')
 'output_file_name'


 Frederic



 

   
Arrrgh!

Did it again capitalizing extensions. We had solved this problem and 
here we have it again. I am so sorry. Fortunately it isn't hard to 
solve, renaming the files once one identifies the problem, which you 
did. I shall change the upload within the next sixty seconds.

Frederic

I'm glad you find it useful.


-- 
http://mail.python.org/mailman/listinfo/python-list

SE 2.3 temporarily unavailable. Cheese shop defeats upload with erratic behavior. Urgently requesting help.

2006-11-02 Thread Frederic Rentsch

Some time ago I had managed to upload a small package to the Cheese Shop 
using the data entry template. Uploading is in two steps: first the text 
then the package file. When I had a new version it went like this: The 
new text made a new page, but the new file went to the old page. The old 
page then had both files and all attempts at uploading the new file to 
the new page failed with the error message that a file could not be 
uploaded if it was already there. So I let it go and handed out the url 
of the old page, figuring that given the choice of two versions, picking 
the latest one was well within the capabilities of anyone.
  One downloader just now made me aware that the new version had 
misspelled extensions 'PY'. They should be lower case. So I fixed it an 
tried to exchange the file. One hour later I have three text pages, 
headed V2,2beta, V2,2beta/V2.3  and V2.3. The first (oldest) page has 
the old package file. The two other pages have no files. The new version 
package is gone, because, prior to re-uploading I was asked to delete 
it. The re-upload fails on all three pages with the message: 'Error 
processing form, invalid distribution file'. The file is a zip file made 
exactly the way I had made and uploaded the ones before. I am banging my 
real head against the proverbial wall. This thing definitely behaves 
erratically and I see no alternative other than to stop banging and go 
to the gym to take my anger out at machines and when I come back in an 
hour, I wish a kind, knowledgeable soul will have posted some good 
advice on how to vanquish such stubborn obstinacy. I have disseminated 
the url and the thought that someone might show up there and not find 
the promised goods make me really unhappy.
  Until such time as this upload is made, I will be happy to send 
SE-2.3 out off list by request. 

Infinite thanks

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: unescape HTML entities

2006-11-02 Thread Frederic Rentsch

Rares Vernica wrote:
 Hi,

 I downloades 2.2 beta, just to be sure I have the same version as you 
 specify. (The file names are no longer funny.) Anyway, it does not seem 
 to do as you said:

 In [14]: import SE

 In [15]: SE.version
 --- SE.version()
 Out[15]: 'SE 2.2 beta - SEL 2.2 beta'

 In [16]: HTM_Decoder = SE.SE ('HTM2ISO.se')

 In [17]: test_string = '''
 : oslash;=(xf8)   #  248  f8
 : ugrave;=(xf9)   #  249  f9
 : uacute;=(xfa)   #  250  fa
 : ucirc;=(xfb)#  251  fb
 : uuml;=(xfc) #  252  fc
 : yacute;=(xfd)   #  253  fd
 : thorn;=(xfe)#  254  fe
 : #233;=(xe9)
 : #234;=(xea)
 : #235;=(xeb)
 : #236;=(xec)
 : #237;=(xed)
 : #238;=(xee)
 : #239;=(xef)
 : '''

 In [18]: print HTM_Decoder (test_string)

 oslash;=(xf8)   #  248  f8
 ugrave;=(xf9)   #  249  f9
 uacute;=(xfa)   #  250  fa
 ucirc;=(xfb)#  251  fb
 uuml;=(xfc) #  252  fc
 yacute;=(xfd)   #  253  fd
 thorn;=(xfe)#  254  fe
 #233;=(xe9)
 #234;=(xea)
 #235;=(xeb)
 #236;=(xec)
 #237;=(xed)
 #238;=(xee)
 #239;=(xef)


 In [19]:

 Thanks,
 Ray



 Frederic Rentsch wrote:
   
 Rares Vernica wrote:
 
 Hi,

 How can I unescape HTML entities like nbsp;?

 I know about xml.sax.saxutils.unescape() but it only deals with amp;, 
 lt;, and gt;.

 Also, I know about htmlentitydefs.entitydefs, but not only this 
 dictionary is the opposite of what I need, it does not have nbsp;.

 It has to be in python 2.4.

 Thanks a lot,
 Ray

   
 One way is this:

   import SE  # 
 Download from http://cheeseshop.python.org/pypi/SE/2.2%20beta
   SE.SE ('HTM2ISO.se')('input_file_name', 'output_file_name')# 
 HTM2ISO.se is included
 'output_file_name'

 For repeated translations the SE object would be assigned to a variable:

   HTM_Decoder = SE.SE ('HTM2ISO.se')

 SE objects take and return strings as well as file names which is useful 
 for translating string variables, doing line-by-line translations and 
 for interactive development or verification. A simple way to check a 
 substitution set is to use its definitions as test data. The following 
 is a section of the definition file HTM2ISO.se:

 test_string = '''
 oslash;=(xf8)   #  248  f8
 ugrave;=(xf9)   #  249  f9
 uacute;=(xfa)   #  250  fa
 ucirc;=(xfb)#  251  fb
 uuml;=(xfc) #  252  fc
 yacute;=(xfd)   #  253  fd
 thorn;=(xfe)#  254  fe
 #233;=(xe9)
 #234;=(xea)
 #235;=(xeb)
 #236;=(xec)
 #237;=(xed)
 #238;=(xee)
 #239;=(xef)
 '''

   print HTM_Decoder (test_string)

 ø=(xf8)   #  248  f8
 ù=(xf9)   #  249  f9
 ú=(xfa)   #  250  fa
 û=(xfb)#  251  fb
 ü=(xfc) #  252  fc
 ý=(xfd)   #  253  fd
 þ=(xfe)#  254  fe
 é=(xe9)
 ê=(xea)
 ë=(xeb)
 ì=(xec)
 í=(xed)
 î=(xee)
 ï=(xef)

 Another feature of SE is modularity.

   strip_tags = '''
~(.|\x0a)*?~=(9)   # one tag to one tab
~!--(.|\x0a)*?--~=(9)  # one comment to one tab
 |   # run
~\x0a[ \x09\x0d\x0a]*~=(x0a)   # delete empty lines
~\t+~=(32)   # one or more tabs to one space
~\x20\t+~=(32)   # one space and one or more tabs to 
 one space
~\t+\x20~=(32)   # one or more tab and one space to 
 one space
 '''

   HTM_Stripper_Decoder = SE.SE (strip_tags + ' HTM2ISO.se ')   # 
 Order doesn't matter

 If you write 'strip_tags' to a file, say 'STRIP_TAGS.se' you'd name it 
 together with HTM2ISO.se:

   HTM_Stripper_Decoder = SE.SE ('STRIP_TAGS.se  HTM2ISO.se')   # 
 Order doesn't matter

 Or, if you have two SE objects, one for stripping tags and one for 
 decoding the ampersands, you can nest them like this:

   test_string = p class=MsoNormal 
 style='line-height:110%'iReneacute;/i est un garccedil;on qui 
 paraicirc;t plus acirc;geacute;. /p

   print Tag_Stripper (HTM_Decoder (test_string))
   René est un garçon qui paraît plus âgé.

 Nesting works with file names too, because file names are returned:

   Tag_Stripper (HTM_Decoder ('input_file_name'), 'output_file_name')
 'output_file_name'


 Frederic



 

   


Ray,

I am sorry you're having a problem. I cannot duplicate it. It works fine 
here. I suspect that SE.SE doesn't find your file HTM2ISO.SE. Do this:

  HTM_Decoder = SE.SE ('HTM2ISO.SE')
  HTM_Decoder.show_log ()

Thu Nov 02 15:15:39 2006 - Compiler - Ignoring single word 'HTM2ISO.SE'. 
Not an existing file 'HTM2ISO.SE'.

If you see this, then you might have forgotten to include the path with 
the file name.

Rather than getting an old version, you could just have renamed the to 
py-files. Version 2.3 has some minor bugs corrected. I fixed the names 
and tried to re-upload to the Cheese Shop and the damn thing stubbornly 
refuses the upload after having required that I delete the file I was 
going to replacing. So it isn't there anymore and the replacement isn't 
there yet. I'll

Re: Where do nested functions live?

2006-11-01 Thread Frederic Rentsch

Rob Williscroft wrote:
 Frederic Rentsch wrote in news:mailman.1556.1162316571.11739.python-
 [EMAIL PROTECTED] in comp.lang.python:

   
 Rob Williscroft wrote:
 
 Frederic Rentsch wrote in news:mailman.1536.1162292996.11739.python-
   
 Rob Williscroft wrote:
 
 Frederic Rentsch wrote in news:mailman.1428.1162113628.11739.python-
   

 [snip]

   
   
   
 Here I'm lost. What's the advantage of this? It looks more convoluted.
 
 
 I'll agree that having to explicitly define a namespace class first 
 does add some convolution. 

 But if you refer to having to prefix you outer variables with
 scope. then this would be the same as claiming that the explict use
 of self is convoluted, which is a valid opinion, so fair enough, but
 I can't say that I agree. 

   
   
 I didn't mean to call into question. I didn't understand the advantage 
 of the added complexity of your second example over the first.

 

 Ok, I missed your point, as for the second example it was an attempt 
 to show that further refactoring from a function with local functions
 that are sharing some state via the scope object, to a class with
 methods that share state via the instance, is a matter of editing
 a few lines.  

 This is useful when a function grows too large (for some value of 
 too large). As an added bonus you get to use the same thechniques
 with both styles of coding.
  
 [snip]

 Rob.
   

Rob,

Thanks a lot for your input. I'll have to digest that. Another question 
I had and forgot was this: Since we have a class that goes out of scope 
when the function returns, and we don't need more than one instance, why 
bother to make an instance? Why not use the class object itself?

def whatever( new_ms ):

  class scope ( object ):

  def inner():
scope.mseconds = new_ms - s * 1000   
m, scope.seconds = divmod (s, 60)
h, scope.minutes = divmod (m, 60)
d, scope.hours = divmod (h, 24)
scope.weeks, scope.days = divmod (d, 7) 


Frederic






-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Where do nested functions live?

2006-10-31 Thread Frederic Rentsch

Rob Williscroft wrote:
 Frederic Rentsch wrote in news:mailman.1428.1162113628.11739.python-
 [EMAIL PROTECTED] in comp.lang.python:

   
def increment_time (interval_ms):
   outer weeks, days, hours, minutes, seconds, mseconds   # 'outer' 
 akin to 'global'
   (...)
   mseconds = new_ms - s * 1000# Assignee remains outer
   m, seconds = divmod (s, 60)
   h, minutes = divmod (m, 60)
   d, hours = divmod (h, 24)
   weeks, days = divmod (d, 7) # No return necessary

 The call would now be:

increment_time (msec)  # No reassignment necessary


 Hope this makes sense
 

 Yes it does, but I prefer explicit in this case:

 def whatever( new_ms ):
   class namespace( object ):
 pass
   scope = namespace()

   def inner():
 scope.mseconds = new_ms - s * 1000   
 m, scope.seconds = divmod (s, 60)
 h, scope.minutes = divmod (m, 60)
 d, scope.hours = divmod (h, 24)
 scope.weeks, scope.days = divmod (d, 7)

   

This is interesting. I am not too familiar with this way of using 
objects. Actually it isn't all that different from a list, because a 
list is also an object. But this way it's attribute names instead of 
list indexes which is certainly easier to work with. Very good!

 The only thing I find anoying is that I can't write:

   scope = object()
  
 Additionally if appropriate I can refactor further:

 def whatever( new_ms ):
   class namespace( object ):
 def inner( scope ):
   scope.mseconds = new_ms - s * 1000   
   m, scope.seconds = divmod (s, 60)
   h, scope.minutes = divmod (m, 60)
   d, scope.hours = divmod (h, 24)
   scope.weeks, scope.days = divmod (d, 7)

   scope = namespace()
   scope.inner()

 In short I think an outer keyword (or whatever it gets called)
 will just add another way of doing something I can already do,
 and potentially makes further refactoring harder.

   

Here I'm lost. What's the advantage of this? It looks more convoluted. 
And speaking of convoluted, what about efficiency? There is much talk of 
efficiency on this forum. I (crudely) benchmark your previous example 
approximately three times slower than a simple inner function taking and 
returning three parameters. It was actually the aspect of increased 
efficiency that prompted me to play with the idea of allowing direct 
outer writes.

 Thats -2 import-this points already.

   

Which ones are the two?

 Rob.
   

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Where do nested functions live?

2006-10-31 Thread Frederic Rentsch

Rob Williscroft wrote:
 Frederic Rentsch wrote in news:mailman.1536.1162292996.11739.python-
 [EMAIL PROTECTED] in comp.lang.python:

   
 Rob Williscroft wrote:
 
 Frederic Rentsch wrote in news:mailman.1428.1162113628.11739.python-
 [EMAIL PROTECTED] in comp.lang.python:

   
   

   
 def whatever( new_ms ):
   class namespace( object ):
 pass
   scope = namespace()

   def inner():
 scope.mseconds = new_ms - s * 1000   
 m, scope.seconds = divmod (s, 60)
 h, scope.minutes = divmod (m, 60)
 d, scope.hours = divmod (h, 24)
 scope.weeks, scope.days = divmod (d, 7)

   
   
 This is interesting. I am not too familiar with this way of using 
 objects. Actually it isn't all that different from a list, because a 
 list is also an object. But this way it's attribute names instead of 
 list indexes which is certainly easier to work with. Very good!

 

   
 In short I think an outer keyword (or whatever it gets called)
 will just add another way of doing something I can already do,
 and potentially makes further refactoring harder.

   
   
 Here I'm lost. What's the advantage of this? It looks more convoluted.
 

 I'll agree that having to explicitly define a namespace class first
 does add some convolution.

 But if you refer to having to prefix you outer variables with scope.
 then this would be the same as claiming that the explict use of self is
 convoluted, which is a valid opinion, so fair enough, but I can't say 
 that I agree.

   

I didn't mean to call into question. I didn't understand the advantage 
of the added complexity of your second example over the first.

 It should also be noted that although I have to declare and create a 
 scope object. My method doesn't require the attributes passed back from 
 the inner function be pre-declared, should I during coding realise
 that I need to return another attribute I just assign it in the inner
 function and use it in the outer function. I would count that as less
 convoluted, YMMV.
   

That is certainly a very interesting aspect.
  
   
 And speaking of convoluted, what about efficiency? There is much talk
 of efficiency on this forum. I (crudely) benchmark your previous 
 example approximately three times slower than a simple inner function 
 taking and returning three parameters. It was actually the aspect of
 increased efficiency that prompted me to play with the idea of 
 allowing direct outer writes.
 

 Did you have optimisation turned on ?

   

No. I did a hundred thousand loops over each in IDLE using xrange.

 As I see it there should be no reason an optimiser couldn't transform 
 my code into the same code we might expect from your outer keyword
 example, as the scope object's type and attributes are all contained 
 within (thus known to) the outer function defenition.

   

I doubt that very much. The 'outer' keyword would give me the choice 
between two alternatives. Such a choice can hardly be automated.

 Wether CPython's optimiser does this or not, I don't know.

   
 Thats -2 import-this points already.

   
   
 Which ones are the two?
 

 Explicit is better than implicit.
 There should be one-- and preferably only one --obvious way to do it.

 Rob.
   


Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to convert nbsp; in a string to blank space?

2006-10-30 Thread Frederic Rentsch

一首诗 wrote:
 Oh, I didn't make myself clear.

 What I mean is how to convert a piece of html to plain text bu keep as
 much format as possible.

 Such as convert nbsp; to blank space and convert br to \r\n

 Gary Herron wrote:
   
 一首诗 wrote:
 
 Is there any simple way to solve this problem?


   
 Yes, strings have a replace method:

 
 s = abcnbsp;def
 s.replace('nbsp;',' ')
   
 'abc def'

 Also various modules that are meant to deal with web and xml and such
 have functions to do such operations.


 Gary Herron
 

   
  my_translations = '''
   nbsp;= 
   # br=\r\n  BR=\r\n   # Windows
   br=\n  BR=\n # Linux
   # Add others to your heart's content
'''

  import SE  # From http://cheeseshop.python.org/pypi/SE/2.2%20beta

  My_Translator = SE.SE (my_translations)

  print My_Translator ('ABCnbsp;DEFGbrXYZ')
ABC DEFG
XYZ

SE can also strip tags and translate all HTM escapes and generally lets 
you do ad hoc translations in seconds. You just write them up, make an 
SE object from your text an run your data through it. As simple as that.
  If you wish further explanations, I'll be happy to explain.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Where do nested functions live?

2006-10-29 Thread Frederic Rentsch

Diez B. Roggisch wrote:
 If I may turn the issue around, I could see a need for an inner function 
 to be able to access the variables of the outer function, the same way a 
 function can access globals. Why? Because inner functions serve to 
 de-multiply code segments one would otherwise need to repeat or to 
 provide a code segment with a name suggestive of its function. In either 
 case the code segment moved to the inner function loses contact with its 
 environment, which rather mitigates its benefit.
 

 Maybe I'm dense here, but where is your point? Python has nested lexical 
 scoping, and while some people complain about it's actual semantics, it 
 works very well:

 def outer():
 outer_var = 10
 def inner():
 return outer_var * 20
 return inner

 print outer()()



 Diez
   
My point is that an inner function operating on existing outer variables 
should be allowed to do so directly. Your example in its simplicity is 
unproblematic. Let us consider a case where several outer variables need 
to be changed:

   weeks = days = hours = minutes = seconds = 0
   mseconds = 0.0

   (code)

   # add interval in milliseconds
   have_ms = ((weeks * 7) + days) * 24) + hours) * 60) + 
minutes) * 60) + seconds) * 1000) + mseconds)
   new_ms = have_ms + interval_ms
   # reconvert
   s = new_ms / 1000.0
   s = int (s)
   mseconds = new_ms - s * 1000
   m, seconds = divmod (s, 60)
   h, minutes = divmod (m, 60)
   d, hours = divmod (h, 24)
   weeks, days = divmod (d, 7)

   (more code)

At some later point I need to increment my units some more and probably 
will again a number of times. Clearly this has to go into a function. I 
make it an inner function, because the scope of its service happens to 
be local to the function in which it comes to live. It operates on 
existing variables of what is now its containing function.

   def increment_time (interval_ms):
  have_ms = ((weeks * 7) + days) * 24) + hours) * 60) + 
minutes) * 60) + seconds) * 1000) + mseconds)
  new_ms = have_ms + interval_ms
  # reconvert
  s = new_ms / 1000.0
  s = int (s)
  ms -= s * 1000   # Was mseconds = new_ms - s * 1000
  m, s = divmod (s, 60)# Was m, seconds = divmod (s, 60)
  h, m = divmod (m, 60)# Was h, minutes = divmod (m, 60)
  d, h = divmod (h, 24)# Was d, hours = divmod (h, 24)
  w, d = divmod (d, 7) # Was weeks, days = divmod (d, 7)
  return w, d, h, m, s, ms

Functionizing I must change the names of the outer variables. Assigning 
to them would make them local, their outer namesakes would become 
invisible and I'd have to pass them all as arguments. Simpler is 
changing assignees names, retaining visibility and therefore not having 
to pass arguments. In either case I have to return the result for 
reassignment by the call.

   weeks, days, hours, minutes, seconds, milliseconds = increment_time 
(msec)

This is a little like a shop where the mechanics have to get their tools 
and work pieces from the manager and hand them back to him when they're 
done. The following two examples are illustrations of my point. They are 
not proposals for 'improvement' of a language I would not presume to 
improve. 

   def increment_time (interval_ms):
  outer weeks, days, hours, minutes, seconds, mseconds   # 'outer' 
akin to 'global'
  (...)
  mseconds = new_ms - s * 1000# Assignee remains outer
  m, seconds = divmod (s, 60)
  h, minutes = divmod (m, 60)
  d, hours = divmod (h, 24)
  weeks, days = divmod (d, 7) # No return necessary

The call would now be:

   increment_time (msec)  # No reassignment necessary


Hope this makes sense

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: unescape HTML entities

2006-10-29 Thread Frederic Rentsch

Rares Vernica wrote:
 Hi,

 How can I unescape HTML entities like nbsp;?

 I know about xml.sax.saxutils.unescape() but it only deals with amp;, 
 lt;, and gt;.

 Also, I know about htmlentitydefs.entitydefs, but not only this 
 dictionary is the opposite of what I need, it does not have nbsp;.

 It has to be in python 2.4.

 Thanks a lot,
 Ray

One way is this:

  import SE  # 
Download from http://cheeseshop.python.org/pypi/SE/2.2%20beta
  SE.SE ('HTM2ISO.se')('input_file_name', 'output_file_name')# 
HTM2ISO.se is included
'output_file_name'

For repeated translations the SE object would be assigned to a variable:

  HTM_Decoder = SE.SE ('HTM2ISO.se')

SE objects take and return strings as well as file names which is useful 
for translating string variables, doing line-by-line translations and 
for interactive development or verification. A simple way to check a 
substitution set is to use its definitions as test data. The following 
is a section of the definition file HTM2ISO.se:

test_string = '''
oslash;=(xf8)   #  248  f8
ugrave;=(xf9)   #  249  f9
uacute;=(xfa)   #  250  fa
ucirc;=(xfb)#  251  fb
uuml;=(xfc) #  252  fc
yacute;=(xfd)   #  253  fd
thorn;=(xfe)#  254  fe
#233;=(xe9)
#234;=(xea)
#235;=(xeb)
#236;=(xec)
#237;=(xed)
#238;=(xee)
#239;=(xef)
'''

  print HTM_Decoder (test_string)

ø=(xf8)   #  248  f8
ù=(xf9)   #  249  f9
ú=(xfa)   #  250  fa
û=(xfb)#  251  fb
ü=(xfc) #  252  fc
ý=(xfd)   #  253  fd
þ=(xfe)#  254  fe
é=(xe9)
ê=(xea)
ë=(xeb)
ì=(xec)
í=(xed)
î=(xee)
ï=(xef)

Another feature of SE is modularity.

  strip_tags = '''
   ~(.|\x0a)*?~=(9)   # one tag to one tab
   ~!--(.|\x0a)*?--~=(9)  # one comment to one tab
|   # run
   ~\x0a[ \x09\x0d\x0a]*~=(x0a)   # delete empty lines
   ~\t+~=(32)   # one or more tabs to one space
   ~\x20\t+~=(32)   # one space and one or more tabs to 
one space
   ~\t+\x20~=(32)   # one or more tab and one space to 
one space
'''

  HTM_Stripper_Decoder = SE.SE (strip_tags + ' HTM2ISO.se ')   # 
Order doesn't matter

If you write 'strip_tags' to a file, say 'STRIP_TAGS.se' you'd name it 
together with HTM2ISO.se:

  HTM_Stripper_Decoder = SE.SE ('STRIP_TAGS.se  HTM2ISO.se')   # 
Order doesn't matter

Or, if you have two SE objects, one for stripping tags and one for 
decoding the ampersands, you can nest them like this:

  test_string = p class=MsoNormal 
style='line-height:110%'iReneacute;/i est un garccedil;on qui 
paraicirc;t plus acirc;geacute;. /p

  print Tag_Stripper (HTM_Decoder (test_string))
  René est un garçon qui paraît plus âgé.

Nesting works with file names too, because file names are returned:

  Tag_Stripper (HTM_Decoder ('input_file_name'), 'output_file_name')
'output_file_name'


Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Where do nested functions live?

2006-10-29 Thread Frederic Rentsch

Fredrik Lundh wrote:
 Frederic Rentsch wrote:

   
 At some later point I need to increment my units some more and probably 
 will again a number of times. Clearly this has to go into a function.
 

 since Python is an object-based language, clearly you could make your 
 counter into a self-contained object instead of writing endless amounts 
 of code and wasting CPU cycles by storing what's really a *single* state 
 in a whole bunch of separate variables.
   
This is surely a good point I'll have to think about.
 in your specific example, you can even use an existing object:
   
Of course. But my example wasn't about time. It was about the situation
 t = datetime.datetime.now()

 # increment
 t += datetime.timedelta(milliseconds=msec)

 print t.timetuple() # get the contents

 if you're doing this so much that it's worth streamlining the timedelta 
 addition, you can wrap the datetime instance in a trivial class, and do

 t += 1500 # milliseconds

 when you need to increment the counter.

   This is a little like a shop where the mechanics have to get their
   tools and work pieces from the manager and hand them back to him when 
   they're done.

 that could of course be because when he was free to use whatever tool he 
 wanted, he always used a crowbar, because he hadn't really gotten around 
 to read that tool kit for dummies book.
   
No mechanic always uses a crowbar. He'd use it just once--with the same 
employer.
 /F

   


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Where do nested functions live?

2006-10-28 Thread Frederic Rentsch

Fredrik Lundh wrote:
 Steven D'Aprano wrote:

   
 I defined a nested function:

 def foo():
 def bar():
 return bar
 return foo  + bar()

 which works. Knowing how Python loves namespaces, I thought I could do
 this:

 
 foo.bar()
   
 Traceback (most recent call last):
   File stdin, line 1, in ?
 AttributeError: 'function' object has no attribute 'bar'

 but it doesn't work as I expected.

 where do nested functions live?
 

 in the local variable of an executing function, just like the variable 
 bar in the following function:

  def foo():
  bar = who am I? where do I live?

 (yes, an inner function is *created* every time you execute the outer 
 function.  but it's created from prefabricated parts, so that's not a 
 very expensive process).

 /F

   
If I may turn the issue around, I could see a need for an inner function 
to be able to access the variables of the outer function, the same way a 
function can access globals. Why? Because inner functions serve to 
de-multiply code segments one would otherwise need to repeat or to 
provide a code segment with a name suggestive of its function. In either 
case the code segment moved to the inner function loses contact with its 
environment, which rather mitigates its benefit.
   If I have an inner function that operates on quite a few outer 
variables it would be both convenient and surely more efficient, if I 
could start the inner function with a declaration analogous to a 
declaration of globals, listing the outer variables which I wish to 
remain writable directly.
   I guess I could put the outer variables into a list as argument to 
the inner function. But while this relieves the inner function of 
returning lots of values it burdens the outer function with handling the 
list which it wouldn't otherwise need.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Search Replace

2006-10-27 Thread Frederic Rentsch

DataSmash wrote:
 Hello,
 I need to search and replace 4 words in a text file.
 Below is my attempt at it, but this code appends
 a copy of the text file within itself 4 times.
 Can someone help me out.
 Thanks!

 # Search  Replace
 file = open(text.txt, r)
 text = file.read()
 file.close()

 file = open(text.txt, w)
 file.write(text.replace(Left_RefAddr, FromLeft))
 file.write(text.replace(Left_NonRefAddr, ToLeft))
 file.write(text.replace(Right_RefAddr, FromRight))
 file.write(text.replace(Right_NonRefAddr, ToRight))
 file.close()

   

Here's a perfect problem for a stream editor, like 
http://cheeseshop.python.org/pypi/SE/2.2%20beta. This is how it works:

  replacement_definitions = '''
   Left_RefAddr=FromLeft
   Left_NonRefAddr=ToLeft
   Right_RefAddr=FromRight
   Right_NonRefAddr=ToRight
'''
  import SE
  Replacements = SE.SE (replacement_definitions)
  Replacements ('text.txt', 'new_text.txt')

That's all! Or in place:

  ALLOW_IN_PLACE = 3
  Replacements.set (file_handling_flag = ALLOW_IN_PLACE)
  Replacements ('text.txt')

This should solve your task.

An SE object takes strings too, which is required for line-by-line 
processing and is very useful for development or verification:

  print Replacements (replacement_definitions)   # Use definitions as 
test data

   FromLeft=FromLeft
   ToLeft=ToLeft
   FromRight=FromRight
   ToRight=ToRight

Checks out. All substitutions are made.


Regards

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Attempting to parse free-form ANSI text.

2006-10-25 Thread Frederic Rentsch

Dennis Lee Bieber wrote:
 On Mon, 23 Oct 2006 20:34:20 +0100, Steve Holden [EMAIL PROTECTED]
 declaimed the following in comp.lang.python:


   
 Don't give up, attach it as a file!

 
   Which might be acceptable on a mailing list, but might be
 problematic on a text newsgroup... Though one attachment a year might
 not be noticed by those providers with strict binaries in binary groups
 only G
   
The comment isn't lost on me. Much less as it runs in an open door. 
Funny thing is that I verified my settings by sending the message to 
myself and it looked fine. Then I sent it to the news group and it was 
messed up again. I will work some more on my setting.

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: mean ans std dev of an array?

2006-10-25 Thread Frederic Rentsch

SpreadTooThin wrote:
 import array
 a = array.array('f', [1,2,3])

 print a.mean()
 print a.std_dev()

 Is there a way to calculate the mean and standard deviation on array
 data?

 Do I need to import it into a Numeric Array to do this?

   
I quickly fish this out of my functions toolbox. There's got to be 
faster functions in scipy, though.

Frederic

(Disclaimer: If you build an air liner or a ocean liner with this and 
the wings fall off at thirty thousand feet or it turns upside down in 
the middle of an ocean, respectively of course, I expect a bunch of 
contingency lawers lining up at my door wanting to sue you on my behalf.)


def standard_deviation (values):

   
  Takes a sequence and returns mean, variance and standard deviation.
  Non-values (None) are skipped

   

   import math
 
   mean = _sum_values_squared = _sum_values = 0.0

   l = len (values)
   i = 0
   item_count = 0
   while i  l:
  value = values [i]
  if value != None:
 _sum_values += value
 _sum_values_squared += value * value
 item_count += 1
  i += 1

   if item_count  2:  # having skipped all Nones
  return None, None, None

   mean = _sum_values / item_count

   variance = (_sum_values_squared - item_count * mean * mean) / 
(item_count - 1)

   if variance  0.0: variance = 0.0 
   # Rounding errors can cause minute negative values which would crash 
the sqrt

   standard_deviation = math.sqrt (variance)

   return mean, variance, standard_deviation


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Attempting to parse free-form ANSI text.

2006-10-24 Thread Frederic Rentsch


Steve Holden wrote:

Frederic Rentsch wrote:
  

Frederic Rentsch wrote:



Frederic Rentsch wrote:
 

  

Paul McGuire wrote:
 
   


Michael B. Trausch mike$#at^nospam!%trauschus wrote in message 
   
 
  
Sorry about the line wrap mess in the previous message. I try again with 
another setting:


Frederic
 
  

I give up!




Don't give up, attach it as a file!

regards
  Steve
  


Thank you for the encourangement!

Frederic

The following code does everything Mike needs to do, except interact with wx. 
It is written to run standing alone. To incorporate it in Mike's class the 
functions would be methods and the globals would be instance attributes. 
Running it does this:

 chunk_1 = This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers: \x1b[10;12mEnd of sequence 1
Sequence 2 Not an 'm'-code: \x1b[30;4;77hEnd of sequence 2
Sequence 3 Color setting code: \x1b[30;45mEnd of sequence 3
Sequence 4 Parameter setting code: \x1b[7mEnd of sequence 4
Sequence 5 Color setting code spanning calls: \x1b[3

 chunk_2 = 7;42mEnd of sequence 5
Sequence 6 Invalid code: \x1b[End of sequence 6
Sequence 7 A valid code at the end: \x1b[9m


 init ()
 process_text (chunk_1)
 process_text (chunk_2)
 print output

This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers:  !!!Ignoring unknown number 10!!!  
!!!Ignoring unknown number 1!!! End of sequence 1
Sequence 2 Not an 'm'-code: End of sequence 2
Sequence 3 Color setting code:  setting foreground BLACK  setting 
background MAGENTA End of sequence 3
Sequence 4 Parameter setting code:  Calling parameter setting function 7 
End of sequence 4
Sequence 5 Color setting code spanning calls:  setting foreground GREY  
setting background GREEN End of sequence 5
Sequence 6 Invalid code: nd of sequence 6
Sequence 7 A valid code at the end:  Calling parameter setting function 9


#


def init (): # This would have to be added to __init__ ()

   import SE   # SEL is less import overhead but doesn't have interactive 
development features (not needed in production versions)
  
   global output  #- For testing
   global Pre_Processor, digits_re, Colors, truncated_escape_hold   # global - 
instance attributes
  
   # Screening out invalid characters and all ansi escape sequences except 
those controlling color
   grit = '\n'.join (['(%d)=' % i for i in range (128,255)]) + ' (13)= '  # 
Makes 127 fixed expressions plus deletion of \r
   # Regular expression r'[\x80-\xff\r]' would work fine but is four times 
slower than 127 fixed expressions
   all_escapes   = r'\x1b\[\d*(;\d*)*[A-Za-z]'
   color_escapes = r'\x1b\[\d*(;\d*)*m'
   Pre_Processor = SE.SE ('%s ~%s~= ~%s~==' % (grit, all_escapes, 
color_escapes))  # SEL.SEL for production
   # 'all_escapes' also matches what 'color_escapes' matches. With identical 
regular expression matches the last regex definition applies.
  
   # Isolating digits.
   digits_re = re.compile ('\d+')
  
   # Color numbers to color names
   Colors = SE.SE ('''
   30=BLACK40=BLACK
   31=RED  41=RED
   32=GREEN42=GREEN
   33=YELLOW   43=YELLOW
   34=BLUE 44=BLUE
   35=MAGENTA  45=MAGENTA
   36=CYAN 46=CYAN
   37=GREY 47=GREY
   39=GREY 49=BLACK
   EAT
   ''')
  
   truncated_escape_hold = ''  #- self.truncated_escape_hold
   output= ''  #- For testing only



# What follows replaces all others of Mike's methods in class 
AnsiTextCtrl(wx.TextCtrl)

def process_text (text):

   global output  #- For testing
   global truncated_escape_hold, digits_re, Pre_Processor, Colors
   
   purged_text = truncated_escape_hold + Pre_Processor (text)
   # Text is now clean except for color codes, which beginning with ESC
   
   ansi_controlled_sections = purged_text.split ('\x1b')
   # Each section starts with a color control, except the first one (leftmost 
split-off)
   
   if ansi_controlled_sections:
  #- self.AppendText(ansi_controlled_sections [0]) #- For real
  output += ansi_controlled_sections [0]#- For 
testing
  for section in ansi_controlled_sections [1:]:
 if section == '': continue
 try: escape_ansi_controlled_section, data = section.split ('m', 1)
 except ValueError:   # Truncated escape
truncated_escape_hold = '\x1b' + section  # Restore ESC removed by 
split ('\x1b')
 else:
escapes = escape_ansi_controlled_section.split (';')
for escape in escapes:
   try: number = digits_re.search (escape).group ()
   except AttributeError:
  output += ' !!!Invalid number %s!!! ' % escape#- 
For testing
  continue
   _set_wx (number)
#- self.AppendText(data) #- For real
output += data#- For testing


def _set_wx (n):

   global output

Re: can't open word document after string replacements

2006-10-24 Thread Frederic Rentsch

Antoine De Groote wrote:
 Hi there,

 I have a word document containing pictures and text. This documents 
 holds several 'ABCDEF' strings which serve as a placeholder for names. 
 Now I want to replace these occurences with names in a list (members). I 
 open both input and output file in binary mode and do the 
 transformation. However, I can't open the resulting file, Word just 
 telling that there was an error. Does anybody what I am doing wrong?

 Oh, and is this approach pythonic anyway? (I have a strong Java background.)

 Regards,
 antoine


 import os

 members = somelist

 os.chdir(somefolder)

 doc = file('ttt.doc', 'rb')
 docout = file('ttt1.doc', 'wb')

 counter = 0

 for line in doc:
  while line.find('ABCDEF')  -1:
  try:
  line = line.replace('ABCDEF', members[counter], 1)
  docout.write(line)
  counter += 1
  except:
  docout.write(line.replace('ABCDEF', '', 1))
  else:
  docout.write(line)

 doc.close()
 docout.close()

   
DOC files contain housekeeping info which becomes inconsistent if you 
change text. Possibly you can exchange stuff of equal length but that 
wouldn't serve your purpose. RTF files let you do substitutions and they 
save a lot of space too. But I kind of doubt whether RTF files can 
contain pictures.

Frederic



-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Attempting to parse free-form ANSI text.

2006-10-23 Thread Frederic Rentsch

Paul McGuire wrote:
 Michael B. Trausch mike$#at^nospam!%trauschus wrote in message 
 news:[EMAIL PROTECTED]
   
 Alright... I am attempting to find a way to parse ANSI text from a
 telnet application.  However, I am experiencing a bit of trouble.

 What I want to do is have all ANSI sequences _removed_ from the output,
 save for those that manage color codes or text presentation (in short,
 the ones that are ESC[#m (with additional #s separated by ; characters).
 The ones that are left, the ones that are the color codes, I want to
 act on, and remove from the text stream, and display the text.

 
 Here is a pyparsing-based scanner/converter, along with some test code at 
 the end.  It takes care of partial escape sequences, and strips any 
 sequences of the form
 ESC[##;##;...alpha, unless the trailing alpha is 'm'.
 The pyparsing project wiki is at http://pyparsing.wikispaces.com.

 -- Paul

 from pyparsing import *

   
snip
  

 test = \
 This is a test string containing some ANSI sequences.
 Sequence 1: ~[10;12m
 Sequence 2: ~[3;4h
 Sequence 3: ~[4;5m
 Sequence 4; ~[m
 Sequence 5; ~[24HNo more escape sequences.
 ~[7.replace('~',chr(27))

 leftOver = processInputString(test)


 Prints:
 This is a test string containing some ANSI sequences.
 Sequence 1:
 change color attributes to ['1012']
   
I doubt we should concatenate numbers.
 Sequence 2:

 Sequence 3:
 change color attributes to ['45']

 Sequence 4;
 change color attributes to ['']

 Sequence 5;
 No more escape sequences.

 found partial escape sequence ['\x1b[7'], tack it on front of next


   
Another one of Paul's elegant pyparsing solutions. To satisfy my own 
curiosity, I tried to see how SE stacked up and devoted more time than I 
really should to finding out. In the end I don't know if it was worth 
the effort, but having made it I might as well just throw it in.

The following code does everything Mike needs to do, except interact 
with wx. It is written to run standing alone. To incorporate it in 
Mike's class the functions would be methods and the globals would be 
instance attributes. Running it does this:

  chunk_1 = This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers: \x1b[10;12mEnd of sequence 1
Sequence 2 Not an 'm'-code: \x1b[30;4;77hEnd of sequence 2
Sequence 3 Color setting code: \x1b[30;45mEnd of sequence 3
Sequence 4 Parameter setting code: \x1b[7mEnd of sequence 4
Sequence 5 Color setting code spanning calls: \x1b[3

  chunk_2 = 7;42mEnd of sequence 5
Sequence 6 Invalid code: \x1b[End of sequence 6
Sequence 7 A valid code at the end: \x1b[9m


  init ()
  process_text (chunk_1)
  process_text (chunk_2)
  print output

This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers:  ! Ignoring unknown number 10 
!  ! Ignoring unknown number 12 ! End of sequence 1
Sequence 2 Not an 'm'-code: End of sequence 2
Sequence 3 Color setting code:  setting foreground BLACK  setting 
background MAGENTA End of sequence 3
Sequence 4 Parameter setting code:  Calling parameter setting function 
7 End of sequence 4
Sequence 5 Color setting code spanning calls:  setting foreground 
GREY  setting background GREEN End of sequence 5
Sequence 6 Invalid code: nd of sequence 6
Sequence 7 A valid code at the end:  Calling parameter setting 
function 9
 

###

And here it goes:

def init (): 

   # To add to AnsiTextCtrl.__init__ ()

   import SE   # SEL is less import overhead but doesn't have 
interactive development features (not needed in production versions)

   global output  #- For testing
   global Pre_Processor, digits_re, Colors, truncated_escape_hold   # 
global - instance attributes

   # Screening out all ansi escape sequences except those controlling color
   grit = '\n'.join (['(%d)=' % i for i in range (128,255)]) + ' (13)= '
   # Regular expression r'[\x80-\xff\r]' would work fine but is four 
times slower than 127 fixed definitions
   all_escapes   = r'\x1b\[\d*(;\d*)*[A-Za-z]'
   color_escapes = r'\x1b\[\d*(;\d*)*m'
   Pre_Processor = SE.SE ('%s ~%s~= ~%s~==' % (grit, all_escapes, 
color_escapes))  # SEL.SEL for production
   # 'all_escapes' also matches what 'color_escapes' matches. With 
identical regular expression matches it is the last definitions that 
applies. Other than that, the order of definitions is irrelevant to 
precedence.

   # Isolating digits.
   digits_re = re.compile ('\d+')

   # Color numbers to color names
   Colors = SE.SE ('''
   30=BLACK40=BLACK
   31=RED  41=RED
   32=GREEN42=GREEN
   33=YELLOW   43=YELLOW
   34=BLUE 44=BLUE
   35=MAGENTA  45=MAGENTA
   36=CYAN 46=CYAN
   37=GREY 47=GREY
   39=GREY 49=BLACK
   EAT
   ''')

   truncated_escape_hold = ''  #- self.truncated_escape_hold
   output= ''  #- For testing only


# What follows replaces all others of Mike's methods

def process_text (text):

   global output  #-

Re: Attempting to parse free-form ANSI text.

2006-10-23 Thread Frederic Rentsch

Frederic Rentsch wrote:
 Paul McGuire wrote:
   
 Michael B. Trausch mike$#at^nospam!%trauschus wrote in message 
 

Sorry about the line wrap mess in the previous message. I try again with 
another setting:

Frederic

##

The following code does everything Mike needs to do, except interact 
with wx. It is written to run standing alone. To incorporate it in 
Mike's class the functions would be methods and the globals would be 
instance attributes. Running it does this:

 chunk_1 = This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers: \x1b[10;12mEnd of sequence 1
Sequence 2 Not an 'm'-code: \x1b[30;4;77hEnd of sequence 2
Sequence 3 Color setting code: \x1b[30;45mEnd of sequence 3
Sequence 4 Parameter setting code: \x1b[7mEnd of sequence 4
Sequence 5 Color setting code spanning calls: \x1b[3

 chunk_2 = 7;42mEnd of sequence 5
Sequence 6 Invalid code: \x1b[End of sequence 6
Sequence 7 A valid code at the end: \x1b[9m


 init ()
 process_text (chunk_1)
 process_text (chunk_2)
 print output

This is a test string containing some ANSI sequences.
Sequence 1 Valid code, invalid numbers:  !!!Ignoring unknown number 10!!!  
!!!Ignoring unknown number 1!!! End of sequence 1
Sequence 2 Not an 'm'-code: End of sequence 2
Sequence 3 Color setting code:  setting foreground BLACK  setting 
background MAGENTA End of sequence 3
Sequence 4 Parameter setting code:  Calling parameter setting function 7 
End of sequence 4
Sequence 5 Color setting code spanning calls:  setting foreground GREY  
setting background GREEN End of sequence 5
Sequence 6 Invalid code: nd of sequence 6
Sequence 7 A valid code at the end:  Calling parameter setting function 9


#


def init ():  # To add to AnsiTextCtrl.__init__ ()

  import SE   # SEL is less import overhead but doesn't have interactive 
development features (not needed in production versions)

  global output  #- For testing
  global Pre_Processor, digits_re, Colors, truncated_escape_hold   # 
global - instance attributes

  # Screening out all ansi escape sequences except those controlling color
  grit = '\n'.join (['(%d)=' % i for i in range (128,255)]) + ' (13)= '  
# Makes 127 fixed expressions plus delete \r
  # Regular expression r'[\x80-\xff\r]' would work fine but is four 
times slower than 127 fixed expressions
  all_escapes   = r'\x1b\[\d*(;\d*)*[A-Za-z]'
  color_escapes = r'\x1b\[\d*(;\d*)*m'
  Pre_Processor = SE.SE ('%s ~%s~= ~%s~==' % (grit, all_escapes, 
color_escapes))  # SEL.SEL for production
  # 'all_escapes' also matches what 'color_escapes' matches. With 
identical regular expression matches the last regex definitions applies.

  # Isolating digits.
  digits_re = re.compile ('\d+')

  # Color numbers to color names
  Colors = SE.SE ('''
  30=BLACK40=BLACK
  31=RED  41=RED
  32=GREEN42=GREEN
  33=YELLOW   43=YELLOW
  34=BLUE 44=BLUE
  35=MAGENTA  45=MAGENTA
  36=CYAN 46=CYAN
  37=GREY 47=GREY
  39=GREY 49=BLACK
  EAT
  ''')

  truncated_escape_hold = ''  #- self.truncated_escape_hold
  output= ''  #- For testing only


# What follows replaces all others of Mike's methods

def process_text (text):

  global output  #- For testing
  global truncated_escape_hold, digits_re, Pre_Processor, Colors

  purged_text = truncated_escape_hold + Pre_Processor (text)
  # Text is now clean except for color codes beginning with ESC

  ansi_controlled_sections = purged_text.split ('\x1b')
  # Each ansi_controlled_section starts with a color control, except the 
first one (leftmost split-off)

  if ansi_controlled_sections:
 #- self.AppendText(ansi_controlled_sections [0]) #- 
For real
 output += ansi_controlled_sections [0]   #- For testing  #- 
For testing
 for section in ansi_controlled_sections [1:]:
if section == '': continue
try: escape_ansi_controlled_section, data = section.split ('m', 1)
except ValueError:   # Truncated escape
   truncated_escape_hold = '\x1b' + section  # Restore ESC 
removed by split ('\x1b')
else:
   escapes = escape_ansi_controlled_section.split (';')
   for escape in escapes:
  try: number = digits_re.search (escape).group ()
  except AttributeError:
 output += ' !!!Invalid number %s!!! ' % escape
#- For testing
 continue
  _set_wx (number)
   #- self.AppendText(data) #- For real
   output += data#- For testing


def _set_wx (n):

  global output  # For testing only
  global Colors

  int_n = int (n)
  if 0 = int_n = 9:
 #- self._number_to_method (n)()  #- 
For real
 output += ' Calling parameter setting function %s ' % n   #- 
For testing
 return
  color = Colors (n)
  if color:
 if 30 = int_n  50

Re: Attempting to parse free-form ANSI text.

2006-10-23 Thread Frederic Rentsch

Frederic Rentsch wrote:
 Frederic Rentsch wrote:
   
 Paul McGuire wrote:
   
 
 Michael B. Trausch mike$#at^nospam!%trauschus wrote in message 
 
   

 Sorry about the line wrap mess in the previous message. I try again with 
 another setting:

 Frederic
   
I give up!


-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Attempting to parse free-form ANSI text.

2006-10-22 Thread Frederic Rentsch

Michael B. Trausch wrote:
 Alright... I am attempting to find a way to parse ANSI text from a
 telnet application.  However, I am experiencing a bit of trouble.

 What I want to do is have all ANSI sequences _removed_ from the output,
 save for those that manage color codes or text presentation (in short,
 the ones that are ESC[#m (with additional #s separated by ; characters).
  The ones that are left, the ones that are the color codes, I want to
 act on, and remove from the text stream, and display the text.

 I am using wxPython's TextCtrl as output, so when I get an ANSI color
 control sequence, I want to basically turn it into a call to wxWidgets'
 TextCtrl.SetDefaultStyle method for the control, adding the appropriate
 color/brightness/italic/bold/etc. settings to the TextCtrl until the
 next ANSI code comes in to alter it.

 It would *seem* easy, but I cannot seem to wrap my mind around the idea.
  :-/

 I have a source tarball up at http://fd0man.theunixplace.com/Tmud.tar
 which contains the code in question.  In short, the information is
 coming in over a TCP/IP socket that is traditionally connected to with a
 telnet client, so things can be broken mid-line (or even mid-control
 sequence).  If anyone has any ideas as to what I am doing, expecting, or
 assuming that is wrong, I would be delighted to hear it.  The code that
 is not behaving as I would expect it to is in src/AnsiTextCtrl.py, but I
 have included the entire project as it stands for completeness.

 Any help would be appreciated!  Thanks!

   -- Mike
   
*I have no experience with reading from TCP/IP. But looking at your 
program with a candid mind I'd say that it is written to process a chunk 
of data in memory. If, as you say, the chunks you get from TCP/IP may 
start and end anywhere and, presumably you pass each chunk through 
AppendText, then you have a synchronization problem, as each call resets 
your escape flag, even if the new chunk starts in the middle of an 
escape sequence. Perhaps you should cut off incomplete escapes at the 
end and prepend them to the next chunk.

And:

if(len(buffer)  0):   
wx.TextCtrl.AppendText(self, buffer)   Are you sure text goes 
into the same place as the controls?
   
if(len(AnsiBuffer)  0):
wx.TextCtrl.AppendText(self, AnsiBuffer)   You say you want to 
strip the control sequences


Frederic

*

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: File read-write mode: problem appending after reading

2006-10-16 Thread Frederic Rentsch

Tim Peters wrote:
 [Frederic Rentsch]
   
   Thanks a lot for your input. I seemed to notice that  everything
 works fine without setting the cursor as long as it stops before the end
 of the file. Is that also a coincidence that may not work?
 

 if you want to read following a write, or write following a read, on
 the same stream, you must perform a file-positioning operation
 (typically a seek) between them
   
I appreciate the clarification. Thanks!
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Ok. This IS homework ...

2006-10-16 Thread Frederic Rentsch

spawn wrote:
 but I've been struggling with this for far too long and I'm about to
 start beating my head against the wall.

 My assignment seemed simple: create a program that will cacluate the
 running total of user inputs until it hits 100.  At 100 it should stop.
  That's not the problem, in fact, that part works.  It's the adding
 that isn't working.  How can my program add 2 + 7 and come up with 14?

 I'm posting my code (so that you may all laugh).  If ANYONE has any
 ideas on what I'm doing wrong, I'd appreciate.

 ---

 running = True
 goal = 100

 # subtotal = 0
 # running_total = subtotal + guess

 while running:
   guess = int(raw_input('Enter an integer that I can use to add : '))
   subtotal = guess

   while running:
   guess = int(raw_input('I\'ll need another number : '))
   running_total = guess + subtotal
   print running_total

   if running_total == goal:
   print 'Congratulations!  You\'re done.'

   elif running_total  goal:
   print 'That\'s a good number, but too high.  Try again.'

 print 'Done'

 --

 I tried adding an additional while statement to capture the second
 number, but it didn't seem to solve my problem.  Help!

   
Dear anonymous student,

Once upon a time programmers did things like this:

   BEGIN
 |
  --|-
 |   |  |
 |   catch input|
 |   |  |
 |   input type valid? - prompt for correct input --|
 |   +  |
 |input too large? + --- prompt for new input --
 |   -
 |  add to running total 
 |   |
 |  status report 
 |   |
  -- - running total = max?
 +
report done  
 |
END

It was called a flow chart. Flow charts could be translated directly 
into machine code written in assembly languages which had labels, tests 
and jumps as the only flow-control constructs. When structured 
programming introduced for and while loops they internalized labeling 
and jumping. That was a great convenience. Flow-charting became rather 
obsolete because the one-to-one correspondence between flow chart and 
code was largely lost.
I still find flow charting useful for conceptualizing a system of 
logical states too complex for my intuition. Everybody's intuition has a 
limit. Your homework solution shows that the assignment exceeds yours. 
So my suggestion is that you use the flow chart, like this:


def homework ():

   # Local functions. (I won't do those for you.)

   def explain_rules (): 
   def check_type (r):   
   def explain_type ():
   def check_size (r):
   def explain_max_size ():
   def report_status (rt):
   def report_done ():


   # Main function

   GOAL  = 100   #BEGIN
   MAX_INPUT =  20   #  |
   running_total =   0   #  |
 #  |
   explain_rules ()  #  |
 #  | 
   while 1:  #   
--|-
 #  |   
|  |
  response = raw_input ('Enter a number  ') #  |   
catch input|
 #  |   
|  |
  if check_type (response) == False: #  |   input 
type valid? - prompt for correct input --|
 explain_type () #  |   
+  |
 continue#  |   
|  |
 #  |   
|  |
  if check_size (response) == False: #  |input 
too large? + --- prompt for new input --
 explain_max_size () #  |   -
 continue#  |   |
 #  |   |
  running_total += int (response)#  |  add to 
running total 
  report_status (running_total)  #  |

Re: Ok. This IS homework ...

2006-10-16 Thread Frederic Rentsch

Nick Craig-Wood wrote:
 Frederic Rentsch [EMAIL PROTECTED] wrote:
   
  It was called a flow chart. Flow charts could be translated directly 
  into machine code written in assembly languages which had labels, tests 
  and jumps as the only flow-control constructs. When structured 
  programming introduced for and while loops they internalized labeling 
  and jumping. That was a great convenience. Flow-charting became rather 
  obsolete because the one-to-one correspondence between flow chart and 
  code was largely lost.
 

 The trouble with flow charts is that they aren't appropriate maps for
 the modern computing language territory.
   
Yes. That's why they aren't used anymore.
 I was born and bred on flow charts and I admit they were useful back
 in the days when I wrote 1000s of lines of assembler code a week.

 Now-a-days a much better map for the the territory is pseudo-code.
 Python is pretty much executable pseudo-code anway
   
Yes. But it's the executable pseudo code our friend has problems with. 
So your very pertinent observation doesn't help him. My suggestion to 
use a flow chart was on the impression that he didn't have a clear 
conception of the solution's logic and that the flow chart was a simple 
means to acquire that clear conception. I like flow charts because they 
exhaustively map states and transitions exactly the way they 
connect---solution imaging as it were. If they can help intelligence map 
a territory it is no issue if they don't map it themselves very well.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Insert characters into string based on re ?

2006-10-15 Thread Frederic Rentsch

Frederic Rentsch wrote:
 Matt wrote:
 I am attempting to reformat a string, inserting newlines before certain
 phrases. For example, in formatting SQL, I want to start a new line at
 each JOIN condition. Noting that strings are immutable, I thought it
 best to spllit the string at the key points, then join with '\n'.

 Regexps can seem the best way to identify the points in the string
 ('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need
 to identify multiple locationg in the string. However, the re.split
 method returns the list without the split phrases, and re.findall does
 not seem useful for this operation.

 Suggestions?

   

 Matt,

   You may want to try this solution:

  import SE 
... snip


 http://cheeseshop.python.org/pypi?:action=displayname=SEversion=2.3

For reasons unknown, the new download for SE is on the old page:  
http://cheeseshop.python.org/pypi/SE/2.2%20beta.


 Frederic


 --
  

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: File read-write mode: problem appending after reading

2006-10-14 Thread Frederic Rentsch

Tim,
  Thanks a lot for your input. I seemed to notice that  everything 
works fine without setting the cursor as long as it stops before the end 
of the file. Is that also a coincidence that may not work?
Frederic


Tim Peters wrote:
 [Frederic Rentsch]
   
Working with read and write operations on a file I stumbled on a
 complication when writes fail following a read to the end.

   f = file ('T:/z', 'r+b')
   f.write ('abcdefg')
   f.tell ()
 30L
   f.seek (0)
   f.read ()
 'abcdefg'
   f.flush ()  # Calling or not makes no difference
   f.write ('abcdefg')
 

 Nothing is defined about what happens at this point, and this is
 inherited from C.  In standard C, if you want to read following a
 write, or write following a read, on the same stream, you must perform
 a file-positioning operation (typically a seek) between them.

   
 Traceback (most recent call last):
   File pyshell#62, line 1, in -toplevel-
 f.write ('abcdefg')
 IOError: (0, 'Error')
 

 That's one possible result.  Since nothing is defined, any other
 outcome is also a possible result ;-)

   
 Flushing doesn't help.
 

 Right, and because flush() is not a file-positioning operation.

   
 I found two work arounds:

   f.read ()
 'abcdefg'
   f.read ()   # Workaround 1: A second read (returning an empty string)
 ''
   f.write ('abcdefg')
 (No error)
 

 Purely an accident; may or may not work the next time you try it, or
 on another platform; etc.

   
   f.read ()
 'abcdefg'
   f.seek (f.tell ())   # Workaround 2: Setting the cursor (to where it 
 is!)
 

 That's a correct approach.  f.seek(0, 1) (seek 0 bytes from the
 current position) is a little easier to spell.

   
   f.write ('abcdefg')
 (No error)

 I found no problem with writing into the file. So it looks like it has
 to do with the cursor which a read puts past the end, unless it is past
 the end, in which case it goes back to the end. Is there a less kludgy
 alternative to fseek (ftell ())?
 

 As above, you need to seek when switching from reading to writing, or
 from writing to reading.
   

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Insert characters into string based on re ?

2006-10-14 Thread Frederic Rentsch

Matt wrote:
 I am attempting to reformat a string, inserting newlines before certain
 phrases. For example, in formatting SQL, I want to start a new line at
 each JOIN condition. Noting that strings are immutable, I thought it
 best to spllit the string at the key points, then join with '\n'.

 Regexps can seem the best way to identify the points in the string
 ('LEFT.*JOIN' to cover 'LEFT OUTER JOIN' and 'LEFT JOIN'), since I need
 to identify multiple locationg in the string. However, the re.split
 method returns the list without the split phrases, and re.findall does
 not seem useful for this operation.

 Suggestions?

   

Matt,

   You may want to try this solution:

  import SE

  Formatter = SE.SE (' ~(?i)(left|inner|right|outer).*join~=\n= ')  
# Details explained below the dotted line
  print Formatter ('select id, people.* from ids left outer join 
people where ...\nSELECT name, job from people INNER JOIN jobs WHERE 
...;\n')
select id, people.* from ids
left outer join people where ...
SELECT name, job from people
INNER JOIN jobs where ...;

You may add other substitutions as required one by one, interactively 
tweaking each one until it does what it is supposed to do:

  Formatter = SE.SE ('''
~(?i)(left|inner|right|outer).*join~=\n  =  # Add an indentation
where=\n  where  WHERE=\n  WHERE  # Add a newline also 
before 'where'
;\n=;\n\n   # Add an extra line feed
\n=;\n\n# And add any missing 
semicolon
# etc.
''')

  print Formatter ('select id, people.* from ids left outer join 
people where ...\nSELECT name, job from people INNER JOIN jobs WHERE 
...;\n')
select id, people.* from ids
  left outer join people
  where ...;

SELECT name, job from people
  INNER JOIN jobs
  WHERE ...;


http://cheeseshop.python.org/pypi?:action=displayname=SEversion=2.3


Frederic


--

The anatomy of a replacement definition

  Formatter = SE.SE (' ~(?i)(left|inner|right|outer).*join~=\n= ')

target=substitute   (first '=')

  Formatter = SE.SE (' ~(?i)(left|inner|right|outer).*join~=\n= ')
 = (each 
following '=' stands for matched target) 

  Formatter = SE.SE (' ~(?i)(left|inner|right|outer).*join~=\n= ')
  ~  ~ (contain 
regular expression)

  Formatter = SE.SE (' ~(?i)(left|inner|right|outer).*join~=\n= ')
  
(contain definition containing white space)

-- 
http://mail.python.org/mailman/listinfo/python-list

File read-write mode: problem appending after reading

2006-10-13 Thread Frederic Rentsch

Hi all,

   Working with read and write operations on a file I stumbled on a 
complication when writes fail following a read to the end.

  f = file ('T:/z', 'r+b')
  f.write ('abcdefg')
  f.tell ()
30L
  f.seek (0)
  f.read ()
'abcdefg'
  f.flush ()  # Calling or not makes no difference
  f.write ('abcdefg')

Traceback (most recent call last):
  File pyshell#62, line 1, in -toplevel-
f.write ('abcdefg')
IOError: (0, 'Error')

Flushing doesn't help. I found two work arounds:

  f.read ()
'abcdefg'
  f.read ()   # Workaround 1: A second read (returning an empty string)
''
  f.write ('abcdefg')
(No error)

  f.read ()
'abcdefg'
  f.seek (f.tell ())   # Workaround 2: Setting the cursor (to where 
it is!)
  f.write ('abcdefg')
(No error)

I found no problem with writing into the file. So it looks like it has 
to do with the cursor which a read puts past the end, unless it is past 
the end, in which case it goes back to the end. Is there a less kludgy 
alternative to fseek (ftell ())?

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Operator += works once, fails if called again

2006-10-03 Thread Frederic Rentsch

John Machin wrote:
 Frederic Rentsch wrote:
   
 Hi all,

I have a class Time_Series derived from list. It lists days and
 contains a dictionary of various Lists also derived from list which
 contain values related to said days. (e.g. Stock quotes, volumes traded,
 etc.)
I defined an operator += which works just fine, but only once. If I
 repeat the operation, it fails and leaves me utterly mystified and crushed.

 Craving to be uncrushed by a superior intelligence.

 Frederic


 ---

 A test:

 
 [perfection snipped]

   
 Perfect!

   TS1 += TS2# Second call

 Traceback (most recent call last):
   File pyshell#311, line 1, in -toplevel-
 TS += TS2
   File c:\i\sony\fre\src\python\TIME_SERIES_7.py, line 1314, in __iadd__
 return self.__add__ (other)
   File c:\i\sony\fre\src\python\TIME_SERIES_7.py, line 1273, in __add__
 Sum = copy.deepcopy (self)
   File C:\PYTHON24\lib\copy.py, line 188, in deepcopy
 y = _reconstruct(x, rv, 1, memo)
   File C:\PYTHON24\lib\copy.py, line 335, in _reconstruct
 state = deepcopy(state, memo)
   File C:\PYTHON24\lib\copy.py, line 161, in deepcopy
 y = copier(x, memo)
   File C:\PYTHON24\lib\copy.py, line 252, in _deepcopy_dict
 y[deepcopy(key, memo)] = deepcopy(value, memo)
   File C:\PYTHON24\lib\copy.py, line 161, in deepcopy
 y = copier(x, memo)
   File C:\PYTHON24\lib\copy.py, line 252, in _deepcopy_dict
 y[deepcopy(key, memo)] = deepcopy(value, memo)
   File C:\PYTHON24\lib\copy.py, line 188, in deepcopy
 y = _reconstruct(x, rv, 1, memo)
   File C:\PYTHON24\lib\copy.py, line 335, in _reconstruct
 state = deepcopy(state, memo)
   File C:\PYTHON24\lib\copy.py, line 161, in deepcopy
 y = copier(x, memo)
   File C:\PYTHON24\lib\copy.py, line 252, in _deepcopy_dict
 y[deepcopy(key, memo)] = deepcopy(value, memo)
   File C:\PYTHON24\lib\copy.py, line 188, in deepcopy
 y = _reconstruct(x, rv, 1, memo)
   File C:\PYTHON24\lib\copy.py, line 320, in _reconstruct
 y = callable(*args)
   File C:\PYTHON24\lib\copy_reg.py, line 92, in __newobj__
 

 Oho! What's it doing there, in copy_reg.py?  Something has stashed away
 a reference to copy_reg.__newobj__, and the stasher is *not* the copy
 module 

   
 return cls.__new__(cls, *args)
 TypeError: instancemethod expected at least 2 arguments, got 0
 

 From a quick browse through the code for copy.py and copy_reg.py [Isn't
 open source a wonderful thing?], I'm guessing:
 (1) The cls  is *your* class. You can confirm that with a debugger
 (or by hacking in a print statement!).
 (2) You may need to read this in the copy docs:
 
 Classes can use the same interfaces to control copying that they use to
 control pickling. See the description of module pickle for information
 on these methods. The copy module does not use the copy_reg
 registration module.
 
 (3) You may need to read this in the pickle docs:
 
 New-style types can provide a __getnewargs__() method that is used for
 protocol 2. Implementing this method is needed if the type establishes
 some internal invariants when the instance is created, or if the memory
 allocation is affected by the values passed to the __new__() method for
 the type (as it is for tuples and strings). Instances of a new-style
 type C are created using
  obj = C.__new__(C, *args)
  where args is the result of calling __getnewargs__() on the original
 object; if there is no __getnewargs__(), an empty tuple is assumed.
 
 (4) You may need to debug your pickling/unpickling before you debug
 your deepcopying.
 (5) Look at function _copy_inst at line 134 in copy.py -- work out what
 it will do with your class instance, depending on what __magic__
 method(s) you have implemented.

 Hope some of this helps,
 John

   

John,

Thank you very much for your suggestions. I followed them one by one and 
in the end found the cause of the problem to be a circular reference. 
Some species of contained Lists need a reference to the containing 
Time_Series and that circular reference trips up deepcopy.  Supposedly I 
can define my own __deepcopy__, but didn't understand whose method it is 
supposed to be. Time_Series.__deepcopy__  () doesn't make sens, nor does 
it work. Time_Series should be the argument of the method not the owner. 
Anyway, I managed without deepcopy making a new instance and loading it 
as I would, if it were the first one.

Thanks again

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Operator += works once, fails if called again

2006-10-02 Thread Frederic Rentsch

Hi all,

   I have a class Time_Series derived from list. It lists days and 
contains a dictionary of various Lists also derived from list which 
contain values related to said days. (e.g. Stock quotes, volumes traded, 
etc.)
   I defined an operator += which works just fine, but only once. If I 
repeat the operation, it fails and leaves me utterly mystified and crushed.

Craving to be uncrushed by a superior intelligence.

Frederic


---

A test:

  TS1 = TIME_SERIES_7.Time_Series (range (10), 'TS1')  # Some days
  L1 = LIST_7.Floats ((22,44,323,55,344,55,66,77,-1,0), 'Numbers')
# Some List with values
  TS1.add_List (L1)
  TS2 = TIME_SERIES_7.Time_Series ((3,4,5,7,8), 'TS2')  # Other days 
(subset)
  L2 = LIST_7.Floats ((7,-2,-5,0,2), 'Numbers')   # Another List with 
values
  TS2.add_list (L2)
  TS1 += TS2# First call
  TS1.write ()
TS1  | Date   | Numbers |

0.00 | 1900.00.00 |   22.00 |
1.00 | 1900.01.01 |   44.00 |
2.00 | 1900.01.02 |  323.00 |
3.00 | 1900.01.03 |   62.00 |
4.00 | 1900.01.04 |  342.00 |
5.00 | 1900.01.05 |   50.00 |
6.00 | 1900.01.06 |   63.00 |
7.00 | 1900.01.07 |   77.00 |
8.00 | 1900.01.08 |   -1.00 |
9.00 | 1900.01.09 |0.00 |

Perfect!

  TS1 += TS2# Second call

Traceback (most recent call last):
  File pyshell#311, line 1, in -toplevel-
TS += TS2
  File c:\i\sony\fre\src\python\TIME_SERIES_7.py, line 1314, in __iadd__
return self.__add__ (other)
  File c:\i\sony\fre\src\python\TIME_SERIES_7.py, line 1273, in __add__
Sum = copy.deepcopy (self)
  File C:\PYTHON24\lib\copy.py, line 188, in deepcopy
y = _reconstruct(x, rv, 1, memo)
  File C:\PYTHON24\lib\copy.py, line 335, in _reconstruct
state = deepcopy(state, memo)
  File C:\PYTHON24\lib\copy.py, line 161, in deepcopy
y = copier(x, memo)
  File C:\PYTHON24\lib\copy.py, line 252, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
  File C:\PYTHON24\lib\copy.py, line 161, in deepcopy
y = copier(x, memo)
  File C:\PYTHON24\lib\copy.py, line 252, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
  File C:\PYTHON24\lib\copy.py, line 188, in deepcopy
y = _reconstruct(x, rv, 1, memo)
  File C:\PYTHON24\lib\copy.py, line 335, in _reconstruct
state = deepcopy(state, memo)
  File C:\PYTHON24\lib\copy.py, line 161, in deepcopy
y = copier(x, memo)
  File C:\PYTHON24\lib\copy.py, line 252, in _deepcopy_dict
y[deepcopy(key, memo)] = deepcopy(value, memo)
  File C:\PYTHON24\lib\copy.py, line 188, in deepcopy
y = _reconstruct(x, rv, 1, memo)
  File C:\PYTHON24\lib\copy.py, line 320, in _reconstruct
y = callable(*args)
  File C:\PYTHON24\lib\copy_reg.py, line 92, in __newobj__
return cls.__new__(cls, *args)
TypeError: instancemethod expected at least 2 arguments, got 0


It seems to crash on the second call to deepcopy. Why? The type of the 
deepcopied object is still 'Time_Series'.

Here's Time_Series.__add__ () but I don't think there's anything wrong 
with that.

   def __add__ (self, other):

  if self [0] = other [0]: # One or the other ...
 Sum = copy.deepcopy (self)
 Summand = copy.deepcopy (other)
  else: # depending on which starts earlier
 Sum = copy.deepcopy (other)
 Summand = copy.deepcopy (self)

  if Sum [0] != Summand [0]:
 Summand.insert (0, Sum [0])
 for list_name in Summand.Lists:
Summand.Lists [list_name].insert (0, None)
  if Sum [-1]  Summand [-1]:
 Sum.append (Summand [-1])
  elif Sum [-1]  Summand [-1]:
 Summand.append (Sum [-1])

  Sum.make_continuous ()  # Fills in missing days and values
  Summand.make_continuous ()

  for list_name in Summand.Lists:
 if Sum.Lists.has_key (list_name):
Sum.Lists [list_name] += Summand.Lists [list_name]
# List operators work fine

  return Sum


---

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Escapeism

2006-10-01 Thread Frederic Rentsch

Kay Schluehr wrote:
 Sybren Stuvel wrote:
   
 Kay Schluehr enlightened us with:
 
 Usually I struggle a short while with \ and either succeed or give up.
 Today I'm in a different mood and don't give up. So here is my
 question:

 You have an unknown character string c such as '\n' , '\a' , '\7' etc.

 How do you echo them using print?

 print_str( c ) prints representation '\a' to stdout for c = '\a'
 print_str( c ) prints representation '\n' for c = '\n'
 ...

 It is required that not a beep or a linebreak shall be printed.
   
 try print repr(c).
 

 This yields the hexadecimal representation of the ASCII character and
 does not simply echo the keystrokes '\' and 'a' for '\a' ignoring the
 escape semantics. One way to achieve this naturally is by prefixing
 '\a' with r where r'\a' indicates a raw string. But unfortunately
 rawrification applies only to string literals and not to string
 objects ( such as c ). I consider creating a table consisting of pairs
 {'\0': r'\0','\1': r'\1',...}  i.e. a handcrafted mapping but maybe
 I've overlooked some simple function or trick that does the same for
 me.

 Kay

   
Kay,

This is perhaps yet another case for SE? I don't really know, because I 
don't quite get what you're after. See for yourself:

  import SE
  Printabilizer = SE.SE ( '''
   (1)=\\1   # All 256 octets can be written as parenthesized ascii
   (2)=\\2
   \a=\\a  # (7)=\\a
   \n=\\n  # or (10)=\\n  or (10)=LF or whatever
   \r=\\r  # (13)=CR
   \f=\\f
   \v=\\v
   # Add whatever other ones you like
   #and translate them to anything you like.
 ''')

  print Printabilizer ('abd\aefg\r\nhijk\vlmnop\1\2.')
abd\aefg\r\nhijk\vlmno\1\2. 
  

If you think this may help, you'll find SE here: 
http://cheeseshop.python.org/pypi/SE/2.2%20beta


Regards

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Can string formatting be used to convert an integer to its binary form ?

2006-09-29 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Frederic Rentsch:
   
 Good idea, but shorter with -
   SE.SE ('se_definition_files/int_to_binary.se') ('%X' % 987654321)
 '0011101011000110100010110001'
 

 Note that your version keeps the leading zeros.
 Have you tested the relative speeds too?
 (I'll probably have to learn to use SE.)

 Bye,
 bearophile

   
If you say speed, I presume you mean speed of execution. No I have not 
tested that. I know it can't be fast on a test bench. After all, SE is 
written in Python. I did a first version fifteen years ago in C, am 
still using it today on occasion and it runs much, much faster than this 
Python SE. This SE here could be done in C if it passes the test of 
acceptance.
 Professionals need to measure execution speed as a part of 
documenting their products. I am not a professional and so I am free to 
define my own scale of grades: A (fast enough) and F (not fast enough). 
I have yet to encounter a situation where SE gets an F. But that says 
less about SE than about my better knowledge which prevents me from 
using SE to, say, change colors in a 50 Mb bitmap. Obviously, fast 
enough and not fast enough pertains not to code per se, but to code 
in a specific situation. So, as the saying goes: the proof of the 
pudding ...
 Another kind of speed is production speed. I do believe that SE 
rather excels on that side. I also believe that the two kinds of speed 
are economically related by the return-on-investment principle.
 The third kind of speed is learning speed. SE is so simple that it 
has no technical learning curve to speak of. It's versatility comes from 
a wealth of application techniques that invite exploration, invention 
even. Take leading zeroes:

Leading zeroes can be stripped in a second pass if they are made 
recognizable in the first pass by some leading mark that is not a zero 
or a one. ([^01]; I use @ in the following example). To purists this 
may seem hackish. So it is! And what's wrong with that if it leads to 
simpler solutions?

  Hex_To_Binary = SE.SE ('0= 1=0001 2=0010 3=0011 4=0100 5=0101 
6=0110 7=0111 8=1000 9=1001 A=1010 a=1010 B=1011 b=1011 C=1100 c=1100 
D=1101 d=1101 E=1110 e=1110 F= f= | ~[^01]0*~=')
  Hex_To_Binary.set (keep_chain = 1)
  Hex_To_Binary ('@%x' % 1234567890)
'100100110010110001011010010'
  Hex_To_Binary.show ()

... snip ...

Data Chain
--
  @499602d2
0 

  @0100100110010110001011010010
1 

  100100110010110001011010010
--


Frederic

(The previously posted example Int_To_Binary = SE.SE (SE.SE ( ... was 
a typo, or course. One (SE.SE  does it. Sorry about that.)

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Can string formatting be used to convert an integer to its binary form ?

2006-09-28 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Mirco Wahab:
   
 But where is the %b in Python?
 

 Python doesn't have that. You can convert the number to a hex, and then
 map the hex digitds to binary strings using a dictionary, like this:
 http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/440528

 Bye,
 bearophile

   
Good idea, but shorter with - 
http://cheeseshop.python.org/pypi/SE/2.2%20beta

  import SE
  Int_To_Binary = SE.SE (SE.SE ('0= 1=0001 2=0010 3=0011 4=0100 
5=0101 6=0110 7=0111 8=1000 9=1001 A=1010 a=1010 B=1011 b=1011 C=1100 
c=1100 D=1101 d=1101 E=1110 e=1110 F= f=')
  Int_To_Binary ('%x' % 1234567890')
'0100100110010110001011010010'

  Int_To_Binary.save ('se_definition_files/int_to_binary.se')

  SE.SE ('se_definition_files/int_to_binary.se') ('%X' % 987654321)
'0011101011000110100010110001'


Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: I need some help with a regexp please

2006-09-26 Thread Frederic Rentsch

Dennis Lee Bieber wrote:
 On 25 Sep 2006 10:25:01 -0700, codefire [EMAIL PROTECTED]
 declaimed the following in comp.lang.python:

   
 Yes, I didn't make it clear in my original post - the purpose of the
 code was to learn something about regexps (I only started coding Python
 last week). In terms of learning a little more the example was
 successful. However, creating a full email validator is way beyond me -
 the rules are far too complex!! :)
 

   I've been doing small things in Python for over a decade now
 (starting with the Amiga port)...

   I still don't touch regular expressions... They may be fast, but to
 me they are just as much line noise as PERL... I can usually code a
 partial parser faster than try to figure out an RE.
   
If I may add another thought along the same line: regular expressions 
seem to tend towards an art form, or an intellectual game. Many 
discussions revolving around regular expressions convey the impression 
that the challenge being pursued is finding a magic formula much more 
than solving a problem. In addition there seems to exist some code of 
honor which dictates that the magic formula must consist of one single 
expression that does it all. I suspect that the complexity of one single 
expression grows somehow exponentially with the number of 
functionalities it has to perform and at some point enters a gray zone 
of impending conceptual intractability where the quest for the magic 
formula becomes particularly fascinating. I also suspect that some 
problems are impossible to solve with a single expression and that no 
test of intractability exists other than giving up after so many hours 
of trying.
With reference to the OP's question, what speaks against passing his 
texts through several simple expressions in succession? Speed of 
execution? Hardly. The speed penalty would not be perceptible. 
Conversely, in favor of multiple expressions speaks that they can be 
kept simple and that the performance of the entire set can be 
incrementally improved by adding another simple expression whenever an 
unexpected contingency occurs, as they may occur at any time with 
informal systems. One may not win a coding contest this way, but saving 
time isn't bad either, or is even better.

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: newbe's re question

2006-09-26 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Frederic Rentsch wrote:
   
 [EMAIL PROTECTED] wrote:
 
 These are csound files.  Csound recently added python as a scripting
 language and is allowing also allowing csound calls from outside of
 csound.  The nice thing about csound is that instead of worrying about
 virus and large files it is an interpiter and all the files look
 somewhat like html.  4,000 virus free instruments for $20 is available
 at
 http://www.csounds.com and the csound programming book is also
 available.  The downside is that csound is can be realy ugly looking
 (that is what I am trying to change) and it lets you write ugly looking
 song code that is almost unreadable at times (would look nice in a
 grid)

   
snip snip snip snip snip .

 I was hoping to add and remove instruments..  although the other should
 go into my example file because it will come in handy at some point.
 It would also be cool to get a list of instruments along with any
 comment line if there is one into a grid but that is two different
 functions.

 http://www.dexrow.com

   

Eric,

Below the dotted line there are two functions.
  csound_filter () extracts and formats instrument blocks from 
csound files and writes the output to another file. (If called without 
an output file name, the output displays on the screen).
  The second function, make_instrument_dictionaries (), takes the 
file generated by the first function and makes two dictionaries. One of 
them lists instrument ids keyed on instrument description, the other one 
does it the other way around, listing descriptions by instrument id. 
Instrument ids are file name plus instrument number.
  You cannot depend on this system to function reliably, because it 
extracts information from comments. I took my data from 
ems.music.utexas.edu/program/mus329j/CSPrimer.pdf the author of which 
happens to lead his instrument blocks with a header made up of comment 
lines, the first of which characterizes the block. That first line I use 
to make the dictionaries. If your data doesn't follow this practice, 
then you may not get meaningful dictionaries and are in for some 
hacking. In any case, this doesn't look like a job that can be fully 
automated. But if you can build your data base with a manageable amount 
of manual work you should be okay.
  The SE filter likewise works depending on formal consistency with 
my sample. If it fails, you may have to either tweak it or move up to a 
parser.
  I'll be glad to provide further assistance to the best of my 
knowledge. But you will have to make an effort to express yourself 
intelligibly. As a matter of fact the hardest part of proposing 
solutions to your problem is guessing what your problem is. I suggest 
you do this: before you post, show the message to a good friend and edit 
it until he understands it.

Regards

Frederic
.
 


def csound_filter (csound_file_name, 
name_of_formatted_instrument_blocks_file = sys.stdout):

   
  This function filters and formats instrument blocks out of a 
csound file.

  csound_formatter (csound_file_name, 
name_of_formatted_instrument_blocks_file)
  csound_formatter (csound_file_name) # Single argument: screen output

   

   import SE
   Instruments_Filter = SE.SE ('EAT ~;.*~==(10) 
~instr(.|\n)*?endin~==(10)(10)')

   INDENT  = 10
   CODE_LENGTH = 50

   def format ():
  for l in instruments_file:
 line = l.strip ()
 if line == '':
out_file.write ('\n')
 else:
if line [0] == ';':   # Comment line
   out_file.write ('%s\n' % line)
else:
   code = comment = ''
   if line [-1] == ':':   # Label
  out_file.write ('%s\n' % line)
   else:
  if ';' in line:
 code, comment = line.split (';')
 out_file.write ('%*s%-*s;%s\n' % (INDENT, '', 
CODE_LENGTH, code, comment))
  else:
 out_file.write ('%*s%s\n' % (INDENT, '', line))

   instruments_file_name = Instruments_Filter (csound_file_name)
   instruments_file = file (instruments_file_name, 'w+a')
   if name_of_formatted_instrument_blocks_file != sys.stdout:
  out_file = file (name_of_formatted_instrument_blocks_file, 'wa')
  owns_out_file = True
   else:
  out_file = name_of_formatted_instrument_blocks_file
  owns_out_file = False

   format ()
  
   if owns_out_file: out_file.close ()



def make_instrument_dictionaries (name_of_formatted_instrument_blocks_file):

   
  This function takes a file made by the previous function and 
generates two dictionaries.
  One records instrument ids by description. The other one 
instrument descriptions by id.
  Instrument ids are made up of file name and instrument number. 
  If these two

Re: newbe's re question

2006-09-24 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 These are csound files.  Csound recently added python as a scripting
 language and is allowing also allowing csound calls from outside of
 csound.  The nice thing about csound is that instead of worrying about
 virus and large files it is an interpiter and all the files look
 somewhat like html.  4,000 virus free instruments for $20 is available
 at
 http://www.csounds.com and the csound programming book is also
 available.  The downside is that csound is can be realy ugly looking
 (that is what I am trying to change) and it lets you write ugly looking
 song code that is almost unreadable at times (would look nice in a
 grid)

 http://www.msn.com
 ..


 Frederic Rentsch wrote:
   
 [EMAIL PROTECTED] wrote:
 
 Frederic Rentsch wrote:

   
 [EMAIL PROTECTED] wrote:

 
 Frederic Rentsch wrote:


   
 [EMAIL PROTECTED] wrote:


 
 All I am after realy is to change this

  reline = re.line.split('instr', '/d$')

 into something that grabs any line with instr in it take all the
 numbers and then grab any comment that may or may not be at the end of
 the line starting with ; until the end of the line including white
 spaces..  this is a corrected version from

 http://python-forum.org/py/viewtopic.php?t=1703

 thanks in advance the hole routine is down below..






 [code]
 def extractCsdInstrument (input_File_Name, output_File_Name,
 instr_number):

 takes an .csd input file and grabs instr_number instrument and
 creates output_File_Name
 f = open (input_File_Name , 'r')#opens file passed
 in to read
 f2 = open (output_File_Name, 'w')   #opens file passed
 in to write
 instr_yes = 'false' #set flag to false

 for line in f:  #for through all
 the lines
   if instr in line:   #look for instr in
 the file
if instr_yes == 'true':#check to see if
 this ends the instr block
break#exit the block

reline = re.line.split('instr', '/d$') #error probily
 split instr and /d (decimal number into parts) $ for end of line
number = int(reline[1])  #convert to a
 number maybe not important
 if number == instr_number:#check to see if
 it is the instr passed to function
 instr_yes = true: #change flag to
 true because this is the instr we want
   if instr_yes = true:#start of code to
 copy to another file
f2.write(f.line) #write line to
 output file

 f.close #close input file
 f2.close

 [/code]




   
 Eric,
   From your problem description and your code it is unclear what
 exactly it is you want. The task appears to be rather simple, though,
 and if you don't get much useful help I'd say it is because you don't
 explain it very well.
   I believe we've been through this before and your input data is
 like this

data = '''
CsoundSynthesizer;
  ; test.csd - a Csound structured data file

CsOptions
  -W -d -o tone.wav
/CsOptions

CsVersion;optional section
  Before 4.10  ;these two statements check for
  After 4.08   ;   Csound version 4.09
/CsVersion

CsInstruments
  ; originally tone.orc
  sr = 44100
  kr = 4410
  ksmps = 10
  nchnls = 1
  instr   1
  a1 oscil p4, p5, 1 ; simple oscillator
 out a1
endin
/CsInstruments

CsScore
  ; originally tone.sco
  f1 0 8192 10 1
  i1 0 1 2 1000 ;play one second of one kHz tone
  e
/CsScore

/CsoundSynthesizer

 Question 1: Is this your input?
 if yes:
 Question 1.1: What do you want to extract from it? In what format?
 if no:
 Question 1.1: What is your input?
 Question 1.2: What do you want to extract from it? In what format?
 Question 2: Do you need to generate output file names from the data?
 (One file per instrument?)
 if yes:
Question 2.1: What do you want to make your file name from?
 (Instrument number?)


 Regards

 Frederic


 
 I want to pass the file name to the subroutine and return a comment
 string if it is there maybe it should be simplier.  I probily should
 have the option of grabbing the comment in other related routines.  I
 am pretty ambitious with the main program.  I did notice some code in
 tcl that would be usefull to the app If I compile it..  I am probily
 not ready for that though..

 http://www.dexrow.com



   
 Eric,
  I'm beginning to enjoy this. I'm sure we'll sort this out in no
 time if we proceed methodically. Imagine you are a teacher and I am your
 student. This is a quiz. I have to take it and you need to explain to me
 the problem you want me to solve. If you don't

Re: Replacing line in a text file

2006-09-22 Thread Frederic Rentsch

CSUIDL PROGRAMMEr wrote:
 Folks
 I am trying to read a file
 This file has a line containing string  'disable = yes'

 I want to change this line to 'disable = no'

 The concern here is that , i plan to take into account the white spaces
 also.

 I tried copying all file int list and then tried to manipulate that
 list

 But the search is not working

 Any answer

 thanks

   
 s = '''Folks! I am trying to read a file
This file has a line containing string  'disable = yes'
I want to change this line to 'disable = no' ...'''

The second line is the one to change. Okay?

 import SE
 Translator = SE.SE ('disable \= yes=disable \= no')
 print Translator (s)
Folks! I am trying to read a file
This file has a line containing string  'disable = no'
I want to change this line to 'disable = no' ...

Did it! - I don't know if it 'takes into account the white spaces' as I don't 
exactly understand what you mean by that. If need be, just change the 
substitution definition that makes the Translator to suit your needs. In an 
IDLE window you can work trial-and-error style five seconds per try. If you 
want to do other translations, just add more substitution definitions, as many 
as you want. It will do files too. No need to read them. Like this:

 Translator ('file_name', 'translated_file_name')

If this approach seems suitable, you'll find SE here: 
http://cheeseshop.python.org/pypi/SE/2.2%20beta


Regards

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: newbe's re question

2006-09-21 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Frederic Rentsch wrote:
   
 [EMAIL PROTECTED] wrote:
 
 Frederic Rentsch wrote:

   
 [EMAIL PROTECTED] wrote:

 
 All I am after realy is to change this

  reline = re.line.split('instr', '/d$')

 into something that grabs any line with instr in it take all the
 numbers and then grab any comment that may or may not be at the end of
 the line starting with ; until the end of the line including white
 spaces..  this is a corrected version from

 http://python-forum.org/py/viewtopic.php?t=1703

 thanks in advance the hole routine is down below..






 [code]
 def extractCsdInstrument (input_File_Name, output_File_Name,
 instr_number):

 takes an .csd input file and grabs instr_number instrument and
 creates output_File_Name
 f = open (input_File_Name , 'r')#opens file passed
 in to read
 f2 = open (output_File_Name, 'w')   #opens file passed
 in to write
 instr_yes = 'false' #set flag to false

 for line in f:  #for through all
 the lines
   if instr in line:   #look for instr in
 the file
if instr_yes == 'true':#check to see if
 this ends the instr block
break#exit the block

reline = re.line.split('instr', '/d$') #error probily
 split instr and /d (decimal number into parts) $ for end of line
number = int(reline[1])  #convert to a
 number maybe not important
 if number == instr_number:#check to see if
 it is the instr passed to function
 instr_yes = true: #change flag to
 true because this is the instr we want
   if instr_yes = true:#start of code to
 copy to another file
f2.write(f.line) #write line to
 output file

 f.close #close input file
 f2.close

 [/code]



   
 Eric,
   From your problem description and your code it is unclear what
 exactly it is you want. The task appears to be rather simple, though,
 and if you don't get much useful help I'd say it is because you don't
 explain it very well.
   I believe we've been through this before and your input data is
 like this

data = '''
CsoundSynthesizer;
  ; test.csd - a Csound structured data file

CsOptions
  -W -d -o tone.wav
/CsOptions

CsVersion;optional section
  Before 4.10  ;these two statements check for
  After 4.08   ;   Csound version 4.09
/CsVersion

CsInstruments
  ; originally tone.orc
  sr = 44100
  kr = 4410
  ksmps = 10
  nchnls = 1
  instr   1
  a1 oscil p4, p5, 1 ; simple oscillator
 out a1
endin
/CsInstruments

CsScore
  ; originally tone.sco
  f1 0 8192 10 1
  i1 0 1 2 1000 ;play one second of one kHz tone
  e
/CsScore

/CsoundSynthesizer

 Question 1: Is this your input?
 if yes:
 Question 1.1: What do you want to extract from it? In what format?
 if no:
 Question 1.1: What is your input?
 Question 1.2: What do you want to extract from it? In what format?
 Question 2: Do you need to generate output file names from the data?
 (One file per instrument?)
 if yes:
Question 2.1: What do you want to make your file name from?
 (Instrument number?)


 Regards

 Frederic

 
 I want to pass the file name to the subroutine and return a comment
 string if it is there maybe it should be simplier.  I probily should
 have the option of grabbing the comment in other related routines.  I
 am pretty ambitious with the main program.  I did notice some code in
 tcl that would be usefull to the app If I compile it..  I am probily
 not ready for that though..

 http://www.dexrow.com


   
 Eric,
  I'm beginning to enjoy this. I'm sure we'll sort this out in no
 time if we proceed methodically. Imagine you are a teacher and I am your
 student. This is a quiz. I have to take it and you need to explain to me
 the problem you want me to solve. If you don't explain it clearly, I
 will not know what I have to do and cannot do the quiz. If you answer my
 questions above, your description of the problem will be clear and I can
 take the quiz. Okay?

 Frederic
 


 instr   1
  a1 oscil p4, p5, 1 ; simple oscillator; comment is
 sometimes here
 out a1
endin


 I need to know the file I wan't to grab this from I need to grab this
 out of the larger file and put it into it's own file,  I need to know
 what instr the user wants.  I need to know what file to put it into and
 it would be usefull to have the comment line returned (if any).

 I did just get python essential reference 3rd edition..  If there is a
 better reference on just the subject I am after I would be glad

Re: newbe's re question

2006-09-20 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 All I am after realy is to change this

  reline = re.line.split('instr', '/d$')

 into something that grabs any line with instr in it take all the
 numbers and then grab any comment that may or may not be at the end of
 the line starting with ; until the end of the line including white
 spaces..  this is a corrected version from

 http://python-forum.org/py/viewtopic.php?t=1703

 thanks in advance the hole routine is down below..






 [code]
 def extractCsdInstrument (input_File_Name, output_File_Name,
 instr_number):

 takes an .csd input file and grabs instr_number instrument and
 creates output_File_Name
 f = open (input_File_Name , 'r')#opens file passed
 in to read
 f2 = open (output_File_Name, 'w')   #opens file passed
 in to write
 instr_yes = 'false' #set flag to false

 for line in f:  #for through all
 the lines
   if instr in line:   #look for instr in
 the file
if instr_yes == 'true':#check to see if
 this ends the instr block
break#exit the block

reline = re.line.split('instr', '/d$') #error probily
 split instr and /d (decimal number into parts) $ for end of line
number = int(reline[1])  #convert to a
 number maybe not important
 if number == instr_number:#check to see if
 it is the instr passed to function
 instr_yes = true: #change flag to
 true because this is the instr we want
   if instr_yes = true:#start of code to
 copy to another file
f2.write(f.line) #write line to
 output file

 f.close #close input file
 f2.close  

 [/code]

   
Eric,
  From your problem description and your code it is unclear what 
exactly it is you want. The task appears to be rather simple, though, 
and if you don't get much useful help I'd say it is because you don't 
explain it very well.
  I believe we've been through this before and your input data is 
like this

   data = '''
   CsoundSynthesizer;
 ; test.csd - a Csound structured data file
 
   CsOptions
 -W -d -o tone.wav
   /CsOptions
 
   CsVersion;optional section
 Before 4.10  ;these two statements check for
 After 4.08   ;   Csound version 4.09
   /CsVersion
 
   CsInstruments
 ; originally tone.orc
 sr = 44100
 kr = 4410
 ksmps = 10
 nchnls = 1
 instr   1
 a1 oscil p4, p5, 1 ; simple oscillator
out a1
   endin
   /CsInstruments

   CsScore
 ; originally tone.sco
 f1 0 8192 10 1
 i1 0 1 2 1000 ;play one second of one kHz tone
 e
   /CsScore

   /CsoundSynthesizer

Question 1: Is this your input?
if yes:
Question 1.1: What do you want to extract from it? In what format?
if no:
Question 1.1: What is your input?
Question 1.2: What do you want to extract from it? In what format?
Question 2: Do you need to generate output file names from the data? 
(One file per instrument?)
if yes:
   Question 2.1: What do you want to make your file name from? 
(Instrument number?)


Regards

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: newbe's re question

2006-09-20 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Frederic Rentsch wrote:
   
 [EMAIL PROTECTED] wrote:
 
 All I am after realy is to change this

  reline = re.line.split('instr', '/d$')

 into something that grabs any line with instr in it take all the
 numbers and then grab any comment that may or may not be at the end of
 the line starting with ; until the end of the line including white
 spaces..  this is a corrected version from

 http://python-forum.org/py/viewtopic.php?t=1703

 thanks in advance the hole routine is down below..






 [code]
 def extractCsdInstrument (input_File_Name, output_File_Name,
 instr_number):

 takes an .csd input file and grabs instr_number instrument and
 creates output_File_Name
 f = open (input_File_Name , 'r')#opens file passed
 in to read
 f2 = open (output_File_Name, 'w')   #opens file passed
 in to write
 instr_yes = 'false' #set flag to false

 for line in f:  #for through all
 the lines
   if instr in line:   #look for instr in
 the file
if instr_yes == 'true':#check to see if
 this ends the instr block
break#exit the block

reline = re.line.split('instr', '/d$') #error probily
 split instr and /d (decimal number into parts) $ for end of line
number = int(reline[1])  #convert to a
 number maybe not important
 if number == instr_number:#check to see if
 it is the instr passed to function
 instr_yes = true: #change flag to
 true because this is the instr we want
   if instr_yes = true:#start of code to
 copy to another file
f2.write(f.line) #write line to
 output file

 f.close #close input file
 f2.close

 [/code]


   
 Eric,
   From your problem description and your code it is unclear what
 exactly it is you want. The task appears to be rather simple, though,
 and if you don't get much useful help I'd say it is because you don't
 explain it very well.
   I believe we've been through this before and your input data is
 like this

data = '''
CsoundSynthesizer;
  ; test.csd - a Csound structured data file

CsOptions
  -W -d -o tone.wav
/CsOptions

CsVersion;optional section
  Before 4.10  ;these two statements check for
  After 4.08   ;   Csound version 4.09
/CsVersion

CsInstruments
  ; originally tone.orc
  sr = 44100
  kr = 4410
  ksmps = 10
  nchnls = 1
  instr   1
  a1 oscil p4, p5, 1 ; simple oscillator
 out a1
endin
/CsInstruments

CsScore
  ; originally tone.sco
  f1 0 8192 10 1
  i1 0 1 2 1000 ;play one second of one kHz tone
  e
/CsScore

/CsoundSynthesizer

 Question 1: Is this your input?
 if yes:
 Question 1.1: What do you want to extract from it? In what format?
 if no:
 Question 1.1: What is your input?
 Question 1.2: What do you want to extract from it? In what format?
 Question 2: Do you need to generate output file names from the data?
 (One file per instrument?)
 if yes:
Question 2.1: What do you want to make your file name from?
 (Instrument number?)


 Regards

 Frederic
 

 I want to pass the file name to the subroutine and return a comment
 string if it is there maybe it should be simplier.  I probily should
 have the option of grabbing the comment in other related routines.  I
 am pretty ambitious with the main program.  I did notice some code in
 tcl that would be usefull to the app If I compile it..  I am probily
 not ready for that though..

 http://www.dexrow.com

   

Eric,
 I'm beginning to enjoy this. I'm sure we'll sort this out in no 
time if we proceed methodically. Imagine you are a teacher and I am your 
student. This is a quiz. I have to take it and you need to explain to me 
the problem you want me to solve. If you don't explain it clearly, I 
will not know what I have to do and cannot do the quiz. If you answer my 
questions above, your description of the problem will be clear and I can 
take the quiz. Okay?

Frederic




-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to change font direction?

2006-09-18 Thread Frederic Rentsch

theju wrote:
 Well here are some self explanatory functions that I've written for
 displaying the text vertically and from right to left. As for rotation
 gimme some more time and i'll come back to you. Also I don't guarantee
 that this is the best method(cos I myself am a newbie), but I can
 guarantee you that it works.
 Here it goes...

 im=Image.open(imgfile.jpg)
 draw=ImageDraw.Draw(im)

 def verdraw(width,height,spacing,a=Defaulttext):
   for i in range(0,len(a)):
   draw.text((width,height),a[i],(options))
   height+=spacing

 def right2leftdraw(width,height,spacing,a=Defaulttext):
   for i in range(0,len(a)):
   draw.text((width,height),a[len(a)-i-1],(options))
   width += spacing

 options is a 3 field length tuple is mentioned in the PIL-Handbook.
 Hope you find it useful
 -Theju

 Daniel Mark wrote:
   
 Hello all:

 I am using PIL to draw some graphics and I need to draw some texts
 in vertical direction rather than the default left-to-right horizontal
 direction.

 Is there anyway I could do that?


 Thank you
 -Daniel
 

   

I have done a circular 24-hour dial with the numbers arranged to read 
upright as seen from the center. I remember writing each number into a 
frame, rotating the frame and pasting it into the dial image at the 
right place. I also remember using a transparency mask, so the white 
background of the little frame didn't cover up what it happened to 
overlap (minute marks). The numbers were black on white, so the frame 
was monochrome and could be used as its own transparency mask.
  (This could well be an needlessly complicated approach. When 
confronting a problem, I tend to weigh the estimated time of  hacking 
against the estimated time of shopping and reading recipes and often 
decide for hacking as the faster alternative.)

Regards

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: stock quotes

2006-09-14 Thread Frederic Rentsch

Donlingerfelt wrote:
 I would like to download stock quotes from the web, store them, do
 calculations and sort the results.  However I am fairly new and don't have a
 clue how to parse the results of a web page download.   I can get to the
 site, but do not know how to request the certain data  need.  Does anyone
 know how to do this?  I would really appreciate it.  Thanks.


   

Hi,

   Heres's example 8.4 from the SE manual:

--

  def get_current_stock_quotes (symbols):

  import urllib

  url = 'http://finance.yahoo.com/q/cq?d=v1s=' + '+'.join (symbols)

  htm_page = urllib.urlopen (url)

  import SE

  keep = '~[A-Z]+ [JFMAJSOND].+?%~==(10)  ~[A-Z]+ 
[0-9][0-2]?:[0-5][0-9][AP]M.+?%~==(10)'  

  Data_Extractor = SE.SE ('EAT ' + keep)

  Tag_Stripper = SE.SE ('~(.|\n)*?~=  se/htm2iso.se | ~\n[ 
\t\n]*~=(10) ~ +~= ')

  data = Data_Extractor (Tag_Stripper (htm_page.read ()))

  htm_page.close ()

  return data

  print get_current_stock_quotes (('GE','IBM','AAPL', 'MSFT', 'AA', 
'MER'))

GE 3:17PM ET 33.15 0.30 0.90%

IBM 3:17PM ET 76.20 0.47 0.61%

AAPL 3:22PM ET 55.66 0.66 1.20%

MSFT 3:22PM ET 23.13 0.37 1.57%

AA 3:17PM ET 31.80 1.61 4.82%

MER 3:17PM ET 70.24 0.82 1.15%

-

If this meets your requirements you'll find SE here: 
http://cheeseshop.python.org/pypi/SE/2.2%20beta

Regards

Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to get the longest possible match with Python's RE module?

2006-09-13 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Frederic Rentsch wrote:


   
If you need regexes, why not just reverse-sort your expressions? This
 seems a lot easier and faster than writing another regex compiler.
 Reverse-sorting places the longer ones ahead of the shorter ones.
 

 Unfortunately, not all regular expressions have a fixed match length.
 Which is the longest of, for example, /(abc)?def/ and /(def)?ghi/
 depends on the input. 

 Lorenzo Gatti

   
Very true! Funny you should remind me, considering that I spent quite 
some time upgrading SE to allow regular expressions. Version 1 didn't 
and could resolve precedence at compile time. Version 2 resolves 
precedence at runtime by length of the matches and should function 
correctly in this respect, although it might not function fast enough 
for speed-critical applications. But then there is in general a 
trade-off between convenience and speed.


Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to get the longest possible match with Python's RE module?

2006-09-13 Thread Frederic Rentsch

[EMAIL PROTECTED] wrote:
 Frederic Rentsch wrote:


   
If you need regexes, why not just reverse-sort your expressions? This
 seems a lot easier and faster than writing another regex compiler.
 Reverse-sorting places the longer ones ahead of the shorter ones.
 

 Unfortunately, not all regular expressions have a fixed match length.
 Which is the longest of, for example, /(abc)?def/ and /(def)?ghi/
 depends on the input. 

 Lorenzo Gatti

   
Oh yes, and my proposal to reverse-sort the targets was in response to 
the OP who wanted to generate a regex from a bunch of strings. It should 
work for that.


Frederic

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: How to get the longest possible match with Python's RE module?

2006-09-12 Thread Frederic Rentsch

Licheng Fang wrote:
 Basically, the problem is this:

   
 p = re.compile(do|dolittle)
 p.match(dolittle).group()
 
 'do'

 Python's NFA regexp engine trys only the first option, and happily
 rests on that. There's another example:

   
 p = re.compile(one(self)?(selfsufficient)?)
 p.match(oneselfsufficient).group()
 
 'oneself'

 The Python regular expression engine doesn't exaust all the
 possibilities, but in my application I hope to get the longest possible
 match, starting from a given point.

 Is there a way to do this in Python?

   
Licheng,

   If you need regexes, why not just reverse-sort your expressions? This 
seems a lot easier and faster than writing another regex compiler. 
Reverse-sorting places the longer ones ahead of the shorter ones.

  targets = ['be', 'bee', 'been', 'being']
  targets.sort ()
  targets.reverse ()
  regex = '|'.join (targets)
  re.findall (regex, 'Having been a bee in a former life, I don\'t 
mind being what I am and wouldn\'t want to be a bee ever again.')
['been', 'bee', 'being', 'be', 'bee']

You might also take a look at a stream editor I recently came out with: 
http://cheeseshop.python.org/pypi/SE/2.2%20beta

It has been well received, especially by newbies, I believe because it 
is so simple to use and allows very compact coding.

  import SE
  Bee_Editor = SE.SE ('be=BE bee=BEE  been=BEEN being=BEING')
  Bee_Editor ('Having been a bee in a former life, I don\'t mind 
being what I am and wouldn\'t want to be a bee ever again.')

Having BEEN a BEE in a former life, I don't mind BEING what I am and wouldn't 
want to BE a BEE ever again.

Because SE works by precedence on length, the targets can be defined in any 
order and modular theme sets can be spliced freely to form supersets.


 SE.SE ('EAT be==, bee==,  been==, being==,')(above_sting)
'been,bee,being,be,bee,'

You can do extraction filters, deletion filters, substitutitons in any 
combination. It does multiple passes and can takes files as input, instead of 
strings and can output files.

 Key_Word_Translator = SE.SE ('''
   *INT=int substitute
   *DECIMAL=decimal substitute
   *FACTION=faction substitute
   *NUMERALS=numerals substitute
   # ... etc.
''')

I don't know if that could serve.

Regards

Frederic


-- 
http://mail.python.org/mailman/listinfo/python-list

97 matches

Mail list logo