fabric problem with ssh

2012-08-24 Thread Tim Arnold
I am getting started with fabric and trying to connect to any machine in 
my known hosts file. I want fabric to figure out that it doesn't need me 
to enter my password.


My fabfile.py:

from fabric.api import run, env
env.disable_known_hosts=False

def lsfiles():
run('ls -l ~/')

My console result:

> fab lsfiles
No hosts found. Please specify (single) host string for connection: 
localhost


[localhost] run: ls -l ~/
[localhost] Passphrase for private key:

... file listing follows ...

How can I get fabric to use my .ssh/known_hosts file?

system details:
Python 2.7.3 (default, Aug 22 2012, 13:09:20)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-11)] on linux2

thanks,
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


communicate with external process via pty

2012-10-09 Thread Tim Arnold
I have an external process, 'tralics' that emits mathml when you feed it 
latex equations. I want to get that mathml into a string.


The problem for me is that tralics wants to talk to a tty and I've never 
done that before; it basically starts its own subshell.


I have the following code which works for simple things. I'm not sure 
this is the best way though: basically I got this from google...


import os,sys
import subprocess
import shlex
import pty
cmd =  'tralics --interactivemath'

(master, slave) = pty.openpty()
p = subprocess.Popen(shlex.split(cmd),close_fds=True,
 stdin=slave,stdout=slave,stderr=slave)

os.read(master,1024)# start the process
os.write(master,'$\sqrt{x}$\n') # feed it an equation
mathml.append(os.read(master,1024)) # get the mathml in a string

os.write(master,'$\sqrt{x}$\n') # feed more equations
mathml.append(os.read(master,1024)) # get the next string


Any suggestions for improvement?
thanks,
--Tim
--
http://mail.python.org/mailman/listinfo/python-list


xhtml encoding question

2012-01-31 Thread Tim Arnold

I have to follow a specification for producing xhtml files.
The original files are in cp1252 encoding and I must reencode them to utf-8.
Also, I have to replace certain characters with html entities.

I think I've got this right, but I'd like to hear if there's something 
I'm doing that is dangerous or wrong.


Please see the appended code, and thanks for any comments or suggestions.

I have two functions, translate (replaces high characters with entities) 
and reencode (um, reencodes):

-
import codecs, StringIO
from lxml import etree
high_chars = {
   0x2014:'—', # 'EM DASH',
   0x2013:'–', # 'EN DASH',
   0x0160:'Š',# 'LATIN CAPITAL LETTER S WITH CARON',
   0x201d:'”', # 'RIGHT DOUBLE QUOTATION MARK',
   0x201c:'“', # 'LEFT DOUBLE QUOTATION MARK',
   0x2019:"’", # 'RIGHT SINGLE QUOTATION MARK',
   0x2018:"‘", # 'LEFT SINGLE QUOTATION MARK',
   0x2122:'™', # 'TRADE MARK SIGN',
   0x00A9:'©',  # 'COPYRIGHT SYMBOL',
   }
def translate(string):
   s = ''
   for c in string:
   if ord(c) in high_chars:
   c = high_chars.get(ord(c))
   s += c
   return s

def reencode(filename, in_encoding='cp1252',out_encoding='utf-8'):
   with codecs.open(filename,encoding=in_encoding) as f:
   s = f.read()
   sio = StringIO.StringIO(translate(s))
   parser = etree.HTMLParser(encoding=in_encoding)
   tree = etree.parse(sio, parser)
   result = etree.tostring(tree.getroot(), method='html',
   pretty_print=True,
   encoding=out_encoding)
   with open(filename,'wb') as f:
   f.write(result)

if __name__ == '__main__':
   fname = 'mytest.htm'
   reencode(fname)
--
http://mail.python.org/mailman/listinfo/python-list


Re: xhtml encoding question

2012-02-01 Thread Tim Arnold

On 2/1/2012 3:26 AM, Stefan Behnel wrote:

Tim Arnold, 31.01.2012 19:09:

I have to follow a specification for producing xhtml files.
The original files are in cp1252 encoding and I must reencode them to utf-8.
Also, I have to replace certain characters with html entities.
-
import codecs, StringIO
from lxml import etree
high_chars = {
0x2014:'—', # 'EM DASH',
0x2013:'–', # 'EN DASH',
0x0160:'Š',# 'LATIN CAPITAL LETTER S WITH CARON',
0x201d:'”', # 'RIGHT DOUBLE QUOTATION MARK',
0x201c:'“', # 'LEFT DOUBLE QUOTATION MARK',
0x2019:"’", # 'RIGHT SINGLE QUOTATION MARK',
0x2018:"‘", # 'LEFT SINGLE QUOTATION MARK',
0x2122:'™', # 'TRADE MARK SIGN',
0x00A9:'©',  # 'COPYRIGHT SYMBOL',
}
def translate(string):
s = ''
for c in string:
if ord(c) in high_chars:
c = high_chars.get(ord(c))
s += c
return s


I hope you are aware that this is about the slowest possible algorithm
(well, the slowest one that doesn't do anything unnecessary). Since none of
this is required when parsing or generating XHTML, I assume your spec tells
you that you should do these replacements?


I wasn't aware of it, but I am now--code's embarassing now.
The spec I must follow forces me to do the translation.

I am actually working with html not xhtml; which makes a huge 
difference, sorry for that.


Ulrich's line of code for translate is elegant.
for c in string:
s += high_chars.get(c,c)




def reencode(filename, in_encoding='cp1252',out_encoding='utf-8'):
with codecs.open(filename,encoding=in_encoding) as f:
s = f.read()
sio = StringIO.StringIO(translate(s))
parser = etree.HTMLParser(encoding=in_encoding)
tree = etree.parse(sio, parser)


Yes, you are doing something dangerous and wrong here. For one, you are
decoding the data twice. Then, didn't you say XHTML? Why do you use the
HTML parser to parse XML?


I see that I'm decoding twice now, thanks.

Also, I now see that when lxml writes the result back out the entities I 
got from my translate function are resolved, which defeats the whole 
purpose.



result = etree.tostring(tree.getroot(), method='html',
pretty_print=True,
encoding=out_encoding)
with open(filename,'wb') as f:
f.write(result)


Use tree.write(f, ...)


From the all the info I've received on this thread, plus some 
additional reading, I think I need the following code.


Use the HTMLParser because the source files are actually HTML, and use 
output from etree.tostring() as input to translate() as the very last step.


def reencode(filename, in_encoding='cp1252', out_encoding='utf-8'):
parser = etree.HTMLParser(encoding=in_encoding)
tree = etree.parse(filename, parser)
result = etree.tostring(tree.getroot(), method='html',
pretty_print=True,
encoding=out_encoding)
with open(filename, 'wb') as f:
f.write(translate(result))

not simply tree.write(f...) because I have to do the translation at the 
end, so I get the entities instead of the resolved entities from lxml.


Again, it would be simpler if this was xhtml, but I misspoke 
(mis-wrote?) when I said xhtml; this is for html.



Assuming you really meant XHTML and not HTML, I'd just drop your entire
code and do this instead:

   tree = etree.parse(in_path)
   tree.write(out_path, encoding='utf8', pretty_print=True)

Note that I didn't provide an input encoding. XML is safe in that regard.

Stefan



thanks everyone for the help.

--Tim Arnold

--
http://mail.python.org/mailman/listinfo/python-list


Re: XSLT to Python script conversion?

2012-02-15 Thread Tim Arnold

On 2/13/2012 6:20 AM, Matej Cepl wrote:

Hi,

I am getting more and more discouraged from using XSLT for a
transformation from one XML scheme to another one. Does anybody could
share any experience with porting moderately complicated XSLT stylesheet
(https://gitorious.org/sword/czekms-csp_bible/blobs/master/CEP2OSIS.xsl)
into a Python script using ElementTree's interparse or perhaps xml.sax?

Any tools for this? Speed differences (currently I am using xsltproc)?
Any thoughts?

Thank you,

Matěj


Just a note to encourage you to stick with XSLT. I also use lxml for 
creating and postprocessing my DocBook documents and it is great.  But I 
use the DocBook XSL stylesheets to convert to html; if you're like me, 
you got discouraged at the strangeness of the XSLT language.


I'm no expert with it by any means, but I'm amazed at some of the things 
it does. It is a great tool to add to your programming toolbox.


Also, I used xsltproc for a while but bogged down in processing time. 
Now I use SAXON which is much faster for my documents.


Good luck,
--Tim


--
http://mail.python.org/mailman/listinfo/python-list


multiprocessing timing issue

2011-08-09 Thread Tim Arnold

Hi, I'm having problems with an empty Queue using multiprocessing.

The task:
I have a bunch of chapters that I want to gather data on individually 
and then update a report database with the results.

I'm using multiprocessing to do the data-gathering simultaneously.

Each chapter report gets put on a Queue in their separate processes. 
Then each report gets picked off the queue and the report database is 
updated with the results.


My problem is that sometimes the Queue is empty and I guess it's
because the get_data() method takes a lot of time.

I've used multiprocessing before, but never with a Queue like this.
Any notes or suggestions are very welcome.

The task starts off with:
Reporter(chapters).report()

thanks,
--Tim Arnold

from Queue import Empty
from multiprocessing import Process, Queue

def run_mp(objects,fn):
q = Queue()
procs = dict()
for obj in objects:
procs[obj['name']] = Process(target=fn, args=(obj,q))
procs[obj['name']].start()

return q

class Reporter(object):
def __init__(self, chapters):
self.chapters = chapters

def report(self):
q = run_mp(self.chapters, self.get_data)

for i in range(len(self.chapters)):
try:
data = q.get(timeout=30)
except Empty:
print 'Report queue empty at %s' % (i)
else:
self.update_report(data)

def get_data(self, chapter, q):
data = expensive_calculations()
q.put(data)

def update_report(self, data):
db connection, etc.
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing timing issue

2011-08-11 Thread Tim Arnold

On 8/10/2011 11:36 PM, Philip Semanchuk wrote:


On Aug 9, 2011, at 1:07 PM, Tim Arnold wrote:


Hi, I'm having problems with an empty Queue using multiprocessing.

The task:
I have a bunch of chapters that I want to gather data on individually and then 
update a report database with the results.
I'm using multiprocessing to do the data-gathering simultaneously.

Each chapter report gets put on a Queue in their separate processes. Then each 
report gets picked off the queue and the report database is updated with the 
results.

My problem is that sometimes the Queue is empty and I guess it's
because the get_data() method takes a lot of time.

I've used multiprocessing before, but never with a Queue like this.
Any notes or suggestions are very welcome.



Hi Tim,
THis might be a dumb question, but...why is it a problem if the queue is empty? 
It sounds like you figured out already that get_data() sometimes takes longer 
than your timeout. So either increase your timeout or learn to live with the 
fact that the queue is sometimes empty. I don't mean to be rude, I just don't 
understand the problem.

Cheers
Philip



Hi Philip,
Not a dumb or rude question at all, thanks for thinking about it. When 
the queue is empty the report cannot be updated, so that's why I was 
concerned--I couldn't figure out how to block. Now that's dumb!


From your response and Tim Roberts too, I see that it's possible to 
block until the data comes back. I just should never have put that 
timeout in there. I must have assumed it would not block with no timeout 
given. Wrong


From the docs on q.get():
If optional args 'block' is true and 'timeout' is None (the default),
block if necessary until an item is available. If 'timeout' is
a positive number, it blocks at most 'timeout' seconds and raises
the Empty exception if no item was available within that time.
Otherwise ('block' is false), return an item if one is immediately
available, else raise the Empty exception ('timeout' is ignored
in that case).

thanks,
--Tim Arnold
--
http://mail.python.org/mailman/listinfo/python-list


sqlite3 with context manager

2011-09-02 Thread Tim Arnold

Hi,
I'm using the 'with' context manager for a sqlite3 connection:

with sqlite3.connect(my.database,timeout=10) as conn:
conn.execute('update config_build set datetime=?,result=?
where id=?',
  (datetime.datetime.now(), success,
self.b['id']))

my question is what happens if the update fails? Shouldn't it throw an
exception?

I ask because apparently something went wrong yesterday and the code
never updated but I never got any warning.  I rebooted the machine and
everything is okay now, but I'd like to understand what happened.

thanks,
--Tim
--
http://mail.python.org/mailman/listinfo/python-list


Re: sqlite3 with context manager

2011-09-06 Thread Tim Arnold

On 9/3/2011 3:03 AM, Carl Banks wrote:

On Friday, September 2, 2011 11:43:53 AM UTC-7, Tim Arnold wrote:

Hi,
I'm using the 'with' context manager for a sqlite3 connection:

with sqlite3.connect(my.database,timeout=10) as conn:
  conn.execute('update config_build set datetime=?,result=?
where id=?',
(datetime.datetime.now(), success,
self.b['id']))

my question is what happens if the update fails? Shouldn't it throw an
exception?


If you look at the sqlite3 syntax documentation, you'll see it has a SQL 
extension that allows you to specify error semantics.  It looks something like 
this:

UPDATE OR IGNORE
UPDATE OR FAIL
UPDATE OR ROLLBACK

I'm not sure exactly how this interacts with pysqlite3, but using one of these 
might help it throw exceptions when you want it to.


Carl Banks


I see now. You can use 'update or fail' if you have the extensions built 
in: http://docs.python.org/library/sqlite3.html#f1


example of use, line 76:
http://projects.developer.nokia.com/TECwidget/browser/data/montreal/updsqlite.py?rev=7ca2ebd301ed1eff0e2c28283470db060b872cd6

For my case, however, I'll follow Ian's advice and check on the rowcount 
after the update.


thanks for the explanation and advice,
--Tim
--
http://mail.python.org/mailman/listinfo/python-list


Re: Literate Programming

2011-04-08 Thread Tim Arnold
"Hans Georg Schaathun"  wrote in message 
news:r7b178-602@svn.schaathun.net...
> Has anyone found a good system for literate programming in python?
>
> I have been trying to use pylit/sphinx/pdflatex to generate
> technical documentation.  The application is scientific/numerical
> programming, so discussing maths in maths syntax in between
> python syntax is important.
>


Hi Hans,
If you already know LaTeX, you might experiment with the *.dtx docstrip 
capability.
It has some pain points if you're developing from scratch, but I use it once 
I've got a system in reasonable shape. You have full control over the 
display and you can make the code files go anywhere you like when you run 
pdflatex on your file.
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Literate Programming

2011-04-11 Thread Tim Arnold
"Hans Georg Schaathun"  wrote in message 
news:aca678-b87@svn.schaathun.net...
> On Fri, 8 Apr 2011 12:58:34 -0400, Tim Arnold
>   wrote:
> :  If you already know LaTeX, you might experiment with the *.dtx docstrip
> :  capability.
>
> Hi.  Hmmm.  That's a new thought.  I never thought of using docstrip
> with anything but LaTeX.  It sounds like a rather primitive tool for
> handling python code, and I would expect some serious trouble getting
> sensible highlighting in vim/eclim (or most other editors for that
> matter).  But I'll give it a thought.  Thanks.
>
> :  It has some pain points if you're developing from scratch, but I use it 
> once
> :  I've got a system in reasonable shape.
>
> Hmmm.  I wonder if I am every going to reach that stage :-)
>
> : You have full control over the
> :  display and you can make the code files go anywhere you like when you 
> run
> :  pdflatex on your file.
>
> If you use docstrip with python, what packages do you use to highlight
> code and markup programming concepts (methods/classes/variables)?
> If I may ask ...
>
> -- 
> :-- Hans Georg

I don't use anything special, just the verbatim environment is fine for my 
purposes.  But you might like the listings package which iirc has syntax 
highlighting built-in for python. ah, yes:
http://en.wikibooks.org/wiki/LaTeX/Packages/Listings

There's also the 'fancyvrb' package:
http://www.ctan.org/tex-archive/macros/latex/contrib/fancyvrb

good luck,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best way to share a python list of objects

2005-10-11 Thread Tim Arnold

"Magnus Lycka" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> kyle.tk wrote:
>> So I have a central list of python objects that I want to be able to
>> share between different process that are possibly on different
>> computers on the network. Some of the processes will add objects to
>> list and another process will be a GUI that will view objects in the
>> list. I want this all to happen in real-time (e.g once a processes adds
>> an object to the list the GUI will see it.)
>>
>> What would be the best way to accomplish this. Some of my ideas:
>> - An XML file r/w-able by all processes
>> - Send pickled objects between all processes and each keeps it own list
>> locally
>> - A ascii type protocol akin to ftp the hands out all the info to the
>> processes
>>
>> Any other ideas? What would work the best
>
> Relational database are useful for sharing data in a controlled way.
> A better option for arbirary Python objects might be ZODB with ZEO.
>
> http://www.zope.org/Wikis/ZODB/FrontPage
> http://www.zope.org/Wikis/ZODB/FrontPage/guide/index.html

Another one to consider is wddx, quite useful if you need to share data 
between different languages.

--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


pyparsing and LaTeX?

2005-11-30 Thread Tim Arnold
Anyone parsing simple LaTeX constructs with pyparsing?

I'm playing around with it (and looking at John Hunter's matplotlib stuff), 
but I thought I'd ask here if anyone had other TeX/LaTeX examples.

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: pyparsing and LaTeX?

2005-12-09 Thread Tim Arnold

>"Ezequiel, Justin" <[EMAIL PROTECTED]> wrote in message 
> >news:[EMAIL PROTECTED]
>> Anyone parsing simple LaTeX constructs with pyparsing?
>
> Greetings Tim,
>
> Have always wanted a way to parse LaTeX myself.
> Unfortunately, I have been moved to a different project.
> However, I am still very much interested.
> Did you ever get a reply?

Hi Justin,
Yes, I did, from the pyparsing forum on Sourceforge. Paul's responses are 
excellent and his help for my simple needs really got me started.
http://pyparsing.sourceforge.net/

For now I'm working on a tag translator to convert from one LaTeX tagset to 
another, which is a pretty simple task compared to writing a full parser 
like pyLaTeX
http://pylatex.sourceforge.net/

--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Python IDE (was: PythonWin troubleshooting)

2005-12-16 Thread Tim Arnold

"chuck" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Apparently not too many people use or are interested in PythonWin.  I'm
> giving up on it.  It used to work pretty good.
>
> I'm lucky that I have found PyScripter (http://www.mmm-experts.com/) a
> python IDE for the windows platform which is much more stable and has
> more features that PythonWin.  If you are doing Python development on
> Windows I'd recommend taking a look at it.
>
> I'm also evaluating Wing IDE.  I may have another post with comments on
> it for anyone who might be interested.
>

Here's a plug for SPE, since I haven't heard anyone extolling its virtues.
I use it every day and love it.
--Tim 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Parser or regex ?

2005-12-16 Thread Tim Arnold

"Fuzzyman" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hello all,
>
> I'm writing a module that takes user input as strings and (effectively)
> translates them to function calls with arguments and keyword
> arguments.to pass a list I use a sort of 'list constructor' - so the
> syntax looks a bit like :
>
>   checkname(arg1, "arg 2", 'arg 3', keywarg="value",
> keywarg2='value2', default=list("val1", 'val2'))
>
> Worst case anyway :-)
>

pyparsing is great, easy to configure and very powerful--I think it looks 
like a great tool for your inputs.

http://pyparsing.sourceforge.net/


--Tim 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Marshaling unicode WDDX

2006-01-05 Thread Tim Arnold

"isthar" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hi !
> i am trying to serialise object which contains some unicode objects
> but looks like there is no way to do it.
>

hi, I'm sure you'll get better answers for the unicode part of your problem 
(I'd start with a look at the codecs module), but I wanted you to know that 
I'm using the wddx connection between Python and PHP with no problem. Here's 
the python code that writes a fairly complex data structure to the wddx 
file:

def writeDocReport(self):
from xml.marshal import wddx
fileName = '%s/%s.doc.wddx' % (os.path.join(adsWeb,'jobs','wddx'), 
self.name)
tmpFile  = open(fileName,'wb')
tmpFile.write(wddx.dumps(self.getDocData()))
tmpFile.close()

and the PHP:

function getInputData($){
$file = "wddx/$prodName.wddx";

if (!is_file($file)) {
exit("Report is not available for $prodName");
}

$output = wddx_deserialize(file_get_contents($file));


-- 
http://mail.python.org/mailman/listinfo/python-list


testing machine responsiveness

2006-10-06 Thread Tim Arnold
I have a bunch of processes that I farm out over several HPux machines on 
the network. There are 60 machines to choose from and I want to
(1) find out which ones are alive (the 'ping' method below) and
(2) sort them by their current load (the 'get' method below, using the rup 
command)

I'm no expert--I bet what I'm doing could be done better. I'd appreciate any 
tips, caveats, etc.
Thanks in advance for looking at the code.
--Tim Arnold


Say the host names are in a global list tmpList...
# The final sorted list of cpus is called as:
cpuList  = [x[1] for x in Machines().get()]

#
class Machines(object):
' List of available, alive machines. '
def __init__(self):
global tmpList
self.asList = [y for y in tmpList if self.ping(y)]
self.asString = ' '.join(self.asList)

def ping(self, cpu):
''' Determine whether a machine is alive.
tcp connect to machine, port 7 (echo port).
Response within 3 seconds -> return true
'''
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.settimeout(2)
try:
s.connect((cpu,7))
except:
return 0
try:
s.send('test')
s.recv(128)
s.close()
return 1
except:
return 0

   def get(self, maxLoad=3.0,):
''' return sorted list of available machines, sorted on load.
Optionally, specify maxload, default is no more than 3.0
'''
tmpDict = {}
retList = []
try:
rup = os.popen('rup %s | sort -n -t, -k4 | grep day' % 
(self.asString))
except OSError:
return self.asList

for s in rup:
(name, t0) = s.split(' ',1)
(t1,t2,t3,avgLoad,t4) = s.split(',')
load = float(avgLoad)
if load < maxLoad:
tmpDict['%s.com' % name] = load

for (l, n) in [(v[1], v[0]) for v in tmpDict.items()]:
retList.append((l, n))
retList.sort()
return retList


-- 
http://mail.python.org/mailman/listinfo/python-list


wddx problem with entities

2006-06-07 Thread Tim Arnold
I'm confused about why I get this error:
UnicodeError: ASCII encoding error: ordinal not in range(128)

when I try to load a wddx file containing this string:
The image file, gif/aperçu.png, does
  not exist.

When I loop through the file as if it's text and check the ord() value of 
each character, of course it's clean. Do I have to replace numbered entities 
in the wddx file before I can wddx.load() it?
thanks!
--tim

example program:
---
from xml.marshal import wddx

datastring = '''




  The image file, gif/aperçu.png,does not exist.
'''

data = wddx.loads(datastring)


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: wddx problem with entities

2006-06-09 Thread Tim Arnold
"Tim Arnold" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> I'm confused about why I get this error:
> UnicodeError: ASCII encoding error: ordinal not in range(128)
>
> when I try to load a wddx file containing this string:
> The image file, gif/aperçu.png, does
>  not exist.
>
> When I loop through the file as if it's text and check the ord() value of 
> each character, of course it's clean. Do I have to replace numbered 
> entities in the wddx file before I can wddx.load() it?
> thanks!
> --tim
>
> example program:
> ---
> from xml.marshal import wddx
>
> datastring = '''
> 
> 
> 
> 
>  The image file, gif/aperçu.png,does not exist.
> '''
>
> data = wddx.loads(datastring)

Replying to my own post.
I got around this problem, which *looks* like a bug to me, (although I'm 
sure I don't understand the internals of xml.marshal) with this:

self.data = wddx.loads(''.join(codecs.open(self.filename,
   errors='ignore',encoding='ascii').readlines()))
where self.filename contains the numbered entity.

just in case anyone else out there is using wddx to communicate between 
python and php.

--Tim Arnold



-- 
http://mail.python.org/mailman/listinfo/python-list


xinclude and pathnames

2006-09-13 Thread Tim Arnold
I'm using ElementTree to access some xml configuration files, and using the 
module's xinclude capability. I've got lines like this in the parent xml 
file (which lives in the same directory as the included xml file):


When I started the project it was Unix-only; this worked fine. Now I have 
users who want to use the system on Windows and of course that directory 
path doesn't exist on Windows, but it is available on the network using a 
name like \\ladida\current\en\xml\asdf\asdf_syntaxterms.xml

if relative paths worked, I could imagine
 would work.
Also,the file can be read via an http server.

My question: is there a way to make xinclude work with relative paths or 
perhaps urls?
Any ideas welcome--to me it looks like I'll have to restructure this part of 
the system since I've basically programmed myself into a corner.

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: xinclude and pathnames

2006-09-14 Thread Tim Arnold
"Tim Arnold" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> I'm using ElementTree to access some xml configuration files, and using 
> the module's xinclude capability. I've got lines like this in the parent 
> xml file (which lives in the same directory as the included xml file):
>  href="/dept/app/doc/current/en/xml/asdf/asdf_syntaxterms.xml"/>
>
> When I started the project it was Unix-only; this worked fine. Now I have 
> users who want to use the system on Windows and of course that directory 
> path doesn't exist on Windows, but it is available on the network using a 
> name like \\ladida\current\en\xml\asdf\asdf_syntaxterms.xml
>
> if relative paths worked, I could imagine
>  would work.
> Also,the file can be read via an http server.
>
> My question: is there a way to make xinclude work with relative paths or 
> perhaps urls?
> Any ideas welcome--to me it looks like I'll have to restructure this part 
> of the system since I've basically programmed myself into a corner.
>

Replying to my own post. With no replies I assume that means either (a) I 
didn't explain the problem very well, or (b) I really have programmed myself 
into a corner and there's no other way to happiness except to rethink the 
problem.

That is, is there really no way to share xinclude'd files between *nix and 
Windows platforms.

Anyone been down this road before?
thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: latex openoffice converter

2006-09-14 Thread Tim Arnold
"Fabian Braennstroem" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hi,
>
> I would like to use python to convert 'simple' latex
> documents into openoffice format. Maybe, anybody has done
> something similar before and can give me a starting point!?
> Would be nice to hear some hints!
>
> Greetings!
> Fabian

You can use pLasTeX to render to XML. May not be that difficult to render to 
the OO xml spec.
http://plastex.sourceforge.net/plastex/index.html

--T 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: xinclude and pathnames

2006-09-14 Thread Tim Arnold
"Fredrik Lundh" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Rob Williscroft wrote:
>
>> The default handler just sees the href value as a filename, so you
>> should be able to use a relative path if you os.chdir() to the working
>> directory before processing you xml file.
>
> and if that's not good enough, writing a custom loader is trivial (see the 
> default_loader implementation in ElementInclude.py for details).
>
> 

Thanks for the information--I just tested Rob's idea, which works fine for 
my case. You guys saved me a bunch of hair-pulling. I'm going to check out 
the default_loader implementation too.

--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: php and python: how to unpickle using PHP?

2006-10-03 Thread Tim Arnold

<[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Ted Zeng:
>> I store some test results into a database after I use python
>> To pickle them (say, misfiles=['file1','file2'])
>> Now I want to display the result on a web page which uses PHP.
>> How could the web page unpickle the results and display them?
>> Is there a PHP routine that can do unpickle ?
>
> Instead of pickling, maybe you can save the data from python in json
> format:
> http://www.json.org/
> Then you can read it from PHP.

wddx format is a workable solution as well

http://pyxml.sourceforge.net/
plus
http://us2.php.net/wddx


-- 
http://mail.python.org/mailman/listinfo/python-list


pypdf assert error on documentinfo

2007-06-28 Thread Tim Arnold
Using pyPdf, nice user interface. Maybe it doesn't handle pdf 1.4? I'm 
getting an assertion error from the following code. The pdf file shows it 
does have a title in its document info (using acrobat 8 or reader 5).

pdf is version 1.4, produced with pdfeTex (pdflatex) 1.304
using python 2.4.1

# file test.py 
import pyPdf
filename = 'test.pdf'
test= pyPdf.PdfFileReader(open(filename,'rb'))
print test.getNumPages()
tmp = test.getDocumentInfo()

===
python test.py
55
Traceback (most recent call last):
File "test.py", line 6, in ?
tmp =  test.getDocumentInfo()
  File "tiarno/pyPdf-1.9/pyPdf/pdf.py", line 291, in getDocumentInfo
obj = self.getObject(self.trailer['/Info'])
  File "tiarno/pyPdf-1.9/pyPdf/pdf.py", line 407, in getObject
retval = readObject(self.stream, self)
  File "tiarno/pyPdf-1.9/pyPdf/generic.py", line 64, in readObject
return DictionaryObject.readFromStream(stream, pdf)
  File "tiarno/pyPdf-1.9/pyPdf/generic.py", line 348, in readFromStream
assert False
AssertionError

===


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Quote aware split

2007-05-16 Thread Tim Arnold
"Ondrej Baudys" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hi,
>
> After trawling through the archives for a simple quote aware split
> implementation (ie string.split-alike that only splits outside of
> matching quote) and coming up short,  I implemented a quick and dirty
> function that suits my purposes.

Take a look at pyparsing--you'll like it I think.
esp. http://pyparsing.wikispaces.com/Examples

--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


extra xml header with ElementTree?

2007-05-25 Thread Tim Arnold
Hi, I'm using ElementTree which is wonderful. I have a need now to write out 
an XML file with these two headers:



My elements have the root named tocbody and I'm using:
newtree = ET.ElementTree(tocbody)
newtree.write(fname)

I assume if I add the encoding arg I'll get the xml header:
newtree = ET.ElementTree(tocbody)
newtree.write(fname,encoding='utf-8')

but how can I get the  into the tree?

python2.4.1,hpux10,ElementTree1.2.6

thanks,
--Tim 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: extra xml header with ElementTree?

2007-05-25 Thread Tim Arnold
"Gerard Flanagan" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On May 25, 3:55 pm, "Tim Arnold" <[EMAIL PROTECTED]> wrote:
>> Hi, I'm using ElementTree which is wonderful. I have a need now to write 
>> out
>> an XML file with these two headers:
>> 
>> 
>>
>> My elements have the root named tocbody and I'm using:
>> newtree = ET.ElementTree(tocbody)
>> newtree.write(fname)
>>
>> I assume if I add the encoding arg I'll get the xml header:
>> newtree = ET.ElementTree(tocbody)
>> newtree.write(fname,encoding='utf-8')
>>
>> but how can I get the  into the tree?
>>
>> python2.4.1,hpux10,ElementTree1.2.6
>>
>
> #This import is for 2.5, change for 2.4
> from xml.etree import cElementTree as ET
>
> tocbody = 'onetwo'
>
> doc = ET.ElementTree(ET.fromstring(tocbody))
>
> outfile = open('\\working\\tmp\\toctest.xml', 'w')
>
> outfile.write('')
>
> outfile.write('')
>
> doc._write(outfile, doc._root, 'utf-8', {})
>
> outfile.close()
>
> -
>
>  
>  
>  
>  one
>  two
>  
>

thanks, this works well. After looking at the ET code, I just used the 
'write' method straight since it calls _write in turn.

thanks again,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


encode/decode misunderstanding

2007-07-26 Thread Tim Arnold
Hi, I'm beginning to understand the encode/decode string methods, but I'd 
like confirmation that I'm still thinking in the right direction:

I have a file of latin1 encoded text. Let's say I put one line of that file 
into a string variable 'tocline', as follows:
tocline = 'Ficha Datos de p\xe9rdida AND acci\xf3n'

import codecs
tocFile = codecs.open('mytoc.htm','wb',encoding='utf8',errors='replace')
tocline = tocline.decode('latin1','replace')
tocFile.write(tocline)
tocFile.close()

What I think is that tocFile is wrapped to insure that anything written to 
it is in utf8
I decode the latin1 string into python's internal unicode encoding and that 
gets written out as utf8.

Questions:
what exactly is the tocline when it's read in with that \xe9 and \xed in the 
string? A latin1 encoded string?
Is my method the right way to write such a line out to a file with utf8 
encoding?

If I read in the latin1 file using
codecs.open(filename,encoding='latin1') and write out the utf8 file by 
opening with
codecs.open(othername,encoding='utf8'), would I no longer have a problem --  
I could just read in latin1 and write out utf8 with no more worries about 
encoding?

thanks,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: encode/decode misunderstanding

2007-07-27 Thread Tim Arnold
> If I read in the latin1 file using
> codecs.open(filename,encoding='latin1') and write out the utf8 file by 
> opening with
> codecs.open(othername,encoding='utf8'), would I no longer have a 
> problem --  I could just read in latin1 and write out utf8 with no more 
> worries about encoding?
>
> thanks,

Replying to my own post, I feel so lonely! I guess that silence means I *am* 
thinking correctly about the encoding/decoding stuff; I'll keep heading in 
this direction unless someone out there sees it differently.

--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


encoding misunderstanding

2007-07-27 Thread Tim Arnold
Hi,  I'm beginning to understand the encode/decode string methods, but
I'd like confirmation that I'm still thinking in the right direction:

I have a file of latin1 encoded text. Let's say I put one line of that
file
into a string variable 'tocline', as follows:
tocline = 'Ficha Datos de p\xe9rdida AND acci\xf3n'

import codecs
tocFile =
codecs.open('mytoc.htm','wb',encoding='utf8',errors='replace')
tocline = tocline.decode('latin1','replace')
tocFile.write(tocline)
tocFile.close()

What I think is that tocFile is wrapped to insure that anything
written to it is in utf8
I decode the latin1 string into python's internal unicode encoding and
that gets written out as utf8.

Questions:
what exactly is the tocline when it's read in with that \xe9 and \xed
in the string? A latin1 encoded string?
Is my method the right way to write such a line out to a file with
utf8
encoding?

If I read in the latin1 file using
codecs.open(filename,encoding='latin1') and write out the utf8 file
by
opening with
codecs.open(othername,encoding='utf8'), would I no longer have a
problem --  I could just read in latin1 and write out utf8 with no
more worries about
encoding?

thanks,
--Tim
p.s. sorry if you see this twice--my newsreader is flaky right now.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: encode/decode misunderstanding

2007-07-30 Thread Tim Arnold
"Diez B. Roggisch" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Tim Arnold schrieb:
>> Hi, I'm beginning to understand the encode/decode string methods, but I'd 
>> like confirmation that I'm still thinking in the right direction:
>>
>> I have a file of latin1 encoded text. Let's say I put one line of that 
>> file into a string variable 'tocline', as follows:
>> tocline = 'Ficha Datos de p\xe9rdida AND acci\xf3n'
>>
>> import codecs
>> tocFile = codecs.open('mytoc.htm','wb',encoding='utf8',errors='replace')
>> tocline = tocline.decode('latin1','replace')
>> tocFile.write(tocline)
>> tocFile.close()
>>
>> What I think is that tocFile is wrapped to insure that anything written 
>> to it is in utf8
>> I decode the latin1 string into python's internal unicode encoding and 
>> that gets written out as utf8.
>>
>> Questions:
>> what exactly is the tocline when it's read in with that \xe9 and \xed in 
>> the string? A latin1 encoded string?
>
> Yes. A simple, pure byte-string, that happens to contain bytes which under 
> the latin1-encoding are "correct".
>
>> Is my method the right way to write such a line out to a file with utf8 
>> encoding?
>
> Yes.
>
>> If I read in the latin1 file using
>> codecs.open(filename,encoding='latin1') and write out the utf8 file by 
>> opening with
>> codecs.open(othername,encoding='utf8'), would I no longer have a 
>> problem --  I could just read in latin1 and write out utf8 with no more 
>> worries about encoding?
>
> As long as you don't mix bytestrings and only use unicode-objects, you 
> should be fine, yes.
>
> Diez

wow, I was thinking correctly about encoding! time for a beer!
Diez, thanks very much for confirming my thoughts.

--Tim Arnold 


-- 
http://mail.python.org/mailman/listinfo/python-list


encoding confusions

2007-03-29 Thread Tim Arnold
I have the contents of a file that contains French documentation.
I've iterated over it and now I want to write it out to a file.

I'm running into problems and I don't understand why--I don't get how the 
encoding works.
My first attempt was just this:
< snipped code for classes, etc; fname is string, codecs module loaded.>
< self.contents is the French file's contents as a single string >

tFile = codecs.open(fname,'w',encoding='latin-1', errors='ignore')
tFile.write(self.contents)
tFile.close()

ok, so that didn't work and I read some more and did this:
tFile.write(self.contents.encode('latin-1'))

but that gives me the same error
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 48: 
ordinal not in range(128)

this is python2.4.1 (hpux)
sys.getdefaultencoding()
'ascii'

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


subprocess confusion

2007-04-16 Thread Tim Arnold
Hi,
Just discovered that my subprocess call with the preexec_fn wasn't doing 
what I thought.
If 'machine' value is different than the current machine name, I want to 
remsh the command to that machine, but obviously I misunderstood the 
preexec_fn arg.

Should I just put the remsh in the actual command instead of preexec_fn?
thanks,
--Tim Arnold
---
if machine == socket.gethostname():
shname = None
else:
shname = lambda :'/bin/remsh %s ' % (machine)
p = subprocess.Popen(preexec_fn = shname,
shell  = True,
args   = command,
stderr = subprocess.STDOUT,
stdout = log,
env= env,
)
try:
p.wait()
if log:
log.close()
except:
pass

--- 


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess confusion

2007-04-17 Thread Tim Arnold
"Nick Craig-Wood" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Tim Arnold <[EMAIL PROTECTED]> wrote:

>>  Should I just put the remsh in the actual command instead of
>>  preexec_fn?
>
> Yes.
>
> The preexec_fn is run after the fork() but before the exec().  Ie a
> new process has been made, but it hasn't started your task yet.
>
> For example a classic use of preexec_fn is
>
>  preexec_fn=os.setsid
>
> You seem to be thinking it is pre-pending something to your command
> line which isn't how it works.
>
> -- 
> Nick Craig-Wood <[EMAIL PROTECTED]> -- http://www.craig-wood.com/nick

Thanks much to you and Michael H. for the great explanations.
Now everything is working fine, and I understand subprocess a little better!

--Tim Arnold 


-- 
http://mail.python.org/mailman/listinfo/python-list


cvs module

2007-09-18 Thread Tim Arnold
Hi, I need to do some scripting that interacts with CVS. I've been just 
doing system calls and parsing the output to figure out what's going on, but 
it would be nice to deal with CVS directly.

Does anyone know of a python module I can use to interface with CVS?
thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


elementtree question

2007-09-21 Thread Tim Arnold
Hi, I'm using elementtree and elementtidy to work with some HTML files. For 
some of these files I need to enclose the body content in a new div tag, 
like this:

  
   original contents...
  


I figure there must be a way to do it by creating a 'div' SubElement to the 
'body' tag and somehow copying the rest of the tree under that SubElement, 
but it's beyond my comprehension.

How can I accomplish this?
(I know I could put the class on the body tag itself, but that won't satisfy 
the powers-that-be).

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: elementtree question

2007-09-24 Thread Tim Arnold
Thanks for the great answers--I learned a lot. I'm looking forward to the ET 
1.3 version.  I'm currently working on some older HP10.20ux machines and 
haven't been able to compile lxml all the way through yet.

thanks again,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


image resize question

2007-10-18 Thread Tim Arnold
Hi, I'm using the Image module to resize PNG images from 300 to 100dpi for 
use in HTML pages, but I'm losing some vertical and horizontal lines in the 
images (usually images of x-y plots).

Here's what I do:
import Image
def imgResize(self,filename):
img = Image.open(filename)
dpi = img.info.get('dpi')
if dpi and 295 < int(dpi[0]) < 305:
wd = img.size[0]/3.0 #convert from 300dpi to 100dpi
ht = img.size[1]/3.0
newimg= img.resize((int(wd),int(ht)))
newimg.save(filename)

imgResize('myimage.png')

Can someone point me to a better way so I don't lose the reference lines in 
the images?
thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: image resize question

2007-10-19 Thread Tim Arnold
"Matimus" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On Oct 18, 11:56 am, "Tim Arnold" <[EMAIL PROTECTED]> wrote:
>> Hi, I'm using the Image module to resize PNG images from 300 to 100dpi 
>> for
>> use in HTML pages, but I'm losing some vertical and horizontal lines in 
>> the
>> images (usually images of x-y plots).
>>
>> Here's what I do:
>> import Image
>> def imgResize(self,filename):
>> img = Image.open(filename)
>> dpi = img.info.get('dpi')
>> if dpi and 295 < int(dpi[0]) < 305:
>> wd = img.size[0]/3.0 #convert from 300dpi to 100dpi
>> ht = img.size[1]/3.0
>> newimg= img.resize((int(wd),int(ht)))
>> newimg.save(filename)
>>
>> imgResize('myimage.png')
>>
>> Can someone point me to a better way so I don't lose the reference lines 
>> in
>> the images?
>> thanks,
>> --Tim Arnold
>
> Resize accepts a second parameter that is used to determine what kind
> of downsampling filter to use (http://www.pythonware.com/library/pil/
> handbook/image.htm). The default is Image.NEAREST, which just samples
> the nearest pixel and results in the type of data loss you are seeing.
> If you want something better try one of the following and see which
> works best for you: Image.BILINEAR, Image.BICUBIC or Image.ANTIALIAS.
>
> example:
> ...
>newimg = img.resize((int(wd),int(ht)),Image.ANTIALIAS)
> ...
>
> Matt

Thank you! The ANTIALIAS filter works great. With any of the others, I still 
lost my reference lines.
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


elementtree w/utf8

2007-10-25 Thread Tim Arnold
Hi, I'm getting the by-now-familiar error:
return codecs.charmap_decode(input,errors,decoding_map)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in position 
4615: ordinal not in range(128)

the html file I'm working with is in utf-8, I open it with codecs, try to 
feed it to TidyHTMLTreeBuilder, but no luck. Here's my code:
from elementtree import ElementTree as ET
from elementtidy import TidyHTMLTreeBuilder

fd = codecs.open(htmfile,encoding='utf-8')
tidyTree = 
TidyHTMLTreeBuilder.TidyHTMLTreeBuilder(encoding='utf-8')
tidyTree.feed(fd.read())
self.tree = tidyTree.close()
fd.close()

what am I doing wrong? Thanks in advance.

On a related note, I have another question--where/how can I get the 
cElementTree.py module? Sorry for something so basic, but I tried installing 
cElementTree, but while I could compile with setup.py build, I didn't end up 
with a cElementTree.py file anywhere. The directory structure on my system 
(HPux, but no root access) doesn't work well with setup.py install.

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: elementtree w/utf8

2007-10-26 Thread Tim Arnold

"Marc 'BlackJack' Rintsch" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On Thu, 25 Oct 2007 17:15:36 -0400, Tim Arnold wrote:
>
>> Hi, I'm getting the by-now-familiar error:
>> return codecs.charmap_decode(input,errors,decoding_map)
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in 
>> position
>> 4615: ordinal not in range(128)
>>
>> the html file I'm working with is in utf-8, I open it with codecs, try to
>> feed it to TidyHTMLTreeBuilder, but no luck. Here's my code:
>> from elementtree import ElementTree as ET
>> from elementtidy import TidyHTMLTreeBuilder
>>
>> fd = codecs.open(htmfile,encoding='utf-8')
>> tidyTree =
>> TidyHTMLTreeBuilder.TidyHTMLTreeBuilder(encoding='utf-8')
>> tidyTree.feed(fd.read())
>> self.tree = tidyTree.close()
>> fd.close()
>>
>> what am I doing wrong? Thanks in advance.
>
> You feed decoded data to `TidyHTMLTreeBuilder`.  As the `encoding`
> argument suggests this class wants bytes not unicode.  Decoding twice
> doesn't work.
>
> Ciao,
> Marc 'BlackJack' Rintsch

well now that you say it, it seems so obvious...
some day I will get the hang of this encode/decode stuff. When I read about 
it, I'm fine, it makes sense, etc. maybe even a little boring. And then I 
write stuff like the above!

Thanks to you and Diez for straightening me out.
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: elementtree w/utf8

2007-10-29 Thread Tim Arnold

"Stefan Behnel" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Tim Arnold wrote:
>> On a related note, I have another question--where/how can I get the
>> cElementTree.py module? Sorry for something so basic, but I tried 
>> installing
>> cElementTree, but while I could compile with setup.py build, I didn't end 
>> up
>> with a cElementTree.py file anywhere.
>
> That's because it compiles into a binary extension module, not a plain 
> Python
> module (mind the 'c' in its name, which stands for the C language here).
>
> I don't know what the standard library extension is under HP-UX, but look 
> a
> little closer at the files that weren't there before, you'll find it.
> Depending on what you did to build it, it might also end up in the "build"
> directory or as an installable package in the "dist" directory.
>
>
>> The directory structure on my system
>> (HPux, but no root access) doesn't work well with setup.py install.
>
> That shouldn't be a problem as long as you keep the binary in your 
> PYTHONPATH.
>
> As suggested before, if you have Python 2.5, you don't even need to 
> install it
> yourself.
>
> Stefan

very nice--thanks. I saw the cElementTree.sl file, but didn't realize it 
would work as-is.
thanks,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding non ascii characters in a set of files

2007-02-23 Thread Tim Arnold
"Peter Bengtsson" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On Feb 23, 2:38 pm, [EMAIL PROTECTED] wrote:
>> Hi,
>>
>> I'm updating my program to Python 2.5, but I keep running into
>> encoding problems. I have no ecodings defined at the start of any of
>> my scripts. What I'd like to do is scan a directory and list all the
>> files in it that contain a non ascii character. How would I go about
>> doing this?
>>
>
> How about something like this:
> content = open('file.py').read()
> try:
>content.encode('ascii')
> except UnicodeDecodeError:
>print "file.py contains non-ascii characters"
>
Here's what I do (I need to know the line number).

import os,sys,codecs
def checkfile(filename):
f = codecs.open(filename,encoding='ascii')

lines = open(filename).readlines()
print 'Total lines: %d' % len(lines)
for i in range(0,len(lines)):
try:
l = f.readline()
except:
num = i+1
print 'problem: line %d' % num

f.close()



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding non ascii characters in a set of files

2007-02-23 Thread Tim Arnold
"Marc 'BlackJack' Rintsch" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> In <[EMAIL PROTECTED]>, Tim Arnold wrote:
>


>   Untested:
>
> import os, sys, codecs
>
> def checkfile(filename):
>f = codecs.open(filename,encoding='ascii')
>
>try:
>for num, line in enumerate(f):
>pass
>except UnicodeError:
>print 'problem: line %d' % num
>
>f.close()
>
> Ciao,
> Marc 'BlackJack' Rintsch

Thanks Marc,
That looks much cleaner. I didn't know the 'num' from the enumerate would 
persist so the except block could report it.

thanks again,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Finding non ascii characters in a set of files

2007-02-23 Thread Tim Arnold
"Marc 'BlackJack' Rintsch" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> In <[EMAIL PROTECTED]>, Tim Arnold wrote:
>
>> Here's what I do (I need to know the line number).
>>
>> import os,sys,codecs
>> def checkfile(filename):
>> f = codecs.open(filename,encoding='ascii')
>>
>> lines = open(filename).readlines()
>> print 'Total lines: %d' % len(lines)
>> for i in range(0,len(lines)):
>> try:
>> l = f.readline()
>> except:
>> num = i+1
>> print 'problem: line %d' % num
>>
>> f.close()
>
> I see a `NameError` here.  Where does `i` come from?  And there's no need
> to read the file twice.  Untested:
>
> import os, sys, codecs
>
> def checkfile(filename):
>f = codecs.open(filename,encoding='ascii')
>
>try:
>for num, line in enumerate(f):
>pass
>except UnicodeError:
>print 'problem: line %d' % num
>
>f.close()
>
> Ciao,
> Marc 'BlackJack' Rintsch

well, I take it backthat code doesn't work, or at least it doesn't for 
my test case.
but thanks anyway, I'm sticking to my original code. the 'i' came from for i 
in range.
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


insert comments into elementtree

2007-11-16 Thread Tim Arnold
Hi, I'm using the TidyHTMLTreeBuilder to generate some elementtrees from 
html. One by-product is that I'm losing comments embedded in the html. So 
I'm trying to put them back in, but I'm doing something wrong: here's the 
code snippet of how I generate the Trees:

from elementtree import ElementTree as ET
from elementtidy import TidyHTMLTreeBuilder
XHTML = "{http://www.w3.org/1999/xhtml}";

htmfile = os.path.join(self.htmloc,filename)
fd = open(htmfile)
tidyTree = TidyHTMLTreeBuilder.TidyHTMLTreeBuilder('utf-8')
tidyTree.feed(fd.read())
fd.close()
try:
tmp = tidyTree.close()
except:
print 'Bad file: %s\nSkipping.' % filename
continue
 tree = ET.ElementTree(tmp)

and here's the method I use to put the comments back in:

def addComments(self,tree):
body = tree.find('./%sbody' % XHTML)
for elem in body:
if elem.tag == '%sdiv' % XHTML and elem.get('class'):
if elem.get('class') == 'remapped':
comElem = ET.SubElement(elem,ET.Comment('stopindex'))

self.addComments(tree)
filename = os.path.join(self.deliverloc,name)
self.htmlcontent.write(tree,filename,encoding=self.encoding

when I try this I get errors from the ElementTree _write method:
TypeError: cannot concatenate 'str' and 'instance' objects

thanks for any help!
--Tim Arnold




-- 
http://mail.python.org/mailman/listinfo/python-list


Re: insert comments into elementtree

2007-11-19 Thread Tim Arnold
"Stefan Behnel" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Tim Arnold wrote:
>> Hi, I'm using the TidyHTMLTreeBuilder to generate some elementtrees from
>> html. One by-product is that I'm losing comments embedded in the html.
>
> That's how the parser in ET works. Use lxml instead, which keeps documents
> intact while parsing.
>
> http://codespeak.net/lxml/dev/
> http://codespeak.net/lxml/dev/lxmlhtml.html
>
> Stefan

Thanks Stefan, I certainly would use lxml if I could get everything to 
compile on this HPux10.20.
I did manage to get this one solved by inserting the comments back in like 
this:
elem.insert(0,ET.Comment('stopindex'))

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


build tool opinions

2008-02-01 Thread Tim Arnold
Hi, I'm going through the various build tools (i.e., make-like) available 
and could use some advice.
My documentation-build system is written in python and uses the pdflatex and 
plasTeX engines to create pdfs, html, and docbook XML from latex source 
files.  All that is ok, but I can clean up a lot of code by letting a build 
tool take care of installing the doc, creating build dirs, interacting with 
the CMS, etc.

So I started out looking at ant, since it's so popular. That led me on to 
look at SCons since it's written in python, and that led me on to look at 
'A-A-P'. Well, there's a lot of options out there.

So my question is what should I use? Impossible to answer I know, but it 
looks like SCons and A-A-P haven't had a lot of development activity--that 
seems to be because they're stable, not because they've been abandoned.

Right now I like the fact that SCons and A-A-P are both written in Python; 
On the other hand I think I could use Jython and Ant too.

Any ideas/opinions/advice would be helpful.
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


using scons as a library

2008-02-08 Thread Tim Arnold
Hi, I've been reading up on the SCons build tool. It's intended to
work by the end-user calling 'scons' on a buildscript. However, I'd
like to use it from my own python project as an imported module, and
have my already-written classes use the Scons objects to take actions
without an external script.

The reason for this somewhat odd question is that I don't want SCons
to build the project--the project itself builds documentation (pdf/
html/xml) from LaTeX sources--my classes handle some complex
configuration issues, source parsing, actual rendering, etc. What I
would gain by using SCons is to let my code hand-off tasks to SCons
like making and cleaning directories, creating zip files, interacting
with CVS, etc.

Has anyone tried this before? It seems doable, but if someone has an
example that would help to shorten my learning curve.

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


first interactive app

2008-03-26 Thread Tim Arnold
hi,
I want to write a tiny interactive app for the following situation:
I have books of many chapters that must be split into volumes before going 
to the printer.
A volume can have up to 600 pages. We obviously break the book into volumes 
only at chapter breaks. Since some chapters make a natural grouping, we want 
some human interaction for where the volume breaks occur.

Not having experience with interactive apps, I'm asking for advice about how 
to go about it. The data I start with is just a dictionary with chapter name 
= ending page number. I figured I would first show where the volumes would 
break with no human interaction, with the begin and ending chapter 
names/pagenumbers for each volume.

>From here I thought about having a slider for each volume, but the number of 
volumes could change during the session.
Or maybe I should just ask 'enter the ending chapter for the first volume' 
and recalculate, etc until all volumes are defined.

Any ideas on a simple interface for this?
thanks,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: first interactive app

2008-03-27 Thread Tim Arnold

"Miki" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Hello Tim,
>>

>> Any ideas on a simple interface for this?
>>
> How about something like:
>
> Chapter 1 (001-200 200)
> Chapter 2 (200-300 100)
> -- 001-300 300 
> Chapter 3 (300-450 150)
> Chapter 4 (450-500 50)
> -- 300-450 250 
> Chapter 5 (500-600 100)
> -- 500-600 100 
>
> Where the user can move the divider up and down to create new volume,
> they can also add and delete dividers.
>
> The program will not allow to drag the divider above the 600 page
> limit.
>
> HTH,
> --
> Miki <[EMAIL PROTECTED]>
> http://pythonwise.blogspot.com

Hi Miki,
that looks nice, simple, and intuitive. thanks for thinking about it.
Now to dive into some gui coding!

thanks,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: python resource management

2009-01-19 Thread Tim Arnold
"Philip Semanchuk"  wrote in message 
news:mailman.7530.1232375454.3487.python-l...@python.org...
>
> On Jan 19, 2009, at 3:12 AM, S.Selvam Siva wrote:
>
>> Hi all,
>>
>> I am running a python script which parses nearly 22,000 html files 
>> locally
>> stored using BeautifulSoup.
>> The problem is the memory usage linearly increases as the files are 
>> being
>> parsed.
>> When the script has crossed parsing 200 files or so, it consumes all  the
>> available RAM and The CPU usage comes down to 0% (may be due to 
>> excessive
>> paging).
>>
>> We tried 'del soup_object'  and used 'gc.collect()'. But, no 
>> improvement.
>>
>> Please guide me how to limit python's memory-usage or proper method  for
>> handling BeautifulSoup object in resource effective manner
>
> You need to figure out where the memory is disappearing. Try  commenting 
> out parts of your script. For instance, maybe start with a  minimalist 
> script: open and close the files but don't process them.  See if the 
> memory usage continues to be a problem. Then add elements  back in, making 
> your minimalist script more and more like the real  one. If the extreme 
> memory usage problem is isolated to one component  or section, you'll find 
> it this way.
>
> HTH
> Philip

Also, are you creating a separate soup object for each file or reusing one 
object over and over?
--Tim


--
http://mail.python.org/mailman/listinfo/python-list


Re: Using lxml to screen scrap a site, problem with charset

2009-02-02 Thread Tim Arnold
"?? ???"  wrote in message 
news:ciqh56-ses@archaeopteryx.softver.org.mk...
> So, I'm using lxml to screen scrap a site that uses the cyrillic
> alphabet (windows-1251 encoding). The sites HTML doesn't have the  ..content-type.. charset=..> header, but does have a HTTP header that
> specifies the charset... so they are standards compliant enough.
>
> Now when I run this code:
>
> from lxml import html
> doc = html.parse('http://a1.com.mk/')
> root = doc.getroot()
> title = root.cssselect(('head title'))[0]
> print title.text
>
> the title.text is ? unicode string, but it has been wrongly decoded as
> latin1 -> unicode
>
> So.. is this a deficiency/bug in lxml or I'm doing something wrong.
> Also, what are my other options here?
>
>
> I'm running Python 2.6.1 and python-lxml 2.1.4 on Linux if matters.
>
> -- 
> ?? ( http://softver.org.mk/damjan/ )
>
> "Debugging is twice as hard as writing the code in the first place.
> Therefore, if you write the code as cleverly as possible, you are,
> by definition, not smart enough to debug it." - Brian W. Kernighan
>

The way I do that is to open the file with codecs, encoding=cp1251, read it 
into variable and feed that to the parser.

--Tim


--
http://mail.python.org/mailman/listinfo/python-list


Re: parse/slice/...

2009-01-07 Thread Tim Arnold
"rcmn"  wrote in message 
news:51451b8a-6377-45d7-a8c8-54d4cadb2...@n33g2000pri.googlegroups.com...
> I'm not sure how to call it sorry for the subject description.
>   Here what i'm trying to accomplish.
> the script i'm working on, take a submitted list (for line in file)
> and generate thread for it. unfortunately winxp has a limit of 500
> thread . So I have to parse/slice the file by chunk of 500 and loop
> until the list is done.
> I though i would of done it in no time but i can't get started for
> some reason.
> And i can't find a good way to do it efficiently . Does anyone have
> something similar to this.
>
> thank you

Here's how I work on a list a bunch of items (100 by default) ata time:

def drain_list(tlist,step=None):
if not step:
step = 100
j=0
for i in range(step,len(tlist), step):
yield tlist[j:i]
j = i
if j < len(tlist):
yield tlist[j:]

--Tim


--
http://mail.python.org/mailman/listinfo/python-list


Re: Printed Documentation

2009-01-08 Thread Tim Arnold
"floob"  wrote in message 
news:0af87074-6d9c-41a8-98ec-501f6f37b...@s1g2000prg.googlegroups.com...
>I have been searching for a way to print the official Python
> documentation into some kind of book (for my own uses).  I don't
> really care if it's printed on newspaper and bound with elmer's
> glue ... any way I can get relatively recent _official documentation_
> in print form will do.
>
> I'm on the go a lot, and can't read for long periods of time on LCD
> screens anyhow (so having a laptop is not my solution).  Until eBook
> readers grow up a bit, I'm stuck trying to print the documentation
> that I REALLY need to read and absorb.
>
> Lulu.com is an option, but it would cost something around $100 US
> before shipping to get everything printed.  Also, I would have to
> split up some larger documents into Volumes, which I'd rather not have
> to do.
>
> Has anyone tried this before?  Is the documentation already available
> in print?
>
> Thanks,
>
> drfloob

just a datapoint, but I used lulu.com to print the latex sources (525 pages) 
hardbound for a cost of $25 US.
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


drive a desktop app from python?

2009-01-08 Thread Tim Arnold
Hi, I don't even know what to google for on this one. I need to drive a 
commercial desktop app (on windows xp) since the app doesn't have a batch 
interface.  It's intended to analyze one file at a time and display a 
report.

I can get the thing to write out the report an html browser, but I have 
thousands of files I need it to analyze every night.

Is there any lib or recipe(s) for doing something like this via python?

thanks,
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


lxml removing tag, keeping text order

2008-10-24 Thread Tim Arnold
Hi,
Using lxml to clean up auto-generated xml to validate against a dtd; I need 
to remove an element tag but keep the text in order. For example
s0 = '''

   first text
ladida
emphasized text
middle text

last text
  
'''

I want to get rid of the  tag but keep everything else as it is; 
that is, I need this result:


   first text
ladida
emphasized text
middle text

last text
  


I'm beginning to think this an impossible task, so I'm asking here to see if 
there is some method that will work. What I've done so far is this:

(outer encloses the parent, outside is the parent, inside is the child to 
remove)
from lxml import etree
import copy
def rm_tag(elem, outer, outside, inside):
newdiv = etree.Element(outside)
newdiv.text = ''
for e0 in elem.getiterator(outside):
for i,e1 in enumerate(e0.getiterator()):
if i == 0:
if e1.text: newdiv.text += e1.text
elif (e1.tag != inside):
newdiv.append(copy.deepcopy(e1))
elif (e1.text):
newdiv.text += e1.text

for t in elem.getiterator():
if t.tag == outer:
t.clear()
t.append(newdiv)
break
return etree.ElementTree(elem)

print 
etree.tostring(rm_tag(el,'option','optional','emphasis'),pretty_print=True)

But the text is messed up using this method. I see why it's wrong, but not 
how to make it right.
It returns:

   first text
emphasized text
ladida

last text
  


Maybe I should send the outside element (via tostring) to a regexp for 
removing the child and return that string? Regexp? Getting desperate, hey.

Any pointers much appreciated,
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


Re: lxml removing tag, keeping text order

2008-10-27 Thread Tim Arnold
"Stefan Behnel" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Tim Arnold schrieb:
>> Hi,
>> Using lxml to clean up auto-generated xml to validate against a dtd; I 
>> need
>> to remove an element tag but keep the text in order. For example
>> s0 = '''
>> 
>>first text
>> ladida
>> emphasized text
>> middle text
>> 
>> last text
>>   
>> '''
>>
>> I want to get rid of the  tag but keep everything else as it 
>> is;
>> that is, I need this result:
>>
>> 
>>first text
>> ladida
>> emphasized text
>> middle text
>> 
>> last text
>>   
>> 
>
> There's a drop_tag() method in lxml.html (lxml/html/__init__.py) that does
> what you want. Just copy the code over to your code base and adapt it as 
> needed.
>
> Stefan
Thanks Stefan, I was going crazy with this. That method is going to be quite 
useful for my project and it's good to learn from too; I was making it too 
hard.

thanks,
--Tim Arnold 


--
http://mail.python.org/mailman/listinfo/python-list


Re: storing a string data in access

2008-11-03 Thread Tim Arnold
"alex23" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
On Nov 3, 3:47 am, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
>> Hi
>> I have
>> access.Fields("Time").value=t
>> I would like t to be a string reprsenting a data. How can I do this?
>
> t = "string representing a datum"
> access.Fields("Time").value = t

maybe OP means t = "string representing a date", but I'm just guessing.

--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


modifying a codec

2008-11-05 Thread Tim Arnold
Hi, I'm using the codecs module to read in utf8 and write out cp1252 
encodings. For some characters I'd like to override the default behavior. 
For example, the mdash character comes out as the code point \227 and I'd 
like to translate it as — instead.
Example: the file myutf8.txt contains this string:
'factor one - initially'

import codecs

fd0 = codecs.open('myutf8.txt', 'rb', encoding='utf8')
line = fd0.read()
fd0.close()

fd1 = codecs.open('my1252.txt', 'wb', encoding='cp1252')
fd1.write(line)
fd1.close()


The codec is doing its job, but I want to override the codepoint for this 
character (plus others) to use the html entity instead (from \227  to 
— in this case).

I see hints writing your own codec and updating the decoding_map, but I 
could use some more detail.

Is that the best way to solve the problem?

thanks,
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


piping input to an external script

2009-05-11 Thread Tim Arnold
Hi, I have some html files that I want to validate by using an external 
script 'validate'. The html files need a doctype header attached before 
validation. The files are in utf8 encoding. My code:
---
import os,sys
import codecs,subprocess
HEADER = ''

filename  = 'mytest.html'
fd = codecs.open(filename,'rb',encoding='utf8')
s = HEADER + fd.read()
fd.close()

p = subprocess.Popen(['validate'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
validate = p.communicate(unicode(s,encoding='utf8'))
print validate
---

I get lots of lines like this:
Error at line 1, character 66:\tillegal character number 0
etc etc.

But I can give the command in a terminal 'cat mytest.html | validate' and 
get reasonable output. My subprocess code must be wrong, but I could use 
some help to see what the problem is.

python2.5.1, freebsd6
thanks,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: piping input to an external script

2009-05-12 Thread Tim Arnold
"Dave Angel"  wrote in message 
news:mailman.25.1242113076.8015.python-l...@python.org...
> Tim Arnold wrote:
>> Hi, I have some html files that I want to validate by using an external 
>> script 'validate'. The html files need a doctype header attached before 
>> validation. The files are in utf8 encoding. My code:
>> ---
>> import os,sys
>> import codecs,subprocess
>> HEADER = '> Transitional//EN">'
>>
>> filename  = 'mytest.html'
>> fd = codecs.open(filename,'rb',encoding='utf8')
>> s = HEADER + fd.read()
>> fd.close()
>>
>> p = subprocess.Popen(['validate'],
>> stdin=subprocess.PIPE,
>> stdout=subprocess.PIPE,
>> stderr=subprocess.STDOUT)
>> validate = p.communicate(unicode(s,encoding='utf8'))
>> print validate
>> ---
>>
>> I get lots of lines like this:
>> Error at line 1, character 66:\tillegal character number 0
>> etc etc.
>>
>> But I can give the command in a terminal 'cat mytest.html | validate' and 
>> get reasonable output. My subprocess code must be wrong, but I could use 
>> some help to see what the problem is.
>>
>> python2.5.1, freebsd6
>> thanks,
>> --Tim
>>
>>
>>
>>
> The usual rule in debugging:  split the problem into two parts, and test 
> each one separately, starting with the one you think most likely to be the 
> culprit
>
> In this case the obvious place to split is with the data you're passing to 
> the  communicate call.  I expect it's already wrong, long before you hand 
> it to the subprocess.  So write it to a file instead, and inspect it with 
> a binary file viewer.  And of course test it manually with your validate 
> program.  Is validate really expecting a Unicode stream in stdin ?
>

Good advice from everyone. The example was simpler than my actual situation, 
but it did show the problem. Dave's final question was the right one: I 
needed to pass the html content as a string, not unicode object:

HEADER = '\n'

filename  = 'mytest.html'
fd = codecs.open(filename,'rb',encoding='utf8')
s = HEADER + fd.read().encode('utf8') # <- made the difference
fd.close()

p = subprocess.Popen(['validate',],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
validate = p.communicate(s)
print validate



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: LaTeXing python programs

2009-05-20 Thread Tim Arnold

"John Reid"  wrote in message 
news:mailman.458.1242842132.8015.python-l...@python.org...
> Edward Grefenstette wrote:
>> I'm typing up my master's  thesis and will be including some of the
>> code used for my project in an appendix. The question is thus: is
>> there a LaTeX package out there that works well for presenting python
>> code?
>>
>> verbatim is a bit ugly and doesn't wrap code, and while there are a
>> plethora of code colouring packages out there, they are not all easy
>> to use, so I thought I might ask what the popular options in the
>> community are.
>
> pygments
>
and listings


-- 
http://mail.python.org/mailman/listinfo/python-list


using PIL for good screenshots

2008-05-12 Thread Tim Arnold
Hi,
I'm using PIL to enhance screenshots for print and online publication. I'm 
writing to see if someone else is doing similar work. The shots are dialogs, 
menus, etc. -- my workflow to create the print images:

(1) writer takes screenshot on Windows XP box (96dpi)
--
*** Python Image Library ***
(2) convert to RGB
(3) resize to a writer-specified width using nearest neighbor*
(4) enhance with Sharpness enhancer, factor=2

I think these look pretty good, but if you have a different method or good 
advice, I'd like to hear from you.
*nearest neighbor used for going up in size, antialias for going down in 
size.

thanks,
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


Re: using PIL for good screenshots

2008-05-13 Thread Tim Arnold
On May 12, 8:11 pm, [EMAIL PROTECTED] wrote:
> Tim,
>
> Sounds like an interesting project.
>
> Have you considered using SnagIt to produce your 
> screenshots?www.TechSmith.com/SnagIt
>
> Malcolm

Thanks for the interest on this, but I don't control what the writers
use to get the screenshot. They give me a 8-bit png screenshots
(sometimes 24-bit) captured at 96 dpi. The part I can control comes
after that--I need a workflow for getting the screenshot into print,
looking as good as possible.

thanks,
--Tim Arnold

--
http://mail.python.org/mailman/listinfo/python-list


pass data from optparse to other class instances

2008-06-09 Thread Tim Arnold
Hi,
I'm writing a command-line interface using optparse. The cli takes
several options with a single action and several parameters to be used
in the resulting worker classes.

I've been passing parameters from optparse to the workers in two ways:
(1) creating a Globals.py module, set parameters once in the cli code
and read it
when needed in the worker class methods. Something like this:
import Globals
class Foo(object):
def __init__(self):
if Globals.debug:
etc
(2) passing a parameter directly to the worker class __init__ method:
class Bar(object):
def __init__(self, verbose=False):
etc

Are those methods the best/only ways to pass these parameters around?
What's the smart way to do it?
thanks,
--Tim
--
http://mail.python.org/mailman/listinfo/python-list


Re: pass data from optparse to other class instances

2008-06-10 Thread Tim Arnold
On Jun 9, 5:42 pm, "Diez B. Roggisch" <[EMAIL PROTECTED]> wrote:
> Tim Arnold schrieb:
>
>
>
> > Hi,
> > I'm writing a command-line interface using optparse. The cli takes
> > several options with a single action and several parameters to be used
> > in the resulting worker classes.
>
> > I've been passing parameters from optparse to the workers in two ways:
> > (1) creating a Globals.py module, set parameters once in the cli code
> > and read it
> > when needed in the worker class methods. Something like this:
> > import Globals
> > class Foo(object):
> >     def __init__(self):
> >         if Globals.debug:
> >             etc
> > (2) passing a parameter directly to the worker class __init__ method:
> > class Bar(object):
> >     def __init__(self, verbose=False):
> >         etc
>
> > Are those methods the best/only ways to pass these parameters around?
> > What's the smart way to do it?
>
> Essentially these are the two ways - and there is not "the" way. Both
> approaches are reasonable.
>
> Generally it is better to refuse the temptation to work with global
> state - becaues only that ensures that code is de-coupled and more
> responsible regarding state.
>
> However there is no need to jump through overly high mounted hoops to
> reach that - especially when config-options affect overall program
> behaviour, such as verbosity.
>
> So - no clear answer, sorry :)
>
> Diez

Thanks for this info. I'm glad to know my thought process is on the
right track. What if I put all this (optparse, worker classes)
together into a package: I guess then could I have my globals set in
the __init__.py.

Which doesn't buy me that much over just importing Globals.py does it.
I kind-of understand about avoiding globals -- your comment about
coupling helped me understand it more. If I put my often used
functions and some global vars in __init__.py, is that any better than
importing them explicitly from Globals.py?
thanks again,
--Tim


--
http://mail.python.org/mailman/listinfo/python-list


Re: regex for balanced parentheses?

2008-06-12 Thread Tim Arnold
"Paul McGuire" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Parsing TeX is definitely not for the faint-of-heart!  You might try
> something like QuotedString('$', escQuote='$$') in pyparsing.  (I've
> not poked at TeX or its ilk since the mid-80's so my TeXpertise is
> long rusted away.)
>
> I know of two projects that have taken on the problem using pyparsing
> - one is the mathtext module in John Hunter's matplotlib, and Tim
> Arnold posted some questions on the subject a while back - try
> googling for "pyparsing tex" for further leads.
>
> -- Paul

Definitely agree that TeX can get pretty complicated. My method (writing a 
converter from one TeX tag system to another)  was to pre-parse using string 
match/replace for really simple stuff, regular expressions for the more 
complex and pyparsing for the really tough stuff.

One thing that was surprisingly hard for me to figure out was filtering out 
comments. I finally just looped through the file line by line, looking for a 
'%' that wasn't in a verbatim environment and wasn't escaped, etc.
Funny how sometimes the simplest thing can be difficult to handle.

Definitely pyparsing made the job possible; I can't imagine that job without 
it.

--Tim Arnold



--
http://mail.python.org/mailman/listinfo/python-list


Re: IDE on the level of Eclipse or DEVc++?

2008-06-23 Thread Tim Arnold
"cirfu" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> is there an IDE for python of the same quality as Eclipse or DEVC++?
>
> I am currently using the editor that coems iwth python and it is all
> fine but for bigger projects it would be nice to have some way to
> easier browse the projectfiles for example.

why not eclipse itself, using the pydev plugin?
--Tim


--
http://mail.python.org/mailman/listinfo/python-list


Re: Build tool for Python

2008-07-30 Thread Tim Arnold
"Terry Reedy" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
>
>
> Hussein B wrote:
>> Hi.
>> Apache Ant is the de facto building tool for Java (whether JSE, JEE
>> and JME) application.
>> With Ant you can do what ever you want: compile, generate docs,
>> generate code, packing, deploy, connecting to remote servers and every
>> thing.
>> Do we have such a tool for Python projects?
>
> Also see thread Continuous integration for Python projects and mention of 
> buildbot.

Surprised no one has mentioned SCons, http://www.scons.org/
I've used it a bit and found it pretty good, out of the box.

--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


set file permission on windows

2008-04-08 Thread Tim Arnold
hi, I need to set file permissions on some directory trees in windows using 
Python.

When I click on properties for a file and select the 'Security' tab, I see a 
list of known 'Group or user names' with permissions for each entry such as
Full Control, Modify, Read&Execute,  etc.

I need to (for example) periodically set Group Permissions for one group to 
Read, and another Group to None. I need to apply the settings to several 
directory trees recursively.

If this was on Unix, I'd just use os.stat I guess. I don't think that will 
work in this case since all I know is the Group names and the permissions I 
need to allow.

thanks for any pointers,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: set file permission on windows

2008-04-08 Thread Tim Arnold
"Mike Driscoll" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On Apr 8, 12:03 pm, "Tim Arnold" <[EMAIL PROTECTED]> wrote:
>>

> According to the following thread, you can use os.chmod on Windows:
>
> http://mail.python.org/pipermail/python-list/2003-June/210268.html
>
> You can also do it with the PyWin32 package. Tim Golden talks about
> one way to do it here:
>
> http://timgolden.me.uk/python/win32_how_do_i/add-security-to-a-file.html
>
> Also see the following thread:
>
> http://mail.python.org/pipermail/python-win32/2004-July/002102.html
>
> or
>
> http://bytes.com/forum/thread560518.html
>
> Hope that helps!
>
> Mike

Hi Mike,
It does help indeed, especially the last two links. That certainly gets me 
started in the right direction. I'm always amazed at the helpful generosity 
of the folks on this list.
thanks again for the help.
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: set file permission on windows

2008-04-09 Thread Tim Arnold
"Tim Golden" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Tim Arnold wrote:
>> "Mike Driscoll" <[EMAIL PROTECTED]> wrote in message 
>> news:[EMAIL PROTECTED]
>>> On Apr 8, 12:03 pm, "Tim Arnold" <[EMAIL PROTECTED]> wrote:
>>>>
>>
>>> According to the following thread, you can use os.chmod on Windows:
>>>
>>> http://mail.python.org/pipermail/python-list/2003-June/210268.html
>>>
>>> You can also do it with the PyWin32 package. Tim Golden talks about
>>> one way to do it here:
>>>
>>> http://timgolden.me.uk/python/win32_how_do_i/add-security-to-a-file.html
>>>
>>> Also see the following thread:
>>>
>>> http://mail.python.org/pipermail/python-win32/2004-July/002102.html
>>>
>>> or
>>>
>>> http://bytes.com/forum/thread560518.html
>>>
>>> Hope that helps!
>>>
>>> Mike
>>
>> Hi Mike,
>> It does help indeed, especially the last two links.
>
> Hi, Tim. For the purposes of improving that page of mine linked
> above, would you mind highlighting what made it less useful
> than the last two links? On the surface, it seems to match your
> use case pretty closely. Was there too much information? Too
> little? Poor formatting? Just didn't feel right? I've a small set
> of security-related pages in train and I'd rather produce something which 
> people find useful.
>
> Thanks
>
> TJG

Hi TJG. Thanks for the site. Unfortunately, I mis-typed in the previous 
reply and that should have been the 'first two links' instead of 'last two 
links'. In fact I bookmarked your site so I can re-read the material and I 
copied the code to play around with.  Excellent example--it contains just 
what I needed to know, esp. since it replaces the dacl instead of modifying 
one. Now I can remove access for 'Everybody' by simply not including it in 
the new dacl.

thanks!
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


convert xhtml back to html

2008-04-24 Thread Tim Arnold
hi, I've got lots of xhtml pages that need to be fed to MS HTML Workshop to 
create  CHM files. That application really hates xhtml, so I need to convert 
self-ending tags (e.g. ) to plain html (e.g. ).

Seems simple enough, but I'm having some trouble with it. regexps trip up 
because I also have to take into account 'img', 'meta', 'link' tags, not 
just the simple 'br' and 'hr' tags. Well, maybe there's a simple way to do 
that with regexps, but my simpleminded )]+/> doesn't work. I'm not 
enough of a regexp pro to figure out that lookahead stuff.

I'm not sure where to start now; I looked at BeautifulSoup and 
BeautifulStoneSoup, but I can't see how to modify the actual tag.

thanks,
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


Re: convert xhtml back to html

2008-04-24 Thread Tim Arnold
"Gary Herron" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> Tim Arnold wrote:
>> hi, I've got lots of xhtml pages that need to be fed to MS HTML Workshop 
>> to create  CHM files. That application really hates xhtml, so I need to 
>> convert self-ending tags (e.g. ) to plain html (e.g. ).
>>
>> Seems simple enough, but I'm having some trouble with it. regexps trip up 
>> because I also have to take into account 'img', 'meta', 'link' tags, not 
>> just the simple 'br' and 'hr' tags. Well, maybe there's a simple way to 
>> do that with regexps, but my simpleminded )]+/> doesn't work. 
>> I'm not enough of a regexp pro to figure out that lookahead stuff.
>>
>> I'm not sure where to start now; I looked at BeautifulSoup and 
>> BeautifulStoneSoup, but I can't see how to modify the actual tag.
>>
>> thanks,
>> --Tim Arnold
>>
>>
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>
> Whether or not you can find an application that does what you want, I 
> don't know, but at the very least I can say this much.
>
> You should not be reading and parsing the text yourself!  XHTML is valid 
> XML, and there a lots of ways to read and parse XML with Python. 
> (ElementTree is what I use, but other choices exist.)   Once you use an 
> existing package to read your files into an internal tree structure 
> representation, it should be a relatively easy job to traverse the tree to 
> emit the tags and text you want.
>
>
> Gary Herron
>
I agree and I'd really rather not parse it myself. However, ET will clean up 
the file which in my case includes some comments required as metadata, so 
that won't work. Oh, I could get ET to read it and write a new parser--I see 
what you mean. I think I need to subclass so I could get ET to honor those 
comments too.
That's one way to go, I was just hoping for something easier.
thanks,
--Tim


--
http://mail.python.org/mailman/listinfo/python-list


Re: convert xhtml back to html

2008-04-24 Thread Tim Arnold
"Arnaud Delobelle" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> "Tim Arnold" <[EMAIL PROTECTED]> writes:
>
>> hi, I've got lots of xhtml pages that need to be fed to MS HTML Workshop 
>> to
>> create  CHM files. That application really hates xhtml, so I need to 
>> convert
>> self-ending tags (e.g. ) to plain html (e.g. ).
>>
>> Seems simple enough, but I'm having some trouble with it. regexps trip up
>> because I also have to take into account 'img', 'meta', 'link' tags, not
>> just the simple 'br' and 'hr' tags. Well, maybe there's a simple way to 
>> do
>> that with regexps, but my simpleminded )]+/> doesn't work. I'm 
>> not
>> enough of a regexp pro to figure out that lookahead stuff.
>
> Hi, I'm not sure if this is very helpful but the following works on
> the very simple example below.
>
>>>> import re
>>>> xhtml = 'hello  spam  bye '
>>>> xtag = re.compile(r'<([^>]*?)/>')
>>>> xtag.sub(r'<\1>', xhtml)
> 'hello  spam  bye '
>
>
> -- 
> Arnaud

Thanks for that. It is helpful--I guess I had a brain malfunction. Your 
example will work for me I'm pretty sure, except in some cases where the IMG 
alt text contains a gt sign. I'm not sure that's even possible, so maybe 
this will do the job.
thanks,
--Tim


--
http://mail.python.org/mailman/listinfo/python-list


Re: convert xhtml back to html

2008-04-25 Thread Tim Arnold
"bryan rasmussen" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> I'll second the recommendation to use xsl-t, set the output to html.
>
>
> The code for an XSL-T to do it would be basically:
> http://www.w3.org/1999/XSL/Transform"; 
> version="1.0">
> 
>
> 
>
> you would probably want to do other stuff than just  copy it out but
> that's another case.
>
> Also, from my recollection the solution in CHM to make XHTML br
> elements behave correctly was  as opposed to , at any rate
> I've done projects generating CHM and my output markup was well formed
> XML at all occasions.
>
> Cheers,
> Bryan Rasmussen

Thanks Bryan, Walter, John, Marc, and Stefan. I finally went with the xslt 
transform which works very well and is simple.  regexps would work, but they 
just scare me somehow. Brian, my tags were formatted as  but the help 
compiler would issue warnings on each one resulting in log files with 
thousands of warnings. It did finish the compile though, but it made 
understanding the logs too painful.

Stefan, I *really* look forward to being able to use lxml when I move to RH 
linux next month. I've been using hp10.20 and never could get the requisite 
libraries to compile. Once I make that move, maybe I won't have as many 
markup related questions here!

thanks again to all for the great suggestions.
--Tim Arnold




--
http://mail.python.org/mailman/listinfo/python-list


multiple processes, private working directories

2008-09-24 Thread Tim Arnold
I have a bunch of processes to run and each one needs its own working
directory. I'd also like to know when all of the processes are
finished.

(1) First thought was threads, until I saw that os.chdir was process-
global.
(2) Next thought was fork, but I don't know how to signal when each
child is
finished.
(3) Current thought is to break the process from a method into a
external
script; call the script in separate threads.  This is the only way I
can see
to give each process a separate dir (external process fixes that), and
I can
find out when each process is finished (thread fixes that).

Am I missing something? Is there a better way? I hate to rewrite this
method
as a script since I've got a lot of object metadata that I'll have to
regenerate with each call of the script.

thanks for any suggestions,
--Tim Arnold
--
http://mail.python.org/mailman/listinfo/python-list


multiple processes with private working dirs

2008-09-24 Thread Tim Arnold
I have a bunch of processes to run and each one needs its own working 
directory. I'd also like to know when all of the processes are finished.

(1) First thought was threads, until I saw that os.chdir was process-global.
(2) Next thought was fork, but I don't know how to signal when each child is 
finished.
(3) Current thought is to break the process from a method into a external 
script; call the script in separate threads.  This is the only way I can see 
to give each process a separate dir (external process fixes that), and I can 
find out when each process is finished (thread fixes that).

Am I missing something? Is there a better way? I hate to rewrite this method 
as a script since I've got a lot of object metadata that I'll have to 
regenerate with each call of the script.

thanks for any suggestions,
--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


Re: multiple processes with private working dirs

2008-09-25 Thread Tim Arnold
On Sep 25, 12:11 am, alex23 <[EMAIL PROTECTED]> wrote:
> On Sep 25, 3:37 am, "Tim Arnold" <[EMAIL PROTECTED]> wrote:
>
> > Am I missing something?
>
> Do you mean something other than the replies you got the last time you
> asked the exact same question?
>
> http://groups.google.com/group/comp.lang.python/browse_frm/thread/42c...

arggg. My newreader didn't show the initial post so I thought it never
made it through.
sorry for the noise.
--Tim Arnold
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiple processes, private working directories

2008-09-25 Thread Tim Arnold
"Tim Arnold" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
>I have a bunch of processes to run and each one needs its own working
> directory. I'd also like to know when all of the processes are
> finished.

Thanks for the ideas everyone--I now have some news tools in the toolbox. 
The task is to use pdflatex to compile a bunch of (>100)  chapters and know 
when the book is complete (i.e. the book pdf is done and the separate 
chapter pdfs are finished. I have to wait for that before I start some 
postprocessing and reporting chores.

My original scheme was to use a class to manage the builds with threads, 
calling pdflatex within each thread. Since pdflatex really does need to be 
in the directory with the source, I had a problem.

I'm reading now about python's multiprocessing capabilty, but I think I can 
use Karthik's suggestion to call pdflatex in subprocess with the cwd set. 
That seems like the simple solution at this point, but I'm going to give 
Cameron's pipes suggestion a go as well.

In any case, it's clear I need to rethink the problem. Thanks to everyone 
for helping me get past my brain-lock.

--Tim Arnold


--
http://mail.python.org/mailman/listinfo/python-list


Re: Python IDE for MacOS-X

2010-01-19 Thread Tim Arnold
"Jean Guillaume Pyraksos"  wrote in message 
news:wissme-9248e1.08090319012...@news.free.fr...
> What's the best one to use with beginners ?
> Something with integrated syntax editor, browser of doc...
> Thanks,
>
>JG

eclipse + pydev works well for me.
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Splitting text at whitespace but keeping the whitespace in thereturned list

2010-01-25 Thread Tim Arnold
"MRAB"  wrote in message 
news:mailman.1362.1264353878.28905.python-l...@python.org...
> pyt...@bdurham.com wrote:
>> I need to parse some ASCII text into 'word' sized chunks of text AND 
>> collect the whitespace that seperates the split items. By 'word' I mean 
>> any string of characters seperated by whitespace (newlines, carriage 
>> returns, tabs, spaces, soft-spaces, etc). This means that my split text 
>> can contain punctuation and numbers - just not whitespace.
>>  The split( None ) method works fine for returning the word sized chunks 
>> of text, but destroys the whitespace separators that I need.
>>  Is there a variation of split() that returns delimiters as well as 
>> tokens?
>>
> I'd use the re module:
>
> >>> import re
> >>> re.split(r'(\s+)', "Hello world!")
> ['Hello', ' ', 'world!']

also, partition works though it returns a tuple instead of a list.
>>> s = 'hello world'
>>> s.partition(' ')
('hello', ' ', 'world')
>>>

--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


subprocess hangs on reading stdout

2009-10-14 Thread Tim Arnold
Hi, I'm querying a list of network servers for processes belonging to a 
specific user. The problem is that when I try to read the stdout from the 
subprocess it sometimes hangs. Not always though.

I thought maybe I needed to set unbufferered to true, so at the beginning of 
the code I set
os.environ['PYTHONUNBUFFERED'] = '1'
But I think it's more likely the subprocess that needs to be unbuffered.

Here's the bit of code:
---
for machine_name in self.alive:# a list of servers that responded to 
ping already.
cmd = ["/bin/remsh", machine_name, 'ps -flu %s' % uid]
finish = time.time() + 4.0
p = subprocess.Popen(cmd,stdout=subprocess.PIPE)
while p.poll() is None:
time.sleep(0.5)
if finish < time.time():
p.kill()
print 'skipping' # this works ok
break

s = ''
if p:
s = p.stdout.read() # trhis will hang occasionally
if not s:
continue
-----------

Any ideas? comments on code welcome also.
thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess hangs on reading stdout

2009-10-15 Thread Tim Arnold
"Minesh Patel"  wrote in message 
news:mailman.1408.1255583431.2807.python-l...@python.org...
> >
>> Any ideas? comments on code welcome also.
>
> Here's something that I would probably do, there may be better ways.
> This only works on python2.6 for the terminate() method.
>
>
> import signal
> import subprocess
>
> def timeout_handler(signum, frame):
>print "About to kill process"
>p.terminate()
>
> for machine_name in self.alive:
>cmd = ["/bin/remsh", machine_name, 'ps -flu %s' % uid]
>signal.signal(signal.SIGALRM, timeout_handler)
>signal.alarm(1)
>p = subprocess.Popen(cmd,stdout=subprocess.PIPE)
>(stdout, stderr) = p.communicate()
>signal.alarm(0)
>if stdout:
>   print stdout
>elif stderr:
>   print stderr
>
>
>
> -- 
> Thanks,
> --Minesh

Hi Minesh,
Looks like I need to learn about signals--that code looks nice. I'm using 
python2.6.
thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


subprocess executing shell

2009-10-21 Thread Tim Arnold
Hi, I'm writing a script to capture a command on the commandline and run it 
on a remote server.
I guess I don't understand subprocess because the code below exec's the 
user's .cshrc file even though by default shell=False in the Popen call.

Here's the code. I put a line in my .cshrc file: echo 'testing' which 
appears when I run this script on the remote host.

import os,sys,subprocess,shlex

def main():
if action:
action.insert(0,'rsh my_remotehost')
p = subprocess.Popen(shlex.split(' '.join(action)))
p.wait()

if __name__ == '__main__':
action = sys.argv[1:] or list()
main()


Since the shell is executing in the child process anyway, is the only 
difference when using shell=True is that environment variables can be 
expanded in the command to be executed?

thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: subprocess executing shell

2009-10-22 Thread Tim Arnold
"Gabriel Genellina"  wrote in message 
news:mailman.1840.1256202325.2807.python-l...@python.org...
> En Wed, 21 Oct 2009 12:24:37 -0300, Tim Arnold  
> escribió:
>
>> Hi, I'm writing a script to capture a command on the commandline and run 
>> it
>> on a remote server.
>> I guess I don't understand subprocess because the code below exec's the
>> user's .cshrc file even though by default shell=False in the Popen call.
>
> Do you mean it execs the .cshrc file in your *local* system or the 
> *remote* one?
> Popen controls what happens on the local system only.
>
>> action.insert(0,'rsh my_remotehost')
>> p = subprocess.Popen(shlex.split(' '.join(action)))
>> p.wait()
>>
>> Since the shell is executing in the child process anyway, is the only
>> difference when using shell=True is that environment variables can be
>> expanded in the command to be executed?
>
> Note that in this case, "the child process" is rsh on the local system. 
> Popen has no control over what happens once rsh starts.
>
> -- 
> Gabriel Genellina

Thanks, I see my mistake now. Arggg, I keep forgetting that one.
thanks,
--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: calling server side function

2009-10-28 Thread Tim Arnold
"Gabriel Genellina"  wrote in message 
news:mailman.2155.1256716617.2807.python-l...@python.org...
> En Wed, 28 Oct 2009 04:04:50 -0300, Paul Hartley  
> escribió:
>
>> I have a socket set up between a client and server program.  Let's say 
>> that I serialize (pickle) some data in the client and send it to the 
>> server with the intention of calling a function in the server to process 
>> the data.  How would one execute the function?  This is not for a 
>> web-based application, BTW -- it's a desktop based application
>> My current thought process is (using a generalized example):
>> I have a list of numbers in the client and want to find the length of 
>> the list using the server.  There exists a function find_len() in the 
>> server code.  I have a list of numbers [1,2,3].  On the client side, I 
>> create the tuple ("find_len", [1,2,3]), and serialize it.  I pass this 
>> serialized object via a socket to the server, which unpickles it.  The 
>> server takes the key (find_len) and uses a getattr call to get the 
>> find_len function.  The server then calls find_len([1,2,3]) to get the 
>> sum.
>> def find_len(list_):return
>> Are there better ways of accomplishing this (I'm aware that there are 
>> security pitfalls here...)
>
> xmlrpc does more or less the same thing, but serializing in xml instead of 
> pickling.
>
> -- 
> Gabriel Genellina
>

Also, have a look at RPyc. I've been playing with it for a few days and it 
sounds it may be what you're after.
http://rpyc.wikidot.com/
http://www.ibm.com/developerworks/linux/library/l-rpyc/index.html

--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: a simple unicode question

2009-10-28 Thread Tim Arnold
"Chris Jones"  wrote in message 
news:mailman.2149.1256707687.2807.python-l...@python.org...
> On Tue, Oct 27, 2009 at 06:21:11AM EDT, Lie Ryan wrote:
>> Chris Jones wrote:
>
> [..]
>
>>> Best part of Unicode is that there are multiple encodings, right? ;-)
>>
>> No, the best part about Unicode is there is no encoding!
>
>> Unicode does not define any encoding;
>
> RFC 3629:
>
> "ISO/IEC 10646 and Unicode define several encoding forms of their
> common repertoire: UTF-8, UCS-2, UTF-16, UCS-4 and UTF-32."
>
>> what it defines is code-points for  characters which is not related to
>> how characters are encoded in files or network transmission.
>
> In other words, Unicode is "not related to any encoding" .. and yet the
> UTF-8, UTF-16.. "encoding forms" are clearly "related" to Unicode.
>
> How is that possible?
>
> CJ

When I first saw it, my first thought was that the subjectline was an 
oxymoron.

--Tim Arnold


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: HTML Parser which allows low-keyed local changes (upon serialization)

2010-02-01 Thread Tim Arnold

"Robert"  wrote in message 
news:hk729b$na...@news.albasani.net...
> Stefan Behnel wrote:
>> Robert, 01.02.2010 14:36:
>>> Stefan Behnel wrote:
 Robert, 31.01.2010 20:57:
> I tried lxml, but after walking and making changes in the element 
> tree,
> I'm forced to do a full serialization of the whole document
> (etree.tostring(tree)) - which destroys the "human edited" format of 
> the
> original HTML code. makes it rather unreadable.
 What do you mean? Could you give an example? lxml certainly does not
 destroy anything it parsed, unless you tell it to do so.
>>> of course it does not destroy during parsing.(?)
>>

I think I understand what you want, but I don't understand why yet. Do you 
want to view the differences in an IDE or something like that? If so, why 
not pretty-print both and compare that?
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


best way to create a dict from string

2010-02-18 Thread Tim Arnold
Hi,
I've got some text to parse that looks like this

text = ''' blah blah blah
\Template[Name=RAD,LMRB=False,LMRG=True]{tables}
ho dee ho
'''
I want to extract the bit between the brackets and create a dictionary. 
Here's what I'm doing now:

def options(text):
d = dict()
options = text[text.find('[')+1:text.find(']')]
for k,v in [val.split('=') for val in options.split(',')]:
d[k] = v
return d

if __name__ == '__main__':
for line in text.split('\n'):
if line.startswith('\\Template'):
print options(line)


is that the best way or maybe there's something simpler?  The options will 
always be key=value format, comma separated.
thanks,
--TIm


-- 
http://mail.python.org/mailman/listinfo/python-list


freebsd and multiprocessing

2010-03-02 Thread Tim Arnold
Hi,
I'm intending to use multiprocessing on a freebsd machine (6.3
release, quad core, 8cpus, amd64). I see in the doc that on this
platform I can't use synchronize:

ImportError: This platform lacks a functioning sem_open
implementation, therefore, the required synchronization primitives
needed will not function, see issue 3770.

As far as I can tell, I have no need to synchronize the processes--I
have several processes run separately and I need to know when they're
all finished; there's no communication between them and each owns its
own log file for output.

Is anyone using multiprocessing on FreeBSD and run into any other
gotchas?
thanks,
--Tim Arnold
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: freebsd and multiprocessing

2010-03-02 Thread Tim Arnold
On Mar 2, 11:52 am, Philip Semanchuk  wrote:
> On Mar 2, 2010, at 11:31 AM, Tim Arnold wrote:
>
>
>
>
>
> > Hi,
> > I'm intending to use multiprocessing on a freebsd machine (6.3
> > release, quad core, 8cpus, amd64). I see in the doc that on this
> > platform I can't use synchronize:
>
> > ImportError: This platform lacks a functioning sem_open
> > implementation, therefore, the required synchronization primitives
> > needed will not function, see issue 3770.
>
> > As far as I can tell, I have no need to synchronize the processes--I
> > have several processes run separately and I need to know when they're
> > all finished; there's no communication between them and each owns its
> > own log file for output.
>
> > Is anyone using multiprocessing on FreeBSD and run into any other
> > gotchas?
>
> Hi Tim,
> I don't use multiprocessing but I've written two low-level IPC  
> packages, one for SysV IPC and the other for POSIX IPC.
>
> I think that multiprocessing prefers POSIX IPC (which is where  
> sem_open() comes from). I don't know what it uses if that's not  
> available, but SysV IPC seems a likely alternative. I must emphasize,  
> however, that that's a guess on my part.
>
> FreeBSD didn't have POSIX IPC support until 7.0, and that was sort of  
> broken until 7.2. As it happens, I was testing my POSIX IPC code  
> against 7.2 last night and it works just fine.
>
> SysV IPC works under FreeBSD 6 (and perhaps earlier versions; 6 is the  
> oldest I've tested). ISTR that by default each message queue is  
> limited to 2048 bytes in total size. 'sysctl kern.ipc' can probably  
> tell you that and may even let you change it. Other than that I can't  
> think of any SysV limitations that might bite you.
>
> HTH
> Philip

Hi Philip,
Thanks for that information. I wish I could upgrade the machine to
7.2! alas, out of my power.  I get the following results from sysctl:
% sysctl kern.ipc | grep msg
kern.ipc.msgseg: 2048
kern.ipc.msgssz: 8
kern.ipc.msgtql: 40
kern.ipc.msgmnb: 2048
kern.ipc.msgmni: 40
kern.ipc.msgmax: 16384

I'll write some test programs using multiprocessing and see how they
go before committing to rewrite my current code. I've also been
looking at 'parallel python' although it may have the same issues.
http://www.parallelpython.com/

thanks again,
--Tim
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: freebsd and multiprocessing

2010-03-02 Thread Tim Arnold
On Mar 2, 12:59 pm, Tim Arnold  wrote:
> On Mar 2, 11:52 am, Philip Semanchuk  wrote:
> > On Mar 2, 2010, at 11:31 AM, Tim Arnold wrote:
>
> > > Hi,
> > > I'm intending to use multiprocessing on a freebsd machine (6.3
> > > release, quad core, 8cpus, amd64). I see in the doc that on this
> > > platform I can't use synchronize:
>
> > > ImportError: This platform lacks a functioning sem_open
> > > implementation, therefore, the required synchronization primitives
> > > needed will not function, see issue 3770.
>
> > > As far as I can tell, I have no need to synchronize the processes--I
> > > have several processes run separately and I need to know when they're
> > > all finished; there's no communication between them and each owns its
> > > own log file for output.
>
> > > Is anyone using multiprocessing on FreeBSD and run into any other
> > > gotchas?
>
> > Hi Tim,
> > I don't use multiprocessing but I've written two low-level IPC  
> > packages, one for SysV IPC and the other for POSIX IPC.
>
> > I think that multiprocessing prefers POSIX IPC (which is where  
> > sem_open() comes from). I don't know what it uses if that's not  
> > available, but SysV IPC seems a likely alternative. I must emphasize,  
> > however, that that's a guess on my part.
>
> > FreeBSD didn't have POSIX IPC support until 7.0, and that was sort of  
> > broken until 7.2. As it happens, I was testing my POSIX IPC code  
> > against 7.2 last night and it works just fine.
>
> > SysV IPC works under FreeBSD 6 (and perhaps earlier versions; 6 is the  
> > oldest I've tested). ISTR that by default each message queue is  
> > limited to 2048 bytes in total size. 'sysctl kern.ipc' can probably  
> > tell you that and may even let you change it. Other than that I can't  
> > think of any SysV limitations that might bite you.
>
> > HTH
> > Philip
>
> Hi Philip,
> Thanks for that information. I wish I could upgrade the machine to
> 7.2! alas, out of my power.  I get the following results from sysctl:
> % sysctl kern.ipc | grep msg
> kern.ipc.msgseg: 2048
> kern.ipc.msgssz: 8
> kern.ipc.msgtql: 40
> kern.ipc.msgmnb: 2048
> kern.ipc.msgmni: 40
> kern.ipc.msgmax: 16384
>
> I'll write some test programs using multiprocessing and see how they
> go before committing to rewrite my current code. I've also been
> looking at 'parallel python' although it may have the same 
> issues.http://www.parallelpython.com/
>
> thanks again,
> --Tim

Well that didn't work out well. I can't import either Queue or Pool
from multiprocessing, so I'm back to the drawing board. I'll see now
how parallel python does on freebsd.

--Tim

-- 
http://mail.python.org/mailman/listinfo/python-list


multiprocessing on freebsd

2010-03-17 Thread Tim Arnold
Hi,
I'm checking to see if multiprocessing works on freebsd for any
version of python. My server is about to get upgraded from 6.3 to 8.0
and I'd sure like to be able to use multiprocessing.

I think the minimal test would be:
-
import multiprocessing
q = multiprocessing.Queue()
-

with 6.3, I get

 File "/usr/local/lib/python2.6/multiprocessing/__init__.py", line
212, in Queue
from multiprocessing.queues import Queue
  File "/usr/local/lib/python2.6/multiprocessing/queues.py", line 22,
in 
from multiprocessing.synchronize import Lock, BoundedSemaphore,
Semaphore, Condition
  File "/usr/local/lib/python2.6/multiprocessing/synchronize.py", line
33, in 
" function, see issue 3770.")
ImportError: This platform lacks a functioning sem_open
implementation, therefore, the required synchronization primitives
needed will not function, see issue 3770.


thanks for any info,
--Tim Arnold
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing on freebsd

2010-03-17 Thread Tim Arnold
On Mar 17, 11:26 am, Philip Semanchuk  wrote:
> On Mar 17, 2010, at 9:30 AM, Tim Arnold wrote:
>
>
>
>
>
> > Hi,
> > I'm checking to see if multiprocessing works on freebsd for any
> > version of python. My server is about to get upgraded from 6.3 to 8.0
> > and I'd sure like to be able to use multiprocessing.
>
> > I think the minimal test would be:
> > -
> > import multiprocessing
> > q = multiprocessing.Queue()
> > -
>
> > with 6.3, I get
>
> > File "/usr/local/lib/python2.6/multiprocessing/__init__.py", line
> > 212, in Queue
> >    from multiprocessing.queues import Queue
> >  File "/usr/local/lib/python2.6/multiprocessing/queues.py", line 22,
> > in 
> >    from multiprocessing.synchronize import Lock, BoundedSemaphore,
> > Semaphore, Condition
> >  File "/usr/local/lib/python2.6/multiprocessing/synchronize.py", line
> > 33, in 
> >    " function, see issue 3770.")
> > ImportError: This platform lacks a functioning sem_open
> > implementation, therefore, the required synchronization primitives
> > needed will not function, see issue 3770.
>
> Hi Tim,
> Under FreeBSD 8/Python 2.6.2 I get the same result, unfortunately.  
> That's a pity because sem_open works under FreeBSD >= 7.2 as we  
> discussed before.
>
> Issue 3770 is closed with the note, "we've removed hard-coded platform  
> variables for a better autoconf approach." I'm using the Python built  
> from FreeBSD's ports, and the note makes me think that it's possible  
> that if I built my own Python from the Python.org tarball rather than  
> ports the problem would go away due to autoconf magic. I don't have  
> the time to offer to do this for you, unfortunately. But why not  
> install FreeBSD 8 under VirtualBox or somesuch and give it a go  
> yourself?
>
> A couple of quirks I noted related to FreeBSD & POSIX IPC that you  
> might find useful --
> - The sem and mqueuefs kernel modules must be loaded, otherwise you'll  
> get a message like this when you try to create a semaphore or message  
> queue:
> Bad system call: 12 (core dumped)
>
> Under 8.0 they're loaded by default, I think.
>
> - C apps that want to use message queues must link to the realtime  
> libs (pass -lrt to the linker). This tripped me up for a while.  
> Linking to the realtime libs is required for all POSIX IPC calls under  
> Linux; FreeBSD does not require it for semaphores or shared mem, only  
> message queues.
>
> Hope this helps
> Philip

Hi Philip,
Thanks for that information (esp the linker info). Once the machine is
upgraded, I'll try building python from the tarball. I'll post back
here with the results.

here's hoping!
thanks,
--Tim
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing on freebsd

2010-03-18 Thread Tim Arnold
"Martin P. Hellwig"  wrote in message 
news:hnrabj$c4...@news.eternal-september.org...
> On 03/17/10 13:30, Tim Arnold wrote:
>> Hi,
>> I'm checking to see if multiprocessing works on freebsd for any
>> version of python. My server is about to get upgraded from 6.3 to 8.0
>> and I'd sure like to be able to use multiprocessing.
>>
>> I think the minimal test would be:
>> -
>> import multiprocessing
>> q = multiprocessing.Queue()
>> -
>>
>> with 6.3, I get
>>
>>   File "/usr/local/lib/python2.6/multiprocessing/__init__.py", line
>> 212, in Queue
>>  from multiprocessing.queues import Queue
>>File "/usr/local/lib/python2.6/multiprocessing/queues.py", line 22,
>> in
>>  from multiprocessing.synchronize import Lock, BoundedSemaphore,
>> Semaphore, Condition
>>File "/usr/local/lib/python2.6/multiprocessing/synchronize.py", line
>> 33, in
>>  " function, see issue 3770.")
>> ImportError: This platform lacks a functioning sem_open
>> implementation, therefore, the required synchronization primitives
>> needed will not function, see issue 3770.
>>
>
> Build mine from ports, with the following options (notice SEM & PTH):
> [mar...@aspire8930 /usr/home/martin]$ cat /var/db/ports/python26/options
> # This file is auto-generated by 'make config'.
> # No user-servicable parts inside!
> # Options for python26-2.6.4
> _OPTIONS_READ=python26-2.6.4
> WITH_THREADS=true
> WITHOUT_HUGE_STACK_SIZE=true
> WITH_SEM=true
> WITH_PTH=true
> WITH_UCS4=true
> WITH_PYMALLOC=true
> WITH_IPV6=true
> WITHOUT_FPECTL=true
>
> [mar...@aspire8930 /usr/home/martin]$ uname -a
> FreeBSD aspire8930 8.0-STABLE FreeBSD 8.0-STABLE #3: Wed Feb  3 17:01:18 
> GMT 2010 mar...@aspire8930:/usr/obj/usr/src/sys/ASPIRE8930  amd64
> [mar...@aspire8930 /usr/home/martin]$ python
> Python 2.6.4 (r264:75706, Mar 17 2010, 18:44:24)
> [GCC 4.2.1 20070719  [FreeBSD]] on freebsd8
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import multiprocessing as mp
> >>> queue = mp.Queue()
> >>>
>
> hth
> -- 
> mph

Hi Martin, thanks very much for posting that. All I can say is YAY! I'm 
really looking forward to my machine's upgrade now!

thanks,
--Tim


-- 
http://mail.python.org/mailman/listinfo/python-list


remote server and effective uid

2010-11-15 Thread Tim Arnold
Hi,
I have a remote server on a FreeBSD box with clients connecting from
linux, all running python2.7. I've setup the remote server as an inetd
service (currently running as 'nobody'). Both client and server have
access to the same file systems.

How can I enable the server process to write into the client's
directories?
If I change the inetd service to run as 'root', I guess that would
work, but then the client couldn't remove the files put there after
the request.
I could ditch the whole server process and wrap client requests with
rsh calls, but is there a way I can switch the effective uid of the
server process without asking clients to login?

Or is there a better way to solve the problem?

thanks,
--Tim Arnold
-- 
http://mail.python.org/mailman/listinfo/python-list


  1   2   >