[Tutor] Python RE uses DFA or NFA for string check?

2006-01-10 Thread Intercodes
Hello everyone,This question is just out of curiosity. I am working with this dragon book. From what I have learnt so far, RE uses either NFA or DFA to check whether the string is accepted or not. (Correct?)
So what does the Python's RE module use to check the correctness of the string, NFA or DFA?-- Intercodes
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python RE uses DFA or NFA for string check?

2006-01-10 Thread Kent Johnson
Intercodes wrote:
 Hello everyone,
 
 This question is just out of curiosity. I am working with this dragon 
 book. From what I have learnt so far, RE uses either NFA or DFA to check 
 whether the string is accepted or not. (Correct?)
 
 So what does the Python's RE module use to check the correctness of the 
 string, NFA or DFA?

You could look at the source. A little digging shows that REs are parsed 
by sre_parse.parse() which is in Python24\Lib\sre_parse.py on my computer.

Kent

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python RE uses DFA or NFA for string check?

2006-01-10 Thread Tim Peters
[Intercodes]
 This question is just out of curiosity. I am working with this dragon book.
 From what I have learnt so far, RE uses either NFA or DFA to check whether
 the string is accepted or not. (Correct?)

In the world of computer science regular expressions, yes.  But the
things _called_ regular expressions in programming languages are
generally richer than those.  For example, almost all regexp
implementations support backreferences, and backreferences allow
recognizing languages that computer-science regexps cannot.  For
example,

^(a*)b+\1$

recognizes strings that begin and end with the same number of a's,
separated by one or more b's.  It's the same number part that's
beyond a pure regexp's abilities.

 So what does the Python's RE module use to check the correctness of the
 string, NFA or DFA?

Neither, but it's much closer to NFA than to DFA.  Most regexp
implementations in most languages supporting such a thing are
implemented via backtracking search.  Jeffrey Friedl's Mastering
Regular Expressions is more useful than the dragon book if you want
insight into how most programming-language regexp implementations
actually work:

http://www.oreilly.com/catalog/regex/

To increase confusion ;-), Friedl calls backtracking search NFA in that book.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Python RE uses DFA or NFA for string check?

2006-01-10 Thread Intercodes
Thanks Mr.Tim. That was helpful :)On 1/10/06, Tim Peters [EMAIL PROTECTED] wrote:
[Intercodes] This question is just out of curiosity. I am working with this dragon book. From what I have learnt so far, RE uses either NFA or DFA to check whether the string is accepted or not. (Correct?)
In the world of computer science regular expressions, yes.But thethings _called_ regular expressions in programming languages aregenerally richer than those.For example, almost all regexp
implementations support backreferences, and backreferences allowrecognizing languages that computer-science regexps cannot.Forexample,^(a*)b+\1$recognizes strings that begin and end with the same number of a's,
separated by one or more b's.It's the same number part that'sbeyond a pure regexp's abilities. So what does the Python's RE module use to check the correctness of the string, NFA or DFA?
Neither, but it's much closer to NFA than to DFA.Most regexpimplementations in most languages supporting such a thing areimplemented via backtracking search.Jeffrey Friedl's MasteringRegular Expressions is more useful than the dragon book if you want
insight into how most programming-language regexp implementationsactually work:http://www.oreilly.com/catalog/regex/To increase confusion ;-), Friedl calls backtracking search NFA in that book.
-- Intercodes
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] How can I make a python script go directory by directory and excecute on files of choice

2006-01-10 Thread Srinivas Iyyer
Dear group, 
I have Excel files that are arranged according to
transctions under various name/directories.

I found out that all these Excel files are not real
OLE based files and some of them are really tab delim
files with .XLS appended to the end of file name. I
got fooled and started using pyExcelator module. 

Now I want to write a script that can go to each
directory from base directory. 

Say: my dir path is like this:

/home/srini/data/sales
  - jan05/(1-31).xls
  - feb05/(1-27).xls 
  - mar05/(1-31).xls

How can I ask my script residing in data directory to
go to sales and scan all directories and report if
.xls files are really Excel files or text files with
.xls extension. 

could any one please help me here. 

thank you.




__ 
Yahoo! DSL – Something to write home about. 
Just $16.99/mo. or less. 
dsl.yahoo.com 

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] need help with syntax

2006-01-10 Thread bill nieuwendorp
hello all I am new to python and this list has been helpfull so far

I am trying to convert binary file to ascii

here is the format spec

steps = int 4
value = int 4
time = float 4 * steps

so in the python terminal terminal i convert it like this


 import struct
 import string
 f = file('binary_file','rb')
 line = f.readline()
 L = tuple(line)

then to get the steps value i do

 s = L[:4]
 s
('\x00', '\x00', '\x00', '\x06')
 a = string.join( s , '')
 steps = struct.unpack (l , a)
 steps
(6,)

and for value

 v = L[4:8]
 v
('\x00', '\x00', '\x00', '\x08')
 b = string.join( v , '')
 value = struct.unpack (l , b)
 value
(8,)

for the time value this is where I need help

I know steps = 6 so

 4*6 + 8
 32
 t = L[8:32]
 t
('\x00', '\x00', '\x00', '\x00', '=', '\x08', '\x88', '\x89', '=', '\x88', '\x88
', '\x89', '=', '\xcc', '\xcc', '\xcd', '', '\x08', '\x88', '\x89', '', '*', '
\xaa', '\xab')
 c = string.join(t , '')
 time = struct.unpack (6f, c)
 time
(0.0, 0.03335071802139, 0.06670143604279, 0.1000149011612, 0.133
4028720856, 0.166716337204)

so I was happy to figure this much out

what I need is a way to do somthing like this

time = struct.unpack(steps,c)

but this does not work because steps is not an integer

any help would be great thanks
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] need help with syntax

2006-01-10 Thread John Fouhy
On 11/01/06, bill nieuwendorp [EMAIL PROTECTED] wrote:

Hi Bill,

Some comments ---

  import struct
  import string
  f = file('binary_file','rb')
  line = f.readline()
  L = tuple(line)

You can do slicing (things like L[:4]) on strings as well as on lists
and tuples.  So there is probably no need for you to convert to a
tuple --- just work with the lines directly.

  s = L[:4]
  s
 ('\x00', '\x00', '\x00', '\x06')
  a = string.join( s , '')

You can also do:

   a = ''.join(s)

and this is considered better python these days.

 what I need is a way to do somthing like this

 time = struct.unpack(steps,c)

 but this does not work because steps is not an integer

Sounds like you want string substitutions..  You can read about them
here: http://python.org/doc/2.4.2/lib/typesseq-strings.html

For instance:

 steps = 6
 %s % steps
'6'

HTH!

--
John.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] need help with syntax

2006-01-10 Thread bill nieuwendorp
Hi John thanks for the tips

I had a bit of a typo in my first post

time = struct.unpack(steps,c)

should read somthing more like

 time = struct.unpack(steps+f,c)
adding the f to tell structs it is in float format

the string substitution
seems like it would work but now I cant figure out how I would add the
f on the end

also this still requires me to reassign 6 to steps

isnt this already done here ?

 s = L[:4]
 s
('\x00', '\x00', '\x00', '\x06')
 a = string.join( s , '')
 steps = struct.unpack (l , a)
 steps
(6,)


what I would like to do in the end is take the format formula.

steps = int 4
value = int 4
time = float 4 * steps

and have a python script that will spit out the ascii contents,
without me having to assign 6 to steps as that number will change from
file to file.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] Fwd: need help with syntax

2006-01-10 Thread Liam Clarke
oops, forward to list.

-- Forwarded message --
From: Liam Clarke [EMAIL PROTECTED]
Date: Jan 11, 2006 4:18 PM
Subject: Re: [Tutor] need help with syntax
To: bill nieuwendorp [EMAIL PROTECTED]


On 1/11/06, bill nieuwendorp [EMAIL PROTECTED] wrote:
 hello all I am new to python and this list has been helpfull so far

 I am trying to convert binary file to ascii

 here is the format spec

 steps = int 4
 value = int 4
 time = float 4 * steps

 so in the python terminal terminal i convert it like this


  import struct
  import string
  f = file('binary_file','rb')
  line = f.readline()
  L = tuple(line)


Try this -
import struct

f = file('binary_file','rb')
toUnpack = f.read(4)
steps = struct.unpack(i,toUnpack)[0]
floatBytes = steps * 4
furtherUnpack = f.read(4 + floatBytes) #The first 4 for value
pattern = i%df % steps
data  = struct.unpack(pattern, furtherUnpack)
value = data[0]
time = data[1:]

Or probably simpler to do

f = file('binary_file','rb')
toUnpack = f.read(8)
(steps,value) = struct.unpack(2i,toUnpack)
floatBytes = steps * 4
furtherUnpack = f.read(floatBytes)
pattern = %df % steps
time = struct.unpack(pattern, furtherUnpack)

You could then call f.readline() to advance to the next line if
needed, although binary data tends not to be line break delimited as
such...

But I found that using f.read(num_of_bytes) is the best way to do
this, especially when the data you're reading is unknown but specified
at a known offset.

You may also want to have a look at f.seek() and f.tell(), and if the
%d stuff is unfamiliar you may want to google - 'format string
inurl:python.org'

Good luck, it's a bit tricky at times*, but it works, I built a simple
iPod database parser using f.read() and string substitution for the
struct module patterns.

Regards,

Liam Clarke

*Like when your brilliant plan to have objects generate patterns for
the data they're working with by checking type founders thanks to
shorts.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] need help with syntax

2006-01-10 Thread John Fouhy
On 11/01/06, bill nieuwendorp [EMAIL PROTECTED] wrote:
  time = struct.unpack(steps+f,c)
 adding the f to tell structs it is in float format

 the string substitution
 seems like it would work but now I cant figure out how I would add the
 f on the end

Did you read up on string substitution?  You can do a few tricks with
it, but in its simplest form, it just replaces '%s' with the argument.
 eg:

'foo%sbar' % 'seven' == 'foosevenbar'

Or even:

'x%sy%sz' % ('foo', 'bar') == 'xfooybarz'

So, in your case, try something like '%s+f'.

 also this still requires me to reassign 6 to steps

 isnt this already done here ?

  s = L[:4]
  s
 ('\x00', '\x00', '\x00', '\x06')
  a = string.join( s , '')
  steps = struct.unpack (l , a)
  steps
 (6,)

Yes, although in this case steps is a 1-element tuple.  So, steps[0]
will give you what you want :-)

--
John.
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] How can I make a python script go directory by directory and excecute on files of choice

2006-01-10 Thread Liam Clarke
Hi Srinivas -

For walking a directory, you can use os.walk() or os.path.walk(), but
I prefer the path module here -
http://www.jorendorff.com/articles/python/path/.

As for determining if a file is really an .XLS format file or a tab
delimited file with .xls on the end, go to www.wotsit.org, have a look
at the .XLS specifications; there'll probably be a sequence of
identifying bytes you can read to confirm that it's an XLS.

Regards,

Liam Clarke

On 1/11/06, Srinivas Iyyer [EMAIL PROTECTED] wrote:
 Dear group,
 I have Excel files that are arranged according to
 transctions under various name/directories.

 I found out that all these Excel files are not real
 OLE based files and some of them are really tab delim
 files with .XLS appended to the end of file name. I
 got fooled and started using pyExcelator module.

 Now I want to write a script that can go to each
 directory from base directory.

 Say: my dir path is like this:

 /home/srini/data/sales
  - jan05/(1-31).xls
  - feb05/(1-27).xls
  - mar05/(1-31).xls

 How can I ask my script residing in data directory to
 go to sales and scan all directories and report if
 .xls files are really Excel files or text files with
 .xls extension.

 could any one please help me here.

 thank you.




 __
 Yahoo! DSL – Something to write home about.
 Just $16.99/mo. or less.
 dsl.yahoo.com

 ___
 Tutor maillist  -  Tutor@python.org
 http://mail.python.org/mailman/listinfo/tutor

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


[Tutor] declaring list in python

2006-01-10 Thread Logesh Pillay
Hello list

I want to declare a list of a specific size as global to some nested
function like so

def foo (n):
A[] (of size n)
def foo1 
...

The only way I can think to declare the list is to use dummy values:
A = [0] * n

A = [] * n doesn't work.  [] * n = []

I'd prefer not to use dummy values I have no use for.  Is there any way?

Thanks

Logesh Pillay


___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] declaring list in python

2006-01-10 Thread Brian van den Broek
Logesh Pillay said unto the world upon 10/01/06 11:28 PM:
  Hello list
 
  I want to declare a list of a specific size as global to some nested
  function like so

Hi Logesh,

what problem are you trying to solve by doing this? Knowing that will 
help generate more useful answers, I suspect.


  def foo (n):
   A[] (of size n)
   def foo1
   ...
 
  The only way I can think to declare the list is to use dummy values:
   A = [0] * n
 
  A = [] * n doesn't work.  [] * n = []

A = [None] * n

would be a better way to created an n-placed list of dummy values in 
Python, I think.


  I'd prefer not to use dummy values I have no use for.  Is there any 
way?

This is why knowing your problem would be helpful. Built-in Python 
data structures don't have size limitations that are declared when an 
instance of the data structure is created. (Python's not C.) There is 
no way (that I know of) to create an n-placed list save creating a 
list with n objects.

So, I think you will have to either give up on not employing dummy 
values or give up on creating a list of a fixed length.

You could subclass list to create a class with a max. and/or min. 
length, but I think knowing more about what you want to do would be 
helpful before getting into that :-)

Best,

Brian vdB
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor