Re: Reg Ex help

2006-05-12 Thread Anthra Norell
>>> se = SE.SE (' "~/[A-Za-z0-9_]+/CHECKEDOUT~==" | /= CHECKEDOUT=')
>>> se
('/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT')
'dbg_for_python'

If I understand your problem, this might be a solution. It is a stream
editor I devised on the impression that it could handle in a simple manner a
number of relatively simple problems on this list for which no
commensurately simple methodologies seem to exist. I intend to propose it to
the group when I finish the doc. Meantime who do I propose it to?

Frederic


- Original Message -
From: "don" <[EMAIL PROTECTED]>
Newsgroups: comp.lang.python
To: 
Sent: Thursday, May 11, 2006 7:39 PM
Subject: Reg Ex help


> I have a string  from a clearcase cleartool ls command.
>
> /main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
> from /main/parallel_branch_1/release_branch_1.0/4
>
> I want to write a regex that gives me the branch the file was
> checkedout on ,in this case - 'dbg_for_python'
>
> Also if there is a better way than using regex, please let me know.
>
> Thanks in advance,
> Don
>
> --
> http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-12 Thread Edward Elliott
bruno at modulix wrote:

> parts = s.replace(' ', '/').strip('/').split('/')
> branch = parts[parts.index('CHECKEDOUT') - 1]
>  
> Edward Elliott wrote:
>>
>> marker = s.index('/CHECKEDOUT')
>> branch = s [s.rindex('/', 0, marker) + 1 : marker]
> 
> Much cleaner than mine. I shouldn't try to code when it's time to bed !-)

Not terribly readable though, hard to tell what the magic slice indexes
mean.  Yours is easier to follow.  I think I'd just use a regex though.

-- 
Edward Elliott
UC Berkeley School of Law (Boalt Hall)
complangpython at eddeye dot net
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-12 Thread bruno at modulix
Edward Elliott wrote:
(snip)
>>don a écrit :
>>
>>>Also if there is a better way than using regex, please let me know.
>>
(snip)
> 
> I wouldn't call these better (or worse) than regexes, but a slight variation
> on the above:
> 
> marker = s.index('/CHECKEDOUT')
> branch = s [s.rindex('/', 0, marker) + 1 : marker]

Much cleaner than mine. I shouldn't try to code when it's time to bed !-)


-- 
bruno desthuilliers
python -c "print '@'.join(['.'.join([w[::-1] for w in p.split('.')]) for
p in '[EMAIL PROTECTED]'.split('@')])"
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Reg Ex help

2006-05-11 Thread Paddy
P.S.

This is how it works:

>>> s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT 
>>> from /main/parallel_branch_1/release_branch_1.0/4"
>>> s
'/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
from /main/parallel_branch_1/release_branch_1.0/4'
>>> s.split()
['/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT',
'from', '/main/parallel_branch_1/release_branch_1.0/4']
>>> s.split()[0]
'/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT'
>>> s.split()[0].split('/')
['', 'main', 'parallel_branch_1', 'release_branch_1.0',
'dbg_for_python', 'CHECKEDOUT']
>>> s.split()[0].split('/')[-1]
'CHECKEDOUT'
>>> s.split()[0].split('/')[-2]
'dbg_for_python'
>>> s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT 
>>> from /main/parallel_branch_1/release_branch_1.0/4"
>>> s.split()[0].split('/')[-2]
'dbg_for_python'
>>> 

- Paddy.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-11 Thread Paddy
If you have strings of all the CHECKEDOUT (is this from the lsco
command?), then the following might work for you:

>>> s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT 
>>> from /main/parallel_branch_1/release_branch_1.0/4"
>>> s.split()[0].split('/')[-2]
'dbg_for_python'
>>> 

- Paddy.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-11 Thread Edward Elliott
Bruno Desthuilliers wrote:
> don a écrit :
>> Also if there is a better way than using regex, please let me know.
> 
> s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
>   from /main/parallel_branch_1/release_branch_1.0/4"
> parts = s.replace(' ', '/').strip('/').split('/')
> branch = parts[parts.index('CHECKEDOUT') - 1]

I wouldn't call these better (or worse) than regexes, but a slight variation
on the above:

marker = s.index('/CHECKEDOUT')
branch = s [s.rindex('/', 0, marker) + 1 : marker]

This version will throw exceptions when the marker isn't found, which may or
may not be preferable under the circumstances.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Reg Ex help

2006-05-11 Thread Bruno Desthuilliers
don a écrit :
> I have a string  from a clearcase cleartool ls command.
> 
> /main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
> from /main/parallel_branch_1/release_branch_1.0/4
> 
> I want to write a regex that gives me the branch the file was
> checkedout on ,in this case - 'dbg_for_python'
> 
> Also if there is a better way than using regex, please let me know.

s ="/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
  from /main/parallel_branch_1/release_branch_1.0/4"
parts = s.replace(' ', '/').strip('/').split('/')
branch = parts[parts.index('CHECKEDOUT') - 1]
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-11 Thread Mirco Wahab
Hi don
> I have a string  from a clearcase cleartool ls command.
> /main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
> from /main/parallel_branch_1/release_branch_1.0/4
> I want to write a regex that gives me the branch the file was
> checkedout on ,in this case - 'dbg_for_python'
> Also if there is a better way than using regex, please let me know.

This is a good situation where Regex come into play,
because all other solutions won't catch on different
string structures easily.

If you know that you will need the string
before CHECKEDOUT, you can, for example use some
nice positive lookahead (mentioned today here)

pseudo: take all strings between / ... / and
return 'em if the next thing is CHECKEDOUT
(or something else):

/ ([^/]+) / (?=CHECKEDOUT)

The ([^/]+) means  ^/ (not /) in a character
class, [^/]+ one or more than one times
and ([^/]+) capture it by (..)

The code:

   import re

   t = '/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT 
from /main/p...'
   r = r'/([^/]+)/(?=CHECKEDOUT)'

   # print re.search(r, t).group(1)

would do the job, independent of the structure
of string - except the /CHECKEDOUT thing (which
has to be there)

If there are 'better ways' - that depends on
'better ways for whom?'. If you can handle
the Railgun, why bother with the Pistols ;-)

Regards

M.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-11 Thread Aaron Barclay
Hi don,

there may well be a better way then regex, although I find them usefull 
and use them a lot.

The way they work would be dependant on knowing some things. For 
example, if the dir you are after is always 4
deep in the structure you could try something like...

path =  
'/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT

from /main/parallel_branch_1/release_branch_1.0/4'

p = re.compile('/\S*/\S*/\S*/(\S*)/')
m = re.search(p, path)
.
if m:
print m.group(1)
   

This is a good reference...
http://www.amk.ca/python/howto/regex/

Hope that helps,
aaron.

don wrote:

>I have a string  from a clearcase cleartool ls command.
>
>/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
>from /main/parallel_branch_1/release_branch_1.0/4
>
>I want to write a regex that gives me the branch the file was
>checkedout on ,in this case - 'dbg_for_python'
>
>Also if there is a better way than using regex, please let me know.
>
>Thanks in advance,
>Don
>
>  
>

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-11 Thread Tim Chase
> /main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
> from /main/parallel_branch_1/release_branch_1.0/4
> 
> I want to write a regex that gives me the branch the file was
> checkedout on ,in this case - 'dbg_for_python'
> 
> Also if there is a better way than using regex, please let me know.

Well, if you have it all in a single string:

s = 
"/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT 
from /main/parallel_branch_1/release_branch_1.0/4"

you can do

branch = s.split("/")[4]

which returns the branch, assuming the path from root is the 
same for each item in question.

If not, you can tinker with something like

r = re.compile(r'/([^/]*)/CHECKEDOUT')
m = r.match(s)

and which should make m.groups(1) the resulting item.  You 
don't give much detail regarding what is constant (the 
number of subdirectories in the path?  the CHECKEDOUT 
portion?, etc) so it's kinda hard to figure out what is most 
globally applicable.

-tkc



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Reg Ex help

2006-05-11 Thread James Thiele
don wrote:
> I have a string  from a clearcase cleartool ls command.
>
> /main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT
> from /main/parallel_branch_1/release_branch_1.0/4
>
> I want to write a regex that gives me the branch the file was
> checkedout on ,in this case - 'dbg_for_python'
>
> Also if there is a better way than using regex, please let me know.
>
> Thanks in advance,
> Don

Not regex, but does this do what you want?
>>> s = "/main/parallel_branch_1/release_branch_1.0/dbg_for_python/CHECKEDOUT"
>>> s = s + " from /main/parallel_branch_1/release_branch_1.0/4"
>>> s.split('/')[4]
'dbg_for_python'

-- 
http://mail.python.org/mailman/listinfo/python-list