Hi Kent,

Once again, thanks a lot. Problem solved now, your suggestions work
like a charm. You were absolutely right about the last group matching.
I modified my matching pattern:

oRe = re.compile( "(\d\d_\d\d\_)(\d\d(\D|$))" )

Instead of

oRe = re.compile( "(\d\d_\d\d\_)(\d\d)" )


I had no idea you could put subgroups in groups, I tried that out of
inspiration and whoo it works. This is fantastic!

So here is the full script, followed by the output:


import os, re


def matchShot( sSubString, oRe ):
        
        sNewString = oRe.sub( r'\g<1>0\2', sSubString )
        return sNewString




def processPath( sString, bTest ):
        
        """
        ARGUMENTS:
        sString (string): the path to process
        bTest (boolean): test the validity of paths at each step of the script
        """
        
        # Create regular expression object
        oRe = re.compile( "(\d\d_\d\d\_)(\d\d(\D|$))" )
        
        
        if bTest == True:
                # Test validity of integral path
                if not os.path.exists( sString ): return None
                
        # Break-up path
        aString = sString.split( os.sep )
        
        aNewPath = []
        
        # Iterate individual components
        for sSubString in aString:
                
                
                if bTest == True:
                        # Test if this part of the path is working with the 
current path
                        sTempPath = '\\'.join( aNewPath )
                        sTempPath += '%s%s' % ( '\\', sSubString )
                        if not os.path.exists( sTempPath ): sSubString = 
matchShot( sSubString, oRe )
                else: sSubString = matchShot( sSubString, oRe )
                
                
                aNewPath.append( sSubString )
                
                
                if bTest == True:
                        # Test again if path is valid with this substring
                        sTempPath = '\\'.join( aNewPath )
                        if not os.path.exists( sTempPath ): return None
        
        
        sNewPath = '\\'.join( aNewPath )
        print sNewPath




processPath( r'C:\temp\MT_03_03_03\allo.txt', False )
processPath( r'C:\temp\MT_03_04_04\mt_03_04_04_anim_v1.scn', False )
processPath( r'C:\temp\MT_03_05_005_anim\mt_03_05_05_anim_v1.scn', False )
processPath( r'C:\temp\MT_03_06_006\mt_03_06_006_anim_v1.scn', False )

# ============================================================

C:\temp\MT_03_03_003\allo.txt
C:\temp\MT_03_04_004\mt_03_04_004_anim_v1.scn
C:\temp\MT_03_05_005_anim\mt_03_05_005_anim_v1.scn
C:\temp\MT_03_06_006\mt_03_06_006_anim_v1.scn


This is exactly what I was after. Thanks a lot!!

Bernard




On 9/8/05, Kent Johnson <[EMAIL PROTECTED]> wrote:
> Bernard Lebel wrote:
> > Ok I think I understand what is going: I'm using a 0 in the
> > replacement argument, between the two groups. If I try with a letter
> > or other types of characters it works fine. So how can use a digit
> > here?
> 
> There is a longer syntax for \1 - \g<1> means the same thing but without the 
> ambiguity of where it ends. So you can use r'\g<1>0\2' as your substitution 
> string.
> 
> >>def matchShot( sSubString ):
> >>
> >>        # Create regular expression object
> >>        oRe = re.compile( "(\d\d_\d\d\_)(\d\d)" )
> >>
> >>        oMatch = oRe.search( sSubString )
> >>        if oMatch != None:
> >>                sNewString = oRe.sub( r'\10\2', sSubString )
> >>                return sNewString
> >>        else:
> >>                return sSubString
> 
> You don't have to do the search, oRe.sub() won't do anything if there is no 
> match. Also if you are doing this a lot you should pull the re.compile() out 
> of the function (so oRe is a module variable), this is an expensive step that 
> only has to be done once.
> 
> You hinted in your original post that you are trying to find strings where 
> the last _\d\d has only two digits. The re you are using will also match 
> something like 'mt_03_04_044_anim' and your matchShot() will change that to 
> 'mt_03_04_0044_anim'. If that is not what you want you have to put some kind 
> of a guard at the end of the re - something that won't match a digit. If you 
> always have the _ at the end it is easy, just use r"(\d\d_\d\d\_)(\d\d_)". If 
> you can't count on the underscore you will have to be more clever.
> 
> Kent
> 
> _______________________________________________
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to