Re: [off-ish] Regex help
Hello Chip, Did you try this: $x:=Split string($pathname;Folder separator) Vincent ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: [off-ish] Regex help
An alternative to ungreedy would be to exclude colons from the strings to be matched: $folderPathMotif:="([^:]*:)" Jeremy > On 28 Nov 2018, at 19:27, Peter Bozek via 4D_Tech <4d_tech@lists.4d.com> > wrote: > > On Wed, Nov 28, 2018 at 7:30 PM Chip Scheide via 4D_Tech < > 4d_tech@lists.4d.com> wrote: > >> Thanks Kirk, >> >> According to Wiki : >> . - matches any single character >> ( ) - defines a marked subexpression >> * - matches the preceding element zero or more times >> so... as I read the definitions... >> "(.*:)" >> >> match any character(s), before a ":" >> a file path (on a Mac) is : : ... : >> >> Match Regex says: >> "...If you pass arrays, the command returns the position and length of >> the occurrence in the element zero of the arrays and the positions and >> lengths of the groups captured by the regular expression in the >> following elements." >> >> so I would expect(ed) >> Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) >> >> to populate the arrays with each occurrence of ":", as your supplied >> code appears to do, >> OR >> if the Match Regex does not find/report all occurrences of ":" to >> report the FIRST instance of a ":", not the last. >> > > Regex behaves like that because, by default, its matching is greedy, what > means it tries to match the pattern to as many characters as possible. In > your case, it tries to find longest run that match pattern .*; - and the > run consist of all characters up to last :. > > If you want the match stop on first occurrence of : you need to make > operator * ungreedy by attaching ? to it. If you try pattern (.*?:) it > should return substring up to first :. While pattern .*: says "find the > longest run of characters ending with :" pattern .*?: means "find the > shortest run of characters ending with : > > Size of array is number of matching groups, in your case 1. This is how it > works, IMHO. If the whole pattern repeats itself several times, you need to > run Match regex in loop. > > HTH, ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
RE: [off-ish] Regex help
Thanks On Thu, 29 Nov 2018 08:37:42 +, Epperlein, Lutz (agendo) via 4D_Tech wrote: > Peter answered already, but if you want to test your regex, you can > use e.g.: > https://regexr.com/440e2 > this is with Kirk's example. > > You regex looks there: > https://regexr.com/440ee > > This is a nice tool which provides explanations for your regex too. > > Regards Lutz > > >> -Original Message- >> From: 4D_Tech [mailto:4d_tech-boun...@lists.4d.com] On Behalf Of Chip >> Scheide via >> 4D_Tech >> Sent: Wednesday, November 28, 2018 7:29 PM >> To: 4D iNug Technical <4d_tech@lists.4d.com> >> Cc: Chip Scheide <4d_o...@pghrepository.org> >> Subject: Re: [off-ish] Regex help >> >> Thanks Kirk, >> >> That will be useful, but mostly I am wondering based on what (very >> tiny) understanding I have of Regex why the previously posted statement >> does not do what your loop does. >> >> According to Wiki : >> . - matches any single character >> ( ) - defines a marked subexpression >> * - matches the preceding element zero or more times >> so... as I read the definitions... >> "(.*:)" >> >> match any character(s), before a ":" >> a file path (on a Mac) is : : ... : >> >> Match Regex says: >> "...If you pass arrays, the command returns the position and length of >> the occurrence in the element zero of the arrays and the positions and >> lengths of the groups captured by the regular expression in the >> following elements." >> >> so I would expect(ed) >> Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) >> >> to populate the arrays with each occurrence of ":", as your supplied >> code appears to do, >> OR >> if the Match Regex does not find/report all occurrences of ":" to >> report the FIRST instance of a ":", not the last. >> >> Thanks >> again >> Chip >> >> On Wed, 28 Nov 2018 10:16:01 -0800, Kirk Brooks via 4D_Tech wrote: >>> Chip, >>> >>> I think what you want is to parse the path into its component parts. >>> >>> This this pattern for matching: >>> >>> ([ \w\d-_]+): >>> >>> This will match letters, numbers, spaces, underscores and dashes up to the >>> semi colon. You will want to use it in a loop like so: >>> >>> $pattern:="([ \\w\\d-_]+):" >>> $start:=1 >>> While(Match regex($patters;$text;$start;$aPos;$aLen)) // pass arrays for >>> pos and len >>> >>> // $aLen[0] will be the length of the entire match >>> >>> // $aLen[1] will be the length of the match within the parens >>> >>> APPEND TO ARRAY($aTheParts;Substring($text;$aPos{1};$aLen{1}) >>> >>> $start:=$aPos{0}+$aLen{0} // move up to the next match >>> >>> End while >>> >>> >>> On Wed, Nov 28, 2018 at 9:51 AM Chip Scheide via 4D_Tech < >>> 4d_tech@lists.4d.com> wrote: >>> >>>> can anyone who has a clue help me? >>>> >>>> I am looking at some code: >>>> Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) >>>> >>>> where: >>>> ARRAY LONGINT($path_pos;0) >>>> ARRAY LONGINT($path_len;0) >>>> $folderPathMotif:="(.*:)" >>>> and >>>> File_Path is, well.., a file path on a Mac (so folder separator is ":") >>>> >>>> When the above regex runs, $Path_pos and $path_len each have 1 element, >>>> and that element is a reference to the LAST occurrence of ":" in the >>>> file path. >>>> >>>> Why does the regex not populate the arrays with the location of ALL >>>> occurrences of ":", or the first occurrence of ":"? >>>> >>>> Thanks for any help... >>>> and off Nug help is fine >>>> --- >>>> Gas is for washing parts >>>> Alcohol is for drinkin' >>>> Nitromethane is for racing >>>> >> * >> * >>>> 4D Internet Users Group (4D iNUG) >>>> Archive: http://lists.4d.com/archives.html >>>> Options: https://lists.4d.com/mailman/options/4d_tech >>>> Unsub: mailto:4d_tech-unsubscr...@lists.4d.com >>>> >>
RE: [off-ish] Regex help
Peter answered already, but if you want to test your regex, you can use e.g.: https://regexr.com/440e2 this is with Kirk's example. You regex looks there: https://regexr.com/440ee This is a nice tool which provides explanations for your regex too. Regards Lutz > -Original Message- > From: 4D_Tech [mailto:4d_tech-boun...@lists.4d.com] On Behalf Of Chip > Scheide via > 4D_Tech > Sent: Wednesday, November 28, 2018 7:29 PM > To: 4D iNug Technical <4d_tech@lists.4d.com> > Cc: Chip Scheide <4d_o...@pghrepository.org> > Subject: Re: [off-ish] Regex help > > Thanks Kirk, > > That will be useful, but mostly I am wondering based on what (very > tiny) understanding I have of Regex why the previously posted statement > does not do what your loop does. > > According to Wiki : > . - matches any single character > ( ) - defines a marked subexpression > * - matches the preceding element zero or more times > so... as I read the definitions... > "(.*:)" > > match any character(s), before a ":" > a file path (on a Mac) is : : ... : > > Match Regex says: > "...If you pass arrays, the command returns the position and length of > the occurrence in the element zero of the arrays and the positions and > lengths of the groups captured by the regular expression in the > following elements." > > so I would expect(ed) > Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) > > to populate the arrays with each occurrence of ":", as your supplied > code appears to do, > OR > if the Match Regex does not find/report all occurrences of ":" to > report the FIRST instance of a ":", not the last. > > Thanks > again > Chip > > On Wed, 28 Nov 2018 10:16:01 -0800, Kirk Brooks via 4D_Tech wrote: > > Chip, > > > > I think what you want is to parse the path into its component parts. > > > > This this pattern for matching: > > > > ([ \w\d-_]+): > > > > This will match letters, numbers, spaces, underscores and dashes up to the > > semi colon. You will want to use it in a loop like so: > > > > $pattern:="([ \\w\\d-_]+):" > > $start:=1 > > While(Match regex($patters;$text;$start;$aPos;$aLen)) // pass arrays for > > pos and len > > > > // $aLen[0] will be the length of the entire match > > > > // $aLen[1] will be the length of the match within the parens > > > > APPEND TO ARRAY($aTheParts;Substring($text;$aPos{1};$aLen{1}) > > > > $start:=$aPos{0}+$aLen{0} // move up to the next match > > > > End while > > > > > > On Wed, Nov 28, 2018 at 9:51 AM Chip Scheide via 4D_Tech < > > 4d_tech@lists.4d.com> wrote: > > > >> can anyone who has a clue help me? > >> > >> I am looking at some code: > >> Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) > >> > >> where: > >> ARRAY LONGINT($path_pos;0) > >> ARRAY LONGINT($path_len;0) > >> $folderPathMotif:="(.*:)" > >> and > >> File_Path is, well.., a file path on a Mac (so folder separator is ":") > >> > >> When the above regex runs, $Path_pos and $path_len each have 1 element, > >> and that element is a reference to the LAST occurrence of ":" in the > >> file path. > >> > >> Why does the regex not populate the arrays with the location of ALL > >> occurrences of ":", or the first occurrence of ":"? > >> > >> Thanks for any help... > >> and off Nug help is fine > >> --- > >> Gas is for washing parts > >> Alcohol is for drinkin' > >> Nitromethane is for racing > >> > * > * > >> 4D Internet Users Group (4D iNUG) > >> Archive: http://lists.4d.com/archives.html > >> Options: https://lists.4d.com/mailman/options/4d_tech > >> Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > >> > * > * > > > > > > > > -- > > Kirk Brooks > > San Francisco, CA > > === > > > > *We go vote - they go home* > > > * > * > > 4D Internet Users Group (4D iNUG) > > Archive: http://lists.4d.com/archives.html > > Options: https://lists.4d.com/mailman/options/4d_tech > > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > > > * > * > --- > Gas is for washing parts > Alcohol is for drinkin' > Nitromethane is for racing > * > * > 4D Internet Users Group (4D iNUG) > Archive: http://lists.4d.com/archives.html > Options: https://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > * > * ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: [off-ish] Regex help
Peter Thanks for that explanation! On Wed, 28 Nov 2018 20:27:40 +0100, Peter Bozek wrote: > On Wed, Nov 28, 2018 at 7:30 PM Chip Scheide via 4D_Tech < > 4d_tech@lists.4d.com> wrote: > >> Thanks Kirk, >> >> According to Wiki : >> . - matches any single character >> ( ) - defines a marked subexpression >> * - matches the preceding element zero or more times >> so... as I read the definitions... >> "(.*:)" >> >> match any character(s), before a ":" >> a file path (on a Mac) is : : ... : >> >> Match Regex says: >> "...If you pass arrays, the command returns the position and length of >> the occurrence in the element zero of the arrays and the positions and >> lengths of the groups captured by the regular expression in the >> following elements." >> >> so I would expect(ed) >> Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) >> >> to populate the arrays with each occurrence of ":", as your supplied >> code appears to do, >> OR >> if the Match Regex does not find/report all occurrences of ":" to >> report the FIRST instance of a ":", not the last. >> > > Regex behaves like that because, by default, its matching is greedy, what > means it tries to match the pattern to as many characters as possible. In > your case, it tries to find longest run that match pattern .*; - and the > run consist of all characters up to last :. > > If you want the match stop on first occurrence of : you need to make > operator * ungreedy by attaching ? to it. If you try pattern (.*?:) it > should return substring up to first :. While pattern .*: says "find the > longest run of characters ending with :" pattern .*?: means "find the > shortest run of characters ending with : > > Size of array is number of matching groups, in your case 1. This is how it > works, IMHO. If the whole pattern repeats itself several times, you need to > run Match regex in loop. > > HTH, > > -- > > Peter Bozek --- Gas is for washing parts Alcohol is for drinkin' Nitromethane is for racing ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: [off-ish] Regex help
On Wed, Nov 28, 2018 at 7:30 PM Chip Scheide via 4D_Tech < 4d_tech@lists.4d.com> wrote: > Thanks Kirk, > > According to Wiki : > . - matches any single character > ( ) - defines a marked subexpression > * - matches the preceding element zero or more times > so... as I read the definitions... > "(.*:)" > > match any character(s), before a ":" > a file path (on a Mac) is : : ... : > > Match Regex says: > "...If you pass arrays, the command returns the position and length of > the occurrence in the element zero of the arrays and the positions and > lengths of the groups captured by the regular expression in the > following elements." > > so I would expect(ed) > Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) > > to populate the arrays with each occurrence of ":", as your supplied > code appears to do, > OR > if the Match Regex does not find/report all occurrences of ":" to > report the FIRST instance of a ":", not the last. > Regex behaves like that because, by default, its matching is greedy, what means it tries to match the pattern to as many characters as possible. In your case, it tries to find longest run that match pattern .*; - and the run consist of all characters up to last :. If you want the match stop on first occurrence of : you need to make operator * ungreedy by attaching ? to it. If you try pattern (.*?:) it should return substring up to first :. While pattern .*: says "find the longest run of characters ending with :" pattern .*?: means "find the shortest run of characters ending with : Size of array is number of matching groups, in your case 1. This is how it works, IMHO. If the whole pattern repeats itself several times, you need to run Match regex in loop. HTH, -- Peter Bozek ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: [off-ish] Regex help
Thanks Kirk, That will be useful, but mostly I am wondering based on what (very tiny) understanding I have of Regex why the previously posted statement does not do what your loop does. According to Wiki : . - matches any single character ( ) - defines a marked subexpression * - matches the preceding element zero or more times so... as I read the definitions... "(.*:)" match any character(s), before a ":" a file path (on a Mac) is : : ... : Match Regex says: "...If you pass arrays, the command returns the position and length of the occurrence in the element zero of the arrays and the positions and lengths of the groups captured by the regular expression in the following elements." so I would expect(ed) Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) to populate the arrays with each occurrence of ":", as your supplied code appears to do, OR if the Match Regex does not find/report all occurrences of ":" to report the FIRST instance of a ":", not the last. Thanks again Chip On Wed, 28 Nov 2018 10:16:01 -0800, Kirk Brooks via 4D_Tech wrote: > Chip, > > I think what you want is to parse the path into its component parts. > > This this pattern for matching: > > ([ \w\d-_]+): > > This will match letters, numbers, spaces, underscores and dashes up to the > semi colon. You will want to use it in a loop like so: > > $pattern:="([ \\w\\d-_]+):" > $start:=1 > While(Match regex($patters;$text;$start;$aPos;$aLen)) // pass arrays for > pos and len > > // $aLen[0] will be the length of the entire match > > // $aLen[1] will be the length of the match within the parens > > APPEND TO ARRAY($aTheParts;Substring($text;$aPos{1};$aLen{1}) > > $start:=$aPos{0}+$aLen{0} // move up to the next match > > End while > > > On Wed, Nov 28, 2018 at 9:51 AM Chip Scheide via 4D_Tech < > 4d_tech@lists.4d.com> wrote: > >> can anyone who has a clue help me? >> >> I am looking at some code: >> Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) >> >> where: >> ARRAY LONGINT($path_pos;0) >> ARRAY LONGINT($path_len;0) >> $folderPathMotif:="(.*:)" >> and >> File_Path is, well.., a file path on a Mac (so folder separator is ":") >> >> When the above regex runs, $Path_pos and $path_len each have 1 element, >> and that element is a reference to the LAST occurrence of ":" in the >> file path. >> >> Why does the regex not populate the arrays with the location of ALL >> occurrences of ":", or the first occurrence of ":"? >> >> Thanks for any help... >> and off Nug help is fine >> --- >> Gas is for washing parts >> Alcohol is for drinkin' >> Nitromethane is for racing >> ** >> 4D Internet Users Group (4D iNUG) >> Archive: http://lists.4d.com/archives.html >> Options: https://lists.4d.com/mailman/options/4d_tech >> Unsub: mailto:4d_tech-unsubscr...@lists.4d.com >> ** > > > > -- > Kirk Brooks > San Francisco, CA > === > > *We go vote - they go home* > ** > 4D Internet Users Group (4D iNUG) > Archive: http://lists.4d.com/archives.html > Options: https://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ** --- Gas is for washing parts Alcohol is for drinkin' Nitromethane is for racing ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: [off-ish] Regex help
Chip, I think what you want is to parse the path into its component parts. This this pattern for matching: ([ \w\d-_]+): This will match letters, numbers, spaces, underscores and dashes up to the semi colon. You will want to use it in a loop like so: $pattern:="([ \\w\\d-_]+):" $start:=1 While(Match regex($patters;$text;$start;$aPos;$aLen)) // pass arrays for pos and len // $aLen[0] will be the length of the entire match // $aLen[1] will be the length of the match within the parens APPEND TO ARRAY($aTheParts;Substring($text;$aPos{1};$aLen{1}) $start:=$aPos{0}+$aLen{0} // move up to the next match End while On Wed, Nov 28, 2018 at 9:51 AM Chip Scheide via 4D_Tech < 4d_tech@lists.4d.com> wrote: > can anyone who has a clue help me? > > I am looking at some code: > Match regex($folderPathMotif;$File_Path;1;$path_pos;$path_len) > > where: > ARRAY LONGINT($path_pos;0) > ARRAY LONGINT($path_len;0) > $folderPathMotif:="(.*:)" > and > File_Path is, well.., a file path on a Mac (so folder separator is ":") > > When the above regex runs, $Path_pos and $path_len each have 1 element, > and that element is a reference to the LAST occurrence of ":" in the > file path. > > Why does the regex not populate the arrays with the location of ALL > occurrences of ":", or the first occurrence of ":"? > > Thanks for any help... > and off Nug help is fine > --- > Gas is for washing parts > Alcohol is for drinkin' > Nitromethane is for racing > ** > 4D Internet Users Group (4D iNUG) > Archive: http://lists.4d.com/archives.html > Options: https://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ** -- Kirk Brooks San Francisco, CA === *We go vote - they go home* ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **