Yes, Spencer's info is correct. This script gives an example for a date in
dd-mm-yy[yy] format:
candidate = '14-06-2023'
matcher = candidate =~ /(?x) # enable whitespace and comments
^ # start of line
(0?[1-9]|[12]\d|3[01]) # capture day, e.g. 1, 01, 12, 30
[\-\/]+ # ignore separator
(\d{1,2}) # capture month, e.g. 1, 01, 12
[\-\/]+ # ignore separator
(\d{4}|\d{2}) # capture year, e.g. 1975, 23
$ # end of line
/
(_, day, month, year) = matcher[0]
assert [year, month, day] == ['2023', '06', '14']
You'd need a slight tweak to instead/also handle USA dates in mm-dd-yy[yy]
format.
Cheers, Paul.
On Mon, Jul 10, 2023 at 6:53 AM Spencer Allain via users <
[email protected]> wrote:
> You have changed the first grouping into a non-capture group with ?:, so
> matcher[0][3] will be null, and matcher[0][1] will be the month and
> matcher[0][2] will be the year.
>
> I also believe that you should be able to reduce [\\\-\\\/] to be simply
> [-\/] because of using slashy-string for the regex (as only forward slash
> needs to be escaped)
>
> -Spencer
>
> On Sunday, July 9, 2023 at 12:08:17 PM EDT, James McMahon <
> [email protected]> wrote:
>
>
> Correction: this is my code...
>
> else if ( candidate =~
> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
> ) {
>
> log.error("BINGO!")
>
> matcher = candidate =~
> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
> matchedSubstring = matcher[0][0]
> log.error("Matcher: ${matcher.toString()}")
> log.error("Matched substring: ${matchedSubstring}")
> day = matchedSubstring[matcher[0][1]]
> month = matchedSubstring[matcher[0][2]]
> year = matcherSubstriong[matcher[0][3]]
>
> log.error("Day: ${day}")
> log.error("Month: ${month}")
> log.error("Year: ${year}")
>
> log.error("Length of Day: ${day.length()}")
> log.error("Length of Month: ${month.length()}")
> log.error("Length of Year: ${year.length()}")
>
> }
>
> I suspect I need to look at how I'm setting day, month, and year from
> matcher.
>
> The log output:
> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Candidate: 06-14-2023
> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] BINGO!
> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matcher:
> java.util.regex.Matcher[pattern=^(?:0?[1-9]|[12]\d|3[01])[\\\-\\/]+(\d{2}|\d{1})[\\\-\\/]+(\d{4}|\d{2})$
> region=0,10 lastmatch=06-14-2023]
> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matched substring:
> 06-14-2023
> 2023-07-09 16:04:47,406 ERROR [Timer-Driven Process Thread-10]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Could not parse:
> 06-14-2023
>
> On Sun, Jul 9, 2023 at 11:59 AM James McMahon <[email protected]>
> wrote:
>
> Hello. I have a conditional clause in my Groovy script that attempts to
> parse a date pattern of this form: 06-14-2023. It fails - I believe in the
> matcher.
>
> I am running from a NiFi ExecuteScript processor. Here is my conditional:
>
> } else if ( candidate =~
> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
> ) {
>
> log.error("BINGO!")
>
> matcher = candidate =~
> /^(?:0?[1-9]|[12]\d|3[01])[\\\-\\\/]+(\d{2}|\d{1})[\\\-\\\/]+(\d{4}|\d{2})$/
> log.error("Matcher: ${matcher.toString()}")
> log.error("Matched substring: ${matchedSubstring}")
> day = matchedSubstring[matcher[0][1]]
> month = matchedSubstring[matcher[0][2]]
> year = matcherSubstriong[matcher[0][3]]
>
> log.error("Day: ${day}")
> log.error("Month: ${month}")
> log.error("Year: ${year}")
>
> log.error("Length of Day: ${day.length()}")
> log.error("Length of Month: ${month.length()}")
> log.error("Length of Year: ${year.length()}")
>
> }
>
> My log output tells me I make it into the conditional, but then I fail on
> the matcher:
>
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Candidate: 06-14-2023
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] BINGO!
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matcher:
> java.util.regex.Matcher[pattern=^(?:0?[1-9]|[12]\d|3[01])[\\\-\\/]+(\d{2}|\d{1})[\\\-\\/]+(\d{4}|\d{2})$
> region=0,10 lastmatch=]
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Matched substring:
> *null*
> 2023-07-09 15:52:23,547 ERROR [Timer-Driven Process Thread-2]
> o.a.nifi.processors.script.ExecuteScript
> ExecuteScript[id=33a5179c-1df4-128b-52be-aaa96b947012] Could not parse:
> 06-14-2023
>
> Can anyone help me get this matcher to work?
>
> Thanks in advance for any help.
>
> Jim
>
>