First a reply to previous suggestions:
Since {0,1} means 'this occurs 0 or 1 times' it can be replaces by ?, since 
that means the same. So the following also works:
^([a-z,A-Z,\.]+) ([a-z,A-Z]+(\s(jr|sr)\.?)?)

Also the , in the [a-z,A-Z,\.] blocks are not necessary, if you do want to 
allow a comma you can leave it in, but once is enough [a-zA-Z,\.], without 
the , you can just use [a-zA-Z\.] . Moreover, I see that the suggestion for 
Jr./Sr. is only lower case, so I assume you do not have case sensitivity 
enabled, so then [a-z\.] would also suffice for the first names, resulting 
in this:
^([a-z\.]+) ([a-z]+(\s(jr|sr)\.?)?)

Now for an alternative suggestion to the shown data:
In your data example the full names are before a \ and the first name is 
probably 'anything before the first space' and last name 'everything after 
the space until the \' you can simply use:
(.+?) (.+?)\\

The first group will match everything until the first space, the second 
group everything after that until the first backslash, and both groups must 
contain at least one character.


On Saturday, February 29, 2020 at 2:40:53 PM UTC+1, anotherhoward wrote:
>
> In a previous post, "Extracting parts of names from full names," I asked 
> about how to extract the first and last names from a string.
>
> *Here is an input sample:*
>
> Felix Jose\josefe01
> Tony Clark\clarkto02
> Matt Williams\willima04
> John McDonald\mcdonjo03
> Mark Grace\gracema01
> Steve Finley\finlest01
> B.J. Surhoff\surhob.01
> J.T. Snow\snowj.01
>
> When I re-examined the full dataset, I noticed two cases not included 
> previously.
> *Eric Young Sr.\younger0*
> *Ken Griffey Jr.\griffke02*
>
> Based on the helpful feedback I got in my previous post, now what I need 
> is a pattern that would handle not only all the original input items plus 
> the two new cases. Further, what I would like extracted is everything up to 
> but not including the backslash.
>
> This attempt (of mine) finds everything in the original dataset, but I was 
> unable to expand it to include either the "Sr." or "Jr.":
> *^([a-z,A-Z,\.]+) ([a-z,A-Z]+)*
>
> 1. Is there a way my attempt can be expanded so it includes "Sr." or "Jr." 
> when a name has either of them?
> 2. What pattern would you write to accomplish the task?
>

-- 
This is the BBEdit Talk public discussion group. If you have a feature request 
or need technical support, please email "supp...@barebones.com" rather than 
posting here. Follow @bbedit on Twitter: <https://twitter.com/bbedit>
--- 
You received this message because you are subscribed to the Google Groups 
"BBEdit Talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to bbedit+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/bbedit/c3be43db-0efa-41d4-b4e2-44c3054bf7f7%40googlegroups.com.

Reply via email to