On 16/04/2022 14:37, wilson wrote:
in shell script, how can I use regex to extract values from a string? maybe value types transformation should be required.
for instance the string: "black berry 12".
I want go get the name: black berry [String]
the price: 12 [Int]
I did this in other language such as java/scala:
scala> val regex = """(.+)\s+(\d+)""".r
val regex: scala.util.matching.Regex = (.+)\s+(\d+)
scala> str match {
      |   case regex(name,price) => (name,price.toInt)
      | }
val res2: (String, Int) = (black berry,12)
But sometimes for a simple task I won't want to write a java program.
so what's the shell solution?

Python is a convenient and scriptable solution for many text processing problems. You can call it from within bash:


### suppose

X="black berry 12"


### then as a python3 one-liner with -c

python3 -c "import re; m = re.search('(.+)\\s+(\d+)', '$X'); print((m.group(1), m.group(2)))"

# prints: ('black berry', '12')


### and you could get the output via command substitution (edit the print statement to get just one group; you could do this twice to get both items in separate bash variables)

Y=$(python3 -c "import re; m = re.search('(.+)\\s+(\d+)', '$X'); print((m.group(1), m.group(2)))")

echo $Y

# prints: ('black berry', '12')


### you can also inline python as a bash here-doc

python3 <<EOF
import re
m = re.search('(.+)\\s+(\d+)', '$X')
print((m.group(1), m.group(2)))
EOF

# prints: ('black berry', '12')


Note the change to your regex: '\' changed to '\\' because bash variable interpolation is used to obtain $X . This could be avoided if single quotes were used, but then you would need another way to access your data, such as reading it from stdin.

It depends on your workflow, but you might consider writing the whole thing in Python as a script with first line #!/usr/bin/python3 . Many is the time I rewrote a bash script in Python when I realised that I wanted proper types, error handling, and unit tests.

Kind regards,

--
Ash Joubert <a...@transient.nz>
Director
Transient Software Limited <https://transient.nz/>
New Zealand

Reply via email to