On 16/04/2022 14:37, wilson wrote:
in shell script, how can I use regex to extract values from a string?
maybe value types transformation should be required.
for instance the string: "black berry 12".
I want go get the name: black berry [String]
the price: 12 [Int]
I did this in other language such as java/scala:
scala> val regex = """(.+)\s+(\d+)""".r
val regex: scala.util.matching.Regex = (.+)\s+(\d+)
scala> str match {
| case regex(name,price) => (name,price.toInt)
| }
val res2: (String, Int) = (black berry,12)
But sometimes for a simple task I won't want to write a java program.
so what's the shell solution?
Python is a convenient and scriptable solution for many text processing
problems. You can call it from within bash:
### suppose
X="black berry 12"
### then as a python3 one-liner with -c
python3 -c "import re; m = re.search('(.+)\\s+(\d+)', '$X');
print((m.group(1), m.group(2)))"
# prints: ('black berry', '12')
### and you could get the output via command substitution (edit the
print statement to get just one group; you could do this twice to get
both items in separate bash variables)
Y=$(python3 -c "import re; m = re.search('(.+)\\s+(\d+)', '$X');
print((m.group(1), m.group(2)))")
echo $Y
# prints: ('black berry', '12')
### you can also inline python as a bash here-doc
python3 <<EOF
import re
m = re.search('(.+)\\s+(\d+)', '$X')
print((m.group(1), m.group(2)))
EOF
# prints: ('black berry', '12')
Note the change to your regex: '\' changed to '\\' because bash variable
interpolation is used to obtain $X . This could be avoided if single
quotes were used, but then you would need another way to access your
data, such as reading it from stdin.
It depends on your workflow, but you might consider writing the whole
thing in Python as a script with first line #!/usr/bin/python3 . Many is
the time I rewrote a bash script in Python when I realised that I wanted
proper types, error handling, and unit tests.
Kind regards,
--
Ash Joubert <a...@transient.nz>
Director
Transient Software Limited <https://transient.nz/>
New Zealand