On Fri, Feb 17, 2023 at 10:28:43PM -0600, David Wright wrote:
> On Fri 17 Feb 2023 at 11:30:43 (-0500), Greg Wooledge wrote:
> > On Fri, Feb 17, 2023 at 09:20:34AM -0600, David Wright wrote:
> > >   $ ps -eo '%p %C' | sed -e 's/\([^ ]\+\) /\1|/;'

> > Eww, GNUisms.
> 
> I don't keep a list of differences to hand, but I guess you'd prefer:
> 
>   $ ps -eo '%p %C' | sed -E 's/([^ ]+) /\1|/;'
>     PID|%CPU
>       1| 0.0
>       2| 0.0

That's *slightly* better, in that it works on both GNU and BSD (and
maybe some future edition of POSIX -- I've been told they're considering
adopting the -E flag).  A truly portable version would either use \{1,\}
or would simply repeat itself: [^ ][^ ]*   (The latter is by far the
more common, especially in scripts that target ancient Unixes where \{1,\}
might not work.)

However, a bigger issue is that your command only works for the two-column
case.  It doesn't support more columns:

unicorn:~$ ps -o '%p|%U|%a'
    PID|USER    |COMMAND
   1010|greg    |bash
2093990|greg    |ps -o %p|%U|%a
unicorn:~$ ps -o '%p %U %a' | sed -E 's/([^ ]+) /\1|/;'
    PID|USER     COMMAND
   1010|greg     bash
2093858|greg     ps -o %p %U %a
2093859|greg     sed -E s/([^ ]+) /\1|/;

And even if you extended it in the "obvious" way, it would break down on
columns that can contain internal whitespace (e.g. %a).

> > That aside, a workaround like this is ugly and should
> > not be needed.
> 
> The OP wrote: "How can I restore the previous behaviour that
> allowed other than whitespace separators between fields?"
> 
> If that's the required format, what are the alternatives?

Because data fields can contain internal whitespace, the only way to
parse the output of ps and determine the right spot to put pipelines
(or whatever) would be to parse the header row.  All of the headers
listed under "AIX format specifiers" are free of whitespace.  So, one
could in theory parse that line, determine the column numbers where
each data field will end, and then replace spaces with pipelines in
those column numbers.

It should be noted that there appear to be two TYPES of data fields:
numeric and string.  Look at this example:

unicorn:~$ ps -o '%C %g %n %p %U %a'
%CPU RGROUP    NI     PID USER     COMMAND
 0.0 greg       0    1010 greg     bash
 0.0 greg       0 2094243 greg     ps -o %C %g %n %p %U %a

The "%CPU", "NI" and "PID" fields are right-justified.  The "RGROUP",
"USER" and "COMMAND" fields are left-justified.

This means the header parser will also need to contain knowledge about
each header -- whether it's left-justified (string) or right- (numeric).

With all those pieces, I think the problem can be "solved", although I
wouldn't care to write such a thing.  Time spent on writing that
parser/filter would be better spent advocating to restore the previous
functionality, IMHO.

Reply via email to