Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread GOMBAS Gabor
Hi,

On Sat, Jun 11, 2005 at 07:40:10PM +0200, Olaf van der Spek wrote:

 Many shell apps/scripts output data in tables, for example ls -l, ps 
 aux, top, netstat, etc.
 At the moment, most of these apps use fixed-width columns with a 
 variable-width last-column.
 This results in (unnecessary) truncation, for example:
 Debian-  11918  0.0  0.1  4428 1464 ?Ss   Jun05   0:00 
 /usr/sbin/exim4 -bd -q30m
 tcp 0 0 TC218-187-80-45.2:35589 bananensaft.inline.:www ESTABLISHEDproxy 
 153239
 
 Also, because the output isn't structured (in way easily readable by 
 machines), using the data in a script isn't (very) easy and is likely to 
 break due to strict dependency on the syntax.
 
 Are there already any plans to solve these issues?

Yes. The commands you mention were designed for _human_ consumption. Do
not use them in scripts without good reasons. There are a lot of
commands to get well-formatted output without truncation. For example,
ls has a -n option for exactly this reason; stat(1) can be used
instead of ls -l to avoid clipping; ps has a _lot_ of formatting
options itself and all the data can be found under /proc in an easily
parseable format etc. You just have to select the right tool for the job
(that also includes using more powerful scripting languages if the task
is complicated).

 I was thinking, using structured output (and maybe input) in an XML-like 
 way would solve these and allow neat post-processing.

XML is just _terrible_ for human input/output.

Gabor

-- 
 -
 MTA SZTAKI Computer and Automation Research Institute
Hungarian Academy of Sciences
 -


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Olaf van der Spek
On 6/13/05, GOMBAS Gabor [EMAIL PROTECTED] wrote:
  Are there already any plans to solve these issues?
 
 Yes. The commands you mention were designed for _human_ consumption. Do
 not use them in scripts without good reasons. There are a lot of

The maintainer of netstat didn't want to change the layout (by
default) because scripts might get broken.
What's the solution here?

 commands to get well-formatted output without truncation. For example,
 ls has a -n option for exactly this reason; stat(1) can be used
 instead of ls -l to avoid clipping; ps has a _lot_ of formatting
 options itself and all the data can be found under /proc in an easily
 parseable format etc. You just have to select the right tool for the job
 (that also includes using more powerful scripting languages if the task
 is complicated).
 
  I was thinking, using structured output (and maybe input) in an XML-like
  way would solve these and allow neat post-processing.
 
 XML is just _terrible_ for human input/output.

It's not meant for human IO, it's meant for IO to the next chain. The
final chain would then process it to normal text output.



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Humberto Massa Guimarães
* Gabor ::

 Hi,

 On Sat, Jun 11, 2005 at 07:40:10PM +0200, Olaf van der Spek wrote:

  Many shell apps/scripts output data in tables, for example ls
  -l, ps aux, top, netstat, etc.  At the moment, most of these
  apps use fixed-width columns with a variable-width last-column.
  This results in (unnecessary) truncation, for example: Debian-
  11918  0.0  0.1  4428 1464 ?Ss   Jun05   0:00
  /usr/sbin/exim4 -bd -q30m tcp 0 0 TC218-187-80-45.2:35589
  bananensaft.inline.:www ESTABLISHEDproxy 153239
  
  Also, because the output isn't structured (in way easily
  readable by machines), using the data in a script isn't (very)
  easy and is likely to break due to strict dependency on the
  syntax.
  
  Are there already any plans to solve these issues?

 Yes. The commands you mention were designed for _human_
 consumption. Do not use them in scripts without good reasons.
 There are a lot of commands to get well-formatted output without
 truncation. For example, ls has a -n option for exactly this
 reason; stat(1) can be used instead of ls -l to avoid clipping;
 ps has a _lot_ of formatting options itself and all the data can
 be found under /proc in an easily parseable format etc. You just
 have to select the right tool for the job (that also includes
 using more powerful scripting languages if the task is
 complicated).

  I was thinking, using structured output (and maybe input) in an
  XML-like way would solve these and allow neat post-processing.

 XML is just _terrible_ for human input/output.

What Olaf *really* seems to want is a resource like the new (vapor?)
Monad shell from MS. Which can be a good thing, if done right, but
is generally a waste of CPU and memory, if you ask me. As you said,
there is not a lot of difference between

ls *.ab | fields name, size | table

in Monad and 

printf %-50.50s %d\n, $_, -s $_ for *.ab

in Perl. The domain is necessary anyway, ie, you have to know Monad
to understand the first, you have to know perl to grok the second.

* Olaf ::

  XML is just _terrible_ for human input/output.

 It's not meant for human IO, it's meant for IO to the next chain.
 The final chain would then process it to normal text output.

Even so; imagine a long (6 links) chain of things. Each of them
would have to unserialize the input and serialize the output (XML no
less! big overhead!), besides trying to know if its input is xml or
not, if its output should be xml or not. In the Monad case, it
*seems* that what is passed around are (DCOM?) objects, lowering the
overhead a litlle bit, but there is a lot of overhead nonetheless.
And it's still easier to use a tool (like Perl, Python or Ruby for
instance) that can do the job you want (look my example above)

IOW, I don't think Monad is such a hot idea.

--
HTH,
Massa



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Olaf van der Spek
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote:
 What Olaf *really* seems to want is a resource like the new (vapor?)
 Monad shell from MS. Which can be a good thing, if done right, but
 is generally a waste of CPU and memory, if you ask me. As you said,
 there is not a lot of difference between
 
 ls *.ab | fields name, size | table
 
 in Monad and
 
 printf %-50.50s %d\n, $_, -s $_ for *.ab
 
 in Perl. The domain is necessary anyway, ie, you have to know Monad
 to understand the first, you have to know perl to grok the second.

Except that in Perl you have hard-coded the size of the name field and
hard-coded sizes are almost never optimal (either too large or too
small in most of the cases).

   XML is just _terrible_ for human input/output.
 
  It's not meant for human IO, it's meant for IO to the next chain.
  The final chain would then process it to normal text output.
 
 Even so; imagine a long (6 links) chain of things. Each of them
 would have to unserialize the input and serialize the output (XML no
 less! big overhead!), besides trying to know if its input is xml or

Note that I said structured (XML-like) IO. I didn't say XML. I'm sure
an implementation without big overhead is possible.

 not, if its output should be xml or not. In the Monad case, it
 *seems* that what is passed around are (DCOM?) objects, lowering the
 overhead a litlle bit, but there is a lot of overhead nonetheless.
 And it's still easier to use a tool (like Perl, Python or Ruby for
 instance) that can do the job you want (look my example above)



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Humberto Massa Guimarães
* Olaf ::

 On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED]
 wrote:
snikt
  printf %-50.50s %d\n, $_, -s $_ for *.ab
  
  in Perl. The domain is necessary anyway, ie, you have to know
  Monad to understand the first, you have to know perl to grok the
  second.

 Except that in Perl you have hard-coded the size of the name field
 and hard-coded sizes are almost never optimal (either too large or
 too small in most of the cases).

Not necessarily. Just as you have tableout as an external command
(built-in or not) in Monad, you can have a Perl module to print
things in a tabular manner, expanding the column sizes as needed
(based on HTML::Format::Table or somesuch)

XML is just _terrible_ for human input/output.
  
   It's not meant for human IO, it's meant for IO to the next
   chain.  The final chain would then process it to normal text
   output.
  
  Even so; imagine a long (6 links) chain of things. Each of them
  would have to unserialize the input and serialize the output
  (XML no less! big overhead!), besides trying to know if its
  input is xml or

 Note that I said structured (XML-like) IO. I didn't say XML. I'm
 sure an implementation without big overhead is possible.

Yes, and I withdraw :-) what I said about XML. But *any*
serialization / deserialization necessary for this scheme to work
would add (unnecessary) overhead. This and the fact that you would
create incompatibilities with other Unices ... Those are indications
that this won't be done.

Obviously, some Monad clone can be done with its entire toolchain
(monad-ls, monad-tableout) ...

--
HTH,
Massa



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Olaf van der Spek
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote:
 Not necessarily. Just as you have tableout as an external command
 (built-in or not) in Monad, you can have a Perl module to print
 things in a tabular manner, expanding the column sizes as needed
 (based on HTML::Format::Table or somesuch)

But I doubt that'd be as simple as things are now.

 Yes, and I withdraw :-) what I said about XML. But *any*
 serialization / deserialization necessary for this scheme to work
 would add (unnecessary) overhead. This and the fact that you would
 create incompatibilities with other Unices ... Those are indications
 that this won't be done.

What kind of incompatibilities?
 
 Obviously, some Monad clone can be done with its entire toolchain
 (monad-ls, monad-tableout) ...

Why not ls --monad?



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Olaf van der Spek
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote:
 Yes, and I withdraw :-) what I said about XML. But *any*
 serialization / deserialization necessary for this scheme to work
 would add (unnecessary) overhead. This and the fact that you would

Well, if you can do it with Perl without overhead, you can of course
also do it without Perl without overhead.
In that case the 'structured' support would be included in the utility itself.



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Humberto Massa Guimarães
* Olaf ::

 On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED]
 wrote:
  Yes, and I withdraw :-) what I said about XML. But *any*
  serialization / deserialization necessary for this scheme to
  work would add (unnecessary) overhead. This and the fact that
  you would

 Well, if you can do it with Perl without overhead, you can of
 course also do it without Perl without overhead.  In that case the
 'structured' support would be included

Not exactly. Don't get me wrong, object component technology is a
great thing, standing just next to sliced bread in the list of great
things, but (just like sliced bread) it does not cure cancer.

When I do my example inside of Perl, I am supposing whatever objects
or handles the Perl interpreter has stay inside the interpreter's
process; when you do a pipe like

monad-ls *.ab | monad-fields name, size | monad-tableout

you are implying the existence of 3 processes, two of them making
serialization of their (internal) objects for output, two or them
making de-serialization of their inputs to (internal) objects, all
of them analyzing (or receiving a hint from the shell, that had to
analyze for them all) to see if their input came from an object
pipe and if their output goes to an object pipe. At least two of
those process have to read all of their input to memory before
spitting any output (ls -- because it sorts the filenames -- and
tableout -- because it dimensions the columns beautifully).

This is a *lot* of overhead -- normal overhead, contention overhead
(ls blocks the other two processes until it starts spitting its
output), and synch overhead (any object read in the input must be
perfectly synchronized to be a valid IPC object), and it's a lot of
overhead independently of the IPC mechanism utilized.

In the case of my Perl example, verbosely made to use a hypotetical
text::table:

use Text::Table; $t=new Text::Table;
$t-addline($_, -s $_) for *.ab; print $t-as_text

You still have some contention inherent to the operation you want to
convey (sorting *.ab, determining optimal column width), but none of
the (really expensive) freeze-serialize/deserialize-thaw cycles the
monad version has, nor the (expensive, complex, and even with
security implications(*)) input-format, input-synch, etc issues.

(*) security implications because when you make a pipe component
like monad-fields that can receive an arbitrary object as its input,
you have to have in mind that said object can have security bugs in
its methods, either on purpose or not. Imagine a malicious-ls that
spits objects whose get-name method (property getter) copies the
.ssh directory of the current user to another, publically-readable
locale. This can be installed in someplace in the net and you can
convince people that your ls is better and 0wn the poor bastards...

This *will* certainly happen in an environment like this because,
well, there will be a point in time where it will be too much
trouble to check all those distributed objects... Not unlike a lot
of websites install spyware via ActiveX in the poor IE-using folk.

--
HTH,
Massa



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Humberto Massa Guimarães
 On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED]
 wrote:
  Not necessarily. Just as you have tableout as an external
  command (built-in or not) in Monad, you can have a Perl module
  to print things in a tabular manner, expanding the column sizes
  as needed (based on HTML::Format::Table or somesuch)

 But I doubt that'd be as simple as things are now.

As I said in my other answer, things will *never* be simpler as they
are right now. Any other stuff will tend to complicate instead of
simplify things.

  Yes, and I withdraw :-) what I said about XML. But *any*
  serialization / deserialization necessary for this scheme to
  work would add (unnecessary) overhead. This and the fact that
  you would create incompatibilities with other Unices ... Those
  are indications that this won't be done.

 What kind of incompatibilities?
  
There are a lot of scripts today in production use that use the
output of ls, ps, in a text-way. If you want to put another command,
or another switch to ls, ok, but the fact that you *can* do it
does not mean that you *should* do it. (see below)

  Obviously, some Monad clone can be done with its entire
  toolchain (monad-ls, monad-tableout) ...

 Why not ls --monad?

If you want to fork and maintain forever util-linux, I have nothing
to say about that.

But I *will* leave you (I'm going home from work now) with Occam's
razor:

Entia non sunt multiplicanda praeter necessitem.

(Things shouldn't be multiplied without necessity)
IOW: if it's not broken, don't fix it.

--
HTH,
Massa



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Olaf van der Spek
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote:
  Well, if you can do it with Perl without overhead, you can of
  course also do it without Perl without overhead.  In that case the
  'structured' support would be included
 
 Not exactly. Don't get me wrong, object component technology is a
 great thing, standing just next to sliced bread in the list of great
 things, but (just like sliced bread) it does not cure cancer.
 
 When I do my example inside of Perl, I am supposing whatever objects
 or handles the Perl interpreter has stay inside the interpreter's
 process; when you do a pipe like
 
 monad-ls *.ab | monad-fields name, size | monad-tableout

If you do a pipe like that. But the functionality you showed in Perl
could also be done completely inside ls itself.



Re: Structured (XML-like) input/output for shell apps?

2005-06-13 Thread Olaf van der Spek
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote:
 There are a lot of scripts today in production use that use the
 output of ls, ps, in a text-way. If you want to put another command,
 or another switch to ls, ok, but the fact that you *can* do it
 does not mean that you *should* do it. (see below)

Didn't you say (or someone else) say the output of these commands was
(only) for human consumption?
 
   Obviously, some Monad clone can be done with its entire
   toolchain (monad-ls, monad-tableout) ...
 
  Why not ls --monad?
 
 If you want to fork and maintain forever util-linux, I have nothing
 to say about that.

Why fork and not just change the 'real' util-linux? ;-

 But I *will* leave you (I'm going home from work now) with Occam's
 razor:
 
 Entia non sunt multiplicanda praeter necessitem.
 
 (Things shouldn't be multiplied without necessity)
 IOW: if it's not broken, don't fix it.

If only it wasn't broken.
Netstat for example suffers from truncation.



Structured (XML-like) input/output for shell apps?

2005-06-11 Thread Olaf van der Spek

Hi,

Many shell apps/scripts output data in tables, for example ls -l, ps 
aux, top, netstat, etc.
At the moment, most of these apps use fixed-width columns with a 
variable-width last-column.

This results in (unnecessary) truncation, for example:
Debian-  11918  0.0  0.1  4428 1464 ?Ss   Jun05   0:00 
/usr/sbin/exim4 -bd -q30m
tcp 0 0 TC218-187-80-45.2:35589 bananensaft.inline.:www ESTABLISHEDproxy 
153239


Also, because the output isn't structured (in way easily readable by 
machines), using the data in a script isn't (very) easy and is likely to 
break due to strict dependency on the syntax.


Are there already any plans to solve these issues?
I was thinking, using structured output (and maybe input) in an XML-like 
way would solve these and allow neat post-processing.
It also separates data generation and presentation, which would be an 
advantage if the presentation needs to be changed.


Any thoughts?

Greetings,

Olaf


--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]