Re: Structured (XML-like) input/output for shell apps?
Hi, On Sat, Jun 11, 2005 at 07:40:10PM +0200, Olaf van der Spek wrote: Many shell apps/scripts output data in tables, for example ls -l, ps aux, top, netstat, etc. At the moment, most of these apps use fixed-width columns with a variable-width last-column. This results in (unnecessary) truncation, for example: Debian- 11918 0.0 0.1 4428 1464 ?Ss Jun05 0:00 /usr/sbin/exim4 -bd -q30m tcp 0 0 TC218-187-80-45.2:35589 bananensaft.inline.:www ESTABLISHEDproxy 153239 Also, because the output isn't structured (in way easily readable by machines), using the data in a script isn't (very) easy and is likely to break due to strict dependency on the syntax. Are there already any plans to solve these issues? Yes. The commands you mention were designed for _human_ consumption. Do not use them in scripts without good reasons. There are a lot of commands to get well-formatted output without truncation. For example, ls has a -n option for exactly this reason; stat(1) can be used instead of ls -l to avoid clipping; ps has a _lot_ of formatting options itself and all the data can be found under /proc in an easily parseable format etc. You just have to select the right tool for the job (that also includes using more powerful scripting languages if the task is complicated). I was thinking, using structured output (and maybe input) in an XML-like way would solve these and allow neat post-processing. XML is just _terrible_ for human input/output. Gabor -- - MTA SZTAKI Computer and Automation Research Institute Hungarian Academy of Sciences - -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Structured (XML-like) input/output for shell apps?
On 6/13/05, GOMBAS Gabor [EMAIL PROTECTED] wrote: Are there already any plans to solve these issues? Yes. The commands you mention were designed for _human_ consumption. Do not use them in scripts without good reasons. There are a lot of The maintainer of netstat didn't want to change the layout (by default) because scripts might get broken. What's the solution here? commands to get well-formatted output without truncation. For example, ls has a -n option for exactly this reason; stat(1) can be used instead of ls -l to avoid clipping; ps has a _lot_ of formatting options itself and all the data can be found under /proc in an easily parseable format etc. You just have to select the right tool for the job (that also includes using more powerful scripting languages if the task is complicated). I was thinking, using structured output (and maybe input) in an XML-like way would solve these and allow neat post-processing. XML is just _terrible_ for human input/output. It's not meant for human IO, it's meant for IO to the next chain. The final chain would then process it to normal text output.
Re: Structured (XML-like) input/output for shell apps?
* Gabor :: Hi, On Sat, Jun 11, 2005 at 07:40:10PM +0200, Olaf van der Spek wrote: Many shell apps/scripts output data in tables, for example ls -l, ps aux, top, netstat, etc. At the moment, most of these apps use fixed-width columns with a variable-width last-column. This results in (unnecessary) truncation, for example: Debian- 11918 0.0 0.1 4428 1464 ?Ss Jun05 0:00 /usr/sbin/exim4 -bd -q30m tcp 0 0 TC218-187-80-45.2:35589 bananensaft.inline.:www ESTABLISHEDproxy 153239 Also, because the output isn't structured (in way easily readable by machines), using the data in a script isn't (very) easy and is likely to break due to strict dependency on the syntax. Are there already any plans to solve these issues? Yes. The commands you mention were designed for _human_ consumption. Do not use them in scripts without good reasons. There are a lot of commands to get well-formatted output without truncation. For example, ls has a -n option for exactly this reason; stat(1) can be used instead of ls -l to avoid clipping; ps has a _lot_ of formatting options itself and all the data can be found under /proc in an easily parseable format etc. You just have to select the right tool for the job (that also includes using more powerful scripting languages if the task is complicated). I was thinking, using structured output (and maybe input) in an XML-like way would solve these and allow neat post-processing. XML is just _terrible_ for human input/output. What Olaf *really* seems to want is a resource like the new (vapor?) Monad shell from MS. Which can be a good thing, if done right, but is generally a waste of CPU and memory, if you ask me. As you said, there is not a lot of difference between ls *.ab | fields name, size | table in Monad and printf %-50.50s %d\n, $_, -s $_ for *.ab in Perl. The domain is necessary anyway, ie, you have to know Monad to understand the first, you have to know perl to grok the second. * Olaf :: XML is just _terrible_ for human input/output. It's not meant for human IO, it's meant for IO to the next chain. The final chain would then process it to normal text output. Even so; imagine a long (6 links) chain of things. Each of them would have to unserialize the input and serialize the output (XML no less! big overhead!), besides trying to know if its input is xml or not, if its output should be xml or not. In the Monad case, it *seems* that what is passed around are (DCOM?) objects, lowering the overhead a litlle bit, but there is a lot of overhead nonetheless. And it's still easier to use a tool (like Perl, Python or Ruby for instance) that can do the job you want (look my example above) IOW, I don't think Monad is such a hot idea. -- HTH, Massa -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: Structured (XML-like) input/output for shell apps?
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: What Olaf *really* seems to want is a resource like the new (vapor?) Monad shell from MS. Which can be a good thing, if done right, but is generally a waste of CPU and memory, if you ask me. As you said, there is not a lot of difference between ls *.ab | fields name, size | table in Monad and printf %-50.50s %d\n, $_, -s $_ for *.ab in Perl. The domain is necessary anyway, ie, you have to know Monad to understand the first, you have to know perl to grok the second. Except that in Perl you have hard-coded the size of the name field and hard-coded sizes are almost never optimal (either too large or too small in most of the cases). XML is just _terrible_ for human input/output. It's not meant for human IO, it's meant for IO to the next chain. The final chain would then process it to normal text output. Even so; imagine a long (6 links) chain of things. Each of them would have to unserialize the input and serialize the output (XML no less! big overhead!), besides trying to know if its input is xml or Note that I said structured (XML-like) IO. I didn't say XML. I'm sure an implementation without big overhead is possible. not, if its output should be xml or not. In the Monad case, it *seems* that what is passed around are (DCOM?) objects, lowering the overhead a litlle bit, but there is a lot of overhead nonetheless. And it's still easier to use a tool (like Perl, Python or Ruby for instance) that can do the job you want (look my example above)
Re: Structured (XML-like) input/output for shell apps?
* Olaf :: On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: snikt printf %-50.50s %d\n, $_, -s $_ for *.ab in Perl. The domain is necessary anyway, ie, you have to know Monad to understand the first, you have to know perl to grok the second. Except that in Perl you have hard-coded the size of the name field and hard-coded sizes are almost never optimal (either too large or too small in most of the cases). Not necessarily. Just as you have tableout as an external command (built-in or not) in Monad, you can have a Perl module to print things in a tabular manner, expanding the column sizes as needed (based on HTML::Format::Table or somesuch) XML is just _terrible_ for human input/output. It's not meant for human IO, it's meant for IO to the next chain. The final chain would then process it to normal text output. Even so; imagine a long (6 links) chain of things. Each of them would have to unserialize the input and serialize the output (XML no less! big overhead!), besides trying to know if its input is xml or Note that I said structured (XML-like) IO. I didn't say XML. I'm sure an implementation without big overhead is possible. Yes, and I withdraw :-) what I said about XML. But *any* serialization / deserialization necessary for this scheme to work would add (unnecessary) overhead. This and the fact that you would create incompatibilities with other Unices ... Those are indications that this won't be done. Obviously, some Monad clone can be done with its entire toolchain (monad-ls, monad-tableout) ... -- HTH, Massa
Re: Structured (XML-like) input/output for shell apps?
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: Not necessarily. Just as you have tableout as an external command (built-in or not) in Monad, you can have a Perl module to print things in a tabular manner, expanding the column sizes as needed (based on HTML::Format::Table or somesuch) But I doubt that'd be as simple as things are now. Yes, and I withdraw :-) what I said about XML. But *any* serialization / deserialization necessary for this scheme to work would add (unnecessary) overhead. This and the fact that you would create incompatibilities with other Unices ... Those are indications that this won't be done. What kind of incompatibilities? Obviously, some Monad clone can be done with its entire toolchain (monad-ls, monad-tableout) ... Why not ls --monad?
Re: Structured (XML-like) input/output for shell apps?
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: Yes, and I withdraw :-) what I said about XML. But *any* serialization / deserialization necessary for this scheme to work would add (unnecessary) overhead. This and the fact that you would Well, if you can do it with Perl without overhead, you can of course also do it without Perl without overhead. In that case the 'structured' support would be included in the utility itself.
Re: Structured (XML-like) input/output for shell apps?
* Olaf :: On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: Yes, and I withdraw :-) what I said about XML. But *any* serialization / deserialization necessary for this scheme to work would add (unnecessary) overhead. This and the fact that you would Well, if you can do it with Perl without overhead, you can of course also do it without Perl without overhead. In that case the 'structured' support would be included Not exactly. Don't get me wrong, object component technology is a great thing, standing just next to sliced bread in the list of great things, but (just like sliced bread) it does not cure cancer. When I do my example inside of Perl, I am supposing whatever objects or handles the Perl interpreter has stay inside the interpreter's process; when you do a pipe like monad-ls *.ab | monad-fields name, size | monad-tableout you are implying the existence of 3 processes, two of them making serialization of their (internal) objects for output, two or them making de-serialization of their inputs to (internal) objects, all of them analyzing (or receiving a hint from the shell, that had to analyze for them all) to see if their input came from an object pipe and if their output goes to an object pipe. At least two of those process have to read all of their input to memory before spitting any output (ls -- because it sorts the filenames -- and tableout -- because it dimensions the columns beautifully). This is a *lot* of overhead -- normal overhead, contention overhead (ls blocks the other two processes until it starts spitting its output), and synch overhead (any object read in the input must be perfectly synchronized to be a valid IPC object), and it's a lot of overhead independently of the IPC mechanism utilized. In the case of my Perl example, verbosely made to use a hypotetical text::table: use Text::Table; $t=new Text::Table; $t-addline($_, -s $_) for *.ab; print $t-as_text You still have some contention inherent to the operation you want to convey (sorting *.ab, determining optimal column width), but none of the (really expensive) freeze-serialize/deserialize-thaw cycles the monad version has, nor the (expensive, complex, and even with security implications(*)) input-format, input-synch, etc issues. (*) security implications because when you make a pipe component like monad-fields that can receive an arbitrary object as its input, you have to have in mind that said object can have security bugs in its methods, either on purpose or not. Imagine a malicious-ls that spits objects whose get-name method (property getter) copies the .ssh directory of the current user to another, publically-readable locale. This can be installed in someplace in the net and you can convince people that your ls is better and 0wn the poor bastards... This *will* certainly happen in an environment like this because, well, there will be a point in time where it will be too much trouble to check all those distributed objects... Not unlike a lot of websites install spyware via ActiveX in the poor IE-using folk. -- HTH, Massa
Re: Structured (XML-like) input/output for shell apps?
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: Not necessarily. Just as you have tableout as an external command (built-in or not) in Monad, you can have a Perl module to print things in a tabular manner, expanding the column sizes as needed (based on HTML::Format::Table or somesuch) But I doubt that'd be as simple as things are now. As I said in my other answer, things will *never* be simpler as they are right now. Any other stuff will tend to complicate instead of simplify things. Yes, and I withdraw :-) what I said about XML. But *any* serialization / deserialization necessary for this scheme to work would add (unnecessary) overhead. This and the fact that you would create incompatibilities with other Unices ... Those are indications that this won't be done. What kind of incompatibilities? There are a lot of scripts today in production use that use the output of ls, ps, in a text-way. If you want to put another command, or another switch to ls, ok, but the fact that you *can* do it does not mean that you *should* do it. (see below) Obviously, some Monad clone can be done with its entire toolchain (monad-ls, monad-tableout) ... Why not ls --monad? If you want to fork and maintain forever util-linux, I have nothing to say about that. But I *will* leave you (I'm going home from work now) with Occam's razor: Entia non sunt multiplicanda praeter necessitem. (Things shouldn't be multiplied without necessity) IOW: if it's not broken, don't fix it. -- HTH, Massa
Re: Structured (XML-like) input/output for shell apps?
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: Well, if you can do it with Perl without overhead, you can of course also do it without Perl without overhead. In that case the 'structured' support would be included Not exactly. Don't get me wrong, object component technology is a great thing, standing just next to sliced bread in the list of great things, but (just like sliced bread) it does not cure cancer. When I do my example inside of Perl, I am supposing whatever objects or handles the Perl interpreter has stay inside the interpreter's process; when you do a pipe like monad-ls *.ab | monad-fields name, size | monad-tableout If you do a pipe like that. But the functionality you showed in Perl could also be done completely inside ls itself.
Re: Structured (XML-like) input/output for shell apps?
On 6/13/05, Humberto Massa Guimarães [EMAIL PROTECTED] wrote: There are a lot of scripts today in production use that use the output of ls, ps, in a text-way. If you want to put another command, or another switch to ls, ok, but the fact that you *can* do it does not mean that you *should* do it. (see below) Didn't you say (or someone else) say the output of these commands was (only) for human consumption? Obviously, some Monad clone can be done with its entire toolchain (monad-ls, monad-tableout) ... Why not ls --monad? If you want to fork and maintain forever util-linux, I have nothing to say about that. Why fork and not just change the 'real' util-linux? ;- But I *will* leave you (I'm going home from work now) with Occam's razor: Entia non sunt multiplicanda praeter necessitem. (Things shouldn't be multiplied without necessity) IOW: if it's not broken, don't fix it. If only it wasn't broken. Netstat for example suffers from truncation.
Structured (XML-like) input/output for shell apps?
Hi, Many shell apps/scripts output data in tables, for example ls -l, ps aux, top, netstat, etc. At the moment, most of these apps use fixed-width columns with a variable-width last-column. This results in (unnecessary) truncation, for example: Debian- 11918 0.0 0.1 4428 1464 ?Ss Jun05 0:00 /usr/sbin/exim4 -bd -q30m tcp 0 0 TC218-187-80-45.2:35589 bananensaft.inline.:www ESTABLISHEDproxy 153239 Also, because the output isn't structured (in way easily readable by machines), using the data in a script isn't (very) easy and is likely to break due to strict dependency on the syntax. Are there already any plans to solve these issues? I was thinking, using structured output (and maybe input) in an XML-like way would solve these and allow neat post-processing. It also separates data generation and presentation, which would be an advantage if the presentation needs to be changed. Any thoughts? Greetings, Olaf -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]