Re: [fonc] Error trying to compile COLA
On 3/11/2012 4:51 PM, Martin Baldan wrote: I won't pretend I really know what I'm talking about, I'm just guessing here, but don't you think the requirement for "independent and identically-distributed random variable data" in Shannon's source coding theorem may not be applicable to pictures, sounds or frame sequences normally handled by compression algorithms? that is a description of random data, which granted, doesn't apply to most (compressible) data. that wasn't really the point though. once one gets to a point where ones' data looks like this, then further compression is no longer possible (hence why there is a limit). typically, compression will transform low-entropy data (with many repeating patterns and redundancies) into a smaller amount of high-entropy compressed data (with almost no repeating patterns or redundancy). I mean, many compression techniques rely on domain knowledge about the things to be compressed. For instance, a complex picture or video sequence may consist of a well-known background with a few characters from a well-known inventory in well-known positions. If you know those facts, you can increase the compression dramatically. A practical example may be Xtranormal stories, where you get a cute 3-D animated dialogue from a small script. yes, but this can only compress what redundancies exist. once the redundancies are gone, one is at a limit. specialized knowledge allows one to do a little better, but does not change the basic nature of the limit. for example, I was able to devise a compression scheme which reduced S-Expressions to only 5% their original size. now what if I want 3%, or 1%? this is not an easy problem. it is much easier to get from 10% to 5% than to get from 5% to 3%. the big question then is how much redundancy exists within a typical OS, or other large piece of software? I expect one can likely reduce it by a fair amount (such as by aggressive refactoring and DSLs), but there will likely be a bit of a limit, and once one approaches this limit, there is little more that can be done (as it quickly becomes a fight against diminishing returns). otherwise, one can start throwing away features, but then there is still a limit, namely how much can one discard and still keep the "essence" of the software intact. although many current programs are, arguably, huge, the vast majority of the code is likely still there for a reason, and is unlikely the result of programmers just endlessly writing the same stuff over and over again, or resulting from other simple patterns. rather, it is more likely piles of special case logic and optimizations and similar. (BTW: now have in-console text editor, but ended up using full words for most command names, seems basically workable...). Best, -Martin On Sun, Mar 11, 2012 at 7:53 PM, BGB wrote: On 3/11/2012 5:28 AM, Jakub Piotr Cłapa wrote: On 28.02.12 06:42, BGB wrote: but, anyways, here is a link to another article: http://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem Shannon's theory applies to lossless transmission. I doubt anybody here wants to reproduce everything down to the timings and bugs of the original software. Information theory is not thermodynamics. Shannon's theory also applies some to lossy transmission, as it also sets a lower bound on the size of the data as expressed with a certain degree of loss. this is why, for example, with JPEGs or MP3s, getting a smaller size tends to result in reduced quality. the higher quality can't be expressed in a smaller size. ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Error trying to compile COLA
I won't pretend I really know what I'm talking about, I'm just guessing here, but don't you think the requirement for "independent and identically-distributed random variable data" in Shannon's source coding theorem may not be applicable to pictures, sounds or frame sequences normally handled by compression algorithms? I mean, many compression techniques rely on domain knowledge about the things to be compressed. For instance, a complex picture or video sequence may consist of a well-known background with a few characters from a well-known inventory in well-known positions. If you know those facts, you can increase the compression dramatically. A practical example may be Xtranormal stories, where you get a cute 3-D animated dialogue from a small script. Best, -Martin On Sun, Mar 11, 2012 at 7:53 PM, BGB wrote: > On 3/11/2012 5:28 AM, Jakub Piotr Cłapa wrote: >> >> On 28.02.12 06:42, BGB wrote: >>> >>> but, anyways, here is a link to another article: >>> http://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem >> >> >> Shannon's theory applies to lossless transmission. I doubt anybody here >> wants to reproduce everything down to the timings and bugs of the original >> software. Information theory is not thermodynamics. >> > > Shannon's theory also applies some to lossy transmission, as it also sets a > lower bound on the size of the data as expressed with a certain degree of > loss. > > this is why, for example, with JPEGs or MP3s, getting a smaller size tends > to result in reduced quality. the higher quality can't be expressed in a > smaller size. ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc
Re: [fonc] Error trying to compile COLA
On 3/11/2012 5:28 AM, Jakub Piotr Cłapa wrote: On 28.02.12 06:42, BGB wrote: but, anyways, here is a link to another article: http://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem Shannon's theory applies to lossless transmission. I doubt anybody here wants to reproduce everything down to the timings and bugs of the original software. Information theory is not thermodynamics. Shannon's theory also applies some to lossy transmission, as it also sets a lower bound on the size of the data as expressed with a certain degree of loss. this is why, for example, with JPEGs or MP3s, getting a smaller size tends to result in reduced quality. the higher quality can't be expressed in a smaller size. I had originally figured that the assumption would have been to try to recreate everything in a reasonably feature-complete way. this means such things in the OS as: an OpenGL implementation; a command-line interface, probably implementing ANSI / VT100 style control-codes (even in my 3D engine, my in-program console currently implements a subset of these codes); a loader for program binaries (ELF or PE/COFF); POSIX or some other similar OS APIs; probably a C compiler, assembler, linker, run-time libraries, ...; network stack, probably a web-browser, ...; ... then it would be a question of how small one could get everything while still implementing a reasonably complete (if basic) feature-set, using any DSLs/... one could think up to shave off lines of code. one could probably shave off OS-specific features which few people use anyways (for example, no need to implement support for things like GDI or the X11 protocol). a "simple" solution being that OpenGL largely is the interface for the GUI subsystem (probably with a widget toolkit built on this, and some calls for things not directly supported by OpenGL, like managing mouse/keyboard/windows/...). also, potentially, a vast amount of what would be standalone tools, could be reimplemented as library code and merged (say, one has the "shell" as a kernel module, which directly implements nearly all of the basic command-line tools, like ls/cp/sed/grep/...). the result of such an effort, under my estimates, would likely still end up in the Mloc range, but maybe one could get from say, 200 Mloc (for a Linux-like configuration) down to maybe about 10-15 Mloc, or if one tried really hard, maybe closer to 1 Mloc, and much smaller is fairly unlikely. apparently this wasn't the plan though, rather the intent was to substitute something entirely different in its place, but this sort of implies that it isn't really feature-complete per-se (and it would be a bit difficult trying to port existing software to it). someone asks: "hey, how can I build Quake 3 Arena for your OS?", and gets back a response roughly along the lines of "you will need to largely rewrite it from the ground up". much nicer and simpler would be if it could be reduced to maybe a few patches and modifying some of the OS glue stubs or something. (tangent time): but, alas, there seems to be a bit of a philosophical split here. I tend to be a bit more conservative, even if some of this stuff is put together in dubious ways. one adds features, but often ends up jerry-rigging things, and using bits of functionality in different contexts: like, for example, an in-program command-entry console, is not normally where one expects ANSI codes, but at the time, it seemed a sane enough strategy (adding ANSI codes was a fairly straightforward way to support things like embedding color information in console message strings, ...). so, the basic idea still works, and so was applied in a new context (a console in a 3D engine, vs a terminal window in the OS). side note: internally, the console is represented as a 2D array of characters, and another 2D array to store color and modifier flags (underline, strikeout, blink, italic, ...). the console can be used both for program-related commands, accessing "cvars", and for evaluating script fragments (sadly, limited to what can be reasonably typed into a console command, which can be a little limiting for much more than "make that thing over there explode" or similar). functionally, the console is less advanced than something like bash or similar. I have also considered the possibility of supporting multiple consoles, and maybe a console-integrated text-editor, but have yet to decide on the specifics (I am torn between a specialized text-editor interface, or making the text editor be a console command which hijacks the console and probably does most of its user-interface via ANSI codes or similar...). but, it is not obvious what is the "best" way to integrate a text-editor into the UI for a 3D engine, hence why I have had this idea floating around for months, but haven't really acted on it (out of humor, it could be given a VIM-like user-interface... ok, probably not, I was imagining mo
Re: [fonc] Error trying to compile COLA
On 28.02.12 06:42, BGB wrote: but, anyways, here is a link to another article: http://en.wikipedia.org/wiki/Shannon%27s_source_coding_theorem Shannon's theory applies to lossless transmission. I doubt anybody here wants to reproduce everything down to the timings and bugs of the original software. Information theory is not thermodynamics. -- regards, Jakub Piotr Cłapa ___ fonc mailing list fonc@vpri.org http://vpri.org/mailman/listinfo/fonc