On Fri, Jul 30, 2004 at 10:49:56AM -0700, Josh Coalson wrote: > it's good you brought this up, I want to finalize the Ogg FLAC > bitstream mapping and add it to the docs. currently the way it > is done in flac 1.1.0 is not ideal and probably should change.
And it should change soon- the time for end users mixing and matching arbitrary codecs with eachother (ie, Theora + FLAC) is here. FLAC bitstream "upgrade" utilities can be written fairly easily, too. > lacking granulepos? not sure what you mean, flac 1.1.0 writes > a granulepos of 0 for metadata packets and the correct granulepos > for audio packets. My bad, then. I have the most recent version but was using test material that must have been encoded with a previous version... > in flac 1.1.0, what ends up in page 0 is not clearly defined, > which is not good. I don't remember the ogg documentation > giving any recommendations as to what should be in the first > page and I didn't really investigate it. This is a good point. I'm going to sit down and write up a draft doc for "Ogg codec design", especially how to map "standalone codecs". Pass it around and get feedback on it, especially from Monty and Ralph. > the way *CVS* libOggFLAC currently writes is as follows: > > - the 4 byte 'fLaC' header is put in the first packet > - each FLAC metadata block is put in its own packet > - each FLAC audio frame is put in its own packet > > granulepos is 0 for the fLaC packet and metadata packets. > (I recall somewhere someone suggesting using a granulepos > of -1 for non-audio header packets?) No, -1 has a special meaning for granulepos, it means "no complete Packet is available in this Page". 0 is the correct granulepos for header pages, you have this right. > when the fLaC packet and metadata packets are written, the > library currently calls ogg_stream_flush() on each packet > so that each fLaC/metadata packet is in its own page. Packet 0 must always be flushed - you want decoders/demuxers to be able to quickly grab a small Page from each stream and pass that page around to the various codecs asking "is this yours?". Libogg2 flushes Packet0 automatically, no choice in the matter. But you also want some minimal information, such as it's version and granulerate, so that (de)muxers can handle granulepos -> time mapping without having to decode any further than Page 0. You must also flush at the end of your header packets so that tools such as Icecast can easily cache the header pages, nothing else, and send them immediatly preceeding "current" live data in the stream. The "end of header" flush is something you have to do manually. > but because some metadata packets can exceed the Ogg nominal > page size of 4K, and even the max Ogg page size (~64K?), some > metadata packets may not fit in a single page. This is not a problem, as long as the base data for the codec is in Page0 the rest of the headers can span multiple pages if needed. > what is your recommendation here? is it that the fLaC and > first STREAMINFO (which holds sample rate, #samples, > resolution, etc) packets be flushed together so that they > are in the same page, page 0? the total size of both packets > is 42 bytes so this seems to be no problem. No, Page #0 should contain only one Packet. My recommendation is that this packet contain, at minimum, a version, samplerate, #channels, and samplesize (resolution? 4/8/16/24/32 bit). Block and frame size constraints should also be used here. You may also want to combine your "Registered application ID" packet with the first header packet (prehaps a field right after the version) I do not recommend putting in fields for total samples in stream, since this goes against Ogg framing, and the MD5 signature (while useful) is a little redundant given that Ogg provides CRC for each page, and this too goes against Ogg framing conventions. Having these frames as being optional is OK, especially since they exist in FLAC, but requiring either makes live streaming and one-pass encoding impossible. the "Vorbis Comment" section should be on a packet by itself. The seekpoint stuff is redundant and should not be used in Ogg encoding, this data can easily be regenerated from Page granulepos's for transfering from OggFlac->FLAC. I'm not sure if cuepoints are especially helpful - it doesn't hurt to include it (especially if you want to be able to transfer Flac <->OggFLAC without data loss). I think cuepoints, if nessesary, make more sense to be put in a generic metadata codec, such as been suggested for the "kitchen sink metadata codec", so work apps that supported cdda stuff could use it with any codec vs having codec-specific support. > but adding in the codec version to the first page.... in > FLAC the codec version is in the vendor tag of the vorbis > comments. When a codec looks at a stream, it needs to know from page 0 if it can support a given stream. If someone is using a FLAC 1.x decoder and the stream is marked as 2.x, the codec needs to know to reject it. > the vorbis comments are in their own metadata > block/packet, but there is no requirement for the vorbis > comments to follow the STREAMINFO immediately. there may > be other metadata in between. now, I can enforce it in > libOggFLAC that it follows STREAMINFO immediately, but then > the question is, how can you guarantee that the whole vorbis > comment packet will also fit in page 0, given that there is > not much restriction on the size of comments? is it enough > that the first part of the vorbis comment packet that contains > the vendor string is in page 0? Comments don't belong in Page 0. They are not useful to a decoder or muxer trying to figure out how to use a codec or displaying metadata about it. A good example for this is how Vorbis works: Packet 0: Identification Header (always flushed to Page 0) 32 bits: vorbis_version 8 bits: audio_channels 32 audio_sample_rate 32 bitrate_maximum 32 bitrate_nominal 32 bitrate_minimum 4 blocksize_0 4 blocksize_1 1 framing_flag Packet 1: Comment Header Packet 2: Codec Setup Header floors, residules, codebooks, etc used for decoding Packet 2 is flushed to page, but Packets 1 and 2 may appear on the same page or on many pages, and may be continued between pages. The purpose of Packet 2 is so this information doesn't have to be repeated on every data packet. FLAC appears to repeat this information on every frame, and as such, better compression may be possible simply by moving this data to a header packet. Or, atleast, offset the additional overhead we get from the Ogg page headers. > audio packets are written out with ogg_page_out() with no > attempt to manipulate the page boundaries. but the first > audio packet will always start a page because all the metadata > is flushed out to pages before audio data is written. is this > also OK? This is perfect. End of headers needs to flush, everything else should go normally. This behavior is identical to that of Vorbis. > your help is appreciated. I wouldn't worry too much about > backward compatibility with old-and-previously-unwieldy Ogg > FLAC because 1) not many people (anyone?) are using it yet since > it has had no seeking support until recently in CVS; 2) it is > trivial to decode an old stream with an old decoder and > re-encode it with a newer encoder that complies to an official > Ogg FLAC bitstream mapping. The latter is a very good point, something I keep forgetting - FLAC is lossless so transcoding FLAC -> FLAC is a lossless operation. However, this is a good example of why a version field in Page 0 is needed :-) Older apps will choke on the new format and, without the version field, they won't know why. Also, new apps will have to detect a four-byte Page 0 as being the "old way" if they want to support it. > I haven't been following Ogg2 (are there any docs for it?) so > I don't know what that entails. Sorry, docs haven't been written yet. I believe that only Monty and I are familiar with it, but docs are going to be written "real soon now". I was going to write a "dummy's guide to migrating to libogg2" but figure it'll be easier to just do alot of that work myself. The API is very similar to libogg1, but at the same time, "everything has changed". All buffers are "owned" by the library now, which is responsible for memory management, and while some of the functions retain their same names their arguments are of different types. They are, however, similar enough so that libtheora supports both with only a few #ifdef LIBOGG2's here and there. libogg2's advantage is speed and memory consumption. libogg1 repeatedly copies memory between buffers and other really inefficient things like that.. Monty wrote libogg2 originally as part of Tremor, since lower memory usage was needed, and wrote it such that data goes from the bitpacker to the sync buffer while never being copied or moved in memory. OggFile, the "Ogg System Library", will use libogg2 and will probobally be distributed with it, and all "next-generation" apps which use Ogg are likely to use libogg2. In other words, migrating FLAC is a pretty high priority as far as getting it ready to be used with other Ogg codecs. ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ Flac-dev mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/flac-dev
