Re: [flac-dev] Two questions
On Aug 9, 2021, at 11:31, Martijn van Beurden wrote: > Op zo 8 aug. 2021 06:24 schreef Federico Miyara : >> As I wanted to stop at sample Nend, I used Nend + 1, instead, to ensure that >> the sample Nend is the last one. > > Considering Nend, did the count start at 0 or at 1? The flac utility starts > counting at 0 AFAIK, if your count starts at 1, that would explain this > behaviour. Exactly. If you wanted to break a FLAC into 588-sample pieces (say, for CD audio blocks), then you would ask for --until=588 and you could get exactly 588 samples numbered 0 through 587. The second piece would be another 588 samples numbered from 588 to 1,175. It's always worked as I expected, but then again I'm a C programmer, not a Pascal programmer. ;-) Brian ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions
Op zo 8 aug. 2021 06:24 schreef Federico Miyara : > > As I wanted to stop at sample Nend, I used Nend + 1, instead, to ensure > that the sample Nend is the last one. > > Considering Nend, did the count start at 0 or at 1? The flac utility starts counting at 0 AFAIK, if your count starts at 1, that would explain this behaviour. > <#m_-8825864059096056862_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions
Martijn, I've discovered exactly why I was getting that skip/until error. I was just following what the flac detailed help (flac --explain) says: --until={#|[+|-]mm:ss.ss} Stop at the given sample number for each input file. The given sample number is not included in the decoded output. (...) As I wanted to stop at sample Nend, I used Nend + 1, instead, to ensure that the sample Nend is the last one. So the explanation given above is misleading since the given sample number is indeed included in the decoded interval. Regards, Federico Miyara On 05/08/2021 03:35, Federico Miyara wrote: Martijn, If you are looking for a tool, take a look at the metaflac command line utility. metaflac --list file.flac returns all metadata in a FLAC file Thanks, this is very interesting. 2) I decode using the option --skip=0 --until=1. I would expect to get a wav file with only 1 sample, but I get 3 samples. Strange, I am not able to replicate that behaviour. When I use flac -d --skip 0 --until 1 file.flac I get a WAV file with a single sample per channel. I tested with a mono and a stereo file. Can you perhaps share the full command line you're using? It is embarrassing, I cannot replicate it either. The only explanation I can find is that I was not running flac from the command line but through the command line function dos() of Scilab. When arranging the command string to pass to the function I used string variables instead of the actual 0 and 1, and it is possible that they contained different values from the desired (and believed) ones. Apologies, my fault. Regards, Federico Miyara -- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions
Martijn, If you are looking for a tool, take a look at the metaflac command line utility. metaflac --list file.flac returns all metadata in a FLAC file Thanks, this is very interesting. 2) I decode using the option --skip=0 --until=1. I would expect to get a wav file with only 1 sample, but I get 3 samples. Strange, I am not able to replicate that behaviour. When I use flac -d --skip 0 --until 1 file.flac I get a WAV file with a single sample per channel. I tested with a mono and a stereo file. Can you perhaps share the full command line you're using? It is embarrassing, I cannot replicate it either. The only explanation I can find is that I was not running flac from the command line but through the command line function dos() of Scilab. When arranging the command string to pass to the function I used string variables instead of the actual 0 and 1, and it is possible that they contained different values from the desired (and believed) ones. Apologies, my fault. Regards, Federico Miyara -- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions
On Aug 4, 2021, at 00:10, Federico Miyara wrote: > Brian, > > Once more, thanks for taking your time to answer my questions and provide > interestig insights. Some comments below. > >> I recommend writing your own utility based on the FLAC library, in C, with >> the features you want. I do not recall any feature in the flac command line >> utility that would allow this. Your workaround is a reasonable attempt, but >> it seems to have too many undefined side-effects. > > I'm not a C programmer, which I regret, but that's how things are. Not to wax too philosophical, but nobody is born a C programmer. Things do not have to remain how they are! However, Martijn has a great suggestion: Use the metaflac command line, and I think this will do a lot for you. > My quest has to do with economy, I have already a function that can read the > metadata from a wav file, being the only required information where the > metadata start (which for the wav file I know). Exactly the same function can > be used with a flac file without using any codec, as far as I can find out > the beginning of the "local copy" of the metadata. Be careful. Many audio programs have failed by attempting to hard-code the beginning a certain pieces of data in a file. A related failure is to assume the location of the end of data. Both failures can create audio glitches. RIFF (WAV) and IFF/FORM (AIFF) files are made from a sequence of chunks, and most chunks can appear in any order. I recall that WAV is slightly more restrictive than AIFF, but I've certainly seen errors in WAV software due to fixed file offsets. Similarly, FLAC files are a sequence of blocks. Your best bet is to create functions that can properly scan blocks, and then once the 'riff' block is found, you can use a WAV function to properly scan chunks. But don't worry too much about economy. Your function only needs to read 4 bytes from each FLAC block in order to determine type and size. When you find an APPLICATION block, you can read 4 bytes to look for 'riff'. If an uninteresting block is found, then seek in the file according to the size of the block, and read 4 bytes to examine the next block. Proper code can scan even very large FLAC files very quickly by reading only 4 or 8 bytes at a time and then seeking over the uninteresting data rather than actually reading it from the file into memory. When writing C code, I just use the virtual memory support to map the entire audio file into memory, and then only read a few bytes. The virtual memory paging system will not read from disk except as necessary. >> The --keep-foreign-metadata feature was added to the command-line >> application after the FLAC format was finalized. The metadata ends up in an >> APPLICATION block, which is usually skipped by the FLAC library decoder. >> These are intended for third-party applications, and thus it's typically >> impossible to document them. Normally, a third-party software developer >> would add their own proprietary block to the FLAC file, and all other >> applications would just skip over it (because all blocks have a universal >> name and length at the start). > > This information is most useful for me, since at least now I know the name of > the block containing the foreign metadata, and I know it is previous to the > audio data. I never noticed that the audio block is last in a FLAC file. I'm used to AIFF and WAV, where chunks can almost appear in any order. This is good news because you'll never seek into the audio data. > I could manually "read" the first few metadata blocks (following the format > specification) and found that there is a seek table whose length is roughly > proportional to the size of the audio samples, then a Vorbis comment > indicating the version of the FLAC libraru'y, and then the Application block > which contains the data I'm interested in. This makes its position > predictable so I can find it without having y¿to read all the file in search > of some key words! The position is not guaranteed to be predictable. If you design your Scilab algorithm based on predictable positions, then you'll probably end up with issues. It will be more successful to create a block scanner, reading 4 or 8 bytes per block, in order to find the APPLICATION 'riff' block that you want. The algorithm will be very similar to scanning chunks in the WAV file. >> The only documentation of the APPLICATION block format is probably the >> source code for the flac command line utility. I did not design this, but I >> remember suggesting it a few times. Basically, the entire WAV or AIFF >> contents are in the block, verbatim, except for the chunk that would contain >> the audio. Since the FLAC data outside the APPLICATION block already >> contains the audio, that chunk is empty in the APPLICATION block. > > I wonder why there is a long run of zeros (about 8192 zero bytes) in the > example I'm attaching, almost as long a
Re: [flac-dev] Two questions
Op ma 2 aug. 2021 om 17:40 schreef Federico Miyara : > > Dear All, > > 1) Is there a way to get the audio size (number of samples) and other > information, such as number of channels, from a flac file without fully > decoding it? > I've found that the WAV header is replicated after the "riffRIFF" keyword, > but I don't seem to be able to predict where it is located or whether it is > safe or not to asume that the first time such keyword appears is the correct > one, and if there is an upper bound for its location; for instance, some text > such as the name of a song or some comment could contain that keyword, even > if it is unlikely. The information preceding the "reference libFLAC 1.3.3" > encoder version seems to be non-text information. If you are looking for a tool, take a look at the metaflac command line utility. metaflac --list file.flac returns all metadata in a FLAC file > 2) I decode using the option --skip=0 --until=1. I would expect to get a wav > file with only 1 sample, but I get 3 samples. Strange, I am not able to replicate that behaviour. When I use flac -d --skip 0 --until 1 file.flac I get a WAV file with a single sample per channel. I tested with a mono and a stereo file. Can you perhaps share the full command line you're using? ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions
On Aug 2, 2021, at 19:19, Federico Miyara wrote: > Brian, > > Thanks for your reply. > >> You should not use the RIFF information. It is not part of the FLAC >> specification. It is an optional enhancement to store information that is >> only about the input WAV files. It will strictly be missing from FLAC files >> that are recorded live, converted from AIFF, and will even be missing when >> WAV is the source if the option is not added. > > I'm aware of that. It happens that my flac files (there are many of them) > have been converted from wav files and contain riff cues and associated text > that have been annotated, many years ago, according to a research protocol. I > need to retrieve that information automatically. The encoding has been done > preserving foreign metadata and tests have shown the information is correctly > kept. I have a script that can retrieve the info from the wav file, so I > decode a dummy wav with very few samples and the whole metadata stuff, but it > would be better to retrieve that information directly from the file. I recommend writing your own utility based on the FLAC library, in C, with the features you want. I do not recall any feature in the flac command line utility that would allow this. Your workaround is a reasonable attempt, but it seems to have too many undefined side-effects. > I've read the FLAC format and cannot find any mention to where are the > foreign metadata included in the stream. Is it possible that it isn't > actually documented yet? The --keep-foreign-metadata feature was added to the command-line application after the FLAC format was finalized. The metadata ends up in an APPLICATION block, which is usually skipped by the FLAC library decoder. These are intended for third-party applications, and thus it's typically impossible to document them. Normally, a third-party software developer would add their own proprietary block to the FLAC file, and all other applications would just skip over it (because all blocks have a universal name and length at the start). This foreign metadata feature is a special case, where the command-line flac utility uses 'RIFF', 'riff' or 'aiff' as "application" names, when actually it's the external file format and not the application that's being referred to. https://xiph.org/flac/format.html#def_APPLICATION and https://xiph.org/flac/id.html The only documentation of the APPLICATION block format is probably the source code for the flac command line utility. I did not design this, but I remember suggesting it a few times. Basically, the entire WAV or AIFF contents are in the block, verbatim, except for the chunk that would contain the audio. Since the FLAC data outside the APPLICATION block already contains the audio, that chunk is empty in the APPLICATION block. By the way, one of the challenges of making a completely lossless WAV or AIFF compressor is that there is no predefined order for the various chunks in those files. The audio data chunk can appear before or after various other optional chunks. The solution for FLAC was to have that empty chunk inside the APPLICATION block. For WAV, the audio chunk is named 'data' and for AIFF the audio chunk is named 'SSND'. All other chunks are copied verbatim, but these audio chunks only have a name and size with no further bytes. It's basically a marker. I'm pretty sure that's how it was implemented, but you can check the flac command line source to confirm. >> Are you seeing 3 bytes for 1 sample? ... or are you seeing 3 samples? Also, >> I recall that the FLAC library returns 32-bit numbers, so you have to >> convert these to 16-bit or 24-bit samples. > > I think it returns exactly the sample type the original file contained, > otherwise I guess it wouldn't be a lossless compressor. There are two level to the FLAC source code. At the lowest level is the FLAC library, which deals only with the FLAC stream, either seekable or restricted to streaming only. The FLAC library does not understand WAV or AIFF or anything besides FLAC. The high level code is separate, and it's the flac command line. So, yes, the flac command line returns the same sample type as the original file. However, if you use the FLAC library directly, you'll note that samples are always 32-bit fixed-point integers. I've written my own object-oriented framework to convert FLAC to WAV, AIFF, CAF, and other formats. In this code base, I had to deal with the 32-bit integers. My apologies for confusing the FLAC library with the output files from the flac command line. If you're going to use the command line, I'd recommend getting or writing some utilities that can analyze a WAV (or AIFF) file directly. Seems like some of the GUI applications out there can do unexpected things (for a long time, certain GUI apps would show MP3 song metadata in the audio samples!) > However, I made a more careful test and with skip=0 until=1 and get 2 samples > inst
Re: [flac-dev] Two questions
Brian, Thanks for your reply. Yes. There is a specific header with this information. Look for the documentation of the FLAC format for reference, and then look through the library for calls that might return this information. OK, I'll see that later. You should not use the RIFF information. It is not part of the FLAC specification. It is an optional enhancement to store information that is only about the input WAV files. It will strictly be missing from FLAC files that are recorded live, converted from AIFF, and will even be missing when WAV is the source if the option is not added. I'm aware of that. It happens that my flac files (there are many of them) have been converted from wav files and contain riff cues and associated text that have been annotated, many years ago, according to a research protocol. I need to retrieve that information automatically. The encoding has been done preserving foreign metadata and tests have shown the information is correctly kept. I have a script that can retrieve the info from the wav file, so I decode a dummy wav with very few samples and the whole metadata stuff, but it would be better to retrieve that information directly from the file. I've read the FLAC format and cannot find any mention to where are the foreign metadata included in the stream. Is it possible that it isn't actually documented yet? Are you seeing 3 bytes for 1 sample? ... or are you seeing 3 samples? Also, I recall that the FLAC library returns 32-bit numbers, so you have to convert these to 16-bit or 24-bit samples. I think it returns exactly the sample type the original file contained, otherwise I guess it wouldn't be a lossless compressor. However, I made a more careful test and with skip=0 until=1 and get 2 samples instead of 3. I confirm this in two different ways: 1) From the RIFF DATA header the size of audio data is 4 bytes, and my wav was originally encoded at 16 bit per sample 2) I open it in Audacity and it shows 2 samples What confused me is that when I open the same file in Cool Edit 2000 (an ancient commercial audio editor) it displays 4 samples, I don't quite understand why. The first 2 ones are the first two of the original file, the 3rd and 4th are 0. But this anomalous behavior corresponds to Cool Edit, not FLAC. (My first example was with skip=1 until=1, so it should have yielded 1 sample but Cool Edit showed two more samples, both 0, which made the 3 samples) Let's go back to the result: I have 2 samples. Since I had skipped 0 or no sample, I have the first two samples, so I guess the first one is #0 and the second one is #1. But there is still a conflict with the documentation, which says (I quote again): --until={#|[+|-]mm:ss.ss} Stop at the given sample number for each input file. _The given sample number is not included in the decoded output._ From my sample case it would appear that the given sample number is indeed included in the decoded output. If I'm correct, this wouldn't be exactly a bug but an error in the documentation. Regards, Federico Miyara -- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions
On Aug 2, 2021, at 08:40, Federico Miyara wrote: > 1) Is there a way to get the audio size (number of samples) and other > information, such as number of channels, from a flac file without fully > decoding it? Yes. There is a specific header with this information. Look for the documentation of the FLAC format for reference, and then look through the library for calls that might return this information. I usually start from https://xiph.org/flac/documentation_format_overview.html for the official specification of the file format. Note that FLAC is mostly a streaming format, where metadata is not generally used. But the file does have this information early on. See https://xiph.org/flac/format.html#def_STREAMINFO where it says that the STREAMINFO packet must be the first packet in the file. > I've found that the WAV header is replicated after the "riffRIFF" keyword, > but I don't seem to be able to predict where it is located or whether it is > safe or not to asume that the first time such keyword appears is the correct > one, and if there is an upper bound for its location; for instance, some text > such as the name of a song or some comment could contain that keyword, even > if it is unlikely. The information preceding the "reference libFLAC 1.3.3" > encoder version seems to be non-text information. You should not use the RIFF information. It is not part of the FLAC specification. It is an optional enhancement to store information that is only about the input WAV files. It will strictly be missing from FLAC files that are recorded live, converted from AIFF, and will even be missing when WAV is the source if the option is not added. > 2) I decode using the option --skip=0 --until=1. I would expect to get a wav > file with only 1 sample, but I get 3 samples. Stictly no sample should be > decoded since 0 samples should be skipped and the help says: > > --until={#|[+|-]mm:ss.ss} Stop at the given sample number for each input > file. The given sample number is not included in the decoded output. > > so sample #1 should not be included, unless the first sample is sample #0, in > which case only 1 sample, and not 3 samples, should be included in the output > file. Are you seeing 3 bytes for 1 sample? ... or are you seeing 3 samples? Also, I recall that the FLAC library returns 32-bit numbers, so you have to convert these to 16-bit or 24-bit samples. Brian ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
[flac-dev] Two questions
Dear All, 1) Is there a way to get the audio size (number of samples) and other information, such as number of channels, from a flac file without fully decoding it? I've found that the WAV header is replicated after the "riffRIFF" keyword, but I don't seem to be able to predict where it is located or whether it is safe or not to asume that the first time such keyword appears is the correct one, and if there is an upper bound for its location; for instance, some text such as the name of a song or some comment could contain that keyword, even if it is unlikely. The information preceding the "reference libFLAC 1.3.3" encoder version seems to be non-text information. 2) I decode using the option --skip=0 --until=1. I would expect to get a wav file with only 1 sample, but I get 3 samples. Stictly no sample should be decoded since 0 samples should be skipped and the help says: --until={#|[+|-]mm:ss.ss} Stop at the given sample number for each input file. The given sample number is not included in the decoded output. so sample #1 should not be included, unless the first sample is sample #0, in which case only 1 sample, and not 3 samples, should be included in the output file. Thanks, Federico Miyara -- El software de antivirus Avast ha analizado este correo electrónico en busca de virus. https://www.avast.com/antivirus ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions about RG in flac
Robert Kausch wrote: > The problem seems to be that sum is interpreted as a 64 bit value if > SSE2 was used in the loop (the lower 32 bits of the result give the > expected value). If sum is evaluated another time before or after (!) > the printf, the problem goes away. For example, changing the last line > to "return sum + 1;" lets the problem disappear. > > I confirmed the bug with GCC 4.6.3 on Ubuntu. As on Windows, only 32 bit > code generation is affected. Thank you for testing. > You should file a bug report with the GCC team. Done: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61423 ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions about RG in flac
Ozkan Sezer wrote: > With gcc-3,3,6, 3,4,6, 4.3.0 and gcc-4.9.1 (svn r210839) the output is > normal: > Sum = 64.00 (should be equal to 64) > > With gcc-4.8.3 (release version) it's broken: > Sum = 206158430272.00 (should be equal to 64) > > With clang-3.4.1 (compiled with gcc-4.8.3) the output is normal again. > > This is on i686-linux (fedora9, glibc-2.8, kernel-2.6.27.35) Thank you for testing. However, I compiled my test with gcc 4.9.1 20140604 from dongsheng-daily ( http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win32/Personal%20Builds/dongsheng-daily/4.9/ , file gcc-4.9-win32_4.9.1-20140604.7z) and it still fails... ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions about RG in flac
On 6/3/14, Robert Kausch wrote: > Am 03.06.2014 16:45, schrieb lvqcl: >> 2) to ALL: >> I attached a small program. Compile and run it. >> * Does it work correctly when compiled with -O3 -msse2 options? >> * If yes, does it work correctly when compiled with -O3 -funroll-loops >> -msse2 options? >> ( and what is the version of your GCC? ) > I further reduced the testcase (attached). > > The bug only occurs if N >= 64; presumably the second loop is only SSE2 > optimized if that's the case. > > The problem seems to be that sum is interpreted as a 64 bit value if > SSE2 was used in the loop (the lower 32 bits of the result give the > expected value). If sum is evaluated another time before or after (!) > the printf, the problem goes away. For example, changing the last line > to "return sum + 1;" lets the problem disappear. > > I confirmed the bug with GCC 4.6.3 on Ubuntu. As on Windows, only 32 bit > code generation is affected. > > You should file a bug report with the GCC team. > With gcc-3,3,6, 3,4,6, 4.3.0 and gcc-4.9.1 (svn r210839) the output is normal: Sum = 64.00 (should be equal to 64) With gcc-4.8.3 (release version) it's broken: Sum = 206158430272.00 (should be equal to 64) With clang-3.4.1 (compiled with gcc-4.8.3) the output is normal again. This is on i686-linux (fedora9, glibc-2.8, kernel-2.6.27.35) ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions about RG in flac
Am 03.06.2014 16:45, schrieb lvqcl: 2) to ALL: I attached a small program. Compile and run it. * Does it work correctly when compiled with -O3 -msse2 options? * If yes, does it work correctly when compiled with -O3 -funroll-loops -msse2 options? ( and what is the version of your GCC? ) I further reduced the testcase (attached). The bug only occurs if N >= 64; presumably the second loop is only SSE2 optimized if that's the case. The problem seems to be that sum is interpreted as a 64 bit value if SSE2 was used in the loop (the lower 32 bits of the result give the expected value). If sum is evaluated another time before or after (!) the printf, the problem goes away. For example, changing the last line to "return sum + 1;" lets the problem disappear. I confirmed the bug with GCC 4.6.3 on Ubuntu. As on Windows, only 32 bit code generation is affected. You should file a bug report with the GCC team. #include #define N 64 /* problem is triggered only if N >= 64 */ unsigned A[N]; int main() { unsigned i, sum = 0; /* both sum and A[] need to be unsigned for the bug to happen */ for (i = 0; i < N; i++) A[i] = 1; for (i = 0; i < N; i++) sum += A[i]; printf("Sum = %f (should be equal to %i)\n", (float) sum, N); return 0; } ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev
Re: [flac-dev] Two questions about RG in flac
Am 03.06.2014 16:45, schrieb lvqcl: > 2) to ALL: > I attached a small program. Compile and run it. > * Does it work correctly when compiled with -O3 -msse2 options? > * If yes, does it work correctly when compiled with -O3 -funroll-loops > -msse2 options? > ( and what is the version of your GCC? ) Tested various versions of TDM-GCC on Windows. 32 bit executable produced with TDM-GCC 4.8.1 fails as soon as -O3 and SSE2 come together. SSE2 is enabled by -O3 here, so compiling with -O3 is sufficient to trigger the bug. Compiling with -O3 -mno-sse2 produces a correctly working executable just as -O2 -msse2 does. -funroll-loops does not make any difference. Same with TDM-GCC 4.4.1, 4.5.0, 4.6.1 and 4.7.1; only difference is that -O3 does not include SSE2 there, so it has to be enabled manually with -msse2 to trigger the problem. TDM-GCC 4.3.2 produces a correctly working executable even with -O3 -msse2. 64 bit executables produced with any of the tested GCC versions work correctly in all cases. ___ flac-dev mailing list flac-dev@xiph.org http://lists.xiph.org/mailman/listinfo/flac-dev