2018-01-08 20:25 GMT+01:00 Jörn Heusipp <osm...@problemloesungsmaschine.de>: > > On 01/08/2018 12:57 AM, Carl Eugen Hoyos wrote: >> >> 2018-01-07 15:40 GMT+01:00 Jörn Heusipp >> <osm...@problemloesungsmaschine.de>: >>> >>> On 01/06/2018 04:10 PM, Carl Eugen Hoyos wrote: >>>> >>>> 2018-01-06 11:07 GMT+01:00 Jörn Heusipp >>>> <osm...@problemloesungsmaschine.de>: > > > >>>>> libopenmpt file header probing is tested regularly against the FATE >>>>> suite and other diverse file collections by libopenmpt upstream in >>>>> order to avoid false positives. >>>> >>>> >>>> You could also test tools/probetest >>> >>> >>> I was not aware of that tool. Thanks for the suggestion. >>> >>> It currently lists a failure related to libopenmpt: >>> Failure of libopenmpt probing code with score=76 type=0 p=FDC size=1024 >> >> >> I did not look at this closely but I suspect libopenmpt should return a >> smaller score in this case. > > > We (libopenmpt developers) are currently considering making the heuristic > for M15 files even stricter. The changes will land in libopenmpt 0.3.5. > libopenmpt 0.3.5 versus 0.3.4 or earlier can be distinguished at runtime via > openmpt_get_library_version(). I would be fine with only doing probing via > libopenmpt in FFmpeg starting with libopenmt 0.3.5 and relying on file > extensions for earlier versions. > > However, the data that tools/probetest.c generates here fundamentally does > have a somewhat high probability of looking like a completely legit M15 > file. False positives are not really avoidable completely no matter what > libopenmpt does here. The failing data is synthetic, and I am not seeing any > M15 false positives at all on real-world file collections (media and > non-media files (tested on 1.2 million data and system files)). > >> A solution could be to never return a high value for the FFmpeg >> probe function. > > > Maybe, but given what tools/probetest generates here, I somewhat doubt these > examples have any real-world implication at all. > Anyway, in case a lower score is deemed to be useful, do you have any > suggestions which score I should pick? AVPROBE_SCORE_EXTENSION or less would > probably not be very useful, so what comes to mind for me is > AVPROBE_SCORE_EXTENSION+1 (51).
No real suggestion here, above was just an idea. >>> Looking at tools/probetest.c, that would be a file with very few bits >>> set. >>> libopenmpt detects the random data in question as M15 .MOD files >>> (original >>> Amiga 15 samples .mod files), and there is sadly not much that can be >>> done >>> about that. There are perfectly valid real-world M15 .MOD files with only >>> 73 >>> bits set in the first 600 bytes (untouchables-station.mod, >>> >>> <https://modarchive.org/index.php?request=view_by_moduleid&query=104280>). >>> The following up to 64*4*4*63 bytes could even be completely 0 (empty >>> pattern data) for valid files (even without the file being totally >>> silent). >>> The generated random data that tools/probetest complains about here >>> contains >>> 46 bits set to 1 in the first 600 bytes. What follows are various other >>> examples with up to 96 bits set to 1. Completely loading a file like that >>> would very likely reject it (in particular, libopenmpt can deduce the >>> expected file size from the sample headers and, with some slack to >>> account >>> for corrupted real-world examples, reject files with non-matching size), >>> however, that is not possible when only probing the file header. >>> The libopenmpt API allows for the file header probing functions to return >>> false-positives, however false-negatives are not allowed. >>> >>> Performance numbers shown by tools/probetest are what I had expected >>> (measured on an AMD A4-5000, during normal Xorg session (i.e. not 100% >>> idle)): >>> > > [...] > >>> 109589637233 cycles, libopenmpt >> >> >> This sadly may not be acceptable, others may want to comment. >> >>> 2672917981 libopenmpt (per module format) >>> >>> At first glance, libopenmpt looks huge here in comparison. However one >>> should consider that libopenmpt internally has to probe for (currently) >>> 41 >>> different module file formats, going through 41 separate probing >>> functions >>> internally. >>> >>> Dividing 109589637233 by 41 gives 2672917981, which is in the ballpark of >>> all other probing functions in ffmpeg. > > > What are your expectations for probing speed of 41 completely distinct file > formats? My only expectation is that other FFmpeg developers comment, a (imo strong) argument in your favour is that this will only apply if an optional external library is activated at compile-time. > Even only h261,h263,h264,hevc,aac,ac3 (raw streams) combined take more time > than libopenmpt takes for its 41 formats together. It is otoh imo not a useful argument to compare four of the most common formats (we have to auto-detect them for mpeg-ts recordings) to libopenmpt;-) > All other FFmpeg probing functions combined (234 formats) take 1201426924609 > cycles. libopenmpt adds 109589637233 cycles for 41 different file formats to > that, which is about 10%. I do not think probing performance is in general > that performance critical that would make 10% a problem, especially > considering that for real-world use cases when probing a whole media > library, the data also has to be read from storage in the first place. It is 10% for probetest, not sure if this compares well to real-world files. But if nobody else comments, I support your patch! Carl Eugen _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel