On Mon, 2004-02-23 at 14:19, Axel IS Main wrote: > Yes, and in fact that is what I am doing now. This is a spider bot > though, so I'm having to think of every single type of binary file that > could be linked to on the web. So far I'm up to 28 with no end in sight. > What about a .com file? I can't omit links that end in .com can I? That > would be counterproductive to say the least. Also, the function that > does the checking just keep getting longer and longer, which makes the > spider go slower and slower. Granted, the thing is pretty fast if it has > enough BW to work with, but still. This could eventually turn into a > script killer. Detecting whether the stream from file_get_contents(), or > fopen() for that matter, is binary or not and going with that result is > the elegant solution to this problem. There has to be a way to do it.
You could trying writing a script to check the first several bytes of the file for control characters. If the first 1kb is >= 20% (randomly pulled from my head) control characters it's a safe bet it is a binary file. This is not 100% accurate, but it's something to play with that doesn't rely on mime types or file extensions, both of which can easily be inaccurate. -- Adam Bregenzer [EMAIL PROTECTED] http://adam.bregenzer.net/ -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php