Re: [Openfontlibrary] ccHost compression
On Mon, 2008-11-03 at 13:46, Ed Trager wrote: at the *nix file command source code, I bet you could fairly easily find a reference to the magic file header bytes that are used to detect TTF/OTF files and then add this to the getId3() stuff, assuming that getId3() is well-written. OpenType fonts begin with the four bytes OTTO TrueType fonts begin with either 0x0001 true 0x0002(I doubt this will occur in a user supplied font) TTC fonts begin with ttcf There are some ancient apple/adobe sfnt formats which start with typ1 CID but these aren't likely to be generated any more. PFB files begin with 0x8001 PFA files begin with %!PS-AdobeFont-1.0 (Other adobe fonts probably also start with this, so it might just mean PostScript font). Bare CFF fonts (the thingies inside otf files) begin with 0x010004 BDF fonts begin with STARTFONT PCF fonts begin with 0x01 fcp sfd files begin with SplineFontDB: Ikarus fonts begin with IK 0x0055 (There are some exceedingly complex rules for recognizing fonts in mac resource forks, or mac dfont format (almost the same) but I doubt it's worth implementing them). On Tue, 2008-11-04 at 05:02, Brendan Ferguson wrote: Say, will any of the font source files read like a unix script file with #!/ as the first bits of information in the file? I am not aware of any font format which starts with #!/ On Tue, 2008-11-04 at 14:17, Dave Crossland wrote: I think with extensions and the file command we can reliably detect images as images and have them displayed instead of for download :-) png images start with 0x89 PNG jpeg images contain JFIF in bytes 6,7,8,9 gif files start with GIF87 GIF89 ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
2008/11/3 Brendan Ferguson [EMAIL PROTECTED]: I have joined the development mailing list. Waiting for my fist mail. Which dev list? :-) One note of concern that I will research. If someone starts with a .html file and adds php content, then uploads it and renames it to .php, a script could be executed if the detect script does not register it as a php file I've tested this and it doesn't detect a file as a PHP file if its first bytes are html whlie having ?php echo oops? on the 2nd line. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
Sounds like you are an expert around here :-) But I have not done any coding in 4 years.. Brendan ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
Hi, Dave Crossland wrote: 2008/11/3 Brendan Ferguson [EMAIL PROTECTED]: Getting the file onto the server is the first big step in launching an attack. We can set the webserver to send files for download, so neither the webserver or webbrowser will interpret them. So could we accept all files, but make them only for download, and tell site visitors to report problems to us if there are dodgy files? http://www.thingy-ma-jig.co.uk/blog/06-08-2007/force-a-pdf-to-download explains how to do this for *.pdf files in a case insensitive, cross-browser way. This download-as-dumb-data policy, combined with ccHost's file-verification capabilities seems adequate to me. I do see the potential for attacks based on the contents of an upload, but why should we accept uploaded HTML files and why should we allow any uploaded file to be executed by Apache? I believe what is needed is this: - accept upload as either loose files or an archive (.tgz, .zip, perhaps .7zip and .bzip) - if this is a new typeface, create a directory for it inside the user's directory - unarchive everything once the archive has been uploaded, *replacing any files with the same name* And then have download links for each individual file and a .tgz (or perhaps better a .zip) for the whole directory. That's different in detail to what ccHost does right now, but it's compatible in spirit. It also leaves the way open for access via special URLs for package maintaining scripts or whatever with no need for human intervention. Cheers, Ben ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
We can set the webserver to send files for download, so neither the webserver or webbrowser will interpret them. I imagine that even if the files are set for download, they will be interpreted. If say I setup a GIF for PHP to run through it, and then force the download header, it will probably download a intreated GIF. Now if you changed the type of file to say text, this might work... Probably. But you will not be able to view any of the images any more, the browser would be treating them like text. :( There is apache configs that can disable PHP and CGI directory specific though. I just spent some time plying with them. It seems as though we will have to put them in our own server config files. They are not universally accepted in .htaccess files. I can see if I can change the permissions of the files that are uploaded so there is read and write access, but not execution access. Not sure if this will work, but worth a try. Other than that, we will just have to rely on our blacklist, which should also disable some windows executables to prevent people from uploading viruses, which will not effect the server, but when downloaded could effect the clients. Another option, which I am really not up to coding, would be to rename the files when they are downloaded and use a database to connect all the original file names with the randomly generated file names we rename them all to. Then we never link directly to any file, but use a script to send the files when they are asked for. This way even if someone got something ugly up on to the server, and they did some how have execution permissions, they would not know what file to call. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
On Tue, 2008-11-04 at 08:20 +, Dave Crossland wrote: 2008/11/3 Ed Trager [EMAIL PROTECTED]: The PHP getId3() library is at http://getid3.sourceforge.net/. It might be worth looking into how to expand this library to recognize the TTF and OTF file headers, perhaps? The idea here seems quite similar to what the *Nix file command does. If someone were to look at the *nix file command source code, I bet you could fairly easily find a reference to the magic file header bytes that are used to detect TTF/OTF files and then add this to the getId3() stuff, assuming that getId3() is well-written. Ben Weiner has been looking at ways to extract metadata from font files directly, but I think he gave up because he couldn't complete it in the time he had to allocate to it. He was looking at the TTX tools for this, I think. Anyway, since none of our files have ID3 tags inside them, it seems to me that OFLB can get rid of getID3() and replace it with a PHP wrapper around the file command, perhaps combined with TTX. getID3() reads/writes metadata to diff. file formats with standard type of metadata per format and not just ID3...that is just for mp3. The project has a bad misleading name ;) Jon ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary -- Jon Phillips San Francisco, CA + Guangzhou + Beijing GLOBAL +1.415.830.3884 CHINA +86.1.360.282.8624 [EMAIL PROTECTED] http://www.rejon.org IM/skype: kidproto Jabber: [EMAIL PROTECTED] IRC: [EMAIL PROTECTED] ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
2008/11/4 Brendan Ferguson [EMAIL PROTECTED]: This is not really my area of expertise. I was primarily a php programmer who made websites, content management systems and such. Also did website design using DHTML and Usability. Sounds like you are an expert around here :-) The extent of unix i know is to get my web servers up and running on my own boxes. :-) ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
2008/11/4 Brendan Ferguson [EMAIL PROTECTED]: Say, will any of the font source files read like a unix script file with #!/ as the first bits of information in the file? Maybe. There is a font on OFLB now that is a SFD and has a makeOTF.sh file uploaded too. I forget which one though :( ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
2008/11/3 Ed Trager [EMAIL PROTECTED]: The PHP getId3() library is at http://getid3.sourceforge.net/. It might be worth looking into how to expand this library to recognize the TTF and OTF file headers, perhaps? The idea here seems quite similar to what the *Nix file command does. If someone were to look at the *nix file command source code, I bet you could fairly easily find a reference to the magic file header bytes that are used to detect TTF/OTF files and then add this to the getId3() stuff, assuming that getId3() is well-written. Ben Weiner has been looking at ways to extract metadata from font files directly, but I think he gave up because he couldn't complete it in the time he had to allocate to it. He was looking at the TTX tools for this, I think. Anyway, since none of our files have ID3 tags inside them, it seems to me that OFLB can get rid of getID3() and replace it with a PHP wrapper around the file command, perhaps combined with TTX. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
2008/11/3 Brendan Ferguson [EMAIL PROTECTED]: By headers Ed means the first few bytes of the file. So the file command does indeed identify PHP files perfectly: I will take your word on it. I am clearly not up to date on this. Pehraps the file manual will help? :-) $ man file Will it detect a PHP file with a .html file if I have enabled PHP support in .html files? Well, it detects PHP files when renamed as HTML files: $ cp index.php index.html $ file index.html index.html: PHP script text ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
2008/11/3 Brendan Ferguson [EMAIL PROTECTED]: It sounds like you are describing user security. This is really a server security issue for me. Take a PHP file. What headers will it have? NONE! By headers Ed means the first few bytes of the file. So the file command does indeed identify PHP files perfectly: $ file web/www/index.php web/www/index.php: PHP script text $ cp web/www/index.{php,rpm} $ file web/www/index.rpm web/www/index.rpm: PHP script text $ So I think we can use this program to reliably detect file types, whatever extension they may have, and filter them accordingly. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
Hi, Just to add a bookend: ccHost ships with two means of filtering files. Both are additive - IOW nothing is allowed until explicitly permitted (I prefer thi, and ccHost is at v5...) The two methods: - clever: getID3 (php file identifying lib). Useless to OFLB ATM as nobody's added and fonts to the library. - less clever: signature recognition (called 'pseudo verify'). Already in place in the current OFLB live site for TTF, OTF, PFA etc (done by the brave souls who set up the site) and perfectly reasonable IMO. It basically does a head command on the file (so more basic than the file command). I tried to co-opt some of the file verification code in the new typeface page (reasoning that the site already knew from info added to the pseudo verify system which types of files belonged to each typeface) but gave up. Instead I just made my own list of file extensions to check against the filename (the 'get you home' solution :-). Cheers, Ben ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
Hi, Brendan, The PHP getId3() library is at http://getid3.sourceforge.net/. It might be worth looking into how to expand this library to recognize the TTF and OTF file headers, perhaps? The idea here seems quite similar to what the *Nix file command does. If someone were to look at the *nix file command source code, I bet you could fairly easily find a reference to the magic file header bytes that are used to detect TTF/OTF files and then add this to the getId3() stuff, assuming that getId3() is well-written. - Ed On Mon, Nov 3, 2008 at 2:06 PM, Brendan Ferguson [EMAIL PROTECTED] wrote: Could you point me to the page for this script? I would like to read more about it. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
Yes, I think worthy. We have done this for SVG on http://openclipart.org Its for more than just id3 now ;) Should be more like readWriteMetadataWithPHP() ;) Jon On Mon, 2008-11-03 at 16:46 -0500, Ed Trager wrote: Hi, Brendan, The PHP getId3() library is at http://getid3.sourceforge.net/. It might be worth looking into how to expand this library to recognize the TTF and OTF file headers, perhaps? The idea here seems quite similar to what the *Nix file command does. If someone were to look at the *nix file command source code, I bet you could fairly easily find a reference to the magic file header bytes that are used to detect TTF/OTF files and then add this to the getId3() stuff, assuming that getId3() is well-written. - Ed On Mon, Nov 3, 2008 at 2:06 PM, Brendan Ferguson [EMAIL PROTECTED] wrote: Could you point me to the page for this script? I would like to read more about it. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary -- Jon Phillips San Francisco, CA + Guangzhou + Beijing GLOBAL +1.415.830.3884 CHINA +86.1.360.282.8624 [EMAIL PROTECTED] http://www.rejon.org IM/skype: kidproto Jabber: [EMAIL PROTECTED] IRC: [EMAIL PROTECTED] ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
2008/11/2 Brendan Ferguson [EMAIL PROTECTED]: (c) when any individual files are added to the typeface, create a new zip that includes everything For what reason? Downloading? Is this essential or ideal? Here is the use-case scenario that this is for: Mary, soccer mom and scrap book hobbyist, is searching the web for free fonts. She finds OFLB and wants to browse all the fonts in the library at once, and quickly download all the files for the dozen typeface families she thinks are cute. She sees something about how these fonts are free as in no price, but also free as in she can change them, and she bookmarks the site to learn more about all that later. More simply: A user goes to a font's page, and wants to download all the files associated with that font - font files, font sources, license, FONTLOG, everything. A ZIP file with everything, available with a single click, is ideal for this. . Now. It looks as though people can fill out tags and also a description. We will not be able to do this while decompressing. The Name will have to take the form of the file name. The name is the human name for the overall collection of files, and is not directly related to those files. The tags and description are like that too. The upload form has a user fill these things in when they say the location of the first file on the disk to be uploaded, and when they click upload then the decompression would happen. I guess the easiest way to work this is to make them hidden by default. Navigating to the hidden files is confusing though. A consistent language on the file submission, (instead of publish now one could use hide this file. One could also rename the tab in the user page from hidden to unpublished or something like that. Additionally after a compressed file has been uploaded, a link on the confirmation page could be provided to the hidden page. Sorry, I don't understand this, please re-explain it :-) Perhaps explain it from the point of view of a user, as the steps they take. Now to the issue of allowed file types. The most secure thing to do would only to allow certain file types. Files such as php files should not be allowed. Nor should any other executable file. The decompression will need to check for the file types and than filter out the ones we do not want. Yes. Have we come to some kind of decision on how the file types is gong to work? How are we gong to solve the problem of all the source files? Should we just input them all or what? I think a exclude list is better than an include list - that is, we should exclude files with .exe .php and so on, and include any files not matching this ban-list. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
One can always change a file name extension to something else, so testing against the file extension is probably not useful. PHP's $_FILES['userfile']['type'] will indicate the file's mime type if provided by the browser, but I don't know how browsers determine the mime type for uploaded files. The *nix file command reads the file headers and determines file type based on the pattern of bytes in the headers of files -- that is the most reliable way to do it. But again, I don't know if browsers use a similar method or not. - Ed I think a exclude list is better than an include list - that is, we should exclude files with .exe .php and so on, and include any files not matching this ban-list. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
Re: [Openfontlibrary] ccHost compression
I suppose the Report possible License violation feature could be duplicated/extended to Report possible malicious file so a simple machine filter like file extensions would have a social safety net. The *nix file command reads the file headers and determines file type based on the pattern of bytes in the headers of files -- that is the most reliable way to do it. Well, in the supposed upload zip, uncompress zip, if other files added, compress all the files into a new zip process, running the file command on the files to check their type matches their file extension at the uncompress zip and files added stages would be great. Brendan, what do you think? :-) It sounds like you are describing user security. This is really a server security issue for me. Take a PHP file. What headers will it have? NONE! I have also looked at project that reads headers, and they primarily read audio file headers. Even, HTML files will have to be disabled if php support is enabled for html files (which it is not). With a PHP file being executed by the server, you may (depending on the way passwords were stored) be able to produce a dump of all the emails and stored passwords for them. Or say someone uploads a rpm file and then manages to execute it on the server? I am not a security expert, but do know basic security rules. Getting the file onto the server is the first big step in launching an attack. I have managed to hack several sites gaining access to privileged database information this way. Constructing a map of the database from error messages I purposefully evoked. All due to lack uploading rules. As per a blacklist, we would need to find a tried and true list as I doubt we would be able to come up with them all. And, it would constantly change with the evolution of technology (php3 .php4 .phtml .php + more) for php. Then there is Cold Fusion, ASP, Server Side Includes, Server-side JavaScript etc. This is just part of the web based technologies that can cause an excitation on the server. Although I am not familiar with many of them, many may have more than one extension. ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
[Openfontlibrary] ccHost compression
(c) when any individual files are added to the typeface, create a new zip that includes everything For what reason? Downloading? Is this essential or ideal? (d) have the decompression work for any common format (e) have the compression happen in a range of formats So, everything is decompressed. Great. . Now. It looks as though people can fill out tags and also a description. We will not be able to do this while decompressing. The Name will have to take the form of the file name. I guess the easiest way to work this is to make them hidden by default. Navigating to the hidden files is confusing though. A consistent language on the file submission, (instead of publish now one could use hide this file. One could also rename the tab in the user page from hidden to unpublished or something like that. Additionally after a compressed file has been uploaded, a link on the confirmation page could be provided to the hidden page. Now to the issue of allowed file types. The most secure thing to do would only to allow certain file types. Files such as php files should not be allowed. Nor should any other executable file. The decompression will need to check for the file types and than filter out the ones we do not want. Have we come to some kind of decision on how the file types is gong to work? How are we gong to solve the problem of all the source files? Should we just input them all or what? Thoughts or comments on any of the above? Brendan ___ Openfontlibrary mailing list Openfontlibrary@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/openfontlibrary