Re: [Openfontlibrary] ccHost compression

2008-11-05 Thread George Williams
On Mon, 2008-11-03 at 13:46, Ed Trager wrote:
 at the *nix file command source code, I bet you could fairly easily
 find a reference to the magic file header bytes that are used to
 detect TTF/OTF files and then add this to the getId3() stuff, assuming
 that getId3() is well-written.
OpenType fonts begin with the four bytes
   OTTO

TrueType fonts begin with either
   0x0001
   true
   0x0002(I doubt this will occur in a user supplied font)

TTC fonts begin with
   ttcf

There are some ancient apple/adobe sfnt formats which start with
   typ1
   CID 
 but these aren't likely to be generated any more.

PFB files begin with
   0x8001
PFA files begin with
   %!PS-AdobeFont-1.0
(Other adobe fonts probably also start with this, so it might just mean
PostScript font).
Bare CFF fonts (the thingies inside otf files) begin with
   0x010004

BDF fonts begin with
   STARTFONT

PCF fonts begin with
   0x01 fcp

sfd files begin with
   SplineFontDB:

Ikarus fonts begin with
   IK 0x0055

(There are some exceedingly complex rules for recognizing fonts in mac
resource forks, or mac dfont format (almost the same) but I doubt it's
worth implementing them).

On Tue, 2008-11-04 at 05:02, Brendan Ferguson wrote:
 Say, will any of the font source files read like a unix script file  
 with #!/ as the first bits of information in the file?
I am not aware of any font format which starts with #!/

On Tue, 2008-11-04 at 14:17, Dave Crossland wrote:
 I think with extensions and the file command we can reliably detect
 images as images and have them displayed instead of for download :-)
png images start with
  0x89 PNG
jpeg images contain JFIF in bytes 6,7,8,9
gif files start with
   GIF87
   GIF89


___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Dave Crossland
2008/11/3 Brendan Ferguson [EMAIL PROTECTED]:

 I have joined the development mailing list. Waiting for my fist mail.

Which dev list? :-)

 One note of concern that I will research. If someone starts with a .html
 file and adds php content, then uploads it and renames it to .php, a script
 could be executed if the detect script does not register it as a php file

I've tested this and it doesn't detect a file as a PHP file if its
first bytes are

html

whlie having ?php echo oops? on the 2nd line.
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Brendan Ferguson

 Sounds like you are an expert around here :-)


But I have not done any coding in 4 years..

Brendan
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Ben Weiner
Hi,

Dave Crossland wrote:
 2008/11/3 Brendan Ferguson [EMAIL PROTECTED]:
   
 Getting the
 file onto the server is the first big step in launching an attack.
 

 We can set the webserver to send files for download, so neither the
 webserver or webbrowser will interpret them.

 So could we accept all files, but make them only for download, and
 tell site visitors to report problems to us if there are dodgy files?

 http://www.thingy-ma-jig.co.uk/blog/06-08-2007/force-a-pdf-to-download
 explains how to do this for *.pdf files in a case insensitive,
 cross-browser way.
   

This download-as-dumb-data policy, combined with ccHost's 
file-verification capabilities seems adequate to me. I do see the 
potential for attacks based on the contents of an upload, but why should 
we accept uploaded HTML files and why should we allow any uploaded file 
to be executed by Apache?

I believe what is needed is this:

- accept upload as either loose files or an archive (.tgz, .zip, perhaps 
.7zip and .bzip)
- if this is a new typeface, create a directory for it inside the user's 
directory
- unarchive everything once the archive has been uploaded, *replacing 
any files with the same name*

And then have download links for each individual file and a .tgz (or 
perhaps better a .zip) for the whole directory.

That's different in detail to what ccHost does right now, but it's 
compatible in spirit. It also leaves the way open for access via special 
URLs for package maintaining scripts or whatever with no need for human 
intervention.

Cheers,
Ben
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Brendan Ferguson
 We can set the webserver to send files for download, so neither the
 webserver or webbrowser will interpret them.

I imagine that even if the files are set for download, they will be  
interpreted. If say I setup a GIF for PHP to run through it, and then  
force the download header, it will probably download a intreated GIF.

Now if you changed the type of file to say text, this might work...  
Probably. But you will not be able to view any of the images any more,  
the browser would be treating them like text. :(

There is apache configs that can disable PHP and CGI directory  
specific though. I just spent some time plying with them. It seems as  
though we will have to put them in our own server config files. They  
are not universally accepted in .htaccess files.

I can see if I can change the permissions of the files that are  
uploaded so there is read and write access, but not execution access.  
Not sure if this will work, but worth a try.

Other than that, we will just have to rely on our blacklist, which  
should also disable some windows executables to prevent people from  
uploading viruses, which will not effect the server, but when  
downloaded could effect the clients.

Another option, which I am really not up to coding, would be to rename  
the files when they are downloaded and use a database to connect all  
the original file names with the randomly generated file names we  
rename them all to. Then we never link directly to any file, but use a  
script to send the files when they are asked for. This way even if  
someone got something ugly up on to the server, and they did some how  
have execution permissions, they would not know what file to call.

___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Jon Phillips
On Tue, 2008-11-04 at 08:20 +, Dave Crossland wrote:
 2008/11/3 Ed Trager [EMAIL PROTECTED]:
 
  The PHP getId3() library is at http://getid3.sourceforge.net/.  It
  might be worth looking into how to expand this library to recognize
  the TTF and OTF file headers, perhaps?  The idea here seems quite
  similar to what the *Nix file command does.  If someone were to look
  at the *nix file command source code, I bet you could fairly easily
  find a reference to the magic file header bytes that are used to
  detect TTF/OTF files and then add this to the getId3() stuff, assuming
  that getId3() is well-written.
 
 Ben Weiner has been looking at ways to extract metadata from font
 files directly, but I think he gave up because he couldn't complete it
 in the time he had to allocate to it. He was looking at the TTX
 tools for this, I think.
 
 Anyway, since none of our files have ID3 tags inside them, it seems to
 me that OFLB can get rid of getID3() and replace it with a PHP wrapper
 around the file command, perhaps combined with TTX.

getID3() reads/writes metadata to diff. file formats with standard type
of metadata per format and not just ID3...that is just for mp3.

The project has a bad misleading name ;)

Jon

 ___
 Openfontlibrary mailing list
 Openfontlibrary@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
-- 
Jon Phillips
San Francisco, CA + Guangzhou + Beijing
GLOBAL +1.415.830.3884
CHINA +86.1.360.282.8624
[EMAIL PROTECTED]
http://www.rejon.org
IM/skype: kidproto
Jabber: [EMAIL PROTECTED]
IRC: [EMAIL PROTECTED]

___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Dave Crossland
2008/11/4 Brendan Ferguson [EMAIL PROTECTED]:

 This is not really my area of expertise. I was primarily a php
 programmer who made websites, content management systems and such.
 Also did website design using DHTML and Usability.

Sounds like you are an expert around here :-)

 The extent of unix i know is to get my web servers up and running on
 my own boxes.

:-)
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Dave Crossland
2008/11/4 Brendan Ferguson [EMAIL PROTECTED]:

 Say, will any of the font source files read like a unix script file with #!/
 as the first bits of information in the file?

Maybe. There is a font on OFLB now that is a SFD and has a
makeOTF.sh file uploaded too. I forget which one though :(
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-04 Thread Dave Crossland
2008/11/3 Ed Trager [EMAIL PROTECTED]:

 The PHP getId3() library is at http://getid3.sourceforge.net/.  It
 might be worth looking into how to expand this library to recognize
 the TTF and OTF file headers, perhaps?  The idea here seems quite
 similar to what the *Nix file command does.  If someone were to look
 at the *nix file command source code, I bet you could fairly easily
 find a reference to the magic file header bytes that are used to
 detect TTF/OTF files and then add this to the getId3() stuff, assuming
 that getId3() is well-written.

Ben Weiner has been looking at ways to extract metadata from font
files directly, but I think he gave up because he couldn't complete it
in the time he had to allocate to it. He was looking at the TTX
tools for this, I think.

Anyway, since none of our files have ID3 tags inside them, it seems to
me that OFLB can get rid of getID3() and replace it with a PHP wrapper
around the file command, perhaps combined with TTX.
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-03 Thread Dave Crossland
2008/11/3 Brendan Ferguson [EMAIL PROTECTED]:

 By headers Ed means the first few bytes of the file. So the file
 command does indeed identify PHP files perfectly:

 I will take your word on it. I am clearly not up to date on this.

Pehraps the file manual will help? :-)

$ man file

 Will it detect a PHP file with a .html file if I have enabled PHP support in
 .html files?

Well, it detects PHP files when renamed as HTML files:

$ cp index.php index.html
$ file index.html
index.html: PHP script text
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-03 Thread Dave Crossland
2008/11/3 Brendan Ferguson [EMAIL PROTECTED]:

 It sounds like you are describing user security. This is really a server
 security issue for me.

 Take a PHP file. What headers will it have? NONE!

By headers Ed means the first few bytes of the file. So the file
command does indeed identify PHP files perfectly:

$ file web/www/index.php
web/www/index.php: PHP script text
$ cp web/www/index.{php,rpm}
$ file web/www/index.rpm
web/www/index.rpm: PHP script text
$

So I think we can use this program to reliably detect file types,
whatever extension they may have, and filter them accordingly.
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-03 Thread Ben Weiner
Hi,

Just to add a bookend: ccHost ships with two means of filtering files. 
Both are additive - IOW nothing is allowed until explicitly permitted (I 
prefer thi, and ccHost is at v5...)

The two methods:
- clever: getID3 (php file identifying lib). Useless to OFLB ATM as 
nobody's added and fonts to the library.
- less clever: signature recognition (called 'pseudo verify'). Already 
in place in the current OFLB live site for TTF, OTF, PFA etc (done by 
the brave souls who set up the site) and perfectly reasonable IMO. It 
basically does a head command on the file (so more basic than the file 
command).

I tried to co-opt some of the file verification code in the new typeface 
page (reasoning that the site already knew from info added to the pseudo 
verify system which types of files belonged to each typeface) but gave 
up. Instead I just made my own list of file extensions to check against 
the filename (the 'get you home' solution :-).

Cheers,
Ben
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-03 Thread Ed Trager
Hi, Brendan,

The PHP getId3() library is at http://getid3.sourceforge.net/.  It
might be worth looking into how to expand this library to recognize
the TTF and OTF file headers, perhaps?  The idea here seems quite
similar to what the *Nix file command does.  If someone were to look
at the *nix file command source code, I bet you could fairly easily
find a reference to the magic file header bytes that are used to
detect TTF/OTF files and then add this to the getId3() stuff, assuming
that getId3() is well-written.

- Ed

On Mon, Nov 3, 2008 at 2:06 PM, Brendan Ferguson [EMAIL PROTECTED] wrote:
 Could you point me to the page for this script? I would like to read
 more about it.

___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-03 Thread Jon Phillips
Yes, I think worthy. We have done this for SVG on http://openclipart.org

Its for more than just id3 now ;) Should be more like
readWriteMetadataWithPHP() 

;)

Jon

On Mon, 2008-11-03 at 16:46 -0500, Ed Trager wrote:
 Hi, Brendan,
 
 The PHP getId3() library is at http://getid3.sourceforge.net/.  It
 might be worth looking into how to expand this library to recognize
 the TTF and OTF file headers, perhaps?  The idea here seems quite
 similar to what the *Nix file command does.  If someone were to look
 at the *nix file command source code, I bet you could fairly easily
 find a reference to the magic file header bytes that are used to
 detect TTF/OTF files and then add this to the getId3() stuff, assuming
 that getId3() is well-written.
 
 - Ed
 
 On Mon, Nov 3, 2008 at 2:06 PM, Brendan Ferguson [EMAIL PROTECTED] wrote:
  Could you point me to the page for this script? I would like to read
  more about it.
 
 ___
 Openfontlibrary mailing list
 Openfontlibrary@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/openfontlibrary
-- 
Jon Phillips
San Francisco, CA + Guangzhou + Beijing
GLOBAL +1.415.830.3884
CHINA +86.1.360.282.8624
[EMAIL PROTECTED]
http://www.rejon.org
IM/skype: kidproto
Jabber: [EMAIL PROTECTED]
IRC: [EMAIL PROTECTED]

___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-02 Thread Dave Crossland
2008/11/2 Brendan Ferguson [EMAIL PROTECTED]:

 (c) when any individual files are added to the typeface, create a new
 zip that includes everything

 For what reason? Downloading? Is this essential or ideal?

Here is the use-case scenario that this is for:

Mary, soccer mom and scrap book hobbyist, is searching the web for
free fonts. She finds OFLB and wants to browse all the fonts in the
library at once, and quickly download all the files for the dozen
typeface families she thinks are cute. She sees something about how
these fonts are free as in no price, but also free as in she can
change them, and she bookmarks the site to learn more about all that
later.

More simply: A user goes to a font's page, and wants to download all
the files associated with that font - font files, font sources,
license, FONTLOG, everything. A ZIP file with everything, available
with a single click, is ideal for this.

 . Now. It looks as though people can fill out tags and also a
 description. We will not be able to do this while decompressing. The
 Name will have to take the form of the file name.

The name is the human name for the overall collection of files, and is
not directly related to those files. The tags and description are like
that too.

The upload form has a user fill these things in when they say the
location of the first file on the disk to be uploaded, and when they
click upload then the decompression would happen.

 I guess the easiest way to work this is to make them hidden by
 default. Navigating to the hidden files is confusing though. A
 consistent language on the file submission, (instead of publish now
 one could use hide this file. One could also rename the tab in the
 user page from hidden to unpublished or something like that.
 Additionally after a compressed file has been uploaded, a link on the
 confirmation page could be provided to the hidden page.

Sorry, I don't understand this, please re-explain it :-) Perhaps
explain it from the point of view of a user, as the steps they take.

 Now to the issue of allowed file types. The most secure thing to do
 would only to allow certain file types. Files such as php files should
 not be allowed. Nor should any other executable file. The
 decompression will need to check for the file types and than filter
 out the ones we do not want.

Yes.

 Have we come to some kind of decision on how the file types is gong to
 work? How are we gong to solve the problem of all the source files?
 Should we just input them all or what?

I think a exclude list is better than an include list - that is, we
should exclude files with .exe .php and so on, and include any files
not matching this ban-list.
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-02 Thread Ed Trager
One can always change a file name extension to something else, so
testing against the file extension is probably not useful.  PHP's
$_FILES['userfile']['type'] will indicate the file's mime type if
provided by the browser, but I don't know how browsers determine the
mime type for uploaded files.  The *nix file command reads the file
headers and determines file type based on the pattern of bytes in the
headers of files -- that is the most reliable way to do it.  But
again, I don't know if browsers use a similar method or not.

- Ed

 I think a exclude list is better than an include list - that is, we
 should exclude files with .exe .php and so on, and include any files
 not matching this ban-list.
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


Re: [Openfontlibrary] ccHost compression

2008-11-02 Thread Brendan Ferguson
 I suppose the Report possible License violation feature could be
 duplicated/extended to Report possible malicious file so a simple
 machine filter like file extensions would have a social safety net.

 The *nix file command reads the file
 headers and determines file type based on the pattern of bytes in the
 headers of files -- that is the most reliable way to do it.

 Well, in the supposed upload zip, uncompress zip, if other files
 added, compress all the files into a new zip process, running the
 file command on the files to check their type matches their file
 extension at the uncompress zip and files added stages would be
 great.

 Brendan, what do you think? :-)


It sounds like you are describing user security. This is really a  
server security issue for me.

Take a PHP file. What headers will it have? NONE! I have also looked  
at project that reads headers, and they primarily read audio file  
headers. Even, HTML files will have to be disabled if php support is  
enabled for html files (which it is not). With a PHP file being  
executed by the server, you may (depending on the way passwords were  
stored) be able to produce a dump of all the emails and stored  
passwords for them. Or say someone uploads a rpm file and then manages  
to execute it on the server?

I am not a security expert, but do know basic security rules. Getting  
the file onto the server is the first big step in launching an attack.  
I have managed to hack several sites gaining access to privileged  
database information this way. Constructing a map of the database from  
error messages I purposefully evoked. All due to lack uploading rules.

As per a blacklist, we would need to find a tried and true list as I  
doubt we would be able to come up with them all. And, it would  
constantly change with the evolution of technology  
(php3 .php4 .phtml .php + more) for php. Then there is Cold Fusion,  
ASP, Server Side Includes, Server-side JavaScript etc. This is just  
part of the web based technologies that can cause an excitation on the  
server. Although I am not familiar with many of them, many may have  
more than one extension. 
___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary


[Openfontlibrary] ccHost compression

2008-11-01 Thread Brendan Ferguson

 (c) when any individual files are added to the typeface, create a new
 zip that includes everything

For what reason? Downloading? Is this essential or ideal?


 (d) have the decompression work for any common format

 (e) have the compression happen in a range of formats


So, everything is decompressed. Great.

. Now. It looks as though people can fill out tags and also a  
description. We will not be able to do this while decompressing. The  
Name will have to take the form of the file name.

I guess the easiest way to work this is to make them hidden by  
default. Navigating to the hidden files is confusing though. A  
consistent language on the file submission, (instead of publish now  
one could use hide this file. One could also rename the tab in the  
user page from hidden to unpublished or something like that.  
Additionally after a compressed file has been uploaded, a link on the  
confirmation page could be provided to the hidden page.

Now to the issue of allowed file types. The most secure thing to do  
would only to allow certain file types. Files such as php files should  
not be allowed. Nor should any other executable file. The  
decompression will need to check for the file types and than filter  
out the ones we do not want.

Have we come to some kind of decision on how the file types is gong to  
work? How are we gong to solve the problem of all the source files?  
Should we just input them all or what?

Thoughts or comments on any of the above?

Brendan

___
Openfontlibrary mailing list
Openfontlibrary@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/openfontlibrary