mime - library for parsing shared MIME database

2015-08-15 Thread FreeSlave via Digitalmars-d-announce
Currently I'm working on mime library for D. Dub page: 
http://code.dlang.org/packages/mime
It can parse MIME database files, including binary ones, like 
mime.cache. It also has algorithms for mime type detecting by 
file name.


It's not fully implemented yet and does not have stable API. 
Issues and goals are listed on the github page: 
https://github.com/MyLittleRobo/mime


If someone is interested in the project, I would be glad to 
discuss interface and implementation details of the library.


If you don't know what is shared MIME database and why does it 
matter read this:

http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html#idm140625831778224



Re: mime - library for parsing shared MIME database

2015-08-15 Thread Rikki Cattermole via Digitalmars-d-announce

On 16/08/2015 6:30 a.m., FreeSlave wrote:

Currently I'm working on mime library for D. Dub page:
http://code.dlang.org/packages/mime
It can parse MIME database files, including binary ones, like
mime.cache. It also has algorithms for mime type detecting by file name.

It's not fully implemented yet and does not have stable API. Issues and
goals are listed on the github page: https://github.com/MyLittleRobo/mime

If someone is interested in the project, I would be glad to discuss
interface and implementation details of the library.

If you don't know what is shared MIME database and why does it matter
read this:
http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html#idm140625831778224


I had an mime implementation in Cmsed that basically was a hard coded 
file with a whole bunch of mime types along with file extensions.


I would be interested in seeing if this can match it 1:1 for features, 
while not allocating. Say give me the mime type for payload.


Possibly with its own override/addition csv files.



Re: mime - library for parsing shared MIME database

2015-08-16 Thread FreeSlave via Digitalmars-d-announce

On Sunday, 16 August 2015 at 03:56:45 UTC, Rikki Cattermole wrote:

On 16/08/2015 6:30 a.m., FreeSlave wrote:

Currently I'm working on mime library for D. Dub page:
http://code.dlang.org/packages/mime
It can parse MIME database files, including binary ones, like
mime.cache. It also has algorithms for mime type detecting by 
file name.


It's not fully implemented yet and does not have stable API. 
Issues and
goals are listed on the github page: 
https://github.com/MyLittleRobo/mime


If someone is interested in the project, I would be glad to 
discuss

interface and implementation details of the library.

If you don't know what is shared MIME database and why does it 
matter

read this:
http://standards.freedesktop.org/shared-mime-info-spec/shared-mime-info-spec-latest.html#idm140625831778224


I had an mime implementation in Cmsed that basically was a hard 
coded file with a whole bunch of mime types along with file 
extensions.


I would be interested in seeing if this can match it 1:1 for 
features, while not allocating. Say give me the mime type for 
payload.


Possibly with its own override/addition csv files.


This library focuses on shared MIME database used in freedesktop 
systems, usually for detecting file types in file managers to 
display appropriate icon and make correct choice of default 
application (well, that's another spec) to run on file. I don't 
think this suits the web world.


Your "hardcoded" approach is what usually used in web. If I 
remember correctly mime types are hardcoded in Chromium too.
But mime type is not only about extension. Generally pattern can 
be any glob pattern. That's why file managers can detect Makefile 
type as text/x-makefile, even though it does not have extension. 
Same for CMakeLists.txt - the preferred type is text/x-cmake, not 
just text/plain.


Patterns are not alone. There're magic rules for the rescue when 
mime type can't be detected from the name of file. That's how 
Linux file managers differ shell script from python script even 
if both don't have extension (but they have leading comment like 
#!/bin/sh or #!/usr/bin/python). Also that's how file managers 
detect file with unknown pk3 extension (used in Quake III based 
games) as zip file. Because it's really just zip file by its 
contents.


The whole shared MIME database thing is system and user 
dependent, so again it's not what you want to use for web, 
probably unless you manage the database yourself on the server. 
Even in this case patterns and magic rules are just hints. You 
can't rely on that to check if the uploading file is of the 
needed type. For example, if you want to validate image file, the 
only way to do it is to parse the whole file. Still you can use 
hints to cut off obviously invalid files.


I did not target non-allocating code yet, though I believe it's 
possible to make MimeCache to not allocate when detecting file 
type.