Filename suffix mappings for compression formats
------------------------------------------------

                 Key: COMPRESS-68
                 URL: https://issues.apache.org/jira/browse/COMPRESS-68
             Project: Commons Compress
          Issue Type: New Feature
            Reporter: Jukka Zitting
            Priority: Minor


There are many file name suffix conventions like .tgz for gzipped .tar files 
and .svgz for gzipped .svg files. It would be useful if Commons Compress knew 
about these conventions and provided tools to help client applications to use 
these conventions.

For example in Apache Tika we currently have the following custom code to 
deduce the original filename from a gzipped file:

{code}
    if (name.endsWith(".tgz")) {
        name = name.substring(0, name.length() - 4) + ".tar";
    } else if (name.endsWith(".gz") || name.endsWith("-gz")) {
        name = name.substring(0, name.length() - 3);
    } else if (name.toLowerCase().endsWith(".svgz")) {
        name = name.substring(0, name.length() - 1);
    } else if (name.toLowerCase().endsWith(".wmz")) {
        name = name.substring(0, name.length() - 1) + "f";
    } else if (name.toLowerCase().endsWith(".emz")) {
        name = name.substring(0, name.length() - 1) + "f";
    }
{code}

It would be nice if we instead could do something like this:

{code}
    name = GzipUtils.getGunzipFilename(name);
{code}



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to