[ 
https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793569#action_12793569
 ] 

Filipe Manana commented on COUCHDB-583:
---------------------------------------

Hi Adam,

I see. I am just thinking about how to do it without causing backward 
incompatibility.

Currently couch_file:append_term[_md5]/2 calls term_to_binary and precedes it 
with a 32 bits header and an optional md5 digest. Then the 32bits header + 
optional term md5 + term_to_binary(Term) is appended to the end of the DB file. 

The high order bit of this header indicates whether an md5 hash follows the 
header (and preceding the serialized term).
Without looking deeply into the code, I think about adding an extra bit to the 
header which indicates if the term is compressed or not.

Of course this implies adding a new DB header version value, etc. Not so 
straightforward as attachment compression.

An (ugly) alternative I see is not adding a new header bit and when reading a 
serialized term from the DB file, always gunzip it and catch an exception:

3> catch(zlib:gunzip(<<"hello world">>)).
{'EXIT',{data_error,[{zlib,call,3},
                     {zlib,inflate,2},
                     {zlib,gunzip,1},
                     {erl_eval,do_apply,5},
                     {erl_eval,expr,5},
                     {shell,exprs,6},
                     {shell,eval_exprs,6},
                     {shell,eval_loop,3}]}}

The issue is that the data_error exception might not mean that the data is not 
gzip compressed.

If using an extra header bit, than why not add a few more bits that will be 
reserved for future features. A little bit like most protocol RFCs do, they 
reserve a few bits in an header for future usage :)

What's your opinion?

cheers



> adding ?compression=(gzip|deflate) optional parameter to the attachment 
> download API
> ------------------------------------------------------------------------------------
>
>                 Key: COUCHDB-583
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-583
>             Project: CouchDB
>          Issue Type: New Feature
>          Components: HTTP Interface
>         Environment: CouchDB trunk revision 885240
>            Reporter: Filipe Manana
>         Attachments: couchdb-583-trunk-3rd-try.patch, 
> couchdb-583-trunk-4th-try-trunk.patch, couchdb-583-trunk-5th-try.patch, 
> couchdb-583-trunk-6th-try.patch, jira-couchdb-583-1st-try-trunk.patch, 
> jira-couchdb-583-2nd-try-trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The following new feature is added in the patch following this ticket 
> creation.
> A new optional http query parameter "compression" is added to the attachments 
> API.
> This parameter can have one of the values:  "gzip" or "deflate".
> When asking for an attachment (GET http request), if the query parameter 
> "compression" is found, CouchDB will send the attachment compressed to the 
> client (and sets the header Content-Encoding with gzip or deflate).
> Further, it adds a new config option "treshold_for_chunking_comp_responses" 
> (httpd section) that specifies an attachment length threshold. If an 
> attachment has a length >= than this threshold, the http response will be 
> chunked (besides compressed).
> Note that using non chunked compressed  body responses requires storing all 
> the compressed blocks in memory and then sending each one to the client. This 
> is a necessary "evil", as we only know the length of the compressed body 
> after compressing all the body, and we need to set the "Content-Length" 
> header for non chunked responses. By sending chunked responses, we can send 
> each compressed block immediately, without accumulating all of them in memory.
> Examples:
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=gzip
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=deflate
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt   # attachment will 
> not be compressed
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=rar   # 
> will give a 500 error code
> Etap test case included.
> Feedback would be very welcome.
> cheers

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to