Re: ddoc protocol for query server

2009-12-20 Thread Chris Anderson
Devs,

My design doc query server branch is now ready for commit. It
maintains backwards compatibility, with some small improvements to
query handling, across all the APIs except for the query server
protocol. It does not effect the map-reduce part of the query server
protocol.

The current branch can be found here:

http://github.com/jchris/couchdb/tree/ddoc-qs

I'm also attaching an (almost current) snapshot of the diff to:

https://issues.apache.org/jira/browse/COUCHDB-589

I've got all the tests passing, and moved the native query servers to
a gen_server (although I didn't completely eliminate use of the
process dictionary, this will not be hard now). There are other
goodies in here, like better query server error handling. I've also
refactored the JavaScript code to avoid global variables, although we
could do more work here on the sandboxing.

I intend to commit this patch soon. It's a lot of lines, so I want to
get it in the next couple of days. I don't think it breaks anything,
but it's still better to let it sit in trunk for a while before the
release.

The only people this change should effect are those using alternate
query servers for more than just map-reduce. If you are one of these
people, please inspect the branch and let me know how you feel about
it.

The gen_server around the native query server could use some review,
too. It all seems to work just fine but there's nothing really in
there about when the server should be shut down. I think that's fine
as long as it goes down on config change, which it seems to.

The way design docs are cached inside the server simplifies function
dispatch and fixes the one os-process per filtered _changes feed
problem. It would be good to profile list / show / update / validation
now. They should be faster as they involve less data sent between
Couch and the query server.

One concern is that on systems with a large number of distinct design
documents (or very large design documents) the memory utilization
could go up. Currently attachments are not loaded into the design
document for this reason.

Only the latest version of each design document is stored in the query
server, so if you have 2 dbs with the same ddoc at different revs,
it's possible that some flapping could occur. The alternative is
storing ddocs by rev and using a JavaScript LRU cache. That'd be a
worthy improvement but it can come later.

The patch opens the door for more flexibility within the JavaScript
query server. For instance, CouchDB will be able to support APIs like
the require() statement from CommonJS:

http://commonjs.org/specs/modules/1.0.html

Chris

On Thu, Dec 17, 2009 at 6:39 PM, Chris Anderson  wrote:
> I've been working on a medium-sized branch for a while and I'm finally
> getting something worth showing... almost.
>
> I'm developing on http://github.com/jchris/couchdb/tree/ddoc-qs
>
> The essential difference is in the way the code is handled by the
> query server. The query server keeps a cache of design documents
> around. This allows us to simplify the programming model for show,
> list, update, etc, and also I think will make it easier to add new
> commands in the future.
>
> My aim was to cut a lot of code, but stay close to a pure refactoring.
> So I haven't substantially changed the Futon test suite. But I did
> make some changes to query_server_spec.rb.
>
> If anyone would like to pull the Erlang query server into line with
> these changes, I'd appreciate the help.
>
> The branch is still deeply messy in some places. When it comes time to
> push to CouchDB trunk, I'll hand-edit the patch for log levels and
> inane comments.
>
> Chris
>
> --
> Chris Anderson
> http://jchrisa.net
> http://couch.io
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io


[jira] Updated: (COUCHDB-589) Simplify Query Server interface and Design Handlers

2009-12-20 Thread Chris Anderson (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Anderson updated COUCHDB-589:
---

Attachment: ddoc_qs.patch

This patch reflects the current state of:

http://github.com/jchris/couchdb/tree/ddoc-qs

I plan to commit this code soon, so I'm putting it in Jira to round out this 
ticket before I close it.

There is a dev@ thread about the "ddoc protocol for query server" which 
discusses the change.

> Simplify Query Server interface and Design Handlers
> ---
>
> Key: COUCHDB-589
> URL: https://issues.apache.org/jira/browse/COUCHDB-589
> Project: CouchDB
>  Issue Type: Improvement
>  Components: JavaScript View Server
>Reporter: Chris Anderson
>Assignee: Chris Anderson
> Attachments: ddoc_qs.patch, design_handlers.patch
>
>
> This patch refactors list, show, update, filter, and view handling to have a 
> unified interface for loading the design document and functions from it. It 
> is a step on the path to removing a lot of function src passing overhead from 
> the view server protocol.
> The patch also removes some old part of the API while improving other parts 
> of the API.
> The big changes:
> Load the design document before handing it to the design document handlers. 
> This removes a lot of duplicated code.
> Remove /db/_view handler. I think it's in our interest to remove the last 
> bits of the old-style API before 0.11.
> I also clean up some rough edges in the API.
> I plan to commit this patch soon, but want to give people an opportunity to 
> look it over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Updating the CouchDB roadmap

2009-12-20 Thread Filipe David Manana
An updated (and no longer breaking 4 test cases) patch for storing
compressed attachments was added to tickets 583 and 437.

cheers

On Sat, Dec 19, 2009 at 8:46 AM, Filipe David Manana wrote:

> After rev 891077 (ticket 558, md5 integrity check) I need to update that
> patch for storing gzip compressed attachments.
> The md5 calculated by couch_stream corresponds to the md5 of the gzipped
> content, therefore not matching the attachment content sent by the client
> for those attachments having a mime type listed in the additional config
> file.
>
> I'll be fixing this, adding a config option for the compression level and
> do the necessary modifications on the existing test cases. A new patch will
> be coming soon :)
>
> cheers
>
>
> On Fri, Dec 18, 2009 at 8:31 PM, Damien Katz  wrote:
>
>> FYI, I haven't looked at this patch's code, but I like it's concepts and I
>> hope of the other committers will have a chance to look at it soon and work
>> out any issues to get it checked into trunk. If not I eventually will get
>> around to it, but I can't promise when.
>>
>> -Damien
>>
>>
>> On Dec 9, 2009, at 8:07 AM, Filipe David Manana wrote:
>>
>> > that's ticket 583 - https://issues.apache.org/jira/browse/COUCHDB-583
>> >
>> > The ticket's title is no longer fair. One of the last comments
>> > mentions the possibility of storing attachments in gzip compressed
>> > form (suggestion from Damien).
>> > I submitted a patch for that feature yesterday.
>> >
>> > cheers
>> >
>> > On Fri, Dec 4, 2009 at 7:17 PM, Robert Newson 
>> wrote:
>> >> Support for compressed attachments? I think there's a ticket for it,
>> >> but the basic idea is to support Accept-Encoding/Content-Encoding form
>> >> the client and store the attachment with compression (and
>> >> decompressing on demand).
>> >>
>> >> A bulk insertion endpoint that included attachments (without base64
>> >> inflation) would also be nice.
>> >>
>> >> On Fri, Dec 4, 2009 at 5:55 PM, Dirkjan Ochtman 
>> wrote:
>> >>> On Fri, Dec 4, 2009 at 18:32, Noah Slater 
>> wrote:
>>  Hey, does anyone want to start the discussion off?
>> >>>
>> >>> Question from one of those trailing around on actual releases, that
>> >>> may help getting started: what big-ticket features are in trunk, but
>> >>> not 0.10.1?
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Dirkjan
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > Filipe David Manana,
>> > fdman...@gmail.com
>> > PGP key - http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC569452B
>> >
>> > "Reasonable men adapt themselves to the world.
>> > Unreasonable men adapt the world to themselves.
>> > That's why all progress depends on unreasonable men."
>>
>>
>
>
> --
> Filipe David Manana,
> fdman...@gmail.com
> PGP key - http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC569452B
>
> "Reasonable men adapt themselves to the world.
> Unreasonable men adapt the world to themselves.
> That's why all progress depends on unreasonable men."
>
>


-- 
Filipe David Manana,
fdman...@gmail.com
PGP key - http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xC569452B

"Reasonable men adapt themselves to the world.
Unreasonable men adapt the world to themselves.
That's why all progress depends on unreasonable men."


[jira] Updated: (COUCHDB-437) Make compression level configurable, and allow attachments to be compressed

2009-12-20 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-437:
--

Attachment: couchdb-583-trunk-4th-try-trunk.patch

This seems to be the same feature as the one for ticket 583 => 
https://issues.apache.org/jira/browse/COUCHDB-583

> Make compression level configurable, and allow attachments to be compressed
> ---
>
> Key: COUCHDB-437
> URL: https://issues.apache.org/jira/browse/COUCHDB-437
> Project: CouchDB
>  Issue Type: Improvement
>  Components: Database Core
>Reporter: Jason Davies
>Priority: Minor
> Attachments: couchdb-583-trunk-4th-try-trunk.patch
>
>
> As suggested by Adam Kocolosk in 
> http://www.mail-archive.com/dev@couchdb.apache.org/msg02858.html
> > The nice thing is that binary_to_term seems perfectly happy reading a mix 
> > of compressed and uncompressed binaries, which means the compression level 
> > can be a configuration parameter if we want it to be. gzip decompresses 
> > pretty quickly, so I'm guessing that reading a compressed DB will be faster 
> > than an uncompressed one. We'll have to measure it, though.
> Just thinking that space may be at a premium for some users and enabling 
> compression could save them quite a bit of space depending on the data stored 
> in docs.  Compressing attachments could be beneficial too, and for particular 
> use cases compression might increase read throughput due to needing less disk 
> reads.  As Adam says, we need to measure it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (COUCHDB-583) adding ?compression=(gzip|deflate) optional parameter to the attachment download API

2009-12-20 Thread Filipe Manana (JIRA)

 [ 
https://issues.apache.org/jira/browse/COUCHDB-583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Filipe Manana updated COUCHDB-583:
--

Attachment: couchdb-583-trunk-4th-try-trunk.patch

Here follows an updated patch.

relatively to the previous one:

+ works with ticket 558 (attachment upload md5 integrity check)
+ adds gzip compression level configuration option
+ no longer breaks 2 Etap tests and 2 JavaScript tests :)

All tests, except for the Etap ICU test (which is broken for some other 
reason), are working with this patch.
Etap test included for the new feature is included.

The feature of this patch seems to be the same that is requested in ticket 437 
=> https://issues.apache.org/jira/browse/COUCHDB-437

Feedback please

cheers

> adding ?compression=(gzip|deflate) optional parameter to the attachment 
> download API
> 
>
> Key: COUCHDB-583
> URL: https://issues.apache.org/jira/browse/COUCHDB-583
> Project: CouchDB
>  Issue Type: New Feature
>  Components: HTTP Interface
> Environment: CouchDB trunk revision 885240
>Reporter: Filipe Manana
> Attachments: couchdb-583-trunk-3rd-try.patch, 
> couchdb-583-trunk-4th-try-trunk.patch, jira-couchdb-583-1st-try-trunk.patch, 
> jira-couchdb-583-2nd-try-trunk.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> The following new feature is added in the patch following this ticket 
> creation.
> A new optional http query parameter "compression" is added to the attachments 
> API.
> This parameter can have one of the values:  "gzip" or "deflate".
> When asking for an attachment (GET http request), if the query parameter 
> "compression" is found, CouchDB will send the attachment compressed to the 
> client (and sets the header Content-Encoding with gzip or deflate).
> Further, it adds a new config option "treshold_for_chunking_comp_responses" 
> (httpd section) that specifies an attachment length threshold. If an 
> attachment has a length >= than this threshold, the http response will be 
> chunked (besides compressed).
> Note that using non chunked compressed  body responses requires storing all 
> the compressed blocks in memory and then sending each one to the client. This 
> is a necessary "evil", as we only know the length of the compressed body 
> after compressing all the body, and we need to set the "Content-Length" 
> header for non chunked responses. By sending chunked responses, we can send 
> each compressed block immediately, without accumulating all of them in memory.
> Examples:
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=gzip
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=deflate
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt   # attachment will 
> not be compressed
> $ curl http://localhost:5984/testdb/testdoc1/readme.txt?compression=rar   # 
> will give a 500 error code
> Etap test case included.
> Feedback would be very welcome.
> cheers

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.