Re: Candidates for built-in filter functions?

2016-03-20 Thread Robert Samuel Newson
That idea came up explicitly this week and it has obvious merit. I don't know enough about mango selectors to know if it's rich "enough" but it would be simple to add and whatever it did cover would run much faster than today's JS approach. > On 20 Mar 2016, at 17:03, Adam Kocoloski wrote: >

The CouchDB Weekly News, March 17, is out đź“°

2016-03-20 Thread Jenn Turner
Happy Saint Patrick’s Day CouchDB-ers! 🍀 Just a short message to tell you that the CouchDB Weekly News is out: http://blog.couchdb.org/2016/03/17/couchdb-weekly-news-march-17-2016/ Highlights: - There’s a survey which you should definitely check out (especially if you haven’t yet)! - There’s al

Re: Candidates for built-in filter functions?

2016-03-20 Thread Adam Kocoloski
Hi Bob, instead of trying to anticipate all popular options what about enabling Mango selectors as filters? I’d hope that over time the performance of a selector is comparable to a builtin. Adam > On Mar 20, 2016, at 12:34 PM, Alexander Shorin wrote: > > On Sun, Mar 20, 2016 at 7:30 PM, Const

Re: Candidates for built-in filter functions?

2016-03-20 Thread Alexander Shorin
On Sun, Mar 20, 2016 at 7:30 PM, Constantin Teodorescu wrote: > On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson wrote: > >> As part of a new effort to improve replicator performance I'm planning to >> add new built-in filter functions. These run in the Erlang vm; saving the >> couchjs round trip.

Re: Candidates for built-in filter functions?

2016-03-20 Thread Constantin Teodorescu
On Sun, Mar 20, 2016 at 6:19 PM, Robert Newson wrote: > As part of a new effort to improve replicator performance I'm planning to > add new built-in filter functions. These run in the Erlang vm; saving the > couchjs round trip. > The first candidate is one that skips deleted documents as it's qui

Candidates for built-in filter functions?

2016-03-20 Thread Robert Newson
Hi, As part of a new effort to improve replicator performance I'm planning to add new built-in filter functions. These run in the Erlang vm; saving the couchjs round trip. The first candidate is one that skips deleted documents as it's quite common to replicate with such a filter to remove de

Re: Multiple database backup strategy

2016-03-20 Thread Robert Newson
Groovy, that's consensus then. Where we can use _db_updates to know which pending jobs are worth running, we will. If it's a 404, we'll do something less optimal, start the job itself. Using the connection pool as discussed and accounting for and penalising jobs that complete very quickly or p

Cardinality estimate (COUNT DISTINCT) as a builtin reduce

2016-03-20 Thread Adam Kocoloski
Hi all, we’ve seen a number of applications now where a user needs to count the number of unique keys in a view. Currently the recommended approach is to add a trivial reduce function and then count the number of rows in a _list function or client-side application code, but of course that doesn’

Re: Multiple database backup strategy

2016-03-20 Thread Adam Kocoloski
I’ll never berate anyone for top-posting (or bottom-posting for that matter). I just follow suit with whatever the current thread is doing — in this, very very clearly top-posting ;) Thank you for making this distinction clear. Personally I was only ever interested in the first case. Scoping th

Re: Multiple database backup strategy

2016-03-20 Thread Robert Samuel Newson
Final note, we've conflated two uses of /_db_updates that I want to be very clear on; 1) using /_db_updates to detect active source databases of a replication job. 2) using /_db_updates to hear about new/updated/deleted _replicator documents. It was the 2nd case where the unreliability was a con

Re: Multiple database backup strategy

2016-03-20 Thread Robert Samuel Newson
(I swear I'll stop soon...) Using /_db_updates as a cheap mechanism to detect activity at the source for any database we're interested in is an important optimization. We didn't discuss it this past week as we felt that /_db_updates wasn't sufficiently reliable. We can save a lot of churn in th

Re: Multiple database backup strategy

2016-03-20 Thread Robert Samuel Newson
I missed a point in Adam's earlier post. The current scheme uses couch_event for runtime changes to _replicator docs but has to read all updates of all _replicator databases at startup. In the steady state it is just receiving couch_event notifications. The /_db_updates option would change that

Re: Multiple database backup strategy

2016-03-20 Thread Robert Samuel Newson
Since I'm typing anyway, and haven't yet been dinged for top-posting, I wanted to mention one other optimization we had in mind. Currently each replicator job has its own connection pool. When we introduce the notion that we can stop and restart jobs, those become approximately useless. So we w

Re: Multiple database backup strategy

2016-03-20 Thread Robert Samuel Newson
My point is that we can (and currently do) trigger the replication manager on receipt of the database updated event, so it avoids all of the other parts of the sequence you describe which could fail. The obvious difference, and I suspect this is what motivates Adam's position, is that _db_updat

Re: Multiple database backup strategy

2016-03-20 Thread Robert Samuel Newson
Hi, If there's a chance that a user can add a single _replicator doc without it being picked up by _db_updates, I think that's a deal breaker. If a user is regularly adding/updating/deleting _replicator docs then, yes, I believe we can say we'll eventually notice. I did mean 'use couch_event'

[GitHub] couchdb-couch-log-lager pull request: Get lager event handlers fro...

2016-03-20 Thread jaydoane
Github user jaydoane commented on a diff in the pull request: https://github.com/apache/couchdb-couch-log-lager/pull/2#discussion_r56443141 --- Diff: src/couch_log_lager.erl --- @@ -64,10 +64,9 @@ emergency(Fmt, Args) -> -spec set_level(atom()) -> ok. set_level(Leve