On Sep 1, 2012, at 10:27 AM, Dave Cottlehuber wrote: > On 1 September 2012 07:55, Nathan Vander Wilt <nate-li...@calftrail.com> > wrote: >> I've got CouchDB mostly working on my Raspberry Pi, simply via `apt-get >> couchdb` plus the permissions fix Jens posted about recently. >> >> However, I can't get a particularly complex design document to finish its >> initial view generation. (See >> https://github.com/natevw/LocLog/tree/master/views especially >> https://github.com/natevw/LocLog/blob/master/views/by_utc/reduce.js for >> source code.) Originally I was getting explicit timeout errors, so after >> unsuccessfully trying more conservative values I cranked os_process_timeout >> to 9000000. This got it a lot farther, but now it seems stuck with no >> indication of what's going wrong except the server suddenly drops out before >> getting respawned: >> >> [Sat, 01 Sep 2012 04:55:55 GMT] [info] [<0.15090.1>] checkpointing view >> update at seq 2272 for loctest _design/loclog >> [Sat, 01 Sep 2012 05:00:01 GMT] [info] [<0.15090.1>] checkpointing view >> update at seq 2409 for loctest _design/loclog >> [Sat, 01 Sep 2012 05:09:49 GMT] [info] [<0.15090.1>] checkpointing view >> update at seq 2517 for loctest _design/loclog >> [Sat, 01 Sep 2012 05:14:46 GMT] [info] [<0.32.0>] Apache CouchDB has started >> on http://0.0.0.0:5984/ >> >> [Sat, 01 Sep 2012 05:19:50 GMT] [info] [<0.121.0>] 192.168.1.6 - - GET >> /_active_tasks 200 >> [Sat, 01 Sep 2012 05:19:55 GMT] [info] [<0.121.0>] 192.168.1.6 - - GET >> /_active_tasks 200 >> [Sat, 01 Sep 2012 05:20:00 GMT] [info] [<0.121.0>] 192.168.1.6 - - GET >> /_active_tasks 200 >> [Sat, 01 Sep 2012 05:20:05 GMT] [info] [<0.121.0>] 192.168.1.6 - - GET >> /_active_tasks 200 >> [Sat, 01 Sep 2012 05:20:10 GMT] [info] [<0.121.0>] 192.168.1.6 - - GET >> /_active_tasks 200 >> [Sat, 01 Sep 2012 05:20:15 GMT] [info] [<0.121.0>] 192.168.1.6 - - GET >> /_active_tasks 200 >> [Sat, 01 Sep 2012 05:20:55 GMT] [info] [<0.32.0>] Apache CouchDB has started >> on http://0.0.0.0:5984/ >> >> >> Any idea how to determine what could cause this, and/or if there's a remedy? >> My reduce function is rather float-heavy and I suspect perhaps the package >> build is using soft floats instead of hardware (not sure how to verify), but >> regardless the view made it this far and to see it simply fail without so >> much as a trace is a new one to me. I don't particularly suspect an >> out-of-memory condition — the whole database is <100MB (albeit snappy >> compressed) and this is spread across well over 5000 separate documents. >> >> thanks, >> -natevw > > Does it pass the test suite? > If not, what errors are coming up? > If it does, you might try running couchjs directly like this: > > /usr/local/bin/couchjs /usr/local/share/couchdb/server/main.js > > & read > http://wiki.apache.org/couchdb/View_server?action=show&redirect=ViewServer#Basic_API > for driving this. > > with a few of your docs & see what happens.
Thanks, yes it does pass most of the test suite (via Futon in Firefox). Only issues are replication-related: replication 301560ms 1. Assertion failed: copy !== null 2. Exception raised: {} replicator_db 35653ms 1. Assertion 'typeof repDoc._replication_stats === "object", "doc has stats"' failed: doc has stats 2. Exception raised: {} To be clear, it is not processing an individual document that fails, or generating views in general. I have managed to get a complete "catchup run" of (simpler) views in a different design document on a different dataset generated. On this particular view, before I increased the timeout it made it to sequence 78 before stopping with a timeout log after an error. I set the timeout to 2.5 hours and it got a lot farther. But... The problem I am having now is that the view can't get past its current checkpoint (over halfway through the change sequences) and there is no trace of why not — the whole server just disappears until restarted. So basically I can query the ?stale=ok view and get some of the data, but if I want the full set from what's in the database my view request waits a while, but then just drops when the server disappears five or ten minutes into the view update. It seems to be hitting some sort of very unexpected issue — it's not a simple timeout, as those were logged. I don't *think* it's a memory issue, as none of the documents are particularly large and up until then I see most of my Pi's real memory still available as well as all the swap. (Perhaps the first [not re-]reduce dataset is somehow overlarge, as it's my reduce function that is the complex part, but I would expect the index to be reasonably balanced…) Under what conditions during view generation would the entire CouchDB server simply abend without any leaving indication in its logs? regards, -natevw