Christian, All, Bad news: my laptop is completely dead. Good news: I have a new one, and it's now fully operational (backups FTW!).
The log files have finally been uploaded: https://www.dropbox.com/s/j7l3lniu0wogu29/riak-died.tar.gz I have attached to that mail our config. The machine is a virtual Xen instance at Linode with 4GB of memory. I know it's probably not the very best setup, but 1) we're on a budget and 2) we assumed that would fit our needs quite well. Just to put things in more details. Initially we did not use allow_mult and things worked out fine for a couple of days. As soon as we enabled allow_mult, we were not able to run the cluster for more then 5 hours without seeing failing nodes, which is why I'm convinced we must be doing something wrong. The question is: what? Thanks On Sun, May 12, 2013 at 8:07 PM, Christian Dahlqvist <christ...@basho.com>wrote: > Hi Julien, > > I was not able to access the logs based on the link you provided. > > Could you please attach a copy of your app.config file so we can get a > better understanding of the configuration of your cluster? Also, what is > the specification of the machines in the cluster? > > How much data do you have in the cluster and how are you querying it? > > Best regards, > > Christian > > > > On 12 May 2013, at 19:11, Julien Genestoux <julien.genest...@gmail.com> > wrote: > > Hi, > > We are running a cluster of 5 servers, or at least trying to, because > nodes seem to be dying 'randomly' > without us knowing any reason why. We don't have a great Erlang guy > aboard, and the error logs are not > that verbose. > So I've just .tgz the whole log directory and I was hoping somebody could > give us a clue. > It's there: https://www.dropbox.com/s/z9ezv0qlxgfhcyq/riak-died.tar.gz(might > not be fully uploaded to dropbox yet!) > > I've looked at the archive and some people said their server was dying > because some object's size was just > too big to allocate the whole memory. Maybe that's what we're seeing? > > As one of our buckets is set with allow_mult, I am tempted to think that > some object's size may be exploding. > However, we do actually try to resolve conflicts in our code. Any idea how > to confirm and then debug that we > have an issue there? > > > Thanks a lot for your precious help... > > Julien > > > > _______________________________________________ > riak-users mailing list > riak-users@lists.basho.com > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com > > >
app.config
Description: Binary data
_______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com