Re: Accumulo defaults

2014-05-17 Thread Keith Turner
you can also set these in the shell w/ config -s tserver.compaction.minor.concurrent.max=5 config -s table.walog.enabled=false Disabling walogs in the shell does not require a tserver restart, but I am not sure about the minor compaction setting. The advantage of setting the config in the shell

Re: Accumulo defaults

2014-05-17 Thread Jeremy Kepner
Agreed. On Sat, May 17, 2014 at 06:01:26PM -0400, Josh Elser wrote: > Absolutely, if you restrict a problem, you can work around it in > other ways. Not going to argue that. > > Since this is a user list though, I got very worried seeing > something that roughly says "I'm benchmarking Accumulo wi

Re: Accumulo defaults

2014-05-17 Thread Josh Elser
tserver.compaction.minor.concurrent.max 5 table.walog.enabled false On 5/17/14, 5:34 PM, Kepner, Jeremy - 0553 - MITLL wrote: Thanks. Does anyone know the precise syntax that would be used in conf/accumulo-site.xml? Regards. -jeremy On May 17, 2014, at 4:12 PM, John Vines m

Re: Accumulo defaults

2014-05-17 Thread Josh Elser
And, one last thought, be careful about accidentally overriding walogs for the metadata table. There isn't ever a reason to turn off walogs for the metadata table (that I can think of). I'm not sure if setting the table.walogs.enabled property in accumulo-site.xml would override the value that

Re: Accumulo defaults

2014-05-17 Thread Josh Elser
Absolutely, if you restrict a problem, you can work around it in other ways. Not going to argue that. Since this is a user list though, I got very worried seeing something that roughly says "I'm benchmarking Accumulo with the WALs off". If you're providing resiliency against data lost using ot

Re: Accumulo defaults

2014-05-17 Thread Kepner, Jeremy - 0553 - MITLL
Thanks. Does anyone know the precise syntax that would be used in conf/accumulo-site.xml? Regards. -jeremy On May 17, 2014, at 4:12 PM, John Vines wrote: > Accumulo-site.xml > > Sent from my phone, please pardon the typos and brevity. > > On May 17, 2014 3:25 PM, "Kepner, Jeremy - 0553 -

Re: Accumulo defaults

2014-05-17 Thread Jeremy Kepner
walog provides data loss protection in a specific set of circumstances. Most of our deployments are under a different set of circumstances. Accumulo is only one part of our systems and we have other mechanisms for protecting against the loss of data. We find the walog actually becomes a bottleneck

Re: Accumulo defaults

2014-05-17 Thread Josh Elser
You're likely to lose data in *any* deployment with the walogs turned off. And, to reiterate what Sean says, I wouldn't really consider any benchmark with the walogs turned off valid except for "internal" benchmarks (ones where we evaluate components only within Accumulo for the sake of improv

Re: Accumulo defaults

2014-05-17 Thread John Vines
Accumulo-site.xml Sent from my phone, please pardon the typos and brevity. On May 17, 2014 3:25 PM, "Kepner, Jeremy - 0553 - MITLL" wrote: > As part of our Accumulo benchmarking we have decided to set certain values > as defaults for all our databases: > > tserver.compaction.minor.concur

Re: Accumulo defaults

2014-05-17 Thread Sean Busbey
You can set both of those in the accumulo-site.xml. However, it's going to be difficult to use benchmarks with walogs disabled for valid comparisons to other systems. Also you are very likely to lose data in any significantly sized deployment. On Sat, May 17, 2014 at 1:35 PM, Kepner, Jeremy - 0

Accumulo defaults

2014-05-17 Thread Kepner, Jeremy - 0553 - MITLL
As part of our Accumulo benchmarking we have decided to set certain values as defaults for all our databases: tserver.compaction.minor.concurrent.max=5 table.walog.enabled=false We were wondering which file(s) we would need to modify to apply these defaults? smime.p7s Descript

Re: Tracking cardinality in Accumulo

2014-05-17 Thread David Medinets
>What's the expected size of your unique key set? Thousands? Millions? Billions? This project is something to occupy me my spare time. And it's intended to explore aspects of Accumulo that I haven't needed to use yet. In the past, I simply ran a map-reduce job using the Word Counting technique. t