Re: How to delete all rows in Accumulo table?

2014-04-14 Thread Billie Rinaldi
In many use cases, the key distribution changes over time. If the row portion of the key is itself time-based, deleterows provides the most most efficient method of removing old data while also keeping you from having a bunch of empty tablets. On Apr 14, 2014 8:51 PM, "Arshak Navruzyan" wrote: >

Re: How to delete all rows in Accumulo table?

2014-04-14 Thread Arshak Navruzyan
BTW, I noticed that with deleterows -t foo -f you lose your split points. Not sure why this is desirable behavior in the code. On Mon, Apr 14, 2014 at 12:53 PM, Tiffany Reid wrote: > Thanks so much for all responses. > > Tiffany > > On Apr 14, 2014, at 3:24 PM, "Keith Turner" wrote: > > del

Re: ZooKeeper ConnectionLoss in Accumulo 1.4.5

2014-04-14 Thread Eric Newton
Make sure you add some limit to the New Generation size. We have "-XX:NewSize=500m -XX:MaxNewSize=500m " in the 3G version of accumulo-env.sh. You can go larger than 500m, but try to keep it small (~1G). Look for evidence of a stop-the-world java garbage collection. 1) look for "gc" lines in th

Re: How to delete all rows in Accumulo table?

2014-04-14 Thread Tiffany Reid
Thanks so much for all responses. Tiffany On Apr 14, 2014, at 3:24 PM, "Keith Turner" mailto:ke...@deenlo.com>> wrote: deletemany pulls data back to the client and write deletes back. The deleterows command is more efficient, it preforms operations on the server side. Entire tablets that f

Re: How to delete all rows in Accumulo table?

2014-04-14 Thread Keith Turner
deletemany pulls data back to the client and write deletes back. The deleterows command is more efficient, it preforms operations on the server side. Entire tablets that fall within the range are just dropped. On Mon, Apr 14, 2014 at 3:20 PM, Russ Weeks wrote: > deletemany -t -f > > If you

Re: How to delete all rows in Accumulo table?

2014-04-14 Thread Mike Drob
All commands are from memory, so typos might exist. Deleting all rows can be a very lengthy operation. It will likely be much faster to delete the table and create a new one. > droptable foo > createtable foo If you had configuration settings on the table that you wanted to keep, then it might be

Re: How to delete all rows in Accumulo table?

2014-04-14 Thread Sean Busbey
You can do this is the shell with: $> deleterows -f -t myTable Where "myTable" is the name of your table. Be careful with this, as the "-f" effectively means "I really know what I'm doing." You could also just delete the table and recreate it. On Mon, Apr 14, 2014 at 12:13 PM, Tiffany Reid w

Re: How to delete all rows in Accumulo table?

2014-04-14 Thread Russ Weeks
deletemany -t -f If you have a large table, that command will produce a lot of output. I don't know if there's a way to make it less verbose? Maybe best to pipe it to /dev/null. -Russ On Mon, Apr 14, 2014 at 12:13 PM, Tiffany Reid wrote: > Hi, > > > > How do I delete all rows in a table via

How to delete all rows in Accumulo table?

2014-04-14 Thread Tiffany Reid
Hi, How do I delete all rows in a table via Accumulo Shell? Thanks, Tiffany

Re: ZooKeeper ConnectionLoss in Accumulo 1.4.5

2014-04-14 Thread Frans Lawaetz
The system swappiness warning is a bit of a red herring in that the systems aren't configured with any swap space. They all have 64GB RAM of which currently ~50GB is sitting as fs cache. The load on these systems was very high during ingest so I'm sure there was IO latency even without swap use.

Re: Optimal # proxy servers

2014-04-14 Thread Josh Elser
Hrm. 10x may have been overstating too. 5x is probably more accurate. YMMV :) On 4/14/14, 1:38 PM, Josh Elser wrote: If you can about maximizing your throughput, ingest is probably not desirable through the proxy (you can probably get ~10x faster using the Java BatchWriter API). I wouldn't avo

Re: Optimal # proxy servers

2014-04-14 Thread Josh Elser
If you can about maximizing your throughput, ingest is probably not desirable through the proxy (you can probably get ~10x faster using the Java BatchWriter API). I wouldn't avoid the proxy server purely because of using batch_scans though. If you look at the Java impl of the BatchScanner, it

Re: ZooKeeper ConnectionLoss in Accumulo 1.4.5

2014-04-14 Thread Sean Busbey
The log looks like it is retrying the ZK connection issues but that it independently lost the lock. The very start of the log claims you have vm.swappiness set to 60. Can you zero this out and see if the issue still happens? Also, check to see if you're hitting swap once the user is running a she

Re: Optimal # proxy servers

2014-04-14 Thread Eric Newton
It will work fine... and you can run more than one in your cluster if needed. If you observe a performance problem, please post a ticket to jira. On Mon, Apr 14, 2014 at 12:12 PM, David O'Gwynn wrote: > Ah, thanks Eric, that answers my question. It sounds like using the > proxy server for batch

ZooKeeper ConnectionLoss in Accumulo 1.4.5

2014-04-14 Thread Frans Lawaetz
Hi- I'm running a five-node Accumulo 1.4.5 cluster with zookeeper 3.4.6 distributed across the same systems. We've seen a couple tserver failures in a manifestation that feels similar to ACCUMULO-1572 (which was patched in 1.4.5). What is perhaps unique in this circumstance is that the user repo

Re: Optimal # proxy servers

2014-04-14 Thread David O'Gwynn
Ah, thanks Eric, that answers my question. It sounds like using the proxy server for batch_scans and ingest is a bit beyond its scope. Are there plans for beefing up the proxy to handle a wider range of purposes from multiple clients? Thanks, David On Mon, Apr 14, 2014 at 11:06 AM, Eric Newton w

Re: Optimal # proxy servers

2014-04-14 Thread Eric Newton
High ingest and batch scans use resources within the proxy for queuing data. If I was using a proxy for these activities, I would want to have a proxy for each client. Administrative requests, and even basic single-range scans are simple pass-throughs with a much lower chance of overloading the p

Re: Optimal # proxy servers

2014-04-14 Thread David Medinets
"number of proxy servers should be proportional to the number of clients" - I hate to be pedantic but this is a very general statement. Can you be more specific? Should the proportion be 1:1 or 5:1? What factors affect the ratio? On Mon, Apr 14, 2014 at 9:32 AM, Eric Newton wrote: > The number

Re: Optimal # proxy servers

2014-04-14 Thread Eric Newton
The number of proxy servers should be proportional to the number of clients. The proxy can talk to all the tablet servers, but the client of the proxy only has the proxy to make requests on its behalf. As always, it's going to depend on what you want to do, what your schema looks like, and the to