In many use cases, the key distribution changes over time. If the row
portion of the key is itself time-based, deleterows provides the most most
efficient method of removing old data while also keeping you from having a
bunch of empty tablets.
On Apr 14, 2014 8:51 PM, "Arshak Navruzyan" wrote:
>
BTW, I noticed that with
deleterows -t foo -f
you lose your split points. Not sure why this is desirable behavior in the
code.
On Mon, Apr 14, 2014 at 12:53 PM, Tiffany Reid wrote:
> Thanks so much for all responses.
>
> Tiffany
>
> On Apr 14, 2014, at 3:24 PM, "Keith Turner" wrote:
>
> del
Make sure you add some limit to the New Generation size. We have
"-XX:NewSize=500m -XX:MaxNewSize=500m " in the 3G version of
accumulo-env.sh. You can go larger than 500m, but try to keep it
small (~1G).
Look for evidence of a stop-the-world java garbage collection.
1) look for "gc" lines in th
Thanks so much for all responses.
Tiffany
On Apr 14, 2014, at 3:24 PM, "Keith Turner"
mailto:ke...@deenlo.com>> wrote:
deletemany pulls data back to the client and write deletes back. The
deleterows command is more efficient, it preforms operations on the server
side. Entire tablets that f
deletemany pulls data back to the client and write deletes back. The
deleterows command is more efficient, it preforms operations on the server
side. Entire tablets that fall within the range are just dropped.
On Mon, Apr 14, 2014 at 3:20 PM, Russ Weeks wrote:
> deletemany -t -f
>
> If you
All commands are from memory, so typos might exist. Deleting all rows can
be a very lengthy operation. It will likely be much faster to delete the
table and create a new one.
> droptable foo
> createtable foo
If you had configuration settings on the table that you wanted to keep,
then it might be
You can do this is the shell with:
$> deleterows -f -t myTable
Where "myTable" is the name of your table. Be careful with this, as the
"-f" effectively means "I really know what I'm doing."
You could also just delete the table and recreate it.
On Mon, Apr 14, 2014 at 12:13 PM, Tiffany Reid w
deletemany -t -f
If you have a large table, that command will produce a lot of output. I
don't know if there's a way to make it less verbose? Maybe best to pipe it
to /dev/null.
-Russ
On Mon, Apr 14, 2014 at 12:13 PM, Tiffany Reid wrote:
> Hi,
>
>
>
> How do I delete all rows in a table via
Hi,
How do I delete all rows in a table via Accumulo Shell?
Thanks,
Tiffany
The system swappiness warning is a bit of a red herring in that the systems
aren't configured with any swap space. They all have 64GB RAM of which
currently ~50GB is sitting as fs cache. The load on these systems was very
high during ingest so I'm sure there was IO latency even without swap use.
Hrm. 10x may have been overstating too. 5x is probably more accurate.
YMMV :)
On 4/14/14, 1:38 PM, Josh Elser wrote:
If you can about maximizing your throughput, ingest is probably not
desirable through the proxy (you can probably get ~10x faster using the
Java BatchWriter API).
I wouldn't avo
If you can about maximizing your throughput, ingest is probably not
desirable through the proxy (you can probably get ~10x faster using the
Java BatchWriter API).
I wouldn't avoid the proxy server purely because of using batch_scans
though. If you look at the Java impl of the BatchScanner, it
The log looks like it is retrying the ZK connection issues but that it
independently lost the lock.
The very start of the log claims you have vm.swappiness set to 60. Can you
zero this out and see if the issue still happens?
Also, check to see if you're hitting swap once the user is running a she
It will work fine... and you can run more than one in your cluster if needed.
If you observe a performance problem, please post a ticket to jira.
On Mon, Apr 14, 2014 at 12:12 PM, David O'Gwynn wrote:
> Ah, thanks Eric, that answers my question. It sounds like using the
> proxy server for batch
Hi-
I'm running a five-node Accumulo 1.4.5 cluster with zookeeper 3.4.6
distributed across the same systems.
We've seen a couple tserver failures in a manifestation that feels similar
to ACCUMULO-1572 (which was patched in 1.4.5). What is perhaps unique in
this circumstance is that the user repo
Ah, thanks Eric, that answers my question. It sounds like using the
proxy server for batch_scans and ingest is a bit beyond its scope. Are
there plans for beefing up the proxy to handle a wider range of
purposes from multiple clients?
Thanks,
David
On Mon, Apr 14, 2014 at 11:06 AM, Eric Newton w
High ingest and batch scans use resources within the proxy for queuing
data. If I was using a proxy for these activities, I would want to
have a proxy for each client. Administrative requests, and even basic
single-range scans are simple pass-throughs with a much lower chance
of overloading the p
"number of proxy servers should be proportional to the number of clients" -
I hate to be pedantic but
this is a very general statement. Can you be more specific? Should the
proportion be 1:1 or 5:1? What factors affect the ratio?
On Mon, Apr 14, 2014 at 9:32 AM, Eric Newton wrote:
> The number
The number of proxy servers should be proportional to the number of clients.
The proxy can talk to all the tablet servers, but the client of the
proxy only has the proxy to make requests on its behalf.
As always, it's going to depend on what you want to do, what your
schema looks like, and the to
19 matches
Mail list logo