Re: Is anyone using serialized iterators to provide provenance data?

2013-05-15 Thread Christopher
Seems to me this is nothing more than "clone and also add these per-table iterators on all scopes". Might be a neat little utility to wrap those features into a single step from the user's perspective. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, May 15, 2013 at 8:58 PM, Josh E

Re: Is anyone using serialized iterators to provide provenance data?

2013-05-15 Thread Josh Elser
Oh, I see what you mean. Table B was created from table A with a function F (where F is some collection of iterators like you said). It could be a neat application of the clone command. Storing that information on table B is some exercise in where to put that immutable information (that's me i

Re: Is anyone using serialized iterators to provide provenance data?

2013-05-15 Thread David Medinets
I don't see those as covering the same ground. Let's say I have an Accumulo table for a given human's genome. As a scientist, I want to apply a set of filters to create a subset of the genome. This provides a transform from data-set A to data-set B. Since iterators were used for the transform, we c

Re: Cancelling queued compactions in Accumulo 1.4

2013-05-15 Thread John Vines
Ugh, you got me! It was late and the cell service on the BART was spotty. On Wed, May 15, 2013 at 5:04 PM, Keith Turner wrote: > > > > On Wed, May 15, 2013 at 2:56 AM, John Vines wrote: > >> I do not believe there is a way to tell a tserver to cancel all >> compactions. It would be a nice feat

Re: Cancelling queued compactions in Accumulo 1.4

2013-05-15 Thread Keith Turner
On Wed, May 15, 2013 at 2:56 AM, John Vines wrote: > I do not believe there is a way to tell a tserver to cancel all > compactions. It would be a nice feature though. Mind putting on a ticket? > See ACCUMULO-990 its complete for 1.5 > Sorry for the dupe mike, missed hitting reply all > > Sent

Re: interesting

2013-05-15 Thread Josh Elser
Definitely, with a note on the ingest job duration, too. On 05/15/2013 04:27 PM, Christopher wrote: I'd be very curious how something faster, like Snappy, compared. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, May 15, 2013 at 2:52 PM, Eric Newton wrote: I don't intend to d

Re: Alternate user for tracer

2013-05-15 Thread Terry P.
Beauty, that's even cleaner Eric as then I can reuse the existing trace table and keep its data. Thanks for the quick reply! -tp On May 15, 2013, at 3:27 PM, Eric Newton wrote: Yes. But, technically, you don't have to give it CREATE_TABLE perms if you create the table first, and give the tra

Re: Alternate user for tracer

2013-05-15 Thread Eric Newton
Yes. But, technically, you don't have to give it CREATE_TABLE perms if you create the table first, and give the tracer user WRITE perms. -Eric On Wed, May 15, 2013 at 4:07 PM, Terry P. wrote: > To help get the Accumulo root password out of my accumulo-site.xml file > (security folks get touc

Re: interesting

2013-05-15 Thread Christopher
I'd be very curious how something faster, like Snappy, compared. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, May 15, 2013 at 2:52 PM, Eric Newton wrote: > I don't intend to do that. > > > On Wed, May 15, 2013 at 12:11 PM, Josh Elser wrote: >> >> Just kidding, re-read the res

Re: Is anyone using serialized iterators to provide provenance data?

2013-05-15 Thread Christopher
I think this might relate to ACCUMULO-1397, in the form of providing a mechanism to specify iterator profiles, or ACCUMULO-415. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Wed, May 15, 2013 at 2:51 PM, David Medinets wrote: > If you apply a set of iterators to one table to produce

Alternate user for tracer

2013-05-15 Thread Terry P.
To help get the Accumulo root password out of my accumulo-site.xml file (security folks get touchy when they find 'root' and 'password' together in a clear text file), I'd like to configure the tracer to run as a separate user. Is it as simple as: 1. create a user and give that user create_table

Re: Cancelling queued compactions in Accumulo 1.4

2013-05-15 Thread Eric Newton
Looks like you need to offline/online the table, too. -Eric On Wed, May 15, 2013 at 3:29 PM, Mike Drob wrote: > Some progress on this issue - > > If I stop the master then I can delete the fate transaction from > zookeeper. First I used "accumulo org.apache.accumulo.server.fate.Admin > print

Re: Cancelling queued compactions in Accumulo 1.4

2013-05-15 Thread Mike Drob
Some progress on this issue - If I stop the master then I can delete the fate transaction from zookeeper. First I used "accumulo org.apache.accumulo.server.fate.Admin print | grep CompactRange" to find the transactions and then "accumulo o.a.a.s.f.Admin delete " to delete it. Started the master ba

Is anyone using serialized iterators to provide provenance data?

2013-05-15 Thread David Medinets
If you apply a set of iterators to one table to produce another, it seems possible to serialize the iterator stack alongside the new table in some catalog to provide provenance. The assumption is that the tables are immutable, I think. Is anyone doing this or has anyone thought about doing so? Just

Re: interesting

2013-05-15 Thread Eric Newton
I don't intend to do that. On Wed, May 15, 2013 at 12:11 PM, Josh Elser wrote: > Just kidding, re-read the rest of this. Let me try again: > > Any intents to retry this with different compression codecs? > > > On 5/15/13 12:00 PM, Josh Elser wrote: > >> RFile... with gzip? Or did you use anothe

Re: interesting

2013-05-15 Thread Eric Newton
gzip. In fact, everything was basically done w/the default settings. On Wed, May 15, 2013 at 12:00 PM, Josh Elser wrote: > RFile... with gzip? Or did you use another compressor? > > > On 5/15/13 10:58 AM, Eric Newton wrote: > >> I ingested the 2-gram data on a 10 node cluster. It took just un

Re: interesting

2013-05-15 Thread Josh Elser
Just kidding, re-read the rest of this. Let me try again: Any intents to retry this with different compression codecs? On 5/15/13 12:00 PM, Josh Elser wrote: RFile... with gzip? Or did you use another compressor? On 5/15/13 10:58 AM, Eric Newton wrote: I ingested the 2-gram data on a 10 node

Re: interesting

2013-05-15 Thread Josh Elser
RFile... with gzip? Or did you use another compressor? On 5/15/13 10:58 AM, Eric Newton wrote: I ingested the 2-gram data on a 10 node cluster. It took just under 7 hours. For most of the job, accumulo ingested at about 200K k-v/server. $ hadoop fs -dus /accumulo/tables/2 /data/n-grams/2-gram

Re: interesting

2013-05-15 Thread Eric Newton
I ingested the 2-gram data on a 10 node cluster. It took just under 7 hours. For most of the job, accumulo ingested at about 200K k-v/server. $ hadoop fs -dus /accumulo/tables/2 /data/n-grams/2-grams /accumulo/tables/2 74632273653 /data/n-grams/2-grams 154271541304 That's a very nice result.