Re: Impala 3.x (and 2.x)

2018-01-12 Thread Philip Zeyliger
On Fri, Jan 12, 2018 at 4:04 PM, Jim Apple  wrote:

> This makes sense to me.
>
> In this mode, for 2.x-only changes and for changes on 3.0 that don't
> apply cleanly, there will be a manual way to do the step labelled "1.
> Cherrypick tool", and that way is the same way we send patches for
> review now, but pushing to HEAD:refs/for/2.x rather than
> HEAD:refs/for/master, yes?
>

Exactly. So, non-clean cherrypicks or 2.x-only changes go through review on
Gerrit, but we give an implicit review pass to clean cherrypicks.

We could have the cherrypick tool between gerrit/master and gerrit/2.x do
the cherrypicks and run the tests on Jenkins. Do you think that's
preferable?

-- Philip



>
> On Fri, Jan 12, 2018 at 3:57 PM, Philip Zeyliger 
> wrote:
> > Picture:
> > https://gist.github.com/philz/323c8b4cb411dc12eb7231d922c195
> 1f#file-impala-branch-image-pdf
> >
> >
> > On Fri, Jan 12, 2018 at 3:47 PM, Jim Apple  wrote:
> >
> >> Often, this list seems to filter out images. Could you post it and send
> a
> >> link?
> >>
> >> Thanks for taking this on, Phil!
> >>
> >> On Fri, Jan 12, 2018 at 3:15 PM, Philip Zeyliger 
> >> wrote:
> >>
> >> > I think most patches go to Gerrit branch 'master', which happens to
> >> > identify itself as 3.0. (Or 3.x?).
> >> >
> >> > Here's a picture:
> >> >
> >> > [image: Inline image 1]
> >> >
> >> >
> >> > With this, every time "cherrypick_and_push_to_asf.py" is run, it
> would
> >> > first offer to cherrypick changes between master and 2.x. Then, it
> would
> >> > offer push those cherrypicks to gerrit/2.x. After that, it continues
> on
> >> as
> >> > before and offers to push changes to ASF. I think this maintains the
> >> > invariant that pushing to ASF is only done with a human trigger. (We
> >> could
> >> > also have step 1 be done by a Jenkins robot, since it's between Gerrit
> >> and
> >> > Gerrit.)
> >> >
> >> > I looked at the How to Release page, and the main difference would be
> >> > that, for a 2.x release, the $COMMIT_HASH_YOU_CHOSE would come from
> the
> >> 2.x
> >> > branch, as would any cherrypicks.
> >> >
> >> > Does this match what you're thinking?
> >> >
> >> > Thanks!
> >> >
> >> > -- Philip
> >> >
> >> > On Fri, Jan 12, 2018 at 11:07 AM, Jim Apple 
> >> wrote:
> >> >
> >> >> Which gerrit branch were you thinking most patches would go to?
> >> >>
> >> >> If they go to 3.0, then push_to_asf.py would have to be amended to
> >> >> push to gerrit, bypassing code review. I think that's possible, but
> >> >> I'm not 100%.
> >> >>
> >> >> There is also security to think about, since the push_to_asf.py users
> >> >> can push a few commits at a time, including ones they didn't author
> or
> >> >> review.
> >> >>
> >> >> We'll also want to clarify
> >> >> https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release
> and
> >> >> keep it consistent with the git & gerrit statuses quo.
> >> >>
> >> >> On Fri, Jan 12, 2018 at 10:52 AM, Philip Zeyliger <
> phi...@cloudera.com>
> >> >> wrote:
> >> >> > Hi!
> >> >> >
> >> >> >> Should we start tagging all candidates with a common label, e.g.
> >> >> > include-in-v3?
> >> >> >
> >> >> > I agree with Lars's suggestion for tagging JIRAs with
> include-in-v3.
> >> >> I've
> >> >> > done so, and the relevant query is
> >> >> > https://issues.apache.org/jira/issues/?jql=labels%20%3D%20in
> >> >> clude-in-v3%20and%20project%3Dimpala
> >> >> > .
> >> >> >
> >> >> >> What sort of process were you thinking of for the automation?
> >> >> >
> >> >> > I think amending push_to_asf.py, as you suggest, is a great idea. I
> >> >> think
> >> >> > we have a string ("not for 2.x") which can be used in commit
> messages
> >> to
> >> >> > discourage the cherrypick for the changes we want to be exclusive
> >> until
> >> >> we
> >> >> > want to change the defaults in the other direction. (I.e., right
> now
> >> the
> >> >> > string is "not for 2.x", but at some point the string may be
> "should
> >> be
> >> >> > cherrypicked to 2.x".)
> >> >> >
> >> >> > I do think that we want to create a gerrit branch to allow 2.x-only
> >> >> changes
> >> >> > to be reviewed in the straight-forward fashion.
> >> >> >
> >> >> > -- Philip
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Thu, Jan 11, 2018 at 9:31 AM, Jim Apple 
> >> >> wrote:
> >> >> >
> >> >> >> I'm on-board with all of this. (I also would be OK delaying 3.0,
> if
> >> >> >> that were the consensus).
> >> >> >>
> >> >> >> There is one issue in here I think we should dive into:
> >> >> >>
> >> >> >> > Both master and 2.x would be active, and, at least for the
> >> beginning,
> >> >> >> > changes would automatically be pulled into the 2.x line, unless
> >> >> >> explicitly
> >> >> >> > blacklisted.
> >> >> >>
> >> >> >> What sort of process were you thinking of for the automation?
> >> >> >>
> >> >> >> Some context, starting from what we all likely already know:
> >> >> >>

Re: Impala 3.x (and 2.x)

2018-01-12 Thread Jim Apple
This makes sense to me.

In this mode, for 2.x-only changes and for changes on 3.0 that don't
apply cleanly, there will be a manual way to do the step labelled "1.
Cherrypick tool", and that way is the same way we send patches for
review now, but pushing to HEAD:refs/for/2.x rather than
HEAD:refs/for/master, yes?

On Fri, Jan 12, 2018 at 3:57 PM, Philip Zeyliger  wrote:
> Picture:
> https://gist.github.com/philz/323c8b4cb411dc12eb7231d922c1951f#file-impala-branch-image-pdf
>
>
> On Fri, Jan 12, 2018 at 3:47 PM, Jim Apple  wrote:
>
>> Often, this list seems to filter out images. Could you post it and send a
>> link?
>>
>> Thanks for taking this on, Phil!
>>
>> On Fri, Jan 12, 2018 at 3:15 PM, Philip Zeyliger 
>> wrote:
>>
>> > I think most patches go to Gerrit branch 'master', which happens to
>> > identify itself as 3.0. (Or 3.x?).
>> >
>> > Here's a picture:
>> >
>> > [image: Inline image 1]
>> >
>> >
>> > With this, every time "cherrypick_and_push_to_asf.py" is run, it would
>> > first offer to cherrypick changes between master and 2.x. Then, it would
>> > offer push those cherrypicks to gerrit/2.x. After that, it continues on
>> as
>> > before and offers to push changes to ASF. I think this maintains the
>> > invariant that pushing to ASF is only done with a human trigger. (We
>> could
>> > also have step 1 be done by a Jenkins robot, since it's between Gerrit
>> and
>> > Gerrit.)
>> >
>> > I looked at the How to Release page, and the main difference would be
>> > that, for a 2.x release, the $COMMIT_HASH_YOU_CHOSE would come from the
>> 2.x
>> > branch, as would any cherrypicks.
>> >
>> > Does this match what you're thinking?
>> >
>> > Thanks!
>> >
>> > -- Philip
>> >
>> > On Fri, Jan 12, 2018 at 11:07 AM, Jim Apple 
>> wrote:
>> >
>> >> Which gerrit branch were you thinking most patches would go to?
>> >>
>> >> If they go to 3.0, then push_to_asf.py would have to be amended to
>> >> push to gerrit, bypassing code review. I think that's possible, but
>> >> I'm not 100%.
>> >>
>> >> There is also security to think about, since the push_to_asf.py users
>> >> can push a few commits at a time, including ones they didn't author or
>> >> review.
>> >>
>> >> We'll also want to clarify
>> >> https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release and
>> >> keep it consistent with the git & gerrit statuses quo.
>> >>
>> >> On Fri, Jan 12, 2018 at 10:52 AM, Philip Zeyliger 
>> >> wrote:
>> >> > Hi!
>> >> >
>> >> >> Should we start tagging all candidates with a common label, e.g.
>> >> > include-in-v3?
>> >> >
>> >> > I agree with Lars's suggestion for tagging JIRAs with include-in-v3.
>> >> I've
>> >> > done so, and the relevant query is
>> >> > https://issues.apache.org/jira/issues/?jql=labels%20%3D%20in
>> >> clude-in-v3%20and%20project%3Dimpala
>> >> > .
>> >> >
>> >> >> What sort of process were you thinking of for the automation?
>> >> >
>> >> > I think amending push_to_asf.py, as you suggest, is a great idea. I
>> >> think
>> >> > we have a string ("not for 2.x") which can be used in commit messages
>> to
>> >> > discourage the cherrypick for the changes we want to be exclusive
>> until
>> >> we
>> >> > want to change the defaults in the other direction. (I.e., right now
>> the
>> >> > string is "not for 2.x", but at some point the string may be "should
>> be
>> >> > cherrypicked to 2.x".)
>> >> >
>> >> > I do think that we want to create a gerrit branch to allow 2.x-only
>> >> changes
>> >> > to be reviewed in the straight-forward fashion.
>> >> >
>> >> > -- Philip
>> >> >
>> >> >
>> >> >
>> >> > On Thu, Jan 11, 2018 at 9:31 AM, Jim Apple 
>> >> wrote:
>> >> >
>> >> >> I'm on-board with all of this. (I also would be OK delaying 3.0, if
>> >> >> that were the consensus).
>> >> >>
>> >> >> There is one issue in here I think we should dive into:
>> >> >>
>> >> >> > Both master and 2.x would be active, and, at least for the
>> beginning,
>> >> >> > changes would automatically be pulled into the 2.x line, unless
>> >> >> explicitly
>> >> >> > blacklisted.
>> >> >>
>> >> >> What sort of process were you thinking of for the automation?
>> >> >>
>> >> >> Some context, starting from what we all likely already know:
>> >> >>
>> >> >> The bulk of the code review and pre-merge testing results are
>> recorded
>> >> >> in gerrit. Once the pre-merge testing passes, a patch is
>> cherry-picked
>> >> >> to the git repo hosted with gerrit. To get the patch to the Impala
>> git
>> >> >> repo hosted by the ASF, bin/push_to_asf.py is run by a human who
>> >> >> supplies his or her ASF credentials. That script copies the commit to
>> >> >> the ASF git repo.
>> >> >>
>> >> >> Often, 2-3 commits will pile up in gerrit before some committer runs
>> >> >> that script and pushes them to ASF.
>> >> >>
>> >> >> We could edit that script (bin/push_to_asf.py) to help with the
>> cherry
>> >> >> 

Re: Impala 3.x (and 2.x)

2018-01-12 Thread Jim Apple
Often, this list seems to filter out images. Could you post it and send a
link?

Thanks for taking this on, Phil!

On Fri, Jan 12, 2018 at 3:15 PM, Philip Zeyliger 
wrote:

> I think most patches go to Gerrit branch 'master', which happens to
> identify itself as 3.0. (Or 3.x?).
>
> Here's a picture:
>
> [image: Inline image 1]
>
>
> With this, every time "cherrypick_and_push_to_asf.py" is run, it would
> first offer to cherrypick changes between master and 2.x. Then, it would
> offer push those cherrypicks to gerrit/2.x. After that, it continues on as
> before and offers to push changes to ASF. I think this maintains the
> invariant that pushing to ASF is only done with a human trigger. (We could
> also have step 1 be done by a Jenkins robot, since it's between Gerrit and
> Gerrit.)
>
> I looked at the How to Release page, and the main difference would be
> that, for a 2.x release, the $COMMIT_HASH_YOU_CHOSE would come from the 2.x
> branch, as would any cherrypicks.
>
> Does this match what you're thinking?
>
> Thanks!
>
> -- Philip
>
> On Fri, Jan 12, 2018 at 11:07 AM, Jim Apple  wrote:
>
>> Which gerrit branch were you thinking most patches would go to?
>>
>> If they go to 3.0, then push_to_asf.py would have to be amended to
>> push to gerrit, bypassing code review. I think that's possible, but
>> I'm not 100%.
>>
>> There is also security to think about, since the push_to_asf.py users
>> can push a few commits at a time, including ones they didn't author or
>> review.
>>
>> We'll also want to clarify
>> https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release and
>> keep it consistent with the git & gerrit statuses quo.
>>
>> On Fri, Jan 12, 2018 at 10:52 AM, Philip Zeyliger 
>> wrote:
>> > Hi!
>> >
>> >> Should we start tagging all candidates with a common label, e.g.
>> > include-in-v3?
>> >
>> > I agree with Lars's suggestion for tagging JIRAs with include-in-v3.
>> I've
>> > done so, and the relevant query is
>> > https://issues.apache.org/jira/issues/?jql=labels%20%3D%20in
>> clude-in-v3%20and%20project%3Dimpala
>> > .
>> >
>> >> What sort of process were you thinking of for the automation?
>> >
>> > I think amending push_to_asf.py, as you suggest, is a great idea. I
>> think
>> > we have a string ("not for 2.x") which can be used in commit messages to
>> > discourage the cherrypick for the changes we want to be exclusive until
>> we
>> > want to change the defaults in the other direction. (I.e., right now the
>> > string is "not for 2.x", but at some point the string may be "should be
>> > cherrypicked to 2.x".)
>> >
>> > I do think that we want to create a gerrit branch to allow 2.x-only
>> changes
>> > to be reviewed in the straight-forward fashion.
>> >
>> > -- Philip
>> >
>> >
>> >
>> > On Thu, Jan 11, 2018 at 9:31 AM, Jim Apple 
>> wrote:
>> >
>> >> I'm on-board with all of this. (I also would be OK delaying 3.0, if
>> >> that were the consensus).
>> >>
>> >> There is one issue in here I think we should dive into:
>> >>
>> >> > Both master and 2.x would be active, and, at least for the beginning,
>> >> > changes would automatically be pulled into the 2.x line, unless
>> >> explicitly
>> >> > blacklisted.
>> >>
>> >> What sort of process were you thinking of for the automation?
>> >>
>> >> Some context, starting from what we all likely already know:
>> >>
>> >> The bulk of the code review and pre-merge testing results are recorded
>> >> in gerrit. Once the pre-merge testing passes, a patch is cherry-picked
>> >> to the git repo hosted with gerrit. To get the patch to the Impala git
>> >> repo hosted by the ASF, bin/push_to_asf.py is run by a human who
>> >> supplies his or her ASF credentials. That script copies the commit to
>> >> the ASF git repo.
>> >>
>> >> Often, 2-3 commits will pile up in gerrit before some committer runs
>> >> that script and pushes them to ASF.
>> >>
>> >> We could edit that script (bin/push_to_asf.py) to help with the cherry
>> >> picks, so that each time a commit is made, the committer must say
>> >> whether the commit goes in 2.x, 3.0, or both, but the commits are
>> >> often made by people who didn't author the patches, so they may not be
>> >> sure which branch to go in.
>> >>
>> >> Additionally, gerrit code review is intimately tied to the git repo.
>> >> Gerrit runs a git repo under-the-hood, and I believe that the branch
>> >> on gerrit's git that changes are cherry-picked to after pre-merge
>> >> testing is identical to the Impala git repo hosted by the ASF - down
>> >> to the hashes, even. If we think 2.x and 3.0 will diverge enough that
>> >> we'll want different code reviews for different branches, then we
>> >> might want two different branches on gerrit, too.
>> >>
>>
>
>


Re: Impala 3.x (and 2.x)

2018-01-12 Thread Jim Apple
Which gerrit branch were you thinking most patches would go to?

If they go to 3.0, then push_to_asf.py would have to be amended to
push to gerrit, bypassing code review. I think that's possible, but
I'm not 100%.

There is also security to think about, since the push_to_asf.py users
can push a few commits at a time, including ones they didn't author or
review.

We'll also want to clarify
https://cwiki.apache.org/confluence/display/IMPALA/How+to+Release and
keep it consistent with the git & gerrit statuses quo.

On Fri, Jan 12, 2018 at 10:52 AM, Philip Zeyliger  wrote:
> Hi!
>
>> Should we start tagging all candidates with a common label, e.g.
> include-in-v3?
>
> I agree with Lars's suggestion for tagging JIRAs with include-in-v3. I've
> done so, and the relevant query is
> https://issues.apache.org/jira/issues/?jql=labels%20%3D%20include-in-v3%20and%20project%3Dimpala
> .
>
>> What sort of process were you thinking of for the automation?
>
> I think amending push_to_asf.py, as you suggest, is a great idea. I think
> we have a string ("not for 2.x") which can be used in commit messages to
> discourage the cherrypick for the changes we want to be exclusive until we
> want to change the defaults in the other direction. (I.e., right now the
> string is "not for 2.x", but at some point the string may be "should be
> cherrypicked to 2.x".)
>
> I do think that we want to create a gerrit branch to allow 2.x-only changes
> to be reviewed in the straight-forward fashion.
>
> -- Philip
>
>
>
> On Thu, Jan 11, 2018 at 9:31 AM, Jim Apple  wrote:
>
>> I'm on-board with all of this. (I also would be OK delaying 3.0, if
>> that were the consensus).
>>
>> There is one issue in here I think we should dive into:
>>
>> > Both master and 2.x would be active, and, at least for the beginning,
>> > changes would automatically be pulled into the 2.x line, unless
>> explicitly
>> > blacklisted.
>>
>> What sort of process were you thinking of for the automation?
>>
>> Some context, starting from what we all likely already know:
>>
>> The bulk of the code review and pre-merge testing results are recorded
>> in gerrit. Once the pre-merge testing passes, a patch is cherry-picked
>> to the git repo hosted with gerrit. To get the patch to the Impala git
>> repo hosted by the ASF, bin/push_to_asf.py is run by a human who
>> supplies his or her ASF credentials. That script copies the commit to
>> the ASF git repo.
>>
>> Often, 2-3 commits will pile up in gerrit before some committer runs
>> that script and pushes them to ASF.
>>
>> We could edit that script (bin/push_to_asf.py) to help with the cherry
>> picks, so that each time a commit is made, the committer must say
>> whether the commit goes in 2.x, 3.0, or both, but the commits are
>> often made by people who didn't author the patches, so they may not be
>> sure which branch to go in.
>>
>> Additionally, gerrit code review is intimately tied to the git repo.
>> Gerrit runs a git repo under-the-hood, and I believe that the branch
>> on gerrit's git that changes are cherry-picked to after pre-merge
>> testing is identical to the Impala git repo hosted by the ASF - down
>> to the hashes, even. If we think 2.x and 3.0 will diverge enough that
>> we'll want different code reviews for different branches, then we
>> might want two different branches on gerrit, too.
>>


Re: Accumulo Integration [SEC=UNOFFICIAL]

2018-01-12 Thread Edward Capriolo
https://www.quora.com/How-do-we-compare-Apache-HBase-vs-Apache-Accumulo

HBase and Accumulo are both implementations of BigTable using HDFS as their
storage mechanism.They are virtually identical in most regards from
both an operational and architectual standpoint; they have different APIs,
and use different nomenclature, but conceptually they do the same thing and
performance is close enough that there's no significant advantage to using
one over the other.

I was under the general impression that the above was true. So it should be
a near 1 to 1 port right? IE fundamentally both are BigTable clones.

On Fri, Jan 12, 2018 at 11:59 AM, Jim Apple  wrote:

> I don't know that Impala can scan Accumulo tables. Assuming it can't:
> Impala can read HBase tables, so that might be a good place for you to
> start looking on what this could look like.
>
> I suspect the most complex piece is going to be the auth integration.
>
> On Thu, Jan 11, 2018 at 7:16 PM, Dickson, Matt MR
>  wrote:
> > UNOFFICIAL
> > I'm investigate integrating Accumulo as a backed to Impala.  At this
> point it is simply an idea and I'm interested to know if anything has been
> done to achieve this in the past/present?
> >
> > If so, is the code open source?
> >
> > If not, I'd be very keen to gain some detail on how this could be
> achieved or issues that prevent it being feasible.
> >
> > Thanks in advance,
> > Matt
> >
>


Re: Impala 3.x (and 2.x)

2018-01-12 Thread Philip Zeyliger
Hi!

> Should we start tagging all candidates with a common label, e.g.
include-in-v3?

I agree with Lars's suggestion for tagging JIRAs with include-in-v3. I've
done so, and the relevant query is
https://issues.apache.org/jira/issues/?jql=labels%20%3D%20include-in-v3%20and%20project%3Dimpala
.

> What sort of process were you thinking of for the automation?

I think amending push_to_asf.py, as you suggest, is a great idea. I think
we have a string ("not for 2.x") which can be used in commit messages to
discourage the cherrypick for the changes we want to be exclusive until we
want to change the defaults in the other direction. (I.e., right now the
string is "not for 2.x", but at some point the string may be "should be
cherrypicked to 2.x".)

I do think that we want to create a gerrit branch to allow 2.x-only changes
to be reviewed in the straight-forward fashion.

-- Philip



On Thu, Jan 11, 2018 at 9:31 AM, Jim Apple  wrote:

> I'm on-board with all of this. (I also would be OK delaying 3.0, if
> that were the consensus).
>
> There is one issue in here I think we should dive into:
>
> > Both master and 2.x would be active, and, at least for the beginning,
> > changes would automatically be pulled into the 2.x line, unless
> explicitly
> > blacklisted.
>
> What sort of process were you thinking of for the automation?
>
> Some context, starting from what we all likely already know:
>
> The bulk of the code review and pre-merge testing results are recorded
> in gerrit. Once the pre-merge testing passes, a patch is cherry-picked
> to the git repo hosted with gerrit. To get the patch to the Impala git
> repo hosted by the ASF, bin/push_to_asf.py is run by a human who
> supplies his or her ASF credentials. That script copies the commit to
> the ASF git repo.
>
> Often, 2-3 commits will pile up in gerrit before some committer runs
> that script and pushes them to ASF.
>
> We could edit that script (bin/push_to_asf.py) to help with the cherry
> picks, so that each time a commit is made, the committer must say
> whether the commit goes in 2.x, 3.0, or both, but the commits are
> often made by people who didn't author the patches, so they may not be
> sure which branch to go in.
>
> Additionally, gerrit code review is intimately tied to the git repo.
> Gerrit runs a git repo under-the-hood, and I believe that the branch
> on gerrit's git that changes are cherry-picked to after pre-merge
> testing is identical to the Impala git repo hosted by the ASF - down
> to the hashes, even. If we think 2.x and 3.0 will diverge enough that
> we'll want different code reviews for different branches, then we
> might want two different branches on gerrit, too.
>


Re: .test file format

2018-01-12 Thread Tim Armstrong
It is possible to get the test infra to generate .test files based on the
actual output. For planner tests, they automatically get written
to logs/fe_tests/PlannerTest/.

If you're running end-to-end tests with impala-py.test, you can use the
--update_results flag to get it to generate similar files for tests that
fail. There are some more pitfalls here but it can be a useful tool.

If you use these shortcuts it's still important to manually check the
changes to the files.

On Fri, Jan 12, 2018 at 9:03 AM, Jim Apple  wrote:

> .test files are written by hand, with the exception of long complex
> input, which I'm sure authors copy-paste from Impala shell sometimes.
>
> The formats are described in
>
> https://github.com/apache/impala/blob/master/fe/src/
> test/java/org/apache/impala/testutil/TestFileParser.java
>
> and
>
> https://github.com/apache/impala/blob/master/tests/util/
> test_file_parser.py
>
> On Fri, Jan 12, 2018 at 8:57 AM, Xinran Yu Tinney
>  wrote:
> > Hi,
> >Is there a link for how to generate the .test files and what is each
> > part in it? Thanks!
> >
> > Xinran
>


Re: .test file format

2018-01-12 Thread Jim Apple
.test files are written by hand, with the exception of long complex
input, which I'm sure authors copy-paste from Impala shell sometimes.

The formats are described in

https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/testutil/TestFileParser.java

and

https://github.com/apache/impala/blob/master/tests/util/test_file_parser.py

On Fri, Jan 12, 2018 at 8:57 AM, Xinran Yu Tinney
 wrote:
> Hi,
>Is there a link for how to generate the .test files and what is each
> part in it? Thanks!
>
> Xinran


Re: Accumulo Integration [SEC=UNOFFICIAL]

2018-01-12 Thread Jim Apple
I don't know that Impala can scan Accumulo tables. Assuming it can't:
Impala can read HBase tables, so that might be a good place for you to
start looking on what this could look like.

I suspect the most complex piece is going to be the auth integration.

On Thu, Jan 11, 2018 at 7:16 PM, Dickson, Matt MR
 wrote:
> UNOFFICIAL
> I'm investigate integrating Accumulo as a backed to Impala.  At this point it 
> is simply an idea and I'm interested to know if anything has been done to 
> achieve this in the past/present?
>
> If so, is the code open source?
>
> If not, I'd be very keen to gain some detail on how this could be achieved or 
> issues that prevent it being feasible.
>
> Thanks in advance,
> Matt
>


.test file format

2018-01-12 Thread Xinran Yu Tinney
Hi,
   Is there a link for how to generate the .test files and what is each
part in it? Thanks!

Xinran