FWIW, most vendors will also publish the source code to their patched releases. that would at least help with the source code reading stuff.
On Wed, Jul 12, 2017 at 2:48 PM, James Srinivasan <[email protected]> wrote: > Yup, I'm going to spin up a vanilla 1.7.0 (maybe newer) install too to > see if it behaves any differently. There is at least one patch > included in their distro that isn't in the formal documentation, plus > it makes matching line numbers in logs to src code rather difficult. > > Thanks, > > James > > On 12 July 2017 at 20:37, Sean Busbey <[email protected]> wrote: >> Hi James! >> >> It sounds like you may need to chase things down with your vendor, >> since the precise combination of patches included will make looking at >> things hard for the community. >> >> On Wed, Jul 12, 2017 at 11:01 AM, James Srinivasan >> <[email protected]> wrote: >>> Hi, >>> >>> So I've fired off a thread to perform the periodic >>> checkTGTAndReloginFromKeytab call which seems to be running, but the >>> connection still fails with GSS errors after precisely 10 hours. >>> >>> While I am running 1.7.0, it seems the vendor included the >>> ACCUMULO-4069 patch, and immediately after the exception is thrown I >>> see a log entry "Performing ticket-cache-based Kerberos re-login". >>> However, it should be using a keytab - have turned up the logging to >>> 11 and will leave running overnight... >>> >>> James >>> >>> On 11 July 2017 at 16:17, Josh Elser <[email protected]> wrote: >>>> Nope, you've got it exactly right! That's the code I would've pointed you >>>> at >>>> to copy :) >>>> >>>> If/when you do get to long-running MR jobs, see the >>>> "general.delegation.token.*" configuration properties in this table[1]. I >>>> think the docs are citing that one delegation token is valid for 7 days, >>>> but >>>> it's been a long time since writing/testing that code. >>>> >>>> - Josh >>>> >>>> [1] >>>> https://accumulo.apache.org/1.8/accumulo_user_manual.html#_server_configuration_2 >>>> >>>> On 7/11/17 1:25 AM, James Srinivasan wrote: >>>>> >>>>> Thanks both. I can't (easily) upgrade beyond 1.7.0, but have raised a >>>>> support case with our Hadoop distribution vendor. >>>>> >>>>> I'm not (yet) worried about expiration with MapReduce - for now I'll >>>>> try to keep such jobs to under 24h! Outside MR, sounds like I just >>>>> need to periodically call >>>>> UserGroupInformation.checkTGTAndReloginFromKeytab like >>>>> >>>>> >>>>> https://github.com/apache/accumulo/blob/master/server/base/src/main/java/org/apache/accumulo/server/security/SecurityUtil.java#L121 >>>>> >>>>> Or is the TGT associated with an Accumulo KerberosToken separate? >>>>> >>>>> Thanks, >>>>> >>>>> James >>>>> >>>>> On 11 July 2017 at 02:59, Josh Elser <[email protected]> wrote: >>>>>> >>>>>> No, you are (likely) not running into ACCUMULO-4069. What you've >>>>>> described sounds like your client's ticket expired. Accumulo does not >>>>>> spawn any ticket renewal on the behalf of clients. >>>>>> >>>>>> Hadoop's UGI code will automatically spawn a renewal thread when you >>>>>> log in using a ticket cache. This does not happen automatically when >>>>>> you use a keytab (I have no explanation as to why this is). This is >>>>>> the most likely cause of your error and something you need to correct >>>>>> in your application (spawn a thread to renew your application's >>>>>> ticket). >>>>>> >>>>>> If you are using MapReduce, you have yet another layer of indirection >>>>>> with DelegationTokens, but that's probably not what you're seeing (as >>>>>> DelegationTokens don't actually have a Kerberos TGT). >>>>>> >>>>>> On Mon, Jul 10, 2017 at 5:42 PM, Christopher <[email protected]> wrote: >>>>>>> >>>>>>> It certainly sounds like the same issue. I'd recommend upgrading to the >>>>>>> latest 1.7.3 (currently the latest 1.7 version) to include all the bugs >>>>>>> we've found and fixed in that release line. >>>>>>> >>>>>>> On Mon, Jul 10, 2017 at 5:50 AM James Srinivasan >>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> >>>>>>>> I'm using Accumulo 1.7.0 and finding that after some period of time >>>>>>>> (>8 hours, <3 days - happened over the weekend) my ingest fails with >>>>>>>> errors regarding "Failed to find any Kerberos tgt". My guess is that >>>>>>>> the ticket from the keytab has expired, and needs to be renewed - from >>>>>>>> memory, I had seen a Kerberos tgt renewer thread running in my client, >>>>>>>> so assumed it happened automagically. Is that the case? Perhaps I am >>>>>>>> hitting this bug? https://issues.apache.org/jira/browse/ACCUMULO-4069 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> James >> >> >> >> -- >> busbey -- busbey
