Re: client config files
So does this mean you'd rather have the config switches be called something like javax.net.ssl.trustStore rather than general.server.ssl.trustStore in the Accumulo Properties? Our implementation of SSL will be provided by the thrift connectors rather than us using JSSE directly, so we'll have to interpret them ourselves rather than JSSE doing it automatically. Should we respect these flags if they're passed in through System.getProperties() in addition to whatever the normal config flow ends up being? Note that we won't be able to respect many of the flags, since we're at Thrift's mercy for socket construction. On Thu, Aug 1, 2013 at 10:06 PM, Christopher ctubb...@apache.org wrote: I'm generally not a fan of the way some standard Java things have been reinvented in Hadoop. I've always seen JSSE SSL config as system properties, optionally stored in a properties file. Even if Hadoop is using an XML-based configuration for this purpose, I'd still steer clear of it, for this reason. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote: I generally prefer properties files to XML, but there may be a argument for reusing Hadoop's SSL configuration system which is XML based. -Joey — Sent from Mailbox for iPhone On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote: ^ Another reason I like commons-configuration here is for property-interpolation with HierarchicalConfiguration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote: I absolutely DO think they should be combined in a properties file located in $HOME/.accumulo/config I absolutely DO NOT think this client configuration should be exclusive to the shell, and I absolutely DO NOT think it should be XML. I would love to see all our clients/client code use commons-configuration to hold properties from the properties file, so that only a --config parameter is needed (with reasonable defaults, so even that is not absolutely necessary). I also think that every property that can exist in the file should be possible to override on the command-line. I personally prefer to use system properties, using commons-configuration's HierarchicalConfiguration, but jcommander may make it easier to do the same thing in a slightly different way. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com wrote: As part of SSL, we need to introduce configuration so accumulo clients (such as ZooKeeperInstance) can find trust stores. It seems like this has a lot in common with shell config files in ACCUMULO-1397. Do people think these should be combined, or should the shell have its own separate config? I was imagining a simple java .properties-style key=value list. Does this seem reasonable? Or should the format be more like the xml of the site config? I was also imagining looking through a list of files that would each override settings, perhaps in the following order (from lowest to highest priority): /etc/accumulo/client.conf $ACCUMULO_HOME/conf/client.conf $HOME/.accumulo/config --client-config command line switch for shell or explicit parameter passed to ZooKeeperInstance Does this sound good to y'all? Should the explicit switch/parameter have per-property override semantics, or should it just be used as the exclusive source of properties if specified? Mike Drob, are you actively working on the shell side of this already? I see that bug is assigned to you... Thanks, Michael
Re: client config files
Yeah, I was starting to think along those lines in ACCUMULO-1397 (a generic configuration library). Although options about how to make connections to accumulo are needed by both clients and accumulo services themselves (for masters talking to tablets, for example). On Fri, Aug 2, 2013 at 1:20 PM, Christopher ctubb...@apache.org wrote: Yes, I think that would be provide the best user experience for client code. I'm not too stuck on this point, though I do think it should be independent of AccumuloConfiguration for reasons I mentioned on ACCUMULO-1397. I've been playing with the idea of creating a generic typed configuration library that extends commons-configuration to make it easier to get the same value as AccumuloConfiguration's Property enums, but without being so monolithic and Accumulo server-specific. That common interface could form the basis of an AccumuloClientConfiguration, and independently, an AccumuloServerConfiguration. Do you think that would be useful for your client configuration? -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Aug 2, 2013 at 10:11 AM, Michael Berman mber...@sqrrl.com wrote: So does this mean you'd rather have the config switches be called something like javax.net.ssl.trustStore rather than general.server.ssl.trustStore in the Accumulo Properties? Our implementation of SSL will be provided by the thrift connectors rather than us using JSSE directly, so we'll have to interpret them ourselves rather than JSSE doing it automatically. Should we respect these flags if they're passed in through System.getProperties() in addition to whatever the normal config flow ends up being? Note that we won't be able to respect many of the flags, since we're at Thrift's mercy for socket construction. On Thu, Aug 1, 2013 at 10:06 PM, Christopher ctubb...@apache.org wrote: I'm generally not a fan of the way some standard Java things have been reinvented in Hadoop. I've always seen JSSE SSL config as system properties, optionally stored in a properties file. Even if Hadoop is using an XML-based configuration for this purpose, I'd still steer clear of it, for this reason. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote: I generally prefer properties files to XML, but there may be a argument for reusing Hadoop's SSL configuration system which is XML based. -Joey — Sent from Mailbox for iPhone On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote: ^ Another reason I like commons-configuration here is for property-interpolation with HierarchicalConfiguration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote: I absolutely DO think they should be combined in a properties file located in $HOME/.accumulo/config I absolutely DO NOT think this client configuration should be exclusive to the shell, and I absolutely DO NOT think it should be XML. I would love to see all our clients/client code use commons-configuration to hold properties from the properties file, so that only a --config parameter is needed (with reasonable defaults, so even that is not absolutely necessary). I also think that every property that can exist in the file should be possible to override on the command-line. I personally prefer to use system properties, using commons-configuration's HierarchicalConfiguration, but jcommander may make it easier to do the same thing in a slightly different way. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com wrote: As part of SSL, we need to introduce configuration so accumulo clients (such as ZooKeeperInstance) can find trust stores. It seems like this has a lot in common with shell config files in ACCUMULO-1397. Do people think these should be combined, or should the shell have its own separate config? I was imagining a simple java .properties-style key=value list. Does this seem reasonable? Or should the format be more like the xml of the site config? I was also imagining looking through a list of files that would each override settings, perhaps in the following order (from lowest to highest priority): /etc/accumulo/client.conf $ACCUMULO_HOME/conf/client.conf $HOME/.accumulo/config --client-config command line switch for shell or explicit parameter passed to ZooKeeperInstance Does this sound good to y'all? Should the explicit switch/parameter have per-property override semantics, or should it just be used as the exclusive source of properties if specified? Mike Drob, are you actively working on the shell side of this already? I see that bug is
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
Sorry for the delay, it's been one of those weeks. The current version would probably not be backwards compatible to 0.20.2 just based on changes in dependencies. We're looking right now to see how hard it is to have three way compatibility (0.20, 1.0, 2.0). -Joey On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net wrote: Any update? -Original Message- From: Joey Echeverria [mailto:j...@cloudera.com] Sent: Monday, July 29, 2013 1:24 PM To: dev@accumulo.apache.org Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch We're testing this today. I'll report back what we find. -Joey — Sent from Mailbox for iPhone On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote: Will 1.4 still work with 0.20 with these patches? Great point Billie. - Original Message - From: Billie Rinaldi billie.rina...@gmail.com To: dev@accumulo.apache.org Sent: Friday, July 26, 2013 3:02:41 PM Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote: If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0) to make the compatibility requirements simpler; we ended up without dependency changes in the hadoop version profiles. Will 1.4 still work with 0.20 with these patches? If there are dependency changes in the profiles, 1.4 would have to be compiled against a hadoop version compatible with the running version of hadoop, correct? We had some trouble in the 1.5 release process with figuring out how to provide multiple binary artifacts (each compiled against a different version of hadoop) for the same release. Just something we should consider before we are in the midst of releasing 1.4.4. Billie -Joey -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: client config files
On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote: I generally prefer properties files to XML, but there may be a argument for reusing Hadoop's SSL configuration system which is XML based. I also prefer prefer properties files over XML. The only reason I can think that we might want to use XML is for consistency with Hadoop and Accumulo server side config. But it does not seem like a very compelling reason, its not like it prop files are hard to use once you realize you need to use them. -Joey — Sent from Mailbox for iPhone On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote: ^ Another reason I like commons-configuration here is for property-interpolation with HierarchicalConfiguration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote: I absolutely DO think they should be combined in a properties file located in $HOME/.accumulo/config I absolutely DO NOT think this client configuration should be exclusive to the shell, and I absolutely DO NOT think it should be XML. I would love to see all our clients/client code use commons-configuration to hold properties from the properties file, so that only a --config parameter is needed (with reasonable defaults, so even that is not absolutely necessary). I also think that every property that can exist in the file should be possible to override on the command-line. I personally prefer to use system properties, using commons-configuration's HierarchicalConfiguration, but jcommander may make it easier to do the same thing in a slightly different way. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com wrote: As part of SSL, we need to introduce configuration so accumulo clients (such as ZooKeeperInstance) can find trust stores. It seems like this has a lot in common with shell config files in ACCUMULO-1397. Do people think these should be combined, or should the shell have its own separate config? I was imagining a simple java .properties-style key=value list. Does this seem reasonable? Or should the format be more like the xml of the site config? I was also imagining looking through a list of files that would each override settings, perhaps in the following order (from lowest to highest priority): /etc/accumulo/client.conf $ACCUMULO_HOME/conf/client.conf $HOME/.accumulo/config --client-config command line switch for shell or explicit parameter passed to ZooKeeperInstance Does this sound good to y'all? Should the explicit switch/parameter have per-property override semantics, or should it just be used as the exclusive source of properties if specified? Mike Drob, are you actively working on the shell side of this already? I see that bug is assigned to you... Thanks, Michael
Re: client config files
The overlap is only a conceptual overlap, not an implementation one. Servers use HdfsZooInstance, which reads the xml configuration file, and read the instanceId out of HDFS. Clients have ZooKeeperInstance, which requires user input and gets the instanceId from an a convenient pointer in ZK. Even if this were changed slightly to get either the alias or the instanceID directly from a client configuration file, it still wouldn't be quite the same, because we don't expect clients to use the xml configuration file at all. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Aug 2, 2013 at 1:25 PM, Michael Berman mber...@sqrrl.com wrote: Yeah, I was starting to think along those lines in ACCUMULO-1397 (a generic configuration library). Although options about how to make connections to accumulo are needed by both clients and accumulo services themselves (for masters talking to tablets, for example). On Fri, Aug 2, 2013 at 1:20 PM, Christopher ctubb...@apache.org wrote: Yes, I think that would be provide the best user experience for client code. I'm not too stuck on this point, though I do think it should be independent of AccumuloConfiguration for reasons I mentioned on ACCUMULO-1397. I've been playing with the idea of creating a generic typed configuration library that extends commons-configuration to make it easier to get the same value as AccumuloConfiguration's Property enums, but without being so monolithic and Accumulo server-specific. That common interface could form the basis of an AccumuloClientConfiguration, and independently, an AccumuloServerConfiguration. Do you think that would be useful for your client configuration? -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Aug 2, 2013 at 10:11 AM, Michael Berman mber...@sqrrl.com wrote: So does this mean you'd rather have the config switches be called something like javax.net.ssl.trustStore rather than general.server.ssl.trustStore in the Accumulo Properties? Our implementation of SSL will be provided by the thrift connectors rather than us using JSSE directly, so we'll have to interpret them ourselves rather than JSSE doing it automatically. Should we respect these flags if they're passed in through System.getProperties() in addition to whatever the normal config flow ends up being? Note that we won't be able to respect many of the flags, since we're at Thrift's mercy for socket construction. On Thu, Aug 1, 2013 at 10:06 PM, Christopher ctubb...@apache.org wrote: I'm generally not a fan of the way some standard Java things have been reinvented in Hadoop. I've always seen JSSE SSL config as system properties, optionally stored in a properties file. Even if Hadoop is using an XML-based configuration for this purpose, I'd still steer clear of it, for this reason. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote: I generally prefer properties files to XML, but there may be a argument for reusing Hadoop's SSL configuration system which is XML based. -Joey — Sent from Mailbox for iPhone On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote: ^ Another reason I like commons-configuration here is for property-interpolation with HierarchicalConfiguration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote: I absolutely DO think they should be combined in a properties file located in $HOME/.accumulo/config I absolutely DO NOT think this client configuration should be exclusive to the shell, and I absolutely DO NOT think it should be XML. I would love to see all our clients/client code use commons-configuration to hold properties from the properties file, so that only a --config parameter is needed (with reasonable defaults, so even that is not absolutely necessary). I also think that every property that can exist in the file should be possible to override on the command-line. I personally prefer to use system properties, using commons-configuration's HierarchicalConfiguration, but jcommander may make it easier to do the same thing in a slightly different way. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com wrote: As part of SSL, we need to introduce configuration so accumulo clients (such as ZooKeeperInstance) can find trust stores. It seems like this has a lot in common with shell config files in ACCUMULO-1397. Do people think these should be combined, or should the shell have its own separate config? I was imagining a simple java .properties-style key=value list. Does this seem reasonable? Or should the format be more like the xml of the site config? I
Re: client config files
Yeah, I agree. Consistency with Hadoop here is probably not that valuable. -Joey On Fri, Aug 2, 2013 at 2:28 PM, Keith Turner ke...@deenlo.com wrote: On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote: I generally prefer properties files to XML, but there may be a argument for reusing Hadoop's SSL configuration system which is XML based. I also prefer prefer properties files over XML. The only reason I can think that we might want to use XML is for consistency with Hadoop and Accumulo server side config. But it does not seem like a very compelling reason, its not like it prop files are hard to use once you realize you need to use them. -Joey — Sent from Mailbox for iPhone On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote: ^ Another reason I like commons-configuration here is for property-interpolation with HierarchicalConfiguration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote: I absolutely DO think they should be combined in a properties file located in $HOME/.accumulo/config I absolutely DO NOT think this client configuration should be exclusive to the shell, and I absolutely DO NOT think it should be XML. I would love to see all our clients/client code use commons-configuration to hold properties from the properties file, so that only a --config parameter is needed (with reasonable defaults, so even that is not absolutely necessary). I also think that every property that can exist in the file should be possible to override on the command-line. I personally prefer to use system properties, using commons-configuration's HierarchicalConfiguration, but jcommander may make it easier to do the same thing in a slightly different way. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com wrote: As part of SSL, we need to introduce configuration so accumulo clients (such as ZooKeeperInstance) can find trust stores. It seems like this has a lot in common with shell config files in ACCUMULO-1397. Do people think these should be combined, or should the shell have its own separate config? I was imagining a simple java .properties-style key=value list. Does this seem reasonable? Or should the format be more like the xml of the site config? I was also imagining looking through a list of files that would each override settings, perhaps in the following order (from lowest to highest priority): /etc/accumulo/client.conf $ACCUMULO_HOME/conf/client.conf $HOME/.accumulo/config --client-config command line switch for shell or explicit parameter passed to ZooKeeperInstance Does this sound good to y'all? Should the explicit switch/parameter have per-property override semantics, or should it just be used as the exclusive source of properties if specified? Mike Drob, are you actively working on the shell side of this already? I see that bug is assigned to you... Thanks, Michael -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
Would it be reasonable to consider a version of 1.4 that breaks compatibility with 0.20? I'm not really a fan of this, personally, but am curious what others think. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Aug 2, 2013 at 2:22 PM, Joey Echeverria j...@cloudera.com wrote: Sorry for the delay, it's been one of those weeks. The current version would probably not be backwards compatible to 0.20.2 just based on changes in dependencies. We're looking right now to see how hard it is to have three way compatibility (0.20, 1.0, 2.0). -Joey On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net wrote: Any update? -Original Message- From: Joey Echeverria [mailto:j...@cloudera.com] Sent: Monday, July 29, 2013 1:24 PM To: dev@accumulo.apache.org Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch We're testing this today. I'll report back what we find. -Joey — Sent from Mailbox for iPhone On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote: Will 1.4 still work with 0.20 with these patches? Great point Billie. - Original Message - From: Billie Rinaldi billie.rina...@gmail.com To: dev@accumulo.apache.org Sent: Friday, July 26, 2013 3:02:41 PM Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote: If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0) to make the compatibility requirements simpler; we ended up without dependency changes in the hadoop version profiles. Will 1.4 still work with 0.20 with these patches? If there are dependency changes in the profiles, 1.4 would have to be compiled against a hadoop version compatible with the running version of hadoop, correct? We had some trouble in the 1.5 release process with figuring out how to provide multiple binary artifacts (each compiled against a different version of hadoop) for the same release. Just something we should consider before we are in the midst of releasing 1.4.4. Billie -Joey -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
I don't think that's a good idea unless you can come up with very clear version number change. -Joey On Fri, Aug 2, 2013 at 2:31 PM, Christopher ctubb...@apache.org wrote: Would it be reasonable to consider a version of 1.4 that breaks compatibility with 0.20? I'm not really a fan of this, personally, but am curious what others think. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Aug 2, 2013 at 2:22 PM, Joey Echeverria j...@cloudera.com wrote: Sorry for the delay, it's been one of those weeks. The current version would probably not be backwards compatible to 0.20.2 just based on changes in dependencies. We're looking right now to see how hard it is to have three way compatibility (0.20, 1.0, 2.0). -Joey On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net wrote: Any update? -Original Message- From: Joey Echeverria [mailto:j...@cloudera.com] Sent: Monday, July 29, 2013 1:24 PM To: dev@accumulo.apache.org Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch We're testing this today. I'll report back what we find. -Joey — Sent from Mailbox for iPhone On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote: Will 1.4 still work with 0.20 with these patches? Great point Billie. - Original Message - From: Billie Rinaldi billie.rina...@gmail.com To: dev@accumulo.apache.org Sent: Friday, July 26, 2013 3:02:41 PM Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote: If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0) to make the compatibility requirements simpler; we ended up without dependency changes in the hadoop version profiles. Will 1.4 still work with 0.20 with these patches? If there are dependency changes in the profiles, 1.4 would have to be compiled against a hadoop version compatible with the running version of hadoop, correct? We had some trouble in the 1.5 release process with figuring out how to provide multiple binary artifacts (each compiled against a different version of hadoop) for the same release. Just something we should consider before we are in the midst of releasing 1.4.4. Billie -Joey -- Joey Echeverria Director, Federal FTS Cloudera, Inc. -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: client config files
I believe it is an implementation overlap. Both ZKInstance and the master-tablet thrift connections get created in ThriftUtil.getClient(). Higher up in the stack, in both paths, we have access to an Instance from which to draw configuration (with getConfiguration()). In one case, it's a ZKInstance with a degenerate AccumuloConfiguration, and in the other case it's an HDFSInstance with a site.xml-backed configuration, but the thrift stack makes no distinction. It seems silly to me to introduce a distinction all the way down the stack just so we can have two different config sources (which have many of the same flags). Unless we were going to implement it as a ThriftConnectionConfiguration interface with named methods that both AccumuloConfiguration and ClientConfiguration could implement...but that would be a big departure from the Property enum configuration model. On Fri, Aug 2, 2013 at 2:29 PM, Joey Echeverria j...@cloudera.com wrote: Yeah, I agree. Consistency with Hadoop here is probably not that valuable. -Joey On Fri, Aug 2, 2013 at 2:28 PM, Keith Turner ke...@deenlo.com wrote: On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote: I generally prefer properties files to XML, but there may be a argument for reusing Hadoop's SSL configuration system which is XML based. I also prefer prefer properties files over XML. The only reason I can think that we might want to use XML is for consistency with Hadoop and Accumulo server side config. But it does not seem like a very compelling reason, its not like it prop files are hard to use once you realize you need to use them. -Joey — Sent from Mailbox for iPhone On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote: ^ Another reason I like commons-configuration here is for property-interpolation with HierarchicalConfiguration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote: I absolutely DO think they should be combined in a properties file located in $HOME/.accumulo/config I absolutely DO NOT think this client configuration should be exclusive to the shell, and I absolutely DO NOT think it should be XML. I would love to see all our clients/client code use commons-configuration to hold properties from the properties file, so that only a --config parameter is needed (with reasonable defaults, so even that is not absolutely necessary). I also think that every property that can exist in the file should be possible to override on the command-line. I personally prefer to use system properties, using commons-configuration's HierarchicalConfiguration, but jcommander may make it easier to do the same thing in a slightly different way. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com wrote: As part of SSL, we need to introduce configuration so accumulo clients (such as ZooKeeperInstance) can find trust stores. It seems like this has a lot in common with shell config files in ACCUMULO-1397. Do people think these should be combined, or should the shell have its own separate config? I was imagining a simple java .properties-style key=value list. Does this seem reasonable? Or should the format be more like the xml of the site config? I was also imagining looking through a list of files that would each override settings, perhaps in the following order (from lowest to highest priority): /etc/accumulo/client.conf $ACCUMULO_HOME/conf/client.conf $HOME/.accumulo/config --client-config command line switch for shell or explicit parameter passed to ZooKeeperInstance Does this sound good to y'all? Should the explicit switch/parameter have per-property override semantics, or should it just be used as the exclusive source of properties if specified? Mike Drob, are you actively working on the shell side of this already? I see that bug is assigned to you... Thanks, Michael -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
Which version of 0.20 are you testing against? Vanilla, or cdh3 flavored? On Fri, Aug 2, 2013 at 2:37 PM, Joey Echeverria j...@cloudera.com wrote: I don't think that's a good idea unless you can come up with very clear version number change. -Joey On Fri, Aug 2, 2013 at 2:31 PM, Christopher ctubb...@apache.org wrote: Would it be reasonable to consider a version of 1.4 that breaks compatibility with 0.20? I'm not really a fan of this, personally, but am curious what others think. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Aug 2, 2013 at 2:22 PM, Joey Echeverria j...@cloudera.com wrote: Sorry for the delay, it's been one of those weeks. The current version would probably not be backwards compatible to 0.20.2 just based on changes in dependencies. We're looking right now to see how hard it is to have three way compatibility (0.20, 1.0, 2.0). -Joey On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net wrote: Any update? -Original Message- From: Joey Echeverria [mailto:j...@cloudera.com] Sent: Monday, July 29, 2013 1:24 PM To: dev@accumulo.apache.org Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch We're testing this today. I'll report back what we find. -Joey — Sent from Mailbox for iPhone On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote: Will 1.4 still work with 0.20 with these patches? Great point Billie. - Original Message - From: Billie Rinaldi billie.rina...@gmail.com To: dev@accumulo.apache.org Sent: Friday, July 26, 2013 3:02:41 PM Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote: If these patches are going to be included with 1.4.4 or 1.4.5, I would like to see the following test run using CDH4 on at least a 5 node cluster. More nodes would be better. * unit test * Functional test * 24 hr Continuous ingest + verification * 24 hr Continuous ingest + verification + agitation * 24 hr Random walk * 24 hr Random walk + agitation I may be able to assist with this, but I can not make any promises. Sure thing. Is there already a write-up on running this full battery of tests? I have a 10 node cluster that I can use for this. Great. I think this would be a good patch for 1.4. I assume that if a user stays with Hadoop 1 there are no dependency changes? Yup. It works the same way as 1.5 where all of the dependency changes are in a Hadoop 2.0 profile. In 1.5.0, we gave up on compatibility with 0.20 (and early versions of 1.0) to make the compatibility requirements simpler; we ended up without dependency changes in the hadoop version profiles. Will 1.4 still work with 0.20 with these patches? If there are dependency changes in the profiles, 1.4 would have to be compiled against a hadoop version compatible with the running version of hadoop, correct? We had some trouble in the 1.5 release process with figuring out how to provide multiple binary artifacts (each compiled against a different version of hadoop) for the same release. Just something we should consider before we are in the midst of releasing 1.4.4. Billie -Joey -- Joey Echeverria Director, Federal FTS Cloudera, Inc. -- Joey Echeverria Director, Federal FTS Cloudera, Inc.
Re: client config files
Okay, so there is implementation overlap, but that overlap is pretty minimal (admittedly, it could potentially grow). The only thing it is currently used for, is to carry the value of the RPC timeout, and this is not currently very friendly to end users (they'd have to instantiate something that extends AccumuloConfiguration, just to override that one property's value, and then call instance.setConfiguration()). I like configuration with scopes, so I would prefer to have configuration properties scoped to RPC. I implemented the Property-enum configuration model to reflect this preference, but I made the mistake of carrying all the scopes in a single configuration object. I don't see a problem separating out the RPC scope into a separate configuration object that could be shared between the other, broader scopes (client, master, tserver, etc.). In other words, I'd prefer smaller, more localized scopes for configuration, and I think it's sensible to have an RPC/connection scope (not necessarily thrift-specific). See BatchWriterConfig for a good example of a locally-scoped configuration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Fri, Aug 2, 2013 at 3:24 PM, Michael Berman mber...@sqrrl.com wrote: I believe it is an implementation overlap. Both ZKInstance and the master-tablet thrift connections get created in ThriftUtil.getClient(). Higher up in the stack, in both paths, we have access to an Instance from which to draw configuration (with getConfiguration()). In one case, it's a ZKInstance with a degenerate AccumuloConfiguration, and in the other case it's an HDFSInstance with a site.xml-backed configuration, but the thrift stack makes no distinction. It seems silly to me to introduce a distinction all the way down the stack just so we can have two different config sources (which have many of the same flags). Unless we were going to implement it as a ThriftConnectionConfiguration interface with named methods that both AccumuloConfiguration and ClientConfiguration could implement...but that would be a big departure from the Property enum configuration model. On Fri, Aug 2, 2013 at 2:29 PM, Joey Echeverria j...@cloudera.com wrote: Yeah, I agree. Consistency with Hadoop here is probably not that valuable. -Joey On Fri, Aug 2, 2013 at 2:28 PM, Keith Turner ke...@deenlo.com wrote: On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote: I generally prefer properties files to XML, but there may be a argument for reusing Hadoop's SSL configuration system which is XML based. I also prefer prefer properties files over XML. The only reason I can think that we might want to use XML is for consistency with Hadoop and Accumulo server side config. But it does not seem like a very compelling reason, its not like it prop files are hard to use once you realize you need to use them. -Joey — Sent from Mailbox for iPhone On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote: ^ Another reason I like commons-configuration here is for property-interpolation with HierarchicalConfiguration. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote: I absolutely DO think they should be combined in a properties file located in $HOME/.accumulo/config I absolutely DO NOT think this client configuration should be exclusive to the shell, and I absolutely DO NOT think it should be XML. I would love to see all our clients/client code use commons-configuration to hold properties from the properties file, so that only a --config parameter is needed (with reasonable defaults, so even that is not absolutely necessary). I also think that every property that can exist in the file should be possible to override on the command-line. I personally prefer to use system properties, using commons-configuration's HierarchicalConfiguration, but jcommander may make it easier to do the same thing in a slightly different way. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com wrote: As part of SSL, we need to introduce configuration so accumulo clients (such as ZooKeeperInstance) can find trust stores. It seems like this has a lot in common with shell config files in ACCUMULO-1397. Do people think these should be combined, or should the shell have its own separate config? I was imagining a simple java .properties-style key=value list. Does this seem reasonable? Or should the format be more like the xml of the site config? I was also imagining looking through a list of files that would each override settings, perhaps in the following order (from lowest to highest priority): /etc/accumulo/client.conf $ACCUMULO_HOME/conf/client.conf $HOME/.accumulo/config