Re: client config files

2013-08-02 Thread Michael Berman
So does this mean you'd rather have the config switches be called something
like javax.net.ssl.trustStore rather than general.server.ssl.trustStore in
the Accumulo Properties?  Our implementation of SSL will be provided by the
thrift connectors rather than us using JSSE directly, so we'll have to
interpret them ourselves rather than JSSE doing it automatically.  Should
we respect these flags if they're passed in through System.getProperties()
in addition to whatever the normal config flow ends up being?  Note that we
won't be able to respect many of the flags, since we're at Thrift's mercy
for socket construction.


On Thu, Aug 1, 2013 at 10:06 PM, Christopher ctubb...@apache.org wrote:

 I'm generally not a fan of the way some standard Java things have been
 reinvented in Hadoop. I've always seen JSSE SSL config as system
 properties, optionally stored in a properties file. Even if Hadoop is
 using an XML-based configuration for this purpose, I'd still steer
 clear of it, for this reason.

 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii


 On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote:
  I generally prefer properties files to XML, but there may be a argument
 for reusing Hadoop's SSL configuration system which is XML based.
 
 
  -Joey
  —
  Sent from Mailbox for iPhone
 
  On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote:
 
  ^ Another reason I like commons-configuration here is for
  property-interpolation with HierarchicalConfiguration.
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
  On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org
 wrote:
  I absolutely DO think they should be combined in a properties file
  located in $HOME/.accumulo/config
  I absolutely DO NOT think this client configuration should be
  exclusive to the shell, and I absolutely DO NOT think it should be
  XML.
 
  I would love to see all our clients/client code use
  commons-configuration to hold properties from the properties file, so
  that only a --config parameter is needed (with reasonable defaults, so
  even that is not absolutely necessary). I also think that every
  property that can exist in the file should be possible to override on
  the command-line. I personally prefer to use system properties, using
  commons-configuration's HierarchicalConfiguration, but jcommander may
  make it easier to do the same thing in a slightly different way.
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
  On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com
 wrote:
  As part of SSL, we need to introduce configuration so accumulo clients
  (such as ZooKeeperInstance) can find trust stores.  It seems like
 this has
  a lot in common with shell config files in ACCUMULO-1397.  Do people
 think
  these should be combined, or should the shell have its own separate
 config?
   I was imagining a simple java .properties-style key=value list.
  Does this
  seem reasonable?  Or should the format be more like the xml of the
 site
  config?  I was also imagining looking through a list of files that
 would
  each override settings, perhaps in the following order (from lowest to
  highest priority):
 
  /etc/accumulo/client.conf
  $ACCUMULO_HOME/conf/client.conf
  $HOME/.accumulo/config
  --client-config command line switch for shell or explicit parameter
 passed
  to ZooKeeperInstance
 
  Does this sound good to y'all?  Should the explicit switch/parameter
 have
  per-property override semantics, or should it just be used as the
 exclusive
  source of properties if specified?
 
  Mike Drob, are you actively working on the shell side of this
 already?  I
  see that bug is assigned to you...
 
  Thanks,
  Michael



Re: client config files

2013-08-02 Thread Michael Berman
Yeah, I was starting to think along those lines in ACCUMULO-1397 (a generic
configuration library).  Although options about how to make connections to
accumulo are needed by both clients and accumulo services themselves (for
masters talking to tablets, for example).


On Fri, Aug 2, 2013 at 1:20 PM, Christopher ctubb...@apache.org wrote:

 Yes, I think that would be provide the best user experience for client
 code. I'm not too stuck on this point, though I do think it should be
 independent of AccumuloConfiguration for reasons I mentioned on
 ACCUMULO-1397.

 I've been playing with the idea of creating a generic typed
 configuration library that extends commons-configuration to make it
 easier to get the same value as AccumuloConfiguration's Property
 enums, but without being so monolithic and Accumulo server-specific.
 That common interface could form the basis of an
 AccumuloClientConfiguration, and independently, an
 AccumuloServerConfiguration. Do you think that would be useful for
 your client configuration?

 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii


 On Fri, Aug 2, 2013 at 10:11 AM, Michael Berman mber...@sqrrl.com wrote:
  So does this mean you'd rather have the config switches be called
 something
  like javax.net.ssl.trustStore rather than general.server.ssl.trustStore
 in
  the Accumulo Properties?  Our implementation of SSL will be provided by
 the
  thrift connectors rather than us using JSSE directly, so we'll have to
  interpret them ourselves rather than JSSE doing it automatically.  Should
  we respect these flags if they're passed in through
 System.getProperties()
  in addition to whatever the normal config flow ends up being?  Note that
 we
  won't be able to respect many of the flags, since we're at Thrift's mercy
  for socket construction.
 
 
  On Thu, Aug 1, 2013 at 10:06 PM, Christopher ctubb...@apache.org
 wrote:
 
  I'm generally not a fan of the way some standard Java things have been
  reinvented in Hadoop. I've always seen JSSE SSL config as system
  properties, optionally stored in a properties file. Even if Hadoop is
  using an XML-based configuration for this purpose, I'd still steer
  clear of it, for this reason.
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
  On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com
 wrote:
   I generally prefer properties files to XML, but there may be a
 argument
  for reusing Hadoop's SSL configuration system which is XML based.
  
  
   -Joey
   —
   Sent from Mailbox for iPhone
  
   On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org
 wrote:
  
   ^ Another reason I like commons-configuration here is for
   property-interpolation with HierarchicalConfiguration.
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
   On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org
  wrote:
   I absolutely DO think they should be combined in a properties file
   located in $HOME/.accumulo/config
   I absolutely DO NOT think this client configuration should be
   exclusive to the shell, and I absolutely DO NOT think it should be
   XML.
  
   I would love to see all our clients/client code use
   commons-configuration to hold properties from the properties file,
 so
   that only a --config parameter is needed (with reasonable defaults,
 so
   even that is not absolutely necessary). I also think that every
   property that can exist in the file should be possible to override
 on
   the command-line. I personally prefer to use system properties,
 using
   commons-configuration's HierarchicalConfiguration, but jcommander
 may
   make it easier to do the same thing in a slightly different way.
  
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
  
  
   On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com
  wrote:
   As part of SSL, we need to introduce configuration so accumulo
 clients
   (such as ZooKeeperInstance) can find trust stores.  It seems like
  this has
   a lot in common with shell config files in ACCUMULO-1397.  Do
 people
  think
   these should be combined, or should the shell have its own separate
  config?
I was imagining a simple java .properties-style key=value list.
   Does this
   seem reasonable?  Or should the format be more like the xml of the
  site
   config?  I was also imagining looking through a list of files that
  would
   each override settings, perhaps in the following order (from
 lowest to
   highest priority):
  
   /etc/accumulo/client.conf
   $ACCUMULO_HOME/conf/client.conf
   $HOME/.accumulo/config
   --client-config command line switch for shell or explicit parameter
  passed
   to ZooKeeperInstance
  
   Does this sound good to y'all?  Should the explicit
 switch/parameter
  have
   per-property override semantics, or should it just be used as the
  exclusive
   source of properties if specified?
  
   Mike Drob, are you actively working on the shell side of this
  already?  I
   see that bug is 

Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-08-02 Thread Joey Echeverria
Sorry for the delay, it's been one of those weeks.

The current version would probably not be backwards compatible to
0.20.2 just based on changes in dependencies. We're looking right now
to see how hard it is to have three way compatibility (0.20, 1.0,
2.0).

-Joey

On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net wrote:
 Any update?

 -Original Message-
 From: Joey Echeverria [mailto:j...@cloudera.com]
 Sent: Monday, July 29, 2013 1:24 PM
 To: dev@accumulo.apache.org
 Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

 We're testing this today. I'll report back what we find.


 -Joey
 —
 Sent from Mailbox for iPhone

 On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote:

 Will 1.4 still work with 0.20 with these patches?
 Great point Billie.
 - Original Message -
 From: Billie Rinaldi billie.rina...@gmail.com
 To: dev@accumulo.apache.org
 Sent: Friday, July 26, 2013 3:02:41 PM
 Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul
 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote:
  If these patches are going to be included with 1.4.4 or 1.4.5, I
  would
 like
  to see the following test run using CDH4 on at least a 5 node cluster.
   More nodes would be better.
 
* unit test
* Functional test
* 24 hr Continuous ingest + verification
* 24 hr Continuous ingest + verification + agitation
* 24 hr Random walk
* 24 hr Random walk + agitation
 
  I may be able to assist with this, but I can not make any promises.

 Sure thing. Is there already a write-up on running this full battery
 of tests? I have a 10 node cluster that I can use for this.


  Great.  I think this would be a good patch for 1.4.   I assume that
  if a user stays with Hadoop 1 there are no dependency changes?

 Yup. It works the same way as 1.5 where all of the dependency changes
 are in a Hadoop 2.0 profile.

 In 1.5.0, we gave up on compatibility with 0.20 (and early versions of
 1.0) to make the compatibility requirements simpler; we ended up
 without dependency changes in the hadoop version profiles.  Will 1.4
 still work with 0.20 with these patches?  If there are dependency
 changes in the profiles, 1.4 would have to be compiled against a
 hadoop version compatible with the running version of hadoop, correct?
 We had some trouble in the
 1.5 release process with figuring out how to provide multiple binary
 artifacts (each compiled against a different version of hadoop) for
 the same release.  Just something we should consider before we are in
 the midst of releasing 1.4.4.
 Billie
 -Joey





-- 
Joey Echeverria
Director, Federal FTS
Cloudera, Inc.


Re: client config files

2013-08-02 Thread Keith Turner
On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote:

 I generally prefer properties files to XML, but there may be a argument
 for reusing Hadoop's SSL configuration system which is XML based.


I also prefer prefer properties files over XML.   The only reason I can
think that we might want to use XML is for consistency with Hadoop and
Accumulo server side config.  But it does not seem like a very compelling
reason, its not like it prop files are hard to use once you realize you
need to use them.




 -Joey
 —
 Sent from Mailbox for iPhone

 On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote:

  ^ Another reason I like commons-configuration here is for
  property-interpolation with HierarchicalConfiguration.
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
  On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote:
  I absolutely DO think they should be combined in a properties file
  located in $HOME/.accumulo/config
  I absolutely DO NOT think this client configuration should be
  exclusive to the shell, and I absolutely DO NOT think it should be
  XML.
 
  I would love to see all our clients/client code use
  commons-configuration to hold properties from the properties file, so
  that only a --config parameter is needed (with reasonable defaults, so
  even that is not absolutely necessary). I also think that every
  property that can exist in the file should be possible to override on
  the command-line. I personally prefer to use system properties, using
  commons-configuration's HierarchicalConfiguration, but jcommander may
  make it easier to do the same thing in a slightly different way.
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
  On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com
 wrote:
  As part of SSL, we need to introduce configuration so accumulo clients
  (such as ZooKeeperInstance) can find trust stores.  It seems like this
 has
  a lot in common with shell config files in ACCUMULO-1397.  Do people
 think
  these should be combined, or should the shell have its own separate
 config?
   I was imagining a simple java .properties-style key=value list.  Does
 this
  seem reasonable?  Or should the format be more like the xml of the site
  config?  I was also imagining looking through a list of files that
 would
  each override settings, perhaps in the following order (from lowest to
  highest priority):
 
  /etc/accumulo/client.conf
  $ACCUMULO_HOME/conf/client.conf
  $HOME/.accumulo/config
  --client-config command line switch for shell or explicit parameter
 passed
  to ZooKeeperInstance
 
  Does this sound good to y'all?  Should the explicit switch/parameter
 have
  per-property override semantics, or should it just be used as the
 exclusive
  source of properties if specified?
 
  Mike Drob, are you actively working on the shell side of this already?
  I
  see that bug is assigned to you...
 
  Thanks,
  Michael



Re: client config files

2013-08-02 Thread Christopher
The overlap is only a conceptual overlap, not an implementation one.
Servers use HdfsZooInstance, which reads the xml configuration file,
and read the instanceId out of HDFS. Clients have ZooKeeperInstance,
which requires user input and gets the instanceId from an a convenient
pointer in ZK. Even if this were changed slightly to get either the
alias or the instanceID directly from a client configuration file, it
still wouldn't be quite the same, because we don't expect clients to
use the xml configuration file at all.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Aug 2, 2013 at 1:25 PM, Michael Berman mber...@sqrrl.com wrote:
 Yeah, I was starting to think along those lines in ACCUMULO-1397 (a generic
 configuration library).  Although options about how to make connections to
 accumulo are needed by both clients and accumulo services themselves (for
 masters talking to tablets, for example).


 On Fri, Aug 2, 2013 at 1:20 PM, Christopher ctubb...@apache.org wrote:

 Yes, I think that would be provide the best user experience for client
 code. I'm not too stuck on this point, though I do think it should be
 independent of AccumuloConfiguration for reasons I mentioned on
 ACCUMULO-1397.

 I've been playing with the idea of creating a generic typed
 configuration library that extends commons-configuration to make it
 easier to get the same value as AccumuloConfiguration's Property
 enums, but without being so monolithic and Accumulo server-specific.
 That common interface could form the basis of an
 AccumuloClientConfiguration, and independently, an
 AccumuloServerConfiguration. Do you think that would be useful for
 your client configuration?

 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii


 On Fri, Aug 2, 2013 at 10:11 AM, Michael Berman mber...@sqrrl.com wrote:
  So does this mean you'd rather have the config switches be called
 something
  like javax.net.ssl.trustStore rather than general.server.ssl.trustStore
 in
  the Accumulo Properties?  Our implementation of SSL will be provided by
 the
  thrift connectors rather than us using JSSE directly, so we'll have to
  interpret them ourselves rather than JSSE doing it automatically.  Should
  we respect these flags if they're passed in through
 System.getProperties()
  in addition to whatever the normal config flow ends up being?  Note that
 we
  won't be able to respect many of the flags, since we're at Thrift's mercy
  for socket construction.
 
 
  On Thu, Aug 1, 2013 at 10:06 PM, Christopher ctubb...@apache.org
 wrote:
 
  I'm generally not a fan of the way some standard Java things have been
  reinvented in Hadoop. I've always seen JSSE SSL config as system
  properties, optionally stored in a properties file. Even if Hadoop is
  using an XML-based configuration for this purpose, I'd still steer
  clear of it, for this reason.
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
  On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com
 wrote:
   I generally prefer properties files to XML, but there may be a
 argument
  for reusing Hadoop's SSL configuration system which is XML based.
  
  
   -Joey
   —
   Sent from Mailbox for iPhone
  
   On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org
 wrote:
  
   ^ Another reason I like commons-configuration here is for
   property-interpolation with HierarchicalConfiguration.
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
   On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org
  wrote:
   I absolutely DO think they should be combined in a properties file
   located in $HOME/.accumulo/config
   I absolutely DO NOT think this client configuration should be
   exclusive to the shell, and I absolutely DO NOT think it should be
   XML.
  
   I would love to see all our clients/client code use
   commons-configuration to hold properties from the properties file,
 so
   that only a --config parameter is needed (with reasonable defaults,
 so
   even that is not absolutely necessary). I also think that every
   property that can exist in the file should be possible to override
 on
   the command-line. I personally prefer to use system properties,
 using
   commons-configuration's HierarchicalConfiguration, but jcommander
 may
   make it easier to do the same thing in a slightly different way.
  
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
  
  
   On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com
  wrote:
   As part of SSL, we need to introduce configuration so accumulo
 clients
   (such as ZooKeeperInstance) can find trust stores.  It seems like
  this has
   a lot in common with shell config files in ACCUMULO-1397.  Do
 people
  think
   these should be combined, or should the shell have its own separate
  config?
I was imagining a simple java .properties-style key=value list.
   Does this
   seem reasonable?  Or should the format be more like the xml of the
  site
   config?  I 

Re: client config files

2013-08-02 Thread Joey Echeverria
Yeah, I agree. Consistency with Hadoop here is probably not that valuable.

-Joey

On Fri, Aug 2, 2013 at 2:28 PM, Keith Turner ke...@deenlo.com wrote:
 On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com wrote:

 I generally prefer properties files to XML, but there may be a argument
 for reusing Hadoop's SSL configuration system which is XML based.


 I also prefer prefer properties files over XML.   The only reason I can
 think that we might want to use XML is for consistency with Hadoop and
 Accumulo server side config.  But it does not seem like a very compelling
 reason, its not like it prop files are hard to use once you realize you
 need to use them.




 -Joey
 —
 Sent from Mailbox for iPhone

 On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org wrote:

  ^ Another reason I like commons-configuration here is for
  property-interpolation with HierarchicalConfiguration.
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
  On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org wrote:
  I absolutely DO think they should be combined in a properties file
  located in $HOME/.accumulo/config
  I absolutely DO NOT think this client configuration should be
  exclusive to the shell, and I absolutely DO NOT think it should be
  XML.
 
  I would love to see all our clients/client code use
  commons-configuration to hold properties from the properties file, so
  that only a --config parameter is needed (with reasonable defaults, so
  even that is not absolutely necessary). I also think that every
  property that can exist in the file should be possible to override on
  the command-line. I personally prefer to use system properties, using
  commons-configuration's HierarchicalConfiguration, but jcommander may
  make it easier to do the same thing in a slightly different way.
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
  On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com
 wrote:
  As part of SSL, we need to introduce configuration so accumulo clients
  (such as ZooKeeperInstance) can find trust stores.  It seems like this
 has
  a lot in common with shell config files in ACCUMULO-1397.  Do people
 think
  these should be combined, or should the shell have its own separate
 config?
   I was imagining a simple java .properties-style key=value list.  Does
 this
  seem reasonable?  Or should the format be more like the xml of the site
  config?  I was also imagining looking through a list of files that
 would
  each override settings, perhaps in the following order (from lowest to
  highest priority):
 
  /etc/accumulo/client.conf
  $ACCUMULO_HOME/conf/client.conf
  $HOME/.accumulo/config
  --client-config command line switch for shell or explicit parameter
 passed
  to ZooKeeperInstance
 
  Does this sound good to y'all?  Should the explicit switch/parameter
 have
  per-property override semantics, or should it just be used as the
 exclusive
  source of properties if specified?
 
  Mike Drob, are you actively working on the shell side of this already?
  I
  see that bug is assigned to you...
 
  Thanks,
  Michael




-- 
Joey Echeverria
Director, Federal FTS
Cloudera, Inc.


Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-08-02 Thread Christopher
Would it be reasonable to consider a version of 1.4 that breaks
compatibility with 0.20? I'm not really a fan of this, personally, but
am curious what others think.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Aug 2, 2013 at 2:22 PM, Joey Echeverria j...@cloudera.com wrote:
 Sorry for the delay, it's been one of those weeks.

 The current version would probably not be backwards compatible to
 0.20.2 just based on changes in dependencies. We're looking right now
 to see how hard it is to have three way compatibility (0.20, 1.0,
 2.0).

 -Joey

 On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net wrote:
 Any update?

 -Original Message-
 From: Joey Echeverria [mailto:j...@cloudera.com]
 Sent: Monday, July 29, 2013 1:24 PM
 To: dev@accumulo.apache.org
 Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

 We're testing this today. I'll report back what we find.


 -Joey
 —
 Sent from Mailbox for iPhone

 On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote:

 Will 1.4 still work with 0.20 with these patches?
 Great point Billie.
 - Original Message -
 From: Billie Rinaldi billie.rina...@gmail.com
 To: dev@accumulo.apache.org
 Sent: Friday, July 26, 2013 3:02:41 PM
 Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul
 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote:
  If these patches are going to be included with 1.4.4 or 1.4.5, I
  would
 like
  to see the following test run using CDH4 on at least a 5 node cluster.
   More nodes would be better.
 
* unit test
* Functional test
* 24 hr Continuous ingest + verification
* 24 hr Continuous ingest + verification + agitation
* 24 hr Random walk
* 24 hr Random walk + agitation
 
  I may be able to assist with this, but I can not make any promises.

 Sure thing. Is there already a write-up on running this full battery
 of tests? I have a 10 node cluster that I can use for this.


  Great.  I think this would be a good patch for 1.4.   I assume that
  if a user stays with Hadoop 1 there are no dependency changes?

 Yup. It works the same way as 1.5 where all of the dependency changes
 are in a Hadoop 2.0 profile.

 In 1.5.0, we gave up on compatibility with 0.20 (and early versions of
 1.0) to make the compatibility requirements simpler; we ended up
 without dependency changes in the hadoop version profiles.  Will 1.4
 still work with 0.20 with these patches?  If there are dependency
 changes in the profiles, 1.4 would have to be compiled against a
 hadoop version compatible with the running version of hadoop, correct?
 We had some trouble in the
 1.5 release process with figuring out how to provide multiple binary
 artifacts (each compiled against a different version of hadoop) for
 the same release.  Just something we should consider before we are in
 the midst of releasing 1.4.4.
 Billie
 -Joey





 --
 Joey Echeverria
 Director, Federal FTS
 Cloudera, Inc.


Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-08-02 Thread Joey Echeverria
I don't think that's a good idea unless you can come up with very
clear version number change.

-Joey

On Fri, Aug 2, 2013 at 2:31 PM, Christopher ctubb...@apache.org wrote:
 Would it be reasonable to consider a version of 1.4 that breaks
 compatibility with 0.20? I'm not really a fan of this, personally, but
 am curious what others think.

 --
 Christopher L Tubbs II
 http://gravatar.com/ctubbsii


 On Fri, Aug 2, 2013 at 2:22 PM, Joey Echeverria j...@cloudera.com wrote:
 Sorry for the delay, it's been one of those weeks.

 The current version would probably not be backwards compatible to
 0.20.2 just based on changes in dependencies. We're looking right now
 to see how hard it is to have three way compatibility (0.20, 1.0,
 2.0).

 -Joey

 On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net wrote:
 Any update?

 -Original Message-
 From: Joey Echeverria [mailto:j...@cloudera.com]
 Sent: Monday, July 29, 2013 1:24 PM
 To: dev@accumulo.apache.org
 Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

 We're testing this today. I'll report back what we find.


 -Joey
 —
 Sent from Mailbox for iPhone

 On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote:

 Will 1.4 still work with 0.20 with these patches?
 Great point Billie.
 - Original Message -
 From: Billie Rinaldi billie.rina...@gmail.com
 To: dev@accumulo.apache.org
 Sent: Friday, July 26, 2013 3:02:41 PM
 Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul
 26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote:
  If these patches are going to be included with 1.4.4 or 1.4.5, I
  would
 like
  to see the following test run using CDH4 on at least a 5 node cluster.
   More nodes would be better.
 
* unit test
* Functional test
* 24 hr Continuous ingest + verification
* 24 hr Continuous ingest + verification + agitation
* 24 hr Random walk
* 24 hr Random walk + agitation
 
  I may be able to assist with this, but I can not make any promises.

 Sure thing. Is there already a write-up on running this full battery
 of tests? I have a 10 node cluster that I can use for this.


  Great.  I think this would be a good patch for 1.4.   I assume that
  if a user stays with Hadoop 1 there are no dependency changes?

 Yup. It works the same way as 1.5 where all of the dependency changes
 are in a Hadoop 2.0 profile.

 In 1.5.0, we gave up on compatibility with 0.20 (and early versions of
 1.0) to make the compatibility requirements simpler; we ended up
 without dependency changes in the hadoop version profiles.  Will 1.4
 still work with 0.20 with these patches?  If there are dependency
 changes in the profiles, 1.4 would have to be compiled against a
 hadoop version compatible with the running version of hadoop, correct?
 We had some trouble in the
 1.5 release process with figuring out how to provide multiple binary
 artifacts (each compiled against a different version of hadoop) for
 the same release.  Just something we should consider before we are in
 the midst of releasing 1.4.4.
 Billie
 -Joey





 --
 Joey Echeverria
 Director, Federal FTS
 Cloudera, Inc.



-- 
Joey Echeverria
Director, Federal FTS
Cloudera, Inc.


Re: client config files

2013-08-02 Thread Michael Berman
I believe it is an implementation overlap.  Both ZKInstance and the
master-tablet thrift connections get created in ThriftUtil.getClient().
 Higher up in the stack, in both paths, we have access to an Instance from
which to draw configuration (with getConfiguration()).  In one case, it's a
ZKInstance with a degenerate AccumuloConfiguration, and in the other case
it's an HDFSInstance with a site.xml-backed configuration, but the thrift
stack makes no distinction.  It seems silly to me to introduce a
distinction all the way down the stack just so we can have two different
config sources (which have many of the same flags).  Unless we were going
to implement it as a ThriftConnectionConfiguration interface with named
methods that both AccumuloConfiguration and ClientConfiguration could
implement...but that would be a big departure from the Property enum
configuration model.


On Fri, Aug 2, 2013 at 2:29 PM, Joey Echeverria j...@cloudera.com wrote:

 Yeah, I agree. Consistency with Hadoop here is probably not that valuable.

 -Joey

 On Fri, Aug 2, 2013 at 2:28 PM, Keith Turner ke...@deenlo.com wrote:
  On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com
 wrote:
 
  I generally prefer properties files to XML, but there may be a argument
  for reusing Hadoop's SSL configuration system which is XML based.
 
 
  I also prefer prefer properties files over XML.   The only reason I can
  think that we might want to use XML is for consistency with Hadoop and
  Accumulo server side config.  But it does not seem like a very compelling
  reason, its not like it prop files are hard to use once you realize you
  need to use them.
 
 
 
 
  -Joey
  —
  Sent from Mailbox for iPhone
 
  On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org
 wrote:
 
   ^ Another reason I like commons-configuration here is for
   property-interpolation with HierarchicalConfiguration.
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
   On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org
 wrote:
   I absolutely DO think they should be combined in a properties file
   located in $HOME/.accumulo/config
   I absolutely DO NOT think this client configuration should be
   exclusive to the shell, and I absolutely DO NOT think it should be
   XML.
  
   I would love to see all our clients/client code use
   commons-configuration to hold properties from the properties file, so
   that only a --config parameter is needed (with reasonable defaults,
 so
   even that is not absolutely necessary). I also think that every
   property that can exist in the file should be possible to override on
   the command-line. I personally prefer to use system properties, using
   commons-configuration's HierarchicalConfiguration, but jcommander may
   make it easier to do the same thing in a slightly different way.
  
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
  
  
   On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com
  wrote:
   As part of SSL, we need to introduce configuration so accumulo
 clients
   (such as ZooKeeperInstance) can find trust stores.  It seems like
 this
  has
   a lot in common with shell config files in ACCUMULO-1397.  Do people
  think
   these should be combined, or should the shell have its own separate
  config?
I was imagining a simple java .properties-style key=value list.
  Does
  this
   seem reasonable?  Or should the format be more like the xml of the
 site
   config?  I was also imagining looking through a list of files that
  would
   each override settings, perhaps in the following order (from lowest
 to
   highest priority):
  
   /etc/accumulo/client.conf
   $ACCUMULO_HOME/conf/client.conf
   $HOME/.accumulo/config
   --client-config command line switch for shell or explicit parameter
  passed
   to ZooKeeperInstance
  
   Does this sound good to y'all?  Should the explicit switch/parameter
  have
   per-property override semantics, or should it just be used as the
  exclusive
   source of properties if specified?
  
   Mike Drob, are you actively working on the shell side of this
 already?
   I
   see that bug is assigned to you...
  
   Thanks,
   Michael
 



 --
 Joey Echeverria
 Director, Federal FTS
 Cloudera, Inc.



Re: Hadoop 2.0 Support for Accumulo 1.4 Branch

2013-08-02 Thread Mike Drob
Which version of 0.20 are you testing against? Vanilla, or cdh3 flavored?


On Fri, Aug 2, 2013 at 2:37 PM, Joey Echeverria j...@cloudera.com wrote:

 I don't think that's a good idea unless you can come up with very
 clear version number change.

 -Joey

 On Fri, Aug 2, 2013 at 2:31 PM, Christopher ctubb...@apache.org wrote:
  Would it be reasonable to consider a version of 1.4 that breaks
  compatibility with 0.20? I'm not really a fan of this, personally, but
  am curious what others think.
 
  --
  Christopher L Tubbs II
  http://gravatar.com/ctubbsii
 
 
  On Fri, Aug 2, 2013 at 2:22 PM, Joey Echeverria j...@cloudera.com
 wrote:
  Sorry for the delay, it's been one of those weeks.
 
  The current version would probably not be backwards compatible to
  0.20.2 just based on changes in dependencies. We're looking right now
  to see how hard it is to have three way compatibility (0.20, 1.0,
  2.0).
 
  -Joey
 
  On Thu, Aug 1, 2013 at 7:33 PM, Dave Marion dlmar...@comcast.net
 wrote:
  Any update?
 
  -Original Message-
  From: Joey Echeverria [mailto:j...@cloudera.com]
  Sent: Monday, July 29, 2013 1:24 PM
  To: dev@accumulo.apache.org
  Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch
 
  We're testing this today. I'll report back what we find.
 
 
  -Joey
  —
  Sent from Mailbox for iPhone
 
  On Fri, Jul 26, 2013 at 3:34 PM, null dlmar...@comcast.net wrote:
 
  Will 1.4 still work with 0.20 with these patches?
  Great point Billie.
  - Original Message -
  From: Billie Rinaldi billie.rina...@gmail.com
  To: dev@accumulo.apache.org
  Sent: Friday, July 26, 2013 3:02:41 PM
  Subject: Re: Hadoop 2.0 Support for Accumulo 1.4 Branch On Fri, Jul
  26, 2013 at 11:33 AM, Joey Echeverria j...@cloudera.com wrote:
   If these patches are going to be included with 1.4.4 or 1.4.5, I
   would
  like
   to see the following test run using CDH4 on at least a 5 node
 cluster.
More nodes would be better.
  
 * unit test
 * Functional test
 * 24 hr Continuous ingest + verification
 * 24 hr Continuous ingest + verification + agitation
 * 24 hr Random walk
 * 24 hr Random walk + agitation
  
   I may be able to assist with this, but I can not make any promises.
 
  Sure thing. Is there already a write-up on running this full battery
  of tests? I have a 10 node cluster that I can use for this.
 
 
   Great.  I think this would be a good patch for 1.4.   I assume that
   if a user stays with Hadoop 1 there are no dependency changes?
 
  Yup. It works the same way as 1.5 where all of the dependency changes
  are in a Hadoop 2.0 profile.
 
  In 1.5.0, we gave up on compatibility with 0.20 (and early versions of
  1.0) to make the compatibility requirements simpler; we ended up
  without dependency changes in the hadoop version profiles.  Will 1.4
  still work with 0.20 with these patches?  If there are dependency
  changes in the profiles, 1.4 would have to be compiled against a
  hadoop version compatible with the running version of hadoop, correct?
  We had some trouble in the
  1.5 release process with figuring out how to provide multiple binary
  artifacts (each compiled against a different version of hadoop) for
  the same release.  Just something we should consider before we are in
  the midst of releasing 1.4.4.
  Billie
  -Joey
 
 
 
 
 
  --
  Joey Echeverria
  Director, Federal FTS
  Cloudera, Inc.



 --
 Joey Echeverria
 Director, Federal FTS
 Cloudera, Inc.



Re: client config files

2013-08-02 Thread Christopher
Okay, so there is implementation overlap, but that overlap is pretty
minimal (admittedly, it could potentially grow). The only thing it is
currently used for, is to carry the value of the RPC timeout, and this
is not currently very friendly to end users (they'd have to
instantiate something that extends AccumuloConfiguration, just to
override that one property's value, and then call
instance.setConfiguration()).

I like configuration with scopes, so I would prefer to have
configuration properties scoped to RPC. I implemented the
Property-enum configuration model to reflect this preference, but I
made the mistake of carrying all the scopes in a single configuration
object. I don't see a problem separating out the RPC scope into a
separate configuration object that could be shared between the other,
broader scopes (client, master, tserver, etc.). In other words,
I'd prefer smaller, more localized scopes for configuration, and I
think it's sensible to have an RPC/connection scope (not necessarily
thrift-specific). See BatchWriterConfig for a good example of a
locally-scoped configuration.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Aug 2, 2013 at 3:24 PM, Michael Berman mber...@sqrrl.com wrote:
 I believe it is an implementation overlap.  Both ZKInstance and the
 master-tablet thrift connections get created in ThriftUtil.getClient().
  Higher up in the stack, in both paths, we have access to an Instance from
 which to draw configuration (with getConfiguration()).  In one case, it's a
 ZKInstance with a degenerate AccumuloConfiguration, and in the other case
 it's an HDFSInstance with a site.xml-backed configuration, but the thrift
 stack makes no distinction.  It seems silly to me to introduce a
 distinction all the way down the stack just so we can have two different
 config sources (which have many of the same flags).  Unless we were going
 to implement it as a ThriftConnectionConfiguration interface with named
 methods that both AccumuloConfiguration and ClientConfiguration could
 implement...but that would be a big departure from the Property enum
 configuration model.


 On Fri, Aug 2, 2013 at 2:29 PM, Joey Echeverria j...@cloudera.com wrote:

 Yeah, I agree. Consistency with Hadoop here is probably not that valuable.

 -Joey

 On Fri, Aug 2, 2013 at 2:28 PM, Keith Turner ke...@deenlo.com wrote:
  On Thu, Aug 1, 2013 at 4:33 PM, Joey Echeverria j...@cloudera.com
 wrote:
 
  I generally prefer properties files to XML, but there may be a argument
  for reusing Hadoop's SSL configuration system which is XML based.
 
 
  I also prefer prefer properties files over XML.   The only reason I can
  think that we might want to use XML is for consistency with Hadoop and
  Accumulo server side config.  But it does not seem like a very compelling
  reason, its not like it prop files are hard to use once you realize you
  need to use them.
 
 
 
 
  -Joey
  —
  Sent from Mailbox for iPhone
 
  On Thu, Aug 1, 2013 at 3:08 PM, Christopher ctubb...@apache.org
 wrote:
 
   ^ Another reason I like commons-configuration here is for
   property-interpolation with HierarchicalConfiguration.
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
   On Thu, Aug 1, 2013 at 3:07 PM, Christopher ctubb...@apache.org
 wrote:
   I absolutely DO think they should be combined in a properties file
   located in $HOME/.accumulo/config
   I absolutely DO NOT think this client configuration should be
   exclusive to the shell, and I absolutely DO NOT think it should be
   XML.
  
   I would love to see all our clients/client code use
   commons-configuration to hold properties from the properties file, so
   that only a --config parameter is needed (with reasonable defaults,
 so
   even that is not absolutely necessary). I also think that every
   property that can exist in the file should be possible to override on
   the command-line. I personally prefer to use system properties, using
   commons-configuration's HierarchicalConfiguration, but jcommander may
   make it easier to do the same thing in a slightly different way.
  
   --
   Christopher L Tubbs II
   http://gravatar.com/ctubbsii
  
  
   On Thu, Aug 1, 2013 at 12:25 PM, Michael Berman mber...@sqrrl.com
  wrote:
   As part of SSL, we need to introduce configuration so accumulo
 clients
   (such as ZooKeeperInstance) can find trust stores.  It seems like
 this
  has
   a lot in common with shell config files in ACCUMULO-1397.  Do people
  think
   these should be combined, or should the shell have its own separate
  config?
I was imagining a simple java .properties-style key=value list.
  Does
  this
   seem reasonable?  Or should the format be more like the xml of the
 site
   config?  I was also imagining looking through a list of files that
  would
   each override settings, perhaps in the following order (from lowest
 to
   highest priority):
  
   /etc/accumulo/client.conf
   $ACCUMULO_HOME/conf/client.conf
   $HOME/.accumulo/config