Running Accumulo on the IBM JVM

2014-06-19 Thread Hayden Marchant
Hi there,

I have been working on getting Accumulo running on IBM JDK, as preparation 
of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop 
distribution). I have come across a number of issues, to which I have made 
some local fixes in my own environment. Since I'm a newbie in Accumulo, I 
wanted to make sure that the approach that I have taken for resolving 
these issues is aligned with the design intent of Accumulo.

Some of the issues are real defects, and some are instances in which the 
assumption of Sun/Oracle JDK being the used JVM is hard-coded into the 
source-code.

I have grouped the issues into 2 sections -  Unit test failures and 
Sun-specific dependencies (though there is an overlap)

1. Unit Test failures - should run consistently no matter which OS, Java 
vendor/version etc...
a. 
org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate
 
. This fails on IBM JRE, since the test is asserting order of elements in 
a HashMap. This consistently passes on Sun , and consistently fails on 
Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to 
TreeMap
 
b. 
org.apache.accumulo.core.security.crypto.BlockedIOStreamTest.testGiantWrite.
This test assumes a max heap of about 1GB. This fails on IBM JRE, 
since the default max heap is not specified, and on IBM JRE this depends 
on the OS (see 
http://www-01.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/appendixes/defaults.html?lang=en
). 
Proposal: add -Xmx1g to the surefire maven plugin reference in 
parent maven pom.
 
c. Both org.apache.accumulo.core.security.crypto.CrypoTest & 
org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due to 
calls to SEcureRandom with Random Number Generator Provider hard-coded as 
Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2 
issues - hard-coded calls to SecureRandom.getInstance(,"SUN") and 
also default value in Property class is "SUN". 
Proposal: Add mechanism to override default Property through 
System property through new annotator in Property class. Only usage will 
be by Property.CRYPTO_SECURE_RNG_PROVIDER
 
 
2. Environment/Configuration
a. The generated configuration files contain references to GC 
params that are specific to Sun JVM. In accumulo-env.sh, the 
ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also 
in ACCUMULO_GENERAL_OPTS,
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used.
b. in bin/accumulo, get ClassNotFoundException due to 
specification of JAXP Doc Builder: 
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
 
. 
The Sun implementation of Document Builder Factory does not exists 
in IBM JDK, so a ClassNotFoundException is thrown on running accumulo 
script
 
c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl, 
Sun-speciifc GC params are passed as params to the java process (similar 
to section a. )
 
Single proposal for solving all three above issues:
Enhance bootstrap_config.sh with request to select Java vendor. 
Selecting this will set correct values for GC params (they differ between 
IBM and Sun), inclusion/ommision of JAXP setting. The 
MiniAccumuloClusterImpl can read the same env variable that was set in 
code for the GC Params, and use in the exec command.
 
 
 So far, my work has been focused on getting unit tests working for all 
Java vendors in a clean manner. I have not yet run intensive testing of 
real clusters following these changes, and would be happy to get pointers 
to what else might need treatment.
 
 I would also like to hear if these changes make sense, and if so, should 
I go ahead and create some JIRAs, and attach my patches for commit 
approval?
 
 Looking forward to hearing feedback!
 
 Regards,
 Hayden Marchant
 Software Architect
 IBM BigInsights, IBM
 

Re: Running Accumulo on the IBM JVM

2014-06-19 Thread Vicky Kak
Hi Hayden,

Most of the recommendation looks okay to me since there are many change to
be done I think you should go ahead and create main JIRA which would have
multiple subtasks addressing all the changes.
I am almost sure that you might get into similar kind of issue if you run
other java based NoSql distributions i.e. HBase/Cassandra on IBM jdk, I
personally had surprises in api calls related to ordering in my application
a long back ago. Your observations looks reasonable to me.

Regards,
Vicky


On Thu, Jun 19, 2014 at 3:47 PM, Hayden Marchant  wrote:

> Hi there,
>
> I have been working on getting Accumulo running on IBM JDK, as preparation
> of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop
> distribution). I have come across a number of issues, to which I have made
> some local fixes in my own environment. Since I'm a newbie in Accumulo, I
> wanted to make sure that the approach that I have taken for resolving
> these issues is aligned with the design intent of Accumulo.
>
> Some of the issues are real defects, and some are instances in which the
> assumption of Sun/Oracle JDK being the used JVM is hard-coded into the
> source-code.
>
> I have grouped the issues into 2 sections -  Unit test failures and
> Sun-specific dependencies (though there is an overlap)
>
> 1. Unit Test failures - should run consistently no matter which OS, Java
> vendor/version etc...
> a.
>
> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate
> . This fails on IBM JRE, since the test is asserting order of elements in
> a HashMap. This consistently passes on Sun , and consistently fails on
> Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to
> TreeMap
>
> b.
>
> org.apache.accumulo.core.security.crypto.BlockedIOStreamTest.testGiantWrite.
> This test assumes a max heap of about 1GB. This fails on IBM JRE,
> since the default max heap is not specified, and on IBM JRE this depends
> on the OS (see
>
> http://www-01.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/appendixes/defaults.html?lang=en
> ).
> Proposal: add -Xmx1g to the surefire maven plugin reference in
> parent maven pom.
>
> c. Both org.apache.accumulo.core.security.crypto.CrypoTest &
> org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due to
> calls to SEcureRandom with Random Number Generator Provider hard-coded as
> Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2
> issues - hard-coded calls to SecureRandom.getInstance(,"SUN") and
> also default value in Property class is "SUN".
> Proposal: Add mechanism to override default Property through
> System property through new annotator in Property class. Only usage will
> be by Property.CRYPTO_SECURE_RNG_PROVIDER
>
>
> 2. Environment/Configuration
> a. The generated configuration files contain references to GC
> params that are specific to Sun JVM. In accumulo-env.sh, the
> ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also
> in ACCUMULO_GENERAL_OPTS,
> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used.
> b. in bin/accumulo, get ClassNotFoundException due to
> specification of JAXP Doc Builder:
>
> -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
> .
> The Sun implementation of Document Builder Factory does not exists
> in IBM JDK, so a ClassNotFoundException is thrown on running accumulo
> script
>
> c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl,
> Sun-speciifc GC params are passed as params to the java process (similar
> to section a. )
>
> Single proposal for solving all three above issues:
> Enhance bootstrap_config.sh with request to select Java vendor.
> Selecting this will set correct values for GC params (they differ between
> IBM and Sun), inclusion/ommision of JAXP setting. The
> MiniAccumuloClusterImpl can read the same env variable that was set in
> code for the GC Params, and use in the exec command.
>
>
>  So far, my work has been focused on getting unit tests working for all
> Java vendors in a clean manner. I have not yet run intensive testing of
> real clusters following these changes, and would be happy to get pointers
> to what else might need treatment.
>
>  I would also like to hear if these changes make sense, and if so, should
> I go ahead and create some JIRAs, and attach my patches for commit
> approval?
>
>  Looking forward to hearing feedback!
>
>  Regards,
>  Hayden Marchant
>  Software Architect
>  IBM BigInsights, IBM
>


Time to release 1.6.1?

2014-06-19 Thread Corey Nolet
I'd like to start getting a candidate together if there are no objections.

It looks like we have 65 resolved tickets with a fix version of 1.6.1.


Is Data Locality Helpful? (or why run tserver and datanode on the same box?)

2014-06-19 Thread David Medinets
At the Accumulo Summit and on a recent client site, there have been
conversations about Data Locality and Accumulo.

I ran an experiment to see that Accumulo can scan tables when the
tserver process is run on a server without a datanode process. I
followed these steps:

1. Start three node cluster
2. Load data
3. Kill datanode on slave1
4. Wait until Hadoop notices dead node.
5. Kill tserver on slave2
6. Wait until Accumulo notices dead node.
7. Run the accumulo shell on master and slave1 to verify entries can be scanned.

Accumulo handled this situation just fine. As I expected.

How important (or not) is it to run tserver and datanode on the same server?
Does the Data Locality implied by running them together exist?
Can the benefit be quantified?


Re: Time to release 1.6.1?

2014-06-19 Thread Mike Drob
I'd like to see 1.5.2 released first, just in case there are issues we
discover during that process that need to be addressed. Also, I think it
would be useful to resolve the discussion surrounding upgrades[1] before
releasing.

[1]:
http://mail-archives.apache.org/mod_mbox/accumulo-dev/201406.mbox/%3CCAGHyZ6LFuwH%3DqGF9JYpitOY9yYDG-sop9g6iq57VFPQRnzmyNQ%40mail.gmail.com%3E


On Thu, Jun 19, 2014 at 8:09 AM, Corey Nolet  wrote:

> I'd like to start getting a candidate together if there are no objections.
>
> It looks like we have 65 resolved tickets with a fix version of 1.6.1.
>


Re: Is Data Locality Helpful? (or why run tserver and datanode on the same box?)

2014-06-19 Thread Corey Nolet
AFAIK, the locality may not be guaranteed right away unless the data for a
tablet was first ingested on the tablet server that is responsible for that
tablet, otherwise you'll need to wait for a major compaction to rewrite the
RFiles locally on the tablet server. I would assume if the tablet server is
not on the same node as the datanode, those files will probably be spread
across the cluster as if you were ingesting data from outside the cloud.

A recent discussion with Bill Slacum also brought to light a possible
problem of the HDFS balancer [1] re-balancing blocks after the fact which
could eventually pull blocks onto datanodes that are not local to the
tablets. I believe remedy for this was to turn off the balancer or not have
it run.

[1]
http://www.swiss-scalability.com/2013/08/hadoop-hdfs-balancer-explained.html




On Thu, Jun 19, 2014 at 10:07 AM, David Medinets 
wrote:

> At the Accumulo Summit and on a recent client site, there have been
> conversations about Data Locality and Accumulo.
>
> I ran an experiment to see that Accumulo can scan tables when the
> tserver process is run on a server without a datanode process. I
> followed these steps:
>
> 1. Start three node cluster
> 2. Load data
> 3. Kill datanode on slave1
> 4. Wait until Hadoop notices dead node.
> 5. Kill tserver on slave2
> 6. Wait until Accumulo notices dead node.
> 7. Run the accumulo shell on master and slave1 to verify entries can be
> scanned.
>
> Accumulo handled this situation just fine. As I expected.
>
> How important (or not) is it to run tserver and datanode on the same
> server?
> Does the Data Locality implied by running them together exist?
> Can the benefit be quantified?
>


Re: Running Accumulo on the IBM JVM

2014-06-19 Thread Mike Drob
Hi Hayden! Welcome to Accumulo!

Detailed responses are inline.

Mike


On Thu, Jun 19, 2014 at 6:14 AM, Vicky Kak  wrote:

> Hi Hayden,
>
> Most of the recommendation looks okay to me since there are many change to
> be done I think you should go ahead and create main JIRA which would have
> multiple subtasks addressing all the changes.
> I am almost sure that you might get into similar kind of issue if you run
> other java based NoSql distributions i.e. HBase/Cassandra on IBM jdk, I
> personally had surprises in api calls related to ordering in my application
> a long back ago. Your observations looks reasonable to me.
>
> Regards,
> Vicky
>
>
> On Thu, Jun 19, 2014 at 3:47 PM, Hayden Marchant 
> wrote:
>
> > Hi there,
> >
> > I have been working on getting Accumulo running on IBM JDK, as
> preparation
> > of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop
> > distribution). I have come across a number of issues, to which I have
> made
> > some local fixes in my own environment. Since I'm a newbie in Accumulo, I
> > wanted to make sure that the approach that I have taken for resolving
> > these issues is aligned with the design intent of Accumulo.
> >
> > Some of the issues are real defects, and some are instances in which the
> > assumption of Sun/Oracle JDK being the used JVM is hard-coded into the
> > source-code.
> >
> > I have grouped the issues into 2 sections -  Unit test failures and
> > Sun-specific dependencies (though there is an overlap)
> >
> > 1. Unit Test failures - should run consistently no matter which OS, Java
> > vendor/version etc...
> > a.
> >
> >
> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate
> > . This fails on IBM JRE, since the test is asserting order of elements in
> > a HashMap. This consistently passes on Sun , and consistently fails on
> > Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to
> > TreeMap
>

This is probably a real defect. We should not be asserting order on a
HashMap. Another possible solution is to change the test to check for
unordered elements - HamCrest matchers may be useful here.


> >
> > b.
> >
> >
> org.apache.accumulo.core.security.crypto.BlockedIOStreamTest.testGiantWrite.
> > This test assumes a max heap of about 1GB. This fails on IBM JRE,
> > since the default max heap is not specified, and on IBM JRE this depends
> > on the OS (see
> >
> >
> http://www-01.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/appendixes/defaults.html?lang=en
> > ).
> > Proposal: add -Xmx1g to the surefire maven plugin reference in
> > parent maven pom.
> >
>
This might be https://issues.apache.org/jira/browse/ACCUMULO-2774


>  > c. Both org.apache.accumulo.core.security.crypto.CrypoTest &
> > org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due
> to
> > calls to SEcureRandom with Random Number Generator Provider hard-coded as
> > Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2
> > issues - hard-coded calls to SecureRandom.getInstance(,"SUN") and
> > also default value in Property class is "SUN".
> > Proposal: Add mechanism to override default Property through
> > System property through new annotator in Property class. Only usage will
> > be by Property.CRYPTO_SECURE_RNG_PROVIDER
>
>
>
I'm not sure about adding new annotators to Property. However, the
CryptoTest should be getting the value from the conf instead of hard-coding
it. Then you can specify the correct value in accumulo-site.xml

I think another part of the issue is in
CryptoModuleFactory::fillParamsObjectFromStringMap because it looks like
that ignores the default setting.

>  >
> > 2. Environment/Configuration
> > a. The generated configuration files contain references to GC
> > params that are specific to Sun JVM. In accumulo-env.sh, the
> > ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also
> > in ACCUMULO_GENERAL_OPTS,
> > -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used.
> > b. in bin/accumulo, get ClassNotFoundException due to
> > specification of JAXP Doc Builder:
> >
> >
> -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
> > .
> > The Sun implementation of Document Builder Factory does not
> exists
> > in IBM JDK, so a ClassNotFoundException is thrown on running accumulo
> > script
> >
> > c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl,
> > Sun-speciifc GC params are passed as params to the java process (similar
> > to section a. )
> >
> > Single proposal for solving all three above issues:
> > Enhance bootstrap_config.sh with request to select Java vendor.
> > Selecting this will set correct values for GC params (they differ between
> > IBM and Sun), inclusion/ommision of JAXP setting. The
> > MiniAccumuloClusterImpl can read the sa

Re: Running Accumulo on the IBM JVM

2014-06-19 Thread Benson Margulies
On Thu, Jun 19, 2014 at 10:49 AM, Mike Drob  wrote:
> Hi Hayden! Welcome to Accumulo!
>
> Detailed responses are inline.
>
> Mike
>
>
> On Thu, Jun 19, 2014 at 6:14 AM, Vicky Kak  wrote:
>
>> Hi Hayden,
>>
>> Most of the recommendation looks okay to me since there are many change to
>> be done I think you should go ahead and create main JIRA which would have
>> multiple subtasks addressing all the changes.
>> I am almost sure that you might get into similar kind of issue if you run
>> other java based NoSql distributions i.e. HBase/Cassandra on IBM jdk, I
>> personally had surprises in api calls related to ordering in my application
>> a long back ago. Your observations looks reasonable to me.
>>
>> Regards,
>> Vicky
>>
>>
>> On Thu, Jun 19, 2014 at 3:47 PM, Hayden Marchant 
>> wrote:
>>
>> > Hi there,
>> >
>> > I have been working on getting Accumulo running on IBM JDK, as
>> preparation
>> > of including Accumulo in an upcoming version of BigInsights (IBM's Hadoop
>> > distribution). I have come across a number of issues, to which I have
>> made
>> > some local fixes in my own environment. Since I'm a newbie in Accumulo, I
>> > wanted to make sure that the approach that I have taken for resolving
>> > these issues is aligned with the design intent of Accumulo.
>> >
>> > Some of the issues are real defects, and some are instances in which the
>> > assumption of Sun/Oracle JDK being the used JVM is hard-coded into the
>> > source-code.
>> >
>> > I have grouped the issues into 2 sections -  Unit test failures and
>> > Sun-specific dependencies (though there is an overlap)
>> >
>> > 1. Unit Test failures - should run consistently no matter which OS, Java
>> > vendor/version etc...
>> > a.
>> >
>> >
>> org.apache.accumulo.core.util.format.ShardedTableDistributionFormatterTest.testAggregate
>> > . This fails on IBM JRE, since the test is asserting order of elements in
>> > a HashMap. This consistently passes on Sun , and consistently fails on
>> > Oracle. Proposal: Change ShardedTableDistributionFormatter.countsByDay to
>> > TreeMap
>>
>
> This is probably a real defect. We should not be asserting order on a
> HashMap. Another possible solution is to change the test to check for
> unordered elements - HamCrest matchers may be useful here.

You don't want to slow down the production code just to make a test
case pass, that's for sure. If order is not part of the contract, do
like Mike says, or copy it out and sort it.

>
>
>> >
>> > b.
>> >
>> >
>> org.apache.accumulo.core.security.crypto.BlockedIOStreamTest.testGiantWrite.
>> > This test assumes a max heap of about 1GB. This fails on IBM JRE,
>> > since the default max heap is not specified, and on IBM JRE this depends
>> > on the OS (see
>> >
>> >
>> http://www-01.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/appendixes/defaults.html?lang=en
>> > ).
>> > Proposal: add -Xmx1g to the surefire maven plugin reference in
>> > parent maven pom.
>> >
>>
> This might be https://issues.apache.org/jira/browse/ACCUMULO-2774
>
>
>>  > c. Both org.apache.accumulo.core.security.crypto.CrypoTest &
>> > org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due
>> to
>> > calls to SEcureRandom with Random Number Generator Provider hard-coded as
>> > Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2
>> > issues - hard-coded calls to SecureRandom.getInstance(,"SUN") and
>> > also default value in Property class is "SUN".
>> > Proposal: Add mechanism to override default Property through
>> > System property through new annotator in Property class. Only usage will
>> > be by Property.CRYPTO_SECURE_RNG_PROVIDER
>>
>>
>>
> I'm not sure about adding new annotators to Property. However, the
> CryptoTest should be getting the value from the conf instead of hard-coding
> it. Then you can specify the correct value in accumulo-site.xml
>
> I think another part of the issue is in
> CryptoModuleFactory::fillParamsObjectFromStringMap because it looks like
> that ignores the default setting.
>
>>  >
>> > 2. Environment/Configuration
>> > a. The generated configuration files contain references to GC
>> > params that are specific to Sun JVM. In accumulo-env.sh, the
>> > ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also
>> > in ACCUMULO_GENERAL_OPTS,
>> > -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used.
>> > b. in bin/accumulo, get ClassNotFoundException due to
>> > specification of JAXP Doc Builder:
>> >
>> >
>> -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
>> > .
>> > The Sun implementation of Document Builder Factory does not
>> exists
>> > in IBM JDK, so a ClassNotFoundException is thrown on running accumulo
>> > script
>> >
>> > c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl,
>> > Sun-speciifc GC params are passed as params to

Re: moving rat to a profile?

2014-06-19 Thread Bill Havanki
I've filed ACCUMULO-2927 to make 'git clean -df' sufficient. No matter how
we decide about the rat plugin, I think not requiring -x is a worthwhile
goal.


On Tue, Jun 17, 2014 at 5:43 PM, Christopher  wrote:

> I agree that instructing users to use this option to modify the build isn't
> acceptable and I wouldn't recommend this as a response to users... I was
> only stating this as a fact, to point out that a special profile on by
> default with an option to disable isn't needed, since that's the current
> behavior.
>
> I'm more interested in the targeted .gitignore with the recommended "git
> clean -df" option without -x. This helps contributors understand build
> tools, makes them aware of the differences between branches, and doesn't
> hide problems introduced by switching branches in an obscure error, all
> without blowing away their IDE build files. (though switching branches
> often warrants blowing these IDE files away anyway, since different modules
> in different branches will be problematic for most IDEs).
>
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Tue, Jun 17, 2014 at 4:47 PM, Alex Moundalexis 
> wrote:
>
> > This kind of response is hardly conducive to prospective contributors.
> >
> > We should consider ourselves lucky whenever a contributor provides a
> patch,
> > let alone runs a build. Expecting a new contributor be fully aware of the
> > Apache licensing details isn't realistic, much less being aware of the
> > arguments concerning Rat; if the ignoreErrors argument is TheWay, it
> ought
> > to be mentioned prominently in the source documentation [1], but I don't
> > think that's correct either...
> >
> > I don't want to encourage contributors to skip the build. I want
> > contributors to be aware of the licensing requirements, but not at the
> > expense of providing otherwise-viable patches. I'd recommend relaxing the
> > Rat checks for contributors, and making it a required part of the profile
> > for automated Jenkins builds and during the release process.
> >
> > The onus should be on the committers to ensure that all of the licensing
> is
> > in place before the release, but preferably long before that point by
> > guiding the contributor to make the necessary license additions before
> the
> > commit.
> >
> > I've been told to correct whitespace at the end of a line before and to
> > re-submit a patch, seems trivial to address missing licensing files in
> the
> > same way.
> >
> > [1] https://accumulo.apache.org/source.html
> >
> > On Tue, Jun 17, 2014 at 3:15 PM, Christopher 
> wrote:
> >
> > > There's already a way to skip it for those who don't understand why its
> > > failing and are incapable/unwilling to troubleshoot:
> > > -Drat.ignoreErrors=true
> > >
> > >
> > > --
> > > Christopher L Tubbs II
> > > http://gravatar.com/ctubbsii
> > >
> > >
> > > On Tue, Jun 17, 2014 at 3:09 PM, Billie Rinaldi <
> > billie.rina...@gmail.com>
> > > wrote:
> > >
> > > > I'm not thrilled about turning it off by default.  How about putting
> it
> > > in
> > > > a profile that would be enabled by default, but could be disabled
> with
> > a
> > > > flag for those who don't understand why it's failing?
> > > >
> > > >
> > > > On Tue, Jun 17, 2014 at 11:44 AM, Sean Busbey 
> > > wrote:
> > > >
> > > > > I've had a few different new-to-Accumulo contributors recently run
> > into
> > > > the
> > > > > issue of Rat failing the build after changing branches.
> > > > >
> > > > > I know we already have a warning about this[1], but AFAICT it's
> over
> > > the
> > > > > threshold for consumable information.
> > > > >
> > > > > Even after pointing people to the warning, the existing workaround
> > > > tripped
> > > > > up atleast one of them. Despite the warning about using "git
> clean,"
> > > the
> > > > > destruction of their local IDE changes were surprising.
> > > > >
> > > > > For contributions to Accumulo that aren't coming from committers,
> the
> > > Rat
> > > > > plugin seems much more likely to give a false positive than to
> catch
> > an
> > > > > error. Additionally, whatever committer is reviewing the
> contribution
> > > > > should be checking for license compliance anyways.
> > > > >
> > > > > In the interests of reducing the surprise for new contributors, I'd
> > > like
> > > > to
> > > > > move our use of Rat to a profile that is only default enabled
> during
> > a
> > > > > release run.
> > > > >
> > > > > The profile would still let those who want rat to run on every
> build
> > to
> > > > > enable it and we could update the guide for handling new
> > contributions
> > > to
> > > > > say committers should enable the rat profile to help guard against
> > > > errors.
> > > > >
> > > > > Any objections?
> > > > >
> > > > > [1]: http://accumulo.apache.org/source.html#running-a-build
> > > > >
> > > > > --
> > > > > Sean
> > > > >
> > > >
> > >
> >
>



-- 
// Bill Havanki
// Solutions Architect, Cloudera Govt Solutions
// 443.686.9283


Re: [DISCUSS] Should we support upgrading 1.4 -> 1.6 w/o going through 1.5?

2014-06-19 Thread Mike Drob
In a nutshell: stop 1.4, install 1.6, copy the WALs to HDFS
(ACCUMULO-2770), start 1.6

Mike


On Wed, Jun 18, 2014 at 5:54 PM, Drew Farris  wrote:

> Mike,
>
> So works just like upgrading from 1.5?
>
> (After 1.4 shutdown, install 1.6 and restart?)
>
> That sounds entirely reasonable.
>
> Drew
> On Jun 17, 2014 10:52 PM, "Mike Drob"  wrote:
>
> > We initially tried to set it up as a stand-alone utility but eventually
> > gave up. In order to properly do the upgrade, you concurrently need to
> run
> > whatever upgrade code concurrently with a tablet server hosting !METADATA
> > and a tablet server that can replay WALs. We ended up duplicating a lot
> of
> > logic already present in master before scrapping that plan. An
> alternative
> > would have been to try to build on MAC, but that was also non-trivial to
> > deploy, so we spliced the code into the existing upgrade path. How do you
> > feel about that, Drew?
> >
> >
> > On Tue, Jun 17, 2014 at 8:57 PM, Drew Farris 
> > wrote:
> >
> > > I'm +1 for a utility that would allow us to go directly from 1.4 to
> 1.6.
> > >
> > > In terms of a general policy, I suggest we make this sort of decision
> on
> > a
> > > case by case basis. My unreasonably self-centered intuition suggests
> that
> > > there may be some folks that want to go from 1.4 to 1.6 now due to a
> > > relatively short 1.5 cycle. The need to jump multiple versions like
> might
> > > not exist in the future.
> > >
> > >
> > >
> > > On Mon, Jun 16, 2014 at 5:24 PM, Sean Busbey 
> > wrote:
> > >
> > > > In an effort to get more users off of our now unsupported 1.4
> release,
> > > > should we support upgrading directly to 1.6 without going through a
> 1.5
> > > > upgrade?
> > > >
> > > > More directly for those on user@: would you be more likely to
> upgrade
> > > off
> > > > of 1.4 if you could do so directly to 1.6?
> > > >
> > > > We have this working locally at Cloudera as a part of our CDH
> > integration
> > > > (we shipped 1.4 and we're planning to ship 1.6 next).
> > > >
> > > > We can get into implementation details on a jira if there's positive
> > > > consensus, but the changes weren't very complicated. They're mostly
> > > >
> > > > * forward porting and consolidating some upgrade code
> > > > * additions to the README for instructions
> > > >
> > > > Personally, I can see the both sides of the argument. On the plus
> side,
> > > > anything to get more users off of 1.4 is a good thing. On the
> negative
> > > > side, it means we have the 1.4 related upgrade code sitting in a
> > > supported
> > > > code branch longer.
> > > >
> > > > Thoughts?
> > > >
> > > > --
> > > > Sean
> > > >
> > >
> >
>


Re: moving rat to a profile?

2014-06-19 Thread Christopher
Agreed. That's a minimum.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 10:54 AM, Bill Havanki 
wrote:

> I've filed ACCUMULO-2927 to make 'git clean -df' sufficient. No matter how
> we decide about the rat plugin, I think not requiring -x is a worthwhile
> goal.
>
>
> On Tue, Jun 17, 2014 at 5:43 PM, Christopher  wrote:
>
> > I agree that instructing users to use this option to modify the build
> isn't
> > acceptable and I wouldn't recommend this as a response to users... I was
> > only stating this as a fact, to point out that a special profile on by
> > default with an option to disable isn't needed, since that's the current
> > behavior.
> >
> > I'm more interested in the targeted .gitignore with the recommended "git
> > clean -df" option without -x. This helps contributors understand build
> > tools, makes them aware of the differences between branches, and doesn't
> > hide problems introduced by switching branches in an obscure error, all
> > without blowing away their IDE build files. (though switching branches
> > often warrants blowing these IDE files away anyway, since different
> modules
> > in different branches will be problematic for most IDEs).
> >
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
> >
> > On Tue, Jun 17, 2014 at 4:47 PM, Alex Moundalexis <
> al...@clouderagovt.com>
> > wrote:
> >
> > > This kind of response is hardly conducive to prospective contributors.
> > >
> > > We should consider ourselves lucky whenever a contributor provides a
> > patch,
> > > let alone runs a build. Expecting a new contributor be fully aware of
> the
> > > Apache licensing details isn't realistic, much less being aware of the
> > > arguments concerning Rat; if the ignoreErrors argument is TheWay, it
> > ought
> > > to be mentioned prominently in the source documentation [1], but I
> don't
> > > think that's correct either...
> > >
> > > I don't want to encourage contributors to skip the build. I want
> > > contributors to be aware of the licensing requirements, but not at the
> > > expense of providing otherwise-viable patches. I'd recommend relaxing
> the
> > > Rat checks for contributors, and making it a required part of the
> profile
> > > for automated Jenkins builds and during the release process.
> > >
> > > The onus should be on the committers to ensure that all of the
> licensing
> > is
> > > in place before the release, but preferably long before that point by
> > > guiding the contributor to make the necessary license additions before
> > the
> > > commit.
> > >
> > > I've been told to correct whitespace at the end of a line before and to
> > > re-submit a patch, seems trivial to address missing licensing files in
> > the
> > > same way.
> > >
> > > [1] https://accumulo.apache.org/source.html
> > >
> > > On Tue, Jun 17, 2014 at 3:15 PM, Christopher 
> > wrote:
> > >
> > > > There's already a way to skip it for those who don't understand why
> its
> > > > failing and are incapable/unwilling to troubleshoot:
> > > > -Drat.ignoreErrors=true
> > > >
> > > >
> > > > --
> > > > Christopher L Tubbs II
> > > > http://gravatar.com/ctubbsii
> > > >
> > > >
> > > > On Tue, Jun 17, 2014 at 3:09 PM, Billie Rinaldi <
> > > billie.rina...@gmail.com>
> > > > wrote:
> > > >
> > > > > I'm not thrilled about turning it off by default.  How about
> putting
> > it
> > > > in
> > > > > a profile that would be enabled by default, but could be disabled
> > with
> > > a
> > > > > flag for those who don't understand why it's failing?
> > > > >
> > > > >
> > > > > On Tue, Jun 17, 2014 at 11:44 AM, Sean Busbey  >
> > > > wrote:
> > > > >
> > > > > > I've had a few different new-to-Accumulo contributors recently
> run
> > > into
> > > > > the
> > > > > > issue of Rat failing the build after changing branches.
> > > > > >
> > > > > > I know we already have a warning about this[1], but AFAICT it's
> > over
> > > > the
> > > > > > threshold for consumable information.
> > > > > >
> > > > > > Even after pointing people to the warning, the existing
> workaround
> > > > > tripped
> > > > > > up atleast one of them. Despite the warning about using "git
> > clean,"
> > > > the
> > > > > > destruction of their local IDE changes were surprising.
> > > > > >
> > > > > > For contributions to Accumulo that aren't coming from committers,
> > the
> > > > Rat
> > > > > > plugin seems much more likely to give a false positive than to
> > catch
> > > an
> > > > > > error. Additionally, whatever committer is reviewing the
> > contribution
> > > > > > should be checking for license compliance anyways.
> > > > > >
> > > > > > In the interests of reducing the surprise for new contributors,
> I'd
> > > > like
> > > > > to
> > > > > > move our use of Rat to a profile that is only default enabled
> > during
> > > a
> > > > > > release run.
> > > > > >
> > > > > > The profile would still let those who want rat to run on every
> > build
> > > to
> > > > > > enable it and we could upd

Re: moving rat to a profile?

2014-06-19 Thread Bill Havanki
Ooh, had another thought. We can probably make the rat plugin run under a
pre-commit hook [1]. That way, while you're actively developing, the rat
plugin need not get in the way, but it still serves as a gatekeeper before
you can commit.

Git also allows for hooks around git am, so committers can invoke rat then
to ensure contributed patches have licenses. That would be useful in case a
contributor never commits locally, for example (or disables the pre-commit
hook locally :) ).

So, specifically, elements for this option:
* by default, either do not run rat or run it with ignoreErrors=true
* set pre-commit hook to run rat:check and verify
* set pre-applypatch hook to also run rat:check and verify

[1] http://git-scm.com/book/en/Customizing-Git-Git-Hooks


On Tue, Jun 17, 2014 at 5:11 PM, Alex Moundalexis 
wrote:

> I like this plan.
>
> * doesn't discourage new contributors
> * provides information for those who want to dig deeper
>
> On Tue, Jun 17, 2014 at 5:04 PM, Bill Havanki 
> wrote:
>
> > It seems like a middle way would be:
> >
> > * always run the rat plugin
> > * configure it by default with ignoreErrors=true
> > * let committers / Jenkins / release managers et al. explicitly set
> > rat.ignoreErrors=false (in MAVEN_OPTS or wherever)
> >
> > By default, the plugin will warn about files lacking a license, but will
> > continue the build. Contributors are exposed to the check but not
> > constrained by it. Example:
> >
> > ---
> > [INFO] Rat check: Summary of files. Unapproved: 1 unknown: 1 generated: 0
> > approved: 187 licence.
> > [WARNING] Rat check: 1 files with unapproved licenses. See RAT report in:
> > /Users/bhavanki/dev/accumulo/server/base/target/rat.txt
> > ---
> >
> > Any entity that should enforce licenses then needs to set the
> ignoreErrors
> > flag to false. This can be part of committer onboarding.
> >
> > Bill
> >
> >
> > On Tue, Jun 17, 2014 at 4:59 PM, Josh Elser 
> wrote:
> >
> > >
> > >
> > > On 6/17/14, 1:47 PM, Alex Moundalexis wrote:
> > >
> > >> This kind of response is hardly conducive to prospective contributors.
> > >>
> > >> We should consider ourselves lucky whenever a contributor provides a
> > >> patch,
> > >> let alone runs a build. Expecting a new contributor be fully aware of
> > the
> > >> Apache licensing details isn't realistic, much less being aware of the
> > >> arguments concerning Rat; if the ignoreErrors argument is TheWay, it
> > ought
> > >> to be mentioned prominently in the source documentation [1], but I
> don't
> > >> think that's correct either...
> > >>
> > >> I don't want to encourage contributors to skip the build. I want
> > >> contributors to be aware of the licensing requirements, but not at the
> > >> expense of providing otherwise-viable patches. I'd recommend relaxing
> > the
> > >> Rat checks for contributors, and making it a required part of the
> > profile
> > >> for automated Jenkins builds and during the release process.
> > >>
> > >> The onus should be on the committers to ensure that all of the
> licensing
> > >> is
> > >> in place before the release, but preferably long before that point by
> > >> guiding the contributor to make the necessary license additions before
> > the
> > >> commit.
> > >>
> > >
> > > This is an important thing to remember. The point of shepherding
> > > contributors is to eventually get them to committer status, at which
> > point
> > > they'll be personally responsible for these things. While we definitely
> > > don't want to be to abrasive initially that they get fed up and go
> away,
> > we
> > > can't fully insulate from the necessary either.
> > >
> > >
> > >
> > >> I've been told to correct whitespace at the end of a line before and
> to
> > >> re-submit a patch, seems trivial to address missing licensing files in
> > the
> > >> same way.
> > >>
> > >> [1] https://accumulo.apache.org/source.html
> > >>
> > >>
> >
> >
> > --
> > // Bill Havanki
> > // Solutions Architect, Cloudera Govt Solutions
> > // 443.686.9283
> >
>



-- 
// Bill Havanki
// Solutions Architect, Cloudera Govt Solutions
// 443.686.9283


Re: moving rat to a profile?

2014-06-19 Thread Mike Drob
I hope you mean verify the output of rat:check, and not run "mvn verify" as
a pre-commit hook.


On Thu, Jun 19, 2014 at 10:38 AM, Bill Havanki 
wrote:

> Ooh, had another thought. We can probably make the rat plugin run under a
> pre-commit hook [1]. That way, while you're actively developing, the rat
> plugin need not get in the way, but it still serves as a gatekeeper before
> you can commit.
>
> Git also allows for hooks around git am, so committers can invoke rat then
> to ensure contributed patches have licenses. That would be useful in case a
> contributor never commits locally, for example (or disables the pre-commit
> hook locally :) ).
>
> So, specifically, elements for this option:
> * by default, either do not run rat or run it with ignoreErrors=true
> * set pre-commit hook to run rat:check and verify
> * set pre-applypatch hook to also run rat:check and verify
>
> [1] http://git-scm.com/book/en/Customizing-Git-Git-Hooks
>
>
> On Tue, Jun 17, 2014 at 5:11 PM, Alex Moundalexis 
> wrote:
>
> > I like this plan.
> >
> > * doesn't discourage new contributors
> > * provides information for those who want to dig deeper
> >
> > On Tue, Jun 17, 2014 at 5:04 PM, Bill Havanki  >
> > wrote:
> >
> > > It seems like a middle way would be:
> > >
> > > * always run the rat plugin
> > > * configure it by default with ignoreErrors=true
> > > * let committers / Jenkins / release managers et al. explicitly set
> > > rat.ignoreErrors=false (in MAVEN_OPTS or wherever)
> > >
> > > By default, the plugin will warn about files lacking a license, but
> will
> > > continue the build. Contributors are exposed to the check but not
> > > constrained by it. Example:
> > >
> > > ---
> > > [INFO] Rat check: Summary of files. Unapproved: 1 unknown: 1
> generated: 0
> > > approved: 187 licence.
> > > [WARNING] Rat check: 1 files with unapproved licenses. See RAT report
> in:
> > > /Users/bhavanki/dev/accumulo/server/base/target/rat.txt
> > > ---
> > >
> > > Any entity that should enforce licenses then needs to set the
> > ignoreErrors
> > > flag to false. This can be part of committer onboarding.
> > >
> > > Bill
> > >
> > >
> > > On Tue, Jun 17, 2014 at 4:59 PM, Josh Elser 
> > wrote:
> > >
> > > >
> > > >
> > > > On 6/17/14, 1:47 PM, Alex Moundalexis wrote:
> > > >
> > > >> This kind of response is hardly conducive to prospective
> contributors.
> > > >>
> > > >> We should consider ourselves lucky whenever a contributor provides a
> > > >> patch,
> > > >> let alone runs a build. Expecting a new contributor be fully aware
> of
> > > the
> > > >> Apache licensing details isn't realistic, much less being aware of
> the
> > > >> arguments concerning Rat; if the ignoreErrors argument is TheWay, it
> > > ought
> > > >> to be mentioned prominently in the source documentation [1], but I
> > don't
> > > >> think that's correct either...
> > > >>
> > > >> I don't want to encourage contributors to skip the build. I want
> > > >> contributors to be aware of the licensing requirements, but not at
> the
> > > >> expense of providing otherwise-viable patches. I'd recommend
> relaxing
> > > the
> > > >> Rat checks for contributors, and making it a required part of the
> > > profile
> > > >> for automated Jenkins builds and during the release process.
> > > >>
> > > >> The onus should be on the committers to ensure that all of the
> > licensing
> > > >> is
> > > >> in place before the release, but preferably long before that point
> by
> > > >> guiding the contributor to make the necessary license additions
> before
> > > the
> > > >> commit.
> > > >>
> > > >
> > > > This is an important thing to remember. The point of shepherding
> > > > contributors is to eventually get them to committer status, at which
> > > point
> > > > they'll be personally responsible for these things. While we
> definitely
> > > > don't want to be to abrasive initially that they get fed up and go
> > away,
> > > we
> > > > can't fully insulate from the necessary either.
> > > >
> > > >
> > > >
> > > >> I've been told to correct whitespace at the end of a line before and
> > to
> > > >> re-submit a patch, seems trivial to address missing licensing files
> in
> > > the
> > > >> same way.
> > > >>
> > > >> [1] https://accumulo.apache.org/source.html
> > > >>
> > > >>
> > >
> > >
> > > --
> > > // Bill Havanki
> > > // Solutions Architect, Cloudera Govt Solutions
> > > // 443.686.9283
> > >
> >
>
>
>
> --
> // Bill Havanki
> // Solutions Architect, Cloudera Govt Solutions
> // 443.686.9283
>


Re: Time to release 1.6.1?

2014-06-19 Thread Josh Elser
I was thinking the same thing, but I also haven't made any strides 
towards getting 1.5.2 closer to happening (as I said I'd try to do).


I still lack "physical" resources to do the week-long testing as our 
guidelines currently force us to do. I still think this testing is 
excessive if we're actually releasing bug-fixes, but it does 
differentiate us from other communities.


I'm really not sure how to approach this which is really why I've been 
stalling on it.


On 6/19/14, 7:18 AM, Mike Drob wrote:

I'd like to see 1.5.2 released first, just in case there are issues we
discover during that process that need to be addressed. Also, I think it
would be useful to resolve the discussion surrounding upgrades[1] before
releasing.

[1]:
http://mail-archives.apache.org/mod_mbox/accumulo-dev/201406.mbox/%3CCAGHyZ6LFuwH%3DqGF9JYpitOY9yYDG-sop9g6iq57VFPQRnzmyNQ%40mail.gmail.com%3E


On Thu, Jun 19, 2014 at 8:09 AM, Corey Nolet  wrote:


I'd like to start getting a candidate together if there are no objections.

It looks like we have 65 resolved tickets with a fix version of 1.6.1.





Re: Running Accumulo on the IBM JVM

2014-06-19 Thread Josh Elser




 b.



org.apache.accumulo.core.security.crypto.BlockedIOStreamTest.testGiantWrite.

 This test assumes a max heap of about 1GB. This fails on IBM JRE,
since the default max heap is not specified, and on IBM JRE this depends
on the OS (see



http://www-01.ibm.com/support/knowledgecenter/SSYKE2_6.0.0/com.ibm.java.doc.diagnostics.60/diag/appendixes/defaults.html?lang=en

).
 Proposal: add -Xmx1g to the surefire maven plugin reference in
parent maven pom.




This might be https://issues.apache.org/jira/browse/ACCUMULO-2774


Yup! I actually bumped this up to 1G already after I started seeing 
failures (again) from the ACCUMULO-2774 patch which set a 768M heap. 
Pull the upstream changes and feel free to submit something to address 
any problem you still have.






  > c. Both org.apache.accumulo.core.security.crypto.CrypoTest &

org.apache.accumulo.core.file.rfile.RFileTest have lots of failures due

to

calls to SEcureRandom with Random Number Generator Provider hard-coded as
Sun. The IBM JRE has it's own built in RNG Provider called IBMJCE. 2
issues - hard-coded calls to SecureRandom.getInstance(,"SUN") and
also default value in Property class is "SUN".
 Proposal: Add mechanism to override default Property through
System property through new annotator in Property class. Only usage will
be by Property.CRYPTO_SECURE_RNG_PROVIDER





I'm not sure about adding new annotators to Property. However, the
CryptoTest should be getting the value from the conf instead of hard-coding
it. Then you can specify the correct value in accumulo-site.xml

I think another part of the issue is in
CryptoModuleFactory::fillParamsObjectFromStringMap because it looks like
that ignores the default setting.


  >

2. Environment/Configuration
 a. The generated configuration files contain references to GC
params that are specific to Sun JVM. In accumulo-env.sh, the
ACCUMULO_TSERVER_OPTS contains -XX:NewSize and -XX:MaxNewSize , and also
in ACCUMULO_GENERAL_OPTS,
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 are used.
 b. in bin/accumulo, get ClassNotFoundException due to
specification of JAXP Doc Builder:



-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl

.
 The Sun implementation of Document Builder Factory does not

exists

in IBM JDK, so a ClassNotFoundException is thrown on running accumulo
script

 c. MiniAccumuloCluster - in the MiniAccumuloClusterImpl,
Sun-speciifc GC params are passed as params to the java process (similar
to section a. )

 Single proposal for solving all three above issues:
 Enhance bootstrap_config.sh with request to select Java vendor.
Selecting this will set correct values for GC params (they differ between
IBM and Sun), inclusion/ommision of JAXP setting. The
MiniAccumuloClusterImpl can read the same env variable that was set in
code for the GC Params, and use in the exec command.




I don't know enough about the IBM JDK to comment on this part
intelligently. Go ahead and generate a patch, and we can use that as a
starting point for discussion.


I'm a little hesitant to remove the CMS configuration (as it really 
helps). My first thought about how to address this is you can submit 
some example Accumulo configurations that work with IBM JDK or you can 
add something to the configuration template/script (conf/examples and 
conf/templates with bin/bootstrap_config.sh, respectively). I think 
you're on the right path.





  >

  So far, my work has been focused on getting unit tests working for all
Java vendors in a clean manner. I have not yet run intensive testing of
real clusters following these changes, and would be happy to get pointers
to what else might need treatment.





Unit tests is a good first pass. Integration tests (mvn verify) is probably
the minimum that you want on your continuous integration once you have
things set up.

Accumulo also comes with a set of longer running, cluster based tests,
since we know that there are some pieces too complex for unit tests to
catch. have a look in the test module for the Continuous Ingest test. Once
you get to that point, we can help you set it up if the README is unclear.


  I would also like to hear if these changes make sense, and if so, should

I go ahead and create some JIRAs, and attach my patches for commit
approval?




Filing JIRAs is going to be the most straightforward path, yes.

  >  Looking forward to hearing feedback!


Likewise. Looking forward to applying some patches!


Re: Is Data Locality Helpful? (or why run tserver and datanode on the same box?)

2014-06-19 Thread Josh Elser
I believe this happens via the DfsClient, but you can only expect the 
first block of a file to actually be on the local datanode (assuming 
there is one). Everything else is possible to be remote. Assuming you 
have a proper rack script set up, you would imagine that you'll still 
get at least one rack-local replica (so you'd have a block nearby).


Interestingly (at least to me), I believe HBase does a bit of work in 
region (tablet) assignments to try to maximize the locality of regions 
WRT the datanode that is hosting the blocks that make up that file. I 
need to dig into their code some day though.


In general, Accumulo and HBase tend to be relatively comparable to one 
another with performance when properly configured which makes me apt to 
think that data locality can help, but it's not some holy grail (of 
course you won't ever hear me claim anything be in that position). I 
will say that I haven't done any real quantitative analysis either though.


tl;dr HDFS block locality should not be affecting the functionality of 
Accumulo.


On 6/19/14, 7:25 AM, Corey Nolet wrote:

AFAIK, the locality may not be guaranteed right away unless the data for a
tablet was first ingested on the tablet server that is responsible for that
tablet, otherwise you'll need to wait for a major compaction to rewrite the
RFiles locally on the tablet server. I would assume if the tablet server is
not on the same node as the datanode, those files will probably be spread
across the cluster as if you were ingesting data from outside the cloud.

A recent discussion with Bill Slacum also brought to light a possible
problem of the HDFS balancer [1] re-balancing blocks after the fact which
could eventually pull blocks onto datanodes that are not local to the
tablets. I believe remedy for this was to turn off the balancer or not have
it run.

[1]
http://www.swiss-scalability.com/2013/08/hadoop-hdfs-balancer-explained.html




On Thu, Jun 19, 2014 at 10:07 AM, David Medinets 
wrote:


At the Accumulo Summit and on a recent client site, there have been
conversations about Data Locality and Accumulo.

I ran an experiment to see that Accumulo can scan tables when the
tserver process is run on a server without a datanode process. I
followed these steps:

1. Start three node cluster
2. Load data
3. Kill datanode on slave1
4. Wait until Hadoop notices dead node.
5. Kill tserver on slave2
6. Wait until Accumulo notices dead node.
7. Run the accumulo shell on master and slave1 to verify entries can be
scanned.

Accumulo handled this situation just fine. As I expected.

How important (or not) is it to run tserver and datanode on the same
server?
Does the Data Locality implied by running them together exist?
Can the benefit be quantified?





Re: moving rat to a profile?

2014-06-19 Thread Bill Havanki
I do. :)

"... to run rat:check and verify that it passes"


On Thu, Jun 19, 2014 at 11:42 AM, Mike Drob  wrote:

> I hope you mean verify the output of rat:check, and not run "mvn verify" as
> a pre-commit hook.
>
>
> On Thu, Jun 19, 2014 at 10:38 AM, Bill Havanki 
> wrote:
>
> > Ooh, had another thought. We can probably make the rat plugin run under a
> > pre-commit hook [1]. That way, while you're actively developing, the rat
> > plugin need not get in the way, but it still serves as a gatekeeper
> before
> > you can commit.
> >
> > Git also allows for hooks around git am, so committers can invoke rat
> then
> > to ensure contributed patches have licenses. That would be useful in
> case a
> > contributor never commits locally, for example (or disables the
> pre-commit
> > hook locally :) ).
> >
> > So, specifically, elements for this option:
> > * by default, either do not run rat or run it with ignoreErrors=true
> > * set pre-commit hook to run rat:check and verify
> > * set pre-applypatch hook to also run rat:check and verify
> >
> > [1] http://git-scm.com/book/en/Customizing-Git-Git-Hooks
> >
> >
> > On Tue, Jun 17, 2014 at 5:11 PM, Alex Moundalexis <
> al...@clouderagovt.com>
> > wrote:
> >
> > > I like this plan.
> > >
> > > * doesn't discourage new contributors
> > > * provides information for those who want to dig deeper
> > >
> > > On Tue, Jun 17, 2014 at 5:04 PM, Bill Havanki <
> bhava...@clouderagovt.com
> > >
> > > wrote:
> > >
> > > > It seems like a middle way would be:
> > > >
> > > > * always run the rat plugin
> > > > * configure it by default with ignoreErrors=true
> > > > * let committers / Jenkins / release managers et al. explicitly set
> > > > rat.ignoreErrors=false (in MAVEN_OPTS or wherever)
> > > >
> > > > By default, the plugin will warn about files lacking a license, but
> > will
> > > > continue the build. Contributors are exposed to the check but not
> > > > constrained by it. Example:
> > > >
> > > > ---
> > > > [INFO] Rat check: Summary of files. Unapproved: 1 unknown: 1
> > generated: 0
> > > > approved: 187 licence.
> > > > [WARNING] Rat check: 1 files with unapproved licenses. See RAT report
> > in:
> > > > /Users/bhavanki/dev/accumulo/server/base/target/rat.txt
> > > > ---
> > > >
> > > > Any entity that should enforce licenses then needs to set the
> > > ignoreErrors
> > > > flag to false. This can be part of committer onboarding.
> > > >
> > > > Bill
> > > >
> > > >
> > > > On Tue, Jun 17, 2014 at 4:59 PM, Josh Elser 
> > > wrote:
> > > >
> > > > >
> > > > >
> > > > > On 6/17/14, 1:47 PM, Alex Moundalexis wrote:
> > > > >
> > > > >> This kind of response is hardly conducive to prospective
> > contributors.
> > > > >>
> > > > >> We should consider ourselves lucky whenever a contributor
> provides a
> > > > >> patch,
> > > > >> let alone runs a build. Expecting a new contributor be fully aware
> > of
> > > > the
> > > > >> Apache licensing details isn't realistic, much less being aware of
> > the
> > > > >> arguments concerning Rat; if the ignoreErrors argument is TheWay,
> it
> > > > ought
> > > > >> to be mentioned prominently in the source documentation [1], but I
> > > don't
> > > > >> think that's correct either...
> > > > >>
> > > > >> I don't want to encourage contributors to skip the build. I want
> > > > >> contributors to be aware of the licensing requirements, but not at
> > the
> > > > >> expense of providing otherwise-viable patches. I'd recommend
> > relaxing
> > > > the
> > > > >> Rat checks for contributors, and making it a required part of the
> > > > profile
> > > > >> for automated Jenkins builds and during the release process.
> > > > >>
> > > > >> The onus should be on the committers to ensure that all of the
> > > licensing
> > > > >> is
> > > > >> in place before the release, but preferably long before that point
> > by
> > > > >> guiding the contributor to make the necessary license additions
> > before
> > > > the
> > > > >> commit.
> > > > >>
> > > > >
> > > > > This is an important thing to remember. The point of shepherding
> > > > > contributors is to eventually get them to committer status, at
> which
> > > > point
> > > > > they'll be personally responsible for these things. While we
> > definitely
> > > > > don't want to be to abrasive initially that they get fed up and go
> > > away,
> > > > we
> > > > > can't fully insulate from the necessary either.
> > > > >
> > > > >
> > > > >
> > > > >> I've been told to correct whitespace at the end of a line before
> and
> > > to
> > > > >> re-submit a patch, seems trivial to address missing licensing
> files
> > in
> > > > the
> > > > >> same way.
> > > > >>
> > > > >> [1] https://accumulo.apache.org/source.html
> > > > >>
> > > > >>
> > > >
> > > >
> > > > --
> > > > // Bill Havanki
> > > > // Solutions Architect, Cloudera Govt Solutions
> > > > // 443.686.9283
> > > >
> > >
> >
> >
> >
> > --
> > // Bill Havanki
> > // Solutions Architect, Cloudera Govt Solutions
> > // 443.686.9283
> >
>



--

Re: moving rat to a profile?

2014-06-19 Thread Christopher
Well, to be clear rat:check executes a different plugin than the apache-rat
one we use. However, it'd be good to do "mvn validate", which will execute
the correct rat check and the enforcer plugin and other very basic checks.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 11:52 AM, Bill Havanki 
wrote:

> I do. :)
>
> "... to run rat:check and verify that it passes"
>
>
> On Thu, Jun 19, 2014 at 11:42 AM, Mike Drob  wrote:
>
> > I hope you mean verify the output of rat:check, and not run "mvn verify"
> as
> > a pre-commit hook.
> >
> >
> > On Thu, Jun 19, 2014 at 10:38 AM, Bill Havanki <
> bhava...@clouderagovt.com>
> > wrote:
> >
> > > Ooh, had another thought. We can probably make the rat plugin run
> under a
> > > pre-commit hook [1]. That way, while you're actively developing, the
> rat
> > > plugin need not get in the way, but it still serves as a gatekeeper
> > before
> > > you can commit.
> > >
> > > Git also allows for hooks around git am, so committers can invoke rat
> > then
> > > to ensure contributed patches have licenses. That would be useful in
> > case a
> > > contributor never commits locally, for example (or disables the
> > pre-commit
> > > hook locally :) ).
> > >
> > > So, specifically, elements for this option:
> > > * by default, either do not run rat or run it with ignoreErrors=true
> > > * set pre-commit hook to run rat:check and verify
> > > * set pre-applypatch hook to also run rat:check and verify
> > >
> > > [1] http://git-scm.com/book/en/Customizing-Git-Git-Hooks
> > >
> > >
> > > On Tue, Jun 17, 2014 at 5:11 PM, Alex Moundalexis <
> > al...@clouderagovt.com>
> > > wrote:
> > >
> > > > I like this plan.
> > > >
> > > > * doesn't discourage new contributors
> > > > * provides information for those who want to dig deeper
> > > >
> > > > On Tue, Jun 17, 2014 at 5:04 PM, Bill Havanki <
> > bhava...@clouderagovt.com
> > > >
> > > > wrote:
> > > >
> > > > > It seems like a middle way would be:
> > > > >
> > > > > * always run the rat plugin
> > > > > * configure it by default with ignoreErrors=true
> > > > > * let committers / Jenkins / release managers et al. explicitly set
> > > > > rat.ignoreErrors=false (in MAVEN_OPTS or wherever)
> > > > >
> > > > > By default, the plugin will warn about files lacking a license, but
> > > will
> > > > > continue the build. Contributors are exposed to the check but not
> > > > > constrained by it. Example:
> > > > >
> > > > > ---
> > > > > [INFO] Rat check: Summary of files. Unapproved: 1 unknown: 1
> > > generated: 0
> > > > > approved: 187 licence.
> > > > > [WARNING] Rat check: 1 files with unapproved licenses. See RAT
> report
> > > in:
> > > > > /Users/bhavanki/dev/accumulo/server/base/target/rat.txt
> > > > > ---
> > > > >
> > > > > Any entity that should enforce licenses then needs to set the
> > > > ignoreErrors
> > > > > flag to false. This can be part of committer onboarding.
> > > > >
> > > > > Bill
> > > > >
> > > > >
> > > > > On Tue, Jun 17, 2014 at 4:59 PM, Josh Elser 
> > > > wrote:
> > > > >
> > > > > >
> > > > > >
> > > > > > On 6/17/14, 1:47 PM, Alex Moundalexis wrote:
> > > > > >
> > > > > >> This kind of response is hardly conducive to prospective
> > > contributors.
> > > > > >>
> > > > > >> We should consider ourselves lucky whenever a contributor
> > provides a
> > > > > >> patch,
> > > > > >> let alone runs a build. Expecting a new contributor be fully
> aware
> > > of
> > > > > the
> > > > > >> Apache licensing details isn't realistic, much less being aware
> of
> > > the
> > > > > >> arguments concerning Rat; if the ignoreErrors argument is
> TheWay,
> > it
> > > > > ought
> > > > > >> to be mentioned prominently in the source documentation [1],
> but I
> > > > don't
> > > > > >> think that's correct either...
> > > > > >>
> > > > > >> I don't want to encourage contributors to skip the build. I want
> > > > > >> contributors to be aware of the licensing requirements, but not
> at
> > > the
> > > > > >> expense of providing otherwise-viable patches. I'd recommend
> > > relaxing
> > > > > the
> > > > > >> Rat checks for contributors, and making it a required part of
> the
> > > > > profile
> > > > > >> for automated Jenkins builds and during the release process.
> > > > > >>
> > > > > >> The onus should be on the committers to ensure that all of the
> > > > licensing
> > > > > >> is
> > > > > >> in place before the release, but preferably long before that
> point
> > > by
> > > > > >> guiding the contributor to make the necessary license additions
> > > before
> > > > > the
> > > > > >> commit.
> > > > > >>
> > > > > >
> > > > > > This is an important thing to remember. The point of shepherding
> > > > > > contributors is to eventually get them to committer status, at
> > which
> > > > > point
> > > > > > they'll be personally responsible for these things. While we
> > > definitely
> > > > > > don't want to be to abrasive initially that they get fed up and
> go
> > > > away,
> > > > > w

Re: Time to release 1.6.1?

2014-06-19 Thread Christopher
Guidelines don't force anything. By definition, a guideline is a suggestion
or recommendation. Even if they were strict requirements, we can agree on
different guidelines for bugfix releases. Ultimately, it comes down to
whoever has time to create the release plan/release candidate and the
results of the vote.

I agree with Mike that 1.5.2 should get out first, and that the upgrade
discussion should complete first. If we're going to support 1.4->1.6
upgrades (and I think that's the direction we're converging on), that
should happen in 1.6.1, not later.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 11:46 AM, Josh Elser  wrote:

> I was thinking the same thing, but I also haven't made any strides towards
> getting 1.5.2 closer to happening (as I said I'd try to do).
>
> I still lack "physical" resources to do the week-long testing as our
> guidelines currently force us to do. I still think this testing is
> excessive if we're actually releasing bug-fixes, but it does differentiate
> us from other communities.
>
> I'm really not sure how to approach this which is really why I've been
> stalling on it.
>
>
> On 6/19/14, 7:18 AM, Mike Drob wrote:
>
>> I'd like to see 1.5.2 released first, just in case there are issues we
>> discover during that process that need to be addressed. Also, I think it
>> would be useful to resolve the discussion surrounding upgrades[1] before
>> releasing.
>>
>> [1]:
>> http://mail-archives.apache.org/mod_mbox/accumulo-dev/
>> 201406.mbox/%3CCAGHyZ6LFuwH%3DqGF9JYpitOY9yYDG-
>> sop9g6iq57VFPQRnzmyNQ%40mail.gmail.com%3E
>>
>>
>> On Thu, Jun 19, 2014 at 8:09 AM, Corey Nolet  wrote:
>>
>>  I'd like to start getting a candidate together if there are no
>>> objections.
>>>
>>> It looks like we have 65 resolved tickets with a fix version of 1.6.1.
>>>
>>>
>>


Re: Initial Release Plan for 2.0.0

2014-06-19 Thread Christopher
Since there's not been any further activity on this thread, I'm going to
assume there's no issue bumping the version in JIRA to 2.0.0 instead of
1.7.0 and in git master branch.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Sat, Jun 7, 2014 at 1:17 PM, Christopher  wrote:

> On Sat, Jun 7, 2014 at 12:52 PM, Sean Busbey  wrote:
>
>> inline
>>
>>
>> On Sat, Jun 7, 2014 at 11:40 AM, Christopher  wrote:
>>
>> > I'd consider the compatibility statement a blocker for the release, but
>> not
>> > a blocker for the release plan.
>> >
>> >
>>
>> Certainly. I just don't see it listed on the release plan for something
>> we'd want done prior to releasing. That's the only reason I mentioned it.
>
>
> I knew I'd forget something important. :)
>
> > I said 2.2, because the only Hadoop releases prior to that in the 2.x
>> > series are alpha/beta releases... and I wouldn't want to have to
>> maintain
>> > compatibility with alpha/beta releases. It may be that those would work
>> > just fine... I just don't want to make it a goal.
>> >
>> >
>> That sounds reasonable to me. I just want to make sure we discuss it in
>> case someone else has a particular need for an earlier compat.
>
>
> Personally, I think a requirement to use an alpha/beta HDFS is probably a
> sufficiently fringe use case to expect those users to consider patching
> upstream Accumulo or upgrading to a stable HDFS release, rather than
> require our work to hold back to the alpha/beta APIs. However, I don't feel
> strongly, and we can easily do testing on 2.0.0 <= HDFS < 2.2.0 also.
>
> > Given our past history of releases, I think Sept 12 would be *way* too
>> > optimistic. This timeline is already shorter than the 1.6 one, but I
>> want
>> > to be practical. If things go more rapidly than we expect, we can
>> release
>> > earlier, but I'd rather not impose an artificial rush on things.
>> >
>> >
>> Didn't 1.6 have a much larger target feature set? I don't recall if a
>> formal set of "what do we want in 1.6" plan happened, but IIRC the meeting
>> notes from the initial video chat discussion had a fairly extensive list.
>>
>
> It's hard to say which has a larger feature set, since the list for 2.0.0
> is not exhaustive. Something that should be understood about the history of
> Accumulo development, is that it is far more dynamic than a formulaic
> enumeration of features followed by a release of those features. We tend to
> identify new needs and opportunities during the active development on the
> next major release, and not just up front at the beginning. I know this
> isn't necessarily ideal, and we may want to work towards something more
> formulaic in the future (I don't think we'll get there overnight), but
> that's the reality of the project as it has been and currently is.
>
> To put this in context, the video discussion/meeting notes you're
> referring to at the beginning of the 1.6.0 development wasn't a decision
> about what features we were including, in the formulaic, up-front feature
> set sense (at least, that's not how I saw it). Rather, it was a discussion
> about what features each of the people involved wanted to work on and
> accomplish. It was not an exhaustive list, and some of the things we
> discussed didn't get done. Other contributors, who weren't even in the
> video discussion contributed to yet other features. So, there's still many
> opportunities for people to pick up existing and neglected tickets, as well
> as new ones, and complete them for 2.0.0.
>
>
>> The obvious blocker is going to be the new API. Probably that work can be
>> broken up across multiple people though?
>>
>
> Yes. There's already multiple issues for that, and will almost certainly
> include development contributions from multiple developers.
>


Re: Is Data Locality Helpful? (or why run tserver and datanode on the same box?)

2014-06-19 Thread Josh Elser
I may also be getting this conflated with how reads work. Time for me to 
read some HDFS code.


On 6/19/14, 8:52 AM, Josh Elser wrote:

I believe this happens via the DfsClient, but you can only expect the
first block of a file to actually be on the local datanode (assuming
there is one). Everything else is possible to be remote. Assuming you
have a proper rack script set up, you would imagine that you'll still
get at least one rack-local replica (so you'd have a block nearby).

Interestingly (at least to me), I believe HBase does a bit of work in
region (tablet) assignments to try to maximize the locality of regions
WRT the datanode that is hosting the blocks that make up that file. I
need to dig into their code some day though.

In general, Accumulo and HBase tend to be relatively comparable to one
another with performance when properly configured which makes me apt to
think that data locality can help, but it's not some holy grail (of
course you won't ever hear me claim anything be in that position). I
will say that I haven't done any real quantitative analysis either though.

tl;dr HDFS block locality should not be affecting the functionality of
Accumulo.

On 6/19/14, 7:25 AM, Corey Nolet wrote:

AFAIK, the locality may not be guaranteed right away unless the data
for a
tablet was first ingested on the tablet server that is responsible for
that
tablet, otherwise you'll need to wait for a major compaction to
rewrite the
RFiles locally on the tablet server. I would assume if the tablet
server is
not on the same node as the datanode, those files will probably be spread
across the cluster as if you were ingesting data from outside the cloud.

A recent discussion with Bill Slacum also brought to light a possible
problem of the HDFS balancer [1] re-balancing blocks after the fact which
could eventually pull blocks onto datanodes that are not local to the
tablets. I believe remedy for this was to turn off the balancer or not
have
it run.

[1]
http://www.swiss-scalability.com/2013/08/hadoop-hdfs-balancer-explained.html





On Thu, Jun 19, 2014 at 10:07 AM, David Medinets

wrote:


At the Accumulo Summit and on a recent client site, there have been
conversations about Data Locality and Accumulo.

I ran an experiment to see that Accumulo can scan tables when the
tserver process is run on a server without a datanode process. I
followed these steps:

1. Start three node cluster
2. Load data
3. Kill datanode on slave1
4. Wait until Hadoop notices dead node.
5. Kill tserver on slave2
6. Wait until Accumulo notices dead node.
7. Run the accumulo shell on master and slave1 to verify entries can be
scanned.

Accumulo handled this situation just fine. As I expected.

How important (or not) is it to run tserver and datanode on the same
server?
Does the Data Locality implied by running them together exist?
Can the benefit be quantified?





Re: Time to release 1.6.1?

2014-06-19 Thread Josh Elser
In the below context, I was using the term "guidelines" loosely, not in 
the strictest grammatical sense. I did not see anything on 
http://accumulo.apache.org/governance/releasing.html that makes me think 
one way or the other.


The general verbage of the page is using a SHOULD context which is 
usually interpreted as a "must". I just don't want to ruffle anyone's 
feathers (or waste my own time) if it's going to be -1'ed because of 
insufficient testing.


On 6/19/14, 9:27 AM, Christopher wrote:

Guidelines don't force anything. By definition, a guideline is a suggestion
or recommendation. Even if they were strict requirements, we can agree on
different guidelines for bugfix releases. Ultimately, it comes down to
whoever has time to create the release plan/release candidate and the
results of the vote.

I agree with Mike that 1.5.2 should get out first, and that the upgrade
discussion should complete first. If we're going to support 1.4->1.6
upgrades (and I think that's the direction we're converging on), that
should happen in 1.6.1, not later.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 11:46 AM, Josh Elser  wrote:


I was thinking the same thing, but I also haven't made any strides towards
getting 1.5.2 closer to happening (as I said I'd try to do).

I still lack "physical" resources to do the week-long testing as our
guidelines currently force us to do. I still think this testing is
excessive if we're actually releasing bug-fixes, but it does differentiate
us from other communities.

I'm really not sure how to approach this which is really why I've been
stalling on it.


On 6/19/14, 7:18 AM, Mike Drob wrote:


I'd like to see 1.5.2 released first, just in case there are issues we
discover during that process that need to be addressed. Also, I think it
would be useful to resolve the discussion surrounding upgrades[1] before
releasing.

[1]:
http://mail-archives.apache.org/mod_mbox/accumulo-dev/
201406.mbox/%3CCAGHyZ6LFuwH%3DqGF9JYpitOY9yYDG-
sop9g6iq57VFPQRnzmyNQ%40mail.gmail.com%3E


On Thu, Jun 19, 2014 at 8:09 AM, Corey Nolet  wrote:

  I'd like to start getting a candidate together if there are no

objections.

It looks like we have 65 resolved tickets with a fix version of 1.6.1.








Re: Time to release 1.6.1?

2014-06-19 Thread Christopher
"I just don't want to ruffle anyone's feathers (or waste my own time) if
it's going to be -1'ed because of insufficient testing."

Yeah, understood. I'm just thinking that it might be good to first propose
new guidelines for a bugfix release, and then release accordingly. If
somebody objects to the looser guidelines (it won't be me), that should
come out in the guidelines proposal, rather than hold up the release.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 1:03 PM, Josh Elser  wrote:

> In the below context, I was using the term "guidelines" loosely, not in
> the strictest grammatical sense. I did not see anything on
> http://accumulo.apache.org/governance/releasing.html that makes me think
> one way or the other.
>
> The general verbage of the page is using a SHOULD context which is usually
> interpreted as a "must". I just don't want to ruffle anyone's feathers (or
> waste my own time) if it's going to be -1'ed because of insufficient
> testing.
>
>
> On 6/19/14, 9:27 AM, Christopher wrote:
>
>> Guidelines don't force anything. By definition, a guideline is a
>> suggestion
>> or recommendation. Even if they were strict requirements, we can agree on
>> different guidelines for bugfix releases. Ultimately, it comes down to
>> whoever has time to create the release plan/release candidate and the
>> results of the vote.
>>
>> I agree with Mike that 1.5.2 should get out first, and that the upgrade
>> discussion should complete first. If we're going to support 1.4->1.6
>> upgrades (and I think that's the direction we're converging on), that
>> should happen in 1.6.1, not later.
>>
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Thu, Jun 19, 2014 at 11:46 AM, Josh Elser 
>> wrote:
>>
>>  I was thinking the same thing, but I also haven't made any strides
>>> towards
>>> getting 1.5.2 closer to happening (as I said I'd try to do).
>>>
>>> I still lack "physical" resources to do the week-long testing as our
>>> guidelines currently force us to do. I still think this testing is
>>> excessive if we're actually releasing bug-fixes, but it does
>>> differentiate
>>> us from other communities.
>>>
>>> I'm really not sure how to approach this which is really why I've been
>>> stalling on it.
>>>
>>>
>>> On 6/19/14, 7:18 AM, Mike Drob wrote:
>>>
>>>  I'd like to see 1.5.2 released first, just in case there are issues we
 discover during that process that need to be addressed. Also, I think it
 would be useful to resolve the discussion surrounding upgrades[1] before
 releasing.

 [1]:
 http://mail-archives.apache.org/mod_mbox/accumulo-dev/
 201406.mbox/%3CCAGHyZ6LFuwH%3DqGF9JYpitOY9yYDG-
 sop9g6iq57VFPQRnzmyNQ%40mail.gmail.com%3E


 On Thu, Jun 19, 2014 at 8:09 AM, Corey Nolet  wrote:

   I'd like to start getting a candidate together if there are no

> objections.
>
> It looks like we have 65 resolved tickets with a fix version of 1.6.1.
>
>
>

>>


Re: Is Data Locality Helpful? (or why run tserver and datanode on the same box?)

2014-06-19 Thread Donald Miner
I had to think about this problem a lot for a product I worked on at one
point, but I think a lot of the same applies here.

To Corey's point, running the rebalancer is most definitely an issue, but
simply turning it off is not a good answer in a lot of situations. It
exists for a reason! You can run into problems with highly utilized
clusters where individual data nodes run out of disk space and all kinds of
bad things start to happen then. Also, if you are also using the cluster
for MapReduce, you can see performance gains by rebalancing on highly
utilized clusters.

In general, the placement of blocks is the NameNode's responsibility, so
even if it's nice to assume that blocks get written to the local data node,
that's not really an assumption you can always make.

There has been talk about custom block placement strategies for HDFS in the
NameNode. I just checked up on it and it does look like it is on the
horizon: https://issues.apache.org/jira/browse/HDFS-2576
In theory, you could have Accumulo "hint" that it wants blocks in a certain
place colocated.

There is another interesting problem with the results of minor compactions.
Let's say you've been minor compacting all day and have a dozen or so of
these files written. The replication policy is pretty random. DataNode that
the tablet server has a fatal problem and never comes back. There is no way
to "collect" the replicas together onto one DataNode-- they are scattered
all over data nodes. Eventually a major compaction happens and all is good
again. There were some ideas of telling NameNode that certain blocks have
an affinity for one another to keep them together.

I think this can be solved scientifically pretty easily on a live
production cluster:
 Step 1: measure performance of your current application and note if it
does lots of single fetches, full table scans, etc.
 Step 2: Run the rebalancer
 Step 3: measure performance again
 Step 4: force major compaction to move everything back (optional)
Unfortunately I don't have any systems right now that I could do this on
that would provide me any sort of real results.

Overall, the question on does it matter? It most absolutely matters.
Slurping off disk locally is always going to be faster than slurping off
disk AND going over the network. The real question is if it's worth our
time. 10GigE is a beautiful thing. In some cases it may be, in others it
may not. For example, if you are just doing small fetches of data here and
there you might not notice. I imagine if you were doing multiple large
scans you might start seeing your network get saturated. I think this also
becomes a problem at larger scales where your network infrastructure is a
bit more ridiculous. Let's say for the sake of argument you have a 25,000
node Accumulo cluster... you might have some sort of tiered network where
you are constrained from a throughput perspective somewhere. This would
matter then.

My 8 cents,

-d








On Thu, Jun 19, 2014 at 12:56 PM, Josh Elser  wrote:

> I may also be getting this conflated with how reads work. Time for me to
> read some HDFS code.
>
>
> On 6/19/14, 8:52 AM, Josh Elser wrote:
>
>> I believe this happens via the DfsClient, but you can only expect the
>> first block of a file to actually be on the local datanode (assuming
>> there is one). Everything else is possible to be remote. Assuming you
>> have a proper rack script set up, you would imagine that you'll still
>> get at least one rack-local replica (so you'd have a block nearby).
>>
>> Interestingly (at least to me), I believe HBase does a bit of work in
>> region (tablet) assignments to try to maximize the locality of regions
>> WRT the datanode that is hosting the blocks that make up that file. I
>> need to dig into their code some day though.
>>
>> In general, Accumulo and HBase tend to be relatively comparable to one
>> another with performance when properly configured which makes me apt to
>> think that data locality can help, but it's not some holy grail (of
>> course you won't ever hear me claim anything be in that position). I
>> will say that I haven't done any real quantitative analysis either though.
>>
>> tl;dr HDFS block locality should not be affecting the functionality of
>> Accumulo.
>>
>> On 6/19/14, 7:25 AM, Corey Nolet wrote:
>>
>>> AFAIK, the locality may not be guaranteed right away unless the data
>>> for a
>>> tablet was first ingested on the tablet server that is responsible for
>>> that
>>> tablet, otherwise you'll need to wait for a major compaction to
>>> rewrite the
>>> RFiles locally on the tablet server. I would assume if the tablet
>>> server is
>>> not on the same node as the datanode, those files will probably be spread
>>> across the cluster as if you were ingesting data from outside the cloud.
>>>
>>> A recent discussion with Bill Slacum also brought to light a possible
>>> problem of the HDFS balancer [1] re-balancing blocks after the fact which
>>> could eventually pull blocks onto datanodes that are

Reduced testing burden for bug-fix releases

2014-06-19 Thread Josh Elser
As we're starting to consider 1.5.2 and 1.6.1 coming out in the near 
future, I want to revisit a discussion[1] I started at the end of April 
regarding the "testing burden" that is currently set forth in our 
release document[2].


What I'm proposing is to modify the language of the release document to 
be explicit about the amount of testing needed. For bug-fix, "minor" 
releases (e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous 
ingest and randomwalk (with and without agitation) will be clearly 
defined as "may" instead of "should" or "must" language. If the 
resources are available, it is recommended that some longer, 
multi-process/node test is run against the release candidate; however, 
it is not required and should not prevent us from making the minor release.


I will also include language that strongly recommends that the changes 
included in the "minor" release be vetted/reviewed as a way to mitigate 
the risk of shipping new regressions.


I am not recommending that the language be changed for "major" releases 
(e.g. 1.7.0 and 2.0.0) as these releases still imply significant new 
features or internal changes.


Unless someone informs me otherwise, I will treat this as a normal 
lazy-consensus approval. Assuming we move closer to "proper" semantic 
versioning for 2.0.0, I believe these updated guidelines will change 
again. I do however think there is merit in making this change now so 
that we can get the good bugs that we've fixed out to our users.


Let me know what you think. I will wait, at least, the prescribed three 
days before changing any thing.


- Josh

[1] 
http://mail-archives.apache.org/mod_mbox/accumulo-dev/201404.mbox/%3C535931A7.30605%40gmail.com%3E

[2] http://accumulo.apache.org/governance/releasing.html


Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread Christopher
+1 for reduced testing burden for bugfixes and a change in the guidelines
to reflect that.


--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 3:13 PM, Josh Elser  wrote:

> As we're starting to consider 1.5.2 and 1.6.1 coming out in the near
> future, I want to revisit a discussion[1] I started at the end of April
> regarding the "testing burden" that is currently set forth in our release
> document[2].
>
> What I'm proposing is to modify the language of the release document to be
> explicit about the amount of testing needed. For bug-fix, "minor" releases
> (e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous ingest and
> randomwalk (with and without agitation) will be clearly defined as "may"
> instead of "should" or "must" language. If the resources are available, it
> is recommended that some longer, multi-process/node test is run against the
> release candidate; however, it is not required and should not prevent us
> from making the minor release.
>
> I will also include language that strongly recommends that the changes
> included in the "minor" release be vetted/reviewed as a way to mitigate the
> risk of shipping new regressions.
>
> I am not recommending that the language be changed for "major" releases
> (e.g. 1.7.0 and 2.0.0) as these releases still imply significant new
> features or internal changes.
>
> Unless someone informs me otherwise, I will treat this as a normal
> lazy-consensus approval. Assuming we move closer to "proper" semantic
> versioning for 2.0.0, I believe these updated guidelines will change again.
> I do however think there is merit in making this change now so that we can
> get the good bugs that we've fixed out to our users.
>
> Let me know what you think. I will wait, at least, the prescribed three
> days before changing any thing.
>
> - Josh
>
> [1] http://mail-archives.apache.org/mod_mbox/accumulo-dev/
> 201404.mbox/%3C535931A7.30605%40gmail.com%3E
> [2] http://accumulo.apache.org/governance/releasing.html
>


Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread David Medinets
-1 I hesitate to step into this discussion because I can't also step
up and do the long-term testing even as I recommend that it must be
done. There are at least four companies supporting Accumulo and
contributing back to the project. Surely one of those companies can
supply the resources to continue the existing test regimen? Is there
some concern that those resources won't be available for the next
release cycle?

On Thu, Jun 19, 2014 at 3:13 PM, Josh Elser  wrote:
> As we're starting to consider 1.5.2 and 1.6.1 coming out in the near future,
> I want to revisit a discussion[1] I started at the end of April regarding
> the "testing burden" that is currently set forth in our release document[2].
>
> What I'm proposing is to modify the language of the release document to be
> explicit about the amount of testing needed. For bug-fix, "minor" releases
> (e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous ingest and
> randomwalk (with and without agitation) will be clearly defined as "may"
> instead of "should" or "must" language. If the resources are available, it
> is recommended that some longer, multi-process/node test is run against the
> release candidate; however, it is not required and should not prevent us
> from making the minor release.
>
> I will also include language that strongly recommends that the changes
> included in the "minor" release be vetted/reviewed as a way to mitigate the
> risk of shipping new regressions.
>
> I am not recommending that the language be changed for "major" releases
> (e.g. 1.7.0 and 2.0.0) as these releases still imply significant new
> features or internal changes.
>
> Unless someone informs me otherwise, I will treat this as a normal
> lazy-consensus approval. Assuming we move closer to "proper" semantic
> versioning for 2.0.0, I believe these updated guidelines will change again.
> I do however think there is merit in making this change now so that we can
> get the good bugs that we've fixed out to our users.
>
> Let me know what you think. I will wait, at least, the prescribed three days
> before changing any thing.
>
> - Josh
>
> [1]
> http://mail-archives.apache.org/mod_mbox/accumulo-dev/201404.mbox/%3C535931A7.30605%40gmail.com%3E
> [2] http://accumulo.apache.org/governance/releasing.html


Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread Keith Turner
On Thu, Jun 19, 2014 at 3:13 PM, Josh Elser  wrote:

> As we're starting to consider 1.5.2 and 1.6.1 coming out in the near
> future, I want to revisit a discussion[1] I started at the end of April
> regarding the "testing burden" that is currently set forth in our release
> document[2].
>
> What I'm proposing is to modify the language of the release document to be
> explicit about the amount of testing needed. For bug-fix, "minor" releases
> (e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous ingest and
> randomwalk (with and without agitation) will be clearly defined as "may"
> instead of "should" or "must" language. If the resources are available, it
> is recommended that some longer, multi-process/node test is run against the
> release candidate; however, it is not required and should not prevent us
> from making the minor release.
>
> I will also include language that strongly recommends that the changes
> included in the "minor" release be vetted/reviewed as a way to mitigate the
> risk of shipping new regressions.
>
> I am not recommending that the language be changed for "major" releases
> (e.g. 1.7.0 and 2.0.0) as these releases still imply significant new
> features or internal changes.
>
> Unless someone informs me otherwise, I will treat this as a normal
> lazy-consensus approval. Assuming we move closer to "proper" semantic
> versioning for 2.0.0, I believe these updated guidelines will change again.
> I do however think there is merit in making this change now so that we can
> get the good bugs that we've fixed out to our users.
>
> Let me know what you think. I will wait, at least, the prescribed three
> days before changing any thing.
>

I am in favor of lowering the required threshhold.  If someone feels not
enough testing was done, they can vote -1 the release vote.


>
> - Josh
>
> [1] http://mail-archives.apache.org/mod_mbox/accumulo-dev/
> 201404.mbox/%3C535931A7.30605%40gmail.com%3E
> [2] http://accumulo.apache.org/governance/releasing.html
>


Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread Christopher
I don't know Josh's concerns, but the concern for me is both resources and
time. No matter how much resources we have, it is still not infinite, and
I'd rather we focus our testing efforts on the changeset between the
previous release and the minor/bugfix release, rather than spend resources
and time on all the exhaustive general testing, which mostly exercises code
that has not changed.

For instance, do we really need 72-hours of continuous ingest on a large
cluster to release a bugfix which affects the shell?
If the long running tests are what is necessary to exercise the changeset,
that makes sense, but otherwise, no.



--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 3:20 PM, David Medinets 
wrote:

> -1 I hesitate to step into this discussion because I can't also step
> up and do the long-term testing even as I recommend that it must be
> done. There are at least four companies supporting Accumulo and
> contributing back to the project. Surely one of those companies can
> supply the resources to continue the existing test regimen? Is there
> some concern that those resources won't be available for the next
> release cycle?
>
> On Thu, Jun 19, 2014 at 3:13 PM, Josh Elser  wrote:
> > As we're starting to consider 1.5.2 and 1.6.1 coming out in the near
> future,
> > I want to revisit a discussion[1] I started at the end of April regarding
> > the "testing burden" that is currently set forth in our release
> document[2].
> >
> > What I'm proposing is to modify the language of the release document to
> be
> > explicit about the amount of testing needed. For bug-fix, "minor"
> releases
> > (e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous ingest and
> > randomwalk (with and without agitation) will be clearly defined as "may"
> > instead of "should" or "must" language. If the resources are available,
> it
> > is recommended that some longer, multi-process/node test is run against
> the
> > release candidate; however, it is not required and should not prevent us
> > from making the minor release.
> >
> > I will also include language that strongly recommends that the changes
> > included in the "minor" release be vetted/reviewed as a way to mitigate
> the
> > risk of shipping new regressions.
> >
> > I am not recommending that the language be changed for "major" releases
> > (e.g. 1.7.0 and 2.0.0) as these releases still imply significant new
> > features or internal changes.
> >
> > Unless someone informs me otherwise, I will treat this as a normal
> > lazy-consensus approval. Assuming we move closer to "proper" semantic
> > versioning for 2.0.0, I believe these updated guidelines will change
> again.
> > I do however think there is merit in making this change now so that we
> can
> > get the good bugs that we've fixed out to our users.
> >
> > Let me know what you think. I will wait, at least, the prescribed three
> days
> > before changing any thing.
> >
> > - Josh
> >
> > [1]
> >
> http://mail-archives.apache.org/mod_mbox/accumulo-dev/201404.mbox/%3C535931A7.30605%40gmail.com%3E
> > [2] http://accumulo.apache.org/governance/releasing.html
>


Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread Josh Elser

I'm sorry, David, but you need to remember that we are individuals.

My employer is absolutely irrelevant from the equation. Unless you are 
willing to supply the funds to pay to use some resources, I don't feel 
like this is a valid -1.


On 6/19/14, 12:20 PM, David Medinets wrote:

-1 I hesitate to step into this discussion because I can't also step
up and do the long-term testing even as I recommend that it must be
done. There are at least four companies supporting Accumulo and
contributing back to the project. Surely one of those companies can
supply the resources to continue the existing test regimen? Is there
some concern that those resources won't be available for the next
release cycle?

On Thu, Jun 19, 2014 at 3:13 PM, Josh Elser  wrote:

As we're starting to consider 1.5.2 and 1.6.1 coming out in the near future,
I want to revisit a discussion[1] I started at the end of April regarding
the "testing burden" that is currently set forth in our release document[2].

What I'm proposing is to modify the language of the release document to be
explicit about the amount of testing needed. For bug-fix, "minor" releases
(e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous ingest and
randomwalk (with and without agitation) will be clearly defined as "may"
instead of "should" or "must" language. If the resources are available, it
is recommended that some longer, multi-process/node test is run against the
release candidate; however, it is not required and should not prevent us
from making the minor release.

I will also include language that strongly recommends that the changes
included in the "minor" release be vetted/reviewed as a way to mitigate the
risk of shipping new regressions.

I am not recommending that the language be changed for "major" releases
(e.g. 1.7.0 and 2.0.0) as these releases still imply significant new
features or internal changes.

Unless someone informs me otherwise, I will treat this as a normal
lazy-consensus approval. Assuming we move closer to "proper" semantic
versioning for 2.0.0, I believe these updated guidelines will change again.
I do however think there is merit in making this change now so that we can
get the good bugs that we've fixed out to our users.

Let me know what you think. I will wait, at least, the prescribed three days
before changing any thing.

- Josh

[1]
http://mail-archives.apache.org/mod_mbox/accumulo-dev/201404.mbox/%3C535931A7.30605%40gmail.com%3E
[2] http://accumulo.apache.org/governance/releasing.html


Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread Josh Elser

On 6/19/14, 12:36 PM, Keith Turner wrote:

I am in favor of lowering the required threshhold.  If someone feels not
enough testing was done, they can vote -1 the release vote.


I like that. If there is a specific area/reason to be concerned, it can 
be addressed as a part of the regular release voting process.


Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread Josh Elser
For context, I do not have the physical resources available to me at any 
point in time.


Time is also a concern, but usually less of one because I don't mind 
working into evenings and weekends to get this done, and I can usually 
work on my other priorities concurrently.


On 6/19/14, 12:38 PM, Christopher wrote:

I don't know Josh's concerns, but the concern for me is both resources and
time. No matter how much resources we have, it is still not infinite, and
I'd rather we focus our testing efforts on the changeset between the
previous release and the minor/bugfix release, rather than spend resources
and time on all the exhaustive general testing, which mostly exercises code
that has not changed.

For instance, do we really need 72-hours of continuous ingest on a large
cluster to release a bugfix which affects the shell?
If the long running tests are what is necessary to exercise the changeset,
that makes sense, but otherwise, no.



--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Thu, Jun 19, 2014 at 3:20 PM, David Medinets 
wrote:


-1 I hesitate to step into this discussion because I can't also step
up and do the long-term testing even as I recommend that it must be
done. There are at least four companies supporting Accumulo and
contributing back to the project. Surely one of those companies can
supply the resources to continue the existing test regimen? Is there
some concern that those resources won't be available for the next
release cycle?

On Thu, Jun 19, 2014 at 3:13 PM, Josh Elser  wrote:

As we're starting to consider 1.5.2 and 1.6.1 coming out in the near

future,

I want to revisit a discussion[1] I started at the end of April regarding
the "testing burden" that is currently set forth in our release

document[2].


What I'm proposing is to modify the language of the release document to

be

explicit about the amount of testing needed. For bug-fix, "minor"

releases

(e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous ingest and
randomwalk (with and without agitation) will be clearly defined as "may"
instead of "should" or "must" language. If the resources are available,

it

is recommended that some longer, multi-process/node test is run against

the

release candidate; however, it is not required and should not prevent us
from making the minor release.

I will also include language that strongly recommends that the changes
included in the "minor" release be vetted/reviewed as a way to mitigate

the

risk of shipping new regressions.

I am not recommending that the language be changed for "major" releases
(e.g. 1.7.0 and 2.0.0) as these releases still imply significant new
features or internal changes.

Unless someone informs me otherwise, I will treat this as a normal
lazy-consensus approval. Assuming we move closer to "proper" semantic
versioning for 2.0.0, I believe these updated guidelines will change

again.

I do however think there is merit in making this change now so that we

can

get the good bugs that we've fixed out to our users.

Let me know what you think. I will wait, at least, the prescribed three

days

before changing any thing.

- Josh

[1]


http://mail-archives.apache.org/mod_mbox/accumulo-dev/201404.mbox/%3C535931A7.30605%40gmail.com%3E

[2] http://accumulo.apache.org/governance/releasing.html






Re: Reduced testing burden for bug-fix releases

2014-06-19 Thread David Medinets
+0 I'm changing my vote after some reconsideration. Having the ability
to vote -1 on a release as Keith mentioned is good enough for a
bug-fix release.

On Thu, Jun 19, 2014 at 3:20 PM, David Medinets
 wrote:
> -1 I hesitate to step into this discussion because I can't also step
> up and do the long-term testing even as I recommend that it must be
> done. There are at least four companies supporting Accumulo and
> contributing back to the project. Surely one of those companies can
> supply the resources to continue the existing test regimen? Is there
> some concern that those resources won't be available for the next
> release cycle?
>
> On Thu, Jun 19, 2014 at 3:13 PM, Josh Elser  wrote:
>> As we're starting to consider 1.5.2 and 1.6.1 coming out in the near future,
>> I want to revisit a discussion[1] I started at the end of April regarding
>> the "testing burden" that is currently set forth in our release document[2].
>>
>> What I'm proposing is to modify the language of the release document to be
>> explicit about the amount of testing needed. For bug-fix, "minor" releases
>> (e.g. 1.5.2 and 1.6.1), the 7 days of testing using continuous ingest and
>> randomwalk (with and without agitation) will be clearly defined as "may"
>> instead of "should" or "must" language. If the resources are available, it
>> is recommended that some longer, multi-process/node test is run against the
>> release candidate; however, it is not required and should not prevent us
>> from making the minor release.
>>
>> I will also include language that strongly recommends that the changes
>> included in the "minor" release be vetted/reviewed as a way to mitigate the
>> risk of shipping new regressions.
>>
>> I am not recommending that the language be changed for "major" releases
>> (e.g. 1.7.0 and 2.0.0) as these releases still imply significant new
>> features or internal changes.
>>
>> Unless someone informs me otherwise, I will treat this as a normal
>> lazy-consensus approval. Assuming we move closer to "proper" semantic
>> versioning for 2.0.0, I believe these updated guidelines will change again.
>> I do however think there is merit in making this change now so that we can
>> get the good bugs that we've fixed out to our users.
>>
>> Let me know what you think. I will wait, at least, the prescribed three days
>> before changing any thing.
>>
>> - Josh
>>
>> [1]
>> http://mail-archives.apache.org/mod_mbox/accumulo-dev/201404.mbox/%3C535931A7.30605%40gmail.com%3E
>> [2] http://accumulo.apache.org/governance/releasing.html