Re: Review Request 26188: ACCUMULO-3176 Add ability to create a table with user specified initial properties
On Oct. 6, 2014, 2:19 p.m., Jenna Huston wrote: Did you plan to change the proxy API and/or create table command in shell? Jenna Huston wrote: I am in the process of testing the new command option in the shell. Can you suggest a test that would be good to look at for testing the new option. test/src/test/java/org/apache/accumulo/test/ShellServerIT.java - kturner --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26188/#review55498 --- On Oct. 3, 2014, 5:30 p.m., Jenna Huston wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26188/ --- (Updated Oct. 3, 2014, 5:30 p.m.) Review request for accumulo. Bugs: ACCUMULO-3176 https://issues.apache.org/jira/browse/ACCUMULO-3176 Repository: accumulo Description --- Gives the ability to add properties to tables before they are initialized. Therefore these properties will take effect before the default tablet is created. We create a NewTableConfiguration class and send that in the create method as opposed to adding another method. Diffs - core/src/main/java/org/apache/accumulo/core/client/admin/TableOperations.java 97f538d core/src/main/java/org/apache/accumulo/core/client/impl/NewTableConfiguration.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/client/impl/TableOperationsImpl.java e46b9c9 core/src/main/java/org/apache/accumulo/core/client/mock/MockAccumulo.java 32dbb28 core/src/main/java/org/apache/accumulo/core/client/mock/MockTable.java 35cbdd2 core/src/main/java/org/apache/accumulo/core/client/mock/MockTableOperations.java 08750fe core/src/test/java/org/apache/accumulo/core/client/impl/TableOperationsHelperTest.java 02838ed proxy/src/main/java/org/apache/accumulo/proxy/ProxyServer.java a778add shell/src/main/java/org/apache/accumulo/shell/commands/CreateTableCommand.java 81b39d2 test/src/test/java/org/apache/accumulo/test/CreateTableWithNewTableConfigIT.java PRE-CREATION Diff: https://reviews.apache.org/r/26188/diff/ Testing --- New IT, ran unit test and integration tests Thanks, Jenna Huston
Re: Review Request 26188: ACCUMULO-3176 Add ability to create a table with user specified initial properties
On Oct. 6, 2014, 10:19 a.m., kturner wrote: core/src/main/java/org/apache/accumulo/core/client/impl/TableOperationsImpl.java, line 200 https://reviews.apache.org/r/26188/diff/1/?file=713519#file713519line200 Is there a benefit to deprecating here if its deprecated in the parent class? I am not sure if its needed, does the deprecated annotation inherit? It's good practice to deprecate implementing sub-class methods for deprecated interface methods, unless there's a good reason to expect the sub-class to be referenced directly and it still needs the method. Annotations are not inherited, and can lead to API confusion if it's deprecated in an interface, but not in the implementing class. - Christopher --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26188/#review55498 --- On Oct. 3, 2014, 1:30 p.m., Jenna Huston wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26188/ --- (Updated Oct. 3, 2014, 1:30 p.m.) Review request for accumulo. Bugs: ACCUMULO-3176 https://issues.apache.org/jira/browse/ACCUMULO-3176 Repository: accumulo Description --- Gives the ability to add properties to tables before they are initialized. Therefore these properties will take effect before the default tablet is created. We create a NewTableConfiguration class and send that in the create method as opposed to adding another method. Diffs - core/src/main/java/org/apache/accumulo/core/client/admin/TableOperations.java 97f538d core/src/main/java/org/apache/accumulo/core/client/impl/NewTableConfiguration.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/client/impl/TableOperationsImpl.java e46b9c9 core/src/main/java/org/apache/accumulo/core/client/mock/MockAccumulo.java 32dbb28 core/src/main/java/org/apache/accumulo/core/client/mock/MockTable.java 35cbdd2 core/src/main/java/org/apache/accumulo/core/client/mock/MockTableOperations.java 08750fe core/src/test/java/org/apache/accumulo/core/client/impl/TableOperationsHelperTest.java 02838ed proxy/src/main/java/org/apache/accumulo/proxy/ProxyServer.java a778add shell/src/main/java/org/apache/accumulo/shell/commands/CreateTableCommand.java 81b39d2 test/src/test/java/org/apache/accumulo/test/CreateTableWithNewTableConfigIT.java PRE-CREATION Diff: https://reviews.apache.org/r/26188/diff/ Testing --- New IT, ran unit test and integration tests Thanks, Jenna Huston
Re: C++ accumulo client -- native clients for Python, Go, Ruby etc
It'd be really cool to see a C++ client -- fully implemented or not. The increased performance via other languages like you said would be really nice, but I'd also be curious to see how the server characteristics change when the client might be sending data at a much faster rate. My C++ is super rusty these days, but I'd be happy to help out any devs who can spearhead the effort :) John R. Frank wrote: Accumulo Developers, We're trying to boost throughput of non-Java tools with Accumulo. It seems that the lowest hanging fruit is to stop using the thrift proxy. Per discussion about Python and thrift proxy in the users list [1], I'm wondering if anyone is interested in helping with a native C++ client? There is a start on one here [2]. We could offer a bounty or maybe make a consulting project depending who is interested in it. We also looked at trying to run a separate thrift proxy for every worker thread or process. With many cores on a box, eg 32, it just doesn't seem practical to run that many proxies, even if they all run on a single JVM. We'd be glad to hear ideas on that front too. A potentially big benefit of making a proper C++ accumulo client is that it is straightforward to expose native interfaces in Python (via pyObject), Go [3], Ruby [4], and other languages. Thanks for any advice, pointers, interest. John 1-- http://www.mail-archive.com/user@accumulo.apache.org/msg03999.html 2-- https://github.com/phrocker/apeirogon 3-- http://golang.org/cmd/cgo/ 4-- https://www.amberbit.com/blog/2014/6/12/calling-c-cpp-from-ruby/ Sent from +1-617-899-2066
Re: Review Request 26188: ACCUMULO-3176 Add ability to create a table with user specified initial properties
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26188/ --- (Updated Oct. 6, 2014, 6:40 p.m.) Review request for accumulo. Bugs: ACCUMULO-3176 https://issues.apache.org/jira/browse/ACCUMULO-3176 Repository: accumulo Description --- Gives the ability to add properties to tables before they are initialized. Therefore these properties will take effect before the default tablet is created. We create a NewTableConfiguration class and send that in the create method as opposed to adding another method. Diffs (updated) - core/src/main/java/org/apache/accumulo/core/client/NewTableConfiguration.java PRE-CREATION core/src/main/java/org/apache/accumulo/core/client/admin/TableOperations.java 97f538d core/src/main/java/org/apache/accumulo/core/client/impl/TableOperationsImpl.java e46b9c9 core/src/main/java/org/apache/accumulo/core/client/mock/MockAccumulo.java 32dbb28 core/src/main/java/org/apache/accumulo/core/client/mock/MockTable.java 35cbdd2 core/src/main/java/org/apache/accumulo/core/client/mock/MockTableOperations.java 08750fe core/src/test/java/org/apache/accumulo/core/client/impl/TableOperationsHelperTest.java 02838ed proxy/src/main/java/org/apache/accumulo/proxy/ProxyServer.java a778add shell/src/main/java/org/apache/accumulo/shell/commands/CreateTableCommand.java 81b39d2 test/src/test/java/org/apache/accumulo/test/CreateTableWithNewTableConfigIT.java PRE-CREATION test/src/test/java/org/apache/accumulo/test/ShellServerIT.java 5a068af Diff: https://reviews.apache.org/r/26188/diff/ Testing --- New IT, ran unit test and integration tests Thanks, Jenna Huston
Re: C++ accumulo client -- native clients for Python, Go, Ruby etc
I'm all for this- though I'm curious to know the thoughts about maintenance and the design. Are we going to use thrift to tie the C++ client calls into the server-side components? Is that going to be maintained through a separate effort or is the plan to have the Accumulo community officially support it? On Mon, Oct 6, 2014 at 2:34 PM, Josh Elser josh.el...@gmail.com wrote: It'd be really cool to see a C++ client -- fully implemented or not. The increased performance via other languages like you said would be really nice, but I'd also be curious to see how the server characteristics change when the client might be sending data at a much faster rate. My C++ is super rusty these days, but I'd be happy to help out any devs who can spearhead the effort :) John R. Frank wrote: Accumulo Developers, We're trying to boost throughput of non-Java tools with Accumulo. It seems that the lowest hanging fruit is to stop using the thrift proxy. Per discussion about Python and thrift proxy in the users list [1], I'm wondering if anyone is interested in helping with a native C++ client? There is a start on one here [2]. We could offer a bounty or maybe make a consulting project depending who is interested in it. We also looked at trying to run a separate thrift proxy for every worker thread or process. With many cores on a box, eg 32, it just doesn't seem practical to run that many proxies, even if they all run on a single JVM. We'd be glad to hear ideas on that front too. A potentially big benefit of making a proper C++ accumulo client is that it is straightforward to expose native interfaces in Python (via pyObject), Go [3], Ruby [4], and other languages. Thanks for any advice, pointers, interest. John 1-- http://www.mail-archive.com/user@accumulo.apache.org/msg03999.html 2-- https://github.com/phrocker/apeirogon 3-- http://golang.org/cmd/cgo/ 4-- https://www.amberbit.com/blog/2014/6/12/calling-c-cpp-from-ruby/ Sent from +1-617-899-2066
Re: Deprecation removal for 1.7.0
No objection to removing aggregators. If anything first deprecated in 1.5 has managed to live this long in 1.7 I'd like to keep it so folks have an easier time getting off of 1.5 when we EOL it. But I realize some things have probably already been removed. On Mon, Oct 6, 2014 at 3:00 PM, Christopher ctubb...@apache.org wrote: Re: ACCUMULO-3197 First: Any objections to finally removing Aggregators in 1.7.0? They've been deprecated in favor of Combiners since 1.4. Second: Is there any API deprecated in 1.6.x or earlier that you really want preserved in 1.7.0? (I know we need to keep INSTANCE_DFS_{URI,DIR} properties for volume upgrades, at least.) -- Christopher L Tubbs II http://gravatar.com/ctubbsii -- Sean
Re: Deprecation removal for 1.7.0
The main thing I'm looking at which is causing problems for me is the instance.getConfiguration() stuff. It was never well defined, usually didn't work or do what was expected of it, and is still being leveraged (incorrectly) by new code (replication, for instance, and I've already informed Josh), because of ServerConfigurationUtil.getConfiguration(Instance instance). It wasn't formally deprecated until 1.6.0, though. Aside from that, everything else is just a nice cleanup. A somewhat exhaustive list of what I was looking at was: Scanner timeout options extra batchwriter/batchdeleter factory methods some junk in MutationsRejectedException extra ZooKeeperInstance constructors securityOperations stuff from 1.5 extra getSplits and flush in tableOperations Constants.NO_AUTHS KeyExtents.getKeyExtentsForRange an extra Value constructor which copies from a ByteBuffer iterators that moved packages in 1.4 some protected getters in the mapred stuff unused RangeInputSplit in InputFormatBase LogFileKey/LogFileValue (old version) You can review the expected changes at https://github.com/ctubbsii/accumulo/tree/ACCUMULO-3197 (in two commits, one for instance stuff, the other for aggregators and everything else). -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Oct 6, 2014 at 4:11 PM, Sean Busbey bus...@cloudera.com wrote: No objection to removing aggregators. If anything first deprecated in 1.5 has managed to live this long in 1.7 I'd like to keep it so folks have an easier time getting off of 1.5 when we EOL it. But I realize some things have probably already been removed. On Mon, Oct 6, 2014 at 3:00 PM, Christopher ctubb...@apache.org wrote: Re: ACCUMULO-3197 First: Any objections to finally removing Aggregators in 1.7.0? They've been deprecated in favor of Combiners since 1.4. Second: Is there any API deprecated in 1.6.x or earlier that you really want preserved in 1.7.0? (I know we need to keep INSTANCE_DFS_{URI,DIR} properties for volume upgrades, at least.) -- Christopher L Tubbs II http://gravatar.com/ctubbsii -- Sean
Re: Deprecation removal for 1.7.0
Do we still have mapred(uce) stuff? On Mon, Oct 6, 2014 at 3:54 PM, Christopher ctubb...@apache.org wrote: The main thing I'm looking at which is causing problems for me is the instance.getConfiguration() stuff. It was never well defined, usually didn't work or do what was expected of it, and is still being leveraged (incorrectly) by new code (replication, for instance, and I've already informed Josh), because of ServerConfigurationUtil.getConfiguration(Instance instance). It wasn't formally deprecated until 1.6.0, though. Aside from that, everything else is just a nice cleanup. A somewhat exhaustive list of what I was looking at was: Scanner timeout options extra batchwriter/batchdeleter factory methods some junk in MutationsRejectedException extra ZooKeeperInstance constructors securityOperations stuff from 1.5 extra getSplits and flush in tableOperations Constants.NO_AUTHS KeyExtents.getKeyExtentsForRange an extra Value constructor which copies from a ByteBuffer iterators that moved packages in 1.4 some protected getters in the mapred stuff unused RangeInputSplit in InputFormatBase LogFileKey/LogFileValue (old version) You can review the expected changes at https://github.com/ctubbsii/accumulo/tree/ACCUMULO-3197 (in two commits, one for instance stuff, the other for aggregators and everything else). -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Oct 6, 2014 at 4:11 PM, Sean Busbey bus...@cloudera.com wrote: No objection to removing aggregators. If anything first deprecated in 1.5 has managed to live this long in 1.7 I'd like to keep it so folks have an easier time getting off of 1.5 when we EOL it. But I realize some things have probably already been removed. On Mon, Oct 6, 2014 at 3:00 PM, Christopher ctubb...@apache.org wrote: Re: ACCUMULO-3197 First: Any objections to finally removing Aggregators in 1.7.0? They've been deprecated in favor of Combiners since 1.4. Second: Is there any API deprecated in 1.6.x or earlier that you really want preserved in 1.7.0? (I know we need to keep INSTANCE_DFS_{URI,DIR} properties for volume upgrades, at least.) -- Christopher L Tubbs II http://gravatar.com/ctubbsii -- Sean
Re: C++ accumulo client -- native clients for Python, Go, Ruby etc
Two kinds of gains: 1) single client throughput: the extra RPC hop through the proxy deserializes and then reserializes the messages. With the proxy running locally the extra network hop is less of an issue. This was discussed on the user list (see link earlier in this thread), and 5x slow down was suggested as a possible swag estimate. 2) cluster management complexity: it's clearly best to have the proxy local to the workers, but if you have a worker on every core of a large box (eg 32), then having a single proxy on each worker machine becomes a bottleneck. Running many proxies on a single JVM is the next thing we could try to improve this --- having a native client seems preferable. Comments? jrf On Oct 6, 2014, at 4:15 PM, David Medinets david.medin...@gmail.com wrote: How far away from the theoretical maximum rate is the thrift protocol? What kind of gain is expected from the native C++ approach? On Sat, Oct 4, 2014 at 12:56 PM, John R. Frank j...@diffeo.com wrote: Accumulo Developers, We're trying to boost throughput of non-Java tools with Accumulo. It seems that the lowest hanging fruit is to stop using the thrift proxy. Per discussion about Python and thrift proxy in the users list [1], I'm wondering if anyone is interested in helping with a native C++ client? There is a start on one here [2]. We could offer a bounty or maybe make a consulting project depending who is interested in it. We also looked at trying to run a separate thrift proxy for every worker thread or process. With many cores on a box, eg 32, it just doesn't seem practical to run that many proxies, even if they all run on a single JVM. We'd be glad to hear ideas on that front too. A potentially big benefit of making a proper C++ accumulo client is that it is straightforward to expose native interfaces in Python (via pyObject), Go [3], Ruby [4], and other languages. Thanks for any advice, pointers, interest. John 1-- http://www.mail-archive.com/user@accumulo.apache.org/msg03999.html 2-- https://github.com/phrocker/apeirogon 3-- http://golang.org/cmd/cgo/ 4-- https://www.amberbit.com/blog/2014/6/12/calling-c-cpp-from-ruby/ Sent from +1-617-899-2066
Re: Deprecation removal for 1.7.0
Christopher, would it make sense to get a patch of the actual things you're looking at potentially removing, or would that be a waste of time this early? Mike Drob wrote: I think before we can agree on a deprecation strategy, we need to firm up the scope for this release plan. What are the intentions for 1.7.0? Is it a minor release in the sense of our previous minor releases, where we add a bunch of new features and maintain some compatibility promises? Or are we going to try and make it a truer minor release, where we cut down on the number of features and have more conservative stakes in the ground? Personally, I think 1.7.0 is shaping up to be a full-featured release given the amount of time since 1.6.0. I wanted to do a scrape of JIRA and collect the stuff that I know is done/in-progress. Is this the same 1.7.0 that was going to be renamed to 2.0.0? Or an intermediate release? Intermediate -- the revised client API that Christopher is working on would be punted to a 1.8/2.0. When do we need to deprecate the mapred API if we plan to drop Hadoop 1 support in Accumulo 2? (as has been discussed, but I'm not sure it was ever formally decided.) In general, I'm inclined to leave as much in as possible, and then if we must remove things then do so in 2.0.0. I know that our compatibility statement only promises one minor version, but that doesn't mean we have to be strict at every opportunity. Mike On Mon, Oct 6, 2014 at 4:03 PM, Billie Rinaldibillie.rina...@gmail.com wrote: Yes, we have both. Neither is deprecated. On Mon, Oct 6, 2014 at 1:56 PM, Mike Drobmad...@cloudera.com wrote: Do we still have mapred(uce) stuff? On Mon, Oct 6, 2014 at 3:54 PM, Christopherctubb...@apache.org wrote: The main thing I'm looking at which is causing problems for me is the instance.getConfiguration() stuff. It was never well defined, usually didn't work or do what was expected of it, and is still being leveraged (incorrectly) by new code (replication, for instance, and I've already informed Josh), because of ServerConfigurationUtil.getConfiguration(Instance instance). It wasn't formally deprecated until 1.6.0, though. Aside from that, everything else is just a nice cleanup. A somewhat exhaustive list of what I was looking at was: Scanner timeout options extra batchwriter/batchdeleter factory methods some junk in MutationsRejectedException extra ZooKeeperInstance constructors securityOperations stuff from 1.5 extra getSplits and flush in tableOperations Constants.NO_AUTHS KeyExtents.getKeyExtentsForRange an extra Value constructor which copies from a ByteBuffer iterators that moved packages in 1.4 some protected getters in the mapred stuff unused RangeInputSplit in InputFormatBase LogFileKey/LogFileValue (old version) You can review the expected changes at https://github.com/ctubbsii/accumulo/tree/ACCUMULO-3197 (in two commits, one for instance stuff, the other for aggregators and everything else). -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Oct 6, 2014 at 4:11 PM, Sean Busbeybus...@cloudera.com wrote: No objection to removing aggregators. If anything first deprecated in 1.5 has managed to live this long in 1.7 I'd like to keep it so folks have an easier time getting off of 1.5 when we EOL it. But I realize some things have probably already been removed. On Mon, Oct 6, 2014 at 3:00 PM, Christopherctubb...@apache.org wrote: Re: ACCUMULO-3197 First: Any objections to finally removing Aggregators in 1.7.0? They've been deprecated in favor of Combiners since 1.4. Second: Is there any API deprecated in 1.6.x or earlier that you really want preserved in 1.7.0? (I know we need to keep INSTANCE_DFS_{URI,DIR} properties for volume upgrades, at least.) -- Christopher L Tubbs II http://gravatar.com/ctubbsii -- Sean
Re: Deprecation removal for 1.7.0
On Mon, Oct 6, 2014 at 4:12 PM, Mike Drob mad...@cloudera.com wrote: In general, I'm inclined to leave as much in as possible, and then if we must remove things then do so in 2.0.0. I know that our compatibility statement only promises one minor version, but that doesn't mean we have to be strict at every opportunity. Mike Related, I'd like to EOL 1.5 shortly after 1.7 gets released. I don't want to derail this thread with that discussion, but my guess is it's a much easier sell if we're conservative about removing things. Just so everyone knows where I'm coming from. -- Sean
Re: Deprecation removal for 1.7.0
See https://github.com/ctubbsii/accumulo/tree/ACCUMULO-3197 for the two commits proposed for removing deprecated stuffs. One removes the instance.getConfiguration nightmare that I'd really like to proceed with. The other removes aggregators and other cleanup, which I don't feel strongly about. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Oct 6, 2014 at 5:20 PM, Josh Elser josh.el...@gmail.com wrote: Christopher, would it make sense to get a patch of the actual things you're looking at potentially removing, or would that be a waste of time this early? Mike Drob wrote: I think before we can agree on a deprecation strategy, we need to firm up the scope for this release plan. What are the intentions for 1.7.0? Is it a minor release in the sense of our previous minor releases, where we add a bunch of new features and maintain some compatibility promises? Or are we going to try and make it a truer minor release, where we cut down on the number of features and have more conservative stakes in the ground? Personally, I think 1.7.0 is shaping up to be a full-featured release given the amount of time since 1.6.0. I wanted to do a scrape of JIRA and collect the stuff that I know is done/in-progress. Is this the same 1.7.0 that was going to be renamed to 2.0.0? Or an intermediate release? Intermediate -- the revised client API that Christopher is working on would be punted to a 1.8/2.0. When do we need to deprecate the mapred API if we plan to drop Hadoop 1 support in Accumulo 2? (as has been discussed, but I'm not sure it was ever formally decided.) In general, I'm inclined to leave as much in as possible, and then if we must remove things then do so in 2.0.0. I know that our compatibility statement only promises one minor version, but that doesn't mean we have to be strict at every opportunity. Mike On Mon, Oct 6, 2014 at 4:03 PM, Billie Rinaldibillie.rina...@gmail.com wrote: Yes, we have both. Neither is deprecated. On Mon, Oct 6, 2014 at 1:56 PM, Mike Drobmad...@cloudera.com wrote: Do we still have mapred(uce) stuff? On Mon, Oct 6, 2014 at 3:54 PM, Christopherctubb...@apache.org wrote: The main thing I'm looking at which is causing problems for me is the instance.getConfiguration() stuff. It was never well defined, usually didn't work or do what was expected of it, and is still being leveraged (incorrectly) by new code (replication, for instance, and I've already informed Josh), because of ServerConfigurationUtil.getConfiguration(Instance instance). It wasn't formally deprecated until 1.6.0, though. Aside from that, everything else is just a nice cleanup. A somewhat exhaustive list of what I was looking at was: Scanner timeout options extra batchwriter/batchdeleter factory methods some junk in MutationsRejectedException extra ZooKeeperInstance constructors securityOperations stuff from 1.5 extra getSplits and flush in tableOperations Constants.NO_AUTHS KeyExtents.getKeyExtentsForRange an extra Value constructor which copies from a ByteBuffer iterators that moved packages in 1.4 some protected getters in the mapred stuff unused RangeInputSplit in InputFormatBase LogFileKey/LogFileValue (old version) You can review the expected changes at https://github.com/ctubbsii/accumulo/tree/ACCUMULO-3197 (in two commits, one for instance stuff, the other for aggregators and everything else). -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Oct 6, 2014 at 4:11 PM, Sean Busbeybus...@cloudera.com wrote: No objection to removing aggregators. If anything first deprecated in 1.5 has managed to live this long in 1.7 I'd like to keep it so folks have an easier time getting off of 1.5 when we EOL it. But I realize some things have probably already been removed. On Mon, Oct 6, 2014 at 3:00 PM, Christopherctubb...@apache.org wrote: Re: ACCUMULO-3197 First: Any objections to finally removing Aggregators in 1.7.0? They've been deprecated in favor of Combiners since 1.4. Second: Is there any API deprecated in 1.6.x or earlier that you really want preserved in 1.7.0? (I know we need to keep INSTANCE_DFS_{URI,DIR} properties for volume upgrades, at least.) -- Christopher L Tubbs II http://gravatar.com/ctubbsii -- Sean
Re: Deprecation removal for 1.7.0
On Mon, Oct 6, 2014 at 4:51 PM, Christopher ctubb...@apache.org wrote: On Mon, Oct 6, 2014 at 5:20 PM, Sean Busbey bus...@cloudera.com wrote: On Mon, Oct 6, 2014 at 4:12 PM, Mike Drob mad...@cloudera.com wrote: In general, I'm inclined to leave as much in as possible, and then if we must remove things then do so in 2.0.0. I know that our compatibility statement only promises one minor version, but that doesn't mean we have to be strict at every opportunity. Mike Related, I'd like to EOL 1.5 shortly after 1.7 gets released. I don't want to derail this thread with that discussion, but my guess is it's a much easier sell if we're conservative about removing things. Just so everyone knows where I'm coming from. (+1 for EOL 1.5 after) In general, does this mean that you're okay with removing stuff deprecated prior to 1.5? With the exception of the instance.getConfiguration stuff, which was deprecated in 1.6.0 and I'd like to remove in 1.7.0, due to its problematic nature (requires further discussion), I could restrict the remaining cleanup to only stuff deprecated prior to 1.5. For me, yeah that's the cut point I'd prefer to use. I'm hoping anyone who did the move to 1.5 didn't move from a removed api to a deprecated API. Maybe we should send a ping to user@ asking if any 1.5 users want to pipe up about APIs they're using that were deprecated prior to 1.5? -- Sean
Re: 1.7 release timeline
Thanks, John. I was thinking about trying to gun for January time-frame for a release. I'd love to say before 2014 is over, but that probably just won't happen for a major release with the holidays. For 1.7 right now, I see the following bigger items (correct me where I'm wrong): * Replication (done) * Upgrade rules/guarantees (proposed) * Replace cloudtrace (in-progress) * Rewrite monitor, include REST service (in-progress) * Drop Hadoop 1 support (proposed) * Decouple MiniAccumulo from ITs (in-progress) * Other minicluster types: in-process, shim to real instance (in-progress) * Support Hadoop metrics2 (proposed) * A few WAL/metadata related performance improvements (in-progress) Also, would be good to check the In-Progress state issues on JIRA. What do people think? John Vines wrote: Moving this to it's own thread... On Mon, Oct 6, 2014 at 5:54 PM, Mike Drobmad...@cloudera.com wrote: Related: Do we have a release timeline for 1.7?
Re: 1.7 release timeline
Yes, of course. We definitely need to see some code here before it gets officially slated for 1.7. I just know that efforts are being put towards it, so I wanted to list it. Christopher wrote: Would replacing cloudtrace be part of1.7? I'm not sure about that. I'd like to see where that's headed before we decide on that. Personally, I'd prefer Zipkin, since htrace is basically a copy of cloudtrace/accumulo-trace, and it has some of the same issues (millis time, for instance, instead of relative nanos, which is independent of the system clock and actually intended for time spans).
Re: 1.7 release timeline
Zipkin is a possible replacement for our trace collection system. It does not provide instrumentation like cloudtrace or htrace, so even if we make zipkin the default collection system we will still need instrumentation. Anyway, we can discuss the details and approach elsewhere. I'd certainly want the trace work to be in 2.0, but if we decide not to put it in 1.7 that would be okay. On Mon, Oct 6, 2014 at 5:59 PM, Christopher ctubb...@apache.org wrote: Would replacing cloudtrace be part of 1.7? I'm not sure about that. I'd like to see where that's headed before we decide on that. Personally, I'd prefer Zipkin, since htrace is basically a copy of cloudtrace/accumulo-trace, and it has some of the same issues (millis time, for instance, instead of relative nanos, which is independent of the system clock and actually intended for time spans). I think the upgrade guarantees are more a 2.0.0 thing, but I think we can be a bit more conservative in 1.x to move towards that. I wouldn't mind dropping Hadoop 1 support in 1.7.0. (I guess we should just vote on that). I'd really like to include the VolumeChooser improvements (in particular ACCUMULO-3177, which depends on ACCUMULO-3176). -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Oct 6, 2014 at 8:38 PM, Josh Elser josh.el...@gmail.com wrote: Thanks, John. I was thinking about trying to gun for January time-frame for a release. I'd love to say before 2014 is over, but that probably just won't happen for a major release with the holidays. For 1.7 right now, I see the following bigger items (correct me where I'm wrong): * Replication (done) * Upgrade rules/guarantees (proposed) * Replace cloudtrace (in-progress) * Rewrite monitor, include REST service (in-progress) * Drop Hadoop 1 support (proposed) * Decouple MiniAccumulo from ITs (in-progress) * Other minicluster types: in-process, shim to real instance (in-progress) * Support Hadoop metrics2 (proposed) * A few WAL/metadata related performance improvements (in-progress) Also, would be good to check the In-Progress state issues on JIRA. What do people think? John Vines wrote: Moving this to it's own thread... On Mon, Oct 6, 2014 at 5:54 PM, Mike Drobmad...@cloudera.com wrote: Related: Do we have a release timeline for 1.7?
[GitHub] accumulo pull request: ACCUMULO-2826 Allow single CF for Intersect...
Github user asfgit closed the pull request at: https://github.com/apache/accumulo/pull/8 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: Deprecation removal for 1.7.0
So, I think we can make a general argument to set policy, and when removing a specific method we should make a specific argument. Personally, I would set the bar at identifying the specific harm cause by the retention of the method, as well as polling the community and considering objections. Christopher, you made an argument about people misunderstanding the semantics of the method and using it incorrectly. Is that not solved by just deprecating the method? It would be nice to have a more structured way of polling the community for continuing use of deprecated code. Can anyone propose a way of doing this? Maybe a call-back system where people can register the deprecated methods that they care about? Maybe some scripts that people can use to determine which deprecated methods they depend on and submit those to us? Adam On Mon, Oct 6, 2014 at 4:42 PM, Jeremy Kepner kep...@ll.mit.edu wrote: -1 Need a good reason why the current deprecated code is causing harm to Accumulo. In general, keeping around deprecated code restricts how much we can optimize behind the scenes (both for performance or maintainability). It also keeps our test burden higher. I'll let Christopher speak to the specifics of what he wants to remove, but it sounds like at least one of them is something that commonly results in incorrect usage, even internally. -- Sean
Re: [PROPOSAL] 1.7/2.0 branches and git workflow change
True. Everything I'm thinking of would work with no master, but that might be confusing, and might break some tooling without extra effort (which branch is default when cloning?). We also kind of assume that the master branch is forward-moving only, but other branches are disposable and can be rebase'd, deleted, re-created, etc. Alternatively, if people understood that a 2.0 branch is a future branch when 1.7 (master) is the current, that'd work, too... I just worry that people will merge it poorly. I suppose the best option, then, is probably to keep the status quo, and use a branch name like ACCUMULO- which represents the overall work for a particular future release plan, instead of a name which looks like a maintenance branch. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Mon, Oct 6, 2014 at 10:59 PM, William Slacum wilhelm.von.cl...@accumulo.net wrote: It seems to me you can get everything you want by merely getting rid of master or making master just be the 1.7 branch. I'm not really concerned about the name, because it's easy enough to figure out. Master duplicating a tag doesn't really seem useful to me, save for here's the highest version we have released, which is of limited utility when a user can just check the tags. I don't see the point in having master be something for the sake of having master. On Mon, Oct 6, 2014 at 9:19 PM, Josh Elser josh.el...@gmail.com wrote: Christopher wrote: What purpose does the master branch serve if it's just the same as the last major release tag? I think Josh had some specific opinions on this, but the general idea from what I understood was that master is supposed to be stable... representative of the latest, most modern release, because it's what a new contributor would expect to fork to create a patch. That's hard to do if the goalpost is moving a lot, and it makes feature merges more complicated, since contributors have to rebase or merge themselves in order to create a patch that merges cleanly. Having a stable master makes it very easy to contribute to the most recent release. No, I don't really care for a stable-only master (I think I diverge from the git-flow model in that regard). I like master to just be a commits-go-here area more than anything.