from:"Aleksandr Shulman $JIRA$"

[jira] [Commented] (HBASE-11721) jdiff script no longer works as usage instructions indicate

2014-08-11 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-11721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14093739#comment-14093739
]

Aleksandr Shulman commented on HBASE-11721:
---

Thanks [~misty] for trying out the tool. I am open to suggestions on how to
make the usage instructions more clear.

Regarding the error you are seeing:
Looks like the artifact that this script depends on may have been moved. curl l
is returning an html redirect (301) instead of the zip file.
Using wget, it looks like there may also be some openssl or certificate issues.
Doing a little homework on the matter, it looks like some people have hit this
issue with . Will look into this further.

{code}wget
http://cloud.github.com/downloads/tomwhite/jdiff/jdiff-1.1.1-with-incompatible-option.zip
--2014-08-11 21:42:11--
http://cloud.github.com/downloads/tomwhite/jdiff/jdiff-1.1.1-with-incompatible-option.zip
Resolving cloud.github.com (cloud.github.com)... 54.230.141.84, 54.230.143.7,
54.230.140.148, ...
Connecting to cloud.github.com (cloud.github.com)|54.230.141.84|:80...
connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location:
https://cloud.github.com/downloads/tomwhite/jdiff/jdiff-1.1.1-with-incompatible-option.zip
[following]
--2014-08-11 21:42:11--
https://cloud.github.com/downloads/tomwhite/jdiff/jdiff-1.1.1-with-incompatible-option.zip
Connecting to cloud.github.com (cloud.github.com)|54.230.141.84|:443...
connected.
OpenSSL: error:14077410:SSL routines:SSL23_GET_SERVER_HELLO:sslv3 alert
handshake failure
Unable to establish SSL connection.
{code}

jdiff script no longer works as usage instructions indicate
---

Key: HBASE-11721
URL: https://issues.apache.org/jira/browse/HBASE-11721
Project: HBase
Issue Type: Bug
Components: scripts
Reporter: Misty Stanley-Jones

I pasted the command from the usage instructions embedded in the script, but
it fails as follows:
[misty@cheezel dev-support](master)$ bash ./jdiffHBasePublicAPI.sh
https://github.com/apache/hbase.git 0.94 https://github.com/MY_REPO/hbase.git
0.94
JDiff evaluation beginning:
Determining if this is a local directory or a git repo.
Looks like https://github.com/apache/hbase.git is a git repo
Determining if this is a local directory or a git repo.
Looks like https://github.com/MY_REPO/hbase.git is a git repo
We are going to compare source 1 which is a git_repo and source 2, which is a
git_repo
0.94
0.94
JDIFF_WORKING_DIRECTORY not set. That's not an issue. We will default it to
/tmp/jdiff.
% Total% Received % Xferd Average Speed TimeTime Time
Current
Dload Upload Total SpentLeft Speed
100 183 100 1830 0447 0 --:--:-- --:--:-- --:--:-- 448
Archive: jdiff-1.1.1-with-incompatible-option.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of
jdiff-1.1.1-with-incompatible-option.zip or
jdiff-1.1.1-with-incompatible-option.zip.zip, and cannot find
jdiff-1.1.1-with-incompatible-option.zip.ZIP, period.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-11400) Edit, consolidate, and update Compression and data encoding docs

2014-06-23 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-11400:
--

  Priority: Minor  (was: Major)
Issue Type: Improvement  (was: Bug)
   Summary: Edit, consolidate, and update Compression and data encoding 
docs  (was: Edit, colsolidate, and update Compression and data encoding docs)

 Edit, consolidate, and update Compression and data encoding docs
 

 Key: HBASE-11400
 URL: https://issues.apache.org/jira/browse/HBASE-11400
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
Priority: Minor
 Attachments: HBASE-11400.patch


 Current docs are here: http://hbase.apache.org/book.html#compression.test
 It could use some editing and expansion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11400) Edit, consolidate, and update Compression and data encoding docs

2014-06-23 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041671#comment-14041671
 ] 

Aleksandr Shulman commented on HBASE-11400:
---

Thanks for taking this up, [~misty]. I'll have a look.

 Edit, consolidate, and update Compression and data encoding docs
 

 Key: HBASE-11400
 URL: https://issues.apache.org/jira/browse/HBASE-11400
 Project: HBase
  Issue Type: Improvement
  Components: documentation
Reporter: Misty Stanley-Jones
Assignee: Misty Stanley-Jones
Priority: Minor
 Attachments: HBASE-11400.patch


 Current docs are here: http://hbase.apache.org/book.html#compression.test
 It could use some editing and expansion.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-06-22 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040236#comment-14040236
]

Aleksandr Shulman commented on HBASE-10924:
---

Hmm - I think it's still the intention of the patch to have region_mover do a
best-effort move of all the regions, as the script had done before. The main
addition is that it will retry that process a configurable number of times, in
case of strange transient conditions we've seen, like the master down when the
move request is sent.

Overall, I've seen the region_mover work pretty well and I see this patch as
just being a minor stability improvement. If you believe there is a better way
to do this region movement, such as failing fast on a split region, I'd be
happy to test such a patch in our frameworks.

If we're happy with the logic of this patch, then I can post a version for
0.96, 0.98, trunk, etc.

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

Key: HBASE-10924
URL: https://issues.apache.org/jira/browse/HBASE-10924
Project: HBase
Issue Type: Bug
Components: Region Assignment
Affects Versions: 0.94.15
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Labels: region_mover, rolling_upgrade
Fix For: 0.94.22

Attachments: HBASE-10924-0.94-v2.patch, HBASE-10924-0.94-v3.patch

Observed behavior:
In about 5% of cases, my rolling upgrade tests fail because of stuck regions
during a region server unload. My theory is that this occurs when region
assignment information changes between the time the region list is generated,
and the time when the region is to be moved.
An example of such a region information change is a split or merge.
Example:
Regionserver A has 100 regions (#0-#99). The balancer is turned off and the
regionmover script is called to unload this regionserver. The regionmover
script will generate the list of 100 regions to be moved and then proceed
down that list, moving the regions off in series. However, there is a region,
#84, that has split into two daughter regions while regions 0-83 were moved.
The script will be stuck trying to move #84, timeout, and then the failure
will bubble up (attempt 1 failed).
Proposed solution:
This specific failure mode should be caught and the region_mover script
should now attempt to move off all the regions. Now, it will have 16+1 (due
to split) regions to move. There is a good chance that it will be able to
move all 17 off without issues. However, should it encounter this same issue
(attempt 2 failed), it will retry again. This process will continue until the
maximum number of unload retry attempts has been reached.
This is not foolproof, but let's say for the sake of argument that 5% of
unload attempts hit this issue, then with a retry count of 3, it will reduce
the unload failure probability from 0.05 to 0.000125 (0.05^3).
Next steps:
I am looking for feedback on this approach. If it seems like a sensible
approach, I will create a strawman patch and test it.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-11122) Annotate coprocessor APIs

2014-05-16 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-11122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-11122:
--

Labels: compatibility coprocessors  (was: )

 Annotate coprocessor APIs
 -

 Key: HBASE-11122
 URL: https://issues.apache.org/jira/browse/HBASE-11122
 Project: HBase
  Issue Type: Task
Affects Versions: 0.99.0, 0.98.3
Reporter: Andrew Purtell
  Labels: compatibility, coprocessors

 Add annotations to coprocessor APIs for:\\
 - Interface stability
 - If or if not bypassable
 - If or if not executed under row lock



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-4920) We need a mascot, a totem

2014-05-15 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-4920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998017#comment-13998017
 ] 

Aleksandr Shulman commented on HBASE-4920:
--

+1 for the Orca.

 We need a mascot, a totem
 -

 Key: HBASE-4920
 URL: https://issues.apache.org/jira/browse/HBASE-4920
 Project: HBase
  Issue Type: Task
Reporter: stack
 Attachments: Apache_HBase_Orca_Logo_1.jpg, 
 Apache_HBase_Orca_Logo_Mean_version-3.pdf, 
 Apache_HBase_Orca_Logo_Mean_version-4.pdf, HBase Orca Logo.jpg, 
 Orca_479990801.jpg, Screen shot 2011-11-30 at 4.06.17 PM.png, apache hbase 
 orca logo_Proof 3.pdf, apache logo_Proof 8.pdf, krake.zip, more_orcas.png, 
 more_orcas2.png, photo (2).JPG, plus_orca.png


 We need a totem for our t-shirt that is yet to be printed.  O'Reilly owns the 
 Clyesdale.  We need something else.
 We could have a fluffy little duck that quacks 'hbase!' when you squeeze it 
 and we could order boxes of them from some off-shore sweatshop that 
 subcontracts to a contractor who employs child labor only.
 Or we could have an Orca (Big!, Fast!, Killer!, and in a poem that Marcy from 
 Salesforce showed me, that was a bit too spiritual for me to be seen quoting 
 here, it had the Orca as the 'Guardian of the Cosmic Memory': i.e. in 
 translation, bigdata).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-05-13 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996537#comment-13996537
]

Aleksandr Shulman commented on HBASE-10924:
---

This issue affects all branches. I will upload patches for the other branches
as well.

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

Attachments: HBASE-10924-0.94-v2.patch, HBASE-10924-0.94-v3.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10251) Restore API Compat for PerformanceEvaluation.generateValue()

2014-04-29 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984841#comment-13984841
]

Aleksandr Shulman commented on HBASE-10251:
---

[~ndimiduk], I like the idea of having utility classes that have a
compatibility story, from which the tests (which do not have compatibility
considerations) can pull.

Restore API Compat for PerformanceEvaluation.generateValue()

Key: HBASE-10251
URL: https://issues.apache.org/jira/browse/HBASE-10251
Project: HBase
Issue Type: Bug
Components: Client
Affects Versions: 0.99.0
Reporter: Aleksandr Shulman
Labels: api_compatibility

Observed:
A couple of my client tests fail to compile against trunk because the method
PerformanceEvaluation.generateValue was removed as part of HBASE-8496.
This is an issue because it was used in a number of places, including unit
tests. Since we did not explicitly label this API as private, it's ambiguous
as to whether this could/should have been used by people writing apps against
0.96. If they used it, then they would be broken upon upgrade to 0.98 and
trunk.
Potential Solution:
The method was renamed to generateData, but the logic is still the same. We
can reintroduce it as deprecated in 0.98, as compat shim over generateData.
The patch should be a few lines. We may also consider doing so in trunk, but
I'd be just as fine with leaving it out.
More generally, this raises the question about what other code is in this
grey-area, where it is public, is used outside of the package, but is not
explicitly labeled with an AudienceInterface.

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-04-19 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-10924:
--

Status: Patch Available (was: Open)

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

Attachments: HBASE-10924-0.94-v2.patch, HBASE-10924-0.94-v3.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-04-18 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-10924:
--

Attachment: HBASE-10924-0.94-v3.patch

Adding v3 which includes a sleep and some comments as to the rationale of the
fix.

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

Attachments: HBASE-10924-0.94-v2.patch, HBASE-10924-0.94-v3.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-04-17 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-10924:
--

Attachment: HBASE-10924-0.94-v1.patch

Attaching v1 of the patch. For 94 only.

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

Attachments: HBASE-10924-0.94-v1.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-04-17 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-10924:
--

Attachment: (was: HBASE-10924-0.94-v1.patch)

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-04-17 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-10924:
--

Attachment: HBASE-10924-0.94-v2.patch

Attaching a better version of the patch here.
It's relatively straightforward, but if there is interest in a formal review, I
can put it up on RB.

Testing: I ran this patch through an in-house rolling upgrade test framework.
It performs MR jobs, splits, compactions, and DML while regions are moving.

I also did some explicit testing by installing this on a cluster and moving
regions back and forth while doing splits.

The results were fine for all the testing.

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

Attachments: HBASE-10924-0.94-v2.patch

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11008) Align bulk load, flush, and compact to require Action.CREATE

2014-04-16 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-11008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972123#comment-13972123
 ] 

Aleksandr Shulman commented on HBASE-11008:
---

We should be careful to consider what workflows we might disrupt with this 
change. Specifically, we should consider while the user is upgrading (rolling 
upgrade) and after the upgrade is complete.

Bulk loading is something that users can expect to do while a rolling upgrade 
is going on. If some regionservers begin enforcing a more restrictive 
requirement, then it will cause issues. If we choose to make it more 
restrictive, we should document any changes we should make to the ACL table in 
order to allow the upgrade to go smoothly.

If we choose to make it less restrictive (e.g. allow admin permissions to users 
with create), then we have to acknowledge that the ACL semantics have changed 
and document that appropriately.

 Align bulk load, flush, and compact to require Action.CREATE
 

 Key: HBASE-11008
 URL: https://issues.apache.org/jira/browse/HBASE-11008
 Project: HBase
  Issue Type: Improvement
  Components: security
Reporter: Jean-Daniel Cryans
Assignee: Jean-Daniel Cryans
 Fix For: 0.99.0, 0.98.2, 0.96.3, 0.94.20

 Attachments: HBASE-11008.patch


 Over in HBASE-10958 we noticed that it might make sense to require 
 Action.CREATE for bulk load, flush, and compact since it is also required for 
 things like enable and disable.
 This means the following changes:
  - preBulkLoadHFile goes from WRITE to CREATE
  - compact/flush go from ADMIN to ADMIN or CREATE



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Created] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-04-07 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10924:
-

 Summary: [region_mover]: Adjust region_mover script to retry 
unloading a server a configurable number of times in case of region 
splits/merges
 Key: HBASE-10924
 URL: https://issues.apache.org/jira/browse/HBASE-10924
 Project: HBase
  Issue Type: Bug
  Components: Region Assignment
Affects Versions: 0.94.15
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
 Fix For: 0.94.19


Observed behavior:
In about 5% of cases, my rolling upgrade tests fail because of stuck regions 
during a region server unload. My theory is that this occurs when region 
assignment information changes between the time the region list is generated, 
and the time when the region is to be moved.

An example of such a region information change is a split or merge.

Example:
Regionserver A has 100 regions (#0-#99). The balancer is turned off and the 
regionmover script is called to unload this regionserver. The regionmover 
script will generate the list of 100 regions to be moved and then proceed down 
that list, moving the regions off in series. However, there is a region, #84, 
that has split into two daughter regions while regions 0-83 were moved. The 
script will be stuck trying to move #84, timeout, and then the failure will 
bubble up (attempt 1 failed).

Proposed solution:
This specific failure mode should be caught and the region_mover script should 
now attempt to move off all the regions. Now, it will have 16+1 (due to split) 
regions to move. There is a good chance that it will be able to move all 17 off 
without issues. However, should it encounter this same issue (attempt 2 
failed), it will retry again. This process will continue until the maximum 
number of unload retry attempts has been reached.

This is not foolproof, but let's say for the sake of argument that 5% of unload 
attempts hit this issue, then with a retry count of 3, it will reduce the 
unload failure probability from 0.05 to 0.000125 (0.05^3).

Next steps:
I am looking for feedback on this approach. If it seems like a sensible 
approach, I will create a strawman patch and test it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10924) [region_mover]: Adjust region_mover script to retry unloading a server a configurable number of times in case of region splits/merges

2014-04-07 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13962112#comment-13962112
]

Aleksandr Shulman commented on HBASE-10924:
---

That seems like a good place to put that logic since it'll be easier to
maintain.
As a bonus, we'll have implicit compatibility checks at compile time :)
Only concern is that we don't break the shell api, but that shouldn't be
difficult to maintain.

[region_mover]: Adjust region_mover script to retry unloading a server a
configurable number of times in case of region splits/merges
-

--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2014-03-14 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934697#comment-13934697
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

I will take a look. Should be interesting.

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.98.1, 0.99.0

 Attachments: 10184-4.patch, 10184.addendum, HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2014-03-14 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934706#comment-13934706
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

From the above runs (3), I only see one TF related to my tests: 
https://builds.apache.org/job/HBase-0.98/228/testReport/junit/org.apache.hadoop.hbase.regionserver/TestEndToEndSplitTransaction/testFromClientSideOnlineSchemaChangeWhileSplitting/

Can you point me to the additional failures?

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.99.0, 0.98.2

 Attachments: 10184-4.patch, 10184.addendum, HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2014-03-14 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935630#comment-13935630
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

I ran the tests locally a while ago before submitting the patch and they all 
passed for me. It's possible something has changed between now and then.
Let me look into these test failures and whether they reveal actual product 
bugs. Otherwise, I'll adjust the tests to be more stable on our Jenkins runs.

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.99.0, 0.98.2

 Attachments: 10184-4.patch, 10184.addendum, HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2014-03-14 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13935743#comment-13935743
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

I don't want to relax the requirements of the test unless they are testing 
something that is not always guaranteed to be true. If that's the case, then I 
can remove it. I'd rather get to the bottom of the issue.

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.99.0, 0.98.2

 Attachments: 10184-4.patch, 10184.addendum, HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2014-03-13 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934303#comment-13934303
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

Thanks Andrew!

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.98.1, 0.99.0

 Attachments: 10184-4.patch, HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2014-03-13 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13934367#comment-13934367
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

+1 on the addendum as well.

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.98.1, 0.99.0

 Attachments: 10184-4.patch, 10184.addendum, HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2014-03-12 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13932733#comment-13932733
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

Thanks for following up on this. Was the patch that you applied the one from 
the JIRA, or was it the latest from the reviewboard review:

That one is here:
https://reviews.apache.org/r/16457/diff/raw/

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.99.0, 0.98.2
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Attachments: HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10653) Incorrect table status in HBase shell Describe

2014-03-03 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918913#comment-13918913
 ] 

Aleksandr Shulman commented on HBASE-10653:
---

The formatting is a bit confusing, but it does appear that the table is
disabled:
'ENABLED' is just the name of the attribute. Below, it says 'false', right
under that heading.

DESCRIPTION
*ENABLED*
 'TestTable', {NAME = 'info', DATA_BLOCK_ENCODING = 'NONE', BLOOMF *false*
 ILTER = 'ROW', REPLICATION_SCOPE = '0', VERSIONS = '1', COMPRESS
 ION = 'SNAPPY', MIN_VERSIONS = '0', TTL = '2147483647', KEEP_DEL
 ETED_CELLS = 'false', BLOCKSIZE = '65536', IN_MEMORY = 'false',
 BLOCKCACHE = 'true'}
1 row(s) in 1.4220 seconds

This might be a usability concern though, and maybe it's worth exploring a
clearer formatting.







-- 
Best Regards,

Aleks Shulman
847.814.5804
Cloudera


 Incorrect table status in HBase shell Describe
 --

 Key: HBASE-10653
 URL: https://issues.apache.org/jira/browse/HBASE-10653
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: Biju Nair
  Labels: HbaseShell, describe

 Describe output of table which is disabled shows as enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-10653) Incorrect table status in HBase shell Describe

2014-03-03 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13918922#comment-13918922
 ] 

Aleksandr Shulman commented on HBASE-10653:
---

To clarify, the 'false' ended up here:
{code}BLOOMF *false* ILTER {code}

 Incorrect table status in HBase shell Describe
 --

 Key: HBASE-10653
 URL: https://issues.apache.org/jira/browse/HBASE-10653
 Project: HBase
  Issue Type: Bug
  Components: shell
Reporter: Biju Nair
  Labels: HbaseShell, describe

 Describe output of table which is disabled shows as enabled.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-23 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10579:
--

Attachment: HBASE-10579-v0.patch

Trivial fix. There is only one reference to this path in the book, so I just 
had to fix it in one spot.

 [Documentation]: ExportSnapshot tool package incorrectly documented
 ---

 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1

 Attachments: HBASE-10579-v0.patch


 Documentation Page: http://hbase.apache.org/book/ops.snapshots.html
 Expected documentation:
 The class should be specified as 
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 Current documentation:
 Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 This makes sense because the class is located in the 
 org.apache.hadoop.hbase.snapshot package:
 https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10567) Add overwrite manifest option to ExportSnapshot

2014-02-21 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909089#comment-13909089
 ] 

Aleksandr Shulman commented on HBASE-10567:
---

Took a first read of the patch. Looks good to me. I'd maybe like to see a few 
more tests, but this is probably okay for now.

 Add overwrite manifest option to ExportSnapshot
 ---

 Key: HBASE-10567
 URL: https://issues.apache.org/jira/browse/HBASE-10567
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.98.0, 0.94.16, 0.99.0, 0.96.1.1
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
Priority: Minor
 Fix For: 0.96.2, 0.98.1, 0.99.0

 Attachments: HBASE-10567-v0.patch, HBASE-10567-v1.patch


 If you want to export a snapshot twice (e.g. in case you accidentally removed 
 a file and now your snapshot is corrupted) you have to manually remove the 
 .hbase-snapshot/SNAPSHOT_NAME directory and then run the ExportSnapshot tool.
 Add an -overwrite option to this operation automatically.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HBASE-10579) [Documentation]: ExportSnapshot tool package incorrectly documented

2014-02-20 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10579:
-

 Summary: [Documentation]: ExportSnapshot tool package incorrectly 
documented
 Key: HBASE-10579
 URL: https://issues.apache.org/jira/browse/HBASE-10579
 Project: HBase
  Issue Type: Bug
  Components: documentation, snapshots
Affects Versions: 0.98.0
Reporter: Aleksandr Shulman
Priority: Minor
 Fix For: 0.96.2, 0.98.1


Documentation Page: http://hbase.apache.org/book/ops.snapshots.html

Expected documentation:
The class should be specified as org.apache.hadoop.hbase.snapshot.ExportSnapshot

Current documentation:
Specified as: org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot

This makes sense because the class is located in the 
org.apache.hadoop.hbase.snapshot package:

https://github.com/apache/hbase/blob/0.98/hbase-server/src/main/java/org/apache/hadoop/hbase/snapshot/ExportSnapshot.java#19



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10481) API Compatibility JDiff script does not properly handle arguments in reverse order

2014-02-07 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895104#comment-13895104
 ] 

Aleksandr Shulman commented on HBASE-10481:
---

Semantically, it does not make sense to have a the previous version be greater 
than the current version. The script would just generate a report that is the 
mirror image (adds reported as removes). 

I don't think this is a meaningful use case to support. The solution would be 
to add a meaningful error message and also to document the logic.

 API Compatibility JDiff script does not properly handle arguments in reverse 
 order
 --

 Key: HBASE-10481
 URL: https://issues.apache.org/jira/browse/HBASE-10481
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.98.0, 0.94.16, 0.99.0, 0.96.1.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.94.16, 0.98.1, 0.99.0, 0.96.1.1


 [~jmhsieh] found an issue when doing a diff between a pre-0.96 branch and a 
 post-0.96 branch.
 Typically, if the pre-0.96 branch is specified first, and the post-0.96 
 branch second, the exisitng logic handles it.
 When it is in the reverse order, that logic is not handled properly.
 The fix should address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10481) API Compatibility JDiff script does not properly handle arguments in reverse order

2014-02-07 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10481:
--

Attachment: HBASE-10481-v1.patch

Adding v1 of the patch. Fixes the case identified in the jira and also corrects 
some of the output about where the working directory is.

 API Compatibility JDiff script does not properly handle arguments in reverse 
 order
 --

 Key: HBASE-10481
 URL: https://issues.apache.org/jira/browse/HBASE-10481
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.98.0, 0.94.16, 0.99.0, 0.96.1.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.94.16, 0.98.1, 0.99.0, 0.96.1.1

 Attachments: HBASE-10481-v1.patch


 [~jmhsieh] found an issue when doing a diff between a pre-0.96 branch and a 
 post-0.96 branch.
 Typically, if the pre-0.96 branch is specified first, and the post-0.96 
 branch second, the exisitng logic handles it.
 When it is in the reverse order, that logic is not handled properly.
 The fix should address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Work started] (HBASE-10481) API Compatibility JDiff script does not properly handle arguments in reverse order

2014-02-07 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-10481 started by Aleksandr Shulman.

 API Compatibility JDiff script does not properly handle arguments in reverse 
 order
 --

 Key: HBASE-10481
 URL: https://issues.apache.org/jira/browse/HBASE-10481
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.98.0, 0.94.16, 0.99.0, 0.96.1.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.94.16, 0.98.1, 0.99.0, 0.96.1.1

 Attachments: HBASE-10481-v1.patch


 [~jmhsieh] found an issue when doing a diff between a pre-0.96 branch and a 
 post-0.96 branch.
 Typically, if the pre-0.96 branch is specified first, and the post-0.96 
 branch second, the exisitng logic handles it.
 When it is in the reverse order, that logic is not handled properly.
 The fix should address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10481) API Compatibility JDiff script does not properly handle arguments in reverse order

2014-02-07 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10481:
--

Status: Patch Available  (was: In Progress)

 API Compatibility JDiff script does not properly handle arguments in reverse 
 order
 --

 Key: HBASE-10481
 URL: https://issues.apache.org/jira/browse/HBASE-10481
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.1.1, 0.94.16, 0.98.0, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.98.1, 0.99.0, 0.96.1.1, 0.94.16

 Attachments: HBASE-10481-v1.patch


 [~jmhsieh] found an issue when doing a diff between a pre-0.96 branch and a 
 post-0.96 branch.
 Typically, if the pre-0.96 branch is specified first, and the post-0.96 
 branch second, the exisitng logic handles it.
 When it is in the reverse order, that logic is not handled properly.
 The fix should address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HBASE-10481) API Compatibility JDiff script does not properly handle arguments in reverse order

2014-02-06 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10481:
-

 Summary: API Compatibility JDiff script does not properly handle 
arguments in reverse order
 Key: HBASE-10481
 URL: https://issues.apache.org/jira/browse/HBASE-10481
 Project: HBase
  Issue Type: Bug
  Components: test
Affects Versions: 0.96.1.1, 0.94.16, 0.98.0, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Fix For: 0.98.0, 0.99.0, 0.96.1.1, 0.94.16


[~jmhsieh] found an issue when doing a diff between a pre-0.96 branch and a 
post-0.96 branch.

Typically, if the pre-0.96 branch is specified first, and the post-0.96 branch 
second, the exisitng logic handles it.

When it is in the reverse order, that logic is not handled properly.

The fix should address this.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2014-01-02 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860747#comment-13860747
 ] 

Aleksandr Shulman commented on HBASE-10264:
---

+1 - looks good to me as well. Smoke tested it against MRv1 as well.

 [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath
 --

 Key: HBASE-10264
 URL: https://issues.apache.org/jira/browse/HBASE-10264
 Project: HBase
  Issue Type: Bug
  Components: Compaction, mapreduce
Affects Versions: 0.98.0, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Himanshu Vashishtha
 Attachments: HBase-10264.patch


 Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
 issues in both MRv1 and MRv2.
 {code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
 -major hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}
 Results:
 {code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
 attempt_1388179525649_0011_m_00_2, Status : FAILED
 Error: java.lang.ClassNotFoundException: 
 org.apache.hadoop.hbase.TableInfoMissingException
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
   at 
 org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
   at 
 org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
   at 
 org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
   at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
   at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
   at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HBASE-10269) [Nit]: Spelling issue in HFileContext.setCompresssion

2014-01-02 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10269:
-

 Summary: [Nit]: Spelling issue in HFileContext.setCompresssion
 Key: HBASE-10269
 URL: https://issues.apache.org/jira/browse/HBASE-10269
 Project: HBase
  Issue Type: Bug
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor


As part of HBase-7544, there was introduced a misspelling into 
HFileContext.java:

https://github.com/apache/hbase/blob/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java#L103

The fix is trivial. Will attach.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10269) [Nit]: Spelling issue in HFileContext.setCompresssion

2014-01-02 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10269:
--

Attachment: HBASE-10269-1.patch

It looks like this call was not used anywhere, so the change is a one-liner.

 [Nit]: Spelling issue in HFileContext.setCompresssion
 -

 Key: HBASE-10269
 URL: https://issues.apache.org/jira/browse/HBASE-10269
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Attachments: HBASE-10269-1.patch


 As part of HBase-7544, there was introduced a misspelling into 
 HFileContext.java:
 https://github.com/apache/hbase/blob/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java#L103
 The fix is trivial. Will attach.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10269) [Nit]: Spelling issue in HFileContext.setCompresssion

2014-01-02 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10269:
--

Affects Version/s: 0.99.0
   0.98.1
   0.98.0

 [Nit]: Spelling issue in HFileContext.setCompresssion
 -

 Key: HBASE-10269
 URL: https://issues.apache.org/jira/browse/HBASE-10269
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Attachments: HBASE-10269-1.patch


 As part of HBase-7544, there was introduced a misspelling into 
 HFileContext.java:
 https://github.com/apache/hbase/blob/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java#L103
 The fix is trivial. Will attach.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10269) [Nit]: Spelling issue in HFileContext.setCompresssion

2014-01-02 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10269:
--

Status: Patch Available  (was: Open)

 [Nit]: Spelling issue in HFileContext.setCompresssion
 -

 Key: HBASE-10269
 URL: https://issues.apache.org/jira/browse/HBASE-10269
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Attachments: HBASE-10269-1.patch


 As part of HBase-7544, there was introduced a misspelling into 
 HFileContext.java:
 https://github.com/apache/hbase/blob/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java#L103
 The fix is trivial. Will attach.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10269) [Nit]: Spelling issue in HFileContext.setCompresssion

2014-01-02 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13860802#comment-13860802
 ] 

Aleksandr Shulman commented on HBASE-10269:
---

Should apply cleanly to both 0.98 and trunk. Not necessary for 0.96.

 [Nit]: Spelling issue in HFileContext.setCompresssion
 -

 Key: HBASE-10269
 URL: https://issues.apache.org/jira/browse/HBASE-10269
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Attachments: HBASE-10269-1.patch


 As part of HBase-7544, there was introduced a misspelling into 
 HFileContext.java:
 https://github.com/apache/hbase/blob/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java#L103
 The fix is trivial. Will attach.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HBASE-10264) [MapReduce]: CompactionTool in mapred mode is missing classes in its classpath

2013-12-31 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10264:
-

 Summary: [MapReduce]: CompactionTool in mapred mode is missing 
classes in its classpath
 Key: HBASE-10264
 URL: https://issues.apache.org/jira/browse/HBASE-10264
 Project: HBase
  Issue Type: Bug
  Components: Compaction, mapreduce
Affects Versions: 0.98.0, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Himanshu Vashishtha


Calling o.a.h.h.regionserver.CompactionTool fails due to classpath-related 
issues in both MRv1 and MRv2.

{code}hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred -major 
hdfs://`hostname`:8020/hbase/data/default/orig_1388179858868{code}

Results:

{code}2013-12-27 13:31:49,478 INFO  [main] mapreduce.Job: Task Id : 
attempt_1388179525649_0011_m_00_2, Status : FAILED
Error: java.lang.ClassNotFoundException: 
org.apache.hadoop.hbase.TableInfoMissingException
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at 
org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionWorker.compact(CompactionTool.java:115)
at 
org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:231)
at 
org.apache.hadoop.hbase.regionserver.CompactionTool$CompactionMapper.map(CompactionTool.java:207)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:165)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1499)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:160){code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (HBASE-10251) Restore API Compat for PerformanceEvaluation.generateValue()

2013-12-27 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10251:
-

 Summary: Restore API Compat for 
PerformanceEvaluation.generateValue()
 Key: HBASE-10251
 URL: https://issues.apache.org/jira/browse/HBASE-10251
 Project: HBase
  Issue Type: Bug
  Components: Client
Affects Versions: 0.98.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman


Observed:

A couple of my client tests fail to compile against trunk because the method 
PerformanceEvaluation.generateValue was removed as part of HBASE-8496.

This is an issue because is was used in a number of places, including unit 
tests. Since we did not explicitly label this API as private, it's ambiguous as 
to whether this could/should have been used by people writing apps against 
0.96. If they used it, then they would be broken upon upgrade to 0.98 and trunk.

Potential Solution:
The method was renamed to generateData, but the logic is still the same. We can 
reintroduce it as deprecated in 0.98, as compat shim over generateData. The 
patch should be a few lines. We may also consider doing so in trunk, but I'd be 
just as fine with leaving it out.

More generally, this raises the question about what other code is in this 
grey-area, where it is public, is used outside of the package, but is not 
explicitly labeled with an AudienceInterface.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10251) Restore API Compat for PerformanceEvaluation.generateValue()

2013-12-27 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-10251:
--

Labels: api_compatibility (was: )

Restore API Compat for PerformanceEvaluation.generateValue()

Key: HBASE-10251
URL: https://issues.apache.org/jira/browse/HBASE-10251
Project: HBase
Issue Type: Bug
Components: Client
Affects Versions: 0.98.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Labels: api_compatibility

Observed:
A couple of my client tests fail to compile against trunk because the method
PerformanceEvaluation.generateValue was removed as part of HBASE-8496.
This is an issue because is was used in a number of places, including unit
tests. Since we did not explicitly label this API as private, it's ambiguous
as to whether this could/should have been used by people writing apps against
0.96. If they used it, then they would be broken upon upgrade to 0.98 and
trunk.
Potential Solution:
The method was renamed to generateData, but the logic is still the same. We
can reintroduce it as deprecated in 0.98, as compat shim over generateData.
The patch should be a few lines. We may also consider doing so in trunk, but
I'd be just as fine with leaving it out.
More generally, this raises the question about what other code is in this
grey-area, where it is public, is used outside of the package, but is not
explicitly labeled with an AudienceInterface.

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10251) Restore API Compat for PerformanceEvaluation.generateValue()

2013-12-27 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-10251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-10251:
--

Description:
Observed:

A couple of my client tests fail to compile against trunk because the method
PerformanceEvaluation.generateValue was removed as part of HBASE-8496.

This is an issue because it was used in a number of places, including unit
tests. Since we did not explicitly label this API as private, it's ambiguous as
to whether this could/should have been used by people writing apps against
0.96. If they used it, then they would be broken upon upgrade to 0.98 and trunk.

Potential Solution:
The method was renamed to generateData, but the logic is still the same. We can
reintroduce it as deprecated in 0.98, as compat shim over generateData. The
patch should be a few lines. We may also consider doing so in trunk, but I'd be
just as fine with leaving it out.

More generally, this raises the question about what other code is in this
grey-area, where it is public, is used outside of the package, but is not
explicitly labeled with an AudienceInterface.

was:
Observed:

A couple of my client tests fail to compile against trunk because the method
PerformanceEvaluation.generateValue was removed as part of HBASE-8496.

This is an issue because is was used in a number of places, including unit
tests. Since we did not explicitly label this API as private, it's ambiguous as
to whether this could/should have been used by people writing apps against
0.96. If they used it, then they would be broken upon upgrade to 0.98 and trunk.

More generally, this raises the question about what other code is in this
grey-area, where it is public, is used outside of the package, but is not
explicitly labeled with an AudienceInterface.

Restore API Compat for PerformanceEvaluation.generateValue()

--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2013-12-24 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13856465#comment-13856465
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

Review available here: https://reviews.apache.org/r/16457/

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Attachments: HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Updated] (HBASE-10194) [Usability]: Instructions in CompactionTool no longer accurate because of namespaces

2013-12-18 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10194:
--

Attachment: HBASE-10194-trunk.patch

Attaching patch. Should apply cleanly for everything after 0.94.

 [Usability]: Instructions in CompactionTool no longer accurate because of 
 namespaces
 

 Key: HBASE-10194
 URL: https://issues.apache.org/jira/browse/HBASE-10194
 Project: HBase
  Issue Type: Bug
  Components: Compaction, util
Affects Versions: 0.96.2, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor
 Attachments: HBASE-10194-trunk.patch


 Observed Behavior:
 The instructions for org.apache.hadoop.hbase.regionserver.CompactionTool 
 suggest using the pre-95 hbase format:
 {code}Examples:
  To compact the full 'TestTable' using MapReduce:
  $ bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
 hdfs:///hbase/TestTable{code}
 Expected behavior:
 It should now take into account namespaces, for example:
 {code}
 hdfs:///hbase/data/default/TestTable
 {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Updated] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2013-12-17 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10184:
--

Attachment: HBASE-10184-trunk.diff

First draft of patch against trunk. All tests pass.

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Attachments: HBASE-10184-trunk.diff


 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Created] (HBASE-10194) [Usability]: Instructions in CompactionTool no longer accurate because of namespaces

2013-12-17 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10194:
-

 Summary: [Usability]: Instructions in CompactionTool no longer 
accurate because of namespaces
 Key: HBASE-10194
 URL: https://issues.apache.org/jira/browse/HBASE-10194
 Project: HBase
  Issue Type: Bug
  Components: Compaction, util
Affects Versions: 0.96.2, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
Priority: Minor


Observed Behavior:
The instructions for org.apache.hadoop.hbase.regionserver.CompactionTool 
suggest using the pre-95 hbase format:
{code}Examples:
 To compact the full 'TestTable' using MapReduce:
 $ bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
hdfs:///hbase/TestTable{code}

Expected behavior:
It should now take into account namespaces, for example:
{code}
hdfs:///hbase/data/default/TestTable
{code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Created] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2013-12-16 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10184:
-

 Summary: [Online Schema Change]: Add additional tests for online 
schema change
 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman


There are some gaps in testing for Online Schema Change:

Examples of some tests that should be added:
1. Splits with online schema change
2. Merge during online schema change
3. MR over HBase during online schema change
4. Bulk Load during online schema change
5. Online change table owner
6. Online Replication scope change
7. Online Bloom Filter change
8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Updated] (HBASE-10136) Alter table conflicts with concurrent snapshot attempt on that table

2013-12-16 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10136:
--

Attachment: HBASE-10136-trunk.patch

Adding a patch for a test that exposes this issue. Test should pass once this 
issue is resolved.

 Alter table conflicts with concurrent snapshot attempt on that table
 

 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
  Labels: online_schema_change
 Attachments: HBASE-10136-trunk.patch


 Expected behavior:
 A user can issue a request for a snapshot of a table while that table is 
 undergoing an online schema change and expect that snapshot request to 
 complete correctly. Also, the same is true if a user issues a online schema 
 change request while a snapshot attempt is ongoing.
 Observed behavior:
 Snapshot attempts time out when there is an ongoing online schema change 
 because the region is closed and opened during the snapshot. 
 As a side-note, I would expect that the attempt should fail quickly as 
 opposed to timing out. 
 Further, what I have seen is that subsequent attempts to snapshot the table 
 fail because of some state/cleanup issues. This is also concerning.
 Immediate error:
 {code}type=FLUSH }' is still in progress!
 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
 Sleeping: 1ms while waiting for snapshot completion.
 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
 current status of snapshot from master...
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
 progress!
 Snapshot failure occurred
 org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
 'snapshot0' wasn't completed in expectedTime:6 ms
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
   at 
 org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
 Likely root cause of error:
 {code}Exception in SnapshotSubprocedurePool
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
   at

[jira] [Commented] (HBASE-10184) [Online Schema Change]: Add additional tests for online schema change

2013-12-16 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13850213#comment-13850213
 ] 

Aleksandr Shulman commented on HBASE-10184:
---

Thanks [~xieliang007]. I thought through some of the things that could happen 
simultaneously and be compromised by this operation. I have some test cases 
locally for some of these already that seem to pass. If you have any others 
you'd like to suggest, let me know :)

 [Online Schema Change]: Add additional tests for online schema change
 -

 Key: HBASE-10184
 URL: https://issues.apache.org/jira/browse/HBASE-10184
 Project: HBase
  Issue Type: Task
  Components: test
Affects Versions: 0.96.1, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change

 There are some gaps in testing for Online Schema Change:
 Examples of some tests that should be added:
 1. Splits with online schema change
 2. Merge during online schema change
 3. MR over HBase during online schema change
 4. Bulk Load during online schema change
 5. Online change table owner
 6. Online Replication scope change
 7. Online Bloom Filter change
 8. Snapshots during online schema change (HBASE-10136)



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Created] (HBASE-10136) [Online Schema Change]: Online Schema Change on a table conflicts with snapshot attempt on the table

2013-12-11 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10136:
-

 Summary: [Online Schema Change]: Online Schema Change on a table 
conflicts with snapshot attempt on the table
 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman


Expected behavior:
A user can take a snapshot of a table while that table is undergoing an online 
schema change.

Observed behavior:
Snapshot attempts time out when there is an ongoing online schema change 
because the region is closed and opened during the snapshot. 

As a side-note, I would expect that the attempt should fail quickly as opposed 
to timing out. 

Further, what I have seen is that subsequent attempts to snapshot the table 
fail because of some state/cleanup issues. This is also concerning.

Immediate error:
{code}type=FLUSH }' is still in progress!
2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
Sleeping: 1ms while waiting for snapshot completion.
2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
current status of snapshot from master...
2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
progress!
Snapshot failure occurred
org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
'snapshot0' wasn't completed in expectedTime:6 ms
at 
org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
at 
org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
at 
org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
at 
org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}

Likely root cause of error:
{code}Exception in SnapshotSubprocedurePool
java.util.concurrent.ExecutionException: 
org.apache.hadoop.hbase.NotServingRegionException: 
changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
 is closing
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
at 
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
at 
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
 is closing
at 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
at 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
... 5 more{code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Updated] (HBASE-10136) [Online Schema Change]: Online Schema Change on a table conflicts with snapshot attempt on the table

2013-12-11 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10136:
--

Labels: online_schema_change  (was: )

 [Online Schema Change]: Online Schema Change on a table conflicts with 
 snapshot attempt on the table
 

 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change

 Expected behavior:
 A user can take a snapshot of a table while that table is undergoing an 
 online schema change.
 Observed behavior:
 Snapshot attempts time out when there is an ongoing online schema change 
 because the region is closed and opened during the snapshot. 
 As a side-note, I would expect that the attempt should fail quickly as 
 opposed to timing out. 
 Further, what I have seen is that subsequent attempts to snapshot the table 
 fail because of some state/cleanup issues. This is also concerning.
 Immediate error:
 {code}type=FLUSH }' is still in progress!
 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
 Sleeping: 1ms while waiting for snapshot completion.
 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
 current status of snapshot from master...
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
 progress!
 Snapshot failure occurred
 org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
 'snapshot0' wasn't completed in expectedTime:6 ms
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
   at 
 org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
 Likely root cause of error:
 {code}Exception in SnapshotSubprocedurePool
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   ... 5 more{code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Assigned] (HBASE-10136) [Online Schema Change]: Online Schema Change on a table conflicts with snapshot attempt on the table

2013-12-11 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman reassigned HBASE-10136:
-

Assignee: Matteo Bertozzi  (was: Aleksandr Shulman)

 [Online Schema Change]: Online Schema Change on a table conflicts with 
 snapshot attempt on the table
 

 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
  Labels: online_schema_change

 Expected behavior:
 A user can take a snapshot of a table while that table is undergoing an 
 online schema change.
 Observed behavior:
 Snapshot attempts time out when there is an ongoing online schema change 
 because the region is closed and opened during the snapshot. 
 As a side-note, I would expect that the attempt should fail quickly as 
 opposed to timing out. 
 Further, what I have seen is that subsequent attempts to snapshot the table 
 fail because of some state/cleanup issues. This is also concerning.
 Immediate error:
 {code}type=FLUSH }' is still in progress!
 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
 Sleeping: 1ms while waiting for snapshot completion.
 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
 current status of snapshot from master...
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
 progress!
 Snapshot failure occurred
 org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
 'snapshot0' wasn't completed in expectedTime:6 ms
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
   at 
 org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
 Likely root cause of error:
 {code}Exception in SnapshotSubprocedurePool
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
   ... 5 more{code}



--
This message was sent by Atlassian JIRA

[jira] [Updated] (HBASE-10136) [Online Schema Change]: Online Schema Change on a table conflicts with concurrent snapshot attempt on the table

2013-12-11 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10136:
--

Summary: [Online Schema Change]: Online Schema Change on a table conflicts 
with concurrent snapshot attempt on the table  (was: [Online Schema Change]: 
Online Schema Change on a table conflicts with snapshot attempt on the table)

 [Online Schema Change]: Online Schema Change on a table conflicts with 
 concurrent snapshot attempt on the table
 ---

 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
  Labels: online_schema_change

 Expected behavior:
 A user can take a snapshot of a table while that table is undergoing an 
 online schema change.
 Observed behavior:
 Snapshot attempts time out when there is an ongoing online schema change 
 because the region is closed and opened during the snapshot. 
 As a side-note, I would expect that the attempt should fail quickly as 
 opposed to timing out. 
 Further, what I have seen is that subsequent attempts to snapshot the table 
 fail because of some state/cleanup issues. This is also concerning.
 Immediate error:
 {code}type=FLUSH }' is still in progress!
 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
 Sleeping: 1ms while waiting for snapshot completion.
 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
 current status of snapshot from master...
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
 progress!
 Snapshot failure occurred
 org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
 'snapshot0' wasn't completed in expectedTime:6 ms
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
   at 
 org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
 Likely root cause of error:
 {code}Exception in SnapshotSubprocedurePool
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at

[jira] [Commented] (HBASE-10136) [Online Schema Change]: Online Schema Change on a table conflicts with concurrent snapshot attempt on the table

2013-12-11 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845934#comment-13845934
 ] 

Aleksandr Shulman commented on HBASE-10136:
---

A potential solution might be table locking:
With the table lock we would expect the modifyTable to wait for the snapshot to 
complete or the snapshot to wait the modifyTable to complete.

 [Online Schema Change]: Online Schema Change on a table conflicts with 
 concurrent snapshot attempt on the table
 ---

 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
  Labels: online_schema_change

 Expected behavior:
 A user can take a snapshot of a table while that table is undergoing an 
 online schema change.
 Observed behavior:
 Snapshot attempts time out when there is an ongoing online schema change 
 because the region is closed and opened during the snapshot. 
 As a side-note, I would expect that the attempt should fail quickly as 
 opposed to timing out. 
 Further, what I have seen is that subsequent attempts to snapshot the table 
 fail because of some state/cleanup issues. This is also concerning.
 Immediate error:
 {code}type=FLUSH }' is still in progress!
 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
 Sleeping: 1ms while waiting for snapshot completion.
 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
 current status of snapshot from master...
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
 progress!
 Snapshot failure occurred
 org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
 'snapshot0' wasn't completed in expectedTime:6 ms
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
   at 
 org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
 Likely root cause of error:
 {code}Exception in SnapshotSubprocedurePool
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at

[jira] [Commented] (HBASE-10136) [Online Schema Change]: Online Schema Change on a table conflicts with concurrent snapshot attempt on the table

2013-12-11 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845945#comment-13845945
 ] 

Aleksandr Shulman commented on HBASE-10136:
---

Sorry, yes, I should have phrased it as:
{quote}
A user can issue a request for a snapshot of a table while that table is 
undergoing an online schema change and expect that snapshot request to complete 
correctly. Also, the same is true if a user issues a online schema change 
request while a snapshot attempt is ongoing.{quote}

 [Online Schema Change]: Online Schema Change on a table conflicts with 
 concurrent snapshot attempt on the table
 ---

 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
  Labels: online_schema_change

 Expected behavior:
 A user can take a snapshot of a table while that table is undergoing an 
 online schema change.
 Observed behavior:
 Snapshot attempts time out when there is an ongoing online schema change 
 because the region is closed and opened during the snapshot. 
 As a side-note, I would expect that the attempt should fail quickly as 
 opposed to timing out. 
 Further, what I have seen is that subsequent attempts to snapshot the table 
 fail because of some state/cleanup issues. This is also concerning.
 Immediate error:
 {code}type=FLUSH }' is still in progress!
 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
 Sleeping: 1ms while waiting for snapshot completion.
 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
 current status of snapshot from master...
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
 progress!
 Snapshot failure occurred
 org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
 'snapshot0' wasn't completed in expectedTime:6 ms
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
   at 
 org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
 Likely root cause of error:
 {code}Exception in SnapshotSubprocedurePool
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
   at

[jira] [Updated] (HBASE-10136) [Online Schema Change]: Online Schema Change on a table conflicts with concurrent snapshot attempt on the table

2013-12-11 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10136:
--

Description: 
Expected behavior:
A user can issue a request for a snapshot of a table while that table is 
undergoing an online schema change and expect that snapshot request to complete 
correctly. Also, the same is true if a user issues a online schema change 
request while a snapshot attempt is ongoing.

Observed behavior:
Snapshot attempts time out when there is an ongoing online schema change 
because the region is closed and opened during the snapshot. 

As a side-note, I would expect that the attempt should fail quickly as opposed 
to timing out. 

Further, what I have seen is that subsequent attempts to snapshot the table 
fail because of some state/cleanup issues. This is also concerning.

Immediate error:
{code}type=FLUSH }' is still in progress!
2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
Sleeping: 1ms while waiting for snapshot completion.
2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
current status of snapshot from master...
2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
progress!
Snapshot failure occurred
org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
'snapshot0' wasn't completed in expectedTime:6 ms
at 
org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
at 
org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
at 
org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
at 
org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}

Likely root cause of error:
{code}Exception in SnapshotSubprocedurePool
java.util.concurrent.ExecutionException: 
org.apache.hadoop.hbase.NotServingRegionException: 
changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
 is closing
at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
at java.util.concurrent.FutureTask.get(FutureTask.java:83)
at 
org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
at 
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
at 
org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
 is closing
at 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
at 
org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
at 
org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
... 5 more{code}

  was:
Expected behavior:
A user can take a snapshot of a table while that table is undergoing an online 
schema change.

Observed behavior:
Snapshot attempts time out when there is an ongoing online schema change 
because the region is closed and opened during the snapshot. 

As a side-note, I would expect that the attempt should fail quickly as opposed 
to timing out. 

Further, what I have seen is that subsequent attempts to snapshot the table 
fail because of some state/cleanup issues. This is

[jira] [Updated] (HBASE-10136) Alter table conflicts with concurrent snapshot attempt on that table

2013-12-11 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10136:
--

Summary: Alter table conflicts with concurrent snapshot attempt on that 
table  (was: [Online Schema Change]: Online Schema Change on a table conflicts 
with concurrent snapshot attempt on the table)

 Alter table conflicts with concurrent snapshot attempt on that table
 

 Key: HBASE-10136
 URL: https://issues.apache.org/jira/browse/HBASE-10136
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0, 0.98.1, 0.99.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
  Labels: online_schema_change

 Expected behavior:
 A user can issue a request for a snapshot of a table while that table is 
 undergoing an online schema change and expect that snapshot request to 
 complete correctly. Also, the same is true if a user issues a online schema 
 change request while a snapshot attempt is ongoing.
 Observed behavior:
 Snapshot attempts time out when there is an ongoing online schema change 
 because the region is closed and opened during the snapshot. 
 As a side-note, I would expect that the attempt should fail quickly as 
 opposed to timing out. 
 Further, what I have seen is that subsequent attempts to snapshot the table 
 fail because of some state/cleanup issues. This is also concerning.
 Immediate error:
 {code}type=FLUSH }' is still in progress!
 2013-12-11 15:58:32,883 DEBUG [Thread-385] client.HBaseAdmin(2696): (#11) 
 Sleeping: 1ms while waiting for snapshot completion.
 2013-12-11 15:58:42,884 DEBUG [Thread-385] client.HBaseAdmin(2704): Getting 
 current status of snapshot from master...
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 master.HMaster(2891): Checking to see if snapshot from request:{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH } is done
 2013-12-11 15:58:42,887 DEBUG [FifoRpcScheduler.handler1-thread-3] 
 snapshot.SnapshotManager(374): Snapshoting '{ ss=snapshot0 
 table=changeSchemaDuringSnapshot1386806258640 type=FLUSH }' is still in 
 progress!
 Snapshot failure occurred
 org.apache.hadoop.hbase.snapshot.SnapshotCreationException: Snapshot 
 'snapshot0' wasn't completed in expectedTime:6 ms
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2713)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2638)
   at 
 org.apache.hadoop.hbase.client.HBaseAdmin.snapshot(HBaseAdmin.java:2602)
   at 
 org.apache.hadoop.hbase.client.TestAdmin$BackgroundSnapshotThread.run(TestAdmin.java:1974){code}
 Likely root cause of error:
 {code}Exception in SnapshotSubprocedurePool
 java.util.concurrent.ExecutionException: 
 org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
   at java.util.concurrent.FutureTask.get(FutureTask.java:83)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.RegionServerSnapshotManager$SnapshotSubprocedurePool.waitForOutstandingTasks(RegionServerSnapshotManager.java:314)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.flushSnapshot(FlushSnapshotSubprocedure.java:118)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure.insideBarrier(FlushSnapshotSubprocedure.java:137)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:181)
   at 
 org.apache.hadoop.hbase.procedure.Subprocedure.call(Subprocedure.java:1)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
   at java.util.concurrent.FutureTask.run(FutureTask.java:138)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hbase.NotServingRegionException: 
 changeSchemaDuringSnapshot1386806258640,,1386806258720.ea776db51749e39c956d771a7d17a0f3.
  is closing
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5327)
   at 
 org.apache.hadoop.hbase.regionserver.HRegion.startRegionOperation(HRegion.java:5289)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:79)
   at 
 org.apache.hadoop.hbase.regionserver.snapshot.FlushSnapshotSubprocedure$RegionSnapshotTask.call(FlushSnapshotSubprocedure.java:1)
   at

[jira] [Commented] (HBASE-9966) Create IntegrationTest for Online Bloom Filter Change

2013-12-10 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844770#comment-13844770
 ] 

Aleksandr Shulman commented on HBASE-9966:
--

Thanks Andrew!

 Create IntegrationTest for Online Bloom Filter Change
 -

 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.96.1, 0.98.1, 0.99.0

 Attachments: HBASE-9966-96.patch, HBASE-9966-98.patch, 
 HBASE-9966-trunk.patch


 For online schema change, a user is perfectly with her rights to modify the 
 compression algorithm used, or the bloom filter.
 Therefore, we should add these actions to our ChaosMonkey tests to ensure 
 that they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change

2013-12-09 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13843699#comment-13843699
 ] 

Aleksandr Shulman commented on HBASE-9966:
--

Thanks for taking a look. Would you like me to create a formal code review or 
is this enough?
Also, I may be adding some more online schema change monkeys, but I'll file a 
separate jira for that.

 Create IntegrationTest for Online Bloom Filter and Compression Algorithm 
 Change
 ---

 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.96.1, 0.98.1, 0.99.0

 Attachments: HBASE-9966-96.patch, HBASE-9966-98.patch, 
 HBASE-9966-trunk.patch


 For online schema change, a user is perfectly with her rights to modify the 
 compression algorithm used, or the bloom filter.
 Therefore, we should add these actions to our ChaosMonkey tests to ensure 
 that they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Updated] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change

2013-12-07 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9966:
-

Fix Version/s: (was: 0.95.2)
   0.99.0
   0.98.1
   0.96.1

 Create IntegrationTest for Online Bloom Filter and Compression Algorithm 
 Change
 ---

 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
 Fix For: 0.96.1, 0.98.1, 0.99.0


 For online schema change, a user is perfectly with her rights to modify the 
 compression algorithm used, or the bloom filter.
 Therefore, we should add these actions to our ChaosMonkey tests to ensure 
 that they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change

2013-12-07 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9966:
-

Attachment: HBASE-9966-96.patch
HBASE-9966-98.patch
HBASE-9966-trunk.patch

The patch ends up being the same for Trunk, 98, and 0.96.

 Create IntegrationTest for Online Bloom Filter and Compression Algorithm 
 Change
 ---

 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
 Fix For: 0.96.1, 0.98.1, 0.99.0

 Attachments: HBASE-9966-96.patch, HBASE-9966-98.patch, 
 HBASE-9966-trunk.patch


 For online schema change, a user is perfectly with her rights to modify the 
 compression algorithm used, or the bloom filter.
 Therefore, we should add these actions to our ChaosMonkey tests to ensure 
 that they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change

2013-12-07 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9966:
-

Labels: online_schema_change  (was: )

 Create IntegrationTest for Online Bloom Filter and Compression Algorithm 
 Change
 ---

 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
  Labels: online_schema_change
 Fix For: 0.96.1, 0.98.1, 0.99.0

 Attachments: HBASE-9966-96.patch, HBASE-9966-98.patch, 
 HBASE-9966-trunk.patch


 For online schema change, a user is perfectly with her rights to modify the 
 compression algorithm used, or the bloom filter.
 Therefore, we should add these actions to our ChaosMonkey tests to ensure 
 that they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10073) [Hadoop1]: hbase zkcli broken due to slf4j incompatibility

2013-12-05 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10073:
--

Assignee: Andrew Purtell  (was: Matteo Bertozzi)

 [Hadoop1]: hbase zkcli broken due to slf4j incompatibility
 --

 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman
Assignee: Andrew Purtell

 Observed behavior:
 In my automation, I have a call to hbase zkcli. That call recently broke with 
 this checkin: 
 https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
 The error that is reported is:
 {code}++ ./hbase zkcli
 11:19:58  Warning: $HADOOP_HOME is deprecated.
 11:19:58  
 11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
 access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
 org.slf4j.LoggerFactory
 11:20:00  at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
 11:20:00  at 
 org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
 11:20:00  at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
 11:20:00  Build step 'Execute shell' marked build as failure{code}
 That said, this checkin is perfectly valid as each component should be 
 allowed to specify its own dependencies.
 The issue is a deeper one of dependency mismatches.
 Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
 where there is a similar checkin, but since trunk is not required to work 
 against hadoop1, this is not an issue for trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10073) [Hadoop1]: hbase zkcli broken due to slf4j incompatibility

2013-12-05 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13840904#comment-13840904
 ] 

Aleksandr Shulman commented on HBASE-10073:
---

Assigned to Andrew Purtell to have a look, since he is the author of the patch. 
The patch itself is not incorrect, but reveals a larger issue of maintaining 
compatibility and harmony among the dependencies.

 [Hadoop1]: hbase zkcli broken due to slf4j incompatibility
 --

 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman
Assignee: Andrew Purtell

 Observed behavior:
 In my automation, I have a call to hbase zkcli. That call recently broke with 
 this checkin: 
 https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
 The error that is reported is:
 {code}++ ./hbase zkcli
 11:19:58  Warning: $HADOOP_HOME is deprecated.
 11:19:58  
 11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
 access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
 org.slf4j.LoggerFactory
 11:20:00  at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
 11:20:00  at 
 org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
 11:20:00  at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
 11:20:00  Build step 'Execute shell' marked build as failure{code}
 That said, this checkin is perfectly valid as each component should be 
 allowed to specify its own dependencies.
 The issue is a deeper one of dependency mismatches.
 Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
 where there is a similar checkin, but since trunk is not required to work 
 against hadoop1, this is not an issue for trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10073) [Hadoop1]: hbase zkcli broken due to slf4j incompatibility

2013-12-05 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841014#comment-13841014
 ] 

Aleksandr Shulman commented on HBASE-10073:
---

I think the problem runs a little deeper. If there are two versions of slf4j 
(or any dependency for that matter) on the classpath, then something will be 
affected by a resulting incompatibility. Because of changes in the ordering of 
the classpath in 0.96, I had to make some changes to my own setup scripts. Both 
configurations one could consider reasonable or representative of what a common 
user would do. 

Before my change, I reported this issue.
After the change, hbase zkcli works fine, but there is a similar error now when 
starting master (much worse!). The revert you are suggesting may fix it though. 
I'll test it.

Here's the master startup error:
{code}21:53:54  2013-12-05 21:53:40,596 INFO  [main] impl.MetricsSourceAdapter: 
MBean for source jvm registered.
21:53:54  2013-12-05 21:53:40,604 INFO  [main] impl.MetricsSourceAdapter: MBean 
for source IPC,sub=IPC registered.
21:53:54  2013-12-05 21:53:41,250 INFO  [main] impl.MetricsSourceAdapter: MBean 
for source ugi registered.
21:53:54  2013-12-05 21:53:41,628 INFO  [main] master.HMaster: 
hbase.rootdir=hdfs://snapshot-tarball-vm-6.ent.cloudera.com:8020/hbase, 
hbase.cluster.distributed=true
21:53:54  2013-12-05 21:53:41,758 ERROR [main] master.HMasterCommandLine: 
Master exiting
21:53:54  java.lang.RuntimeException: Failed construction of Master: class 
org.apache.hadoop.hbase.master.HMaster
21:53:54at 
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2779)
21:53:54at 
org.apache.hadoop.hbase.master.HMasterCommandLine.startMaster(HMasterCommandLine.java:184)
21:53:54at 
org.apache.hadoop.hbase.master.HMasterCommandLine.run(HMasterCommandLine.java:134)
21:53:54at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
21:53:54at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
21:53:54at 
org.apache.hadoop.hbase.master.HMaster.main(HMaster.java:2793)
21:53:54  Caused by: java.lang.IllegalAccessError: tried to access field 
org.slf4j.impl.StaticLoggerBinder.SINGLETON from class org.slf4j.LoggerFactory
21:53:54at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
21:53:54at org.apache.zookeeper.ZooKeeper.clinit(ZooKeeper.java:94)
21:53:54at 
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.init(RecoverableZooKeeper.java:112)
21:53:54at 
org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:132)
21:53:54at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.init(ZooKeeperWatcher.java:165)
21:53:54at 
org.apache.hadoop.hbase.master.HMaster.init(HMaster.java:472)
21:53:54at 
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
21:53:54at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
21:53:54at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
21:53:54at 
java.lang.reflect.Constructor.newInstance(Constructor.java:526)
21:53:54at 
org.apache.hadoop.hbase.master.HMaster.constructMaster(HMaster.java:2774)
21:53:54... 5 more{code}

 [Hadoop1]: hbase zkcli broken due to slf4j incompatibility
 --

 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.96.1, 0.99.0

 Attachments: 10073-0.96.patch


 Observed behavior:
 In my automation, I have a call to hbase zkcli. That call recently broke with 
 this checkin: 
 https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
 The error that is reported is:
 {code}++ ./hbase zkcli
 11:19:58  Warning: $HADOOP_HOME is deprecated.
 11:19:58  
 11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
 access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
 org.slf4j.LoggerFactory
 11:20:00  at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
 11:20:00  at 
 org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
 11:20:00  at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
 11:20:00  Build step 'Execute shell' marked build as failure{code}
 That said, this checkin is perfectly valid as each component should be 
 allowed to specify its own dependencies.
 The issue is a deeper one of dependency mismatches.
 Note: This issue only

[jira] [Commented] (HBASE-10073) [Hadoop1]: hbase zkcli broken due to slf4j incompatibility

2013-12-05 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841032#comment-13841032
 ] 

Aleksandr Shulman commented on HBASE-10073:
---

{quote}The revert you are suggesting may fix it though. I'll test it.{quote}
As expected, even with the different classpath, when applying your revert, 
everything builds and runs correctly.

 [Hadoop1]: hbase zkcli broken due to slf4j incompatibility
 --

 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.96.1, 0.99.0

 Attachments: 10073-0.96.patch


 Observed behavior:
 In my automation, I have a call to hbase zkcli. That call recently broke with 
 this checkin: 
 https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
 The error that is reported is:
 {code}++ ./hbase zkcli
 11:19:58  Warning: $HADOOP_HOME is deprecated.
 11:19:58  
 11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
 access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
 org.slf4j.LoggerFactory
 11:20:00  at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
 11:20:00  at 
 org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
 11:20:00  at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
 11:20:00  Build step 'Execute shell' marked build as failure{code}
 That said, this checkin is perfectly valid as each component should be 
 allowed to specify its own dependencies.
 The issue is a deeper one of dependency mismatches.
 Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
 where there is a similar checkin, but since trunk is not required to work 
 against hadoop1, this is not an issue for trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-10073) Revert HBASE-9718 (Add a test scope dependency on org.slf4j:slf4j-api to hbase-client)

2013-12-05 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13841067#comment-13841067
 ] 

Aleksandr Shulman commented on HBASE-10073:
---

Thanks Andrew!

 Revert HBASE-9718 (Add a test scope dependency on org.slf4j:slf4j-api to 
 hbase-client)
 --

 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman
Assignee: Andrew Purtell
 Fix For: 0.98.0, 0.96.1, 0.99.0

 Attachments: 10073.patch


 Observed behavior:
 In my automation, I have a call to hbase zkcli. That call recently broke with 
 this checkin: 
 https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
 The error that is reported is:
 {code}++ ./hbase zkcli
 11:19:58  Warning: $HADOOP_HOME is deprecated.
 11:19:58  
 11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
 access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
 org.slf4j.LoggerFactory
 11:20:00  at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
 11:20:00  at 
 org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
 11:20:00  at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
 11:20:00  Build step 'Execute shell' marked build as failure{code}
 That said, this checkin is perfectly valid as each component should be 
 allowed to specify its own dependencies.
 The issue is a deeper one of dependency mismatches.
 Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
 where there is a similar checkin, but since trunk is not required to work 
 against hadoop1, this is not an issue for trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-10073) [Hadoop1]: hbase zkcli broken due to slf4j incompatibility

2013-12-03 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-10073:
-

 Summary: [Hadoop1]: hbase zkcli broken due to slf4j incompatibility
 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman


Observed behavior:
In my automation, I have a call to hbase zkcli. That call recently broke with 
this checkin: 
https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c

The error that is reported is:
{code}++ ./hbase zkcli
11:19:58  Warning: $HADOOP_HOME is deprecated.
11:19:58  
11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
org.slf4j.LoggerFactory
11:20:00at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
11:20:00at 
org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
11:20:00at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
11:20:00  Build step 'Execute shell' marked build as failure{code}

That said, this checkin is perfectly valid as each component should be allowed 
to specify its own dependencies.

The issue is a deeper one of dependency mismatches.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10073) [Hadoop1]: hbase zkcli broken due to slf4j incompatibility

2013-12-03 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10073:
--

Assignee: Matteo Bertozzi

 [Hadoop1]: hbase zkcli broken due to slf4j incompatibility
 --

 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi

 Observed behavior:
 In my automation, I have a call to hbase zkcli. That call recently broke with 
 this checkin: 
 https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
 The error that is reported is:
 {code}++ ./hbase zkcli
 11:19:58  Warning: $HADOOP_HOME is deprecated.
 11:19:58  
 11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
 access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
 org.slf4j.LoggerFactory
 11:20:00  at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
 11:20:00  at 
 org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
 11:20:00  at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
 11:20:00  Build step 'Execute shell' marked build as failure{code}
 That said, this checkin is perfectly valid as each component should be 
 allowed to specify its own dependencies.
 The issue is a deeper one of dependency mismatches.
 Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
 where there is a similar checkin, but since trunk is not required to work 
 against hadoop1, this is not an issue for trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-10073) [Hadoop1]: hbase zkcli broken due to slf4j incompatibility

2013-12-03 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-10073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-10073:
--

Description: 
Observed behavior:
In my automation, I have a call to hbase zkcli. That call recently broke with 
this checkin: 
https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c

The error that is reported is:
{code}++ ./hbase zkcli
11:19:58  Warning: $HADOOP_HOME is deprecated.
11:19:58  
11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
org.slf4j.LoggerFactory
11:20:00at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
11:20:00at 
org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
11:20:00at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
11:20:00  Build step 'Execute shell' marked build as failure{code}

That said, this checkin is perfectly valid as each component should be allowed 
to specify its own dependencies.

The issue is a deeper one of dependency mismatches.

Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
where there is a similar checkin, but since trunk is not required to work 
against hadoop1, this is not an issue for trunk.

  was:
Observed behavior:
In my automation, I have a call to hbase zkcli. That call recently broke with 
this checkin: 
https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c

The error that is reported is:
{code}++ ./hbase zkcli
11:19:58  Warning: $HADOOP_HOME is deprecated.
11:19:58  
11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
org.slf4j.LoggerFactory
11:20:00at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
11:20:00at 
org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
11:20:00at 
org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
11:20:00  Build step 'Execute shell' marked build as failure{code}

That said, this checkin is perfectly valid as each component should be allowed 
to specify its own dependencies.

The issue is a deeper one of dependency mismatches.


 [Hadoop1]: hbase zkcli broken due to slf4j incompatibility
 --

 Key: HBASE-10073
 URL: https://issues.apache.org/jira/browse/HBASE-10073
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Affects Versions: 0.96.1
 Environment: Centos6, sun-jdk-64bit-1.7.0.25
Reporter: Aleksandr Shulman

 Observed behavior:
 In my automation, I have a call to hbase zkcli. That call recently broke with 
 this checkin: 
 https://github.com/apache/hbase/commit/5af0a60efed91ac2084f25f13edb21db0f510e7c
 The error that is reported is:
 {code}++ ./hbase zkcli
 11:19:58  Warning: $HADOOP_HOME is deprecated.
 11:19:58  
 11:20:00  Exception in thread main java.lang.IllegalAccessError: tried to 
 access field org.slf4j.impl.StaticLoggerBinder.SINGLETON from class 
 org.slf4j.LoggerFactory
 11:20:00  at org.slf4j.LoggerFactory.clinit(LoggerFactory.java:60)
 11:20:00  at 
 org.apache.zookeeper.ZooKeeperMain.clinit(ZooKeeperMain.java:50)
 11:20:00  at 
 org.apache.hadoop.hbase.zookeeper.ZooKeeperMainServer.main(ZooKeeperMainServer.java:78)
 11:20:00  Build step 'Execute shell' marked build as failure{code}
 That said, this checkin is perfectly valid as each component should be 
 allowed to specify its own dependencies.
 The issue is a deeper one of dependency mismatches.
 Note: This issue only affects hadoop1, not hadoop2. It also appears in trunk, 
 where there is a similar checkin, but since trunk is not required to work 
 against hadoop1, this is not an issue for trunk.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9980) [0.94] Wire compatibility test for 0.94

2013-11-21 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13829353#comment-13829353
 ] 

Aleksandr Shulman commented on HBASE-9980:
--

Thanks [~andrew.purt...@gmail.com] for the explanation. Makes sense.

 [0.94] Wire compatibility test for 0.94
 ---

 Key: HBASE-9980
 URL: https://issues.apache.org/jira/browse/HBASE-9980
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.13
Reporter: Lars Hofhansl
Assignee: Andrew Purtell

 See HBASE-9834.
 We should have a test that:
 # generates a file with all kinds of objects serialized into it. Save that 
 file as part of the HBase tests
 # a test can then read the objects back from that file
 # a test can regenerate that file
 If both tests pass we can be reasonably sure that neither readFields nor 
 write was changed in an incompatible way.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9980) [0.94] Wire compatibility test for 0.94

2013-11-20 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13828494#comment-13828494
 ] 

Aleksandr Shulman commented on HBASE-9980:
--

Great this is getting attention :)

[~lhofhansl]
{quote} those that are assignable from Writable.{quote}
Can you elaborate on what exactly you mean by this. Can you also give a couple 
examples of objects that are and are not?

 [0.94] Wire compatibility test for 0.94
 ---

 Key: HBASE-9980
 URL: https://issues.apache.org/jira/browse/HBASE-9980
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.13
Reporter: Lars Hofhansl
Assignee: Andrew Purtell

 See HBASE-9834.
 We should have a test that:
 # generates a file with all kinds of objects serialized into it. Save that 
 file as part of the HBase tests
 # a test can then read the objects back from that file
 # a test can regenerate that file
 If both tests pass we can be reasonably sure that neither readFields nor 
 write was changed in an incompatible way.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9973) [ACL]: Users with 'Admin' ACL permission will lose permissions after upgrade to 0.96.x from 0.94.x or 0.92.x

2013-11-14 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9973:
-

Labels: acl  (was: )

 [ACL]: Users with 'Admin' ACL permission will lose permissions after upgrade 
 to 0.96.x from 0.94.x or 0.92.x
 

 Key: HBASE-9973
 URL: https://issues.apache.org/jira/browse/HBASE-9973
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.96.0, 0.96.1
Reporter: Aleksandr Shulman
  Labels: acl
 Fix For: 0.96.1


 In our testing, we have uncovered that the ACL permissions for users with the 
 'A' credential do not hold after the upgrade to 0.96.x.
 This is because in the ACL table, the entry for the admin user is a 
 permission on the '_acl_' table with permission 'A'. However, because of the 
 namespace transition, there is no longer an '_acl_' table. Therefore, that 
 entry in the hbase:acl table is no longer valid.
 Example:
 {code}hbase(main):002:0 scan 'hbase:acl'
 ROW   COLUMN+CELL 
   
  TestTablecolumn=l:hdfs, timestamp=1384454830701, value=RW
   
  TestTablecolumn=l:root, timestamp=1384455875586, value=RWCA  
   
  _acl_column=l:root, timestamp=1384454767568, value=C 
   
  _acl_column=l:tableAdmin, timestamp=1384454788035, value=A   
   
  hbase:aclcolumn=l:root, timestamp=1384455875786, value=C 
   
 {code}
 In this case, the following entry becomes meaningless:
 {code} _acl_column=l:tableAdmin, timestamp=1384454788035, 
 value=A {code}
 As a result, 
 Proposed fix:
 I see the fix being relatively straightforward. As part of the migration, 
 change any entries in the '_acl_' table with key '_acl_' into a new row with 
 key 'hbase:acl', all else being the same. And the old entry would be deleted.
 This can go into the standard migration script that we expect users to run.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-9973) [ACL]: Users with 'Admin' ACL permission will lose permissions after upgrade to 0.96.x from 0.94.x or 0.92.x

2013-11-14 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-9973:


 Summary: [ACL]: Users with 'Admin' ACL permission will lose 
permissions after upgrade to 0.96.x from 0.94.x or 0.92.x
 Key: HBASE-9973
 URL: https://issues.apache.org/jira/browse/HBASE-9973
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.96.0, 0.96.1
Reporter: Aleksandr Shulman
 Fix For: 0.96.1


In our testing, we have uncovered that the ACL permissions for users with the 
'A' credential do not hold after the upgrade to 0.96.x.

This is because in the ACL table, the entry for the admin user is a permission 
on the '_acl_' table with permission 'A'. However, because of the namespace 
transition, there is no longer an '_acl_' table. Therefore, that entry in the 
hbase:acl table is no longer valid.

Example:

{code}hbase(main):002:0 scan 'hbase:acl'
ROW   COLUMN+CELL   
 TestTablecolumn=l:hdfs, timestamp=1384454830701, value=RW  
 TestTablecolumn=l:root, timestamp=1384455875586, value=RWCA
 _acl_column=l:root, timestamp=1384454767568, value=C   
 _acl_column=l:tableAdmin, timestamp=1384454788035, value=A 
 hbase:aclcolumn=l:root, timestamp=1384455875786, value=C   
{code}

In this case, the following entry becomes meaningless:
{code} _acl_column=l:tableAdmin, timestamp=1384454788035, 
value=A {code}

As a result, 

Proposed fix:
I see the fix being relatively straightforward. As part of the migration, 
change any entries in the '_acl_' table with key '_acl_' into a new row with 
key 'hbase:acl', all else being the same. And the old entry would be deleted.

This can go into the standard migration script that we expect users to run.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9973) [ACL]: Users with 'Admin' ACL permission will lose permissions after upgrade to 0.96.x from 0.94.x or 0.92.x

2013-11-14 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9973:
-

Assignee: Himanshu Vashishtha

 [ACL]: Users with 'Admin' ACL permission will lose permissions after upgrade 
 to 0.96.x from 0.94.x or 0.92.x
 

 Key: HBASE-9973
 URL: https://issues.apache.org/jira/browse/HBASE-9973
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.96.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Himanshu Vashishtha
  Labels: acl
 Fix For: 0.96.1


 In our testing, we have uncovered that the ACL permissions for users with the 
 'A' credential do not hold after the upgrade to 0.96.x.
 This is because in the ACL table, the entry for the admin user is a 
 permission on the '_acl_' table with permission 'A'. However, because of the 
 namespace transition, there is no longer an '_acl_' table. Therefore, that 
 entry in the hbase:acl table is no longer valid.
 Example:
 {code}hbase(main):002:0 scan 'hbase:acl'
 ROW   COLUMN+CELL 
   
  TestTablecolumn=l:hdfs, timestamp=1384454830701, value=RW
   
  TestTablecolumn=l:root, timestamp=1384455875586, value=RWCA  
   
  _acl_column=l:root, timestamp=1384454767568, value=C 
   
  _acl_column=l:tableAdmin, timestamp=1384454788035, value=A   
   
  hbase:aclcolumn=l:root, timestamp=1384455875786, value=C 
   
 {code}
 In this case, the following entry becomes meaningless:
 {code} _acl_column=l:tableAdmin, timestamp=1384454788035, 
 value=A {code}
 As a result, 
 Proposed fix:
 I see the fix being relatively straightforward. As part of the migration, 
 change any entries in the '_acl_' table with key '_acl_' into a new row with 
 key 'hbase:acl', all else being the same. And the old entry would be deleted.
 This can go into the standard migration script that we expect users to run.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Work started] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change

2013-11-13 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-9966 started by Aleksandr Shulman.

 Create IntegrationTest for Online Bloom Filter and Compression Algorithm 
 Change
 ---

 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
 Fix For: 0.95.2


 For online merge, a user is perfectly with her rights to modify the 
 compression algorithm used, or the bloom filter.
 Therefore, we should add these actions to our ChaosMonkey tests to ensure 
 that they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change

2013-11-13 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-9966:


 Summary: Create IntegrationTest for Online Bloom Filter and 
Compression Algorithm Change
 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman


For online merge, a user is perfectly with her rights to modify the compression 
algorithm used, or the bloom filter.

Therefore, we should add these actions to our ChaosMonkey tests to ensure that 
they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9966) Create IntegrationTest for Online Bloom Filter and Compression Algorithm Change

2013-11-13 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9966:
-

Description: 
For online schema change, a user is perfectly with her rights to modify the 
compression algorithm used, or the bloom filter.

Therefore, we should add these actions to our ChaosMonkey tests to ensure that 
they do not introduce instability.

  was:
For online merge, a user is perfectly with her rights to modify the compression 
algorithm used, or the bloom filter.

Therefore, we should add these actions to our ChaosMonkey tests to ensure that 
they do not introduce instability.


 Create IntegrationTest for Online Bloom Filter and Compression Algorithm 
 Change
 ---

 Key: HBASE-9966
 URL: https://issues.apache.org/jira/browse/HBASE-9966
 Project: HBase
  Issue Type: Sub-task
  Components: HFile, test
Affects Versions: 0.98.0, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Aleksandr Shulman
 Fix For: 0.95.2


 For online schema change, a user is perfectly with her rights to modify the 
 compression algorithm used, or the bloom filter.
 Therefore, we should add these actions to our ChaosMonkey tests to ensure 
 that they do not introduce instability.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-7639) Enable online schema update by default

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-7639:
-

Labels: online_schema_change  (was: )

 Enable online schema update by default 
 ---

 Key: HBASE-7639
 URL: https://issues.apache.org/jira/browse/HBASE-7639
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.95.2
Reporter: Enis Soztutar
Assignee: Elliott Clark
  Labels: online_schema_change
 Fix For: 0.98.0, 0.95.2

 Attachments: HBASE-7639-0.patch


 After we get HBASE-7305 and HBASE-7546, things will become stable enough to 
 enable online schema update to be enabled by default. 
 {code}
   property
 namehbase.online.schema.update.enable/name
 valuefalse/value
 description
 Set true to enable online schema changes.  This is an experimental 
 feature.··
 There are known issues modifying table schemas at the same time a region
 split is happening so your table needs to be quiescent or else you have to
 be running with splits disabled.
 /description
   /property
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-8726) Create an Integration Test for online schema change

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-8726:
-

Labels: online_schema_change  (was: )

 Create an Integration Test for online schema change
 ---

 Key: HBASE-8726
 URL: https://issues.apache.org/jira/browse/HBASE-8726
 Project: HBase
  Issue Type: Bug
  Components: Admin
Affects Versions: 0.98.0, 0.95.1
Reporter: Elliott Clark
Assignee: Elliott Clark
  Labels: online_schema_change
 Fix For: 0.95.2

 Attachments: HBASE-8726-0.patch, HBASE-8726-1.patch, 
 HBASE-8726-2.patch, HBASE-8726-3.patch, HBASE-8726-4.patch


 With table locks in place it should be time to start really testing online 
 table schema changes.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-5678) Dynamic configuration capability for Hbase.

2013-11-08 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-5678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aleksandr Shulman updated HBASE-5678:
-

Description:
I think, some preperties can be dynamically configured without restart of the
nodes.
This is an umberilla JIRA for this Feature.

In Hadoop we already had such feature but not yet implemented by nodes. I
think we can have the similar base framework here and can implemented by nodes.
So, that whatever properies are allowed to reconfigurable, should be able to
reconfigure with new values with out restarting the node.

I will come up with some design doc with noeds implementation and will raise
subtasks for each.

was:
I think, some preperties can be danamically configured without restart of the
nodes.
This is an umberilla JIRA for this Feature.

I will come up with some design doc with noeds implementation and will raise
subtasks for each.

Dynamic configuration capability for Hbase.
---

Key: HBASE-5678
URL: https://issues.apache.org/jira/browse/HBASE-5678
Project: HBase
Issue Type: New Feature
Components: master, regionserver, util
Affects Versions: 0.95.2
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G

I think, some preperties can be dynamically configured without restart of the
nodes.
This is an umberilla JIRA for this Feature.
In Hadoop we already had such feature but not yet implemented by nodes. I
think we can have the similar base framework here and can implemented by
nodes. So, that whatever properies are allowed to reconfigurable, should be
able to reconfigure with new values with out restarting the node.
I will come up with some design doc with noeds implementation and will raise
subtasks for each.

--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9395) Disable Schema Change on 0.96

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9395:
-

Labels: online_schema_change  (was: )

 Disable Schema Change on 0.96
 -

 Key: HBASE-9395
 URL: https://issues.apache.org/jira/browse/HBASE-9395
 Project: HBase
  Issue Type: Bug
  Components: master
Affects Versions: 0.98.0
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Blocker
  Labels: online_schema_change
 Fix For: 0.96.0

 Attachments: HBASE-9395-95-0.patch


 Running LoadTestAndVerify fails when the chaos monkey is slowDeterministic.  
 When commenting out all of the schema change actions everything passes.  We 
 should disable the schema change until we can be 100% sure of data integrity.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-8775) Throttle online schema changes.

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-8775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-8775:
-

Labels: online_schema_change  (was: )

 Throttle online schema changes.
 ---

 Key: HBASE-8775
 URL: https://issues.apache.org/jira/browse/HBASE-8775
 Project: HBase
  Issue Type: Improvement
  Components: master
Affects Versions: 0.89-fb
Reporter: Shane Hogan
Priority: Minor
  Labels: online_schema_change
 Fix For: 0.89-fb


 Throttle the open and close of the regions after an online schema change



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-9407) Online Schema Change causes Test Load and Verify to fail.

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9407:
-

Labels: online_schema_change  (was: )

 Online Schema Change causes Test Load and Verify to fail.
 -

 Key: HBASE-9407
 URL: https://issues.apache.org/jira/browse/HBASE-9407
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Elliott Clark
  Labels: online_schema_change





--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-4741) Online schema change doesn't return errors

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-4741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-4741:
-

Labels: online_schema_change  (was: )

 Online schema change doesn't return errors
 --

 Key: HBASE-4741
 URL: https://issues.apache.org/jira/browse/HBASE-4741
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.92.0
Reporter: Jean-Daniel Cryans
Assignee: stack
Priority: Critical
  Labels: online_schema_change
 Fix For: 0.92.0

 Attachments: 4741-v2.txt, 4741-v3.txt, 4741-v4.txt, 4741-v5.txt, 
 4741-v6.txt, 4741-v7.txt, 4741.txt


 Still after the fun I had over in HBASE-4729, I tried to finish altering my 
 table (remove a family) since only half of it was changed so I did this:
 {quote}
 hbase(main):002:0 alter 'TestTable', NAME = 'allo', METHOD = 'delete' 
 Updating all regions with the new schema...
 244/244 regions updated.
 Done.
 0 row(s) in 1.2480 seconds
 {quote}
 Nice it all looks good, but over in the master log:
 {quote}
 org.apache.hadoop.hbase.InvalidFamilyOperationException: Family 'allo' does 
 not exist so cannot be deleted
 at 
 org.apache.hadoop.hbase.master.handler.TableDeleteFamilyHandler.handleTableOperation(TableDeleteFamilyHandler.java:56)
 at 
 org.apache.hadoop.hbase.master.handler.TableEventHandler.process(TableEventHandler.java:86)
 at 
 org.apache.hadoop.hbase.master.HMaster.deleteColumn(HMaster.java:1011)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at 
 org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:348)
 at 
 org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1242)
 {quote}
 Maybe we should do checks before launching the async task.
 Marking critical as this is a regression.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-5335) Dynamic Schema Configurations

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-5335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-5335:
-

Labels: configuration online_schema_change schema  (was: configuration 
schema)

 Dynamic Schema Configurations
 -

 Key: HBASE-5335
 URL: https://issues.apache.org/jira/browse/HBASE-5335
 Project: HBase
  Issue Type: New Feature
Reporter: Nicolas Spiegelberg
Assignee: Nicolas Spiegelberg
  Labels: configuration, online_schema_change, schema
 Fix For: 0.94.7, 0.95.0

 Attachments: ASF.LICENSE.NOT.GRANTED--D2247.1.patch, 
 ASF.LICENSE.NOT.GRANTED--D2247.2.patch, 
 ASF.LICENSE.NOT.GRANTED--D2247.3.patch, 
 ASF.LICENSE.NOT.GRANTED--D2247.4.patch, 
 ASF.LICENSE.NOT.GRANTED--D2247.5.patch, 
 ASF.LICENSE.NOT.GRANTED--D2247.6.patch, 
 ASF.LICENSE.NOT.GRANTED--D2247.7.patch, 
 ASF.LICENSE.NOT.GRANTED--D2247.8.patch, HBASE-5335-trunk-2.patch, 
 HBASE-5335-trunk-3.patch, HBASE-5335-trunk-3.patch, HBASE-5335-trunk-4.patch, 
 HBASE-5335-trunk.patch


 Currently, the ability for a core developer to add per-table  per-CF 
 configuration settings is very heavyweight.  You need to add a reserved 
 keyword all the way up the stack  you have to support this variable 
 long-term if you're going to expose it explicitly to the user.  This has 
 ended up with using Configuration.get() a lot because it is lightweight and 
 you can tweak settings while you're trying to understand system behavior 
 [since there are many config params that may never need to be tuned].  We 
 need to add the ability to put  read arbitrary KV settings in the HBase 
 schema.  Combined with online schema change, this will allow us to safely 
 iterate on configuration settings.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Updated] (HBASE-7236) add per-table/per-cf configuration via metadata

2013-11-08 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-7236:
-

Labels: online_schema_change  (was: )

 add per-table/per-cf configuration via metadata
 ---

 Key: HBASE-7236
 URL: https://issues.apache.org/jira/browse/HBASE-7236
 Project: HBase
  Issue Type: Umbrella
Affects Versions: 0.95.2
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
  Labels: online_schema_change
 Fix For: 0.95.0

 Attachments: HBASE-7236-PROTOTYPE-v1.patch, 
 HBASE-7236-PROTOTYPE.patch, HBASE-7236-PROTOTYPE.patch, HBASE-7236-v0.patch, 
 HBASE-7236-v1.patch, HBASE-7236-v2.patch, HBASE-7236-v3.patch, 
 HBASE-7236-v4.patch, HBASE-7236-v5.patch, HBASE-7236-v6.patch, 
 HBASE-7236-v6.patch


 Regardless of the compaction policy, it makes sense to have separate 
 configuration for compactions for different tables and column families, as 
 their access patterns and workloads can be different. In particular, for 
 tiered compactions that are being ported from 0.89-fb branch it is necessary 
 to have, to use it properly.
 We might want to add support for compaction configuration via metadata on 
 table/cf.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9407) Online Schema Change causes Test Load and Verify to fail.

2013-11-08 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13817674#comment-13817674
 ] 

Aleksandr Shulman commented on HBASE-9407:
--

Hi Elliott, is this issue still occurring?
If so, can you add more specifics about the failure mode, how often it occurs, 
potential root causes, etc.

 Online Schema Change causes Test Load and Verify to fail.
 -

 Key: HBASE-9407
 URL: https://issues.apache.org/jira/browse/HBASE-9407
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.98.0
Reporter: Elliott Clark
  Labels: online_schema_change





--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-9786) [hbck]: hbck -metaonly incorrectly reports inconsistent regions after HBASE-9698 fix

2013-10-16 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-9786:


 Summary: [hbck]: hbck -metaonly incorrectly reports inconsistent 
regions after HBASE-9698 fix
 Key: HBASE-9786
 URL: https://issues.apache.org/jira/browse/HBASE-9786
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.96.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi


In my testing, I found that this call began to fail:
{code}sudo -u hbase hbase hbck -metaonly
{code}

The checking after which it began to fail is:  after this checkin: 
https://github.com/apache/hbase/commit/818749ff9f261aac4206054d331189e92290b408

The full output is below. The issue seems the patch does not include -metaOnly

Testing done:
I build 0.96 up to commit a6f208d91efff207860b049eb8466a069f0c71a9 and the test 
passes.

The output:
{code}Summary:
  clonedtestSnapshotSource-1381959945438 is okay.
Number of regions: 0
Deployed on: 
  hbase:meta is okay.
Number of regions: 1
Deployed on:  tarball-target-2.ent.cloudera.com,60020,1381952904985
  hbase:namespace is okay.
Number of regions: 0
Deployed on: 
  sampleTable_tarball-target-2.ent.cloudera.com is okay.
Number of regions: 0
Deployed on: 
  testMRIncrementalLoadWithSplit_1381959500784 is okay.
Number of regions: 0
Deployed on: 
  testMRIncrementalLoad_1381959434211 is okay.
Number of regions: 0
Deployed on: 
  testSnapshotSource-1381959945438 is okay.
Number of regions: 0
Deployed on: 
  testSnapshotSource-1381959995995 is okay.
Number of regions: 0
Deployed on: 
7 inconsistencies detected.
Status: INCONSISTENT
{code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9786) [hbck]: hbck -metaonly incorrectly reports inconsistent regions after HBASE-9698 fix

2013-10-16 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13797440#comment-13797440
 ] 

Aleksandr Shulman commented on HBASE-9786:
--

Tested the fix -- looks good.

 [hbck]: hbck -metaonly incorrectly reports inconsistent regions after 
 HBASE-9698 fix
 

 Key: HBASE-9786
 URL: https://issues.apache.org/jira/browse/HBASE-9786
 Project: HBase
  Issue Type: Bug
  Components: hbck
Affects Versions: 0.98.0, 0.96.0
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
 Fix For: 0.98.0, 0.96.0

 Attachments: HBASE-9786-v0.patch


 In my testing, I found that this call began to fail:
 {code}sudo -u hbase hbase hbck -metaonly
 {code}
 The checking after which it began to fail is:  after this checkin: 
 https://github.com/apache/hbase/commit/818749ff9f261aac4206054d331189e92290b408
 The full output is below. The issue seems the patch does not include -metaOnly
 Testing done:
 I build 0.96 up to commit a6f208d91efff207860b049eb8466a069f0c71a9 and the 
 test passes.
 The output:
 {code}
 $ hbase org.apache.hadoop.hbase.PerformanceEvaluation --rows=1 
 sequentialWrite 1
 $ hbase hbck -metaonly
 ...
 2013-10-16 23:52:24,075 DEBUG [main] util.HBaseFsck: There are 1 region info 
 entries
 ERROR: There is a hole in the region chain between  and .  You need to create 
 a new .regioninfo and region dir in hdfs to plug the hole.
 ERROR: Found inconsistency in table TestTable
 ERROR: There is a hole in the region chain between  and .  You need to create 
 a new .regioninfo and region dir in hdfs to plug the hole.
 ERROR: Found inconsistency in table hbase:namespace
 2013-10-16 23:52:24,182 INFO  [main] zookeeper.ZooKeeper: Initiating client 
 connection, connectString=localhost:2181 sessionTimeout=9 watcher=hbase 
 Fsck
 2013-10-16 23:52:24,183 INFO  [main] zookeeper.RecoverableZooKeeper: Process 
 identifier=hbase Fsck connecting to ZooKeeper ensemble=localhost:2181
 2013-10-16 23:52:24,183 INFO  [main-SendThread(localhost:2181)] 
 zookeeper.ClientCnxn: Opening socket connection to server 
 localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL 
 (unknown error)
 2013-10-16 23:52:24,184 INFO  [main-SendThread(localhost:2181)] 
 zookeeper.ClientCnxn: Socket connection established to 
 localhost/127.0.0.1:2181, initiating session
 2013-10-16 23:52:24,188 INFO  [main-SendThread(localhost:2181)] 
 zookeeper.ClientCnxn: Session establishment complete on server 
 localhost/127.0.0.1:2181, sessionid = 0x141c377e423000d, negotiated timeout = 
 4
 Summary:
   TestTable is okay.
 Number of regions: 0
 Deployed on:
   hbase:meta is okay.
 Number of regions: 1
 Deployed on:  localhost,49217,1381963918103
   hbase:namespace is okay.
 Number of regions: 0
 Deployed on:
 2 inconsistencies detected.
 Status: INCONSISTENT
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HBASE-9735) region_mover.rb uses the removed HConnection.getZooKeeperWatcher() method

2013-10-09 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13790774#comment-13790774
 ] 

Aleksandr Shulman commented on HBASE-9735:
--

I tested the patch. It looks good! Thanks Matteo for the fix!

 region_mover.rb uses the removed HConnection.getZooKeeperWatcher() method
 -

 Key: HBASE-9735
 URL: https://issues.apache.org/jira/browse/HBASE-9735
 Project: HBase
  Issue Type: Bug
  Components: scripts
Affects Versions: 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.98.0, 0.96.0

 Attachments: HBASE-9735-v0.patch


 [~aleksshulman] found that region_mover.rb is using 
 HConnection.getZooKeeperWatcher(), which is deprecated in 94 and removed in 
 96.
 {code}
 14:02:34  2013-09-16 14:02:34,945 INFO  [main] region_mover: Moving 7 
 region(s) from c5-rolling2-4.ent.cloudera.com,60020,1379364656888 during this 
 cycle
 14:02:34  [c5-rolling2-2.ent.cloudera.com] out: 2013-09-16 14:02:34,951 INFO  
 [main] region_mover: Moving region 1588230740 (0 of 7) to 
 server=c5-rolling2-2.ent.cloudera.com,60020,1379365188814
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out: NoMethodError: undefined 
 method `getZooKeeperWatcher' for ##Class:0x1fe91485:0x465098f9
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out:   getServerNameForRegion at 
 /usr/lib/hbase/bin/region_mover.rb:91
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out: isSameServer at 
 /usr/lib/hbase/bin/region_mover.rb:73
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out: move at 
 /usr/lib/hbase/bin/region_mover.rb:157
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out:  __for__ at 
 /usr/lib/hbase/bin/region_mover.rb:327
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out: each at 
 file:/usr/lib/hbase/lib/jruby-complete-1.6.8.jar!/builtin/java/java.util.rb:7
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out:unloadRegions at 
 /usr/lib/hbase/bin/region_mover.rb:318
 14:02:35  [c5-rolling2-2.ent.cloudera.com] out:   (root) at 
 /usr/lib/hbase/bin/region_mover.rb:456
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Created] (HBASE-9663) PerformanceEvaluation does not properly honor specified table name parameter

2013-09-26 Thread Aleksandr Shulman (JIRA)

Aleksandr Shulman created HBASE-9663:


 Summary: PerformanceEvaluation does not properly honor specified 
table name parameter
 Key: HBASE-9663
 URL: https://issues.apache.org/jira/browse/HBASE-9663
 Project: HBase
  Issue Type: Bug
  Components: Client, test
Reporter: Aleksandr Shulman
 Fix For: 0.94.13, 0.96.1, 0.95.2


Expected behavior:
A user should be able to specify a given table for PerformanceEvaluation and 
have that table be used. That table does not need to exist. If it doesn't 
exist, PE will create it.

Observed behavior:
In creating the job, PE will use the new table name. However, the map tasks 
will fail because they are still looking for TestTable, which is not there.

Potential causes:
In the PE code, we see that the table's is not argument to MR:
https://github.com/apache/hbase/blob/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/PerformanceEvaluation.java#L723

Command:
{code} hbase org.apache.hadoop.hbase.PerformanceEvaluation --table=t2 
sequentialWrite 2{code}

Output:
{code}initiating session
13/09/26 00:36:02 INFO zookeeper.ClientCnxn: Session establishment complete on 
server REDACTED/10.20.187.137:2181, sessionid = 0x14159256f9b0031, negotiated 
timeout = 18
13/09/26 00:36:02 DEBUG client.HConnectionManager$HConnectionImplementation: 
Looked up root region location, 
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@c8d427c;
 serverName=REDACTED,60020,1380180157520
13/09/26 00:36:02 DEBUG client.HConnectionManager$HConnectionImplementation: 
Cached location for .META.,,1.1028785192 is REDACTED:60020
13/09/26 00:36:02 DEBUG client.ClientScanner: Creating scanner over .META. 
starting at key 't2,,'
13/09/26 00:36:02 DEBUG client.ClientScanner: Advancing internal scanner to 
startKey at 't2,,'
13/09/26 00:36:02 DEBUG catalog.CatalogTracker: Stopping catalog tracker 
org.apache.hadoop.hbase.catalog.CatalogTracker@466e06d7
13/09/26 00:36:02 INFO zookeeper.ZooKeeper: Session: 0x14159256f9b0031 closed
13/09/26 00:36:02 INFO zookeeper.ClientCnxn: EventThread shut down
13/09/26 00:36:02 INFO zookeeper.ZooKeeper: Initiating client connection, 
connectString=REDACTED:2181 sessionTimeout=18 
watcher=catalogtracker-on-org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@c8d427c
13/09/26 00:36:02 INFO zookeeper.RecoverableZooKeeper: The identifier of this 
process is 10390@REDACTED
13/09/26 00:36:02 DEBUG catalog.CatalogTracker: Starting catalog tracker 
org.apache.hadoop.hbase.catalog.CatalogTracker@5152cfbb
13/09/26 00:36:02 INFO zookeeper.ClientCnxn: Opening socket connection to 
server REDACTED/10.20.187.137:2181. Will not attempt to authenticate using 
SASL (unknown error)
13/09/26 00:36:02 INFO zookeeper.ClientCnxn: Socket connection established to 
REDACTED/10.20.187.137:2181, initiating session
13/09/26 00:36:02 INFO zookeeper.ClientCnxn: Session establishment complete on 
server REDACTED/10.20.187.137:2181, sessionid = 0x14159256f9b0032, negotiated 
timeout = 18
13/09/26 00:36:02 DEBUG client.ClientScanner: Creating scanner over .META. 
starting at key 't2,,'
13/09/26 00:36:02 DEBUG client.ClientScanner: Advancing internal scanner to 
startKey at 't2,,'
13/09/26 00:36:02 DEBUG catalog.CatalogTracker: Stopping catalog tracker 
org.apache.hadoop.hbase.catalog.CatalogTracker@5152cfbb
13/09/26 00:36:03 INFO zookeeper.ZooKeeper: Session: 0x14159256f9b0032 closed
13/09/26 00:36:03 INFO zookeeper.ClientCnxn: EventThread shut down
13/09/26 00:36:04 WARN mapred.JobClient: Use GenericOptionsParser for parsing 
the arguments. Applications should implement Tool for the same.
13/09/26 00:36:06 INFO input.FileInputFormat: Total input paths to process : 1
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[0]  startRow=1363147 
rows=104857 totalRows=2097152 clients=2 flushCommits=true writeToWAL=true
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[1]  startRow=1468004 
rows=104857 totalRows=2097152 clients=2 flushCommits=true writeToWAL=true
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[2]  startRow=1887432 
rows=104857 totalRows=2097152 clients=2 flushCommits=true writeToWAL=true
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[3]  startRow=209714 
rows=104857 totalRows=2097152 clients=2 flushCommits=true writeToWAL=true
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[4]  startRow=524285 
rows=104857 totalRows=2097152 clients=2 flushCommits=true writeToWAL=true
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[5]  startRow=1048576 
rows=104857 totalRows=2097152 clients=2 flushCommits=true writeToWAL=true
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[6]  startRow=1572861 
rows=104857 totalRows=2097152 clients=2 flushCommits=true writeToWAL=true
13/09/26 00:36:06 DEBUG hbase.PerformanceEvaluation: split[7]

[jira] [Commented] (HBASE-9603) IsRestoreSnapshotDoneResponse has wrong default causing restoreSnapshot() to be async

2013-09-20 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13773552#comment-13773552
 ] 

Aleksandr Shulman commented on HBASE-9603:
--

Patch looks good Matteo. +1

 IsRestoreSnapshotDoneResponse has wrong default causing restoreSnapshot() to 
 be async
 -

 Key: HBASE-9603
 URL: https://issues.apache.org/jira/browse/HBASE-9603
 Project: HBase
  Issue Type: Bug
  Components: snapshots
Affects Versions: 0.96.0
Reporter: Matteo Bertozzi
Assignee: Matteo Bertozzi
 Fix For: 0.96.0

 Attachments: HBASE-9603-v0.patch


 the done field of IsRestoreSnapshotDoneRequest is set to true which cause the 
 restoreSnapshot() to not wait until the restore is done. resulting in an 
 async behavior.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9153) Introduce/update a script to generate jdiff reports

2013-09-19 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13772094#comment-13772094
 ] 

Aleksandr Shulman commented on HBASE-9153:
--

Looks like it's not going in smoothly in a few places. What can I do to help?

 Introduce/update a script to generate jdiff reports
 ---

 Key: HBASE-9153
 URL: https://issues.apache.org/jira/browse/HBASE-9153
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Hsieh
Assignee: Aleksandr Shulman
 Fix For: 0.98.0, 0.96.0, 0.94.13

 Attachments: HBASE-9153-v1.patch, HBASE-9153-v3.patch, 
 HBASE-9153-v4-0.94.patch, HBASE-9153-v4-0.95.patch, 
 HBASE-9153-v4-trunk.patch, HBASE-9153-v5-0.94.patch, 
 HBASE-9153-v5-0.95.patch, HBASE-9153-v5-trunk.patch, 
 HBASE-9153-v6-0.94.patch, HBASE-9153-v6-0.95.patch, HBASE-9153-v6-trunk.patch


 We've had a few issues now where we've removed API's without deprecating or 
 deprecating in the late release.  (HBASE-9142, HBASE-9093)  We should just 
 have a tool that enforces our api deprecation policy as a release time check 
 or as a precommit check.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9533) List of dependency jars for MR jobs is hard-coded and does not include netty, breaking MRv1 jobs

2013-09-19 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13772239#comment-13772239
 ] 

Aleksandr Shulman commented on HBASE-9533:
--

[~saint@gmail.com] and I took a look at the issue and discovered that the 
dependency is being excluded by zookeeper in 0.96 and trunk. Removing the 
exclusion fixes the problem. 

Testing: We have automation that runs MRv1 over HBase that failed because of 
this issue. When we removed the exclusion and ran it from a custom branch, it 
passed.
The branch is the latest 0.96 + the patch. 
https://github.com/AleksandrShulman/hbase/commit/5f7df8e7b08eebe2d28337e2eb0750acea21d51d

After the patch is applied, MRv1 and MRv2 both work for a regular pi job (MR 
only) and a rowcounter job (MR over HBase)

The patch is straightforward. I will attach it shortly.

 List of dependency jars for MR jobs is hard-coded and does not include netty, 
 breaking MRv1 jobs
 

 Key: HBASE-9533
 URL: https://issues.apache.org/jira/browse/HBASE-9533
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.95.2, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
 Fix For: 0.95.2, 0.96.1

 Attachments: failed_mrv1_rowcounter_tt_taskoutput.out


 Observed behavior:
 Against trunk, using MRv1 with hadoop 1.0.4, r1393290, I am able to run MRv1 
 jobs (e.g. pi 2 4).
 However, when I use it to run MR over HBase jobs, they fail with the stack 
 trace below.
 From the trace, the issue seems to be that it cannot find a class that the 
 netty jar contains. This would make sense, given that the dependency jars 
 that we use for the MapReduce job are hard-coded, and that the netty jar is 
 not one of them.
 https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java#L519
 Strangely, this is only an issue in trunk, not 0.95, even though the code 
 hasn't changed.
 Command:
 {code}/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter 
 sampletable{code}
 TT logs (attached)
 Output from console running job:
 {code}13/09/13 16:02:58 INFO mapred.JobClient: Task Id : 
 attempt_201309131601_0002_m_00_2, Status : FAILED
 java.io.IOException: Cannot create a record reader because of a previous 
 error. Please look at the previous logs lines from the task's full log for 
 more details.
   at 
 org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:119)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.init(MapTask.java:489)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 13/09/13 16:03:09 INFO mapred.JobClient: Job complete: job_201309131601_0002
 13/09/13 16:03:09 INFO mapred.JobClient: Counters: 7
 13/09/13 16:03:09 INFO mapred.JobClient:   Job Counters 
 13/09/13 16:03:09 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=29913
 13/09/13 16:03:09 INFO mapred.JobClient: Total time spent by all reduces 
 waiting after reserving slots (ms)=0
 13/09/13 16:03:09 INFO mapred.JobClient: Total time spent by all maps 
 waiting after reserving slots (ms)=0
 13/09/13 16:03:09 INFO mapred.JobClient: Launched map tasks=4
 13/09/13 16:03:09 INFO mapred.JobClient: Data-local map tasks=4
 13/09/13 16:03:09 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
 13/09/13 16:03:09 INFO mapred.JobClient: Failed map tasks=1{code}
 Expected behavior:
 As a stopgap, the netty jar should be included in that list. More generally, 
 there should be a more elegant way to include the jars that are needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9533) List of dependency jars for MR jobs is hard-coded and does not include netty, breaking MRv1 jobs

2013-09-19 Thread Aleksandr Shulman (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-9533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13772585#comment-13772585
 ] 

Aleksandr Shulman commented on HBASE-9533:
--

I'm +1 on the patch for 0.96 and trunk.

 List of dependency jars for MR jobs is hard-coded and does not include netty, 
 breaking MRv1 jobs
 

 Key: HBASE-9533
 URL: https://issues.apache.org/jira/browse/HBASE-9533
 Project: HBase
  Issue Type: Bug
  Components: mapreduce
Affects Versions: 0.95.2, 0.96.1
Reporter: Aleksandr Shulman
Assignee: Matteo Bertozzi
Priority: Blocker
 Fix For: 0.96.0

 Attachments: 9533.txt, 9533v3.txt, 
 failed_mrv1_rowcounter_tt_taskoutput.out


 Observed behavior:
 Against trunk, using MRv1 with hadoop 1.0.4, r1393290, I am able to run MRv1 
 jobs (e.g. pi 2 4).
 However, when I use it to run MR over HBase jobs, they fail with the stack 
 trace below.
 From the trace, the issue seems to be that it cannot find a class that the 
 netty jar contains. This would make sense, given that the dependency jars 
 that we use for the MapReduce job are hard-coded, and that the netty jar is 
 not one of them.
 https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.java#L519
 Strangely, this is only an issue in trunk, not 0.95, even though the code 
 hasn't changed.
 Command:
 {code}/bin/hbase org.apache.hadoop.hbase.mapreduce.RowCounter 
 sampletable{code}
 TT logs (attached)
 Output from console running job:
 {code}13/09/13 16:02:58 INFO mapred.JobClient: Task Id : 
 attempt_201309131601_0002_m_00_2, Status : FAILED
 java.io.IOException: Cannot create a record reader because of a previous 
 error. Please look at the previous logs lines from the task's full log for 
 more details.
   at 
 org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:119)
   at 
 org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.init(MapTask.java:489)
   at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:731)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:396)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
   at org.apache.hadoop.mapred.Child.main(Child.java:249)
 13/09/13 16:03:09 INFO mapred.JobClient: Job complete: job_201309131601_0002
 13/09/13 16:03:09 INFO mapred.JobClient: Counters: 7
 13/09/13 16:03:09 INFO mapred.JobClient:   Job Counters 
 13/09/13 16:03:09 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=29913
 13/09/13 16:03:09 INFO mapred.JobClient: Total time spent by all reduces 
 waiting after reserving slots (ms)=0
 13/09/13 16:03:09 INFO mapred.JobClient: Total time spent by all maps 
 waiting after reserving slots (ms)=0
 13/09/13 16:03:09 INFO mapred.JobClient: Launched map tasks=4
 13/09/13 16:03:09 INFO mapred.JobClient: Data-local map tasks=4
 13/09/13 16:03:09 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
 13/09/13 16:03:09 INFO mapred.JobClient: Failed map tasks=1{code}
 Expected behavior:
 As a stopgap, the netty jar should be included in that list. More generally, 
 there should be a more elegant way to include the jars that are needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-9153) Introduce/update a script to generate jdiff reports

2013-09-18 Thread Aleksandr Shulman (JIRA)

[
https://issues.apache.org/jira/browse/HBASE-9153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13771283#comment-13771283
]

Aleksandr Shulman commented on HBASE-9153:
--

Hey Nick, thanks for the commit.
In terms of release note, here is what I propose:
Added a tool to generate a report of API compatibility between different
versions of hbase. It is found in dev-support and uses JDiff under the covers.
Usage info at the top of the script.

I'm not sure if this is exactly what you're looking for, but we can adjust as
necessary.

Introduce/update a script to generate jdiff reports
---

Key: HBASE-9153
URL: https://issues.apache.org/jira/browse/HBASE-9153
Project: HBase
Issue Type: Task
Reporter: Jonathan Hsieh
Assignee: Aleksandr Shulman
Fix For: 0.98.0, 0.96.0

Attachments: HBASE-9153-v1.patch, HBASE-9153-v3.patch,
HBASE-9153-v4-0.94.patch, HBASE-9153-v4-0.95.patch,
HBASE-9153-v4-trunk.patch, HBASE-9153-v5-0.94.patch,
HBASE-9153-v5-0.95.patch, HBASE-9153-v5-trunk.patch,
HBASE-9153-v6-0.94.patch, HBASE-9153-v6-0.95.patch, HBASE-9153-v6-trunk.patch

We've had a few issues now where we've removed API's without deprecating or
deprecating in the late release. (HBASE-9142, HBASE-9093) We should just
have a tool that enforces our api deprecation policy as a release time check
or as a precommit check.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9153) Create a deprecation policy enforcement check

2013-09-17 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9153:
-

Attachment: HBASE-9153-v5-0.95.patch
HBASE-9153-v5-0.94.patch
HBASE-9153-v5-trunk.patch

File name fixed in the comments.

 Create a deprecation policy enforcement check
 -

 Key: HBASE-9153
 URL: https://issues.apache.org/jira/browse/HBASE-9153
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Hsieh
 Attachments: HBASE-9153-v1.patch, HBASE-9153-v3.patch, 
 HBASE-9153-v4-0.94.patch, HBASE-9153-v4-0.95.patch, 
 HBASE-9153-v4-trunk.patch, HBASE-9153-v5-0.94.patch, 
 HBASE-9153-v5-0.95.patch, HBASE-9153-v5-trunk.patch


 We've had a few issues now where we've removed API's without deprecating or 
 deprecating in the late release.  (HBASE-9142, HBASE-9093)  We should just 
 have a tool that enforces our api deprecation policy as a release time check 
 or as a precommit check.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-9153) Create a deprecation policy enforcement check

2013-09-17 Thread Aleksandr Shulman (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-9153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Shulman updated HBASE-9153:
-

Attachment: HBASE-9153-v6-trunk.patch
HBASE-9153-v6-0.95.patch
HBASE-9153-v6-0.94.patch

Had to fix one last thing.

 Create a deprecation policy enforcement check
 -

 Key: HBASE-9153
 URL: https://issues.apache.org/jira/browse/HBASE-9153
 Project: HBase
  Issue Type: Task
Reporter: Jonathan Hsieh
 Attachments: HBASE-9153-v1.patch, HBASE-9153-v3.patch, 
 HBASE-9153-v4-0.94.patch, HBASE-9153-v4-0.95.patch, 
 HBASE-9153-v4-trunk.patch, HBASE-9153-v5-0.94.patch, 
 HBASE-9153-v5-0.95.patch, HBASE-9153-v5-trunk.patch, 
 HBASE-9153-v6-0.94.patch, HBASE-9153-v6-0.95.patch, HBASE-9153-v6-trunk.patch


 We've had a few issues now where we've removed API's without deprecating or 
 deprecating in the late release.  (HBASE-9142, HBASE-9093)  We should just 
 have a tool that enforces our api deprecation policy as a release time check 
 or as a precommit check.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

1 2 3 >

1 - 100 of 250 matches

Mail list logo