[jira] [Created] (HADOOP-11969) Text encoder/decoder factory initialization are not thread safe

2015-05-14 Thread Sean Busbey (JIRA)
Sean Busbey created HADOOP-11969:


 Summary: Text encoder/decoder factory initialization are not 
thread safe
 Key: HADOOP-11969
 URL: https://issues.apache.org/jira/browse/HADOOP-11969
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Reporter: Sean Busbey
Assignee: Sean Busbey
Priority: Critical


Right now, the initialization of hte thread local factories for encoder / 
decoder in Text are not marked final. This means they end up with a static 
initializer that is not guaranteed to be finished running before the members 
are visible. 

Under heavy contention, this means during initialization some users will get an 
NPE:

{code}
(2015-05-05 08:58:03.974 : solr_server_log.log) 
 org.apache.solr.common.SolrException; null:java.lang.NullPointerException
at org.apache.hadoop.io.Text.decode(Text.java:406)
at org.apache.hadoop.io.Text.decode(Text.java:389)
at org.apache.hadoop.io.Text.toString(Text.java:280)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:764)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.buildBaseHeader(DataTransferProtoUtil.java:81)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.buildClientHeader(DataTransferProtoUtil.java:71)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Sender.readBlock(Sender.java:101)
at 
org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:400)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:785)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:663)
at 
org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:327)
at 
org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1027)
at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:974)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1305)
at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:78)
at 
org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:107)
... SNIP...
{code} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Protocol Buffers version

2015-05-14 Thread Chris Nauroth
Thanks for that link, Alan.  That looks like a useful site!

Ideally, the Protocol Buffers project would give a clear statement about
wire compatibility between 2.5.0 and 2.6.1.  Unfortunately, I can't find
that anywhere.  If it's not documented, then it's probably worth following
up on the Protocol Buffers support lists to ask them.

One thing we could try is starting up a mix of Hadoop processes using
2.5.0 and 2.6.1 to see how it goes.  We've made a commitment to both
forward and backward compatibility within Hadoop 2.x, so we'd need a 2.5.0
client to be able to talk to a 2.6.1 server, and we'd need a 2.6.1 client
to be able to talk to a 2.5.0 server.  Even if this appears to go well, I
wouldn't consider it a substitute for a formal statement of the
compatibility policy from the Protocol Buffers project.  Otherwise, there
might be some subtle lurking issue that we miss in our initial testing.

As a reminder though, the community probably would want to see a strong
justification for the upgrade in terms of features or performance or
something else.  Right now, I'm not seeing a significant benefit for us
based on my reading of their release notes.  I think it's worthwhile to
figure this out first.  Otherwise, there is a risk that any testing work
turns out to be a wasted effort.

--Chris Nauroth




On 5/14/15, 7:23 AM, Alan Burlison alan.burli...@oracle.com wrote:

On 13/05/2015 17:13, Chris Nauroth wrote:

 It was important to complete this upgrade before Hadoop 2.x came out of
 beta.  After that, we committed to a policy of backwards-compatibility
 within the 2.x release line.  I can't find a statement about whether or
 not Protocol Buffers 2.6.1 is backwards-compatible with 2.5.0 (both at
 compile time and on the wire).  Do you know the answer?  If it's
 backwards-incompatible, then we wouldn't be able to do this upgrade
within
 Hadoop 2.x, though we could consider it for 3.x (trunk).

I'm not sure about the wire format, what's the best way of checking for
wire format issues?

http://upstream-tracker.org/versions/protobuf.html suggests there are
are some source-level issues which will require investigation.

 In general, we upgrade dependencies when a new release offers a
compelling
 benefit, not solely to keep up with the latest.  In the case of 2.5.0,
 there was a performance benefit.  Looking at the release notes for 2.6.0
 and 2.6.1, I don't see anything particularly compelling.  (That's just
my
 opinion though, and others might disagree.)

I think bundling or forking is the only practical option. I was looking
to see if we could provide ProtocolBuffers as an installable option on
our platform, if it's a version-compatibility nightmare as you say,
that's going to be difficult as we really don't want to have to provide
multiple versions.

 BTW, if anyone is curious, it's possible to try a custom build right now
 linked against 2.6.1.  You'd pass -Dprotobuf.version=2.6.1 and
 -Dprotoc.path=path to protoc 2.6.1 binary when you run the mvn
command.

Once I have fixed all the other source portability issues I'll circle
back around and take a look at this.

-- 
Alan Burlison
--



[jira] [Created] (HADOOP-11971) Move test utilities for tracing from hadoop-hdfs to hadoo-common

2015-05-14 Thread Masatake Iwasaki (JIRA)
Masatake Iwasaki created HADOOP-11971:
-

 Summary: Move test utilities for tracing from hadoop-hdfs to 
hadoo-common
 Key: HADOOP-11971
 URL: https://issues.apache.org/jira/browse/HADOOP-11971
 Project: Hadoop Common
  Issue Type: Improvement
  Components: tracing
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki
Priority: Minor


Utilities used in TestTracing such as SetSpanReceiver should be moved to 
{{hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/tracing/}}
 in order to make it usable from yarn and other modules.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HADOOP-11220) Jenkins should verify mvn site if the patch contains *.apt.vm changes

2015-05-14 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HADOOP-11220.
---
Resolution: Fixed

Yup. Closing as contained by HADOOP-11746, which rewrote the precommit checks 
to run mvn site as appropriate.

 Jenkins should verify mvn site if the patch contains *.apt.vm changes
 ---

 Key: HADOOP-11220
 URL: https://issues.apache.org/jira/browse/HADOOP-11220
 Project: Hadoop Common
  Issue Type: Improvement
Reporter: Zhijie Shen

 It's should be good to make Jenkins verify mvn site if the patch contains 
 *.apt.vm changes to avoid some obvious build failure, such as YARN-2732.
 It's not the first time that the similar issues have been raised. Having an 
 automative verification can inform us an alert before us encounter an actual 
 build failure which involves site lifecycle.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HADOOP-11973) Some ZkDelegationTokenSecretManager znodes do not have ACLs

2015-05-14 Thread Gregory Chanan (JIRA)
Gregory Chanan created HADOOP-11973:
---

 Summary: Some ZkDelegationTokenSecretManager znodes do not have 
ACLs
 Key: HADOOP-11973
 URL: https://issues.apache.org/jira/browse/HADOOP-11973
 Project: Hadoop Common
  Issue Type: Bug
  Components: security
Affects Versions: 2.6.0
Reporter: Gregory Chanan


I recently added an ACL Provider to the curator framework instance I pass to 
the ZkDelegationTokenSecretManager, and notice some strangeness around ACLs.

I set: zk-dt-secret-manager.znodeWorkingPath to:
solr/zkdtsm

and notice that
/solr/zkdtsm/
/solr/zkdtsm/ZKDTSMRoot
do not have ACLs

but all the znodes under /solr/zkdtsm/ZKDTSMRoot have ACLs.  From adding some 
logging, it looks like the ACLProvider is never called for /solr/zkdtsm and 
/solr/zkdtsm/ZKDTSMRoot.  I don't know if that's a Curator or 
ZkDelegationTokenSecretManager issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)