[jira] [Created] (HADOOP-11969) Text encoder/decoder factory initialization are not thread safe
Sean Busbey created HADOOP-11969: Summary: Text encoder/decoder factory initialization are not thread safe Key: HADOOP-11969 URL: https://issues.apache.org/jira/browse/HADOOP-11969 Project: Hadoop Common Issue Type: Bug Components: io Reporter: Sean Busbey Assignee: Sean Busbey Priority: Critical Right now, the initialization of hte thread local factories for encoder / decoder in Text are not marked final. This means they end up with a static initializer that is not guaranteed to be finished running before the members are visible. Under heavy contention, this means during initialization some users will get an NPE: {code} (2015-05-05 08:58:03.974 : solr_server_log.log) org.apache.solr.common.SolrException; null:java.lang.NullPointerException at org.apache.hadoop.io.Text.decode(Text.java:406) at org.apache.hadoop.io.Text.decode(Text.java:389) at org.apache.hadoop.io.Text.toString(Text.java:280) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:764) at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.buildBaseHeader(DataTransferProtoUtil.java:81) at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.buildClientHeader(DataTransferProtoUtil.java:71) at org.apache.hadoop.hdfs.protocol.datatransfer.Sender.readBlock(Sender.java:101) at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:400) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:785) at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:663) at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:327) at org.apache.hadoop.hdfs.DFSInputStream.actualGetFromOneDataNode(DFSInputStream.java:1027) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:974) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1305) at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:78) at org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:107) ... SNIP... {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Re: Protocol Buffers version
Thanks for that link, Alan. That looks like a useful site! Ideally, the Protocol Buffers project would give a clear statement about wire compatibility between 2.5.0 and 2.6.1. Unfortunately, I can't find that anywhere. If it's not documented, then it's probably worth following up on the Protocol Buffers support lists to ask them. One thing we could try is starting up a mix of Hadoop processes using 2.5.0 and 2.6.1 to see how it goes. We've made a commitment to both forward and backward compatibility within Hadoop 2.x, so we'd need a 2.5.0 client to be able to talk to a 2.6.1 server, and we'd need a 2.6.1 client to be able to talk to a 2.5.0 server. Even if this appears to go well, I wouldn't consider it a substitute for a formal statement of the compatibility policy from the Protocol Buffers project. Otherwise, there might be some subtle lurking issue that we miss in our initial testing. As a reminder though, the community probably would want to see a strong justification for the upgrade in terms of features or performance or something else. Right now, I'm not seeing a significant benefit for us based on my reading of their release notes. I think it's worthwhile to figure this out first. Otherwise, there is a risk that any testing work turns out to be a wasted effort. --Chris Nauroth On 5/14/15, 7:23 AM, Alan Burlison alan.burli...@oracle.com wrote: On 13/05/2015 17:13, Chris Nauroth wrote: It was important to complete this upgrade before Hadoop 2.x came out of beta. After that, we committed to a policy of backwards-compatibility within the 2.x release line. I can't find a statement about whether or not Protocol Buffers 2.6.1 is backwards-compatible with 2.5.0 (both at compile time and on the wire). Do you know the answer? If it's backwards-incompatible, then we wouldn't be able to do this upgrade within Hadoop 2.x, though we could consider it for 3.x (trunk). I'm not sure about the wire format, what's the best way of checking for wire format issues? http://upstream-tracker.org/versions/protobuf.html suggests there are are some source-level issues which will require investigation. In general, we upgrade dependencies when a new release offers a compelling benefit, not solely to keep up with the latest. In the case of 2.5.0, there was a performance benefit. Looking at the release notes for 2.6.0 and 2.6.1, I don't see anything particularly compelling. (That's just my opinion though, and others might disagree.) I think bundling or forking is the only practical option. I was looking to see if we could provide ProtocolBuffers as an installable option on our platform, if it's a version-compatibility nightmare as you say, that's going to be difficult as we really don't want to have to provide multiple versions. BTW, if anyone is curious, it's possible to try a custom build right now linked against 2.6.1. You'd pass -Dprotobuf.version=2.6.1 and -Dprotoc.path=path to protoc 2.6.1 binary when you run the mvn command. Once I have fixed all the other source portability issues I'll circle back around and take a look at this. -- Alan Burlison --
[jira] [Created] (HADOOP-11971) Move test utilities for tracing from hadoop-hdfs to hadoo-common
Masatake Iwasaki created HADOOP-11971: - Summary: Move test utilities for tracing from hadoop-hdfs to hadoo-common Key: HADOOP-11971 URL: https://issues.apache.org/jira/browse/HADOOP-11971 Project: Hadoop Common Issue Type: Improvement Components: tracing Reporter: Masatake Iwasaki Assignee: Masatake Iwasaki Priority: Minor Utilities used in TestTracing such as SetSpanReceiver should be moved to {{hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/tracing/}} in order to make it usable from yarn and other modules. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HADOOP-11220) Jenkins should verify mvn site if the patch contains *.apt.vm changes
[ https://issues.apache.org/jira/browse/HADOOP-11220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer resolved HADOOP-11220. --- Resolution: Fixed Yup. Closing as contained by HADOOP-11746, which rewrote the precommit checks to run mvn site as appropriate. Jenkins should verify mvn site if the patch contains *.apt.vm changes --- Key: HADOOP-11220 URL: https://issues.apache.org/jira/browse/HADOOP-11220 Project: Hadoop Common Issue Type: Improvement Reporter: Zhijie Shen It's should be good to make Jenkins verify mvn site if the patch contains *.apt.vm changes to avoid some obvious build failure, such as YARN-2732. It's not the first time that the similar issues have been raised. Having an automative verification can inform us an alert before us encounter an actual build failure which involves site lifecycle. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HADOOP-11973) Some ZkDelegationTokenSecretManager znodes do not have ACLs
Gregory Chanan created HADOOP-11973: --- Summary: Some ZkDelegationTokenSecretManager znodes do not have ACLs Key: HADOOP-11973 URL: https://issues.apache.org/jira/browse/HADOOP-11973 Project: Hadoop Common Issue Type: Bug Components: security Affects Versions: 2.6.0 Reporter: Gregory Chanan I recently added an ACL Provider to the curator framework instance I pass to the ZkDelegationTokenSecretManager, and notice some strangeness around ACLs. I set: zk-dt-secret-manager.znodeWorkingPath to: solr/zkdtsm and notice that /solr/zkdtsm/ /solr/zkdtsm/ZKDTSMRoot do not have ACLs but all the znodes under /solr/zkdtsm/ZKDTSMRoot have ACLs. From adding some logging, it looks like the ACLProvider is never called for /solr/zkdtsm and /solr/zkdtsm/ZKDTSMRoot. I don't know if that's a Curator or ZkDelegationTokenSecretManager issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)