> On Aug 22, 2017, at 6:00 AM, Steve Loughran <[email protected]> wrote:
>
>
> I'm having problems getting the s3 classpath setup on the CLI & am trying to
> work out what I'm doing wrong.
>
>
> without setting things up, you can't expect to talk to blobstores
>
> hadoop fs -ls wasb://something/
> hadoop fs -ls s3a://landsat-pds/
>
> That's expected.
Yup.
> but what I can't do is get the aws bits on the CP via HADOOP_OPTIONAL_TOOLS
>
> export
> HADOOP_OPTIONAL_TOOLS="hadoop-azure,hadoop-aws,hadoop-adl,hadoop-openstack"
>
> Once I do that the wasb:// ls works (or at least doesnt throw a CNFE), but
> the s3a URL still fails
Hmm. So HOT is getting processed at least somewhat then...
> if Add the line to ~/.hadooprc all becomes well
>
> hadoop_add_to_classpath_tools hadoop-aws
>
> any ideas?
Setting HOT should be calling the equivalent of
hadoop_add_to_classpath_tools hadoop-aws in the code path. Luckily, we have
debugging tools in 3.x[1]:
First, let’s duplicate the failure conditions, but only activate hadoop-aws
since it should be standalone and cuts our output down:
=======================
$ cat ~/.hadooprc
cat: /Users/aw/.hadooprc: No such file or directory
$ bin/hadoop envvars | grep CONF
HADOOP_CONF_DIR='/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/etc/hadoop'
$ pwd
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT
$ grep OPTIONAL_TOOLS etc/hadoop/hadoop-env.sh
# export
HADOOP_OPTIONAL_TOOLS="hadoop-aliyun,hadoop-aws,hadoop-azure,hadoop-azure-datalake,hadoop-kafka,hadoop-openstack"
export HADOOP_OPTIONAL_TOOLS="hadoop-aws”
=======================
Using --debug, let’s see what happens:
=======================
$ bin/hadoop --debug classpath 2>&1 | egrep '(tools|hadoop-aws)'
DEBUG: shellprofiles:
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-aliyun.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-archive-logs.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-archives.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-aws.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-azure-datalake.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-azure.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-distcp.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-extras.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-gridmix.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-hdfs.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-httpfs.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-kafka.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-kms.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-mapreduce.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-openstack.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-rumen.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-streaming.sh
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-yarn.sh
DEBUG: Adding hadoop-aws to HADOOP_TOOLS_OPTIONS
DEBUG: Profiles: importing
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/bin/../libexec/shellprofile.d/hadoop-aws.sh
DEBUG: HADOOP_SHELL_PROFILES accepted hadoop-aws
DEBUG: Profiles: hadoop-aws classpath
DEBUG: Append CLASSPATH:
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/aws-java-sdk-bundle-1.11.134.jar
DEBUG: Append CLASSPATH:
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/java-xmlbuilder-0.4.jar
DEBUG: Append CLASSPATH:
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/jets3t-0.9.0.jar
DEBUG: Append CLASSPATH:
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/hadoop-aws-3.0.0-beta1-SNAPSHOT.jar
/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/etc/hadoop:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/common/lib/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/common/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/aws-java-sdk-bundle-1.11.134.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/java-xmlbuilder-0.4.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/jets3t-0.9.0.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/tools/lib/hadoop-aws-3.0.0-beta1-SNAPSHOT.jar:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/hdfs:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/hdfs/lib/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/hdfs/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/mapreduce/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/yarn/lib/*:/Users/aw/H/hadoop-3.0.0-beta1-SNAPSHOT/share/hadoop/yarn/*
=======================
OK, the “extra” bits are definitely getting added. With the addition of the
debug lines:
* the hadoop-aws profile and tools hooks are getting executed
* the hadoop-aws classpath function is getting executed (aka
hadoop_add_to_classpath_tools hadoop-aws)
* the classpath isn’t rejecting any jars
* the final line definitely has AWS there.
So we should be good to go assuming the profile and supplemental tools code is
correct.
=======================
$ bin/hadoop fs -ls s3a://landsat-pds/
ls: Interrupted
=======================
umm, ok? No CNFE though. If I disable the network:
=======================
$ bin/hadoop fs -ls s3a://landsat-pds/
ls: doesBucketExist on landsat-pds: com.amazonaws.AmazonClientException: No AWS
Credentials provided by BasicAWSCredentialsProvider
EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider :
com.amazonaws.SdkClientException: Unable to load credentials from service
endpoint: No AWS Credentials provided by BasicAWSCredentialsProvider
EnvironmentVariableCredentialsProvider InstanceProfileCredentialsProvider :
com.amazonaws.SdkClientException: Unable to load credentials from service
endpoint
=======================
Ugly error, but still no CNFE. So at least out of the box with a build from
last week. I guess this is working? At this point, it’d probably be worthwhile
to make sure that the libexec/shellprofile.d/hadoop-aws.sh on your system is in
working order. In particular...
=======================
if hadoop_verify_entry HADOOP_TOOLS_OPTIONS "hadoop-aws"; then
hadoop_add_profile "hadoop-aws”
fi
=======================
… is the magic code. It (effectively[2]) says that if HADOOP_OPTIONAL_TOOLS
has hadoop-aws in it, then activate the hadoop-aws profile which should end up
calling hadoop_add_to_classpath_tools hadoop-aws. Might also be worthwhile to
check simple stuff like permissions.
[1] It’s tempting to say “now”, but given that debug was added several years
ago. it’s more like branch-2 is just really ancient rather than 3.x being
"current".
[2] yes, that variable is supposed to be HADOOP_TOOLS_OPTIONS. HOT gets
transformed into HADOOP_OPTIONAL_TOOLS internally for “reasons”. It’s a
longer discussion that most people aren’t interested in.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]