Hi Andre
Are you using Local Resource type ARCHIVE? Using FILE may not help in your
scenario.
If you are using ARCHIVE, you can then use the classpath config (
TEZ_CLUSTER_ADDITIONAL_CLASSPATH_PREFIX ) to modify the classpath.
For example, assume foo.jar and bar.jar ( in the structure that you called out
) are added to the map of local resources using keys foo and bar:
- classpath prefix would be
“$PWD/foo/*:$PWD/foo/lib/*:$PWD/bar/*:$PWD/bar/lib/*:”
As mentioned on the jira, the launch_container.sh from your cluster would help.
Also, if you upload an example jar to the jira, I can help provide a working
example.
thanks
— Hitesh
On Jun 18, 2015, at 9:40 AM, Andre Kelpe <[email protected]> wrote:
> On Wed, Jun 17, 2015 at 4:58 PM, Bikas Saha <[email protected]> wrote:
>
>> If I understand this right, there is a jar with user code in it. The jar
>> needs to be available during split creation but it is not available.
>>
>>
>>
>> Is split creation happening on the client or on the AM. If its happening
>> on the AM, and the AM is not getting the jars then how are you specifying
>> the jars to be sent to the AM. There are different ways to do it.
>>
>
> In our case the AM is doing the split calculation. We are sending the jar
> over as LocalResources given in the TezClient#create method
>
>
>> 1) Set tez.aux.uris in tez-site.xml to an HDFS location and copy
>> user jars there
>>
>> 2) Upload the user jar to HDFS and create a YARN local resource for
>> it. Then use either of the following to add the local resource to the
>> AM/DAG that needs it.
>>
>> a. TezClient#addAppMasterLocalFiles(…)
>>
>> b. DAG#addTaskLocalFiles(…)
>>
>>
>>
>> Not sure what is meant by classic Hadoop style jars?
>>
>
> Hadoop style jars are jar files, where you have the user code + all
> required libs in a sub-directory within the jar. The layout that RunJar
> understands since forever.
>
> The thing is that we can't find a way to put the jars in the lib folder in
> the job-jar on the classpath of the AM.
>
> - André
>
>
>
>>
>>
>> Bikas
>>
>>
>>
>> *From:* Chris K Wensel [mailto:[email protected]]
>> *Sent:* Wednesday, June 17, 2015 4:41 PM
>> *To:* [email protected]
>> *Cc:* [email protected]
>> *Subject:* Re: ClassNotFoundException with custom InputFormat.
>>
>>
>>
>> cross posting down to dev… should continue the discussion there I believe.
>>
>>
>>
>> as I understand it, all Cascading users familiar with packaging a Hadoop
>> job jar with a lib folder, in which the packaged custom InputFormat is
>> placed — pulled from maven etc, will have this issue.
>>
>>
>>
>> this also expands to projects on top of Cascading including Scalding and
>> Cascalog.
>>
>>
>>
>> oddly the org.apache.tez.client.AMConfiguration has a
>>
>>
>>
>> private Map<String, String> env;
>>
>>
>>
>> but is unused.
>>
>>
>>
>> On Jun 17, 2015, at 4:32 PM, Andre Kelpe <[email protected]>
>> wrote:
>>
>>
>>
>> Hi,
>>
>> we are currently running into a problem when a user of Cascading uses a
>> custom InputFormat with Tez. The ApplicationMaster is running into a
>> ClassNotFoundException when calculating the splits, since we are unable to
>> control the environment/classpath visibile to the ApplicationMaster. We
>> have a work-around, where the users have to supply a fat-jar to make it
>> work, but we need to be able to support other ways as well.
>>
>> When interacting with the DAG, we are able to pass along a custom
>> environment/classpath, but that API is missing on the TezClient, causing
>> the AppMaster to fail, when the user is using classic hadoop style jars
>> (embedded lib directory).
>>
>> In order to get lingual, our SQL layer on top of Cascading to work
>> correctly, we need a way to supply the environment in a more dynamic way
>> then one fatjar, so it would be great if the API could be extendend to do
>> that.
>>
>> I have opened https://issues.apache.org/jira/browse/TEZ-2563
>>
>> Thanks!
>>
>>
>>
>> - André
>>
>>
>> --
>>
>> André Kelpe
>> [email protected]
>> http://concurrentinc.com
>>
>>
>>
>> —
>>
>> Chris K Wensel
>>
>> [email protected]
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> André Kelpe
> [email protected]
> http://concurrentinc.com