Hey Jonathan, Yep, you should put your jar into the lib directory of your Samza job package.
This can be done using Maven's assembly plugin. You can have the plugin put all runtime dependencies (including transitive dependencies) into your lib directory. Then you just need to add a dependency, and it'll get sucked in as part of the Maven build. This is what hello-samza does. Here's hello-samza's assembly file: https://github.com/linkedin/hello-samza/blob/master/samza-job-package/src/m ain/assembly/src.xml?source=c The most important part is the "dependencySets" section. <dependencySet> <outputDirectory>lib</outputDirectory> <includes> <include>org.apache.samza:samza-core_2.8.1</include> <include>org.apache.samza:samza-kafka_2.8.1</include> <include>org.apache.samza:samza-serializers_2.8.1</include> <include>org.apache.samza:samza-yarn_2.8.1</include> <include>org.slf4j:slf4j-log4j12</include> <include>samza:samza-wikipedia</include> <include>org.apache.kafka:kafka_2.8.1</include> </includes> <useTransitiveFiltering>true</useTransitiveFiltering> </dependencySet> You can see that we're including transitive dependencies, and depending on the Samza libraries, as well as samza-wikipedia. You can either add a new <include> block here for your MySQL jars, OR you can have your Samza code package's pom.xml depend on the MySQL jar, and it will get sucked in automatically (due to useTransitiveFiltering). Cheers, Chris On 2/10/14 3:49 PM, "Jonathan Poltak Samosir" <jonathan.samo...@gmail.com> wrote: >Yep, sure enough the jar I need is not in my classpath (as per the YARN >container's stdout). > >To fix this issue, and for future reference, what is the best way of >adding external libraries to the classpath for Samza? > >Going by how the samza/bin/run-class.sh script is written, it seems like >it would probably work if I place the jars I need in samza/lib directory. >This seems rather hacky though, and I would have to do it everytime I run >`mvn clean package`. > >So is there something I'm missing in the Maven pom.xml that is placing >all the needed libraries, apart from the new ones I've added as >dependencies, into the samza/lib directory of the tarball? (sorry if this >is something obvious...) > >Thanks, >Jonathan > > >------------------------------------------------------ >From: Chris Riccomini criccom...@linkedin.com >Reply: dev@samza.incubator.apache.org dev@samza.incubator.apache.org >Date: 10 February 2014 at 14:16:04 >To: dev@samza.incubator.apache.org dev@samza.incubator.apache.org >Subject: Re: Unable to call external library classes from within Samza > >> >> Hey Jonathan, >> >> Can you post your classpath? You can usually find this in your >> YARN >> container's stdout file, if you're running with YARN. If you're >> running >> with LocalJobFactory, it should print to SDTOUT. >> >> Cheers, >> Chris >> >> On 2/8/14 11:54 AM, "Jonathan Poltak Samosir" >> wrote: >> >> >Hello, >> > >> >This is a bit of a follow on on a previous thread of mine, >> >"org.apache.hadoop.util.Shell$ExitCodeException whenever >> Samza container >> >launches". >> > >> >So I am not sure whether this is a bug, or if I am not doing something >> >> >correct, or it is intended, but whenever I attempt to call >> >Class.forName() on an external library class from within Samza, >> I am >> >running into a ClassNotFoundException. My Maven deps are set >> up >> >correctly, and the exact same code works without any issues >> if invoked >> >manually through a main() method, for example, as opposed to >> running it >> >through Samza. >> > >> >The reason I want to do this, is to test that classes for JDBC drivers >> >> >can be found and used. I fully understand that this method is >> no longer >> >needed as of JDBC 4.0, and that the drivers should be automatically >> >loaded if found in the class path, although this isn't happening >> either >> >(an SQLException is thrown with "No suitable driver found" >> if I leave out >> >the Class.forName() call, which leads to the same problem). >> Just so you >> >know, this same code also works fine and loads the appropriate >> JDBC >> >driver if invoked directly through a main() method. >> > >> >I have also tried this with a number of different external libraries, >> >both pulled in using Maven and manually linking .jar files on >> my disk, >> >and the same result; those classes can be found fine from a standard >> Java >> >main() method call, but cannot be found running in a Samza container. >> > >> >Has anyone encountered the same issue, or know of why this could >> be >> >happening? >> > >> >I hope all that made sense, let me know if it didn't and I'll try >> to >> >rephrase. >> > >> >Here is the class I am trying to do this in, in case anyone wants >> to see >> >what I'm talking about: >> >>>https://github.com/poltak/hello-samza/blob/database-reader/samza-wikiped >>>ia >> >>>/src/main/java/samza/examples/databasereader/system/DatabaseReaderConsum >>>er >> >.java >> > >> >Anyway, thanks for your time, >> >Jonathan >> >> >