moving to dev@
So, I can run KMeansDriver directly on EMR, but one of the things I want to do
is actually run MahoutDriver on EMR. The only sticking point to this are the
lines:
<snip classname="MahoutDriver">
InputStream propsStream = Thread.currentThread()
.getContextClassLoader()
.getResourceAsStream("driver.classes.props");
mainClasses.load(propsStream);
</snip>
due to the fact that the properties files are not in the class path that EMR
gets.
Anyone have suggestions on working around this?
My first thought is to create a JOB jar that contains the properties, but the
thought occurred to me that there might be a way to enhance the classpath.
Other thoughts:
1. Instead of requiring driver.classes.props, we could just have an Interface
that each of those drivers implements that reports it's short name and
description and then we just need to do some reflection at startup to get all
implementers of the interface.
2. We create a "default.driver.classes.props" that is actually packaged into
the JOB jar. We first look for driver.classes.props then we look for
default.driver.classes.props, then we throw an exception.
I guess my preference is #2, since that is the least code, still allows the
existing functionality to work and provides reasonable defaults w/o any setup.
Thoughts?
-Grant
On Sep 12, 2010, at 8:07 AM, Grant Ingersoll wrote:
>
> On Sep 12, 2010, at 7:42 AM, Grant Ingersoll wrote:
>
>>
>> On Sep 11, 2010, at 10:11 PM, Drew Farris wrote:
>>
>>> I will write up notes on the EMR wiki page.
>
> https://cwiki.apache.org/confluence/display/MAHOUT/Mahout+on+Elastic+MapReduce
> is updated to 0.4-SNAPSHOT.
>
> -Grant
>
>