PS assuming we clean mahout-math and scala modules -- this should be fairly
easy. Maybe there's some stuff in the colt classes but there shoulnd't be a
lot?


On Tue, May 19, 2015 at 11:16 AM, Dmitriy Lyubimov <[email protected]>
wrote:

> can't we just declare its own guava for mahout-mr? Or inherit it from
> whenever it is declared in hadoop we depend on there?
>
> On Tue, May 19, 2015 at 9:24 AM, Pat Ferrel <[email protected]> wrote:
>
>> I was hoping someone knew the differences. Andrew and I are feeling our
>> way along since we haven’t used either to any extent.
>>
>> On May 19, 2015, at 9:17 AM, Suneel Marthi <[email protected]> wrote:
>>
>> Ok, see ur point if its only for MAhout-Math and Mahout-hdfs.  Not sure if
>> its just straight replacement of Preconditions -> Asserts though.
>> Preconditions throw an exception if some condition is not satisfied. Java
>> Asserts are never meant to be used in production code.
>>
>> So the right fix would be to replace all references to Preconditions with
>> some exception handling boilerplate.
>>
>> On Tue, May 19, 2015 at 11:58 AM, Pat Ferrel <[email protected]>
>> wrote:
>>
>> > We only have to worry about mahout-math and mahout-hdfs.
>> >
>> > Yes, Andrew was working on those they were replaced with plain Java
>> > asserts.
>> >
>> > There still remain the uses you mention in those two modules but I see
>> no
>> > good alternative to hacking them out. Maybe we can move some code out to
>> > mahout-mr if it’s easier.
>> >
>> > On May 19, 2015, at 8:48 AM, Suneel Marthi <[email protected]> wrote:
>> >
>> > I had tried minimizing the Guava Dependency to a large extent in the
>> run up
>> > to 0.10.0.  Its not as trivial as it seems, there are parts of the code
>> > (Collocations, lucene2seq. Lucene TokenStream processing and
>> tokenization
>> > code) that are heavily reliant on AbstractIterator;  and there are
>> sections
>> > of the code that assign a HashSet to a List (again have to use Guava for
>> > that if one wants to avoid writing boilerplate for doing the same.
>> >
>> > Moreover, things that return something like Iterable<?> and need to be
>> > converted into a regular collection, can easily be done using Guava
>> without
>> > writing own boilerplate again.
>> >
>> > Are we replacing all Preconditions by straight Asserts now ??
>> >
>> >
>> > On Tue, May 19, 2015 at 11:21 AM, Pat Ferrel <[email protected]>
>> > wrote:
>> >
>> >> We need to move to Spark 1.3 asap and set the stage for beyond 1.3. The
>> >> primary reason is that the big distros are there already or will be
>> very
>> >> soon. Many people using Mahout will have the environment they must use
>> >> dictated by support orgs in their companies so our current position as
>> >> running only on Spark 1.1.1 means many potential users are out of luck.
>> >>
>> >> Here are the problems I know of in moving Mahout ahead on Spark
>> >> 1) Guava in any backend code (executor closures) relies on being
>> >> serialized with Javaserializer, which is broken and hasn’t been fixed
>> in
>> >> 1.2+ There is a work around, which involves moving a Guava jar to all
>> > Spark
>> >> workers, which is unacceptable in many cases. Guava in the Spark-1.2 PR
>> > has
>> >> been removed from Scala code and will be pushed to the master probably
>> > this
>> >> week. That leaves a bunch of uses of Guava in java math and hdfs.
>> Andrew
>> >> has (I think) removed the Preconditions and replaced them with asserts.
>> > But
>> >> there remain some uses of Map and AbstractIterator from Guava. Not sure
>> > how
>> >> many of these remain but if anyone can help please check here:
>> >> https://issues.apache.org/jira/browse/MAHOUT-1708 <
>> >> https://issues.apache.org/jira/browse/MAHOUT-1708>
>> >> 2) the Mahout Shell relies on APIs not available in Spark 1.3.
>> >> 3) the api for writing to sequence files now requires implicit values
>> > that
>> >> are not available in the current code. I think Andy did a temp fix to
>> > write
>> >> to object files but this is probably nto what we want to release.
>> >>
>> >> I for one would dearly love to see Mahout 0.10.1 support Spark 1.3+.
>> and
>> >> soon. This is a call for help in cleaning these things up. Even with no
>> > new
>> >> features the above things would make Mahout much more usable in current
>> >> environments.
>> >
>> >
>>
>>
>

Reply via email to