Re: Beyond Spark 1.1.1

2015-05-19 Thread Pat Ferrel
Are all those classes really needed for scala/spark? Seems like we should prune out non-dependencies if possible before we start changing the code. There are probably a lot of things that could be used with Mahout-Samsara but aren’t explicitly in it. Do we loose much by moving those to another m

Re: Beyond Spark 1.1.1

2015-05-19 Thread Andrew Musselman
Might not be terrible, I didn't look too hard but there are 97 instances of "com.google.common" in mahout-math and 4 in mahout-hdfs. On Tue, May 19, 2015 at 11:17 AM, Dmitriy Lyubimov wrote: > PS assuming we clean mahout-math and scala modules -- this should be fairly > easy. Maybe there's some

Build failed in Jenkins: Mahout-Examples-Cluster-Reuters-II #1193

2015-05-19 Thread Apache Jenkins Server
See -- [...truncated 1701 lines...] A mrlegacy/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/qr A mrlegacy/src/main/java/org/apache/mahout/math/hadoop/stoch

Re: Beyond Spark 1.1.1

2015-05-19 Thread Dmitriy Lyubimov
PS assuming we clean mahout-math and scala modules -- this should be fairly easy. Maybe there's some stuff in the colt classes but there shoulnd't be a lot? On Tue, May 19, 2015 at 11:16 AM, Dmitriy Lyubimov wrote: > can't we just declare its own guava for mahout-mr? Or inherit it from > whenev

Re: Beyond Spark 1.1.1

2015-05-19 Thread Dmitriy Lyubimov
can't we just declare its own guava for mahout-mr? Or inherit it from whenever it is declared in hadoop we depend on there? On Tue, May 19, 2015 at 9:24 AM, Pat Ferrel wrote: > I was hoping someone knew the differences. Andrew and I are feeling our > way along since we haven’t used either to any

[jira] [Updated] (MAHOUT-1708) Replace Google/Guava in mahout-math and mahout-hdfs

2015-05-19 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pat Ferrel updated MAHOUT-1708: --- Summary: Replace Google/Guava in mahout-math and mahout-hdfs (was: Replace Preconditions with assert

[jira] [Commented] (MAHOUT-1708) Replace Preconditions with asserts for Spark code

2015-05-19 Thread Andrew Musselman (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550724#comment-14550724 ] Andrew Musselman commented on MAHOUT-1708: -- PR 132 has the removals of guava fro

Re: Beyond Spark 1.1.1

2015-05-19 Thread Pat Ferrel
I was hoping someone knew the differences. Andrew and I are feeling our way along since we haven’t used either to any extent. On May 19, 2015, at 9:17 AM, Suneel Marthi wrote: Ok, see ur point if its only for MAhout-Math and Mahout-hdfs. Not sure if its just straight replacement of Preconditio

Re: Beyond Spark 1.1.1

2015-05-19 Thread Suneel Marthi
Ok, see ur point if its only for MAhout-Math and Mahout-hdfs. Not sure if its just straight replacement of Preconditions -> Asserts though. Preconditions throw an exception if some condition is not satisfied. Java Asserts are never meant to be used in production code. So the right fix would be to

Re: Beyond Spark 1.1.1

2015-05-19 Thread Pat Ferrel
We only have to worry about mahout-math and mahout-hdfs. Yes, Andrew was working on those they were replaced with plain Java asserts. There still remain the uses you mention in those two modules but I see no good alternative to hacking them out. Maybe we can move some code out to mahout-mr if i

Re: Beyond Spark 1.1.1

2015-05-19 Thread Andrew Musselman
I only looked at replacing Precinditions by asserts and found a bunch of other stuff from Google common package, so held off. On Tuesday, May 19, 2015, Suneel Marthi wrote: > I had tried minimizing the Guava Dependency to a large extent in the run up > to 0.10.0. Its not as trivial as it seems,

[jira] [Comment Edited] (MAHOUT-1708) Replace Preconditions with asserts for Spark code

2015-05-19 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550671#comment-14550671 ] Pat Ferrel edited comment on MAHOUT-1708 at 5/19/15 4:01 PM: -

[jira] [Commented] (MAHOUT-1708) Replace Preconditions with asserts for Spark code

2015-05-19 Thread Pat Ferrel (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550671#comment-14550671 ] Pat Ferrel commented on MAHOUT-1708: [~andrew.musselman] can you create a PR branch s

Re: Beyond Spark 1.1.1

2015-05-19 Thread Suneel Marthi
I had tried minimizing the Guava Dependency to a large extent in the run up to 0.10.0. Its not as trivial as it seems, there are parts of the code (Collocations, lucene2seq. Lucene TokenStream processing and tokenization code) that are heavily reliant on AbstractIterator; and there are sections o

Beyond Spark 1.1.1

2015-05-19 Thread Pat Ferrel
We need to move to Spark 1.3 asap and set the stage for beyond 1.3. The primary reason is that the big distros are there already or will be very soon. Many people using Mahout will have the environment they must use dictated by support orgs in their companies so our current position as running o