Hi Andy, I ran the following tests as you have specified:
- classify-wikipedia.sh - Option 2 - cluster-reuters.sh - Option 1,2 - classify-20newsgroups.sh - Option 1 All these examples *ran successfully* on a cloudera quickstart vm 5.7. I had to change the cluster JVM to 1.8 to make it work otherwise lucene was failing with incompatible class major/minor version error (because lucene 6.1.0 was built for JVM 1.8). On seeing that this patch wasn't working with Java 8, I was like why, why, why? Thanks, Raviteja On Sat, Aug 6, 2016 at 5:15 PM, Raviteja Lokineni < raviteja.lokin...@gmail.com> wrote: > I will let you know by tomorrow. Will run them now. > > On Aug 6, 2016 5:13 PM, "Andrew Palumbo" <ap....@outlook.com> wrote: > >> We will likely move to Java 8 at some point of course, but I personally >> would not be inclined to enforce it right now as most of our current new >> work is Scala-based, and this (the lucene dep.) is only used in legacy >> components. Admittedly though, one useful legacy component. Were you >> able to get the examples to run in pseudo-cluster mode with lucene 6? >> >> >> Thanks, >> >> >> Andy >> >> >> ________________________________ >> From: Andrew Palumbo <ap....@outlook.com> >> Sent: Saturday, August 6, 2016 5:03:45 PM >> To: dev@mahout.apache.org >> Subject: Re: MAHOUT-1876 - Lucene compatibility >> >> Thank you Raviteja, this is something that we will have to discuss. >> >> ________________________________ >> From: Raviteja Lokineni <raviteja.lokin...@gmail.com> >> Sent: Friday, August 5, 2016 11:41:09 PM >> To: mahout >> Subject: Re: MAHOUT-1876 - Lucene compatibility >> >> Guys, found an issue lucene 6.x is compatible only with Java 8. What's the >> plan for mahout compatibility? Do you guys want to call in a vote for Java >> compatibility? >> >> On Aug 5, 2016 4:58 PM, "Andrew Palumbo" <ap....@outlook.com> wrote: >> >> > Hi Raviteja, >> > >> > >> > Since this upgrade affects the entire Mahout MapReduce text processing >> > pipeline it is important to make sure that it is working in the end to >> end >> > examples. >> > >> > >> > Could you please set up a Hadoop 2.4.1 pseudo cluster and run through >> the >> > previously mentioned examples? >> > >> > >> > The instructions are here (this is from 2.7.1 but should be the same for >> > 2.4.1) : >> > >> > >> > <https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/ >> hadoop-common/ >> > SingleCluster.html>https://hadoop.apache.org/docs/r2.7.2/ >> > hadoop-project-dist/hadoop-common/SingleCluster.html# >> > Pseudo-Distributed_Operation >> > >> > >> > Thanks very much, >> > >> > >> > Andy >> > >> > >> > >> > ________________________________ >> > From: Andrew Palumbo <ap....@outlook.com> >> > Sent: Friday, August 5, 2016 2:38 PM >> > To: dev@mahout.apache.org >> > Subject: Re: MAHOUT-1876 - Lucene compatibility >> > >> > Ahh- yes I think we started removing MAHOUT_LOCAL capability I see the >> > check for MAHOUT_LOCAL was removed in this commit: >> > >> > >> > https://github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e45 >> > 1618f3a028 >> > [https://avatars3.githubusercontent.com/u/692523?v=3&s=200]<https:// >> > github.com/apache/mahout/commit/daad3a4ce618cbd05be468c4ce6e >> 451618f3a028> >> > >> > MAHOUT-1665: Update hadoop commands in example scripts (akm) closes a… · >> > apache/mahout@daad3a4<https://github.com/apache/mahout/commit/ >> > daad3a4ce618cbd05be468c4ce6e451618f3a028> >> > github.com >> > …pache/mahout#98 >> > >> > >> > >> > >> > So it would make sense that you are seeing that Error in local mode. >> > >> > ________________________________ >> > From: Raviteja Lokineni <raviteja.lokin...@gmail.com> >> > Sent: Friday, August 5, 2016 2:28:08 PM >> > To: mahout >> > Subject: Re: MAHOUT-1876 - Lucene compatibility >> > >> > Nope in a Linux environment. >> > >> > On Aug 5, 2016 2:21 PM, "Suneel Marthi" <smar...@apache.org> wrote: >> > >> > > r u running this on windows prompt or in Cygwin. >> > > >> > > Suggest use Cygwin. >> > > >> > > On Fri, Aug 5, 2016 at 2:15 PM, Raviteja Lokineni < >> > > raviteja.lokin...@gmail.com> wrote: >> > > >> > > > This is what I get. >> > > > >> > > > $ ./classify-20newsgroups.sh >> > > > /home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line >> > > > 36: /bin/hadoop: No such file or directory >> > > > /home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line >> > > > 38: [: too many arguments >> > > > /home/lok268/projects/mahout/examples/bin/set-dfs-commands.sh: line >> > > > 43: [: -eq: unary operator expected >> > > > Can't determine Hadoop version. >> > > > >> > > > >> > > > On Fri, Aug 5, 2016 at 2:08 PM, Suneel Marthi <smar...@apache.org> >> > > wrote: >> > > > >> > > > > u don't need a hadoop cluster for that, >> > > > > >> > > > > set MAHOUT_LOCAL=true >> > > > > and u shuld be able to run locally >> > > > > >> > > > > On Fri, Aug 5, 2016 at 1:57 PM, Raviteja Lokineni < >> > > > > raviteja.lokin...@gmail.com> wrote: >> > > > > >> > > > > > Hi Andrew, >> > > > > > >> > > > > > Looks like the examples don't seem to work unless on a hadoop >> > > cluster. >> > > > > If I >> > > > > > get some time I will download a cloudera quickstart vm and test >> it >> > > out. >> > > > > > >> > > > > > Thanks, >> > > > > > Raviteja >> > > > > > >> > > > > > On Fri, Aug 5, 2016 at 12:53 PM, Andrew Palumbo < >> > ap....@outlook.com> >> > > > > > wrote: >> > > > > > >> > > > > > > Thanks again Raviteja, >> > > > > > > Tests pass in my Linux env as well. >> > > > > > > >> > > > > > > FYI, if the windows script has not yet been officially >> deprecated >> > > it >> > > > > > > should be soon. >> > > > > > > >> > > > > > > As Suneel said, someone will merge it over the weekend. In >> the >> > > > > meantime >> > > > > > > it would good to ensure that some of the examples are working >> in >> > > the >> > > > > > > $MAHOUT_HOME/examples/bin dir. Could you try running >> > > > > > > classify-wikipedia.sh option (2), cluster-reuters.sh option >> (1) >> > pr >> > > > (2) >> > > > > > and >> > > > > > > classify-20newsgroups.sh option 1 in (pseudo)cluster mode if >> > > > possible? >> > > > > > > >> > > > > > > This would to ensure that seq2sparse is working correctly >> which >> > > > relies >> > > > > > > heavily on lucene. >> > > > > > > >> > > > > > > Thanks again for the great contribution. >> > > > > > > >> > > > > > > Andy >> > > > > > > >> > > > > > > >> > > > > > > -------- Original message -------- >> > > > > > > From: Raviteja Lokineni <raviteja.lokin...@gmail.com> >> > > > > > > Date: 08/05/2016 12:42 PM (GMT-05:00) >> > > > > > > To: mahout <dev@mahout.apache.org> >> > > > > > > Subject: Re: MAHOUT-1876 - Lucene compatibility >> > > > > > > >> > > > > > > Just a FYI, all the tests are successful on windows too ;) >> > > > > > > >> > > > > > > On Fri, Aug 5, 2016 at 12:18 PM, Andrew Palumbo < >> > > ap....@outlook.com> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > +1 >> > > > > > > > >> > > > > > > > ________________________________ >> > > > > > > > From: Raviteja Lokineni <raviteja.lokin...@gmail.com> >> > > > > > > > Sent: Friday, August 5, 2016 12:14:24 PM >> > > > > > > > To: mahout >> > > > > > > > Subject: Re: MAHOUT-1876 - Lucene compatibility >> > > > > > > > >> > > > > > > > Yay! for the heads up on merging. >> > > > > > > > >> > > > > > > > FYI, I take back my word on failure on windows though. I >> had to >> > > > > include >> > > > > > > the >> > > > > > > > hadoop.dll file on PATH. Tests are running (I am running it >> > just >> > > to >> > > > > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > > -- >> > > > > > *Raviteja Lokineni* | Business Intelligence Developer >> > > > > > TD Ameritrade >> > > > > > >> > > > > > E: raviteja.lokin...@gmail.com >> > > > > > >> > > > > > [image: View Raviteja Lokineni's profile on LinkedIn] >> > > > > > <http://in.linkedin.com/in/ravitejalokineni> >> > > > > > >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > *Raviteja Lokineni* | Business Intelligence Developer >> > > > TD Ameritrade >> > > > >> > > > E: raviteja.lokin...@gmail.com >> > > > >> > > > [image: View Raviteja Lokineni's profile on LinkedIn] >> > > > <http://in.linkedin.com/in/ravitejalokineni> >> > > > >> > > >> > >> > -- *Raviteja Lokineni* | Business Intelligence Developer TD Ameritrade E: raviteja.lokin...@gmail.com [image: View Raviteja Lokineni's profile on LinkedIn] <http://in.linkedin.com/in/ravitejalokineni>