Re: Mahout for 1.5 JVM

2009-03-09 Thread deneche abdelhakim

The following classes uses the Deque interface, which is not available in Java 
1.5:

. org.apache.mahout.classifier.bayes.BayesClassifier
. org.apache.mahout.classifier.cbayes.CBayesClassifier


--- En date de : Lun 9.3.09, Sean Owen  a écrit :

> De: Sean Owen 
> Objet: Re: Mahout for 1.5 JVM
> À: mahout-dev@lucene.apache.org
> Date: Lundi 9 Mars 2009, 22h17
> Yeah I don't know of anything in my
> bits that actually uses a Java
> 6-only class, but could be proved wrong there. You can dig
> out my old
> build.xml file in a pinch to build just this bit -- I can
> write up a
> quick Ant build for you too for the same purpose.
> 
> You do need to make sure you compile with Java 6 since I do
> surely use
> stuff like @Override on methods implementing interface
> methods which
> isn't allowed in Java 5, but which javac in Java 6 can take
> care of if
> source is 6 and target is 5.
> 
> On Mon, Mar 9, 2009 at 9:13 PM, Otis Gospodnetic
> 
> wrote:
> >
> > Hm, yeah, 1.6 because of Hadoop, I forgot about that.
>  I need only the Tasty part of Mahout, though, and that one
> doesn't really need to run on Hadoop.  Any way to build
> just that (for 1.5)?
> 





Re: Google's Gson JSON

2009-03-09 Thread Jeff Eastman
Sounds like we have a couple of additional alternatives to try.  I'm 
going to continue with Gson in the DP MR stuff for the short term, since 
it is already working, and maybe try all three with Vector and Matrix as 
they stand now without annotations.


From a recent posting it sounds like annotations are bubbling up in 
priority.

Jeff


Sean Owen wrote:

If we're going this way -- and I strongly support it -- I'd suggest we
look a step beyond JSON. It is a more compact and standard string
encoding of complex data types, indeed. But it has the secondary goal
of being parseable as Javascript, and a string representation is not
the most efficient encoding.

This strikes me as exactly what Protocol Buffers (or Thrift from FB
perhaps) is for. It is certainly exactly what is used inside Google
for moving data around among MapReduces. It also has Java bindings.

On Mon, Mar 9, 2009 at 6:45 PM, Jeff Eastman  wrote:
  

A few months back, in the context of vector annotations, we had a discussion
of a more standard means to serialize our object state. The Dirichlet
Process implementation has a rather complicated DirichletState object which
must be serialized and so I have worked out a way to do this using Gson.
Though I had to use the 1.3 beta 2 release to get past a problem in the
1.2.3 release, the package seems to be up to the task of serializing
complicated, generic, classes.

In the post 0.1 timeframe, I will look into using native Gson to replace the
current Vector asFormatString as a step towards vector annotations. It's
Apache licensed.

Does anybody else have experience with or comments about this package?

Jeff





  




PGP.sig
Description: PGP signature


Re: Mahout for 1.5 JVM

2009-03-09 Thread Sean Owen
Yeah I don't know of anything in my bits that actually uses a Java
6-only class, but could be proved wrong there. You can dig out my old
build.xml file in a pinch to build just this bit -- I can write up a
quick Ant build for you too for the same purpose.

You do need to make sure you compile with Java 6 since I do surely use
stuff like @Override on methods implementing interface methods which
isn't allowed in Java 5, but which javac in Java 6 can take care of if
source is 6 and target is 5.

On Mon, Mar 9, 2009 at 9:13 PM, Otis Gospodnetic
 wrote:
>
> Hm, yeah, 1.6 because of Hadoop, I forgot about that.  I need only the Tasty 
> part of Mahout, though, and that one doesn't really need to run on Hadoop.  
> Any way to build just that (for 1.5)?


Re: Mahout for 1.5 JVM

2009-03-09 Thread Sean Owen
You need to compile with Java 6, and set source to "1.6" (er, is it
"6" and "5"? Ant accepts both and I bet Maven does too). The target is
indeed "1.5". Java 6 should be able to generate byte code for Java 5;
there is some chance though that the code or a dependency like Hadoop
uses a class or API introduced in Java 6. There weren't many of those
so some large subset of Mahout may indeed work on Java 5.

What error are you getting? that may point to the problem.


On Mon, Mar 9, 2009 at 9:04 PM, Grant Ingersoll  wrote:
> Mahout requires 1.6 due to Hadoop requiring 1.6.  You'd have to backport to
> an older version of Hadoop.
>
> At any rate, I think the maven stuff looks right.  Are you exporting a 1.5
> JVM JAVA_HOME?  I think Maven just uses JAVA_HOME to determine the JVM
> version.


Re: Mahout for 1.5 JVM

2009-03-09 Thread Otis Gospodnetic

Hm, yeah, 1.6 because of Hadoop, I forgot about that.  I need only the Tasty 
part of Mahout, though, and that one doesn't really need to run on Hadoop.  Any 
way to build just that (for 1.5)?


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Grant Ingersoll 
> To: mahout-dev@lucene.apache.org
> Sent: Monday, March 9, 2009 5:04:09 PM
> Subject: Re: Mahout for 1.5 JVM
> 
> Mahout requires 1.6 due to Hadoop requiring 1.6.  You'd have to backport to 
> an 
> older version of Hadoop.
> 
> At any rate, I think the maven stuff looks right.  Are you exporting a 1.5 
> JVM 
> JAVA_HOME?  I think Maven just uses JAVA_HOME to determine the JVM version.
> 
> On Mar 9, 2009, at 4:51 PM, Otis Gospodnetic wrote:
> 
> > 
> > Hi,
> > 
> > This could be a Maven question but I think I've done this successfully 
> with other Maven-based projects, so it could be a Mahout-specific thing.  I'm 
> trying to build Mahout to run on 1.5 JVM.  So I tried:
> > 
> > $ mvn -Dmaven.compiler.target=1.5 clean install
> > 
> > But that built the same jar as without that parameter.  So I then modified 
> > 2 
> pom.xml file, but that didn't seem to build a jar that can run on 1.5 how 
> should one go about doing this?
> > 
> > o...@lesina:~/workspace/asf-mahout$ svn diff pom.xml core/pom.xml
> > Index: pom.xml
> > ===
> > --- pom.xml(revision 751789)
> > +++ pom.xml(working copy)
> > @@ -24,9 +24,22 @@
> >examples
> >  
> > 
> > +
> > +1.5
> > +1.5
> > +
> > +
> >  
> >
> >  
> > +maven-compiler-plugin
> > +
> > +  1.5
> > +  1.5
> > +
> > +  
> > +
> > +  
> >maven-assembly-plugin
> >
> >  
> > Index: core/pom.xml
> > ===
> > --- core/pom.xml(revision 751789)
> > +++ core/pom.xml(working copy)
> > @@ -29,6 +29,13 @@
> >
> >  
> > 
> > +  
> > +maven-compiler-plugin
> > +
> > +  1.5
> > +  1.5
> > +
> > +  
> > 
> >  
> >  
> > 
> > 
> > Thanks,
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 



Re: Mahout for 1.5 JVM

2009-03-09 Thread Grant Ingersoll
Mahout requires 1.6 due to Hadoop requiring 1.6.  You'd have to  
backport to an older version of Hadoop.


At any rate, I think the maven stuff looks right.  Are you exporting a  
1.5 JVM JAVA_HOME?  I think Maven just uses JAVA_HOME to determine the  
JVM version.


On Mar 9, 2009, at 4:51 PM, Otis Gospodnetic wrote:



Hi,

This could be a Maven question but I think I've done this  
successfully with other Maven-based projects, so it could be a  
Mahout-specific thing.  I'm trying to build Mahout to run on 1.5  
JVM.  So I tried:


$ mvn -Dmaven.compiler.target=1.5 clean install

But that built the same jar as without that parameter.  So I then  
modified 2 pom.xml file, but that didn't seem to build a jar that  
can run on 1.5 how should one go about doing this?


o...@lesina:~/workspace/asf-mahout$ svn diff pom.xml core/pom.xml
Index: pom.xml
===
--- pom.xml(revision 751789)
+++ pom.xml(working copy)
@@ -24,9 +24,22 @@
examples
  

+
+1.5
+1.5
+
+
  

  
+maven-compiler-plugin
+
+  1.5
+  1.5
+
+  
+
+  
maven-assembly-plugin

  
Index: core/pom.xml
===
--- core/pom.xml(revision 751789)
+++ core/pom.xml(working copy)
@@ -29,6 +29,13 @@

  

+  
+maven-compiler-plugin
+
+  1.5
+  1.5
+
+  

  

  


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch






Mahout for 1.5 JVM

2009-03-09 Thread Otis Gospodnetic

Hi,

This could be a Maven question but I think I've done this successfully with 
other Maven-based projects, so it could be a Mahout-specific thing.  I'm trying 
to build Mahout to run on 1.5 JVM.  So I tried:

$ mvn -Dmaven.compiler.target=1.5 clean install

But that built the same jar as without that parameter.  So I then modified 2 
pom.xml file, but that didn't seem to build a jar that can run on 1.5 how 
should one go about doing this?

o...@lesina:~/workspace/asf-mahout$ svn diff pom.xml core/pom.xml 
Index: pom.xml
===
--- pom.xml(revision 751789)
+++ pom.xml(working copy)
@@ -24,9 +24,22 @@
 examples
   
 
+
+1.5
+1.5
+
+
   
 
   
+maven-compiler-plugin
+
+  1.5
+  1.5
+
+  
+
+  
 maven-assembly-plugin
 
   
Index: core/pom.xml
===
--- core/pom.xml(revision 751789)
+++ core/pom.xml(working copy)
@@ -29,6 +29,13 @@
 
   
 
+  
+maven-compiler-plugin
+
+  1.5
+  1.5
+
+  
 
   
   


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



Re: Google's Gson JSON

2009-03-09 Thread Ted Dunning
+1

I use thrift all the time now and find it helps things enormously.  One of
best things about it is the good design with respect to version crossing.
Another is the way that unit tests of servers become trivial and traditional
unit tests of ordinary objects and how easily connections to mock servers
can be built (as in, create an object).

One of the worst things is documentation (essentially nil).  It is also
pretty hard to figure out how to serialize an object as opposed to invoke a
service call.

On Mon, Mar 9, 2009 at 12:42 PM, Sean Owen  wrote:

> If we're going this way -- and I strongly support it -- I'd suggest we
> look a step beyond JSON. It is a more compact and standard string
> encoding of complex data types, indeed. But it has the secondary goal
> of being parseable as Javascript, and a string representation is not
> the most efficient encoding.
>
> This strikes me as exactly what Protocol Buffers (or Thrift from FB
> perhaps) is for. It is certainly exactly what is used inside Google
> for moving data around among MapReduces. It also has Java bindings.
>
>


Re: Google's Gson JSON

2009-03-09 Thread Sean Owen
If we're going this way -- and I strongly support it -- I'd suggest we
look a step beyond JSON. It is a more compact and standard string
encoding of complex data types, indeed. But it has the secondary goal
of being parseable as Javascript, and a string representation is not
the most efficient encoding.

This strikes me as exactly what Protocol Buffers (or Thrift from FB
perhaps) is for. It is certainly exactly what is used inside Google
for moving data around among MapReduces. It also has Java bindings.

On Mon, Mar 9, 2009 at 6:45 PM, Jeff Eastman  wrote:
> A few months back, in the context of vector annotations, we had a discussion
> of a more standard means to serialize our object state. The Dirichlet
> Process implementation has a rather complicated DirichletState object which
> must be serialized and so I have worked out a way to do this using Gson.
> Though I had to use the 1.3 beta 2 release to get past a problem in the
> 1.2.3 release, the package seems to be up to the task of serializing
> complicated, generic, classes.
>
> In the post 0.1 timeframe, I will look into using native Gson to replace the
> current Vector asFormatString as a step towards vector annotations. It's
> Apache licensed.
>
> Does anybody else have experience with or comments about this package?
>
> Jeff
>


Re: Google's Gson JSON

2009-03-09 Thread Otis Gospodnetic

Here is a handy list:
http://www.simpy.com/user/otis/search/json

People, including Doug Cutting, had good things to say about Jackson.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Ted Dunning 
> To: mahout-dev@lucene.apache.org
> Sent: Monday, March 9, 2009 2:15:52 PM
> Subject: Re: Google's Gson JSON
> 
> I have just used json-lib and like it a lot.  Gson looks like it might be
> even more so.
> 
> On Mon, Mar 9, 2009 at 11:45 AM, Jeff Eastman wrote:
> 
> > A few months back, in the context of vector annotations, we had a
> > discussion of a more standard means to serialize our object state. The
> > Dirichlet Process implementation has a rather complicated DirichletState
> > object which must be serialized and so I have worked out a way to do this
> > using Gson. Though I had to use the 1.3 beta 2 release to get past a problem
> > in the 1.2.3 release, the package seems to be up to the task of serializing
> > complicated, generic, classes.
> >
> > In the post 0.1 timeframe, I will look into using native Gson to replace
> > the current Vector asFormatString as a step towards vector annotations. It's
> > Apache licensed.
> >
> > Does anybody else have experience with or comments about this package?
> >
> > Jeff
> >
> 
> 
> 
> -- 
> Ted Dunning, CTO
> DeepDyve



Re: Google's Gson JSON

2009-03-09 Thread Ted Dunning
I have just used json-lib and like it a lot.  Gson looks like it might be
even more so.

On Mon, Mar 9, 2009 at 11:45 AM, Jeff Eastman wrote:

> A few months back, in the context of vector annotations, we had a
> discussion of a more standard means to serialize our object state. The
> Dirichlet Process implementation has a rather complicated DirichletState
> object which must be serialized and so I have worked out a way to do this
> using Gson. Though I had to use the 1.3 beta 2 release to get past a problem
> in the 1.2.3 release, the package seems to be up to the task of serializing
> complicated, generic, classes.
>
> In the post 0.1 timeframe, I will look into using native Gson to replace
> the current Vector asFormatString as a step towards vector annotations. It's
> Apache licensed.
>
> Does anybody else have experience with or comments about this package?
>
> Jeff
>



-- 
Ted Dunning, CTO
DeepDyve


Google's Gson JSON

2009-03-09 Thread Jeff Eastman
A few months back, in the context of vector annotations, we had a 
discussion of a more standard means to serialize our object state. The 
Dirichlet Process implementation has a rather complicated DirichletState 
object which must be serialized and so I have worked out a way to do 
this using Gson. Though I had to use the 1.3 beta 2 release to get past 
a problem in the 1.2.3 release, the package seems to be up to the task 
of serializing complicated, generic, classes.


In the post 0.1 timeframe, I will look into using native Gson to replace 
the current Vector asFormatString as a step towards vector annotations. 
It's Apache licensed.


Does anybody else have experience with or comments about this package?

Jeff


PGP.sig
Description: PGP signature