Hi,
I'm upgrading some classification code from 0.6 to 0.8 and am wondering what
the replacement is for the ClassifierContext?
Thanks,
Grant
Hi Mahout Users,
Drew Farris, Tom Morton and I are currently working on the 2nd Edition of
Taming Text (http://www.manning.com/ingersoll for first ed.) and are soliciting
interested parties who would be willing to contribute to a chapter on practical
use cases (i.e. you have something in produc
238095238095238
>
> Sorry if this is an obvious question but I find it hard to find details on
> these specifics.
>
> Many thanks,
>
> Will
Grant Ingersoll | @gsingers
http://www.lucidworks.com
The Apache Mahout PMC is pleased to announce the release of Mahout 0.8.
Mahout's goal is to build scalable machine learning libraries focused
primarily in the areas of collaborative filtering (recommenders),
clustering and classification (known collectively as the "3Cs"), as well as the
necessa
A _preview_ of release artifacts for 0.8 are at
https://repository.apache.org/content/repositories/orgapachemahout-113/org/apache/mahout/.
This is not an official release. I will call a vote in a day or two, pending
feedback on this thread, so please review/test.
A _preview_ of the release no
;
>> You need to group by user before converting to vector to get sensible
>> clustering.
>>
>>
>> On Wed, Jun 12, 2013 at 1:06 PM, Grant Ingersoll >> wrote:
>>
>>> The CSVVectorIterator in the Integration package will take in a CSV file
>
ng algorithm. My doubt is, Is
> there any need to convert the movielens rating.csv file into a sequence
> file. If needed what are the commands for applying clustering technique
> using mahout and the hadoop.
>
> Thanking you,
> Neetha Suan Thampi
------
.m.math.hadoop.decomposer and port all code that uses it to SSVD.
No opinion.
+1 on everything else.
>
> To all users and other committers, this is a biased first proposal,
> please shout, if you see things different and want to have things kept.
>
> Best,
> Sebastian
>
>
parse -> rowid -> cvb.
lucene.vector will still give you higher performance at the cost of extra
storage (and the fact that it doesn't work in M/R and can't handle multiple
directories).
I'd say we keep it for now.
>
>
>
>
> _
as a sequence file from lucene.vector?
>
> Thanks for your help.
>
> James
Grant Ingersoll | @gsingers
http://www.lucidworks.com
On Jun 2, 2013, at 10:42 AM, Sebastian Schelter wrote:
> I don't think unmaintained code should stay in our codebase.
+1
> This will
> only create frustration amongst our users, as they will not get
> questions answered and bugs fixed. It would also be an obstacle for a
> 1.0 release, where we
FP Growth seems to not have a lot of dev support. Are there users out there
using it? Should it live on or get the axe prior to 1.0?
-Grant
o
>> ./contentDataDir/sparseVectors --namedVector -wt tf -a
>> org.apache.lucene.analysis.EnglishAnalyzer
>>
>> java.lang.ClassNotFoundException: org.apache.lucene.analysis.EnglishAnalyzer
>>
>> Looking at the output from bin/mahout classpath
>>
>> it shows that lucene-analyzers-common-4.2.1.jar is in there as a dependancy
>> so any idea why is the above throwing an exception.
Grant Ingersoll | @gsingers
http://www.lucidworks.com
Hi,
I'm looking for interns for the summer for those interested in Mahout and
Machine Learning:
Research Engineer Internship
DESCRIPTION
LucidWorks, the leading commercial company for Apache Lucene and Solr, is
looking for interns to work on building next generation search, analytics and
mach
stering-using-Solr-Index-vs-Lucene-Index-Different-Results-tp4037198.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
Grant Ingersoll
http://www.lucidworks.com
Is there a way to build and to use any actual version with Lucene 4.0?
>
> thanks,
>
> --tomw
>
>
>
>
>
>
>
Grant Ingersoll
http://www.lucidworks.com
Analyzer.(DefaultAnalyzer.java:34)
>>... 11 more
>>
>> Any idea what causes this?
>>
Grant Ingersoll
http://www.lucidworks.com
e, heavily modified) on top of YARN.
>>
>> See ya'll there.
>>
>> JP
>>
>> --
>> Twitter: @jpatanooga
>> Principal Solution Architect @ Cloudera
>> hadoop: http://www.cloudera.com
>>
Grant Ingersoll
http://www.lucidworks.com
nd SIPC. Unless clearly
> stated, nothing herein shall be construed to be an offer to sell, nor a
> solicitation of an offer to buy, any financial product.
Grant Ingersoll
http://www.lucidworks.com
Hi,
I'm wondering if any has any rules of thumb around model size and memory usage
for SGD? I'm doing some testing of it myself, but thought I would ask to see
how it compares.
Thanks,
Grant
ut-user/201204.mbox/%3cca+y9ocwgs2se7doqqrse3p+qe5gvxct8xutucfdzvgkjkpo...@mail.gmail.com%3E
>
Grant Ingersoll
http://www.lucidimagination.com
On Apr 20, 2012, at 12:05 PM, Hector Yee wrote:
> On a related note, wish i could share the data i have to see how these
> algorithms stack up to the ones we use for large scale learning.
That certainly would be interesting.
>
> Are there other examples of large data sets people use? I know th
, attention, and work to the Mahout
>> project, rather than subtract from it. I hope to see more
>> Mahout-related commercializations, beyond the inclusions in
>> distributions we're already seeing, in 2012, as it's key to the
>> long-term project health. It's most certainly going to be the year of
>> the application layer (analytics, machine learning) for Big Data.
>>
>> Thank you!
>> Sean
>>
Grant Ingersoll
http://www.lucidimagination.com
Hi,
I have internships open for this summer for students interested in working on
search and machine learning. Description is below.
-Grant
Research Engineer Internship
DESCRIPTION
Lucid Imagination, the leading commercial company for Apache Lucene and Solr,
is looking for interns to work on
Hi Mahout's,
Thought some here might be interested as search and machine learning often go
together.
--
Lucene Revolution will be here May 9-10 in Boston. Reserve your spot today with
Early Bird pricing of $575. Committers and accepted speakers are entitled to
free admission. Our CFP is op
>> and fixing it up". Since I think this is the only realistic approach
>> to a next version, in this conversation I could not support anything
>> approach that pretends to do five more things in the next version --
>> at least not unless accompanied by some plan to address the
>> contributions already in line in JIRA. It's not OK to be implicitly
>> rejecting so much from the community by not planning to fix that first
>> and foremost.
>>
>>
>
Grant Ingersoll
http://www.lucidimagination.com
On Feb 22, 2012, at 7:24 AM, Jake Mannix wrote:
> On recent threads on the dev@ list, and discussions off-list, it's pretty
> clear that we need to have "cleanup" be a priority for the next release.
>
> How about this for a formal proposal:
>
>
> - The 0.7 release will have issues (both ne
On Jan 31, 2012, at 2:14 PM, Keary Cavin wrote:
>
> Dhruv, I downloaded the MAHOUT-627 patch and applied the files to the current
> mahout release. I'll let you know when I have questions.
Note, the plan is to put this patch into 0.7 once the remaining test issue is
fixed.
-Grant
dwood City, California
TRAVEL
Minimal
--------
Grant Ingersoll
http://www.lucidimagination.com
wt tf --minSupport 2
> --minDF 2 --maxDFSigma 3 -seq
>
> Thanks,
> John
>
> On Sun, Jan 22, 2012 at 3:00 PM, Grant Ingersoll wrote:
>
>> What were the command/options you were passing in?
>>
>>
>> On Jan 18, 2012, at 4:26 PM, John Conw
gt;
>DictionaryVectorizer.createTermFrequencyVectors(tokenizedPath,
> outputDir, tfDirName, conf, minSupport, maxNGramSize,
>
> minLLRValue, -1.0f, false, reduceTasks, chunkSize,
> sequentialAccessOutput, namedVectors);
>
> }
>
> --
>
> Thanks,
> John C
>
>
>
>
> --
>
> -- John C
Grant Ingersoll
http://www.lucidimagination.com
e
> there existing tools?)
----
Grant Ingersoll
http://www.lucidimagination.com
Doc 3 is ZZZ similar*
Have a look at the RowSimilarityJob, which will do pairwise similarity.
> *
> *
> Can you please help?
>
> --
> Regards
> Junaid
Grant Ingersoll
http://www.lucidimagination.com
euters, amongst others, for examples of these in
action.
>
> On Wed, Jan 4, 2012 at 8:31 AM, Grant Ingersoll wrote:
>> Hu Junaid,
>>
>> Have a look at the SparseVectorsFromSequenceFiles class, as this does this
>> already, in combination with SequenceFilesFromDirect
totype to calculate the TF IDF from the documents
> present in a directory.
>
> Can you please help me with the Steps to go about it using Apache Mahout?
> Thank you.
>
> --
> Regards
> Junaid
----
Grant Ingersoll
http://www.lucidimagination.com
hms of Mahout on Amazon EMR including
> clusterdumper following the instructions on:
>
> https://cwiki.apache.org/MAHOUT/mahout-on-elastic-mapreduce.html
>
> Thanks once again,
> Ipshita
--------
Grant Ingersoll
http://www.lucidimagination.com
ASF projects.
The basic task is to try and predict what project an email belongs to based on
its content.
> Are these textual
> features? Or what?
>
> On Tue, Jan 3, 2012 at 2:53 PM, Grant Ingersoll wrote:
>
>> I'm trying to run the full ASF email SGD classifier p
I'm trying to run the full ASF email SGD classifier problem and am facing heap
size issues. My current setup has 105 features and I am using a cardinality of
100K. I'm using the AdaptiveLogisticRegression. I'm getting heap errors and
they occur when trying to construct the ALR class (i.e. not
ome notes in the script to document this a bit more.
Note, there are some issues w/ this example and the SGD code that are still
being worked through. See https://issues.apache.org/jira/browse/MAHOUT-904 for
more info.
----
Grant Ingersoll
http://www.lucidimagination.com
rong?
>
> I'm running a 0.6-SNAPSHOT I cloned today from github. Was considering
> trying 0.5 but a quick look at recent changes doesn't seem to suggest this
> code has changed in awhile...
>
> Cheers,
> Mat
Grant Ingersoll
http://www.lucidimagination.com
should become a separate binary attribute,
> MapBackedARFFModel.java doesn't seem to do the right thing.
We can patch this if you have an alternate implementation.
>
> Seems like a compressed binary format would be useful for representing such
> attributes, unless you also neede
will mahout insert derived attributes (hour of day, day
> of week)? I presume not and I presume I have to add them myself.
>
> Thanks, Don
Grant Ingersoll
http://www.lucidimagination.com
s Grant that was the point of my first question..
>> Now I'll take a look at the vector implementation.
>> Thanks again
>> Daniele
>>
>> On 14 December 2011 23:44, Grant Ingersoll wrote:
>>> While Ted answered the Dissector question, your original issu
gt; org.apache.mahout.math.VectorWritable
>>> at
>>>
>>>
>> org.apache.mahout.classifier.naivebayes.training.IndexInstancesMapper.map(IndexInstancesMapper.java:1)
>>> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
>>> at org.apache.hado
ethod through the seqdirectory program i get this error:
>
> java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
> org.apache.mahout.math.VectorWritable
>
> Do you have some hints on the right usage of this class?
>
> Thanks,
> Daniele Volpi
---
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/mahout-exception-lucene-vector-tp3569144p3569144.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
Grant Ingersoll
http://www.lucidimagination.com
RK_DIR}/myproj-bydate/bayes-test-input \
> -type bayes \
> -ng 1 \
> -source hdfs \
> -v \
> -method mapreduce
>
> Any suggestions? Thanks
>
Grant Ingersoll
http://www.lucidimagination.com
Consider the examples ret
>
>
>
>
>
>Thanks and Regards,
>
>S SYED ABDUL KATHER
>
>9731841519
Grant Ingersoll
http://www.lucidimagination.com
gt;
>
>Thanks and Regards,
>S SYED ABDUL KATHER
> 9731841519
Grant Ingersoll
http://www.lucidimagination.com
, Dec 1, 2011 at 3:32 AM, Ted Dunning wrote:
>>> Sure. I attached it, but those get stripped. I didn't realize that this
>>> was going to the list.
>>>
>>> Try here: http://dl.dropbox.com/u/36863361/cluster-viz.r
>>>
>>> And here for the i
I launched a micro instance and mounted the volume and downloaded it. That's
the only way to get that exact data set that I am aware of. I've got a smaller
sample up on the Lucid website. Otherwise, if you just want something like it,
you can use your ASF credentials to get it. I can point y
le clusters are near.
>
> On Tue, Nov 29, 2011 at 8:03 AM, Grant Ingersoll wrote:
> I'm still learning R, do you have code handy you could share?
>
> On Nov 29, 2011, at 6:25 AM, Ted Dunning wrote:
>
> > Coloring is pretty easy in R, which is what I use. I just bu
ns, I vary the transparency according to how seriously
> down-sampled the cluster is. That lets me get a good visual feel for the
> actual cluster size.
>
> On Tue, Nov 29, 2011 at 5:03 AM, Grant Ingersoll wrote:
>
>> Anyone have an easy algorithm for coloring clusters
on
https://issues.apache.org/jira/browse/MAHOUT-899) but would really like to be
able to produce much prettier visualizations out of the box.
--------
Grant Ingersoll
http://www.lucidimagination.com
xamples.
>
>
> It may be a good idea to add 'JaccardDistance' measure to the existing
> Distance measures in Mahout (unless there was a reason for not having it in
> the first place).
TanimotoDistanceMeasure is the Jaccard Distance.
>
>
>
> __
ught the actual text content in both the
> files is different. How?
>
> I am assuming that the NGram attribute was set to the default value of 1 when
> creating the tf-idf vectors from sequence files.
>
> Suneel
>
>
>
>
the mahout root directory?
>
> Isabel
--------
Grant Ingersoll
http://www.lucidimagination.com
For those in the San Francisco area, there will be a Mahout User Meeting on
Nov. 29th at Lucid Imagination's offices. Details and RSVP are at
http://sf-mahout-11-11.eventbrite.com/
For those not in the SF area, I _believe_ we will be recording it and posting
it.
alue = *new* WeightedVectorWritable();
> *while* (reader.next(key, value))
> {
>
> System.*out*.println(value.toString() + " belongs to cluster "+
> key.toString());
>
> }
>
> reader.close();
> But it is returning null .
> Please help me to move further .
>
> Thanks and Regards,
> S SYED ABDUL KATHER
Grant Ingersoll
http://www.lucidimagination.com
t to the default value of 1 when
> creating the tf-idf vectors from sequence files.
>
> Suneel
>
>
>
>
> From: Grant Ingersoll
> To: user@mahout.apache.org
> Sent: Tuesday, October 25, 2011 5:55 AM
> Subject: Re: MinHash C
; If yes then I would update the wiki page
>> https://cwiki.apache.org/confluence/display/MAHOUT/Minhash+Clustering with
>> the instructions.
>>
>> Otherwise if someone could tell me on what am I doing wrong.
>
> I haven't looked into the code, but I get similar outputs, so I assume it is
> working. Might be good to incorporate this into the build-reuters.sh as well
> as try it on some other input.
>
> -Grant
Grant Ingersoll
http://www.lucidimagination.com
hout in Action. If they do what I think they do, I will definitely try
> them, and probably complain on the list (Ted) if I can't interpret them right
> :).
>
> Thanks for the reply,
>
> --
> Ioan Eugen Stan
Grant Ingersoll
http://www.lucidimagination.com
>x2: JList[JPair[JList[String], JLong]]) = {
> println(x1 + ":" +
>x2.map(pair => "[" + pair.getFirst.mkString(",") + "] : " +
> pair.getSecond).mkString("; "))
> }
>
Might be of interest: "Clustering Very Large Multi-dimensional Datasets with
MapReduce"
http://www.cs.cmu.edu/~jclopez/ref/kdd2011-mr-clustering.pdf
--------
Grant Ingersoll
http://www.lucidimagination.com
On Nov 16, 2011, at 9:39 PM, Ioan Eugen Stan wrote:
> Hello,
>
> I have to figure out how much hardware is required to do clustering
> for my company on about 10+ milion user accounts, each with 100-5000
> documents. The documents will be indexed so vector creation will be
> done at indexing.
>
I've never implemented LSI. Is there a way to incrementally build the model
(by simply indexing documents) or is it something that one only runs after the
fact once one has built up the much bigger matrix? If it's the former, I bet
it wouldn't be that hard to just implement the appropriate new
should I give the class? I tried to change the canopy
> thresholds (250, 120) to some other numbers, tried also changing the
> EuclideanDistanceMeasure for the canopy clustering to
> CosineDistanceMeasure, with no use.
>
> Many thanks in advance,
> Ahmad
Grant Ingersoll
http://www.lucidimagination.com
Might be useful: https://github.com/algoriffic/lsa4solr
Looks like it hasn't been kept up to date.
On Nov 13, 2011, at 1:47 PM, Sebastian Schelter wrote:
> Is there some documentation/tutorial available on how to build a LSI
> pipeline with mahout and lucene?
>
> --sebastian
> MSV-770{n=1 c=[0:-0.025,1:0.011,2:0.032,..etc
>
> As seen above in MSV-441 there is no presence of ":" in the output whereas
> MSV-770 has ):-0.025.
> Can anyone throw some light as to what is the difference and why is it
> present there..??
>
> Thanks.
solution anyhow).
See the ClusterDumper code.
>
> I'm new to Mahout and have to admit I've been struggling even to get this
> far. Any help would be gratefully received.
>
>
> R
Grant Ingersoll
http://www.lucidimagination.com
Cool, how about adding it to the Wiki?
On Nov 9, 2011, at 8:15 AM, Suneel Marthi wrote:
> I can put together a doc if we don't already have one, know the SGD code
> pretty well.
>
> Regards,
> Suneel
>
>
>
> ____
>
In the SGD TrainNewsGroups example, we have:
System.out.printf("%.2f\t%.2f\t%.2f\t%.2f\t%.8g\t%.8g\t", maxBeta, nonZeros,
positive, norm, lambda, mu);
Do we have any docs explaining what these values mean and what one should be
looking for to know whether the system is performing or not?
Thanks
.
-Grant
On Nov 7, 2011, at 8:54 PM, Suneel Marthi wrote:
> Do we have an answer for this?
>
> Sent from my iPhone
>
> On Nov 2, 2011, at 7:20 AM, Grant Ingersoll wrote:
>
>> What's the Minhash key groups value used for in the MinhashDriver? I mean,
>> I see
I haven't seen an answer yet. I also asked on dev@.
On Nov 7, 2011, at 8:54 PM, Suneel Marthi wrote:
> Do we have an answer for this?
>
> Sent from my iPhone
>
> On Nov 2, 2011, at 7:20 AM, Grant Ingersoll wrote:
>
>> What's the Minhash key groups value
st of the heavy lifting.
On Nov 5, 2011, at 11:20 AM, Robert Stewart wrote:
> Can you point me to the code in trunk which implements "lucene.vector"
> command?
>
> Bob
>
>
> On Nov 4, 2011, at 2:05 PM, Grant Ingersoll wrote:
>
>> Should be doable, but like
to make this work via
codecs.
-Grant
--------
Grant Ingersoll
http://www.lucidimagination.com
g to
have two speakers giving presentations related to Mahout: Ted Dunning, MapR
and Grant Ingersoll of Lucid Imagination (me). Both Ted and Grant are long
time committers on the Mahout project.
Ted's talk: How and why random projections work?
Mine: Using Mahout to Cluster, Classify a
ector file that I can use with mahout.
>
> Probably I should not use internal docid, but instead some unique identifier
> field.
>
> Also, I assume at some point this could be a map-reduce job in hadoop.
>
> I'm just asking for sanity check, or if there are any better ideas out there.
>
> Thanks
> Bob
--
Grant Ingersoll
http://www.lucidimagination.com
t; code to explain each distance measure implementation, that will really help,
> thanks guys.
That would be a great addition! Also, javadoc would be helpful, so patches
would be great there.
Grant Ingersoll
http://www.lucidimagination.com
We've been debating removing/archiving the Watchmaker integration in Mahout due
to seeming lack of maintenance and interest. Is anybody actually using it?
-Grant
>> So, what helped me was to process this into a map with cluster Id as the
>>>>>> key and vector list as the value. I read the clustered points and all
>>>>>> the data in the map in the form. In the end, the list against each
>>>>>>
I've tried various open source tools (Gephi, others), but haven't found one yet
that can handle large volumes of points in an efficient way. FWIW, the Carrot2
workbench is BSD, perhaps it could be used with some work?
That being said, I did recently add the ability to ClusterDumper to output
ll points/vector belong
>>> to this cluster, but... so did i miss something? Thanks a lot. Cheers
>>> Ramon
>>>
>>>
>>> -
>>> No virus found in this message.
>>> Checked by AVG - www.avg.com
>>> Version: 10.0.1411 / Virus Database: 2092/3992 - Release Date: 11/02/11
>>
>
Grant Ingersoll
http://www.lucidimagination.com
In the vein of users become contributors become committers:
It seems there has been some spark of interest in contributing more, so I
thought I would pass along a few pointers:
1. https://cwiki.apache.org/MAHOUT/how-to-contribute.html -- Details how to
submit patches, etc. IDE codestyles at
ssion about them way back when and Ted and Jeff went through a few
iterations to add them in.
>
> -jake
>
> On Wed, Nov 2, 2011 at 8:08 AM, Grant Ingersoll wrote:
>
>>
>> On Nov 2, 2011, at 10:58 AM, Jake Mannix wrote:
>>
>>> On Wed, Nov 2, 20
On Nov 2, 2011, at 10:58 AM, Jake Mannix wrote:
> On Wed, Nov 2, 2011 at 7:34 AM, Grant Ingersoll wrote:
>
>> What functionality, specifically, are you proposing to remove?
>
>
> I'm suggesting we kill, from Matrix.java and descendents, all of the
>
On Nov 2, 2011, at 7:17 AM, Tharindu Mathew wrote:
> I want to create a java UI tool (based on a web app) that can pick and
> apply different algorithms available in Mahout to different data sets.
Very cool! Keep us posted, as this would be immensely useful! Any chance it
will be donated back
What functionality, specifically, are you proposing to remove? I know we had a
lot of discussion around some of this stuff way back when as to how best to do
it, but of course, that doesn't mean it has uptake. If it's on the Matrix,
then doesn't it more easily get shipped around via the Writab
What's the Minhash key groups value used for in the MinhashDriver? I mean, I
see it is used for building up the key out of the hashed values, but what's the
significance of different values for it? The default is 2, what does it mean
practically speaking if I choose, say, 10? AFAICT, it would
On Nov 1, 2011, at 2:16 PM, Patrick Hunt wrote:
> On Tue, Nov 1, 2011 at 10:44 AM, Ted Dunning wrote:
>> On Tue, Nov 1, 2011 at 9:18 AM, Patrick Hunt wrote:
>>
>>> 2011/10/31 Ted Dunning :
Keep in mind that Cloudera has packaged the 0.5 release. That is
>>> probably
OK for most reco
any info if it's available?
>
> --
> Regards,
>
> Tharindu
>
> blog: http://mackiemathew.com/
--------
Grant Ingersoll
http://www.lucidimagination.com
you like, but on Hadoop.
>
> On Wed, Oct 26, 2011 at 1:56 PM, Grant Ingersoll wrote:
>> I seem to recall past discussions on where one hits the bottleneck w/ user
>> based recommendation approaches in Mahout, but I can't seem to locate it
>> anymore. Anyone know of
I seem to recall past discussions on where one hits the bottleneck w/ user
based recommendation approaches in Mahout, but I can't seem to locate it
anymore. Anyone know off hand? Where do user based approaches hit their
limits, more or less?
Thanks,
Grant
the one the index was
>> created with?
>>
>>
>> Isabel
>>
>
>
> --
> Lance Norskog
> goks...@gmail.com
Grant Ingersoll
http://www.lucidimagination.com
On Oct 19, 2011, at 11:38 AM, Varun Thacker wrote:
> I was trying to run the MinHash algorithm on the Reuters data set, so I did
> the following before running MinHashDriver
>
> - Get the Reuters dataset
> - Run org.apache.lucene.benchmark.utils.ExtractReuters to generate
> reuters-out fro
m/zkpy0k.png>
>
> Please help me out . Thanks a lot .
--------
Grant Ingersoll
http://www.lucidimagination.com
Just a friendly nudge to those on the fence for ApacheCon in Vancouver this
year that there will be both a Mahout training and some Mahout talks. I think
a few of us committers will also be hacking Mahout on Tuesday if you are
interested.
Training info: http://na11.apachecon.com/talks/18395
M
rainer-thetaNormalizer
> drwxrwxrwx - hadoop supergroup 0 2011-10-17 10:18
> /user/hadoop/bayes-model/trainer-weights
>
> And I use this model to classify new data, all sample will be classified to
> "unknown"
>
> My Environment:
>
>
the shell script with -x as
> you will probably have to tweak it.
>
> Lance
>
> On Thu, Oct 13, 2011 at 11:04 PM, Sebastian Schelter wrote:
>
>> Only got the raw data, how did you convert it to our standard
>> recommender input?
>>
>> --sebastian
>&g
;> recommender input?
>>
>> --sebastian
>>
>>
>> On 14.10.2011 01:17, Grant Ingersoll wrote:
>>> Were you able to get the data, Sebastian?
>>>
>>> On Oct 13, 2011, at 4:01 AM, Sebastian Schelter wrote:
>>>
>>>>
1 - 100 of 307 matches
Mail list logo