[jira] Updated: (MAHOUT-274) Use avro for serialization of structured documents.
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Anil updated MAHOUT-274: -- Affects Version/s: 0.4 Fix Version/s: 0.4 Assignee: Drew Farris > Use avro for serialization of structured documents. > --- > > Key: MAHOUT-274 > URL: https://issues.apache.org/jira/browse/MAHOUT-274 > Project: Mahout > Issue Type: Improvement >Affects Versions: 0.4 >Reporter: Drew Farris >Assignee: Drew Farris >Priority: Minor > Fix For: 0.4 > > Attachments: mahout-avro-examples.tar.gz, mahout-avro-examples.tar.gz > > > Explore the intersection between Writables and Avro to see how serialization > can be improved within Mahout. > An intermediate goal is the provide a structured document format that can be > serialized using Avro as an Input/OutputFormat and Writable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-274) Use avro for serialization of structured documents.
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Farris updated MAHOUT-274: --- Attachment: (was: mahout-avro-examples.tar.bz) > Use avro for serialization of structured documents. > --- > > Key: MAHOUT-274 > URL: https://issues.apache.org/jira/browse/MAHOUT-274 > Project: Mahout > Issue Type: Improvement >Reporter: Drew Farris >Priority: Minor > Attachments: mahout-avro-examples.tar.gz, mahout-avro-examples.tar.gz > > > Explore the intersection between Writables and Avro to see how serialization > can be improved within Mahout. > An intermediate goal is the provide a structured document format that can be > serialized using Avro as an Input/OutputFormat and Writable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-274) Use avro for serialization of structured documents.
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Farris updated MAHOUT-274: --- Comment: was deleted (was: re-added latest tarball with proper extension.) > Use avro for serialization of structured documents. > --- > > Key: MAHOUT-274 > URL: https://issues.apache.org/jira/browse/MAHOUT-274 > Project: Mahout > Issue Type: Improvement >Reporter: Drew Farris >Priority: Minor > Attachments: mahout-avro-examples.tar.gz, mahout-avro-examples.tar.gz > > > Explore the intersection between Writables and Avro to see how serialization > can be improved within Mahout. > An intermediate goal is the provide a structured document format that can be > serialized using Avro as an Input/OutputFormat and Writable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-274) Use avro for serialization of structured documents.
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Farris updated MAHOUT-274: --- Attachment: (was: mahout-colloc.tar.gz) > Use avro for serialization of structured documents. > --- > > Key: MAHOUT-274 > URL: https://issues.apache.org/jira/browse/MAHOUT-274 > Project: Mahout > Issue Type: Improvement >Reporter: Drew Farris >Priority: Minor > Attachments: mahout-avro-examples.tar.gz, mahout-avro-examples.tar.gz > > > Explore the intersection between Writables and Avro to see how serialization > can be improved within Mahout. > An intermediate goal is the provide a structured document format that can be > serialized using Avro as an Input/OutputFormat and Writable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-274) Use avro for serialization of structured documents.
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Farris updated MAHOUT-274: --- Attachment: mahout-avro-examples.tar.gz (this is really the right tarball this time, honest) > Use avro for serialization of structured documents. > --- > > Key: MAHOUT-274 > URL: https://issues.apache.org/jira/browse/MAHOUT-274 > Project: Mahout > Issue Type: Improvement >Reporter: Drew Farris >Priority: Minor > Attachments: mahout-avro-examples.tar.gz, mahout-avro-examples.tar.gz > > > Explore the intersection between Writables and Avro to see how serialization > can be improved within Mahout. > An intermediate goal is the provide a structured document format that can be > serialized using Avro as an Input/OutputFormat and Writable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-274) Use avro for serialization of structured documents.
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Farris updated MAHOUT-274: --- Attachment: mahout-colloc.tar.gz re-added latest tarball with proper extension. > Use avro for serialization of structured documents. > --- > > Key: MAHOUT-274 > URL: https://issues.apache.org/jira/browse/MAHOUT-274 > Project: Mahout > Issue Type: Improvement >Reporter: Drew Farris >Priority: Minor > Attachments: mahout-avro-examples.tar.bz, > mahout-avro-examples.tar.gz, mahout-colloc.tar.gz > > > Explore the intersection between Writables and Avro to see how serialization > can be improved within Mahout. > An intermediate goal is the provide a structured document format that can be > serialized using Avro as an Input/OutputFormat and Writable -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-274) Use avro for serialization of structured documents.
[ https://issues.apache.org/jira/browse/MAHOUT-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Drew Farris updated MAHOUT-274: --- Attachment: mahout-avro-examples.tar.bz Status update w/ new tarball which contains a maven project (mvn clean install should do the trick) README.txt included, relevant portions included below: Provided are two different versions of AvroInputFormat/AvroOutputFormat that are compatible with the mapred (pre 0.20) and mapreduce (0.20+) apis. They are based on, code provided as a part of MAPREDUCE-815 and other patches. Also provided are backports of the SerializationBase/AvroSerialization classes from the current hadoop-core trunk. When writing a job using the pre 0.20 apis: Add serializations: {code} conf.setStrings("io.serializations", new String[] { WritableSerialization.class.getName(), AvroSpecificSerialization.class.getName(), AvroReflectSerialization.class.getName(), AvroGenericSerialization.class.getName() }); {code} Setup input and output formats: {code} conf.setInputFormat(AvroInputFormat.class); conf.setOutputFormat(AvroOutputFormat.class); AvroInputFormat.setAvroInputClass(conf, AvroDocument.class); AvroOutputFormat.setAvroOutputClass(conf, AvroDocument.class); {code} AvroInputFormat provides the specified class as the key and a LongWritable file offset as the value. AvroOutputFormat expects the specified class as the key and expects a NullWritable as a value. If an avro serializable class is passed between the map and reduce phases it is necessary to set the following: {code} AvroComparator.setSchema(AvroDocument._SCHEMA); conf.setClass("mapred.output.key.comparator.class", AvroComparator.class, RawComparator.class); {code} So far I've been using avro 'specific' serialization, which compiles an avro schema into a Java class. see src/main/schemata/org/apache/mahout/avro/AvroDocument.avsc. This is currently compiled into classes o.a.m.avro.document (AvroDocument|AvroField) using o.a.m.avro.util.AvroDocumentCompiler (eventually to be replaced by a maven plugin, Generated sources are currently checked in.). Helper classes for AvroDocument and AvroField include o.a.m.avro.document.Avro(Document|Field)Builder, o.a.m.avro(Document|Field)Reader. This seems to work ok here, but I'm not certain that this is be best pattern to use, especially when there are many pre-existing classes (such as there are in the case of vector. Avro also provides reflection-based serialization and schema-based serialization, both should be supported by the infrastructure that has been backported here, but that's something else to explore. Examples: These are quick and dirty and need much cleanup work before they can be taken out to the dance. see o.a.m.avro.text, o.a.m.avro.text.mapred and o.a.m.avro.text.mapreduce: * AvroDocumentsFromDirectory: quick and dirty port of SequenceFilesFromDirectory to use AvroDocuments. Writes a file containing documents in avro format; file contents is stored in a single field named 'content', contents are stored in the originalText portion of this field. * AvroDocumentsDumper: dump an avro documents file to a standard output * AvroDocumentsWordCount: perform a wordcount on an avro document input file. * AvroDocumentProcessor: tokenizes the text found in the input document file, reads from the originalText of the field named content and writes original document+tokens to output file. Running the examples: (haven't tested with the hadoop driver yet) {code} mvn exec:java -Dexec.mainClass=org.apache.mahout.avro.text.AvroDocumentsFromDirectory \ -Dexec.args='--parent /home/drew/mahout/20news-18828 \ --outputDir /home/drew/mahout/20news-18828-example \ --charset UTF-8' mvn exec:java -Dexec.mainClass=org.apache.mahout.avro.text.mapred.AvroDocumentProcessor \ -Dexec.args='/home/drew/mahout/20news-18828-example /home/drew/mahout/20news-18828-processed' mvn exec:java -Dexec.mainClass=org.apache.mahout.avro.text.AvroDocumentsDumper \ -Dexec.args='/home/drew/mahout/20news-18828-processed/.avro-r-0' > foobar.txt {code} The Wikipedia stuff is in there, but isn't working yet. Many thanks (apologies) to Robin for the starting point for much of this code and hacking it to pieces so badly. > Use avro for serialization of structured documents. > --- > > Key: MAHOUT-274 > URL: https://issues.apache.org/jira/browse/MAHOUT-274 > Project: Mahout > Issue Type: Improvement >Reporter: Drew Farris >Priority: Minor > Attachments: mahout-avro-examples.tar.bz, mahout-avro-examples.tar.gz > > > Explore the intersection between Writables and Avro to see how serialization > can be improved within Mahout. > An intermediate g
[jira] Updated: (MAHOUT-292) Classifier Test Data and Self Tests
[ https://issues.apache.org/jira/browse/MAHOUT-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Anil updated MAHOUT-292: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed > Classifier Test Data and Self Tests > --- > > Key: MAHOUT-292 > URL: https://issues.apache.org/jira/browse/MAHOUT-292 > Project: Mahout > Issue Type: Improvement >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil > Fix For: 0.3 > > Attachments: MAHOUT-292.patch > > > Till now there was no means to test if quality of classification suffered due > to a code change. > Added Classifier data with 3 labels (mahout, lucene and spamassasin) with 4 > long sentences in each of the labels. > Added a SelfTest which trains Bayes and CBayes model and classify the train > dataset while testing and check accuracy and confusion matrix -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Mass Code Cleanup
I picked out the formatter issues and committed the rest. Will have smaller patches if anything looks horribly machine formatter. So far not much Robin On Mon, Feb 15, 2010 at 8:18 PM, Robin Anil wrote: > SGD kmeans++ pegasus seems fine. Isabel can you check with the latest trunk > if the perceptron is alright? > I dont see any other open issues which requires patch testing as extensive > as these do > > Robin > > > On Mon, Feb 15, 2010 at 8:10 PM, Drew Farris wrote: > >> On Mon, Feb 15, 2010 at 1:09 AM, Robin Anil wrote: >> > If its A. I have a few patches ready to commit like the static qualifier >> > fix. I really need you guys to be on board on this. We just cant leave >> it at >> > this discussion. >> > >> > If its B. I will do the revert. But would have to patch some commits. >> > >> > If A sounds reasonable. Its easier to go forward than go back. I will >> not be >> > making any more changes at this scale. except bunch of classes from time >> to >> > time. >> >> I think A sounds reasonable, given a patch for MAHOUT-291 that isn't >> as extensive, but I can't really comment on the potential for breaking >> other patches here. I would say that the people with that sort of time >> invested really should have the final say. >> >> Would it make sense for those with outstanding patches to apply 291 >> and then attempt to apply their patches to determine the extent of >> breakage? To be honest, anyone can do it really. If someone wants to >> post some jira issue references for patches that need to be tested I >> can mess around with trying to apply them this evening. >> >> Drew >> > >
[jira] Commented: (MAHOUT-291) Mahout Code Cleanup
[ https://issues.apache.org/jira/browse/MAHOUT-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833905#action_12833905 ] Robin Anil commented on MAHOUT-291: --- Picked out the formatter errors. Committing the rest of the fix. > Mahout Code Cleanup > --- > > Key: MAHOUT-291 > URL: https://issues.apache.org/jira/browse/MAHOUT-291 > Project: Mahout > Issue Type: Improvement > Components: Classification, Clustering, Collaborative Filtering, > Frequent Itemset/Association Rule Mining, Genetic Algorithms, Math, Utils, > Website >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil >Priority: Minor > Fix For: 0.3 > > Attachments: MAHOUT-291.patch > > > Code Cleanup > Organize imports > Remove space in blank lines > make local variables final -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (MAHOUT-291) Mahout Code Cleanup
[ https://issues.apache.org/jira/browse/MAHOUT-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833879#action_12833879 ] Benson Margulies commented on MAHOUT-291: - Robin's going to have to pick through the diffs and find all the places where the formatter splatted and put them back. > Mahout Code Cleanup > --- > > Key: MAHOUT-291 > URL: https://issues.apache.org/jira/browse/MAHOUT-291 > Project: Mahout > Issue Type: Improvement > Components: Classification, Clustering, Collaborative Filtering, > Frequent Itemset/Association Rule Mining, Genetic Algorithms, Math, Utils, > Website >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil >Priority: Minor > Fix For: 0.3 > > Attachments: MAHOUT-291.patch > > > Code Cleanup > Organize imports > Remove space in blank lines > make local variables final -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAHOUT-293) Add more tunable parameters to PFPGrowth implementation
Add more tunable parameters to PFPGrowth implementation --- Key: MAHOUT-293 URL: https://issues.apache.org/jira/browse/MAHOUT-293 Project: Mahout Issue Type: Improvement Components: Frequent Itemset/Association Rule Mining Affects Versions: 0.4 Reporter: Robin Anil Assignee: Robin Anil Fix For: 0.4 Objective is to add more tunable parameters to the PFPGrowth algorithm. >From Neal on Mahout User list: I often use Christian Borgelt's itemset implementations for playing with data. He's implemented a nice set of switches, see below. Setting a minimum support threshold and mimimum itemset size are both convenient and tend to make the algorithm run a bit faster. http://www.borgelt.net/software.html ne...@nrichter-laptop:~$ fpgrowth_fim usage: fpgrowth_fim [options] infile outfile find frequent item sets with the fpgrowth algorithm version 1.13 (2008.05.02)(c) 2004-2008 Christian Borgelt -m# minimal number of items per item set (default: 1) -n# maximal number of items per item set (default: no limit) -s# minimal support of an item set (default: 10%) (positive: percentage, negative: absolute number) -d# minimal binary logarithm of support quotient (default: none) -p# output format for the item set support (default: "%.1f") -a print absolute support (number of transactions) -g write output in scanable form (quote certain characters) -q# sort items w.r.t. their frequency (default: -2) (1: ascending, -1: descending, 0: do not sort, 2: ascending, -2: descending w.r.t. transaction size sum) -u use alternative tree projection method -z do not prune tree projections to bonsai -j use quicksort to sort the transactions (default: heapsort) -i# ignore records starting with a character in the given string -b/f/r# blank characters, field and record separators (default: " \t\r", " \t", "\n") infile file to read transactions from outfile file to write frequent item se -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-292) Classifier Test Data and Self Tests
[ https://issues.apache.org/jira/browse/MAHOUT-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Anil updated MAHOUT-292: -- Status: Patch Available (was: Open) Patch ready to go in > Classifier Test Data and Self Tests > --- > > Key: MAHOUT-292 > URL: https://issues.apache.org/jira/browse/MAHOUT-292 > Project: Mahout > Issue Type: Improvement >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil > Fix For: 0.3 > > Attachments: MAHOUT-292.patch > > > Till now there was no means to test if quality of classification suffered due > to a code change. > Added Classifier data with 3 labels (mahout, lucene and spamassasin) with 4 > long sentences in each of the labels. > Added a SelfTest which trains Bayes and CBayes model and classify the train > dataset while testing and check accuracy and confusion matrix -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (MAHOUT-292) Classifier Test Data and Self Tests
[ https://issues.apache.org/jira/browse/MAHOUT-292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Anil updated MAHOUT-292: -- Attachment: MAHOUT-292.patch > Classifier Test Data and Self Tests > --- > > Key: MAHOUT-292 > URL: https://issues.apache.org/jira/browse/MAHOUT-292 > Project: Mahout > Issue Type: Improvement >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil > Fix For: 0.3 > > Attachments: MAHOUT-292.patch > > > Till now there was no means to test if quality of classification suffered due > to a code change. > Added Classifier data with 3 labels (mahout, lucene and spamassasin) with 4 > long sentences in each of the labels. > Added a SelfTest which trains Bayes and CBayes model and classify the train > dataset while testing and check accuracy and confusion matrix -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (MAHOUT-292) Classifier Test Data and Self Tests
Classifier Test Data and Self Tests --- Key: MAHOUT-292 URL: https://issues.apache.org/jira/browse/MAHOUT-292 Project: Mahout Issue Type: Improvement Affects Versions: 0.3 Reporter: Robin Anil Assignee: Robin Anil Fix For: 0.3 Till now there was no means to test if quality of classification suffered due to a code change. Added Classifier data with 3 labels (mahout, lucene and spamassasin) with 4 long sentences in each of the labels. Added a SelfTest which trains Bayes and CBayes model and classify the train dataset while testing and check accuracy and confusion matrix -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Mass Code Cleanup
SGD kmeans++ pegasus seems fine. Isabel can you check with the latest trunk if the perceptron is alright? I dont see any other open issues which requires patch testing as extensive as these do Robin On Mon, Feb 15, 2010 at 8:10 PM, Drew Farris wrote: > On Mon, Feb 15, 2010 at 1:09 AM, Robin Anil wrote: > > If its A. I have a few patches ready to commit like the static qualifier > > fix. I really need you guys to be on board on this. We just cant leave it > at > > this discussion. > > > > If its B. I will do the revert. But would have to patch some commits. > > > > If A sounds reasonable. Its easier to go forward than go back. I will not > be > > making any more changes at this scale. except bunch of classes from time > to > > time. > > I think A sounds reasonable, given a patch for MAHOUT-291 that isn't > as extensive, but I can't really comment on the potential for breaking > other patches here. I would say that the people with that sort of time > invested really should have the final say. > > Would it make sense for those with outstanding patches to apply 291 > and then attempt to apply their patches to determine the extent of > breakage? To be honest, anyone can do it really. If someone wants to > post some jira issue references for patches that need to be tested I > can mess around with trying to apply them this evening. > > Drew >
[jira] Commented: (MAHOUT-291) Mahout Code Cleanup
[ https://issues.apache.org/jira/browse/MAHOUT-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833810#action_12833810 ] Robin Anil commented on MAHOUT-291: --- The last one was done by me not the formatter. the removed lines are the ones created by the formatter. About the first one. There are only a couple of places with such a problem (thats when the num chars tread close to the limit) I do agree the options look much better before. The same formatter does the above one in 80 columns. and does the below one in 120 columns. > Mahout Code Cleanup > --- > > Key: MAHOUT-291 > URL: https://issues.apache.org/jira/browse/MAHOUT-291 > Project: Mahout > Issue Type: Improvement > Components: Classification, Clustering, Collaborative Filtering, > Frequent Itemset/Association Rule Mining, Genetic Algorithms, Math, Utils, > Website >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil >Priority: Minor > Fix For: 0.3 > > Attachments: MAHOUT-291.patch > > > Code Cleanup > Organize imports > Remove space in blank lines > make local variables final -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Mass Code Cleanup
On Mon, Feb 15, 2010 at 1:09 AM, Robin Anil wrote: > If its A. I have a few patches ready to commit like the static qualifier > fix. I really need you guys to be on board on this. We just cant leave it at > this discussion. > > If its B. I will do the revert. But would have to patch some commits. > > If A sounds reasonable. Its easier to go forward than go back. I will not be > making any more changes at this scale. except bunch of classes from time to > time. I think A sounds reasonable, given a patch for MAHOUT-291 that isn't as extensive, but I can't really comment on the potential for breaking other patches here. I would say that the people with that sort of time invested really should have the final say. Would it make sense for those with outstanding patches to apply 291 and then attempt to apply their patches to determine the extent of breakage? To be honest, anyone can do it really. If someone wants to post some jira issue references for patches that need to be tested I can mess around with trying to apply them this evening. Drew
[jira] Commented: (MAHOUT-291) Mahout Code Cleanup
[ https://issues.apache.org/jira/browse/MAHOUT-291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833806#action_12833806 ] Drew Farris commented on MAHOUT-291: Thanks very much Robin for posting a patch to review. Things like KMeansClusterer.log.debug -> log.debug look great herehere, and I'm ok with the whitespace oriented changes for the most part, bu there are some cases where auto-code formatting is really making a hash of things: e.g: {code} System.out.println("Generating " + num + " samples m=[" + mx + ", " + my + "] sd=[" + sdx + ", " + sdy + ']'); {code} gets transformed to: {code} System.out.println("Generating " + num + " samples m=[" + mx + ", " + my + "] sd=[" + sdx + ", " + sdy + ']'); {code} which despite the 120 line length rule seems a little too strict IMHO. Also, a nicely formatted OptionBuilder is turned into something nasty and unreadable. {code} -Option clustersOpt = obuilder -.withLongName("clusters") -.withRequired(true) - .withArgument(abuilder.withName("clusters").withMinimum(1).withMaximum(1).create()) -.withDescription( - "The input centroids, as Vectors. Must be a SequenceFile of Writable, Cluster/Canopy. " - + "If k is also specified, then a random set of vectors will be selected and written out to this path first") +Option clustersOpt = obuilder.withLongName("clusters").withRequired(true).withArgument( + abuilder.withName("clusters").withMinimum(1).withMaximum(1).create()).withDescription( + "The input centroids, as Vectors. Must be a SequenceFile of Writable, Cluster/Canopy. " + + "If k is also specified, then a random set of vectors will be selected and" + + "written out to this path first") .withShortName("c").create(); {code} And things like the following, but honestly which of these is the greater sin? (From LDAInference) {code} -double t = f - * (-1 / 12.0 + f - * (1 / 120.0 + f - * (-1 / 252.0 + f - * (1 / 240.0 + f -* (-1 / 132.0 + f - * (691 / 32760.0 + f - * (-1 / 12.0 + f * 3617.0 / 8160.0))); +double t = f * (-1 / 12.0 + f * (1 / 120.0 + f * (-1 / 252.0 ++ f * (1 / 240.0 + f * (-1 / 132.0 + f * (691 / 32760.0 + f * (-1 / 12.0 + f * 3617.0 / 8160.0))); {code} What's the best way to proceed from here given this? > Mahout Code Cleanup > --- > > Key: MAHOUT-291 > URL: https://issues.apache.org/jira/browse/MAHOUT-291 > Project: Mahout > Issue Type: Improvement > Components: Classification, Clustering, Collaborative Filtering, > Frequent Itemset/Association Rule Mining, Genetic Algorithms, Math, Utils, > Website >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil >Priority: Minor > Fix For: 0.3 > > Attachments: MAHOUT-291.patch > > > Code Cleanup > Organize imports > Remove space in blank lines > make local variables final -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Mass Code Cleanup
In a previous jira issue I had aligned the CS with the lucene style. Its that version which is checked in. Could you try now with ti Robin On Mon, Feb 15, 2010 at 7:44 PM, Benson Margulies wrote: > To answer a question of Robin's: > > Some months ago, I started to make arrangements to include cs in our > build. However, I discovered an aspect of 'Lucene style' that was, at > the time, 100%-incompatible with cs. There was no option to cs to > align it. > > So, the first step here is to agree to a style that cs can, in fact, check. >
Re: Mass Code Cleanup
To answer a question of Robin's: Some months ago, I started to make arrangements to include cs in our build. However, I discovered an aspect of 'Lucene style' that was, at the time, 100%-incompatible with cs. There was no option to cs to align it. So, the first step here is to agree to a style that cs can, in fact, check.
[jira] Updated: (MAHOUT-291) Mahout Code Cleanup
[ https://issues.apache.org/jira/browse/MAHOUT-291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robin Anil updated MAHOUT-291: -- Attachment: MAHOUT-291.patch Remove static qualifiers. Fix most of the 120+ line issues > Mahout Code Cleanup > --- > > Key: MAHOUT-291 > URL: https://issues.apache.org/jira/browse/MAHOUT-291 > Project: Mahout > Issue Type: Improvement > Components: Classification, Clustering, Collaborative Filtering, > Frequent Itemset/Association Rule Mining, Genetic Algorithms, Math, Utils, > Website >Affects Versions: 0.3 >Reporter: Robin Anil >Assignee: Robin Anil >Priority: Minor > Fix For: 0.3 > > Attachments: MAHOUT-291.patch > > > Code Cleanup > Organize imports > Remove space in blank lines > make local variables final -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: Mass Code Cleanup
On Mon, Feb 15, 2010 at 1:09 AM, Robin Anil wrote: > If its A. I have a few patches ready to commit like the static qualifier > fix. I really need you guys to be on board on this. We just cant leave it at > this discussion. Is there a patch on JIRA for these? It would be easier to review and vote on if there is. Drew
Re: Mahout as TLP
+1 on Isabel's comments. Isabel Drost wrote: On Sat Grant Ingersoll wrote: I don't see any harm in getting 0.3 out first if that makes folks more comfortable. Yeah, this feels better to me the more I think about it. +1 from me as well: I really like the idea of Mahout becoming a TLP - even before a 1.0 release is available. However I think it makes sense to sort out the 0.3 release first. If I am counting correctly, that would make for three reasons for press releases: A new release, Mahout becoming a TLP and later on a 1.0 release. ;) Isabel
Re: Mahout as TLP
+1
Re: Mahout as TLP
On Sat Grant Ingersoll wrote: > > I don't see any harm in getting 0.3 out first if that makes folks > > more comfortable. > > Yeah, this feels better to me the more I think about it. +1 from me as well: I really like the idea of Mahout becoming a TLP - even before a 1.0 release is available. However I think it makes sense to sort out the 0.3 release first. If I am counting correctly, that would make for three reasons for press releases: A new release, Mahout becoming a TLP and later on a 1.0 release. ;) Isabel