Re: Where to start refactoring?

2013-01-13 Thread Ted Dunning
That is a pity. Good use cases with realistic (not necessarily real) data would be very helpful. Probably much more impact than small code fixes. On Sun, Jan 13, 2013 at 5:54 PM, Florents Tselai wrote: > For now, I'm afraid no, I don't. > > On Mon, Jan 14, 2013 at 3:31 AM, Ted Dunning > wrote:

Re: Where to start refactoring?

2013-01-13 Thread Florents Tselai
For now, I'm afraid no, I don't. On Mon, Jan 14, 2013 at 3:31 AM, Ted Dunning wrote: > Do you have any sample data? > > On Sun, Jan 13, 2013 at 5:13 PM, Florents Tselai >wrote: > > > Thanks for the reply! > > > > Yes, you're correct the data source is a smart-meter installed in each > > buildin

Re: Where to start refactoring?

2013-01-13 Thread Ted Dunning
Do you have any sample data? On Sun, Jan 13, 2013 at 5:13 PM, Florents Tselai wrote: > Thanks for the reply! > > Yes, you're correct the data source is a smart-meter installed in each > building. > > On Mon, Jan 14, 2013 at 3:07 AM, Ted Dunning > wrote: > > > If you have discrete data, then I wo

Re: Where to start refactoring?

2013-01-13 Thread Florents Tselai
Thanks for the reply! Yes, you're correct the data source is a smart-meter installed in each building. On Mon, Jan 14, 2013 at 3:07 AM, Ted Dunning wrote: > If you have discrete data, then I would think that simple cooccurrence > mining would be more useful than full on association mining. > >

Re: Where to start refactoring?

2013-01-13 Thread Ted Dunning
If you have discrete data, then I would think that simple cooccurrence mining would be more useful than full on association mining. But is your data really a time-series? Are you extracting discrete features from the time series? In the following, I am assuming that when you say "real-time energ

Re: scalding and mahout vector

2013-01-13 Thread Jake Mannix
I think the key is, as Ted says, every place where you want to emit a writable form of vector, to wrap it in a VectorWritable. In scala terms, there is certainly two implicit conversions (a, ahem bijection in fact) between Vector and VectorWritable, by the get/set encapsulation of the latter aroun

Re: Where to start refactoring?

2013-01-13 Thread Florents Tselai
Real-time energy data, Association mining is in fact the core analysis applied (but not the only one for e.g. it could be classification as well). On Mon, Jan 14, 2013 at 1:34 AM, Ted Dunning wrote: > Can you say more about what kind of data and what kind of analysis? > > It is usually best if t

Re: Mahout TDD?

2013-01-13 Thread Ted Dunning
Not strictly, no. But most of the production code has reasonable levels of testing. On Sun, Jan 13, 2013 at 3:21 PM, Florents Tselai wrote: > Hello, > > is there any code any mahout that was developed following TDD principles? >

Re: Where to start refactoring?

2013-01-13 Thread Ted Dunning
Can you say more about what kind of data and what kind of analysis? It is usually best if the work you do is motivated by your needs. On Sun, Jan 13, 2013 at 3:18 PM, Florents Tselai wrote: > Hello, > > In the next weeks/months I'll be using mahout for analyzing some big data > for a start-up a

Mahout TDD?

2013-01-13 Thread Florents Tselai
Hello, is there any code any mahout that was developed following TDD principles?

[jira] [Commented] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552346#comment-13552346 ] sam wu commented on MAHOUT-1140: I attach a simple test program, TestUniformRandomSample.

[jira] [Updated] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam wu updated MAHOUT-1140: --- Attachment: (was: TestUniformRandomSample.java) > Uniform random sampling problem in RandomSeedGener

[jira] [Updated] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam wu updated MAHOUT-1140: --- Attachment: TestUniformRandomSample.java > Uniform random sampling problem in RandomSeedGenerator.java >

[jira] [Issue Comment Deleted] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam wu updated MAHOUT-1140: --- Comment: was deleted (was: I attach a simple test program, TestUniformRandomSample.java , to test the unifo

[jira] [Issue Comment Deleted] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam wu updated MAHOUT-1140: --- Comment: was deleted (was: test uniform sampling) > Uniform random sampling problem in RandomSeedGenera

[jira] [Commented] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13552345#comment-13552345 ] sam wu commented on MAHOUT-1140: I attach a simple test program, TestUniformRandomSample.

[jira] [Updated] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
[ https://issues.apache.org/jira/browse/MAHOUT-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam wu updated MAHOUT-1140: --- Attachment: TestUniformRandomSample.java test uniform sampling > Uniform random sampling pr

[jira] [Created] (MAHOUT-1140) Uniform random sampling problem in RandomSeedGenerator.java

2013-01-13 Thread sam wu (JIRA)
sam wu created MAHOUT-1140: -- Summary: Uniform random sampling problem in RandomSeedGenerator.java Key: MAHOUT-1140 URL: https://issues.apache.org/jira/browse/MAHOUT-1140 Project: Mahout Issue Type: