[jira] [Assigned] (MAHOUT-1252) Add support for Finite State Transducers (FST) as a DictionaryType.

2013-09-28 Thread Suneel Marthi (JIRA)

 [ 
https://issues.apache.org/jira/browse/MAHOUT-1252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suneel Marthi reassigned MAHOUT-1252:
-

Assignee: Suneel Marthi

> Add support for Finite State Transducers (FST) as a DictionaryType.
> ---
>
> Key: MAHOUT-1252
> URL: https://issues.apache.org/jira/browse/MAHOUT-1252
> Project: Mahout
>  Issue Type: Improvement
>  Components: Integration
>Affects Versions: 0.7
>Reporter: Suneel Marthi
>Assignee: Suneel Marthi
> Fix For: 1.0
>
>
> Add support for Finite State Transducers (FST) as a DictionaryType, this 
> should result in an order of magnitude speedup of seq2sparse.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Build failed in Jenkins: mahout-nightly #1367

2013-09-28 Thread Apache Jenkins Server
See 

--
[...truncated 1817 lines...]
INFO: 
Sep 28, 2013 11:04:55 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-jar-plugin:2.4:jar (default-jar) @ mahout-core ---
[INFO] Building jar: 

Sep 28, 2013 11:04:56 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:04:56 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-jar-plugin:2.4:test-jar (default) @ mahout-core ---
[WARNING] Artifact org.apache.mahout:mahout-core:test-jar:tests:0.9-SNAPSHOT 
already attached to project, ignoring duplicate
Sep 28, 2013 11:04:56 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:04:56 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-assembly-plugin:2.4:single (job) @ mahout-core ---
[INFO] Reading assembly descriptor: src/main/assembly/job.xml
[INFO] Building jar: 

[WARNING] Artifact org.apache.mahout:mahout-core:jar:job:0.9-SNAPSHOT already 
attached to project, ignoring duplicate
Sep 28, 2013 11:05:03 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:03 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-source-plugin:2.2.1:jar-no-fork (attach-sources) @ mahout-core 
---
[WARNING] Artifact 
org.apache.mahout:mahout-core:java-source:sources:0.9-SNAPSHOT already attached 
to project, ignoring duplicate
Sep 28, 2013 11:05:03 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:03 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-install-plugin:2.5:install (default-install) @ mahout-core ---
[INFO] Installing 

 to 
/home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.jar
[INFO] Installing 
 to 
/home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT.pom
[INFO] Installing 

 to 
/home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-tests.jar
[INFO] Installing 

 to 
/home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-job.jar
[INFO] Installing 

 to 
/home/jenkins/jenkins-slave/maven-repositories/1/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-SNAPSHOT-sources.jar
Sep 28, 2013 11:05:03 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:03 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-deploy-plugin:2.5:deploy (default-deploy) @ mahout-core ---
Downloading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
Downloaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
 (2 KB at 2.4 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130928.230504-61.jar
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130928.230504-61.jar
 (1333 KB at 5163.2 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130928.230504-61.pom
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/mahout-core-0.9-20130928.230504-61.pom
 (7 KB at 66.6 KB/sec)
Downloading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml
Downloaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/maven-metadata.xml
 (344 B at 2.1 KB/sec)
Uploading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
Uploaded: 
https://repository.apache.org/content/repositories/snapshots/org/apache/mahout/mahout-core/0.9-SNAPSHOT/maven-metadata.xml
 (2 KB at 11.4 KB/sec)
Uploading: 
http

Build failed in Jenkins: mahout-nightly » Mahout Integration #1367

2013-09-28 Thread Apache Jenkins Server
See 


--
Sep 28, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Sep 28, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Sep 28, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: Building Mahout Integration 0.9-SNAPSHOT
Sep 28, 2013 11:05:16 PM org.apache.maven.cli.event.ExecutionEventLogger 
projectStarted
INFO: 
Sep 28, 2013 11:05:17 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:17 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-clean-plugin:2.4.1:clean (default-clean) @ mahout-integration 
---
[INFO] Deleting 

Sep 28, 2013 11:05:17 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:17 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-resources-plugin:2.6:resources (default-resources) @ 
mahout-integration ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
Sep 28, 2013 11:05:17 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:17 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-compiler-plugin:3.1:compile (default-compile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 127 source files to 

[WARNING] Note: 

 uses or overrides a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[WARNING] Note: 

 uses unchecked or unsafe operations.
[WARNING] Note: Recompile with -Xlint:unchecked for details.
Sep 28, 2013 11:05:19 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:19 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-resources-plugin:2.6:testResources (default-testResources) @ 
mahout-integration ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 10 resources
Sep 28, 2013 11:05:20 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:20 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ 
mahout-integration ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 37 source files to 

[WARNING] Note: Some input files use or override a deprecated API.
[WARNING] Note: Recompile with -Xlint:deprecation for details.
[WARNING] Note: Some input files use unchecked or unsafe operations.
[WARNING] Note: Recompile with -Xlint:unchecked for details.
Sep 28, 2013 11:05:20 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: 
Sep 28, 2013 11:05:20 PM org.apache.maven.cli.event.ExecutionEventLogger 
mojoStarted
INFO: --- maven-surefire-plugin:2.16:test (default-test) @ mahout-integration 
---
[INFO] Surefire report directory: 


---
 T E S T S
---

---
 T E S T S
---
Running org.apache.mahout.clustering.TestClusterEvaluator
Running org.apache.mahout.utils.vectors.arff.MapBackedARFFModelTest
Running org.apache.mahout.clustering.cdbw.TestCDbwEvaluator
Running org.apache.mahout.clustering.TestClusterDumper
Running org.apache.mahout.utils.vectors.VectorHelperTest
Running org.apache.mahout.utils.regex.RegexMapperTest
Running org.apache.mahout.text.SequenceFilesFromMailArchivesTest
Running org.apache.mahout.utils.TestConcatenateVectorsJob
Running org.apache.mahout.utils.SplitInputTest
Running org.apache.mahout.utils.vectors.lucene.DriverTest
Running org.apache.mahout.utils.vectors.arff.ARFFVectorIterableTest
Running 
org.apache.mahout.cf.taste.impl.similarity.jdbc.MySQLJD

Re: 0.9?

2013-09-28 Thread Ted Dunning
The one large-ish feature that I think would find general use would be a high 
performance classifier trainer.  

Flor cleanup sort of thing it would be good to fully integrate the streaming 
k-means into the normal clustering commands while revamping the command line 
API.  

Dmitriy's recent scala work would help quite a bit before 1.0. Not sure it can 
make 0.9. 

For recommendations, I think that the demo system that pat started with the 
elaborations by Ellen an Tim would be very good to have. 

I would be happy to collaborate with somebody on these but am not at all likely 
to have time to actually do them end to end. 

Sent from my iPhone

On Sep 28, 2013, at 12:40, Grant Ingersoll  wrote:

> Moving closer to 1.0, removing cruft, etc.  Do we have any more major 
> features planned for 1.0?  I think we said during 0.8 that we would try to 
> follow pretty quickly w/ another release.
> 
> -Grant
> 
> On Sep 28, 2013, at 12:33 PM, Ted Dunning  wrote:
> 
>> Sounds right in principle but perhaps a bit soon.  
>> 
>> What would define the release?
>> 
>> Sent from my iPhone
>> 
>> On Sep 27, 2013, at 7:48, Grant Ingersoll  wrote:
>> 
>>> Anyone interested in thinking about 0.9 in the early Nov. time frame?
>>> 
>>> -Grant
> 
> 
> Grant Ingersoll | @gsingers
> http://www.lucidworks.com
> 
> 
> 
> 
> 


Re: 0.9?

2013-09-28 Thread Grant Ingersoll
Moving closer to 1.0, removing cruft, etc.  Do we have any more major features 
planned for 1.0?  I think we said during 0.8 that we would try to follow pretty 
quickly w/ another release.

-Grant

On Sep 28, 2013, at 12:33 PM, Ted Dunning  wrote:

> Sounds right in principle but perhaps a bit soon.  
> 
> What would define the release?
> 
> Sent from my iPhone
> 
> On Sep 27, 2013, at 7:48, Grant Ingersoll  wrote:
> 
>> Anyone interested in thinking about 0.9 in the early Nov. time frame?
>> 
>> -Grant


Grant Ingersoll | @gsingers
http://www.lucidworks.com







Re: 0.9?

2013-09-28 Thread Tom Griffin
unsubscribe

--
*Tom Griffin*
*Director, Innovation*
*
*
*Office: *732-562-6531
*Mobile: *201-259-8860
*Email:* t.p.grif...@ieee.org




On Sat, Sep 28, 2013 at 8:41 AM, Grant Ingersoll wrote:

>
> On Sep 27, 2013, at 9:07 AM, Suneel Marthi 
> wrote:
>
> > I was gonna bring this up myself next week (and was chatting with Isabel
> about it today morning).
> >
> > I was thinking of the following for 0.9:-
> >
> > 1. We have already removed the algorithms that have been marked as
> deprecated in 0.8
> > 2.  Bugs that have been fixed since 0.8.
> > 3.  New Features in 0.9 could include :-
> > a) New Multilayer Perceptron that Yexi had contributed recently and
> is presently pending review (don't know the JIRA# top of my head).
> > b)  Using Finite State Transducers as a dictionary type. I had
> opened a Jira for this and an work on it.
> >
>
> Are you using Lucene's FSTs for this?
>
> Rest sounds good.
>
>
> > Anything else others would like to add???
> >
> > Grant, could we have a hangout the week of Oct 7 :) ??
>
> I can't that week, but probably the following.
>
> >
> >
> >
> >
> > 
> > From: Grant Ingersoll 
> > To: "dev@mahout.apache.org" 
> > Sent: Friday, September 27, 2013 8:48 AM
> > Subject: 0.9?
> >
> >
> > Anyone interested in thinking about 0.9 in the early Nov. time frame?
> >
> > -Grant
>
> 
> Grant Ingersoll | @gsingers
> http://www.lucidworks.com
>
>
>
>
>
>


Re: 0.9?

2013-09-28 Thread Ted Dunning
Sounds right in principle but perhaps a bit soon.  

What would define the release?

Sent from my iPhone

On Sep 27, 2013, at 7:48, Grant Ingersoll  wrote:

> Anyone interested in thinking about 0.9 in the early Nov. time frame?
> 
> -Grant


Re: 0.9?

2013-09-28 Thread Suneel Marthi
Yep, I was planning on using Lucene's FST.





 From: Grant Ingersoll 
To: dev@mahout.apache.org; Suneel Marthi  
Sent: Saturday, September 28, 2013 8:41 AM
Subject: Re: 0.9?
 




On Sep 27, 2013, at 9:07 AM, Suneel Marthi  wrote:

I was gonna bring this up myself next week (and was chatting with Isabel about 
it today morning).
>
>I was thinking of the following for 0.9:-
>
>1. We have already removed the algorithms that have been marked as deprecated 
>in 0.8
>2.  Bugs that have been fixed since 0.8.
>3.  New Features in 0.9 could include :-
>    a) New Multilayer Perceptron that Yexi had contributed recently and is 
>presently pending review (don't know the JIRA# top of my head).  
>    b)  Using Finite State Transducers as a dictionary type. I had opened a 
>Jira for this and an work on it.
> 
>

Are you using Lucene's FSTs for this?

Rest sounds good.


Anything else others would like to add???
>
>Grant, could we have a hangout the week of Oct 7 :) ??
>

I can't that week, but probably the following.


>
>
>
>
>From: Grant Ingersoll 
>To: "dev@mahout.apache.org"  
>Sent: Friday, September 27, 2013 8:48 AM
>Subject: 0.9?
>
>
>Anyone interested in thinking about 0.9 in the early Nov. time frame?
>
>-Grant


Grant Ingersoll | @gsingers
http://www.lucidworks.com

Re: 0.9?

2013-09-28 Thread Grant Ingersoll

On Sep 27, 2013, at 9:07 AM, Suneel Marthi  wrote:

> I was gonna bring this up myself next week (and was chatting with Isabel 
> about it today morning).
> 
> I was thinking of the following for 0.9:-
> 
> 1. We have already removed the algorithms that have been marked as deprecated 
> in 0.8
> 2.  Bugs that have been fixed since 0.8.
> 3.  New Features in 0.9 could include :-
> a) New Multilayer Perceptron that Yexi had contributed recently and is 
> presently pending review (don't know the JIRA# top of my head).  
> b)  Using Finite State Transducers as a dictionary type. I had opened a 
> Jira for this and an work on it.
>  

Are you using Lucene's FSTs for this?

Rest sounds good.


> Anything else others would like to add???
> 
> Grant, could we have a hangout the week of Oct 7 :) ??

I can't that week, but probably the following.

> 
> 
> 
> 
> 
> From: Grant Ingersoll 
> To: "dev@mahout.apache.org"  
> Sent: Friday, September 27, 2013 8:48 AM
> Subject: 0.9?
> 
> 
> Anyone interested in thinking about 0.9 in the early Nov. time frame?
> 
> -Grant


Grant Ingersoll | @gsingers
http://www.lucidworks.com