I agree with Ted: screwing up your repository (I assume a local clone
of something remote) in svn is much easier than in git, for example by
moving a folder from one place to another. If I can recommend
something, this book is quite nice, especially for beginners ("Basic
Usage" chapter):
http://bo
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107640#comment-13107640
]
Lance Norskog commented on MAHOUT-524:
--
As for 5-d points v.s. 2-d points, SVD does a
I have the ability to bollix svn in ways that nobody else fathoms. Some fans
promote Mercurial as "Git without pain".
On Sun, Sep 18, 2011 at 8:09 PM, Ted Dunning wrote:
> On Sun, Sep 18, 2011 at 5:21 PM, Lance Norskog wrote:
>
> > One important caveat: git is a rope factory for hanging yoursel
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lance Norskog updated MAHOUT-524:
-
Attachment: EclipseLog_20110918.txt
> DisplaySpectralKMeans example fails
> -
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107635#comment-13107635
]
Lance Norskog commented on MAHOUT-524:
--
For completeness, the log when running under
[
https://issues.apache.org/jira/browse/MAHOUT-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107633#comment-13107633
]
Lance Norskog commented on MAHOUT-524:
--
Possibly a little help. When run from the com
[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107609#comment-13107609
]
Dmitriy Lyubimov commented on MAHOUT-814:
-
it's actually will only manifest in tes
On Sun, Sep 18, 2011 at 5:21 PM, Lance Norskog wrote:
> One important caveat: git is a rope factory for hanging yourself. It badly
> needs a Chef/Puppet-style "describe the end result" executor. Don't be
> surprised when you have to re-build your whole checkout when something
> unfathomable blows
[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107567#comment-13107567
]
Grant Ingersoll commented on MAHOUT-814:
Yeah, it is different. It's not actually
It's gitk on windows. Also there's a Tortoise git manager for the windows
desktop. And Github has a mac-only local management app.
One important caveat: git is a rope factory for hanging yourself. It badly
needs a Chef/Puppet-style "describe the end result" executor. Don't be
surprised when you ha
As part of that learning curve, make sure you check out gitx (on the mac,
gitg on linux, I don't care what is on windows).
It makes it easier to understand what the branching structure is.
I recommend invoking as gitx --all to show all of the branches right away.
This will highlight the interest
On Sun, Sep 18, 2011 at 2:15 PM, Grant Ingersoll wrote:
> Cool, I've pushed my changes to ClusterDumper to Lucid's github account
> (lucidimagination) and am planning on pushing all of it to Mahout this week.
> It is now possible to output CSV, Text (the current option) and GraphML.
> Easy enoug
[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107534#comment-13107534
]
Dmitriy Lyubimov commented on MAHOUT-814:
-
oh I think this particular thing is dif
I didn't mean to criticize github -- I use it myself for a number of
projects and I've been extremely happy with their service. I merely
suggested that in terms of the learning curve one may wish to start
with local branches and then slowly progress to adding more remote
sources. I think throwing m
[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dmitriy Lyubimov updated MAHOUT-814:
Summary: SSVD local tests should use their own tmp space to avoid
collisions (was: QRFirst
[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107529#comment-13107529
]
Dmitriy Lyubimov commented on MAHOUT-814:
-
they used to use their own space. But I
Yes, one doesn't have to use github of course. I do it just to share,
collaborate and let people try and preview what I do with a more timely
detailed history and in more convenient way than an issue patch allows.
Besides, it allows me to have a backup in case my desktop disk goes cuckoo,
and wor
> That is, once you are over the learning curve and have a good workflow! I've
> been doing an SVN patch workflow for a long time now and it has served me
> well. Oh well, time to move on!
I'll put it this way: moving to git is well worth the time spent on
learning. I was a skeptic myself... f
On Sep 18, 2011, at 3:20 PM, Ted Dunning wrote:
> Actually, this is important to say. Speed is one of the huge advantages of
> git over other options.
That is, once you are over the learning curve and have a good workflow! I've
been doing an SVN patch workflow for a long time now and it has s
Cool, I've pushed my changes to ClusterDumper to Lucid's github account
(lucidimagination) and am planning on pushing all of it to Mahout this week.
It is now possible to output CSV, Text (the current option) and GraphML. Easy
enough to extend to output JSON or whatever. I would imagine it wo
Actually, this is important to say. Speed is one of the huge advantages of
git over other options.
On Sun, Sep 18, 2011 at 1:13 PM, Dawid Weiss
wrote:
> In case of Lucene you can also work on multiple svn branches and do
> the switching using git... needless to say this is way faster than
> usin
[
https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107508#comment-13107508
]
Ted Dunning commented on MAHOUT-542:
Go for it. The Hadoop API is very confused at th
I looked at it -- yes, this is the way to follow. You can save some
complexity by not keeping a github remote (if you work from one place,
a local feature branch is enough, no need to push/pull to github).
In case of Lucene you can also work on multiple svn branches and do
the switching using git.
Dmitriy documented his work-flow which is very similar to this:
http://weatheringthrutechdays.blogspot.com/2011/04/git-github-and-committing-to-asf-svn.html
I use his process almost exactly.
On Sun, Sep 18, 2011 at 5:58 AM, Dawid Weiss
wrote:
> Yes, these instructions worked for me:
> go to htt
You have to make one hack to make sure that the JS downloads from your local
server, but that is easy.
On Sun, Sep 18, 2011 at 12:17 PM, Ted Dunning wrote:
> Yes. The old stuff from google used to require their servers and was very
> limited on size of data.
>
> This newer stuff is not.
>
>
> O
Yes. The old stuff from google used to require their servers and was very
limited on size of data.
This newer stuff is not.
On Sun, Sep 18, 2011 at 4:46 AM, Grant Ingersoll wrote:
>
> On Sep 17, 2011, at 9:22 PM, Ted Dunning wrote:
>
> > I strongly recommend Google's visualization API.
>
> Cool
The LuceneIterator has a built-in circuit breaker if it gets too many errors.
If you are using lucene.vector, you can pass in --maxPercentErrorDocs X, where
X is some percentage of docs you are willing to allow errors in. The default
is no errors.
On Sep 18, 2011, at 10:48 AM, Philippe Adji
[
https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107458#comment-13107458
]
Fabian Alenius commented on MAHOUT-542:
---
Okay. I'll wait a bit to see if anyone obje
I'd say no because it's only a copy of
ExpectationMaximizationSVDFactorizer that uses a hacky /quirky DataModel
implementation to save a lot of RAM.
On 18.09.2011 10:59, Lance Norskog wrote:
> Should ParallelArraysSGDFactorizer be promoted to the svdrecommender package
> in core/src?
>
[
https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107454#comment-13107454
]
Sebastian Schelter commented on MAHOUT-542:
---
Current hadoop version is 0.20.204.
[
https://issues.apache.org/jira/browse/MAHOUT-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13107453#comment-13107453
]
Fabian Alenius commented on MAHOUT-542:
---
Hi, I was thinking of rewriting the itemRat
Hi,
I was trying to generate vectors from a lucene index using the lucene.vector
driver, it worked fine using mahout 0.4 but in mahout 0.5 i get the
following exception:
SEVERE: There are too many documents that do not have a term vector for
description
Exception in thread "main" java.lang.Illega
[
https://issues.apache.org/jira/browse/MAHOUT-814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Ingersoll updated MAHOUT-814:
---
Summary: QRFirstStep should use their own tmp space to avoid collisions
(was: LocalSSDSolver
LocalSSDSolver tests should use their own tmp space to avoid collisions
---
Key: MAHOUT-814
URL: https://issues.apache.org/jira/browse/MAHOUT-814
Project: Mahout
Issue Type:
[
https://issues.apache.org/jira/browse/MAHOUT-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved MAHOUT-813.
--
Resolution: Fixed
Assignee: Sean Owen (was: Grant Ingersoll)
> RecommenderJob incorrectly sets i
RecommenderJob incorrectly sets io.sort.mb
--
Key: MAHOUT-813
URL: https://issues.apache.org/jira/browse/MAHOUT-813
Project: Mahout
Issue Type: Bug
Affects Versions: 0.6
Reporter: Grant
I opened MAHOUT-813.
Agreed, a cap would be good. And agreed on reasoning to set it.
On Sep 18, 2011, at 9:30 AM, Sean Owen wrote:
> I can just cap it at, say, 1024MB.
>
> This isn't in the config because that would change it for all jobs, and it
> is probably not a good idea in general to us
I can just cap it at, say, 1024MB.
This isn't in the config because that would change it for all jobs, and it
is probably not a good idea in general to use so much memory for the
combiner. Here it's the right thing to do.
On Sun, Sep 18, 2011 at 2:26 PM, Grant Ingersoll wrote:
> I'm trying to r
Awesome! Thanks.
On Sep 18, 2011, at 7:58 AM, Dawid Weiss wrote:
> Yes, these instructions worked for me:
> go to http://wiki.apache.org/general/GitAtApache, then: "Git for
> Apache committers". The URL for git svn init needs to be:
>
> git svn init --prefix=origin/ --tags=tags --trunk=trunk
>
I'm trying to run the RecommenderJob (trunk as of this morning) and am getting:
java.io.IOException: Invalid "io.sort.mb": 2048
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:939)
at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.(MapTask.java:673)
Yes, these instructions worked for me:
go to http://wiki.apache.org/general/GitAtApache, then: "Git for
Apache committers". The URL for git svn init needs to be:
git svn init --prefix=origin/ --tags=tags --trunk=trunk
--branches=branches https://svn.apache.org/repos/asf/lucene/dev
Should work out
Resurrecting old thread...
I originally just cloned from the ASF Git mirrors. Is there a way to then
associate it with an SVN repos so that I can then push a branch to SVN? I've
got a rather large set of changes across several commits (and don't remember
when I started). My thinking was I wo
On Sep 17, 2011, at 9:22 PM, Ted Dunning wrote:
> I strongly recommend Google's visualization API.
Cool. Here I thought it required using Goog's servers, but I guess not. So
you can run the server and hit it locally?
>
> This is divided into two parts, the reporting half and the data source
Should ParallelArraysSGDFactorizer be promoted to the svdrecommender package
in core/src?
--
Lance Norskog
goks...@gmail.com
44 matches
Mail list logo