On Mo, 2012-11-19 at 15:54 -0500, Grant Ingersoll wrote:
> Correct. The 4.0 work is not committed yet. I'm hoping to consolidate some
> of the redundant code around Lucene as part of this upgrade. Also, some of
> the constructors, etc. appear to have changed. In general, I'd like to make
> i
Correct. The 4.0 work is not committed yet. I'm hoping to consolidate some of
the redundant code around Lucene as part of this upgrade. Also, some of the
constructors, etc. appear to have changed. In general, I'd like to make it a
little easier to leverage the variety of options some of the
On Nov 19, 2012, at 12:16 PM, Ted Dunning wrote:
> This looks like it may be an artifact of switching to Lucene 4.0.
>
> Grant?
I don't believe we have updated to 4 yet, unless I missed something.
Christopher, can you provide details on:
1. What version you are running? Is this 0.7 or build
> I'm using 0.80 SNAPSHOT and indeed the root cause may be that mahout
> does not support solr 4.0 yet. Just tested with 3.6.1 and it works fine.
> The error however is a bit misleading...
>
Digging a bit deeper, I realized that the dependencies point to Lucene
3.6.0 which is the reason that Luc
This looks like it may be an artifact of switching to Lucene 4.0.
Grant?
On Mon, Nov 19, 2012 at 9:12 AM, Christopher Laux wrote:
> Caused by: java.lang.NoSuchFieldError: LUCENE_36
> at
>
> org.apache.mahout.vectorizer.DefaultAnalyzer.(DefaultAnalyzer.java:34)
> ... 11 more
>
> Any idea
Thanks for the hint. Now I get this exception:
$ mahout seq2sparse -i ~/run/posts2.seq -o ~/run/posts2-vec -seq -nv
Nov 19, 2012 6:09:22 PM org.apache.hadoop.mapred.LocalJobRunner$Job run
WARNING: job_local_0001
java.lang.IllegalStateException: java.lang.reflect.InvocationTargetException
at o
Hi Jake,
It's a great idea indeed. However I'm new to the mahout ; could you give me
some pointers as to where to publish this guide and maybe an example of a
well-formed already existing guide that I could use as an example ?
Thank you !
Jeremie
2012/11/16 Jake Mannix
> I'm glad to hear it's
(Yes, it is a Java binary requiring Java 6+. It runs against Hadoop
0.20.x - 2.0.x or work-alikes, or Amazon EMR. The work is in the
reducer in this implementation, so you would need to hand the reducers
extra memory instead of mappers. I think that you can run the whole
20M rows of input in Myrrix
Just checked TOP at worker node during 20% job (4 000 000 users in my
case): java process uses 2800 MB (resident mem).
Good news for me, both U and M iteration passed on 20% sample.
Can I use current M (computed on 20% of users, 15 iterations) to process
the reminder (80% users)?
Fix M, recompute
Thats huge. It means you need to fit a dense 20M x 20 matrix into the
mappers RAM that recompute U. This will require a few gigabytes...
If that doesn't work for you, you could try to rewrite the job to use
reduce-side joins to recompute the factors, this would however be a much
slower implementat
About 20 000 000 users and 150 000 items. 0,03% non-zeros. 20 features
required.
Pavel
19.11.12 12:31 пользователь "Sebastian Schelter" написал:
>You need to give much more memory than 200 MB to your mappers. What are
>the dimensions of your input in terms of users and items?
>
>--sebastian
>
>
You need to give much more memory than 200 MB to your mappers. What are
the dimensions of your input in terms of users and items?
--sebastian
On 19.11.2012 09:28, Abramov Pavel wrote:
> Thanks for your replies.
>
> 1)
>> Can you describe your failure or give us a strack trace?
>
>
> Here is j
Hi Sean,
> PS I think I mentioned off-list, but this is more or less exactly the
>basis
> of Myrrix (http://myrrix.com). It should be able to handle this scale,
> maybe slightly more easily since it can load only the subset of these
> matrices needed by each worker -- more reducers means less RAM
Thanks for your replies.
1)
> Can you describe your failure or give us a strack trace?
Here is job log:
12/11/19 09:54:07 INFO als.ParallelALSFactorizationJob: Recomputing U
(iteration 0/15)
…
12/11/19 10:03:31 INFO mapred.JobClient: Job complete:
job_201211150152_1671
12/11/19 10:03:31 INFO a
14 matches
Mail list logo