Dear Fang
There are considerable tutorials available for doing matrix operations.
For example just a quick google search for hadoop matrix multiplications
has given this result.
http://www.norstad.org/matrix-multiply/index.html
Is there any specific matrix operation you are looking for ?
If yo
I see, thank you very much
On Sat, Mar 31, 2012 at 12:13 PM, Subir S wrote:
> You won't receive ur post, others will. You are able to post; Proof is my
> reply.
>
> On 3/31/12, Fang Xin wrote:
>> all
>>
>> sorry to bother, as a new user, it seems that I cannot post anything.
>> I've tried twice
You won't receive ur post, others will. You are able to post; Proof is my reply.
On 3/31/12, Fang Xin wrote:
> all
>
> sorry to bother, as a new user, it seems that I cannot post anything.
> I've tried twice yesterday, but I didn't receive my own post...
>
> can anyone enlighten me? thanks
>
all
sorry to bother, as a new user, it seems that I cannot post anything.
I've tried twice yesterday, but I didn't receive my own post...
can anyone enlighten me? thanks
Hi,
Just use "hadoop fs -text ". It would read most of these files
without breaking a sweat :)
You can look at its implementation inside FsShell.java if you want to
implement/reuse things in Java.
On Fri, Mar 30, 2012 at 11:01 PM, kasi subrahmanyam
wrote:
> Hi Pedro i am not sure we have a sing
yes. JAVA_LIBRARY_PATH seems to be the approach that works (rather
than just putting it into tasktracker_opts etc.)
Thanks.
On Wed, Mar 28, 2012 at 9:57 PM, Harsh J wrote:
> George,
>
> This ought to work. Did you restart all your TTs to have it set into effect?
>
> Also, the right way to do thi
Hi Pedro i am not sure we have a single method for reading the data in
output files for different otutput formats.
But for sequence files we can use SequenceFile.Reader class in the API to
read the sequence files.
On Fri, Mar 30, 2012 at 10:49 PM, Pedro Costa wrote:
>
> The ReduceTask can save t
I believe there is no requirement to save both key and value for the
OutputFormat, therefore, it is not guaranteed that you can extract
(key,value) pair from a file generated by an arbitrary OutputFormat.
Zhu, Guojun
Modeling Sr Graduate
571-3824370
guojun_...@freddiemac.com
Financial Engineeri
Hi,
In many Hadoop production environments you get gzipped files as the raw
input. Usually these are Apache HTTPD logfiles. When putting these gzipped
files into Hadoop you are stuck with exactly 1 map task per input file. In
many scenarios this is fine. However when doing a lot of work in this ve
A Hadoop cluster of low-end machines (2 cores, 2GB RAM) can, with a parsing
fetcher and proper configuration and evenly distributed fetch lists, process
up to ~15 URL's per second per node. Such a machine has a single mapper and a
single reducer running because of limited memory.
On Friday 30 M
@ Christoph: Thanks for replying. I would try with more nodes/larger url
set to see how much improvement in processing time i get from cluster.
@mapreduce-mailing-community: It would be great if anybody can help me with
Nutch benchmark on small cluster since it would help me in determining no.
of
Hi Ashish,
IMHO your numbers (2 machines, 10 URLs) are way too small to outweigh the
natural overhead that occurs with a distributed computation (distributing the
program code, coordinating the distributed file system, making sure everybody
is starting and stopping, etc.). Also, if you're web c
>
> Hi,
>
>
> I have setup hadoop clutser(2 node cluster) and I am running Nutch crawl
> on it. I am trying to compare results and improvement in processing time
> when I crawl with 10 URL’s and depth 2. When I am running the crawl on
> cluster its taking more time than pseudo cluster which in turn
13 matches
Mail list logo