hello Chris!
(if you are talking about serving language models and/or phrase tables)
i had a student look at using HBase for LMs this summer. i don't
think it is sufficiently quick to deal with millions of queries per
second, but that may be due to blunders on our part.
it may be possible that
Hi Mafish,
Thanks for your suggestions.
Finally I could resolve the issue. The *site.xml in namenode had
ds.default.name as localhost where as in data nodes it were the actual ip. I
changed the local host to actual ip in name node and it started working.
Regards,
Sourav
-Original
Hi,
I am looking for information in the area of Hadoop tracing, instrumentation,
benchmarking and so forth.
What utilities exist ? What's their maturity? Where can I get more info
about them ?
I am curious about statistics on Hadoop behavior (per a typical workload ?
different workloads ?). I am
[EMAIL PROTECTED] wrote:
thanks for the replies. So looks like replication might be the real
overhead when compared to scp.
Makes sense, but there's no reason why you couldn't have first node you
copy up the data to, continue and pass that data to the other nodes. If
its in the same rack,
On Thursday 18 September 2008 04:12:13 pm Steve Loughran wrote:
[EMAIL PROTECTED] wrote:
thanks for the replies. So looks like replication might be the real
overhead when compared to scp.
Makes sense, but there's no reason why you couldn't have first node you
copy up the data to, continue
Your custom implementation of any interface from hadoop-core should be
archived together with the application (i.e. in the same jar).
Andt he jar will be added to CLASSPATH of the task runner, then your
customwritable.java could be found.
On Thu, Sep 18, 2008 at 8:09 PM, Deepak Diwakar [EMAIL
You can refer to the Hadoop Map-Reduce Tutorial
On Thu, Sep 18, 2008 at 8:40 PM, Shengkai Zhu [EMAIL PROTECTED] wrote:
Your custom implementation of any interface from hadoop-core should be
archived together with the application (i.e. in the same jar).
Andt he jar will be added to CLASSPATH
Where can you find the Hadoop Map-Reduce Tutorial?
Shengkai Zhu wrote:
You can refer to the Hadoop Map-Reduce Tutorial
On Thu, Sep 18, 2008 at 8:40 PM, Shengkai Zhu [EMAIL PROTECTED] wrote:
Your custom implementation of any interface from hadoop-core should be
archived together with the
Isn't one of the features of replication a guarantee that when my
write finishes, I know there are N replicas written?
Seems like if you want the quicker behavior, you write with
replication set to 1 for that file, then change the replication count
when you're finished.
--
James Moore | [EMAIL
James Moore wrote:
Isn't one of the features of replication a guarantee that when my
write finishes, I know there are N replicas written?
This is what happens normally, but it is not a guarantee. When there are
errors, data might be written to fewer replicas.
Raghu.
Seems like if you want
Release 0.18.1 fixes 9 critical bugs in 0.18.0.
For Hadoop release details and downloads, visit:
http://hadoop.apache.org/core/releases.html
Hadoop 0.18.1 Release Notes are at
http://hadoop.apache.org/core/docs/r0.18.1/releasenotes.html
Thanks to all who contributed to this release!
Nigel
Hi,
I'm using 0.17.2.1 and see a reduce hang in shuffle phase due
to a unresponsive node. From the reduce log (sorry that I didn't
keep it around), it stuck in copying map output from a dead
node (I can not ssh to that one). At that point, all maps are already
finished. I'm wondering why this
Hello all,
Does anyone have some working example code for doing a map-side
(inner) join? The documentation at
http://tinyurl.com/43j5pp is less than enlightening...
Thanks,
-Stuart
On 16-Sep-08, at 1:25 AM, Christian Ulrik Søttrup wrote:
Ok i've tried what you suggested and all sorts of combinations with
no luck.
Then I went through the source of the Streaming lib. It looks like
it checks for the existence
of the combiner while it is building the jobconf i.e. before
Here is the link
http://hadoop.apache.org/core/docs/current/mapred_tutorial.html
On Thu, Sep 18, 2008 at 9:16 PM, chanel [EMAIL PROTECTED] wrote:
Where can you find the Hadoop Map-Reduce Tutorial?
Shengkai Zhu wrote:
You can refer to the Hadoop Map-Reduce Tutorial
On Thu, Sep 18, 2008 at
Reply to myself. I'm using streaming and the task timeout was set to 0,
so that's why.
On Fri, Sep 19, 2008 at 3:34 AM, Rong-en Fan [EMAIL PROTECTED] wrote:
Hi,
I'm using 0.17.2.1 and see a reduce hang in shuffle phase due
to a unresponsive node. From the reduce log (sorry that I didn't
keep
Hello,
I am running a custom crawler (written internally) using hadoop
streaming. I am attempting to
compress the output using LZO, but instead I am receiving corrupted
output that is neither in the
format I am aiming for nor as a compressed lzo file. Is this a known
issue? Is there anything
I am
this time, I set task timeout to 10m via
-jobconf mapred.task.timeout=60
However, I still see this hang at shuffle stage, and lots
of messages below appear in the log
2008-09-19 12:34:02,289 INFO org.apache.hadoop.mapred.ReduceTask:
task_200809190308_0007_r_01_1 Need 6 map output(s)
hi everybody.
can anyone plase help me how to get the input filename in dfs as the key in
the output?
example: [ filenames , value]
-
Unlimited freedom, unlimited storage. Get it now
The key is of the form ID :DenseVector Representation in mahout with
I guess vector size seems too large so it'll need a distributed vector
architecture (or 2d partitioning strategies) for large scale matrix
operations. The hama team investigate these problem areas. So, it will
be improved If
Even if writes are happening in parallel from a single machine, wouldn't the
network congestion cause slow down due to packet collision?
- Prasad.
On Thursday 18 September 2008 10:47:48 pm Raghu Angadi wrote:
Steve Loughran wrote:
[EMAIL PROTECTED] wrote:
thanks for the replies. So looks
Yeah. That was the problem. And Hama can be surely useful for large scale
matrix operations.
But for this problem, I have modified the code to just pass the ID information
and read the vector information only when it is needed. In this case, it was
needed only in the reducer phase. This way,
22 matches
Mail list logo