> On 24 May 2016, at 19:53, Chris Nauroth wrote:
>
> Hello Dmitry,
>
> To clarify, the intent of MAPREDUCE-5065 was to message the user that
> using different block sizes on source and destination might cause a
> failure to checksum mismatch. The message to the user
> On 21 May 2016, at 09:34, Dmitry Sivachenko <trtrmi...@gmail.com> wrote:
>
>
>> On 21 May 2016, at 02:15, Chris Nauroth <cnaur...@hortonworks.com> wrote:
>>
>> Hello Dmitry,
>>
>> MAPREDUCE-5065 has been included in these branches f
> On 21 May 2016, at 02:15, Chris Nauroth wrote:
>
> Hello Dmitry,
>
> MAPREDUCE-5065 has been included in these branches for a long time. Are
> you certain that you passed a dfs.blocksize equal to what was used in the
> source files? Did all source files use the
Hello,
When I copy files with distcp and -D dfs.blocksize=XXX (hadoop-2.7.2), it fails
with
"Source and target differ in block-size" error despite MAPREDUCE-5065 was
committed 3 years ago.
Is it possible to merge this change to 2.7 / 2.8 branches?
Thanks.
Hello,
I setup I hadoop 2.7.2 cluster on Ubuntu 16.04 with OpenJDK8. After running
TeraGen from examples jar:
hadoop jar
/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.2.jar
teragen 100 /user/mitya/terasort-input
I see that many NodeManages are not running
> On 23 сент. 2015 г., at 22:08, Naganarasimha Garla
> wrote:
>
> Sorry for the late Reply, thought of providing you some search strings for
> blackListing hence got lil delayed.
> As varun mentioned it looks more like app blacklisting case.
>
is 1000
What does these mean?
>
>
> Regards,
> + Naga
>
>
> From: Dmitry Sivachenko [trtrmi...@gmail.com]
> Sent: Wednesday, September 23, 2015 03:57
> To: user@hadoop.apache.org
> Subject: node remains unused after reboot
>
> Hello!
>
> I am using h
Hello!
I am using hadoop-2.7.1. I have a large map job running (total cores available
on the cluster about 3000, total tasks 35000).
In the middle of this process one server reboots.
After reboot, nodemanager starts successfully end registers with resource
manager:
2015-09-23 01:06:24,656 INFO
Hello!
I am trying to perform a rolling upgrade of a cluster running hadoop-2.4.1
without downtime following procedure described at
http://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
The first command to execute is:
% hdfs dfsadmin -rollingUpgrade
Hello!
I am using the following procedure to build hadoop from sources:
First, run mvn -Dmaven.repo.local=/path/to/m2 to populate /path/to/m2 directory
with required artifacts;
Then I always run mvn -Dmaven.repo.local=/path/to/m2 --offline so it does not
download anything during the build.
Hello!
I am using hadoop-2.4.1 in distributed mode. After a job completes, logs are
aggregated to hdfs and are available via history server.
Sometimes logs appear very fast after the job completes (or fails), but
sometimes it takes long (10-20-30 minutes).
During that period history server
After streaming job outputs some data to stdout, some hadoop code receives it
and splits into key/value pair before it reaches TextOutputFormat.
Can anyone point me to that piece of code please?
Thanks!
On 11 сент. 2014 г., at 0:37, Dmitry Sivachenko trtrmi...@gmail.com wrote:
On 10 сент
Okay, FWIW I found the solution:
https://issues.apache.org/jira/browse/MAPREDUCE-6085
Thanks for all who replied.
On 11 сент. 2014 г., at 11:16, Dmitry Sivachenko trtrmi...@gmail.com wrote:
After streaming job outputs some data to stdout, some hadoop code receives it
and splits into key
Hello!
Imagine the following common task: I want to process big text file line-by-line
using streaming interface.
Run unix grep command for instance. Or some other line-by-line processing,
e.g. line.upper().
I copy file to HDFS.
Then I run a map task on this file which reads one line,
it in python.
On 9/10/14, Dmitry Sivachenko trtrmi...@gmail.com wrote:
Hello!
Imagine the following common task: I want to process big text file
line-by-line using streaming interface.
Run unix grep command for instance. Or some other line-by-line processing,
e.g. line.upper().
I copy
On 10 сент. 2014 г., at 22:19, Rich Haase rdha...@gmail.com wrote:
You can write a custom output format
Any clues how can this can be done?
, or you can write your mapreduce job in Java and use a NullWritable as
Susheel recommended.
grep (and every other *nix text processing
in the original line) or not.
What is the proper way to workaround that isuue?
Regards,
Shahab
On Wed, Sep 10, 2014 at 2:28 PM, Dmitry Sivachenko trtrmi...@gmail.com
wrote:
On 10 сент. 2014 г., at 22:19, Rich Haase rdha...@gmail.com wrote:
You can write a custom output format
Any
) {
writeObject(value);
}
out.write(newline);
}
On Sep 10, 2014, at 1:37 PM, Dmitry Sivachenko trtrmi...@gmail.com wrote:
On 10 сент. 2014 г., at 22:33, Felix Chern idry...@gmail.com wrote:
Use ‘tr -s’ to stripe out tabs?
$ echo -e a\t\t\tb
I am trying hadoop-1.2.1 on FreeBSD-10 (installed from ports).
I see the following exception in datanode and tasktracker's log:
2014-06-29 10:13:17,105 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded
the native-hadoop library
2014-06-29 10:13:17,106 ERROR
Hello!
FreeBSD does not need -ldl when linking programs that use dlopen() (dlopen is
in libc).
Now I am getting the following error trying to compile hadoop-2.2.0 on FreeBSD:
[exec] /usr/bin/cc -fPIC -g -Wall -O2 -D_REENTRANT -D_GNU_SOURCE
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
Hello!
I set up hadoop-1.2.1 on FreeBSD-10/stable with openjdk version 1.7.0_60.
On the first glance it is doing well except one annoying thing: after
executing some tasks, tasktracker process starts to eat CPU when idle.
Sometimes it is 10-20% (numbers from top(1) output), sometimes it is
, at 2:15, Dmitry Sivachenko trtrmi...@gmail.com wrote:
Hello!
I set up hadoop-1.2.1 on FreeBSD-10/stable with openjdk version 1.7.0_60.
On the first glance it is doing well except one annoying thing: after
executing some tasks, tasktracker process starts to eat CPU when idle.
Sometimes
22 matches
Mail list logo