Kai, this is great. It is well down the path to solving the
small/object-as-file problem. Good show!
*Daemeon C.M. ReiydelleSan Francisco 1.415.501.0198London 44 020 8144 9872*
On Mon, Sep 4, 2017 at 8:56 PM, Zheng, Kai wrote:
> A nice discussion about support of small
Determine what is meant by "disaster recovery". What are the scenarious,
what data.
Architect to the business need, not the buzz words
*“Anyone who isn’t embarrassed by who they were last year probably isn’t
learning enough.” - Alain de Botton*
*Daemeon C.M. ReiydelleUSA (+1)
Another option is to stop the node's relevant Hadoop services (including
e.g spark, impala, etc. if applicable), move the existing local storage,
mount the desired file system, and move the data over. Then just restart
hadoop. As long as this does not take too long, you don't have write
A possibility is that the node showing errors was not able to get tcp
connection, or heavy network conjestion, or (possibly) heavy garbage
collection tomeouts. Would suspect network
...
There is no sin except stupidity - Oscar Wilde
...
Daemeon (Dæmœn) Reiydelle
USA 1.415.501.0198
On Jul 3, 2017
For fairly simple transformations, Flume is great, and works fine
subscribing
to some pretty
high volumes of messages from Kafka
(I think we hit 50M/second at one point)
. If you need to do complex transformations, e.g. database lookups for the
Kafka to Hadoop ETL, then you will start having
Readers ARE parallel processes, one per map task. There are defaults in map
phase, about how many readers there are for the input file(s). Default is
one mapper task block (or file, where any file is smaller than the hdfs
block size). There is no java framework per se for splitting up an file
Hello, and thanks for reaching out to me. Let me know if the first section
of the attached resume is helpful?
CV attached. Currently working in San Francisco
Daemeon Reiydelle
*...*
*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Sat, Feb 18, 2017 at 6:33
JHS does NOT use ZK for any purpose. HA options of NM and RM do. That is
the ONLY ZK use in Hadoop.
*...*
*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Wed, Nov 16, 2016 at 4:29 PM, Benson Qiu
wrote:
> Hi Ravi,
>
> Nothing gave
What do you mean by "does not enter ... class(es)"?
Does the log show that the scheduler ever accepts the job (You may have to
turn logging up)? Are "other" jobs that are submitted to the same class
under your user scheduled & executed? Wonder about which scheduler? What is
the definition for the
Have you considered the probability (mean time to failure - not mean time
TO failure) of a disk, then factor the probability is 12 times as likely
with a raid 0? Then compare that the the time to replicate in degraded mode
where you have such a large number of drives on each node?
Secondly, there
There are indeed many tuning points here. If the name nodes and journal
nodes can be larger, perhaps even bonding multiple 10gbyte nics, one can
easily scale. I did have one client where the file counts forced multiple
clusters. But we were able to differentiate by airframe types ... eg fixed
wing
"Best" practices are either so generic as to be mostly useless, or
dependent on very specific business processes, SLA's, and OLA's relevant to
different data models, etc.
If you can identify specific business use cases?
Even basic attributes change the model: Containers on CoreOS? White boxes?
In addition to the data privacy concern described well below, there are a
couple of other areas you might consider (you can also respond privately
where I can be a bit more candid). . My experience with most banks (I work
with most of the players in the EU and US) are such that (1) below drives
Based on the brief description, which includes the relatively small
number of records, type of queries I can imagine the end customer would
make, my question would be how ad hoc are the queries vs. how well managed
by traditional RDBMS schemas?
Then I would be interested to understand the nature
With rep of 3 you would have to lose 3 entire nodes to lose data. The rep
factor is 3 nodes, not 3 spindles.. The number of disks (sort of) determine
how hdfs spreads io across the spindles for the single copy of the data
(one of 3 nodes with copies) that the node owns. Note that things get
You will find much information if you Google configuring Linux page
stealing. This is actually the core of the problem with swap (and throwing
away pages of shared libraries). Or talk to your devops team about how to
avoid page stealing in systems with large memory storage footprints ...
Do a reverse lookup and use the name you find. There are a few areas
of Hadoopo that require reverse name lookup, but in general just
create relevant entries (shared across the cluster, e.g. via Ansible
if more than just a few nodes) in /etc/hosts.
Not hard.
On Thu, Mar 5, 2015 at 6:35 PM,
Duh. try running ntpdate as root (sudo).
On Fri, Feb 27, 2015 at 11:39 PM, daemeon reiydelle daeme...@gmail.com
wrote:
try ntpdate -b -p8 whichever server
However, you flat should not be seeing 13 minutes. Something wrong.
Suggest nptdate -d -b -p8 whichever server and look at the results
try ntpdate -b -p8 whichever server
However, you flat should not be seeing 13 minutes. Something wrong. Suggest
nptdate -d -b -p8 whichever server and look at the results.
On Thu, Feb 26, 2015 at 10:54 AM, Jan van Bemmelen j...@tokyoeye.net wrote:
Hi Tariq,
So this is not really an Hadoop
When the access fails, do you have a way to check that the utilization on
the target node ... i.e. was the target node utilization at 100%?
On Thu, Feb 26, 2015 at 10:30 PM, hadoop.supp...@visolve.com wrote:
Hello Krishna,
Exception seems to be IP specific. It might be occurred due to
. Presently, I convert the
video files to individual framess, make a sequence file out of them and
transfer the sequence file to HDFS.
This flow is not optimized and I need to optimize it.
On Thu, Feb 26, 2015 at 3:00 AM, daemeon reiydelle daeme...@gmail.com
wrote:
Can you explain your use case
Can you explain your use case?
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a
Only one rm will be active at a time. The other is in standby. When you
started the new rm, the configuration files direct the new rm to come up
and take over, the old primary will go to stand by (or should!). Working as
designed except you will see slowdown in scheduling. I suspect what you
want
look for busy network resulting in network timeouts to LDAP server
(assuming LDAP itself is not overloaded).
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of
I would guess you do not have your ssl certs set up, client or server,
based on the error.
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used
What have you set dfs.datanode.fsdataset.volume.choosing.policy to
(assuming you are on a current version of Hadoop)? Is the policy set to
org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy?
*...*
*“Life should not be a journey to the grave with the
Absolutely a critical error to lose the configured ntpd time source in
Hadoop. The replication and many other services require absolutely
millisecond time sync between the nodes. Interesting that your SRE design
called for ntpd running on each node. Curious.
What is the problem you are trying to
Are your nodes actually stuck or are you in e.g. a reduce step that is
reading so much data across the network that the node SEEMS unreachable?
Since you mention gets stuck for a while at 25%, that suggests that
eventually the node finishes up its work ...
*...*
*“Life should not be
My first concern is that temp is a virtual file system and cannot exceed
real memory plus swap space. You may not see file system full errors flow
up as the root cause of the stack trace, but you will see null pointers. So
move your temp directory onto your file system (~/Documents/HDFSTMP or
No Kerberos TGT was issued. This looks like an auth issue to Kerberos for
e.g. user hadoop. Check your Kerb server.
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of
Null map step (at a guess?), 3 step reduce. No problem. Suspect 3 may be
rather long running?
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used
.
On 2015년 2월 5일 (목) at 오전 4:51 daemeon reiydelle daeme...@gmail.com
wrote:
Null map step (at a guess?), 3 step reduce. No problem. Suspect 3 may be
rather long running?
*...*
*“Life should not be a journey to the grave with the intention of
arriving safely in apretty and well preserved
Fantastic! I was delighted to have recently worked for a large search
engine company that has moved significant components of their hadoop to to
docker containers on Ubunto, seeing amazing performance/density
improvements. And yes, the build process is really picky. Thanks SO much!
*...*
Make virtualization an option. Federation will NOT solve your problems.
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out,
Add me to the list of interested parties. I am heavily involved with
security and controls of Hadoop, Google MR, use of OSSEC on Hadoop
clusters, etc.
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather
Check your ip addresses and host names of the RM (could be an issue around
which interface the nodes are now using?)
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud
If you are running /var/lib/hadoop/tmp dir in the / file system, you may
want to reconsider that. Disk IO will cause issues with the OS as it
attempts to use it's file system.
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well
At the end of the day, the more data that is pulled from multiple physical
nodes, the (relatively) slower your response time to respond to queries.
Until you reach a point where that response time exceeds your business
requirements, keep it simple. As volumes grow with distributed data sources
to
Formally OpenJDK is not supported as of a recent posting, and I have had
incompatibility issues myself. This is a work in progress so YMMV.
*...*
*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in
There would be thousands of tasks, but not all fired off at the same time.
The number of parallel tasks is configurable but typically 1 per data node
core.
*...*
On Wed, Dec 17, 2014 at 6:31 PM, bit1...@163.com bit1...@163.com wrote:
Thanks Mark and Dieter for the reply.
Actually, I got
I found the terminology of primary and secondary to be a bit confusing in
describing operation after a failure scenario. Perhaps it is helpful to
think that the Hadoop instance is guided to select a node as primary for
normal operation. If that node fails, then the backup becomes the new
primary.
Exactly HOW did you manually remove the block?
sent from my mobile
Daemeon C.M. Reiydelle
USA 415.501.0198
London +44.0.20.8144.9872
On Nov 12, 2014 9:45 PM, sam liu samliuhad...@gmail.com wrote:
Hi Experts,
In my hdfs, there is a file named /tmp/test.txt belonging to 1 block with
2 replica.
I would consider a jbod with 16-64mb stride. This would be a choice where
one or more (e.g. MR) steps will be io bound. Otherwise one or more tasks
will be hit with the low read/write times of having large amounts of data
behind a single spindle
On Nov 12, 2014 8:37 AM, Brian C. Huffman
Yes. That is why you should consider striping across raid 0 (JBOD)
*...“The race is not to the swift,nor the battle to the strong,but to
those who can see it coming and jump aside.” - Hunter ThompsonDaemeon C.M.
ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Wed, Nov
no relo
*...“The race is not to the swift,nor the battle to the strong,but to
those who can see it coming and jump aside.” - Hunter ThompsonDaemeon C.M.
ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*
On Wed, Nov 12, 2014 at 11:31 AM, Amendra Singh Gangwar
I observed what I thought was a similar problem but found that I was
actually hitting the physical disks much harder/more efficiently ... I
found that I had some spindles with pretty deep queue depths. When I sat
back and actually measured, I found that my throughput had substantially
increased,
What you want as a sandbox depends on what you are trying to learn.
If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all
of the suggestions (perhaps excluding BigTop due to its setup complexities)
are great. Laptop? perhaps but laptop's are really kind of infuriatingly
slow
Hadoop is in effect a massively fast etl with high latency as the tradeoff.
Other solutions allow different tradeoffs. And some of those occur in Map
phase, some in a reduce phase (e.g. Stream or columnar stores).
On Oct 7, 2014 11:32 PM, Dattatrya Moin dattatryam...@gmail.com wrote:
Hi ,
We
48 matches
Mail list logo