No it’s flat out saying that that config cannot be set with anything starting
with /home.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Naganarasimha G R (Naga)
Sent: Thursday, November 0
So when you say remount, what exactly am I remounting? /dev/mapper/centos-home?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Naganarasimha G R (Naga)
Sent: Thursday, November 05, 2015 1
By manually you mean actually going in with nano and editing the config file? I
could do that but if Ambari won’t let you do it through the interface, isn’t it
possible that trying to add the directory in home might break something?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street
tmpfs 16G 97M 16G 1% /run
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/sda2 494M 124M 370M 26% /boot
/dev/mapper/centos-home 2.7T 33M 2.7T 1% /home
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldm
though it’s in root that it’s still somehow pointing to /home? So confused.
It’s the part amount mounting a drive to another folder..on the same disk. Is
it kind of like how on Windows you can have more than one “drive” on a disk?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Anal
Is there a maximum amount of disk space that HDFS will use? Is 100GB that max?
When we’re supposed to be dealing with “big data” why is the amount of data to
be held on any one box such a small number when you’ve got terabytes available?
Adaryl "Bob" Wakefield, MBA
Principal
M
/home directory. So I made it
/hdfs/data.
2. When I restarted, the space available increased by a whopping 100GB.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Naganarasimha G R (N
So like I can just create a new folder in the home directory like:
home/hdfs/data
and then set dfs.datanode.data.dir to:
/hadoop/hdfs/data,home/hdfs/data
Restart the node and that should do it correct?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.li
/hadoop/hdfs/data
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: P lva
Sent: Wednesday, November 04, 2015 3:41 PM
To: user@hadoop.apache.org
Subject: Re: hadoop not using whole disk for
/mapper/centos-home 2.7T 33M 2.7T 1% /home
That’s from one datanode. The second one is nearly identical. I discovered that
50GB is actually a default. That seems really weird. Disk space is cheap. Why
would you not just use most of the disk and why is it so hard to reset the
default?
Adaryl &quo
Yeah. It has the current value of 1073741824 which is like 1.07 gig.
B.
From: Chris Nauroth
Sent: Tuesday, November 03, 2015 11:57 AM
To: user@hadoop.apache.org
Subject: Re: hadoop not using whole disk for HDFS
Hi Bob,
Does the hdfs-site.xml configuration file contain the property
dfs.datanod
I’ve got the Hortonworks distro running on a three node cluster. For some
reason the disk available for HDFS is MUCH less than the total disk space. Both
of my data nodes have 3TB hard drives. Only 100GB of that is being used for
HDFS. Is it possible that I have a setting wrong somewhere?
B.
@hadoop.apache.org
Subject: Re: hdfs commands tutorial
I am confused. The linked posted above tells you exactly that that how you
interact with hdfs to do various tasks and features with examples.
What else are you looking for?
Regards,
Shahab
On Aug 14, 2015 12:14 AM, "Adaryl "Bob"
: Re: hdfs commands tutorial
Did you try this . I referred to this when I was learning .
http://hadoop.apache.org/docs/r2.4.1/hadoop-project-dist/hadoop-common/FileSystemShell.html
Thanks and Regards,
Ashish Kumar
From:"Adaryl \"Bob\" Wakefield, MBA"
To:
Does anybody know of a good place to learn and practice HDFS commands?
B.
This is turning into less about Ambari and more general computing. I’m trying
to set up Hadoop on a home network. Not work, not on EC2; just a simple three
node cluster in my personal computer lab. My machines don’t belong to a domain.
Everything I read says that in this situation, the computer
I’m trying to set up a Hadoop cluster but Ambari is giving me issues. At the
screen where it ask me to confirm host, I get:
1. Warning that I’m not inputting a fully qualified domain name.
2. The host that the Ambari instance is actually sitting on is not even
registering.
When run hostname –fqd
should be able to call a method in java from scala but can not
figure out how to turn a Comparator into a Comparator[_: wrote:
Yeah, compared to something as performant as java...
On 10/20/2014 10:16 PM, Adaryl "Bob" Wakefield, MBA wrote:
Using an interpreted scripting lang
Friday, October 17, 2014, Adaryl "Bob" Wakefield, MBA
wrote:
“The only problem with Spark adoption is the steep learning curve of Scala ,
and understanding the API properly.”
This is why I’m looking for reasons to avoid Spark. In my mind, it’s one more
thing to have to master a
On Fri, Oct 17, 2014 at 11:06 AM, Adaryl "Bob" Wakefield, MBA
wrote:
Does anybody have any performance figures on how Spark stacks up against Tez?
If you don’t have figures, does anybody have an opinion? Spark seems so popular
but I’m not really seeing why.
B.
: Spark vs Tez
What aspects of Tez and Spark are you comparing? They have different purposes
and thus not directly comparable, as far as I understand.
Regards,
Shahab
On Fri, Oct 17, 2014 at 2:06 PM, Adaryl "Bob" Wakefield, MBA
wrote:
Does anybody have any performance figures on
Does anybody have any performance figures on how Spark stacks up against Tez?
If you don’t have figures, does anybody have an opinion? Spark seems so popular
but I’m not really seeing why.
B.
Can Tez and MapReduce live together and get along in the same cluster?
B.
You've got MapReduce jobs right? What is it called if, instead, you're using
Tez? A Tez job?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
? I've been thinking about
it like DOS. Is that an incorrect analogy?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
messing up
every SUM() function result.
In the old world, it was simply a matter of going in the warehouse and blowing
away those records. I think the solution we came up with is instead of dropping
that data into a file, drop it into HBASE where you can do row level deletes.
Adaryl &quo
and what does what and how.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Kilaru, Sambaiah
Sent: Wednesday, August 13, 2014 1:10 PM
To: user@hadoop.apache.org
Subject: Re: Started learning Had
Is this up to date?
http://www.mapr.com/products/product-overview/overview
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Aaron Eng
Sent: Tuesday, August 12, 2014 4:31 PM
To: user@hadoop.
You fell into my trap sir. I was hoping someone would clear that up. :)
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Kai Voigt
Sent: Tuesday, August 12, 2014 4:10 PM
To: user@hadoop.apache.or
community around
it.
5. Who the heck is BigInsights? (Which should tell you something.)
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: mani kandan
Sent: Tuesday, August 12, 2014 3:12 P
Hadoop and
the individual projects, but there is very little on how to actually manage
data in Hadoop.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Bertrand Dechoux
Sent: Sunday, August 10,
. It’s better
to just blow these records away, I’m just not certain what the best way to
accomplish that is in the new world.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Sriram Ramachandrasek
Or...as an alternative, since HBASE uses HDFS to store it’s data, can we get
around the no editing file rule by dropping structured data into HBASE? That
way, we have data in HDFS that can be deleted. Any real problem with that idea?
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street
data load up into
periodic files (days, months, etc.) that can easily be rebuilt should errors
occur
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData
From: Adaryl "Bob" Wakefield, MBA
Sent:
that their report is off/the numbers don’t look right.
We investigate and find the bug in the transactional system.
Question: Can we then go back into HDFS and rid ourselves of the bad records?
If not, what is the recommended course of action?
Adaryl "Bob" Wakefield, MBA
Principal
M
http://hortonworks.com/hdp/downloads/
Use the Sandbox with YouTube and lots of Google. That’s what I’m doing at least.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
From: thejas prasad
Sent: Wednesday, August 06, 2014
The book Hadoop Operations by Eric Sammer helped answer a lot of these
questions for me.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
-Original Message-
From: Chris MacKenzie
Sent: Friday, August 01, 2014
This may not be the right place to ask this question. I asked a more generic
question about how to do predictive modeling on hadoop and nobody answered. It
perplexes me as well how to take these machine learning concepts and implement
them in a Map Reduce paradigm.
Adaryl "Bob" Wake
I’ve been working with predictive models for three years now. My models have
been single threaded and written against data in a non distributed environment.
I’m not certain how to translate my skills to Hadoop. Mahout yes but I don’t
know Java as I tend to work with Python (as do a lot of my col
Someone contacted me directly and suggested the book Hadoop Operations by Eric
Sammer.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
From: YIMEN YIMGA Gael
Sent: Tuesday, July 22, 2014 9:48 AM
To: user@hadoop.apache.or
What is the rule for determining how many nodes should be in your initial
cluster?
B.
like a single line
item in an invoice.
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
From: Mark Kerzner
Sent: Sunday, July 20, 2014 2:08 PM
To: Hadoop User
Subject: Re: Merging small files
Bob,
you don't have to
In the old world, data cleaning used to be a large part of the data warehouse
load. Now that we’re working in a schemaless environment, I’m not sure where
data cleansing is supposed to take place. NoSQL sounds fun because
theoretically you just drop everything in but transactional systems that
Hadoop in one big
file, process them, then store the results of the processing in Oracle.
Source file –> Oracle –> Hadoop –> Oracle
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
From: Shashidhar Rao
Sent: Sunday,
“Even if we kept the discussion to the mailing list's technical Hadoop usage
focus, any company/organization looking to use a distro is going to have to
consider the costs, support, platform, partner ecosystem, market share, company
strategy, etc.”
Yeah good point.
Adaryl "Bob"
=-N9i-YXoQBE&index=77&list=WL
Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics
913.938.6685
www.linkedin.com/in/bobwakefieldmba
From: Kilaru, Sambaiah
Sent: Sunday, July 20, 2014 3:47 AM
To: user@hadoop.apache.org
Subject: Re: Merging small files
This is not place to
And by that I mean is there an HDFS file type? I feel like I’m missing
something. Let’s say I have a HUGE json file that I import into HDFS. Does it
retain it’s JSON format in HDFS? What if it’s just random tweets I’m streaming.
Is it kind of like a normal disk where there are all kinds of files
/book.html#arch.hdfs
On Mon, Jul 14, 2014 at 2:52 PM, Adaryl "Bob" Wakefield, MBA
wrote:
HBASE uses HDFS to store it's data correct?
B.
HBASE uses HDFS to store it's data correct?
B.
http://www.cs.cmu.edu/~./enron/
Not sure the uncompressed size but pretty sure it’s over a Gig.
B.
From: navaz
Sent: Monday, July 07, 2014 6:22 PM
To: user@hadoop.apache.org
Subject: Huge text file for Hadoop Mapreduce
Hi
I am running basic word count Mapreduce code. I have download a fi
If you have a server with more than one hard drive is that one node or n
nodes where n = the number of hard drives?
B.
project sponsored by ASF. Look here:
http://storm.apache.org
On 04/07/14 12:28, Adaryl "Bob" Wakefield, MBA wrote:
Storm. It’s not a part of the Apache project but it seems to be what people
are using to process event data.
B.
From: santosh.viswanat...@accenture.com
Sent: Frida
Storm. It’s not a part of the Apache project but it seems to be what people are
using to process event data.
B.
From: santosh.viswanat...@accenture.com
Sent: Friday, July 04, 2014 11:25 AM
To: user@hadoop.apache.org
Subject: Streaming data - Avaiable tools
Hello Experts,
Wanted to explore
at a generic stack without oversimplifying to the
point of serious deficiencies. There are as you say a multitude of options.
You are attempting to boil them down to A vs B as opposed to A may work better
under the following conditions ..
2014-07-02 13:25 GMT-07:00 Adaryl "Bob" W
constructs.)
So given this, you can pick the framework which is more attuned to your needs.
On Wed, Jul 2, 2014 at 3:31 PM, Adaryl "Bob" Wakefield, MBA
wrote:
Do these two projects do essentially the same thing? Is one better than the
other?
Do these two projects do essentially the same thing? Is one better than the
other?
fies things... Just like you can evaluate all kinds of
Apache ecosystems products to meet your needs, MapReduce is no longer the only
kid on the bock.
On Tue, Jul 1, 2014 at 3:07 PM, Adaryl "Bob" Wakefield, MBA
wrote:
From your answer, it sounds like you need to be able to d
right now". Most are looking for *real-time* fraud
detection or recommendations, for example, which MapReduce is not ideal for.
Marco
On Tue, Jul 1, 2014 at 12:00 PM, Adaryl "Bob" Wakefield, MBA
wrote:
“The Mahout community decided to move its codebase onto modern data
proce
“The Mahout community decided to move its codebase onto modern data processing
systems that offer a richer programming model and more efficient execution than
Hadoop MapReduce.”
Does this mean that learning MapReduce is a waste of time? Is Storm the future
or are both technologies necessary?
B
59 matches
Mail list logo