at 9:32 AM, max scalf oracle.bl...@gmail.com wrote:
Krish,
I dont mean to hijack your mail here but i wanted to find out how/what
you did for the below portion, as i am trying to go down your path as well,
i was able to get 4-5 node cluster using ambari and cdh and now wanted to
take it to next
On Mon, Mar 9, 2015 at 5:15 PM, max scalf oracle.bl...@gmail.com wrote:
when you say the security group has all open ports, is that open to
public (0.0.0.0) or to your specific IP(if so is ur ip correct)?
also are the instance inside of a VPC ??
On Mon, Mar 9, 2015 at 5:05 PM, Krish Donald gotomyp
@jonathan,
I totaly agree that this is reinventing the wheel, but think about the
folks who wants to do this setup from scratch to better under hadoop or
maybe those folks who are going to do admin realted work...and hence the
need to setting is up from scratch...
@alexandar,
Yes you are right,
by in hive works different from terasort. In case of terasort
you can merge output files and get one file with globally sorted data.
On Sun, Mar 8, 2015 at 7:55 AM, max scalf oracle.bl...@gmail.com wrote:
Thank you Alexander. So is it fair to assume when sort by is used and
multiple files
Krish,
I dont mean to hijack your mail here but i wanted to find out how/what you
did for the below portion, as i am trying to go down your path as well, i
was able to get 4-5 node cluster using ambari and cdh and now wanted to
take it to next level. What have you done for below?
I have done a
Hello all,
I am a new to hadoop and hive in general and i am reading hadoop the
definitive guide by Tom White and on page 504 for the hive chapter, Tom
says below with regards to soritng
*Sorting and Aggregating*
*Sorting data in Hive can be achieved by using a standard ORDER BY clause.
ORDER BY
you will get the the private ip to work until and unless you are in your
VPC connected to a VPN or a direct connect. For what you are doing, i
would use the public IP that should work just fine.
On Fri, Mar 13, 2015 at 3:00 PM, Krish Donald gotomyp...@gmail.com wrote:
Hi,
I am using Elastic
Destination
Target
172.31.0.0/16
local
0.0.0.0/0
igw-6d16cxxx
On Tue, Mar 10, 2015 at 6:47 AM, max scalf oracle.bl...@gmail.com wrote:
inside your VPC -- subnet -- does the route table have a internet
gateway attached(that should have a gateway of 0.0.0.0/0 as well)...
On Mon, Mar 9, 2015
Here is a easy way to go about assigning static name to your ec2 instance.
When you get the launch an EC2-instance from aws console when you get to
the point of selecting VPC, ip address screen there is a screen that says
USER DATA...put the below in with appropriate host name(change
, Alexander Pivovarov apivova...@gmail.com
wrote:
what about DNS?
if you have 2 computers (nn and dn) how nn knows dn ip?
The script puts only this computer ip to /etc/hosts
On Thu, Mar 5, 2015 at 6:39 PM, max scalf oracle.bl...@gmail.com wrote:
Here is a easy way to go about assigning static
Thank you harsh. Can you please explain what you mean when u said Just
simple virtual memory used by the process ? Doesn't virtual memory means
swap?
On Wednesday, March 25, 2015, Harsh J ha...@cloudera.com wrote:
The suggestion (regarding swappiness) is not for disabling swap as much as
it
(PARTITION BY A ORDER BY B)
On Sat, Mar 7, 2015 at 3:02 PM, max scalf oracle.bl...@gmail.com wrote:
Hello all,
I am a new to hadoop and hive in general and i am reading hadoop the
definitive guide by Tom White and on page 504 for the hive chapter, Tom
says below with regards to soritng
Not to hijack this post but how would you deal with data that is maintained
by hive(Orc format file, hive created tables etc..)...Would we copy the
hivemetastore(MySQL) and move that over to new cluster?
On Friday, June 19, 2015, Joep Rottinghuis jrottingh...@gmail.com wrote:
You can't set up a
May I ask why you need to do that? Y not let Hadoop handle that for u?
On Sunday, July 19, 2015, Shiyao Ma i...@introo.me wrote:
Hi,
I'd like to put my data selectively on some datanodes.
Currently I can do that by shutting down un-needed datanodes. But this is
a little laborsome.
Is
Hello Hadoop community,
we are running hadoop in AWS(not EMR) but hortonworks distro on EC2
instance. Everything is all setup and working as expected. Our design
calls for running HDFS/data nodes on local/ephemeral storage and we have 3X
replication enabled by default, all of the metastore
ver, that would be a code change in DistCp, and not as easy as a
> script. But that would address the scalability issue that you are worried
> about.
>
>
>
> Thanks
>
> Anu
>
>
>
>
>
>
>
> *From: *max scalf <oracle.bl...@gmail.com>
> *Date
Just out of curiosity, have you enabled S3 endpoint for this ? Hopefully u
are running this cluster inside a VPC, if so an endpoint would help as the
S3 traffic will not go out to the Internet...
Any new policies put in place for your S3 bucket as others have mentioned
something about throttling
17 matches
Mail list logo