RE: running hive on windows 7

2015-03-10 Thread Kiran Kumar.M.R
Try posting this query on Hive mailing list
u...@hive.apache.orgmailto:u...@hive.apache.org


Regards,
Kiran
__
This e-mail and its attachments contain confidential information from HUAWEI, 
which is intended only for the person or entity whose address is listed above. 
Any use of the information contained herein in any way (including, but not 
limited to, total or partial disclosure, reproduction, or dissemination) by 
persons other than the intended recipient(s) is prohibited. If you receive this 
e-mail in error, please notify the sender by phone or email immediately and 
delete it!
__




From: 北极星 [mailto:150201...@qq.com]
Sent: Sunday, March 08, 2015 14:17
To: user
Subject: running hive on windows 7

Hi

I'm a freshman in hadoop world. After some struggling, i've successfully make 
hadoop 2.6 running on my windows 7 laptop.

However when I want to run hive 1.0.0 on my win 7 system, I found there is no 
cmd line script as provided for linux.

It's also hard to find any useful message in google. That's why i seek here.

Anyone can provide me any clue on how to run hive on window 7?


Many thanks.

Regards

Iven Chen‍


getting amazon emr to access a single file when reducing

2015-03-10 Thread Jonathan Aquilina
 

Hi guys, 

I need to run a job where we have data when being reduced it needs to
access another file for data that is needed, in my case way points, the
way points do not need to be processed. 

On amazon emr its proving very tricky how would one need to do this in
the simplest way possible? 

-- 
Regards,
Jonathan Aquilina
Founder Eagle Eye T
 

Unsubscribe

2015-03-10 Thread Ravi Mummulla (BIG DATA)


--





Re: Not able to ping AWS host

2015-03-10 Thread max scalf
inside your VPC -- subnet -- does the route table have a internet gateway
attached(that should have a gateway of 0.0.0.0/0 as well)...

On Mon, Mar 9, 2015 at 10:23 PM, Krish Donald gotomyp...@gmail.com wrote:

 Yes security group has all open ports to 0.0.0.0 and yes cluster is under
 VPC

 On Mon, Mar 9, 2015 at 5:15 PM, max scalf oracle.bl...@gmail.com wrote:

 when you say the security group has all open ports, is that open to
 public (0.0.0.0) or to your specific IP(if so is ur ip correct)?

 also are the instance inside of a VPC ??

 On Mon, Mar 9, 2015 at 5:05 PM, Krish Donald gotomyp...@gmail.com
 wrote:

 Hi,

 I am trying to setup Hadoop cluster on AWS .
 After creating an instance, I got the public ip and dns.
 But I tried to ping it from my windows machine I am not able to ping it.

 I am not able to logon to machine using putty .
 It is saying Network timed out.

 Security group in the AWS cluster has open all TCP, UDP, ICMP and SSH
 also.

 Please let me know if anybody ahs any idea.

 Thanks
 Krish






Re: video stream as input to sequence files

2015-03-10 Thread tesm...@gmail.com
Thanks. Is there some example of this process.


Regards,



On Sat, Feb 28, 2015 at 7:11 AM, daemeon reiydelle daeme...@gmail.com
wrote:

 My thinking ... in your map step take each frame and tag it with an
 appropriate unique key. Your reducers (if used) then do the frame analysis,
 If doing frame sequences, then you need to decide the granularity vs. time
 each node spends executing. Same sort of process that is done for e.g.
 satellite images undergoing feature recognition analysis.



 *...*






 *“Life should not be a journey to the grave with the intention of arriving
 safely in apretty and well preserved body, but rather to skid in broadside
 in a cloud of smoke,thoroughly used up, totally worn out, and loudly
 proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
 (+1) 415.501.0198London (+44) (0) 20 8144 9872*

 On Wed, Feb 25, 2015 at 11:54 PM, tesm...@gmail.com tesm...@gmail.com
 wrote:

 Dear Daemeon,

 Thanks for your rpely. Here is my flow.

 I am processing video frames using MapReduce. Presently, I convert the
 video files to individual framess, make a sequence file out of them and
 transfer the sequence file to HDFS.

 This flow is not optimized and I need to optimize it.

 On Thu, Feb 26, 2015 at 3:00 AM, daemeon reiydelle daeme...@gmail.com
 wrote:

 Can you explain your use case?



 *...*






 *“Life should not be a journey to the grave with the intention of
 arriving safely in apretty and well preserved body, but rather to skid in
 broadside in a cloud of smoke,thoroughly used up, totally worn out, and
 loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M.
 ReiydelleUSA (+1) 415.501.0198 %28%2B1%29%20415.501.0198London (+44) (0)
 20 8144 9872 %28%2B44%29%20%280%29%2020%208144%209872*

 On Wed, Feb 25, 2015 at 4:01 PM, tesm...@gmail.com tesm...@gmail.com
 wrote:

 Hi,

 How can I make my video data files as input for sequence file or to
 HDFS directly.


 Regards,
 Tariq







Re: Not able to ping AWS host

2015-03-10 Thread Krish Donald
It is as below:
Route Table:
rtb-f377cbxx | myroute
https://us-west-2.console.aws.amazon.com/vpc/home?region=us-west-2#routetables:filter=rtb-f377cb96
Destination
Target
172.31.0.0/16
local
0.0.0.0/0
igw-6d16cxxx

On Tue, Mar 10, 2015 at 6:47 AM, max scalf oracle.bl...@gmail.com wrote:

 inside your VPC -- subnet -- does the route table have a internet
 gateway attached(that should have a gateway of 0.0.0.0/0 as well)...

 On Mon, Mar 9, 2015 at 10:23 PM, Krish Donald gotomyp...@gmail.com
 wrote:

 Yes security group has all open ports to 0.0.0.0 and yes cluster is under
 VPC

 On Mon, Mar 9, 2015 at 5:15 PM, max scalf oracle.bl...@gmail.com wrote:

 when you say the security group has all open ports, is that open to
 public (0.0.0.0) or to your specific IP(if so is ur ip correct)?

 also are the instance inside of a VPC ??

 On Mon, Mar 9, 2015 at 5:05 PM, Krish Donald gotomyp...@gmail.com
 wrote:

 Hi,

 I am trying to setup Hadoop cluster on AWS .
 After creating an instance, I got the public ip and dns.
 But I tried to ping it from my windows machine I am not able to ping it.

 I am not able to logon to machine using putty .
 It is saying Network timed out.

 Security group in the AWS cluster has open all TCP, UDP, ICMP and SSH
 also.

 Please let me know if anybody ahs any idea.

 Thanks
 Krish







Re: Not able to ping AWS host

2015-03-10 Thread max scalf
That is very interesting, any network ACL blocking your inbound
connections?  If you would like you can setup a webex/go-meeting conf. and
we can troubleshoot this together, so we can take this offline as this is
specific to AWS and nothing to do with Hadoop.

I can be reached at oracle.bl...@gmail.com, i am available to do this later
on today after 4PM CST.

On Tue, Mar 10, 2015 at 12:00 PM, Krish Donald gotomyp...@gmail.com wrote:

 It is as below:
 Route Table:
 rtb-f377cbxx | myroute
 https://us-west-2.console.aws.amazon.com/vpc/home?region=us-west-2#routetables:filter=rtb-f377cb96
 Destination
 Target
 172.31.0.0/16
 local
 0.0.0.0/0
 igw-6d16cxxx

 On Tue, Mar 10, 2015 at 6:47 AM, max scalf oracle.bl...@gmail.com wrote:

 inside your VPC -- subnet -- does the route table have a internet
 gateway attached(that should have a gateway of 0.0.0.0/0 as well)...

 On Mon, Mar 9, 2015 at 10:23 PM, Krish Donald gotomyp...@gmail.com
 wrote:

 Yes security group has all open ports to 0.0.0.0 and yes cluster is
 under VPC

 On Mon, Mar 9, 2015 at 5:15 PM, max scalf oracle.bl...@gmail.com
 wrote:

 when you say the security group has all open ports, is that open to
 public (0.0.0.0) or to your specific IP(if so is ur ip correct)?

 also are the instance inside of a VPC ??

 On Mon, Mar 9, 2015 at 5:05 PM, Krish Donald gotomyp...@gmail.com
 wrote:

 Hi,

 I am trying to setup Hadoop cluster on AWS .
 After creating an instance, I got the public ip and dns.
 But I tried to ping it from my windows machine I am not able to ping
 it.

 I am not able to logon to machine using putty .
 It is saying Network timed out.

 Security group in the AWS cluster has open all TCP, UDP, ICMP and SSH
 also.

 Please let me know if anybody ahs any idea.

 Thanks
 Krish








Pydoop 1.0.0-rc2

2015-03-10 Thread Simone Leo

Hello everyone,

we're happy to announce the 1.0.0-rc2 release of Pydoop 
(http://crs4.github.io/pydoop), the non-Streaming Python interface to 
Hadoop.  Adding to the simplified installation and new Pythonic API 
introduced with 1.0.0-rc1, this rc provides built-in Avro support (for 
now, only with Hadoop 2).  By setting a few flags in the submitter and 
selecting the new AvroContext as your application's context class, you 
can read and write Avro data, transparently manipulating records as 
Python dictionaries.  For instance, you could count your favorite colors 
stored in an Avro file like this:


   export STATS_SCHEMA=$(cat stats.avsc)
   pydoop submit \
 -D pydoop.mapreduce.avro.value.output.schema=${STATS_SCHEMA} \
 --avro-input v --avro-output v \
 --upload-file-to-cache color_count.py --mrv2 \
 color_count input output

And your Pydoop code would be these few lines:

   class Mapper(api.Mapper):
   def map(self, ctx):
   user = ctx.value
   color = user['favorite_color']
   if color is not None:
   ctx.emit(user['office'], Counter({color: 1}))

   class Reducer(api.Reducer):
   def reduce(self, ctx):
   s = sum(ctx.values, Counter())
   ctx.emit('', {'office': ctx.key, 'counts': s})

Any input/output format that exchanges Avro records is supported, 
including the Parquet ones.  For more detailed information, see the docs 
at http://crs4.github.io/pydoop/examples/avro.html


Pydoop is a Python API for Hadoop that allows you to write full-fledged 
MapReduce applications with HDFS access.  Pydoop powers several 
scientific projects at CRS4, including Seal 
(http://biodoop-seal.sourceforge.net), Biodoop-BLAST 
(http://biodoop.sourceforge.net/blast) and VISPA 
(https://github.com/crs4/vispa), as well as successful commercial 
services such as Slacker Radio (http://www.slacker.com).


Please note that this is a release candidate that's not been used in 
production yet.  This means, among other things, that you have to add 
the --pre flag if installing with pip.  As usual, we're happy to 
receive your feedback: please open an issue on GitHub if you spot a bug 
or find something that could be improved.


Links:

  * download: http://pypi.python.org/pypi/pydoop
  * docs: http://crs4.github.io/pydoop
  * git repo: https://github.com/crs4/pydoop
  * paper: dx.doi.org/10.1145/1851476.1851594
  * Dr.Dobb's review:
http://www.drdobbs.com/database/pydoop-writing-hadoop-programs-in-python/240156473

Happy pydooping!

The Pydoop Team

--
Simone Leo
Data Fusion - Distributed Computing
CRS4
POLARIS - Building #1
Piscina Manna
I-09010 Pula (CA) - Italy
e-mail: simone@crs4.it
http://www.crs4.it