Re: Need command to compress the files

2015-07-29 Thread Hadoop User
I already have the data in HDFS. I want to test compression ratio with gzip and snappy. Thanks Sajid Sent from my iPhone On Jul 29, 2015, at 5:37 PM, Ron Gonzalez zlgonza...@yahoo.com wrote: I think you can pick the compression algorithm when using sqoop - either deflate or snappy when

Re:

2015-05-08 Thread Hadoop User
I did execute with -m option but it did not work. Thanks Jay On May 7, 2015, at 11:35 PM, Kai Voigt k...@123.org wrote: Not sure if that will fully help, but --m is bad syntax, use -m instead. Maybe sqoop freaks out about that and its syntax parser gets confused. Kai Am 08.05.2015

Re: parque table

2015-05-01 Thread Hadoop User
Here is the content of the file once it's unzip 106,2003-02-03,20,2,A,2,2,037 106,2003-02-03,20,3,A,2,2,037 106,2003-02-03,8,2,A,2,2,037 On May 1, 2015, at 7:32 AM, Asit Parija a...@sigmoidanalytics.com wrote: Hi Kumar , You can remove the stored as text file part and then try that

Re: how to feed sampled data into each mapper

2014-02-27 Thread Hadoop User
Try changing split size in the driver code. Mapreduce split size properties Sent from my iPhone On Feb 27, 2014, at 11:20 AM, qiaoresearcher qiaoresearc...@gmail.com wrote: Assume there is one large data set with size 100G on hdfs, how can we control that every data set sent to each

task tracker fails to start in one of the slave node

2014-01-04 Thread hadoop user
I am trying to setup 4 node cluster on ec2 ec2 machine setup is as follow 1 namenode, (master) 1 secondary namenode , and 2 slave nodes after issuing start-all.sh on master , all daemons starts as expected with only one issue on slave2 - data node and tasktracker starts , but on slave1 only

Re: task tracker fails to start in one of the slave node

2014-01-04 Thread hadoop user
. configuration property namemapred.job.tracker/name value*hdfs://ec2-namdenode(master).*compute.amazonaws.com:8021/value /property /configuration. On Sat, Jan 4, 2014 at 12:34 PM, hadoop user using.had...@gmail.com wrote: I am trying to setup 4 node cluster on ec2 ec2 machine setup

Why do I get SocketTimeoutException?

2011-01-28 Thread hadoop user
What are possible causes due to which I might get SocketTimeoutException ? 11/01/28 19:01:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketTimeoutException: 69000 millis timeout while waiting for channel to be ready for connect. ch :