I already have the data in HDFS. I want to test compression ratio with gzip and
snappy.
Thanks
Sajid
Sent from my iPhone
On Jul 29, 2015, at 5:37 PM, Ron Gonzalez zlgonza...@yahoo.com wrote:
I think you can pick the compression algorithm when using sqoop - either
deflate or snappy when
I did execute with -m option but it did not work.
Thanks
Jay
On May 7, 2015, at 11:35 PM, Kai Voigt k...@123.org wrote:
Not sure if that will fully help, but --m is bad syntax, use -m instead.
Maybe sqoop freaks out about that and its syntax parser gets confused.
Kai
Am 08.05.2015
Here is the content of the file once it's unzip
106,2003-02-03,20,2,A,2,2,037
106,2003-02-03,20,3,A,2,2,037
106,2003-02-03,8,2,A,2,2,037
On May 1, 2015, at 7:32 AM, Asit Parija a...@sigmoidanalytics.com wrote:
Hi Kumar ,
You can remove the stored as text file part and then try that
Try changing split size in the driver code.
Mapreduce split size properties
Sent from my iPhone
On Feb 27, 2014, at 11:20 AM, qiaoresearcher qiaoresearc...@gmail.com wrote:
Assume there is one large data set with size 100G on hdfs, how can we control
that every data set sent to each
I am trying to setup 4 node cluster on ec2
ec2 machine setup is as follow
1 namenode, (master) 1 secondary namenode , and 2 slave nodes
after issuing start-all.sh on master , all daemons starts as expected with
only one issue
on slave2 - data node and tasktracker starts , but on slave1 only
.
configuration
property
namemapred.job.tracker/name
value*hdfs://ec2-namdenode(master).*compute.amazonaws.com:8021/value
/property
/configuration.
On Sat, Jan 4, 2014 at 12:34 PM, hadoop user using.had...@gmail.com wrote:
I am trying to setup 4 node cluster on ec2
ec2 machine setup
What are possible causes due to which I might get SocketTimeoutException ?
11/01/28 19:01:36 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketTimeoutException: 69000 millis timeout while waiting for
channel to be ready for connect. ch :