Re: merging small files in HDFS

2017-01-09 Thread Gabriel Balan
# have the inner input format give you a record reader for that split # iterate over the record reader's k-v pairs, outputting them into to mapper's output. # (you need to set the output format appropriately) my 2c Gabriel Balan On 12/30/2016 3:57 PM, Chris Nauroth w

Re: Output File could only be replicated to 0 nodes

2016-07-25 Thread Gabriel Balan
) are excluded in this operation. Can it be there is no more space left (for HDFS) on the host running data nodes? Try running "hdfs dfsadmin -report" hth Gabriel Balan On 7/24/2016 7:53 PM, Madhav Sharan wrote: Hi hadoop users, We are running a mapreduce jobs with 10 nodes. Each map j

Re: Accessing files in Hadoop 2.7.2 Distributed Cache

2016-06-21 Thread Gabriel Balan
ndependent of *conf1*, or a copy/clone of *conf1*). hth Gabriel Balan P.S. Btw, you don't have to copy the local file to hdfs using "IOUtils.copyBytes(in, out, 4096, true);" Try FileSystem.copyFromLocalFile <https://hadoop.apache.org/docs/r2.6.1/api/src-html/org/apache/hadoop/

Re: Packaging multiple map reduce jobs into one Jar

2015-07-01 Thread gabriel balan
Hi Adding to Harshit Mathur's reply: What should be the main class in the manifest file From what I remember, you must not set that. (i.e. if you set it to MyMain1, then you can't use MyMain2) hth Gabriel Balan On 7/1/2015 12:02 AM, Harshit Mathur wrote: Yes you can do this. You can have

Re: how to assign unique ID (Long Value) in mapper

2015-06-29 Thread gabriel balan
are the line of text. If you have multiple files, you may want to combine the file offset with the file name (path) to get a unique id. See here how to get the input file name in the mapper How%20to%20get%20the%20input%20file%20name%20in%20the%20mapper. hth Gabriel Balan On 6/26/2015 5:29 AM

Re: parque table

2015-05-04 Thread gabriel balan
, then you could try putting a view on top of the table, and have the view use UDFs to strip the quotes. hth Gabriel Balan On 5/2/2015 1:04 AM, Kumar Jayapal wrote:6 Hi, When I am loading this data I am gettinginserted into the table how to load with out it. Inline image 1 thanks jay

Re: parque table

2015-05-02 Thread gabriel balan
in this thread, you need to specify the partition clause (in red above), or you get an error: hive LOAD DATA LOCAL INPATH 'access.log.gz' into table raw; FAILED: SemanticException [Error 10062]: Need to specify partition columns because the destination table is partitioned hth Gabriel Balan On 5/1