Hey brock do you have a proper code its like giving a lot of errors!!!!!!

On Thu, Oct 13, 2011 at 4:29 PM, Brock Noland <br...@cloudera.com> wrote:

> Hi,
>
> The code is very similar, just create a SequenceFile reader.
>
> Brock
>
> On Thu, Oct 13, 2011 at 4:53 AM, visioner sadak 
> <visioner.sa...@gmail.com>wrote:
>
>> Hello Brock,
>>
>>                   Thanks a lot for your help man,should i run this code
>> after doing the small file uploads i mean i have a java api which does the
>> small file uploads and reads as well,how will be i able to read the files as
>> well
>>
>>
>>
>> On Thu, Oct 13, 2011 at 2:26 AM, Brock Noland <br...@cloudera.com> wrote:
>>
>>> Hi,
>>>
>>> This:  http://pastebin.com/YFzAh0Nj
>>>
>>> will convert a directory of small files to a sequence file. The key is
>>> the filename, the value the file itself. This works if each individual file
>>> is small enough to fit in memory. If you have some files which are larger
>>> and those files can be split up, they can be split over multiple key value
>>> pairs.
>>>
>>> Brock
>>>
>>> On Wed, Oct 12, 2011 at 4:50 PM, visioner sadak <
>>> visioner.sa...@gmail.com> wrote:
>>>
>>>> Hello guys,
>>>>
>>>>             Thanks a lot again for your previous guidance guys,i tried
>>>> out java api to do file uploads its wrking fine,now i need to modify the
>>>> code using sequence files so that i can handle large number of small files
>>>> in hadoop.. for that i encountered 2 links
>>>>
>>>> 1. http://stuartsierra.com/2008/04/24/a-million-little-files (tar to
>>>> sequence)
>>>> 2. http://www.jointhegrid.com/hadoop_filecrush/index.jsp (file crush)
>>>>
>>>> could you pls tell me which approach is better to follow or should i
>>>> follow HAR(hadoop archive) approach,i came to know that in sequence file we
>>>> can combine smaller files in to one big one but dunt know how to split and
>>>> retrieve the small files again while reading files,,, as well..
>>>> Thanks and Gratitude
>>>>   On Wed, Oct 5, 2011 at 1:27 AM, visioner sadak <
>>>> visioner.sa...@gmail.com> wrote:
>>>>
>>>>> Thanks a lot wellington and bejoy for your inputs will try out this api
>>>>> and sequence file....
>>>>>
>>>>>
>>>>> On Wed, Oct 5, 2011 at 1:17 AM, Wellington Chevreuil <
>>>>> wellington.chevre...@gmail.com> wrote:
>>>>>
>>>>>> Yes, Sadak,
>>>>>>
>>>>>> Within this API, you'll copy your files into Hadoop HDFS as you do
>>>>>> when writing to an OutputStream. It will be replicated in your
>>>>>> cluster's HDFS then.
>>>>>>
>>>>>> Cheers.
>>>>>>
>>>>>> 2011/10/4 visioner sadak <visioner.sa...@gmail.com>:
>>>>>>  > Hey thanks wellington just a thought will my data be replicated as
>>>>>> well coz
>>>>>> > i thought tht mapper does the job of breaking data in to pieces and
>>>>>> > distribution and reducer will do the joining and combining while
>>>>>> fetching
>>>>>> > data back thts why was confused to use a MR..can i use this API for
>>>>>> > uploading a large number of small files as well thru my application
>>>>>> or
>>>>>> > should i use sequence file class for that...because i saw the small
>>>>>> file
>>>>>> > problem in hadoop as well as mentioned in below link
>>>>>> >
>>>>>> > http://www.cloudera.com/blog/2009/02/the-small-files-problem/
>>>>>> >
>>>>>> > On Wed, Oct 5, 2011 at 12:54 AM, Wellington Chevreuil
>>>>>> > <wellington.chevre...@gmail.com> wrote:
>>>>>> >>
>>>>>> >> Hey Sadak,
>>>>>> >>
>>>>>> >> you don't need to write a MR job for that. You can make your java
>>>>>> >> program use Hadoop Java API for that. You would need to use
>>>>>> FileSystem
>>>>>> >>
>>>>>> >> (
>>>>>> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/fs/FileSystem.html
>>>>>> )
>>>>>> >> and Path
>>>>>> >> (
>>>>>> http://hadoop.apache.org/common/docs/current/api/index.html?org/apache/hadoop/fs/Path.html
>>>>>> )
>>>>>> >> classes for that.
>>>>>> >>
>>>>>> >> Cheers,
>>>>>> >> Wellington.
>>>>>> >>
>>>>>> >> 2011/10/4 visioner sadak <visioner.sa...@gmail.com>:
>>>>>> >> > Hello guys,
>>>>>> >> >
>>>>>> >> >             I would like to know how to do file uploads in HDFS
>>>>>> using
>>>>>> >> > java,is it to be done using map reduce what if i have a large
>>>>>> number of
>>>>>> >> > small files should i use sequence file along with map
>>>>>> reduce???,It will
>>>>>> >> > be
>>>>>> >> > great if you can provide some sort of information...
>>>>>> >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Reply via email to