I just tried to copy a local file to HDFS and it's the same size. Maybe you are reading it wrong?
J-D On Thu, Mar 31, 2011 at 11:58 PM, 陈加俊 <cjjvict...@gmail.com> wrote: > I want to copy the files of one table from one cluster to another cluster . > So I do it by step: > 1 . bin/hadoop fs -copyToLocal at A > 2. scp files from A to B > 3. bin/hadoop fs -copyFromLocal at B > I scan the table and save the data to a file by a file formats which is > defined by me before yesterday ,and I found the file is only 18M. > It is stange that used -copyFromLocal , the file has 99G . > Why? > 2011/4/1 Jean-Daniel Cryans <jdcry...@apache.org> >> >> Depends what you're trying to do? Like I said you didn't give us a lot >> of information so were pretty much in the dark regarding what you're >> trying to achieve. >> >> At first you asked why the files were so big, I don't see the relation >> with the log files. >> >> Also I'm not sure why you referred to the number of versions, unless >> you are overwriting your data it's irrelevant to on-disk size. Again >> not enough information about what you're trying to do. >> >> J-D >> >> On Thu, Mar 31, 2011 at 12:27 AM, 陈加俊 <cjjvict...@gmail.com> wrote: >> > Can I skip the log files? >> > >> > On Thu, Mar 31, 2011 at 2:17 PM, 陈加俊 <cjjvict...@gmail.com> wrote: >> >> >> >> I found there is so many log files under the table folder and it is >> >> very >> >> big ! >> >> >> >> On Thu, Mar 31, 2011 at 2:16 PM, 陈加俊 <cjjvict...@gmail.com> wrote: >> >>> >> >>> I fond there is so many log files under the table folder and it is >> >>> very >> >>> big ! >> >>> >> >>> >> >>> >> >>> On Thu, Mar 31, 2011 at 1:37 PM, 陈加俊 <cjjvict...@gmail.com> wrote: >> >>>> >> >>>> thank you JD >> >>>> the type of key is Long , and the family's versions is 5 . >> >>>> >> >>>> >> >>>> On Thu, Mar 31, 2011 at 12:42 PM, Jean-Daniel Cryans >> >>>> <jdcry...@apache.org> wrote: >> >>>>> >> >>>>> (Trying to answer with the very little information you gave us) >> >>>>> >> >>>>> So in HBase every cell is stored along it's row key, family name, >> >>>>> qualifier and timestamp (plus length of each). Depending on how big >> >>>>> your keys are, it can grow your total dataset. So it's not just a >> >>>>> function of value sizes. >> >>>>> >> >>>>> J-D >> >>>>> >> >>>>> On Wed, Mar 30, 2011 at 9:34 PM, 陈加俊 <cjjvict...@gmail.com> wrote: >> >>>>> > I scan the table ,It just has 29000 rows and each row only has >> >>>>> > not >> >>>>> > reached >> >>>>> > 1 k . I save it to files which has 18M. >> >>>>> > >> >>>>> > But I used /app/cloud/hadoop/bin/hadoop fs -copyFromLocal , it has >> >>>>> > 99G . >> >>>>> > >> >>>>> > Why ? >> >>>>> > -- >> >>>>> > Thanks & Best regards >> >>>>> > jiajun >> >>>>> > >> >>>> >> >>>> >> >>>> >> >>>> -- >> >>>> Thanks & Best regards >> >>>> jiajun >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Thanks & Best regards >> >>> jiajun >> >>> >> >> >> >> >> >> >> >> -- >> >> Thanks & Best regards >> >> jiajun >> >> >> > >> > >> > >> > -- >> > Thanks & Best regards >> > jiajun >> > >> > > > > > -- > Thanks & Best regards > jiajun > >