Just be aware that the data in the write-buffer is in the client - so it hasn't been sent to the RegionServers yet.
So as long as your client doesn't die, you should be ok. -----Original Message----- From: saurabh....@gmail.com [mailto:saurabh....@gmail.com] On Behalf Of Sam Seigal Sent: Monday, June 20, 2011 10:17 PM To: user@hbase.apache.org Subject: Re: Insert a lot of data in HBase When using the write cache and setting setAutoFlush() to false, is there a risk of data loss, even if WAL is enabled ? On Mon, Jun 20, 2011 at 12:27 PM, Jeff Whiting <je...@qualtrics.com> wrote: > There is the possibility that your keys have the same timestamp -- > especially if you are running multi-threaded. If the puts are > buffered then it isn't outside the realm of possibility that they are > executed within the same millisecond. If you have the same keys for > multiple puts you would "loose" data as you describe because it would > just update the row rather than inserting a new one. > > ~Jeff > > > On 6/20/2011 1:16 PM, Laurent Hatier wrote: > >> I think that there is a solution in your link, i will check it ! :) >> >> 2011/6/20 Laurent >> Hatier<laurent.hatier@gmail.**com<laurent.hat...@gmail.com> >> > >> >> my keys are the moment where the data is inserted into HBase (so >>> System.currentTimeMillis()***1000). As you can see, i use the put >>> method which insert data... there is an another way to insert data ? >>> >>> >>> 2011/6/20 Doug >>> Meil<doug.meil@**explorysmedical.com<doug.m...@explorysmedical.com> >>> > >>> >>> Look here in the HBase book for these, and other, tips. >>>> >>>> http://hbase.apache.org/book.**html#performance<http://hbase.apache >>>> .org/book.html#performance> >>>> >>>> >>>> -----Original Message----- >>>> From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of >>>> Jean-Daniel Cryans >>>> Sent: Monday, June 20, 2011 2:03 PM >>>> To: user@hbase.apache.org >>>> Subject: Re: Insert a lot of data in HBase >>>> >>>> 4M is small data :) >>>> >>>> Could there be an overlap in the keys? Are you disabling autoflush >>>> and not flushing the write buffer? These are common >>>> errors/misconceptions. >>>> >>>> J-D >>>> >>>> On Mon, Jun 20, 2011 at 10:36 AM, Laurent Hatier< >>>> laurent.hat...@gmail.com> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I'm new in HBase. I want to insert 4'000'000 rows in HBase (each >>>>> row has 4 columns). I have already looked the HBase wiki to insert >>>>> data, but i've a problem : i loss data. When i do a COUNT with the >>>>> shell, there is approximativly 1'500'000 in the DB... >>>>> I've tested to create multiple Put and insert it with a List, i've >>>>> already tested a simple Put with four add functions, open and >>>>> close the socket it each time i put the line or i read the file don't >>>>> run... >>>>> If anyone have an idea. >>>>> >>>>> here we go my code if you want to see : >>>>> >>>>> List<Put> arrayPut = new ArrayList<Put>(); >>>>> >>>>> arrayPut.add(new Put(Bytes.toBytes(id))); >>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, >>>>> QUALIFIER_START, Bytes.toBytes(tStart)); arrayPut.add(new >>>>> Put(Bytes.toBytes(id))); >>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, >>>>> QUALIFIER_END, Bytes.toBytes(tEnd)); arrayPut.add(new >>>>> Put(Bytes.toBytes(id))); >>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, >>>>> QUALIFIER_COUNTRY, Bytes.toBytes(countryCode)); arrayPut.add(new >>>>> Put(Bytes.toBytes(id))); >>>>> arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, >>>>> QUALIFIER_REGION, Bytes.toBytes(regionCode)); table.put(arrayPut); >>>>> >>>>> -- >>>>> Laurent HATIER >>>>> Étudiant en 2e année du Cycle Ingénieur ą l'EISTI >>>>> >>>>> >>> >>> -- >>> Laurent HATIER >>> Étudiant en 2e année du Cycle Ingénieur ą l'EISTI >>> >>> >> >> > -- > Jeff Whiting > Qualtrics Senior Software Engineer > je...@qualtrics.com > >