There is the possibility that your keys have the same timestamp -- especially if you are running
multi-threaded. If the puts are buffered then it isn't outside the realm of possibility that they
are executed within the same millisecond. If you have the same keys for multiple puts you would
"loose" data as you describe because it would just update the row rather than inserting a new one.
~Jeff
On 6/20/2011 1:16 PM, Laurent Hatier wrote:
I think that there is a solution in your link, i will check it ! :)
2011/6/20 Laurent Hatier<laurent.hat...@gmail.com>
my keys are the moment where the data is inserted into HBase (so
System.currentTimeMillis()*1000). As you can see, i use the put method which
insert data... there is an another way to insert data ?
2011/6/20 Doug Meil<doug.m...@explorysmedical.com>
Look here in the HBase book for these, and other, tips.
http://hbase.apache.org/book.html#performance
-----Original Message-----
From: jdcry...@gmail.com [mailto:jdcry...@gmail.com] On Behalf Of
Jean-Daniel Cryans
Sent: Monday, June 20, 2011 2:03 PM
To: user@hbase.apache.org
Subject: Re: Insert a lot of data in HBase
4M is small data :)
Could there be an overlap in the keys? Are you disabling autoflush and not
flushing the write buffer? These are common errors/misconceptions.
J-D
On Mon, Jun 20, 2011 at 10:36 AM, Laurent Hatier<
laurent.hat...@gmail.com> wrote:
Hi all,
I'm new in HBase. I want to insert 4'000'000 rows in HBase (each row
has 4 columns). I have already looked the HBase wiki to insert data,
but i've a problem : i loss data. When i do a COUNT with the shell,
there is approximativly 1'500'000 in the DB...
I've tested to create multiple Put and insert it with a List, i've
already tested a simple Put with four add functions, open and close
the socket it each time i put the line or i read the file don't run...
If anyone have an idea.
here we go my code if you want to see :
List<Put> arrayPut = new ArrayList<Put>();
arrayPut.add(new Put(Bytes.toBytes(id)));
arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, QUALIFIER_START,
Bytes.toBytes(tStart)); arrayPut.add(new Put(Bytes.toBytes(id)));
arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, QUALIFIER_END,
Bytes.toBytes(tEnd)); arrayPut.add(new Put(Bytes.toBytes(id)));
arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC,
QUALIFIER_COUNTRY, Bytes.toBytes(countryCode)); arrayPut.add(new
Put(Bytes.toBytes(id)));
arrayPut.get(arrayPut.size() - 1).add(FAMILY_GEOLOC, QUALIFIER_REGION,
Bytes.toBytes(regionCode)); table.put(arrayPut);
--
Laurent HATIER
Étudiant en 2e année du Cycle Ingénieur à l'EISTI
--
Laurent HATIER
Étudiant en 2e année du Cycle Ingénieur à l'EISTI
--
Jeff Whiting
Qualtrics Senior Software Engineer
je...@qualtrics.com