Re: Overwrite a row

2013-04-20 Thread Kristoffer Sjögren
The schema is known beforehand so this is exactly what I need. Great!

One more question. What guarantees does the batch operation have? Are the
operations contained within each batch atomic? I.e. all mutations will be
given the same timestamp? If something fails, all operation fail or can it
fail partially?

Thanks for your help, much appreciated.

Cheers,
-Kristoffer


On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu yuzhih...@gmail.com wrote:

 I don't know details about Kristoffer's schema.
 If all the column qualifiers are known a priori, mutateRow() should serve
 his needs.

 HBase allows arbitrary number of columns in a column family. If the schema
 is dynamic, mutateRow() wouldn't suffice.
 If the column qualifiers are known but the row is very wide (and a few
 columns are updated per call), performance would degrade.

 Just some factors to consider.

 Cheers

 On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim mibra...@mibrahim.net
 wrote:

  Actually I do see it in the 0.94 JavaDocs (
 
 
 http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
  ),
  so may be it was added in 0.94.6 even though the jira says fixed in 0.95
 .
  I haven't used it though, but it seems that's what you're looking for.
 
  Sorry for confusion.
 
  Mohamed
 
 
  On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim mibra...@mibrahim.net
  wrote:
 
   It seems that 0.95 is not released yet, mutateRow won't be a solution
 for
   now. I saw it in the downloads and I thought it was released.
  
  
   On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim 
 mibra...@mibrahim.net
  wrote:
  
   Just noticed you want to delete as well. I think that's supported
 since
   0.95 in mutateRow (
  
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
 ).
   You can do multiple puts and deletes and they will be performed
  atomically.
   So you can remove qualifiers and put new ones.
  
   Mohamed
  
  
   On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren sto...@gmail.com
  wrote:
  
   What would you suggest? I want the operation to be atomic.
  
  
   On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu yuzhih...@gmail.com wrote:
  
What is the maximum number of versions do you allow for the
  underlying
table ?
   
Thanks
   
On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren 
  sto...@gmail.com
wrote:
   
 Hi

 Is it possible to completely overwrite/replace a row in a single
   _atomic_
 action? Already existing columns and qualifiers should be removed
  if
   they
 do not exist in the data inserted into the row.

 The only way to do this is to first delete the row then insert
 new
   data
in
 its place, correct? Or is there an operation to do this?

 Cheers,
 -Kristoffer

   
  
  
  
  
 



Re: RefGuide schema design examples

2013-04-20 Thread Ravindranath Akila
+1

R. A.
On 20 Apr 2013 12:07, Viral Bajaria viral.baja...@gmail.com wrote:

 +1!


 On Fri, Apr 19, 2013 at 4:09 PM, Marcos Luis Ortiz Valmaseda 
 marcosluis2...@gmail.com wrote:

  Wow, great work, Doug.
 
 
  2013/4/19 Doug Meil doug.m...@explorysmedical.com
 
   Hi folks,
  
   I reorganized the Schema Design case studies 2 weeks ago and
 consolidated
   them into here, plus added several cases common on the dist-list.
  
   http://hbase.apache.org/book.html#schema.casestudies
  
   Comments/suggestions welcome.  Thanks!
  
  
   Doug Meil
   Chief Software Architect, Explorys
   doug.m...@explorysmedical.com
  
  
  
 
 
  --
  Marcos Ortiz Valmaseda,
  *Data-Driven Product Manager* at PDVSA
  *Blog*: http://dataddict.wordpress.com/
  *LinkedIn: *http://www.linkedin.com/in/marcosluis2186
  *Twitter*: @marcosluis2186 http://twitter.com/marcosluis2186
 



Re: Slow region server recoveries

2013-04-20 Thread Nicolas Liochon
Hi,

I looked at it again with a fresh eye. As Varun was saying, the root cause
is the wrong order of the block locations.

The root cause of the root cause is actually simple: HBASE started the
recovery while the node was not yet stale from an HDFS pov.

Varun mentioned this timing:
Lost Beat: 27:30
Became stale: 27:50 - * this is a guess and reverse engineered (stale
timeout 20 seconds)
Became dead: 37:51

But the  recovery started at 27:13 (15 seconds before we have this log
line)
2013-04-19 00:27:28,432 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
connect to /10.156.194.94:50010 for file
/hbase/feeds/1479495ad2a02dceb41f093ebc29fe4f/home/02f639bb43944d4ba9abcf58287831c0
for block
BP-696828882-10.168.7.226-1364886167971:blk_-5977178030490858298_99853:java.net.SocketTimeoutException:
15000 millis timeout while waiting for channel to be ready for connect. ch
: java.nio.channels.SocketChannel[connection-pending remote=/
10.156.194.94:50010]

So when we took the blocks from the NN, the datanode was not stale, so you
have the wrong (random) order.

ZooKeeper can expire a session before the timeout. I don't what why it does
this in this case, but I don't consider it as a ZK bug: if ZK knows that a
node is dead, it's its role to expire the session. There is something more
fishy: we started the recovery while the datanode was still responding to
heartbeat. I don't know why. Maybe the OS has been able to kill 15 the RS
before vanishing away.

Anyway, we then have an exception when we try to connect, because the RS
does not have a TCP connection to this datanode. And this is retried many
times.

You would not have this with trunk, because HBASE-6435 reorders the blocks
inside the client, using an information not available to the NN, excluding
the datanode of the region server under recovery.

Some conclusions:
 - we should likely backport hbase-6435 to 0.94.
 - I will revive HDFS-3706 and HDFS-3705 (the non hacky way to get
hbase-6435).
 - There are some stuff that could be better in HDFS. I will see.
 - I'm worried by the SocketTimeoutException. We should get NoRouteToHost
at a moment, and we don't. That's also why it takes ages. I think it's an
AWS thing, but it brings to issue: it's slow, and, in HBase, you don't know
if the operation could have been executed or not, so it adds complexity to
some scenarios. If someone with enough network and AWS knowledge could
clarify this point it would be great.

 Cheers,

 Nicolas









On Fri, Apr 19, 2013 at 10:10 PM, Varun Sharma va...@pinterest.com wrote:

 This is 0.94.3 hbase...


 On Fri, Apr 19, 2013 at 1:09 PM, Varun Sharma va...@pinterest.com wrote:

  Hi Ted,
 
  I had a long offline discussion with nicholas on this. Looks like the
 last
  block which was still being written too, took an enormous time to
 recover.
  Here's what happened.
  a) Master split tasks and region servers process them
  b) Region server tries to recover lease for each WAL log - most cases are
  noop since they are already rolled over/finalized
  c) The last file lease recovery takes some time since the crashing server
  was writing to it and had a lease on it - but basically we have the
 lease 1
  minute after the server was lost
  d) Now we start the recovery for this but we end up hitting the stale
 data
  node which is puzzling.
 
  It seems that we did not hit the stale datanode when we were trying to
  recover the finalized WAL blocks with trivial lease recovery. However,
 for
  the final block, we hit the stale datanode. Any clue why this might be
  happening ?
 
  Varun
 
 
  On Fri, Apr 19, 2013 at 10:40 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  Can you show snippet from DN log which mentioned UNDER_RECOVERY ?
 
  Here is the criteria for stale node checking to kick in (from
 
 
 https://issues.apache.org/jira/secure/attachment/12544897/HDFS-3703-trunk-read-only.patch
  ):
 
  +   * Check if the datanode is in stale state. Here if
  +   * the namenode has not received heartbeat msg from a
  +   * datanode for more than staleInterval (default value is
  +   * {@link
  DFSConfigKeys#DFS_NAMENODE_STALE_DATANODE_INTERVAL_MILLI_DEFAULT}),
  +   * the datanode will be treated as stale node.
 
 
  On Fri, Apr 19, 2013 at 10:28 AM, Varun Sharma va...@pinterest.com
  wrote:
 
   Is there a place to upload these logs ?
  
  
   On Fri, Apr 19, 2013 at 10:25 AM, Varun Sharma va...@pinterest.com
   wrote:
  
Hi Nicholas,
   
Attached are the namenode, dn logs (of one of the healthy replicas
 of
  the
WAL block) and the rs logs which got stuch doing the log split.
 Action
begins at 2013-04-19 00:27*.
   
Also, the rogue block is 5723958680970112840_174056. Its very
  interesting
to trace this guy through the HDFS logs (dn and nn).
   
Btw, do you know what the UNDER_RECOVERY stage is for, in HDFS ?
 Also
   does
the stale node stuff kick in for that state ?
   
Thanks
Varun
   
   
On Fri, Apr 19, 2013 at 4:00 AM, Nicolas Liochon 

Re: Overwrite a row

2013-04-20 Thread Ted Yu
Operations within each batch are atomic. 
They would either all succeed or all fail. 

Time stamps would all refer to the latest cell (KeyVal). 

Cheers

On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren sto...@gmail.com wrote:

 The schema is known beforehand so this is exactly what I need. Great!
 
 One more question. What guarantees does the batch operation have? Are the
 operations contained within each batch atomic? I.e. all mutations will be
 given the same timestamp? If something fails, all operation fail or can it
 fail partially?
 
 Thanks for your help, much appreciated.
 
 Cheers,
 -Kristoffer
 
 
 On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu yuzhih...@gmail.com wrote:
 
 I don't know details about Kristoffer's schema.
 If all the column qualifiers are known a priori, mutateRow() should serve
 his needs.
 
 HBase allows arbitrary number of columns in a column family. If the schema
 is dynamic, mutateRow() wouldn't suffice.
 If the column qualifiers are known but the row is very wide (and a few
 columns are updated per call), performance would degrade.
 
 Just some factors to consider.
 
 Cheers
 
 On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim mibra...@mibrahim.net
 wrote:
 
 Actually I do see it in the 0.94 JavaDocs (
 http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
 ),
 so may be it was added in 0.94.6 even though the jira says fixed in 0.95
 .
 I haven't used it though, but it seems that's what you're looking for.
 
 Sorry for confusion.
 
 Mohamed
 
 
 On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim mibra...@mibrahim.net
 wrote:
 
 It seems that 0.95 is not released yet, mutateRow won't be a solution
 for
 now. I saw it in the downloads and I thought it was released.
 
 
 On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim 
 mibra...@mibrahim.net
 wrote:
 
 Just noticed you want to delete as well. I think that's supported
 since
 0.95 in mutateRow (
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
 ).
 You can do multiple puts and deletes and they will be performed
 atomically.
 So you can remove qualifiers and put new ones.
 
 Mohamed
 
 
 On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren sto...@gmail.com
 wrote:
 
 What would you suggest? I want the operation to be atomic.
 
 
 On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu yuzhih...@gmail.com wrote:
 
 What is the maximum number of versions do you allow for the
 underlying
 table ?
 
 Thanks
 
 On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren 
 sto...@gmail.com
 wrote:
 
 Hi
 
 Is it possible to completely overwrite/replace a row in a single
 _atomic_
 action? Already existing columns and qualifiers should be removed
 if
 they
 do not exist in the data inserted into the row.
 
 The only way to do this is to first delete the row then insert
 new
 data
 in
 its place, correct? Or is there an operation to do this?
 
 Cheers,
 -Kristoffer
 


Re: talk list table

2013-04-20 Thread Amit Sela
Hope I'm not too late here... regarding hot spotting with sequential keys,
I'd suggest you read this Sematext blog -
http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
They present a nice idea there for this kind of issues.

Good Luck!



On Mon, Apr 15, 2013 at 11:18 PM, Ted Yu yuzhih...@gmail.com wrote:

 bq. write performance would be lower

 The above means poorer performance.

 bq. I could batch them up application side

 Please do that.

 bq. I guess there is no way to turn that off?

 That's right.

 On Mon, Apr 15, 2013 at 11:15 AM, Kireet kir...@feedly.com wrote:

 
 
 
  Thanks for the reply. write performance would be lower - this means
  better?
 
  Also I think I used the wrong terminology regarding batching. I meant to
  ask if it uses the client side write buffer. I would think not since the
  append() method returns a Result. I could batch them up application side
 I
  suppose. Append also seems to return the updated value. This seems like a
  lot of unnecessary I/O in my case since I am not immediately interested
 in
  the updated value. I guess there is no way to turn that off?
 
 
  On 4/15/13 1:28 PM, Ted Yu wrote:
 
  I assume you would select HBase 0.94.6.1 (the latest release) for this
  project.
 
  For #1, write performance would be lower if you choose to use Append
 (vs.
  using Put).
 
  bq. Can appends be batched by the client or do they execute immediately?
  This depends on your use case. Take a look at the following method in
  HTable where you can send a list of actions (Appends):
 
 public void batch(final List?extends Row actions, final Object[]
  results)
  For #2
  bq. The other would be to prefix the timestamp row key with a random
  leading byte.
 
  This technique has been used elsewhere and is better than the first one.
 
  Cheers
 
  On Mon, Apr 15, 2013 at 6:09 AM, Kireet Reddy
 kireet-Teh5dPVPL8nQT0dZR+*
  *a...@public.gmane.org 
 kireet-teh5dpvpl8nqt0dzr%2ba...@public.gmane.org
  wrote:
 
   I are planning to create a scheduled task list table in our hbase
  cluster. Essentially we will define a table with key timestamp and then
  the
  row contents will be all the tasks that need to be processed within
 that
  second (or whatever time period). I am trying to do the reasonably
 wide
  rows design mentioned in the hbasecon opentsdb talk. A couple of
  questions:
 
  1. Should we use append or put to create tasks? Since these rows will
 not
  live forever, storage space in not a concern, read/write performance is
  more important. As concurrency increases I would guess the row lock may
  become an issue in append? Can appends be batched by the client or do
  they
  execute immediately?
 
  2. I am a little worried about hotspots. This basic design may cause
  issues in terms of the table's performance. Many tasks will execute and
  reschedule themselves using the same interval, t + 1 hour for example.
 So
  many the writes may all go to the same block.  Also, we have a lot of
  other
  data so I am worried it may impact performance of unrelated data if the
  region server gets too busy servicing the task list table. I can think
  of 2
  strategies to avoid this. One would be to create N different tables and
  read/write tasks to them randomly. This may spread load across servers,
  but
  there is no guarantee hbase will place the tables on different region
  servers, correct? The other would be to prefix the timestamp row key
  with a
  random leading byte. Then when reading from the task list table,
  consumers
  could scan from any/all possible values of the random byte + current
  timestamp to obtain tasks. Both strategies seem like they could spread
  out
  load, but at the cost of more work/complexity to read tasks from the
  table.
  Do either of those approaches make sense?
 
  On the read side, it seems like a similar problem exists in that all
  consumers will be reading rows based on the current timestamp. Is this
  good
  because the block will very likely be cached or bad because the region
  server may become overloaded? I have a feeling the answer is going to
 be
  it depends. :)
 
  I did see the previous posts on queues and the tips there - use
 zookeeper
  for coordination, schedule major compactions, etc. Sorry if these
  questions
  are basic, I am pretty new to hbase. Thanks!
 
 
 
 
 



hbase + mapreduce

2013-04-20 Thread Adrian Acosta Mitjans
Hello:

I'm working in a proyect, and i'm using hbase for storage the data, y have this 
method that work great but without the performance i'm looking for, so i want 
is to make the same but using mapreduce.


public ArrayListMyObject findZ(String z) throws IOException {

ArrayListMyObject rows = new ArrayListMyObject();
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, test);
Scan s = new Scan();
s.addColumn(Bytes.toBytes(x), Bytes.toBytes(y));
ResultScanner scanner = table.getScanner(s);
try {
for (Result rr : scanner) {
if (Bytes.toString(rr.getValue(Bytes.toBytes(x), 
Bytes.toBytes(y))).equals(z)) {
rows.add(getInformation(Bytes.toString(rr.getRow(;
}
}
} finally {
scanner.close();
}
return archivos;
}

the getInformation method take all the columns and convert the row in MyObject 
type.

I just want a example or a link to a tutorial that make something like this,  i 
want to get a result type as answer and not a number to count words, like many 
a found.
My natural language is spanish, so sorry if something is not well writing.

Thanths
http://www.uci.cu

Re: Overwrite a row

2013-04-20 Thread Kristoffer Sjögren
Just to absolutely be clear, is this also true for a batch that span
multiple rows?


On Sat, Apr 20, 2013 at 2:42 PM, Ted Yu yuzhih...@gmail.com wrote:

 Operations within each batch are atomic.
 They would either all succeed or all fail.

 Time stamps would all refer to the latest cell (KeyVal).

 Cheers

 On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren sto...@gmail.com wrote:

  The schema is known beforehand so this is exactly what I need. Great!
 
  One more question. What guarantees does the batch operation have? Are the
  operations contained within each batch atomic? I.e. all mutations will be
  given the same timestamp? If something fails, all operation fail or can
 it
  fail partially?
 
  Thanks for your help, much appreciated.
 
  Cheers,
  -Kristoffer
 
 
  On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu yuzhih...@gmail.com wrote:
 
  I don't know details about Kristoffer's schema.
  If all the column qualifiers are known a priori, mutateRow() should
 serve
  his needs.
 
  HBase allows arbitrary number of columns in a column family. If the
 schema
  is dynamic, mutateRow() wouldn't suffice.
  If the column qualifiers are known but the row is very wide (and a few
  columns are updated per call), performance would degrade.
 
  Just some factors to consider.
 
  Cheers
 
  On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim mibra...@mibrahim.net
  wrote:
 
  Actually I do see it in the 0.94 JavaDocs (
 
 http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
  ),
  so may be it was added in 0.94.6 even though the jira says fixed in
 0.95
  .
  I haven't used it though, but it seems that's what you're looking for.
 
  Sorry for confusion.
 
  Mohamed
 
 
  On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim 
 mibra...@mibrahim.net
  wrote:
 
  It seems that 0.95 is not released yet, mutateRow won't be a solution
  for
  now. I saw it in the downloads and I thought it was released.
 
 
  On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim 
  mibra...@mibrahim.net
  wrote:
 
  Just noticed you want to delete as well. I think that's supported
  since
  0.95 in mutateRow (
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
  ).
  You can do multiple puts and deletes and they will be performed
  atomically.
  So you can remove qualifiers and put new ones.
 
  Mohamed
 
 
  On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren 
 sto...@gmail.com
  wrote:
 
  What would you suggest? I want the operation to be atomic.
 
 
  On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu yuzhih...@gmail.com
 wrote:
 
  What is the maximum number of versions do you allow for the
  underlying
  table ?
 
  Thanks
 
  On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren 
  sto...@gmail.com
  wrote:
 
  Hi
 
  Is it possible to completely overwrite/replace a row in a single
  _atomic_
  action? Already existing columns and qualifiers should be removed
  if
  they
  do not exist in the data inserted into the row.
 
  The only way to do this is to first delete the row then insert
  new
  data
  in
  its place, correct? Or is there an operation to do this?
 
  Cheers,
  -Kristoffer
 



Re: Slow region server recoveries

2013-04-20 Thread Varun Sharma
Hi Nicholas,

Regarding the following, I think this is not a recovery - the file below is
an HFIle and is being accessed on a get request. On this cluster, I don't
have block locality. I see these exceptions for a while and then they are
gone, which means the stale node thing kicks in.

2013-04-19 00:27:28,432 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
connect to /10.156.194.94:50010 for file
/hbase/feeds/1479495ad2a02dceb41f093ebc29fe4f/home/
02f639bb43944d4ba9abcf58287831c0
for block

This is the real bummer. The stale datanode is 1st even 90 seconds
afterwards.

*2013-04-19 00:28:35*,777 WARN
org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting of
hdfs://
ec2-107-20-237-30.compute-1.amazonaws.com/hbase/.logs/ip-10-156-194-94.ec2.internal,60020,1366323217601-splitting/ip-10-156-194-94.ec2.internal%2C60020%2C1366323217601.1366331156141failed,
returning error
java.io.IOException: Cannot obtain block length for
LocatedBlock{BP-696828882-10.168.7.226-1364886167971:blk_-5723958680970112840_174056;
getBlockSize()=0; corrupt=false; offset=0; locs=*[10.156.194.94:50010,
10.156.192.106:50010, 10.156.195.38:50010]}*
---at
org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:238)
---at
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:182)
---at
org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
---at org.apache.hadoop.hdfs.DFSInputStream.init(DFSInputStream.java:117)
---at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1080)
---at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:245)
---at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:78)
---at
org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1787)
---at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.openFile(SequenceFileLogReader.java:62)
---at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1707)
---at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1728)
---at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.init(SequenceFileLogReader.java:55)
---at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:175)
---at
org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:717)
---at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:821)
---at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:734)
---at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:381)
---at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:348)
---at
org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:111)
---at
org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:264)
---at
org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:195)
---at
org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:163)
---at java.lang.Thread.run(Thread.java:662)



On Sat, Apr 20, 2013 at 1:16 AM, Nicolas Liochon nkey...@gmail.com wrote:

 Hi,

 I looked at it again with a fresh eye. As Varun was saying, the root cause
 is the wrong order of the block locations.

 The root cause of the root cause is actually simple: HBASE started the
 recovery while the node was not yet stale from an HDFS pov.

 Varun mentioned this timing:
 Lost Beat: 27:30
 Became stale: 27:50 - * this is a guess and reverse engineered (stale
 timeout 20 seconds)
 Became dead: 37:51

 But the  recovery started at 27:13 (15 seconds before we have this log
 line)
 2013-04-19 00:27:28,432 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
 connect to /10.156.194.94:50010 for file

 /hbase/feeds/1479495ad2a02dceb41f093ebc29fe4f/home/02f639bb43944d4ba9abcf58287831c0
 for block

 BP-696828882-10.168.7.226-1364886167971:blk_-5977178030490858298_99853:java.net.SocketTimeoutException:
 15000 millis timeout while waiting for channel to be ready for connect. ch
 : java.nio.channels.SocketChannel[connection-pending remote=/
 10.156.194.94:50010]

 So when we took the blocks from the NN, the datanode was not stale, so you
 have the wrong (random) order.

 ZooKeeper can expire a session before the timeout. I don't what why it does
 this in this case, but I don't consider it as a ZK bug: if ZK knows that a
 node is dead, it's its role to expire the session. There is something more
 fishy: we started the recovery while the datanode was still responding to
 heartbeat. I don't know why. Maybe the OS has been able to kill 15 the RS
 before vanishing away.

 Anyway, we then have an exception when we try to connect, because the RS
 does not have a TCP connection to this datanode. And this is retried many
 times.

 You would not have this with trunk, because HBASE-6435 reorders the blocks
 inside the client, using 

default region splitting on which value?

2013-04-20 Thread Pal Konyves
Hi,

I am just reading about region splitting. By default - as I understand -
Hbase handles splitting the regions. I just don't know how to imagine on
which key it splits the regions.

1) For example when I write MD5 hash of rowkeys, they are most probably
evenly distributed from
00... to F... right? When  Hbase starts with one region, all the
writes goes into that region, and when the HFile get's too big, it just
gets for example the median value of the stored keys, and split the region
by this?

2) I want to bulk load tons of data with the HBase java client API put
operations. I want it to perform well. My keys are numeric sequential
values (which I know from this post, I cannot load into Hbase sequentially,
because the Hbase tables are going to be sad
http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
 )
So I thought I would pre-split the table into regions, and load the data
randomized. This way I will get good distribution among region servers in
terms of network IO from the beginning. Is that a good idea?

3) If my rowkeys are not evenly distributed in the keyspace, but they show
some peaks or bursts. e.g. 000-999, but most of the keys gather around 020
and 060 values, is it a good idea to have the pre region splits at those
peaks?

Thanks in advance,
Pal


Re: Slow region server recoveries

2013-04-20 Thread Varun Sharma
The important thing to note is the block for this rogue WAL is
UNDER_RECOVERY state. I have repeatedly asked HDFS dev if the stale node
thing kicks in correctly for UNDER_RECOVERY blocks but failed.


On Sat, Apr 20, 2013 at 10:47 AM, Varun Sharma va...@pinterest.com wrote:

 Hi Nicholas,

 Regarding the following, I think this is not a recovery - the file below
 is an HFIle and is being accessed on a get request. On this cluster, I
 don't have block locality. I see these exceptions for a while and then they
 are gone, which means the stale node thing kicks in.

 2013-04-19 00:27:28,432 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
 connect to /10.156.194.94:50010 for file
 /hbase/feeds/1479495ad2a02dceb41f093ebc29fe4f/home/
 02f639bb43944d4ba9abcf58287831c0
 for block

 This is the real bummer. The stale datanode is 1st even 90 seconds
 afterwards.

 *2013-04-19 00:28:35*,777 WARN
 org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting of
 hdfs://
 ec2-107-20-237-30.compute-1.amazonaws.com/hbase/.logs/ip-10-156-194-94.ec2.internal,60020,1366323217601-splitting/ip-10-156-194-94.ec2.internal%2C60020%2C1366323217601.1366331156141failed,
  returning error
 java.io.IOException: Cannot obtain block length for
 LocatedBlock{BP-696828882-10.168.7.226-1364886167971:blk_-5723958680970112840_174056;
 getBlockSize()=0; corrupt=false; offset=0; locs=*[10.156.194.94:50010,
 10.156.192.106:50010, 10.156.195.38:50010]}*
 ---at
 org.apache.hadoop.hdfs.DFSInputStream.readBlockLength(DFSInputStream.java:238)
 ---at
 org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:182)
 ---at
 org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:124)
 ---at
 org.apache.hadoop.hdfs.DFSInputStream.init(DFSInputStream.java:117)
 ---at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1080)
 ---at
 org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:245)
 ---at
 org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:78)
 ---at
 org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1787)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.openFile(SequenceFileLogReader.java:62)
 ---at
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1707)
 ---at
 org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1728)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader$WALReader.init(SequenceFileLogReader.java:55)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.init(SequenceFileLogReader.java:175)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.HLog.getReader(HLog.java:717)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:821)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getReader(HLogSplitter.java:734)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:381)
 ---at
 org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:348)
 ---at
 org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:111)
 ---at
 org.apache.hadoop.hbase.regionserver.SplitLogWorker.grabTask(SplitLogWorker.java:264)
 ---at
 org.apache.hadoop.hbase.regionserver.SplitLogWorker.taskLoop(SplitLogWorker.java:195)
 ---at
 org.apache.hadoop.hbase.regionserver.SplitLogWorker.run(SplitLogWorker.java:163)
 ---at java.lang.Thread.run(Thread.java:662)



 On Sat, Apr 20, 2013 at 1:16 AM, Nicolas Liochon nkey...@gmail.comwrote:

 Hi,

 I looked at it again with a fresh eye. As Varun was saying, the root cause
 is the wrong order of the block locations.

 The root cause of the root cause is actually simple: HBASE started the
 recovery while the node was not yet stale from an HDFS pov.

 Varun mentioned this timing:
 Lost Beat: 27:30
 Became stale: 27:50 - * this is a guess and reverse engineered (stale
 timeout 20 seconds)
 Became dead: 37:51

 But the  recovery started at 27:13 (15 seconds before we have this log
 line)
 2013-04-19 00:27:28,432 WARN org.apache.hadoop.hdfs.DFSClient: Failed to
 connect to /10.156.194.94:50010 for file

 /hbase/feeds/1479495ad2a02dceb41f093ebc29fe4f/home/02f639bb43944d4ba9abcf58287831c0
 for block

 BP-696828882-10.168.7.226-1364886167971:blk_-5977178030490858298_99853:java.net.SocketTimeoutException:
 15000 millis timeout while waiting for channel to be ready for connect. ch
 : java.nio.channels.SocketChannel[connection-pending remote=/
 10.156.194.94:50010]

 So when we took the blocks from the NN, the datanode was not stale, so you
 have the wrong (random) order.

 ZooKeeper can expire a session before the timeout. I don't what why it
 does
 this in this case, but I don't consider it as a ZK bug: if ZK knows that a
 node is dead, it's its role to expire the session. There is something more
 fishy: we started the recovery while the datanode was still responding to
 heartbeat. I 

Re: default region splitting on which value?

2013-04-20 Thread Ted Yu
How many column families do you have ?

For #3, per-splitting table at the row keys corresponding to peaks makes sense. 

On Apr 20, 2013, at 10:52 AM, Pal Konyves paul.kony...@gmail.com wrote:

 Hi,
 
 I am just reading about region splitting. By default - as I understand -
 Hbase handles splitting the regions. I just don't know how to imagine on
 which key it splits the regions.
 
 1) For example when I write MD5 hash of rowkeys, they are most probably
 evenly distributed from
 00... to F... right? When  Hbase starts with one region, all the
 writes goes into that region, and when the HFile get's too big, it just
 gets for example the median value of the stored keys, and split the region
 by this?
 
 2) I want to bulk load tons of data with the HBase java client API put
 operations. I want it to perform well. My keys are numeric sequential
 values (which I know from this post, I cannot load into Hbase sequentially,
 because the Hbase tables are going to be sad
 http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
 )
 So I thought I would pre-split the table into regions, and load the data
 randomized. This way I will get good distribution among region servers in
 terms of network IO from the beginning. Is that a good idea?
 
 3) If my rowkeys are not evenly distributed in the keyspace, but they show
 some peaks or bursts. e.g. 000-999, but most of the keys gather around 020
 and 060 values, is it a good idea to have the pre region splits at those
 peaks?
 
 Thanks in advance,
 Pal


Re: default region splitting on which value?

2013-04-20 Thread Pal Konyves
Hi Ted,
Only one family, my data is very simple key-value, although I want to make
sequential scan, so making a hash of the key is not an option.



On Sat, Apr 20, 2013 at 10:07 PM, Ted Yu yuzhih...@gmail.com wrote:

 How many column families do you have ?

 For #3, per-splitting table at the row keys corresponding to peaks makes
 sense.

 On Apr 20, 2013, at 10:52 AM, Pal Konyves paul.kony...@gmail.com wrote:

  Hi,
 
  I am just reading about region splitting. By default - as I understand -
  Hbase handles splitting the regions. I just don't know how to imagine on
  which key it splits the regions.
 
  1) For example when I write MD5 hash of rowkeys, they are most probably
  evenly distributed from
  00... to F... right? When  Hbase starts with one region, all the
  writes goes into that region, and when the HFile get's too big, it just
  gets for example the median value of the stored keys, and split the
 region
  by this?
 
  2) I want to bulk load tons of data with the HBase java client API put
  operations. I want it to perform well. My keys are numeric sequential
  values (which I know from this post, I cannot load into Hbase
 sequentially,
  because the Hbase tables are going to be sad
 
 http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
  )
  So I thought I would pre-split the table into regions, and load the data
  randomized. This way I will get good distribution among region servers in
  terms of network IO from the beginning. Is that a good idea?
 
  3) If my rowkeys are not evenly distributed in the keyspace, but they
 show
  some peaks or bursts. e.g. 000-999, but most of the keys gather around
 020
  and 060 values, is it a good idea to have the pre region splits at those
  peaks?
 
  Thanks in advance,
  Pal



Re: default region splitting on which value?

2013-04-20 Thread Ted Yu
The answer to your first question is yes - midkey of the key range would be 
chosen as split key.

For #2, can you tell us how you plan to randomize the loading ?
Bulk load normally means preparing HFiles which would be loaded directly into 
your table. 

Cheers

On Apr 20, 2013, at 1:11 PM, Pal Konyves paul.kony...@gmail.com wrote:

 Hi Ted,
 Only one family, my data is very simple key-value, although I want to make
 sequential scan, so making a hash of the key is not an option.
 
 
 
 On Sat, Apr 20, 2013 at 10:07 PM, Ted Yu yuzhih...@gmail.com wrote:
 
 How many column families do you have ?
 
 For #3, per-splitting table at the row keys corresponding to peaks makes
 sense.
 
 On Apr 20, 2013, at 10:52 AM, Pal Konyves paul.kony...@gmail.com wrote:
 
 Hi,
 
 I am just reading about region splitting. By default - as I understand -
 Hbase handles splitting the regions. I just don't know how to imagine on
 which key it splits the regions.
 
 1) For example when I write MD5 hash of rowkeys, they are most probably
 evenly distributed from
 00... to F... right? When  Hbase starts with one region, all the
 writes goes into that region, and when the HFile get's too big, it just
 gets for example the median value of the stored keys, and split the
 region
 by this?
 
 2) I want to bulk load tons of data with the HBase java client API put
 operations. I want it to perform well. My keys are numeric sequential
 values (which I know from this post, I cannot load into Hbase
 sequentially,
 because the Hbase tables are going to be sad
 http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
 )
 So I thought I would pre-split the table into regions, and load the data
 randomized. This way I will get good distribution among region servers in
 terms of network IO from the beginning. Is that a good idea?
 
 3) If my rowkeys are not evenly distributed in the keyspace, but they
 show
 some peaks or bursts. e.g. 000-999, but most of the keys gather around
 020
 and 060 values, is it a good idea to have the pre region splits at those
 peaks?
 
 Thanks in advance,
 Pal
 


Re: default region splitting on which value?

2013-04-20 Thread Pal Konyves
I am making a paper for school about HBase, so the data I chose is not a
real usable example. I am familiar with GTFS that is a de facto standard
for storing information about public transportation schedules: when vehicle
arrives to a stop and where it goes toward.

I chose to genrate the rows on the fly, where each row represents a
sequence of 'bus' stops that make a route from the first stop until the
last stop.
e.g.: [first_stop_id,last_stop_id],string_sequence_of_stops
where within the [...] is the rowkey.

So long story short, I generate the data. I want to use the HBase java
client api to store the rows with Put. I plan to randomize it by picking
random first_stop_id-s, and use more threads.

the rowkeys will still have a sequence, because the way I generate the rows
will output about 100-1000 rows starting with the same first_stop_id within
the rowkey. The total ammount of rows will be about billions, and would
take up about 1TB.


On Sat, Apr 20, 2013 at 10:54 PM, Ted Yu yuzhih...@gmail.com wrote:

 The answer to your first question is yes - midkey of the key range would
 be chosen as split key.

 For #2, can you tell us how you plan to randomize the loading ?
 Bulk load normally means preparing HFiles which would be loaded directly
 into your table.

 Cheers

 On Apr 20, 2013, at 1:11 PM, Pal Konyves paul.kony...@gmail.com wrote:

  Hi Ted,
  Only one family, my data is very simple key-value, although I want to
 make
  sequential scan, so making a hash of the key is not an option.
 
 
 
  On Sat, Apr 20, 2013 at 10:07 PM, Ted Yu yuzhih...@gmail.com wrote:
 
  How many column families do you have ?
 
  For #3, per-splitting table at the row keys corresponding to peaks makes
  sense.
 
  On Apr 20, 2013, at 10:52 AM, Pal Konyves paul.kony...@gmail.com
 wrote:
 
  Hi,
 
  I am just reading about region splitting. By default - as I understand
 -
  Hbase handles splitting the regions. I just don't know how to imagine
 on
  which key it splits the regions.
 
  1) For example when I write MD5 hash of rowkeys, they are most probably
  evenly distributed from
  00... to F... right? When  Hbase starts with one region, all
 the
  writes goes into that region, and when the HFile get's too big, it just
  gets for example the median value of the stored keys, and split the
  region
  by this?
 
  2) I want to bulk load tons of data with the HBase java client API put
  operations. I want it to perform well. My keys are numeric sequential
  values (which I know from this post, I cannot load into Hbase
  sequentially,
  because the Hbase tables are going to be sad
 
 http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
  )
  So I thought I would pre-split the table into regions, and load the
 data
  randomized. This way I will get good distribution among region servers
 in
  terms of network IO from the beginning. Is that a good idea?
 
  3) If my rowkeys are not evenly distributed in the keyspace, but they
  show
  some peaks or bursts. e.g. 000-999, but most of the keys gather around
  020
  and 060 values, is it a good idea to have the pre region splits at
 those
  peaks?
 
  Thanks in advance,
  Pal
 



Re: talk list table

2013-04-20 Thread Otis Gospodnetic
+ 
http://blog.sematext.com/2012/12/24/hbasewd-and-hbasehut-handy-hbase-libraries-available-in-public-maven-repo/
if you use Maven and want to use HBaseWD.

Otis
--
HBASE Performance Monitoring - http://sematext.com/spm/index.html




On Sat, Apr 20, 2013 at 11:24 AM, Amit Sela am...@infolinks.com wrote:
 Hope I'm not too late here... regarding hot spotting with sequential keys,
 I'd suggest you read this Sematext blog -
 http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/
 They present a nice idea there for this kind of issues.

 Good Luck!



 On Mon, Apr 15, 2013 at 11:18 PM, Ted Yu yuzhih...@gmail.com wrote:

 bq. write performance would be lower

 The above means poorer performance.

 bq. I could batch them up application side

 Please do that.

 bq. I guess there is no way to turn that off?

 That's right.

 On Mon, Apr 15, 2013 at 11:15 AM, Kireet kir...@feedly.com wrote:

 
 
 
  Thanks for the reply. write performance would be lower - this means
  better?
 
  Also I think I used the wrong terminology regarding batching. I meant to
  ask if it uses the client side write buffer. I would think not since the
  append() method returns a Result. I could batch them up application side
 I
  suppose. Append also seems to return the updated value. This seems like a
  lot of unnecessary I/O in my case since I am not immediately interested
 in
  the updated value. I guess there is no way to turn that off?
 
 
  On 4/15/13 1:28 PM, Ted Yu wrote:
 
  I assume you would select HBase 0.94.6.1 (the latest release) for this
  project.
 
  For #1, write performance would be lower if you choose to use Append
 (vs.
  using Put).
 
  bq. Can appends be batched by the client or do they execute immediately?
  This depends on your use case. Take a look at the following method in
  HTable where you can send a list of actions (Appends):
 
 public void batch(final List?extends Row actions, final Object[]
  results)
  For #2
  bq. The other would be to prefix the timestamp row key with a random
  leading byte.
 
  This technique has been used elsewhere and is better than the first one.
 
  Cheers
 
  On Mon, Apr 15, 2013 at 6:09 AM, Kireet Reddy
 kireet-Teh5dPVPL8nQT0dZR+*
  *a...@public.gmane.org 
 kireet-teh5dpvpl8nqt0dzr%2ba...@public.gmane.org
  wrote:
 
   I are planning to create a scheduled task list table in our hbase
  cluster. Essentially we will define a table with key timestamp and then
  the
  row contents will be all the tasks that need to be processed within
 that
  second (or whatever time period). I am trying to do the reasonably
 wide
  rows design mentioned in the hbasecon opentsdb talk. A couple of
  questions:
 
  1. Should we use append or put to create tasks? Since these rows will
 not
  live forever, storage space in not a concern, read/write performance is
  more important. As concurrency increases I would guess the row lock may
  become an issue in append? Can appends be batched by the client or do
  they
  execute immediately?
 
  2. I am a little worried about hotspots. This basic design may cause
  issues in terms of the table's performance. Many tasks will execute and
  reschedule themselves using the same interval, t + 1 hour for example.
 So
  many the writes may all go to the same block.  Also, we have a lot of
  other
  data so I am worried it may impact performance of unrelated data if the
  region server gets too busy servicing the task list table. I can think
  of 2
  strategies to avoid this. One would be to create N different tables and
  read/write tasks to them randomly. This may spread load across servers,
  but
  there is no guarantee hbase will place the tables on different region
  servers, correct? The other would be to prefix the timestamp row key
  with a
  random leading byte. Then when reading from the task list table,
  consumers
  could scan from any/all possible values of the random byte + current
  timestamp to obtain tasks. Both strategies seem like they could spread
  out
  load, but at the cost of more work/complexity to read tasks from the
  table.
  Do either of those approaches make sense?
 
  On the read side, it seems like a similar problem exists in that all
  consumers will be reading rows based on the current timestamp. Is this
  good
  because the block will very likely be cached or bad because the region
  server may become overloaded? I have a feeling the answer is going to
 be
  it depends. :)
 
  I did see the previous posts on queues and the tips there - use
 zookeeper
  for coordination, schedule major compactions, etc. Sorry if these
  questions
  are basic, I am pretty new to hbase. Thanks!
 
 
 
 
 



Re: Overwrite a row

2013-04-20 Thread Ted Yu
Here is code from 0.94 code base:

  public void mutateRow(final RowMutations rm) throws IOException {
new ServerCallableVoid(connection, tableName, rm.getRow(),
operationTimeout) {
  public Void call() throws IOException {
server.mutateRow(location.getRegionInfo().getRegionName(), rm);
return null;

where RowMutations has the following check:

  private void internalAdd(Mutation m) throws IOException {
int res = Bytes.compareTo(this.row, m.getRow());
if(res != 0) {
  throw new IOException(The row in the recently added Put/Delete  +
  Bytes.toStringBinary(m.getRow()) +  doesn't match the original
one  +
  Bytes.toStringBinary(this.row));

This means you need to issue multiple mutateRow() calls for different rows.

I think you should consider the potential impact on performance due to this
limitation.

For advanced usage, take a look at MultiRowMutationEndpoint:

 * This class demonstrates how to implement atomic multi row transactions
using
 * {@link HRegion#mutateRowsWithLocks(java.util.Collection,
java.util.Collection)}
 * and Coprocessor endpoints.

Cheers

On Sat, Apr 20, 2013 at 10:11 AM, Kristoffer Sjögren sto...@gmail.comwrote:

 Just to absolutely be clear, is this also true for a batch that span
 multiple rows?


 On Sat, Apr 20, 2013 at 2:42 PM, Ted Yu yuzhih...@gmail.com wrote:

  Operations within each batch are atomic.
  They would either all succeed or all fail.
 
  Time stamps would all refer to the latest cell (KeyVal).
 
  Cheers
 
  On Apr 20, 2013, at 12:17 AM, Kristoffer Sjögren sto...@gmail.com
 wrote:
 
   The schema is known beforehand so this is exactly what I need. Great!
  
   One more question. What guarantees does the batch operation have? Are
 the
   operations contained within each batch atomic? I.e. all mutations will
 be
   given the same timestamp? If something fails, all operation fail or can
  it
   fail partially?
  
   Thanks for your help, much appreciated.
  
   Cheers,
   -Kristoffer
  
  
   On Sat, Apr 20, 2013 at 4:47 AM, Ted Yu yuzhih...@gmail.com wrote:
  
   I don't know details about Kristoffer's schema.
   If all the column qualifiers are known a priori, mutateRow() should
  serve
   his needs.
  
   HBase allows arbitrary number of columns in a column family. If the
  schema
   is dynamic, mutateRow() wouldn't suffice.
   If the column qualifiers are known but the row is very wide (and a few
   columns are updated per call), performance would degrade.
  
   Just some factors to consider.
  
   Cheers
  
   On Fri, Apr 19, 2013 at 1:41 PM, Mohamed Ibrahim 
 mibra...@mibrahim.net
   wrote:
  
   Actually I do see it in the 0.94 JavaDocs (
  
 
 http://hbase.apache.org/0.94/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
   ),
   so may be it was added in 0.94.6 even though the jira says fixed in
  0.95
   .
   I haven't used it though, but it seems that's what you're looking
 for.
  
   Sorry for confusion.
  
   Mohamed
  
  
   On Fri, Apr 19, 2013 at 4:35 PM, Mohamed Ibrahim 
  mibra...@mibrahim.net
   wrote:
  
   It seems that 0.95 is not released yet, mutateRow won't be a
 solution
   for
   now. I saw it in the downloads and I thought it was released.
  
  
   On Fri, Apr 19, 2013 at 4:18 PM, Mohamed Ibrahim 
   mibra...@mibrahim.net
   wrote:
  
   Just noticed you want to delete as well. I think that's supported
   since
   0.95 in mutateRow (
  
 
 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#mutateRow(org.apache.hadoop.hbase.client.RowMutations)
   ).
   You can do multiple puts and deletes and they will be performed
   atomically.
   So you can remove qualifiers and put new ones.
  
   Mohamed
  
  
   On Fri, Apr 19, 2013 at 3:44 PM, Kristoffer Sjögren 
  sto...@gmail.com
   wrote:
  
   What would you suggest? I want the operation to be atomic.
  
  
   On Fri, Apr 19, 2013 at 8:32 PM, Ted Yu yuzhih...@gmail.com
  wrote:
  
   What is the maximum number of versions do you allow for the
   underlying
   table ?
  
   Thanks
  
   On Fri, Apr 19, 2013 at 10:53 AM, Kristoffer Sjögren 
   sto...@gmail.com
   wrote:
  
   Hi
  
   Is it possible to completely overwrite/replace a row in a single
   _atomic_
   action? Already existing columns and qualifiers should be
 removed
   if
   they
   do not exist in the data inserted into the row.
  
   The only way to do this is to first delete the row then insert
   new
   data
   in
   its place, correct? Or is there an operation to do this?
  
   Cheers,
   -Kristoffer
  
 



Re: default region splitting on which value?

2013-04-20 Thread Ted Yu
Thanks for sharing the information below.

How do you plan to store time (when the bus gets to each stop) in the row ?
Or maybe it is not of importance to you ?

On Sat, Apr 20, 2013 at 2:24 PM, Pal Konyves paul.kony...@gmail.com wrote:

 I am making a paper for school about HBase, so the data I chose is not a
 real usable example. I am familiar with GTFS that is a de facto standard
 for storing information about public transportation schedules: when vehicle
 arrives to a stop and where it goes toward.

 I chose to genrate the rows on the fly, where each row represents a
 sequence of 'bus' stops that make a route from the first stop until the
 last stop.
 e.g.: [first_stop_id,last_stop_id],string_sequence_of_stops
 where within the [...] is the rowkey.

 So long story short, I generate the data. I want to use the HBase java
 client api to store the rows with Put. I plan to randomize it by picking
 random first_stop_id-s, and use more threads.

 the rowkeys will still have a sequence, because the way I generate the rows
 will output about 100-1000 rows starting with the same first_stop_id within
 the rowkey. The total ammount of rows will be about billions, and would
 take up about 1TB.


 On Sat, Apr 20, 2013 at 10:54 PM, Ted Yu yuzhih...@gmail.com wrote:

  The answer to your first question is yes - midkey of the key range would
  be chosen as split key.
 
  For #2, can you tell us how you plan to randomize the loading ?
  Bulk load normally means preparing HFiles which would be loaded directly
  into your table.
 
  Cheers
 
  On Apr 20, 2013, at 1:11 PM, Pal Konyves paul.kony...@gmail.com wrote:
 
   Hi Ted,
   Only one family, my data is very simple key-value, although I want to
  make
   sequential scan, so making a hash of the key is not an option.
  
  
  
   On Sat, Apr 20, 2013 at 10:07 PM, Ted Yu yuzhih...@gmail.com wrote:
  
   How many column families do you have ?
  
   For #3, per-splitting table at the row keys corresponding to peaks
 makes
   sense.
  
   On Apr 20, 2013, at 10:52 AM, Pal Konyves paul.kony...@gmail.com
  wrote:
  
   Hi,
  
   I am just reading about region splitting. By default - as I
 understand
  -
   Hbase handles splitting the regions. I just don't know how to imagine
  on
   which key it splits the regions.
  
   1) For example when I write MD5 hash of rowkeys, they are most
 probably
   evenly distributed from
   00... to F... right? When  Hbase starts with one region, all
  the
   writes goes into that region, and when the HFile get's too big, it
 just
   gets for example the median value of the stored keys, and split the
   region
   by this?
  
   2) I want to bulk load tons of data with the HBase java client API
 put
   operations. I want it to perform well. My keys are numeric sequential
   values (which I know from this post, I cannot load into Hbase
   sequentially,
   because the Hbase tables are going to be sad
  
 
 http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/
   )
   So I thought I would pre-split the table into regions, and load the
  data
   randomized. This way I will get good distribution among region
 servers
  in
   terms of network IO from the beginning. Is that a good idea?
  
   3) If my rowkeys are not evenly distributed in the keyspace, but they
   show
   some peaks or bursts. e.g. 000-999, but most of the keys gather
 around
   020
   and 060 values, is it a good idea to have the pre region splits at
  those
   peaks?
  
   Thanks in advance,
   Pal