Re: Setting TTL at the row level

2017-06-23 Thread yonghu
I did not quite understand what you mean by "row timestamp"? As far as I know, timestamp is associated to each data version (cell). Will you store multiple data versions in a single column? On Thu, Jun 22, 2017 at 4:35 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Why not using

Re: multiple data versions vs. multiple rows?

2015-01-20 Thread yonghu
need to run way bigger and longer tests. Dont run them in parallete. Run them at least 10 time each to make sure you have a good trend. Is the difference between the 2 significant? It should not. JM 2015-01-19 15:17 GMT-05:00 yonghu yongyong...@gmail.com: Hi, Thanks for your

multiple data versions vs. multiple rows?

2015-01-19 Thread yonghu
Dear all, I want to record the user history data. I know there exists two options, one is to store user events in a single row with multiple data versions and the other one is to use multiple rows. I wonder which one is better for performance? Thanks! Yong

Re: multiple data versions vs. multiple rows?

2015-01-19 Thread yonghu
-01-19 14:28 GMT-05:00 yonghu yongyong...@gmail.com: Dear all, I want to record the user history data. I know there exists two options, one is to store user events in a single row with multiple data versions and the other one is to use multiple rows. I wonder which one is better

Strange behavior when using MapReduce to process HBase.

2014-11-25 Thread yonghu
Hi, I write a copyTable mapreduce program. My hbase version is 0.94.16. Several rows in the source table contain multiple data versions. The Map function looks like follows: public void map(ImmutableBytesWritable rowKey, Result res, Context context) throws IOException, InterruptedException{

Is it possible to get the table name at the Map phase?

2014-11-21 Thread yonghu
Hello all, I want to implement difference operator by using MapReduce. I read two tables by using MultiTableInputFormat. In the Map phase, I need to tag the name of table into each row, but how can I get the table name? One way I can think is to create HTable instance for each table in the

It is possible to indicate a specific key range into a specific node?

2014-11-02 Thread yonghu
Dear All, Suppose that I have a key range from 1 to 100 and want to store 1-50 in the first node and 51-100 in the second node. How can I do this in Hbase? regards! Yong

A use case for ttl deletion?

2014-09-26 Thread yonghu
Hello, Can anyone give me a concrete use case for ttl deletions? I mean in which situation we should set ttl property? regards! Yong

Re: hbase memstore size

2014-08-06 Thread yonghu
I did not quite understand your problem. You store your data in HBase, and I guess later you also will read data from it. Generally, HBase will first check if the data exist in memstore, if not, it will check the disk. If you set the memstore to 0, it denotes every read will directly forward to

Re: HBase appends

2014-07-22 Thread yonghu
Hi, If a author does not have hundreds of publications, you can directly write in one column. Hence, your column will contain multiple data versions. The default data version is 3 but you can send more. On Tue, Jul 22, 2014 at 4:20 AM, Ishan Chhabra ichha...@rocketfuel.com wrote: Arun, You

Re: single node's performance and cluster's performance

2014-04-03 Thread yonghu
I think the right understanding of this is it will slow down the data query processing. You can think the RS who hit a heady I/O as a hotspot node. It will not slow down the whole cluster, it will only slow down the data applications which access the data from that RS. On Thu, Apr 3, 2014 at

Re: Moving older data versions to archive

2014-04-03 Thread yonghu
I think you can define coprocessors to do this. For example, for every put command, you can keep the desired versions that you want, and later put the older version into the other table or HDFS. Finally, either let Hbase delete your stale data or let coprocessor do that for you. The problem of

Re: LSM tree, SSTable and fractal tree

2014-02-28 Thread yonghu
Hbase uses LSM tree and SStable, not sure for fractal tree On Fri, Feb 28, 2014 at 2:57 PM, Shahab Yunus shahab.yu...@gmail.comwrote: http://www.slideshare.net/jaxlondon2012/hbase-advanced-lars-george http://hortonworks.com/hadoop/hbase/ Regards, Shahab On Fri, Feb 28, 2014 at 8:36 AM,

Re: When should we trigger a major compaction?

2014-02-21 Thread yonghu
Before you want to trigger major compaction, let's first explain why do we need major compaction. The major compaction will cause 1. delete the data which is masked by tombstone; 2. delete the data which has expired ttl; 3. compact several small hfiles into a single larger one. I didn't quite

Re: TTL forever

2014-02-18 Thread yonghu
I also calculated the years of ttl, just for fun. :). But as Jean said, default ttl is forever. On Tue, Feb 18, 2014 at 2:05 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Hi Mohamed, Default value is MAX_VALUE, which is considered as forever. So default TTL is NOT 69 years.

Re: What's the Common Way to Execute an HBase Job?

2014-02-11 Thread yonghu
Hi, To process the data in Hbase. You can have different options. 1. Java program using Hbase api; 2. MapReduce program; 3. High-level languages, such as Hive or Pig (built on top of MapReduce); 4. Phoenix also a High-level language (built based on coprocessor). which one you should use depends

Re: Newbie question: Rowkey design

2013-12-17 Thread yonghu
In my opinion, it really depends on your queries. The first one achieves data locality. There is no additional data transmit between different nodes. But this strategy sacrifices parallelism and the node which stores A will be a hot node if too many applications try to access A. The second

Re: Online/Realtime query with filter and join?

2013-11-29 Thread yonghu
The question is what you mean of real-time. What is your performance request? In my opinion, I don't think the MapReduce is suitable for the real time data processing. On Fri, Nov 29, 2013 at 9:55 AM, Azuryy Yu azury...@gmail.com wrote: you can try phoniex. On 2013-11-29 3:44 PM, Ramon Wang

Re: How to create HTableInterface in coprocessor?

2013-10-24 Thread yonghu
...@gmail.com wrote: I logged HBASE-9819 to backport HBASE-8372 'Provide mutability to CompoundConfiguration' to 0.94 If you have time, you can work on the backport. Cheers On Tue, Oct 22, 2013 at 11:56 AM, yonghu yongyong...@gmail.com wrote: Hi Ted, This is because I tried

How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
Hello, In the oldest verison of HBase , I can get the HTableInterface by HTablePool.getTable() method. However, in the latest Hbase version0.94.12, HTablePool is deprecated. So, I tried to use HConnectionManager to create HTableInterface, but it does not work. Can anyone tell me how to create

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
9.3.1.1. On Tue, Oct 22, 2013 at 9:37 AM, yonghu yongyong...@gmail.com wrote: Hello, In the oldest verison of HBase , I can get the HTableInterface by HTablePool.getTable() method. However, in the latest Hbase version0.94.12, HTablePool is deprecated. So, I tried to use

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(RegionCoprocessorHost.java:808) ... 9 more : 1 time, servers with issues: hans-laptop:60020, On Tue, Oct 22, 2013 at 8:14 PM, yonghu yongyong...@gmail.com wrote: Ted, Can you tell me how to dump the stack trace of HBase

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
to be unmanaged.); } Cheers On Tue, Oct 22, 2013 at 11:14 AM, yonghu yongyong...@gmail.com wrote: Ted, Can you tell me how to dump the stack trace of HBase? By the way, I check the log of RegionServer. It has following error messages: java.io.IOException: The connection has

When a log file will be transfered from /.log folder to /.oldlogs folder?

2013-10-17 Thread yonghu
Hello, I saw some descriptions that when data modifications (like Put or Delete) have already been persistent on disk, the log file will be transfered from /.log folder to /.oldlogs folder. However, I made a simple test. I first created a table and inserted several data and then I used flush

Re: How to understand the TS of each data version?

2013-09-28 Thread yonghu
in the Cell. I used two rows, but as you said, the two Cells can be written using one RPC call. This way, NetworkSupplier column is not needed. Cheers On Fri, Sep 27, 2013 at 3:04 PM, yonghu yongyong...@gmail.com wrote: To Ted, --Can you tell me why readings corresponding to different

Re: How to understand the TS of each data version?

2013-09-27 Thread yonghu
(1,3,5) are timestamp. regards! Yong On Fri, Sep 27, 2013 at 4:47 PM, Ted Yu yuzhih...@gmail.com wrote: In {10K:1, 20K:3, 15K:5}, what does the value (1, 3, 5) represent ? Cheers On Fri, Sep 27, 2013 at 7:24 AM, yonghu yongyong...@gmail.com wrote: Hello, In my understanding

Re: How to understand the TS of each data version?

2013-09-27 Thread yonghu
) -- Lars From: yonghu yongyong...@gmail.com To: user@hbase.apache.org Sent: Friday, September 27, 2013 7:24 AM Subject: How to understand the TS of each data version? Hello, In my understanding, the timestamp of each data version is generated by Put command

Re: hbase schema design

2013-09-18 Thread yonghu
Different from the RDBMS, the data in HBase is stored as key-value pair in HDFS. Hence, for every data version in a cell, the row key will appear. On Tue, Sep 17, 2013 at 7:53 PM, Ted Yu yuzhih...@gmail.com wrote: w.r.t. Data Block Encoding, you can find some performance numbers here:

Re: one column family but lots of tables

2013-08-24 Thread yonghu
I think you can take a look at the http://hbase.apache.org/book/regions.arch.html, it describes the data storage hierarchy of HBase. Due to the statement of Lars stems from the fact that HBase flushes by region (which means all stores of that region are flushed), you can think the limitations of

Does HBase supports parallel table scan if I use MapReduce

2013-08-20 Thread yonghu
Hello, I know if I use default scan api, HBase scans table in a serial manner, as it needs to guarantee the order of the returned tuples. My question is if I use MapReduce to read the HBase table, and directly output the results in HDFS, not returned back to client. The HBase scan is still in a

Re: slow operation in postPut

2013-08-01 Thread yonghu
If I want to use multi-thread with thread safe, which class should I use? On Thu, Aug 1, 2013 at 3:08 PM, Ted Yu yuzhih...@gmail.com wrote: HTable is not thread safe. On Aug 1, 2013, at 5:58 AM, Pavel Hančar pavel.han...@gmail.com wrote: Hello, I have a class extending

Re: slow operation in postPut

2013-08-01 Thread yonghu
Use HTablePool instead. For more infor, http://hbase.apache.org/book/client.html. On Thu, Aug 1, 2013 at 3:32 PM, yonghu yongyong...@gmail.com wrote: If I want to use multi-thread with thread safe, which class should I use? On Thu, Aug 1, 2013 at 3:08 PM, Ted Yu yuzhih...@gmail.com wrote

Re: Data missing in import bulk data

2013-07-24 Thread yonghu
The ways that you can lose data in my point of views: 1. some tuples share the same row-key+cf+column. Hence, when you load your data in HBase, they will be loaded into the same column and may exceed the predefined max version. 2. As Ted mentioned, you may import some delete, do you generate

Re: How to join 2 tables using hadoop?

2013-07-19 Thread yonghu
You can write one MR job to finish this. First read two tables at Map function, the output key will be the reference key for one table and primary key for the other table. At the Reduce function, you can join the tuples which contain the same key. Please note this is a very naive approach, for

several doubts about region split?

2013-07-17 Thread yonghu
Dear all, From the HBase reference book, it mentions that when RegionServer splits regions, it will offline the split region and then adds the daughter regions to META, opens daughters on the parent's hosting RegionServer and then reports the split to the Master. I have a several questions: 1.

Re: several doubts about region split?

2013-07-17 Thread yonghu
of the daughter regions if continuous write load incurs. Cheers On Wed, Jul 17, 2013 at 6:53 AM, yonghu yongyong...@gmail.com wrote: Dear all, From the HBase reference book, it mentions that when RegionServer splits regions, it will offline the split region and then adds the daughter

a question about assigning timestamp?

2013-07-13 Thread yonghu
Hello, From the reference book 5.8.1.4. Put, if I issue a Put command without specifying timestamp, the server will generate a TS for me! I wonder if the server means the master node or regionservers? In my understanding, the server means the regionserver, as the master will only tells the

Re: Deleting range of rows from Hbase.

2013-07-04 Thread yonghu
I check the latest api of Delete class. I am afraid you have to do it by yourself. regards! Yong On Wed, Jul 3, 2013 at 6:46 PM, Rahul Bhattacharjee rahul.rec@gmail.com wrote: Hi, Like scan with range. I would like to delete rows with range.Is this supported from hbase shell ? Lets

Re: Deleting range of rows from Hbase.

2013-07-04 Thread yonghu
API also.. You can have a look at BulkDeleteEndpoint which can do what you want to -Anoop- On Thu, Jul 4, 2013 at 4:09 PM, yonghu yongyong...@gmail.com wrote: I check the latest api of Delete class. I am afraid you have to do it by yourself. regards! Yong On Wed, Jul 3, 2013

Re: Deleting range of rows from Hbase.

2013-07-04 Thread yonghu
and the actual execution will happen at server side.. (This is what will happen with Endpoints :) ) -Anoop- On Thu, Jul 4, 2013 at 4:29 PM, yonghu yongyong...@gmail.com wrote: Hi Anoop one more question. Can I use BulkDeleteEndpoint at the client side or should I use it like

Re: HBase failure scenarios

2013-06-11 Thread yonghu
Hi Lucas, First, the write request for HBase consists of two parts: 1. Write into WAL; 2. Write into Memstore, when Memstore reaches the threshold, the data in Memstore will be flushed into disk. In my understanding, there are two data synchronization points: The first one is write to WAL. As

Re: Poor HBase map-reduce scan performance

2013-06-05 Thread yonghu
Can anyone explain why client + rpc + server will decrease the performance of scanning? I mean the Regionserver and Tasktracker are the same node when you use MapReduce to scan the HBase table. So, in my understanding, there will be no rpc cost. Thanks! Yong On Wed, Jun 5, 2013 at 10:09 AM,

Re: Poor HBase map-reduce scan performance

2013-06-05 Thread yonghu
, yonghu yongyong...@gmail.com wrote: Can anyone explain why client + rpc + server will decrease the performance of scanning? I mean the Regionserver and Tasktracker are the same node when you use MapReduce to scan the HBase table. So, in my understanding, there will be no rpc cost

Re: Change data capture tool for hbase

2013-06-03 Thread yonghu
Hello, I have presented 5 CDC approaches based on HBase and published my results in adbis 2013. regards! Yong On Mon, Jun 3, 2013 at 11:16 AM, yavuz gokirmak ygokir...@gmail.com wrote: Hi all, Currently we are working on a hbase change data capture (CDC) tool. I want to share our ideas

Re: Doubt Regading HLogs

2013-05-17 Thread yonghu
In this situation, you can set the property namehbase.regionserver. logroll.period/name value360/value /property to a short value, let's say 3000 and then you can see your log file with current size after 3 seconds. To Nicolas, I guess he wants somehow to analyze the HLog.

How can I set heap size for HBase?

2013-03-11 Thread yonghu
Dear All, I wonder how I can set heap size for HBase and what is the suitable portion compared to whole memory size? And the other question is that how many memory size I need to give to Java when I run HBase, as I sometimes got out of memory problems. Thanks! Yong

Re: Possible to delete a specific cell?

2013-03-07 Thread yonghu
Hello, I think you can use HBase org.apache.hadoop.hbase.client.Delete class. It already supported to delete a specific version in a cell, see public Delete deleteColumn(byte[] family, byte[] qualifier, long timestamp) method. regards! Yong On Thu, Mar 7, 2013 at 9:25 PM, Jonathan Natkins

Re: Hbase table with a nested entity

2013-02-27 Thread yonghu
Hello Dastgiri, I don't think HBase can support original nested schema which you want to define. But you can still store your data in HBase. I figured out several possible solutions: 1. row_key: profileid + profilename + date, the column will be monthwiseProfileCount:uk and so on. However, this

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
Parkash, I have a six nodes cluster and met the same problem as you had. In my test, inserting one tuple using coprocessor is nearly 10 times slower than normal put operation. I think the main reason is what Lars pointed out, the main overhead is executing RPC. regards! Yong On Mon, Feb 18,

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
Forget to say. I also tested MapReduce. It's faster than coprocessor. On Mon, Feb 18, 2013 at 10:01 AM, yonghu yongyong...@gmail.com wrote: Parkash, I have a six nodes cluster and met the same problem as you had. In my test, inserting one tuple using coprocessor is nearly 10 times slower

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
Hi, Michael I don't quite understand what do you mean by round trip back to the client. In my understanding, as the RegionServer and TaskTracker can be the same node, MR don't have to pull data into client and then process. And you also mention the unnecessary overhead, can you explain a little

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
, at 4:57 AM, yonghu yongyong...@gmail.com wrote: Hi, Michael I don't quite understand what do you mean by round trip back to the client. In my understanding, as the RegionServer and TaskTracker can be the same node, MR don't have to pull data into client and then process. And you also mention

Re: Is it possible to indicate the column scan order when scanning table?

2013-02-07 Thread yonghu
7, 2013 at 6:29 PM, Ted Yu yuzhih...@gmail.com wrote: Can you give us the use case where the scanning order is significant ? Thanks On Thu, Feb 7, 2013 at 9:23 AM, yonghu yongyong...@gmail.com wrote: Dear all, I wonder if it is possible to indicate the column scan order when scanning

Re: Is it possible to indicate the column scan order when scanning table?

2013-02-07 Thread yonghu
Thanks for your response. I will take a look. yong On Thu, Feb 7, 2013 at 10:11 PM, Ted Yu yuzhih...@gmail.com wrote: Yonghu: You may want to take a look at HBASE-5416: Improve performance of scans with some kind of filters. It would be in the upcoming 0.94.5 release. You can designate

Re: Json+hbase

2013-02-04 Thread yonghu
I think you can treat id as row key. and address-type/home or address-type/office as column family. each address can be treated as column. The thing is how you can transform your json metadata into Hbase schema information. regards! Yong On Mon, Feb 4, 2013 at 11:28 AM, ranjin...@polarisft.com

Re: Is there a way to close automatically log deletion in HBase?

2013-02-02 Thread yonghu
mean the normal logging or the WAL logs. Sorry am not getting your question here. WAL trigger is for the WAL logs. Regards Ram On Sat, Feb 2, 2013 at 1:31 PM, yonghu yongyong...@gmail.com wrote: Hello, For some reasons, I need to analyze the log of hbase. However, the log

Re: Is there a way to close automatically log deletion in HBase?

2013-02-02 Thread yonghu
version then you have a way to plugin your own logcleaner class. 'BaseLogCleanerDelegate' is the default thing available. Customise your logcleaner as per your requirement so that you can have a back up of the logs. Hope this helps. Regards Ram On Sat, Feb 2, 2013 at 3:39 PM, yonghu

How can I set column information when I use YCSB to test HBase?

2013-01-18 Thread yonghu
Dear all, I read the information of https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload For example, I can indicate the column family name when I issue the command line as java -cp build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P

Re: Hbase Question

2012-12-28 Thread yonghu
I think you can take a look at your row-key design and evenly distribute your data in your cluster, as you mentioned even if you added more nodes, there was no improvement of performance. Maybe you have a node who is a hot spot, and the other nodes have no work to do. regards! Yong On Tue, Dec

Re: Coprocessor slow down problem!

2012-12-02 Thread yonghu
30, 2012 at 2:04 PM, ramkrishna vasudevan ramkrishna.s.vasude...@gmail.com wrote: Hi Pls check if this issue is similar to HBASE-5897. It is fixed in 0.92.2 as i see from the fix versions. Regards Ram On Fri, Nov 30, 2012 at 1:13 PM, yonghu yongyong...@gmail.com wrote

Re: Major compactions not firing

2012-11-29 Thread yonghu
Do you check your log infor? As far as I understood, if there is a major compaction, this event will be recorded in log. regards! Yong On Fri, Nov 30, 2012 at 8:41 AM, Varun Sharma va...@pinterest.com wrote: Hi, I turned off automatic major compactions and tried to major compact all regions

Re: Major compactions not firing

2012-11-29 Thread yonghu
How do you close the major compaction and what hbase version do you use? On Fri, Nov 30, 2012 at 8:51 AM, Varun Sharma va...@pinterest.com wrote: I see nothing like major compaction in the logs of the region server or the master... On Thu, Nov 29, 2012 at 11:46 PM, yonghu yongyong

Re: Column family names and data size on disk

2012-11-28 Thread yonghu
I like the illustration of Stack. regards! Yong On Wed, Nov 28, 2012 at 6:56 PM, Stack st...@duboce.net wrote: On Wed, Nov 28, 2012 at 6:40 AM, matan ma...@cloudaloe.org wrote: Why does the CF have to be in the HFile, isn't the entire HFile dedicated to just one CF to start with (I'm

Re: Can we insert into Hbase without specifying the column name?

2012-11-26 Thread yonghu
Hi Rams, yes. You can. See follows: hbase(main):001:0 create 'test1','course' 0 row(s) in 1.6760 seconds hbase(main):002:0 put 'test1','tom','course',90 0 row(s) in 0.1040 seconds hbase(main):003:0 scan 'test1' ROW COLUMN+CELL tom column=course:,

Re: Log files occupy lot of Disk size

2012-11-23 Thread yonghu
I think you can set setWriteToWAL() method to false to reduce the amount of log infor. But you may get risks when your cluster is down. regards! yong On Fri, Nov 23, 2012 at 7:58 AM, iwannaplay games funnlearnfork...@gmail.com wrote: Hi, Everytime i query hbase or hive ,there is a significant

Re: why reduce doesn't work by HBase?

2012-11-16 Thread yonghu
a method 3) Use a break point while debugging The answer to your current problem : o.a.h.mapreduce.Reducer method has no Iterator parameter but it does have a Iterable parameter... Regards Bertrand PS : It is absolutely not related to HBase. On Thu, Nov 15, 2012 at 8:42 PM, yonghu yongyong

Re: Why Regionserver is not serving when I set the WAL trigger?

2012-11-12 Thread yonghu
the coprocessor and region not online. Pls check if your META region is online. Regards ram On Sat, Nov 10, 2012 at 8:37 PM, yonghu yongyong...@gmail.com wrote: Dear All, I used hbase 0.94.1 and implemented the test example of WAL trigger like: public class WalTrigger extends

Re: A question of storage structure for memstore?

2012-10-22 Thread yonghu
- From: Kevin O'dell [kevin.od...@cloudera.com] Sent: Monday, October 22, 2012 5:55 PM To: user@hbase.apache.org Subject: Re: A question of storage structure for memstore? Yes, there will be two memstores if you have two CFs. On Oct 22, 2012 7:25 AM, yonghu yongyong

Re: Can I use coprocessor to record the deleted data caused by ttl?

2012-09-01 Thread yonghu
From: yonghu yongyong...@gmail.com To: user@hbase.apache.org Sent: Friday, August 31, 2012 1:13 PM Subject: Can I use coprocessor to record the deleted data caused by ttl? Dear All, I wonder if I can use coprocessor to record the deleted data caused by ttl. Any ideas

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-22 Thread yonghu
in memstore. But the question is that if memstore doesn't have enough size to accept new incoming ttl data what will happen? Can anybody explain? Thanks! Yong On Wed, Aug 22, 2012 at 10:19 AM, yonghu yongyong...@gmail.com wrote: I can fully understand normal deletion. But, in my point of view

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-22 Thread yonghu
Sorry for that. I didn't use the right parameter. Now I get the point. regards! Yong On Wed, Aug 22, 2012 at 10:49 AM, Harsh J ha...@cloudera.com wrote: Hey Yonghu, You are right that TTL deletions (it isn't exactly a delete, its more of a compact-time skip wizardry) do not go to the HLog

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-21 Thread yonghu
. They aren't Deleted in the sense of what a Delete insert signifies, but are rather eliminated in the write process when new storefiles are written out - if the value being written to the compacted store has already expired. On Tue, Aug 14, 2012 at 8:40 PM, yonghu yongyong...@gmail.com wrote

Re: Put w/ timestamp - Deleteall - Put w/ timestamp fails

2012-08-15 Thread yonghu
Hi Harsh, I have a question of your description. The deleted tag masks the new inserted value with old timestamp, that's why the new inserted data can'be seen. But after major compaction, this new value will be seen again. So, the question is that how the deletion really executes. In my

What happened in hlog if data are deleted cuased by ttl?

2012-08-14 Thread yonghu
My hbase version is 0.92. I tried something as follows: 1.Created a table 'test' with 'course' in which ttl=5. 2. inserted one row into the table. 5 seconds later, the row was deleted. Later when I checked the log infor of 'test' table, I only found the inserted information but not deleted

Re: is there anyway to turn off compaction in hbase

2012-08-13 Thread yonghu
Harsh is right. You find the wrong place. regards! Yong On Sun, Aug 12, 2012 at 1:40 PM, Harsh J ha...@cloudera.com wrote: Richard, The property disables major compactions from happening automatically. However, if you choose to do this, you should ensure you have a cron job that does

Re: column based or row based storage for HBase?

2012-08-05 Thread yonghu
In my understanding of column-oriented structure of hbase, the first thing is the term column-oriented. The meaning is that the data which belongs to the same column family stores continuously in the disk. For each column-family, the data is stored as row store. If you want to understand the

Re: Why Hadoop can't find Reducer when Mapper reads data from HBase?

2012-07-12 Thread yonghu
12, 2012 at 1:15 PM, yonghu yongyong...@gmail.com wrote: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.mapreducetablescan.MRTableAccess$MTableReducer; Does anybody know why? Its not in your job jar? Check the job jar (jar -tf JAR_FILE). St.Ack

Re: HBase RegionServer can't connect to Master

2012-05-04 Thread yonghu
I think you can also use ifconfig command in the VM to see the ip address. And then you can change your ip address in /etc/hosts. Regards! Yong On Wed, May 2, 2012 at 7:21 PM, Ben Lisbakken lisba...@gmail.com wrote: Hello -- I've got a problem where the RegionServers try to connect to

Re: Hbase custom filter

2012-05-02 Thread yonghu
It means that java run time can't find org/apache/hadoop/hbase/filter/FilterBase class. You have to add the hbase.jar in your classpath. regards! Yong On Wed, May 2, 2012 at 12:12 PM, cldo datk...@gmail.com wrote: i want to custom filter hbase. i created jar file by eclipse, copy to sever

Re: Are minor compaction and major compaction different in HBase 0.92?

2012-04-27 Thread yonghu
cleans out delete markers. Delete markers cannot be removed during a minor compaction since an affected KeyValue could exist in an HFile that is not part of this compaction. -- Lars From: yonghu yongyong...@gmail.com To: user@hbase.apache.org Sent: Wednesday

Are minor compaction and major compaction different in HBase 0.92?

2012-04-26 Thread yonghu
Hello, My HBase version is 0.92.0. And I find that when I use minor compaction and major compaction to compact a table, there are no differences. In the minor compaction, it will remove the deleted cells and discard the exceeding data versions which should be the task of major compaction. I

Re: Problem to Insert the row that i was deleted

2012-04-25 Thread yonghu
As Lars mentioned, the row is not physically deleted. The way which Hbase uses is to insert a cell called tombstone which is used to mask the deleted value, but value is still there (if the deleted value is in the same memstore with tombstone, it will be deleted in the memstore, so you will not

Re: HBase 0.92 with Hadoop 0.22

2012-04-16 Thread yonghu
yes. You can compile the hadoop jar file by yourself and put into the Hbase lib folder. Regards! Yong On Mon, Apr 16, 2012 at 2:09 PM, Harsh J ha...@cloudera.com wrote: While I haven't tried this personally, it should be alright to do. You need to replace HBase's default hadoop jars (which

dump HLog content!

2012-04-15 Thread yonghu
Hello, My hbase version is 0.92.0 and is installed in pseudo-mode. I found a strange situation of HLog. After I inserted new data value into table, the volume of HLog is 0. I checked in HDFS. drwxr-xr-x - yonghu supergroup 0 2012-04-15 17:34 /hbase/.logs drwxr-xr-x - yonghu

Re: dump HLog content!

2012-04-15 Thread yonghu
Thanks for your reply. After nearly 60minutes, I can see the Hlog volume. -rw-r--r-- 3 yonghu supergroup 2125 2012-04-15 17:34 /hbase/.logs/yonghu-laptop,60020,1334504008467/yonghu-laptop%2C60020%2C1334504008467.1334504048854 I have no idea why it takes so long time. Yong On Sun, Apr

Re: dump HLog content!

2012-04-15 Thread yonghu
byte.eventually it moves the log into .oldlogs also. Thanks Manish Sent from my BlackBerry, pls excuse typo -Original Message- From: yonghu yongyong...@gmail.com Date: Sun, 15 Apr 2012 18:58:45 To: user@hbase.apache.org Reply-To: user@hbase.apache.org Subject: Re: dump HLog content

A confusion of RegionCoprocessorEnvironment.getReion() method

2012-04-10 Thread yonghu
Hello, The description of this method is /** @return the region associated with this coprocessor */ and the return value is an HRegion instance. If I configure the region-coprocessor class in hbase-site.xml. It means that this coprocessor will be applied to every HRegion which resides on this

Re: A confusion of RegionCoprocessorEnvironment.getReion() method

2012-04-10 Thread yonghu
help explain things a bit more: https://blogs.apache.org/hbase/entry/coprocessor_introduction --gh On Tue, Apr 10, 2012 at 2:30 AM, yonghu yongyong...@gmail.com wrote: Hello, The description of this method is /** @return the region associated with this coprocessor */ and the return value

Re: Still Seeing Old Data After a Delete

2012-03-27 Thread yonghu
Hi Shwan, My hbase-version is 0.92.0. I have to mention that in recently I noticed that the delete semantics between shell and Java api are different. In shell, if you delete one version, it will mask the versions whose timestamps are older than that version, it means that scan will not return

Re: There is no data value information in HLog?

2012-03-20 Thread yonghu
. On Mon, Mar 19, 2012 at 7:31 AM, yonghu yongyong...@gmail.com wrote: Hello, I used the $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.HLog --dump command to check the HLog information. But I can not find any data information. The output of my HLog file is looks like follows

Re: How can I see the right format data content when I export the table content from Hbase?

2012-03-18 Thread yonghu
Thanks for your response. Regards! Yong On Sat, Mar 17, 2012 at 10:09 PM, Stack st...@duboce.net wrote: On Sat, Mar 17, 2012 at 1:06 AM, yonghu yongyong...@gmail.com wrote: Hello, I have used the command ./hbase org.apache.hadoop.hbase.mapreduce.Export 'test' http://localhost:8020/test

Re: Can anyone show me how to construct the HFileReaderV2 object to read the HFile content.

2012-03-16 Thread yonghu
the command line. I want to know that why its code dose not work? Regards! Yong On Fri, Mar 16, 2012 at 5:50 PM, yonghu yongyong...@gmail.com wrote: Thanks for your information. Regards! Yong On Fri, Mar 16, 2012 at 5:39 PM, Stack st...@duboce.net wrote: On Fri, Mar 16, 2012 at 9:01 AM, yonghu

Re: Can anyone show me how to construct the HFileReaderV2 object to read the HFile content.

2012-03-16 Thread yonghu
I noticed the problem is that somehow I lost the data from hdfs. The code is ok. Regards! Yong On Fri, Mar 16, 2012 at 5:59 PM, yonghu yongyong...@gmail.com wrote: I implemented the code like this way. My Hbase version is 0.92.0.                Configuration conf = new Configuration

Re: Write data content into HFile.

2012-03-06 Thread yonghu
Thanks for your reply. But I am using hbase 0.90.2. There is no HFileWriterV2 class. Can you show me how to use HFileWriter constructor? Thanks Yong On Tue, Mar 6, 2012 at 3:19 PM, Konrad Tendera kon...@tendera.eu wrote: yonghu yongyong313@... writes: Hello, ... try something like

Re: Write data content into HFile.

2012-03-06 Thread yonghu
Thanks for your reply. I have already solve the problem. Yong On Tue, Mar 6, 2012 at 5:02 PM, Stack st...@duboce.net wrote: On Tue, Mar 6, 2012 at 6:48 AM, yonghu yongyong...@gmail.com wrote: Thanks for your reply. But I am using hbase 0.90.2. There is no HFileWriterV2 class. Can you show me

Re: Write data content into HFile.

2012-03-06 Thread yonghu
= new HFile.Writer(fs, new Path(hdfs://localhost:8020/test), 2, (Compression.Algorithm)null, null); the data is stored in the hdfs. regards! Yong On Tue, Mar 6, 2012 at 5:10 PM, Stack st...@duboce.net wrote: On Tue, Mar 6, 2012 at 8:07 AM, yonghu yongyong...@gmail.com wrote: Thanks for your

a question about append operation of HFile.Writer

2012-03-06 Thread yonghu
Hello, One HFile consists of many blocks. Suppose we have two blocks, b1 and b2. The size of each block is 2K. In b1, we have two key-value pairs, whose keys are t1 and t2, separately. Each key-value pair is 1K. So the b1 is full. Suppose that now we insert a new tuple which key is also t1. The

a strange situation of HBase when I issue scan '.META.' command

2012-03-05 Thread yonghu
Hello, My HBase version is 0.90.2 and installed in pseudo mode. I have successfully inserted two tuples in the 'test' table. hbase(main):005:0 scan 'test' ROWCOLUMN+CELL jim column=course:english, timestamp=1330949116240, value=1.3 tom

Re: a strange situation of HBase when I issue scan '.META.' command

2012-03-05 Thread yonghu
      is the first region in a table.  If region has both an empty start and an empty end key, its the only region in the table On 3/5/12 7:27 AM, yonghu yongyong...@gmail.com wrote: Hello, My HBase version is 0.90.2 and installed in pseudo mode. I have successfully inserted two tuples in the 'test' table

Which server store the root and .meta. information?

2012-02-10 Thread yonghu
Hello, I read some articles which mention before the client connect to the master node, he will first connect to the zookeeper node and find the location of the root node. So, my question is that the node which stores the root information is different from master node or they are the same node?

  1   2   >