Re: Setting TTL at the row level

2017-06-23 Thread yonghu
I did not quite understand what you mean by "row timestamp"? As far as I know, timestamp is associated to each data version (cell). Will you store multiple data versions in a single column? On Thu, Jun 22, 2017 at 4:35 AM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Why not using th

Re: multiple data versions vs. multiple rows?

2015-01-20 Thread yonghu
If you want to compare the performances, you need to run way bigger and > > longer tests. Dont run them in parallete. Run them at least 10 time each > to > > make sure you have a good trend. Is the difference between the 2 > > significant? It should not. > > > > JM

Re: multiple data versions vs. multiple rows?

2015-01-19 Thread yonghu
give you the consistency of being > within a row. > > JM > > 2015-01-19 14:28 GMT-05:00 yonghu : > > > Dear all, > > > > I want to record the user history data. I know there exists two options, > > one is to store user events in a single row with multiple data

multiple data versions vs. multiple rows?

2015-01-19 Thread yonghu
Dear all, I want to record the user history data. I know there exists two options, one is to store user events in a single row with multiple data versions and the other one is to use multiple rows. I wonder which one is better for performance? Thanks! Yong

Strange behavior when using MapReduce to process HBase.

2014-11-25 Thread yonghu
Hi, I write a copyTable mapreduce program. My hbase version is 0.94.16. Several rows in the source table contain multiple data versions. The Map function looks like follows: public void map(ImmutableBytesWritable rowKey, Result res, Context context) throws IOException, InterruptedException{ for(K

Is it possible to get the table name at the Map phase?

2014-11-21 Thread yonghu
Hello all, I want to implement difference operator by using MapReduce. I read two tables by using MultiTableInputFormat. In the Map phase, I need to tag the name of table into each row, but how can I get the table name? One way I can think is to create HTable instance for each table in the setup(

It is possible to indicate a specific key range into a specific node?

2014-11-02 Thread yonghu
Dear All, Suppose that I have a key range from 1 to 100 and want to store 1-50 in the first node and 51-100 in the second node. How can I do this in Hbase? regards! Yong

A use case for ttl deletion?

2014-09-26 Thread yonghu
Hello, Can anyone give me a concrete use case for ttl deletions? I mean in which situation we should set ttl property? regards! Yong

Re: hbase memstore size

2014-08-06 Thread yonghu
I did not quite understand your problem. You store your data in HBase, and I guess later you also will read data from it. Generally, HBase will first check if the data exist in memstore, if not, it will check the disk. If you set the memstore to 0, it denotes every read will directly forward to dis

Re: HBase appends

2014-07-22 Thread yonghu
Hi, If a author does not have hundreds of publications, you can directly write in one column. Hence, your column will contain multiple data versions. The default data version is 3 but you can send more. On Tue, Jul 22, 2014 at 4:20 AM, Ishan Chhabra wrote: > Arun, > You need to represent your

Re: Moving older data versions to archive

2014-04-03 Thread yonghu
I think you can define coprocessors to do this. For example, for every put command, you can keep the desired versions that you want, and later put the older version into the other table or HDFS. Finally, either let Hbase delete your stale data or let coprocessor do that for you. The problem of this

Re: single node's performance and cluster's performance

2014-04-03 Thread yonghu
I think the right understanding of this is it will slow down the data query processing. You can think the RS who hit a heady I/O as a hotspot node. It will not slow down the whole cluster, it will only slow down the data applications which access the data from that RS. On Thu, Apr 3, 2014 at 3:58

Re: LSM tree, SSTable and fractal tree

2014-02-28 Thread yonghu
Hbase uses LSM tree and SStable, not sure for fractal tree On Fri, Feb 28, 2014 at 2:57 PM, Shahab Yunus wrote: > http://www.slideshare.net/jaxlondon2012/hbase-advanced-lars-george > http://hortonworks.com/hadoop/hbase/ > > Regards, > Shahab > > > On Fri, Feb 28, 2014 at 8:36 AM, Vimal Jain wro

Re: When should we trigger a major compaction?

2014-02-21 Thread yonghu
Before you want to trigger major compaction, let's first explain why do we need major compaction. The major compaction will cause 1. delete the data which is masked by tombstone; 2. delete the data which has expired ttl; 3. compact several small hfiles into a single larger one. I didn't quite unde

Re: TTL forever

2014-02-18 Thread yonghu
I also calculated the years of ttl, just for fun. :). But as Jean said, default ttl is forever. On Tue, Feb 18, 2014 at 2:05 PM, Jean-Marc Spaggiari < jean-m...@spaggiari.org> wrote: > Hi Mohamed, > > Default value is MAX_VALUE, which is considered as "forever". So default > TTL is NOT 69 years.

Re: What's the Common Way to Execute an HBase Job?

2014-02-11 Thread yonghu
Hi, To process the data in Hbase. You can have different options. 1. Java program using Hbase api; 2. MapReduce program; 3. High-level languages, such as Hive or Pig (built on top of MapReduce); 4. Phoenix also a High-level language (built based on coprocessor). which one you should use depends

Re: Newbie question: Rowkey design

2013-12-17 Thread yonghu
In my opinion, it really depends on your queries. The first one achieves data locality. There is no additional data transmit between different nodes. But this strategy sacrifices parallelism and the node which stores A will be a hot node if too many applications try to access A. The second appro

Re: Online/Realtime query with filter and join?

2013-11-29 Thread yonghu
The question is what you mean of "real-time". What is your performance request? In my opinion, I don't think the MapReduce is suitable for the real time data processing. On Fri, Nov 29, 2013 at 9:55 AM, Azuryy Yu wrote: > you can try phoniex. > On 2013-11-29 3:44 PM, "Ramon Wang" wrote: > > >

Re: How to create HTableInterface in coprocessor?

2013-10-24 Thread yonghu
s fully understand the case and verify the fix. > > > > Thanks > > > > > > On Tue, Oct 22, 2013 at 12:51 PM, Ted Yu wrote: > > > >> I logged HBASE-9819 to backport HBASE-8372 'Provide mutability to > >> CompoundConfiguration' to 0.94 &

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
e() or hc.getTable(). regards! Yong On Tue, Oct 22, 2013 at 8:42 PM, Ted Yu wrote: > There're two types of exceptions. In the code below, I saw rce.getTable() > being commented out. > > Can you tell us the correlation between types of exception and getTable() > calls ? >

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
nection has to be unmanaged."); > } > > Cheers > > > On Tue, Oct 22, 2013 at 11:14 AM, yonghu wrote: > > > Ted, > > > > Can you tell me how to dump the stack trace of HBase? By the way, I check > > the log of RegionServer. It has

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
tion.java:61) at org.apache.hadoop.hbase.regionserver.RegionCoprocessorHost.prePut(RegionCoprocessorHost.java:808) ... 9 more : 1 time, servers with issues: hans-laptop:60020, On Tue, Oct 22, 2013 at 8:14 PM, yonghu wrote: > Ted, > > Can you tell me how to dump the stack trace of HBa

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
roblem ? > > Cheers > > > On Tue, Oct 22, 2013 at 11:01 AM, yonghu wrote: > > > Gray, > > > > Finally, I saw the error messages. ERROR: > > org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: > Failed > > 1 action: o

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
x27; threw: 'java.lang.UnsupportedOperationException: Immutable Configuration' and has been removed from the active coprocessor set. I will try different approach as Ted mentioned. On Tue, Oct 22, 2013 at 7:49 PM, yonghu wrote: > Gray > > Thanks for your response. I tried your appro

Re: How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
specially 9.3.1.1. > > > > > > On Tue, Oct 22, 2013 at 9:37 AM, yonghu wrote: > > > > > Hello, > > > > > > In the oldest verison of HBase , I can get the HTableInterface by > > > HTablePool.getTable() method. However, in the latest Hbase

How to create HTableInterface in coprocessor?

2013-10-22 Thread yonghu
Hello, In the oldest verison of HBase , I can get the HTableInterface by HTablePool.getTable() method. However, in the latest Hbase version0.94.12, HTablePool is deprecated. So, I tried to use HConnectionManager to create HTableInterface, but it does not work. Can anyone tell me how to create HTab

When a log file will be transfered from /.log folder to /.oldlogs folder?

2013-10-17 Thread yonghu
Hello, I saw some descriptions that when data modifications (like Put or Delete) have already been persistent on disk, the log file will be transfered from /.log folder to /.oldlogs folder. However, I made a simple test. I first created a table and inserted several data and then I used flush comma

Re: How to understand the TS of each data version?

2013-09-28 Thread yonghu
unknown, you can store a special marker in the Cell. > I used two rows, but as you said, the two Cells can be written using one > RPC call. > > This way, NetworkSupplier column is not needed. > > Cheers > > > On Fri, Sep 27, 2013 at 3:04 PM, yonghu wrote: > > &g

Re: How to understand the TS of each data version?

2013-09-27 Thread yonghu
#x27;d see that supplier c has 15k iff you query the > latest data, which seems to be what you want. > Note that you could also query as of TS 4 (c:20k), TS3 (d:20k), TS2 (d:10k) > > > -- Lars > > > > > From: yonghu > To: user@hbase.apa

Re: How to understand the TS of each data version?

2013-09-27 Thread yonghu
(1,3,5) are timestamp. regards! Yong On Fri, Sep 27, 2013 at 4:47 PM, Ted Yu wrote: > In {10K:1, 20K:3, 15K:5}, what does the value (1, 3, 5) represent ? > > Cheers > > > On Fri, Sep 27, 2013 at 7:24 AM, yonghu wrote: > > > Hello, > > > > In my un

How to understand the TS of each data version?

2013-09-27 Thread yonghu
Hello, In my understanding, the timestamp of each data version is generated by Put command. The value of TS is either indicated by user or assigned by HBase itself. If the TS is generated by HBase, it only records when (the time point) that data version is generated (Have no meaning to the applica

Re: hbase schema design

2013-09-18 Thread yonghu
Different from the RDBMS, the data in HBase is stored as key-value pair in HDFS. Hence, for every data version in a cell, the row key will appear. On Tue, Sep 17, 2013 at 7:53 PM, Ted Yu wrote: > w.r.t. Data Block Encoding, you can find some performance numbers here: > > > https://issues.apache

Re: one column family but lots of tables

2013-08-24 Thread yonghu
I think you can take a look at the http://hbase.apache.org/book/regions.arch.html, it describes the data storage hierarchy of HBase. Due to the statement of Lars "stems from the fact that HBase flushes by region (which means all stores of that region are flushed)", you can think the limitations of

Re: Does HBase supports parallel table scan if I use MapReduce

2013-08-21 Thread yonghu
table (look at TableInputFormat to see how it is > done). The map tasks will run in parallel. > > Jeff > > > On Tue, Aug 20, 2013 at 8:45 AM, yonghu wrote: > > > Hello, > > > > I know if I use default scan api, HBase scans table in a serial manner, > as >

Does HBase supports parallel table scan if I use MapReduce

2013-08-20 Thread yonghu
Hello, I know if I use default scan api, HBase scans table in a serial manner, as it needs to guarantee the order of the returned tuples. My question is if I use MapReduce to read the HBase table, and directly output the results in HDFS, not returned back to client. The HBase scan is still in a se

Re: slow operation in postPut

2013-08-01 Thread yonghu
Use HTablePool instead. For more infor, http://hbase.apache.org/book/client.html. On Thu, Aug 1, 2013 at 3:32 PM, yonghu wrote: > If I want to use multi-thread with thread safe, which class should I use? > > > On Thu, Aug 1, 2013 at 3:08 PM, Ted Yu wrote: > >> HTab

Re: slow operation in postPut

2013-08-01 Thread yonghu
If I want to use multi-thread with thread safe, which class should I use? On Thu, Aug 1, 2013 at 3:08 PM, Ted Yu wrote: > HTable is not thread safe. > > On Aug 1, 2013, at 5:58 AM, Pavel HanĨar wrote: > > > Hello, > > I have a class extending BaseRegionObserver and I use the postPut method > t

Re: Data missing in import bulk data

2013-07-24 Thread yonghu
The ways that you can lose data in my point of views: 1. some tuples share the same row-key+cf+column. Hence, when you load your data in HBase, they will be loaded into the same column and may exceed the predefined max version. 2. As Ted mentioned, you may import some delete, do you generate tomb

Re: How to join 2 tables using hadoop?

2013-07-19 Thread yonghu
You can write one MR job to finish this. First read two tables at Map function, the output key will be the reference key for one table and primary key for the other table. At the Reduce function, you can "join" the tuples which contain the same key. Please note this is a very naive approach, for m

Re: several doubts about region split?

2013-07-17 Thread yonghu
alancer is supposed to offload one of the daughter regions > if continuous write load incurs. > > Cheers > > On Wed, Jul 17, 2013 at 6:53 AM, yonghu wrote: > > > Dear all, > > > > From the HBase reference book, it mentions that when RegionServer splits > > regio

several doubts about region split?

2013-07-17 Thread yonghu
Dear all, >From the HBase reference book, it mentions that when RegionServer splits regions, it will offline the split region and then adds the daughter regions to META, opens daughters on the parent's hosting RegionServer and then reports the split to the Master. I have a several questions: 1.

a question about assigning timestamp?

2013-07-13 Thread yonghu
Hello, >From the reference book "5.8.1.4. Put", if I issue a "Put" command without specifying timestamp, the server will generate a TS for me! I wonder if the "server" means the master node or regionservers? In my understanding, the server means the regionserver, as the master will only tells the

Re: Deleting range of rows from Hbase.

2013-07-04 Thread yonghu
he actual execution > will happen at server side.. (This is what will happen with Endpoints :) ) > > -Anoop- > > On Thu, Jul 4, 2013 at 4:29 PM, yonghu wrote: > > > Hi Anoop > > one more question. Can I use BulkDeleteEndpoint at the client side or > > should I

Re: Deleting range of rows from Hbase.

2013-07-04 Thread yonghu
> You can have a look at BulkDeleteEndpoint which can do what you want to > > -Anoop- > > On Thu, Jul 4, 2013 at 4:09 PM, yonghu wrote: > > > I check the latest api of Delete class. I am afraid you have to do it by > > yourself. > > > > regards! > &g

Re: Deleting range of rows from Hbase.

2013-07-04 Thread yonghu
I check the latest api of Delete class. I am afraid you have to do it by yourself. regards! Yong On Wed, Jul 3, 2013 at 6:46 PM, Rahul Bhattacharjee wrote: > Hi, > > Like scan with range. I would like to delete rows with range.Is this > supported from hbase shell ? > > Lets say I have a table

Re: HBase failure scenarios

2013-06-10 Thread yonghu
Hi Lucas, First, the write request for HBase consists of two parts: 1. Write into WAL; 2. Write into Memstore, when Memstore reaches the threshold, the data in Memstore will be flushed into disk. In my understanding, there are two data synchronization points: The first one is write to WAL. As WA

Re: Poor HBase map-reduce scan performance

2013-06-05 Thread yonghu
be tackled. > > Cheers > > On Wed, Jun 5, 2013 at 7:55 AM, yonghu wrote: > > > Can anyone explain why client + rpc + server will decrease the > performance > > of scanning? I mean the Regionserver and Tasktracker are the same node > when > > you use MapReduc

Re: Poor HBase map-reduce scan performance

2013-06-05 Thread yonghu
Can anyone explain why client + rpc + server will decrease the performance of scanning? I mean the Regionserver and Tasktracker are the same node when you use MapReduce to scan the HBase table. So, in my understanding, there will be no rpc cost. Thanks! Yong On Wed, Jun 5, 2013 at 10:09 AM, San

Re: Change data capture tool for hbase

2013-06-03 Thread yonghu
Hello, I have presented 5 CDC approaches based on HBase and published my results in adbis 2013. regards! Yong On Mon, Jun 3, 2013 at 11:16 AM, yavuz gokirmak wrote: > Hi all, > > Currently we are working on a hbase change data capture (CDC) tool. I want > to share our ideas and continue deve

Re: Doubt Regading HLogs

2013-05-17 Thread yonghu
In this situation, you can set the > > > hbase.regionserver. logroll.period > > 360 > > to a short value, let's say 3000 and then you can see your log file with current size after 3 seconds. To Nicolas, I guess he wants somehow to analyze the HLog. regards! Yong On Fri, May 1

How can I set heap size for HBase?

2013-03-11 Thread yonghu
Dear All, I wonder how I can set heap size for HBase and what is the suitable portion compared to whole memory size? And the other question is that how many memory size I need to give to Java when I run HBase, as I sometimes got "out of memory" problems. Thanks! Yong

Re: Possible to delete a specific cell?

2013-03-07 Thread yonghu
Hello, I think you can use HBase org.apache.hadoop.hbase.client.Delete class. It already supported to delete a specific version in a cell, see public Delete deleteColumn(byte[] family, byte[] qualifier, long timestamp) method. regards! Yong On Thu, Mar 7, 2013 at 9:25 PM, Jonathan Natkins wrot

Re: Hbase table with a nested entity

2013-02-27 Thread yonghu
Hello Dastgiri, I don't think HBase can support original nested schema which you want to define. But you can still store your data in HBase. I figured out several possible solutions: 1. row_key: profileid + profilename + date, the column will be monthwiseProfileCount:uk and so on. However, this a

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
run a M/R job that rebuilds the index should > something occur to the system where you might lose the data. Indexes *ARE* > expendable. ;-) > > Does that explain it? > > -Mike > > On Feb 18, 2013, at 4:57 AM, yonghu wrote: > >> Hi, Michael >> >> I don&#x

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
Hi, Michael I don't quite understand what do you mean by "round trip back to the client". In my understanding, as the RegionServer and TaskTracker can be the same node, MR don't have to pull data into client and then process. And you also mention the "unnecessary overhead", can you explain a litt

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
Forget to say. I also tested MapReduce. It's faster than coprocessor. On Mon, Feb 18, 2013 at 10:01 AM, yonghu wrote: > Parkash, > > I have a six nodes cluster and met the same problem as you had. In my > test, inserting one tuple using coprocessor is nearly 10 times slower

Re: coprocessor enabled put very slow, help please~~~

2013-02-18 Thread yonghu
Parkash, I have a six nodes cluster and met the same problem as you had. In my test, inserting one tuple using coprocessor is nearly 10 times slower than normal put operation. I think the main reason is what Lars pointed out, the main overhead is executing RPC. regards! Yong On Mon, Feb 18, 201

Re: Is it possible to indicate the column scan order when scanning table?

2013-02-07 Thread yonghu
Thanks for your response. I will take a look. yong On Thu, Feb 7, 2013 at 10:11 PM, Ted Yu wrote: > Yonghu: > You may want to take a look at HBASE-5416: Improve performance of scans > with some kind of filters. > It would be in the upcoming 0.94.5 release. > > You can desi

Re: Is it possible to indicate the column scan order when scanning table?

2013-02-07 Thread yonghu
7, 2013 at 6:29 PM, Ted Yu wrote: > Can you give us the use case where the scanning order is significant ? > > Thanks > > On Thu, Feb 7, 2013 at 9:23 AM, yonghu wrote: > >> Dear all, >> >> I wonder if it is possible to indicate the column scan order when >&

Re: Json+hbase

2013-02-04 Thread yonghu
I think you can treat id as row key. and address-type/home or address-type/office as column family. each address can be treated as column. The thing is how you can transform your json metadata into Hbase schema information. regards! Yong On Mon, Feb 4, 2013 at 11:28 AM, wrote: > Hi, > > >

Re: Is there a way to close automatically log deletion in HBase?

2013-02-02 Thread yonghu
version then you have a way to plugin your own > logcleaner class. > 'BaseLogCleanerDelegate' is the default thing available. Customise your > logcleaner as per your requirement so that you can have a back up of the > logs. > > Hope this helps. > > Regards > Ra

Re: Is there a way to close automatically log deletion in HBase?

2013-02-02 Thread yonghu
the WAL logs. Sorry am not > getting your question here. > WAL trigger is for the WAL logs. > > Regards > Ram > > On Sat, Feb 2, 2013 at 1:31 PM, yonghu wrote: > >> Hello, >> >> For some reasons, I need to analyze the log of hbase. However, the log &

How can I set column information when I use YCSB to test HBase?

2013-01-18 Thread yonghu
Dear all, I read the information of https://github.com/brianfrankcooper/YCSB/wiki/Running-a-Workload For example, I can indicate the column family name when I issue the command line as java -cp build/ycsb.jar:db/hbase/lib/* com.yahoo.ycsb.Client -load -db com.yahoo.ycsb.db.HBaseClient -P workload

Re: Hbase Question

2012-12-28 Thread yonghu
I think you can take a look at your row-key design and evenly distribute your data in your cluster, as you mentioned even if you added more nodes, there was no improvement of performance. Maybe you have a node who is a hot spot, and the other nodes have no work to do. regards! Yong On Tue, Dec 2

Re: Coprocessor slow down problem!

2012-12-02 Thread yonghu
>> -Anoop- >> >> >> On Fri, Nov 30, 2012 at 2:04 PM, ramkrishna vasudevan < >> ramkrishna.s.vasude...@gmail.com> wrote: >> >> > Hi >> > >> > Pls check if this issue is similar to HBASE-5897. It is fixed in 0.92.2 >> as >> >

Re: Major compactions not firing

2012-11-29 Thread yonghu
How do you close the major compaction and what hbase version do you use? On Fri, Nov 30, 2012 at 8:51 AM, Varun Sharma wrote: > I see nothing like major compaction in the logs of the region server or the > master... > > On Thu, Nov 29, 2012 at 11:46 PM, yonghu wrote: > >>

Re: Major compactions not firing

2012-11-29 Thread yonghu
Do you check your log infor? As far as I understood, if there is a major compaction, this event will be recorded in log. regards! Yong On Fri, Nov 30, 2012 at 8:41 AM, Varun Sharma wrote: > Hi, > > I turned off automatic major compactions and tried to major compact all > regions via both the re

Re: Column family names and data size on disk

2012-11-28 Thread yonghu
I like the illustration of Stack. regards! Yong On Wed, Nov 28, 2012 at 6:56 PM, Stack wrote: > On Wed, Nov 28, 2012 at 6:40 AM, matan wrote: > >> Why does the CF have to be in the HFile, isn't the entire HFile dedicated >> to >> just one CF to start with (I'm speaking at the HBase architectur

Re: Can we insert into Hbase without specifying the column name?

2012-11-26 Thread yonghu
Hi Rams, yes. You can. See follows: hbase(main):001:0> create 'test1','course' 0 row(s) in 1.6760 seconds hbase(main):002:0> put 'test1','tom','course',90 0 row(s) in 0.1040 seconds hbase(main):003:0> scan 'test1' ROW COLUMN+CELL tom column=course:, timestamp

Re: Log files occupy lot of Disk size

2012-11-23 Thread yonghu
I think you can set setWriteToWAL() method to false to reduce the amount of log infor. But you may get risks when your cluster is down. regards! yong On Fri, Nov 23, 2012 at 7:58 AM, iwannaplay games wrote: > Hi, > > Everytime i query hbase or hive ,there is a significant growth in my > log file

Re: why reduce doesn't work by HBase?

2012-11-16 Thread yonghu
ng a method > 3) Use a break point while debugging > > The answer to your current problem : o.a.h.mapreduce.Reducer method has no > Iterator parameter but it does have a Iterable parameter... > > Regards > > Bertrand > > PS : It is absolutely not related to HBase. >

Re: Why Regionserver is not serving when I set the WAL trigger?

2012-11-12 Thread yonghu
on not online. > Pls check if your META region is online. > > Regards > ram > > On Sat, Nov 10, 2012 at 8:37 PM, yonghu wrote: > >> Dear All, >> >> I used hbase 0.94.1 and implemented the test example of WAL trigger like: >> >> public class Wal

Re: A question of storage structure for memstore?

2012-10-22 Thread yonghu
gt; > -Anoop- > > From: Kevin O'dell [kevin.od...@cloudera.com] > Sent: Monday, October 22, 2012 5:55 PM > To: user@hbase.apache.org > Subject: Re: A question of storage structure for memstore? > > Yes, there will be two memstores if you have two CFs. &

Re: Can I use coprocessor to record the deleted data caused by ttl?

2012-09-01 Thread yonghu
at is your usecase for this? > > -- Lars > > > > > From: yonghu > To: user@hbase.apache.org > Sent: Friday, August 31, 2012 1:13 PM > Subject: Can I use coprocessor to record the deleted data caused by ttl? > > Dear All, > > I wonder if I can

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-22 Thread yonghu
Sorry for that. I didn't use the right parameter. Now I get the point. regards! Yong On Wed, Aug 22, 2012 at 10:49 AM, Harsh J wrote: > Hey Yonghu, > > You are right that TTL "deletions" (it isn't exactly a delete, its > more of a compact-time skip wizardry) do

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-22 Thread yonghu
cating LruBlockCache with maximum size 247.9m Scanned kv count -> 1 so, I guess the ttl data is only managed in memstore. But the question is that if memstore doesn't have enough size to accept new incoming ttl data what will happen? Can anybody explain? Thanks! Yong On Wed, Aug 22, 2012 a

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-22 Thread yonghu
tml > > > ./Zahoor > HBase Musings > > > On 14-Aug-2012, at 6:54 PM, Harsh J wrote: > >> Hi Yonghu, >> >> A timestamp is stored along with each insert. The ttl is maintained at >> the region-store level. Hence, when the log replays, all ent

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-21 Thread yonghu
;> >> Yes, TTL deletions are done only during compactions. They aren't >> "Deleted" in the sense of what a Delete insert signifies, but are >> rather eliminated in the write process when new >> storefiles are written out - if the value being written to the &

Re: Put w/ timestamp -> Deleteall -> Put w/ timestamp fails

2012-08-15 Thread yonghu
Hi Harsh, I have a question of your description. The deleted tag masks the new inserted value with old timestamp, that's why the new inserted data can'be seen. But after major compaction, this new value will be seen again. So, the question is that how the deletion really executes. In my understand

Re: What happened in hlog if data are deleted cuased by ttl?

2012-08-14 Thread yonghu
Hi Hars, Thanks for your reply. If I understand you right, it means the ttl deletion will not reflect in log. On Tue, Aug 14, 2012 at 3:24 PM, Harsh J wrote: > Hi Yonghu, > > A timestamp is stored along with each insert. The ttl is maintained at > the region-store level. Hence,

What happened in hlog if data are deleted cuased by ttl?

2012-08-14 Thread yonghu
My hbase version is 0.92. I tried something as follows: 1.Created a table 'test' with 'course' in which ttl=5. 2. inserted one row into the table. 5 seconds later, the row was deleted. Later when I checked the log infor of 'test' table, I only found the inserted information but not deleted informat

Re: is there anyway to turn off compaction in hbase

2012-08-13 Thread yonghu
Harsh is right. You find the wrong place. regards! Yong On Sun, Aug 12, 2012 at 1:40 PM, Harsh J wrote: > Richard, > > The property disables major compactions from happening automatically. > However, if you choose to do this, you should ensure you have a cron > job that does trigger major_compa

Re: column based or row based storage for HBase?

2012-08-05 Thread yonghu
In my understanding of column-oriented structure of hbase, the first thing is the term column-oriented. The meaning is that the data which belongs to the same column family stores continuously in the disk. For each column-family, the data is stored as row store. If you want to understand the intern

Re: Why Hadoop can't find Reducer when Mapper reads data from HBase?

2012-07-12 Thread yonghu
:15 PM, yonghu wrote: >> java.lang.RuntimeException: java.lang.ClassNotFoundException: >> com.mapreducetablescan.MRTableAccess$MTableReducer; >> >> Does anybody know why? >> > > Its not in your job jar? Check the job jar (jar -tf JAR_FILE). > > St.Ack

Re: HBase RegionServer can't connect to Master

2012-05-04 Thread yonghu
I think you can also use ifconfig command in the VM to see the ip address. And then you can change your ip address in /etc/hosts. Regards! Yong On Wed, May 2, 2012 at 7:21 PM, Ben Lisbakken wrote: > Hello -- > > I've got a problem where the RegionServers try to connect to localhost for > the Ma

Re: Hbase custom filter

2012-05-02 Thread yonghu
It means that java run time can't find org/apache/hadoop/hbase/filter/FilterBase class. You have to add the hbase.jar in your classpath. regards! Yong On Wed, May 2, 2012 at 12:12 PM, cldo wrote: > > i want to custom filter hbase. > i created jar file by eclipse, copy to sever and in file hbase

Re: Are minor compaction and major compaction different in HBase 0.92?

2012-04-27 Thread yonghu
eans out > delete markers. > Delete markers cannot be removed during a minor compaction since an affected > KeyValue could exist in an HFile that is not part of this compaction. > > -- Lars > ________ > > From: yonghu > To: user@hbase.apache.o

Are minor compaction and major compaction different in HBase 0.92?

2012-04-25 Thread yonghu
Hello, My HBase version is 0.92.0. And I find that when I use minor compaction and major compaction to compact a table, there are no differences. In the minor compaction, it will remove the deleted cells and discard the exceeding data versions which should be the task of major compaction. I wonder

Re: Problem to Insert the row that i was deleted

2012-04-25 Thread yonghu
As Lars mentioned, the row is not physically deleted. The way which Hbase uses is to insert a cell called "tombstone" which is used to mask the deleted value, but value is still there (if the deleted value is in the same memstore with tombstone, it will be deleted in the memstore, so you will not f

Re: HBase 0.92 with Hadoop 0.22

2012-04-16 Thread yonghu
yes. You can compile the hadoop jar file by yourself and put into the Hbase lib folder. Regards! Yong On Mon, Apr 16, 2012 at 2:09 PM, Harsh J wrote: > While I haven't tried this personally, it should be alright to do. You > need to replace HBase's default hadoop jars (which are 1.0.x/0.20 > v

Re: Is it possible to install two different Hbase versions in the same Cluster?

2012-04-16 Thread yonghu
evice. Please excuse any typos... > > Mike Segel > > On Apr 16, 2012, at 6:31 AM, yonghu wrote: > >> Hello, >> >> I wonder if it's possible to install two different Hbase versions in >> the same cluster? >> >> Thanks >> >> Yong >>

Re: dump HLog content!

2012-04-15 Thread yonghu
byte.eventually it moves the log into .oldlogs also. > > Thanks > Manish > Sent from my BlackBerry, pls excuse typo > > -Original Message- > From: yonghu > Date: Sun, 15 Apr 2012 18:58:45 > To: > Reply-To: user@hbase.apache.org > Subject: Re: dump HLog cont

Re: dump HLog content!

2012-04-15 Thread yonghu
Thanks for your reply. After nearly 60minutes, I can see the Hlog volume. -rw-r--r-- 3 yonghu supergroup 2125 2012-04-15 17:34 /hbase/.logs/yonghu-laptop,60020,1334504008467/yonghu-laptop%2C60020%2C1334504008467.1334504048854 I have no idea why it takes so long time. Yong On Sun, Apr

Re: dump HLog content!

2012-04-15 Thread yonghu
yes On Sun, Apr 15, 2012 at 6:30 PM, Ted Yu wrote: > Did 'HLog --dump' show real contents for a 0-sized file ? > > Cheers > > On Sun, Apr 15, 2012 at 8:58 AM, yonghu wrote: > >> Hello, >> >> My hbase version is 0.92.0 and is installed in pseud

dump HLog content!

2012-04-15 Thread yonghu
Hello, My hbase version is 0.92.0 and is installed in pseudo-mode. I found a strange situation of HLog. After I inserted new data value into table, the volume of HLog is 0. I checked in HDFS. drwxr-xr-x - yonghu supergroup 0 2012-04-15 17:34 /hbase/.logs drwxr-xr-x - yonghu

Re: A confusion of RegionCoprocessorEnvironment.getReion() method

2012-04-10 Thread yonghu
blog post by Mingjie may help explain things a bit more: > https://blogs.apache.org/hbase/entry/coprocessor_introduction > > > --gh > > > > On Tue, Apr 10, 2012 at 2:30 AM, yonghu wrote: >> Hello, >> >> The description of this method is " /** @return t

A confusion of RegionCoprocessorEnvironment.getReion() method

2012-04-10 Thread yonghu
Hello, The description of this method is " /** @return the region associated with this coprocessor */" and the return value is an HRegion instance. If I configure the region-coprocessor class in hbase-site.xml. It means that this coprocessor will be applied to every HRegion which resides on this

Re: Still Seeing Old Data After a Delete

2012-03-27 Thread yonghu
Hi Shwan, My hbase-version is 0.92.0. I have to mention that in recently I noticed that the delete semantics between shell and Java api are different. In shell, if you delete one version, it will mask the versions whose timestamps are older than that version, it means that scan will not return the

Re: There is no data value information in HLog?

2012-03-20 Thread yonghu
you should have specified the above option. > > On Mon, Mar 19, 2012 at 7:31 AM, yonghu wrote: > >> Hello, >> >> I used the $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.HLog >> --dump command to check the HLog information. But I can not find any >> da

There is no data value information in HLog?

2012-03-19 Thread yonghu
Hello, I used the $ ./bin/hbase org.apache.hadoop.hbase.regionserver.wal.HLog --dump command to check the HLog information. But I can not find any data information. The output of my HLog file is looks like follows: Sequence 933 from region 85986149309dff24ecf7be4873136f15 in table test Action:

  1   2   >