key byte pattern

2014-11-06 Thread Rita
ATM, i have a key which is a string. "Z.1415276116.223232" Are there any algorithms or design patterns I should follow for something like this to decrease the key size? Obviously, storing it as a String is costly was wondering if there was a better way to store something like this? BTW, 141527611

measuring iops

2013-06-11 Thread Rita
Hi, I am using, org.apache.hadoop.hbase.PerformanceEvaluation, for performance evaluation and was wondering if there is a way to get iops for the entire test(s). Or, is there a particular test which will determine this metric for me. The goal is to see how many extra I/O operations each node deliv

Re: discp versus export

2013-03-04 Thread Rita
the end goal is to have a backup of our hbase tables. On Mon, Mar 4, 2013 at 7:10 AM, Kevin O'dell wrote: > DistCP is typically used for HDFS level back up jobs. It can be used for > HBase but can be quite tricky. I would recommend using Export, CopyTable, > or Replication. These are tools de

Re: Announcing Phoenix v 1.1: Support for HBase v 0.94.4 and above

2013-02-26 Thread Rita
Looks great. It seems SQL in Hbase is becoming a trend (not complaining) , how does this compare to Cloudera's Impala On Mon, Feb 25, 2013 at 5:09 PM, Ted Yu wrote: > I ran test suite and they passed: > > Tests run: 452, Failures: 0, Errors: 0, Skipped: 0 > > [INFO] > --

Re: restrict clients

2013-02-11 Thread Rita
Hi, I am looking for more than an ACL. I want to control what clients can connect to the hbase cluster. Is that possible? On Fri, Feb 8, 2013 at 10:36 AM, Stas Maksimov wrote: > Hi Rita, > > As far as I know ACL is on a user basis. Here's a link for you: > http://hbas

Re: migrating hbase table from one cluster to another

2012-11-17 Thread Rita
; Looks like you need: > > https://issues.apache.org/jira/browse/HBASE-3271 > > Allow .META. table to be exported > > > > But it is in 0.94.2 and above. > > > > Cheers > > > > > > On Sat, Nov 17, 2012 at 8:41 AM, Rita wrote: > > > &g

Re: migrating hbase table from one cluster to another

2012-11-17 Thread Rita
Do I need to restart the cluster? My 0.92.1 already has several hbase tables. On Sat, Nov 17, 2012 at 9:54 AM, wrote: > When you start the new cluster with 0.92.1, hbase would migrate your data > to 0.92 format. > > Thanks > > > > On Nov 17, 2012, at 6:31 AM,

Re: time series question

2012-09-20 Thread Rita
occurance) > I didn't get you very well, you store the serialized value as qualifier? > > Jieshan > -Original Message- > From: Rita [mailto:rmorgan...@gmail.com] > Sent: Thursday, September 20, 2012 8:48 PM > To: user@hbase.apache.org > Cc: Zhouxunmiao > Subje

Re: time series question

2012-09-20 Thread Rita
V1V2V3 :- Limit the length for each metrics. Likes v1=10, v2=8, v3=9. > And the length we set to 3. The value should be : 010008009. > > [Key-Schema 2]: Sensor + time(milliseconds) + > :- M bytes random number. > Just store the metrics in value part. > > Jieshan

Re: time series question

2012-09-20 Thread Rita
the part > of "v". Is it the version number? Or some random number to distinguish the > different version? > > Jieshan > -Original Message- > From: Rita [mailto:rmorgan...@gmail.com] > Sent: Thursday, September 20, 2012 9:09 AM > To: user@hbase.apache.org

Re: creating column families

2012-09-19 Thread Rita
Got it. Thanks. That should of been obvious. On Wed, Sep 19, 2012 at 2:16 AM, Yusup Ashrap wrote: > hi Rita ,check out this link. > > http://happybase.readthedocs.org/en/latest/api.html#happybase.Connection.create_table > > On Wed, Sep 19, 2012 at 7:58 AM, Rita wrote: >

Re: lookup table

2012-09-16 Thread Rita
ep 16, 2012 at 6:16 PM, Stack wrote: > On Sat, Sep 15, 2012 at 8:09 AM, Rita wrote: > > I am debating if a lookup table would help my situation. > > > > I have a bunch of codes which map with timestamp (unsigned int). The > codes > > look like this > > > &g

Re: backup strategies

2012-08-23 Thread Rita
-10 to backup_dir_B . Would the be feasible? On Wed, Aug 22, 2012 at 6:48 AM, Rita wrote: > what is the typical conversion process? My biggest worry is I come from a > higher version of Hbase to a lower version of Hbase, say CDH4 to CDH3U1. > > > > On Thu, Aug 16, 2012 at 7:5

Re: backup strategies

2012-08-22 Thread Rita
what is the typical conversion process? My biggest worry is I come from a higher version of Hbase to a lower version of Hbase, say CDH4 to CDH3U1. On Thu, Aug 16, 2012 at 7:53 AM, Paul Mackles wrote: > Hi Rita > > By default, the export that ships with hbase writes KeyValue obj

Re: hfile v2 in cloudera cdh3

2012-08-11 Thread Rita
is there a way to convert it? On Wed, Aug 8, 2012 at 6:46 AM, Harsh J wrote: > Hi Weishung, > > No, CDH3's HBase, being based on 0.90.x, does not carry this new format. > > On Wed, Aug 8, 2012 at 3:23 PM, Weishung Chung wrote: > > Hi HBase users, > > > > Does any of the cloudera cdh3 versions

Re: Announcing HappyBase, a developer-friendly Python library to interact with HBase

2012-05-22 Thread Rita
looks great and thanks for doing this. Whats with the name? I fear I will be running into this problem when presenting... http://news.ycombinator.com/item?id=2687372 On Mon, May 21, 2012 at 6:30 PM, Wouter Bolsterlee wrote: > Todd DeLuca schreef op ma 21-05-2012 om 15:32 [-0400]: > > Please

Re: large machine configuration

2012-05-18 Thread Rita
Mike, Where can I find your talk? On Fri, May 11, 2012 at 7:51 AM, Rita wrote: > most of the operations I do with MR are exporting tables and importing > tables. Does that still require a lot of memory and does it help to > allocate more memory for jobs like that? > > Yes, I hav

hbase data

2012-05-17 Thread Rita
Hello, Currently, using hbase to store sensor data -- basically large time series data hitting close to 2 billion rows for a type of sensor. I was wondering how hbase differs from HDF (http://www.hdfgroup.org/HDF5/) file format. Most of my operations are scanning a range and getting its values bu

Re: hbase security

2012-05-15 Thread Rita
Thanks! Can't wait until CHD4 :p On Tue, May 15, 2012 at 6:37 PM, Kevin O'dell wrote: > CDH4 is based off of 92 and will have HBase security. > > On Tue, May 15, 2012 at 6:35 PM, Rita wrote: > > > Do any of the CDH have this feature? > > > > > >

Re: hbase security

2012-05-15 Thread Rita
e Segel > > On May 15, 2012, at 6:11 AM, Rita wrote: > > > I am guessing I can´t use these features using shell, right? > > > > > > > > On Tue, May 15, 2012 at 5:24 AM, Harsh J wrote: > > > >> HBase 0.92 has table-level security

Re: hbase security

2012-05-15 Thread Rita
ww.hbasecon.com/sessions/hbase-security-for-the-enterprise/ > which also includes a tutorial (from Andrew). > > On Tue, May 15, 2012 at 8:11 AM, Rita wrote: > > Hello, > > > > It seems for my hbase installation anyone can delete my tables. Is there > a > &g

hbase security

2012-05-14 Thread Rita
Hello, It seems for my hbase installation anyone can delete my tables. Is there a way to prevent this? I would like to have only owner of Hmaster with super authority. tia -- --- Get your facts first, then you can distort them as you please.--

Re: large machine configuration

2012-05-11 Thread Rita
just > for M/R. > (4GB a core is a good rule of thumb ) > > Depending on what you want to do, you could set aside 8GB of heap and tune > that, but even that might not be enough... > > > On May 11, 2012, at 5:42 AM, Rita wrote: > > > Hello, > > > >

large machine configuration

2012-05-11 Thread Rita
Hello, While looking at, http://hbase.apache.org/book.html#important_configurations, I noticed large machine configuration section still isnt completed. ¨Unfortunately¨, I am running on a large machine which as 64gb of memory therefore I would need some help tuning my hbase/hadoop instance for max

Re: importing a large table

2012-03-31 Thread Rita
; http://www.meetup.com/LA-HUG/**pages/Video_from_April_13th_** > HBASE_DO%27S_and_DON%27TS/<http://www.meetup.com/LA-HUG/pages/Video_from_April_13th_HBASE_DO%27S_and_DON%27TS/> > > > > On 3/31/2012 5:33 AM, Rita wrote: > >> I have close to 9200 regions. Is there an e

Re: importing a large table

2012-03-31 Thread Rita
I have close to 9200 regions. Is there an example I can follow? or are there tools to do this already? On Fri, Mar 30, 2012 at 10:11 AM, Marcos Ortiz wrote: > > > On 03/30/2012 04:54 AM, Rita wrote: > > Thanks for the responses. I am using 0.90.4-cdh3. i exported the tabl

Re: importing a large table

2012-03-30 Thread Rita
then import them in? What is the difference between that in the regular MR export job? I idea sounds good because it sounds simple on the surface :-) On Fri, Mar 30, 2012 at 12:08 AM, Stack wrote: > On Thu, Mar 29, 2012 at 7:57 PM, Rita wrote: > > Hello, > > > > I am import

importing a large table

2012-03-29 Thread Rita
Hello, I am importing a 40+ billion row table which I exported several months ago. The data size is close to 18TB on hdfs (3x replication). My problem is when I try to import it with mapreduce it takes a few days -- which is ok -- however when the job fails to whatever reason, I have to restart e

serialize hbase data

2012-03-27 Thread Rita
Hello, I was wondering if there is an easy way to serialize hbase data so I can store it on a Unix filesystem. Since the data is unstructured I was thinking of creating a XML file which would represent it for each key and value. Any thoughts or ideas about this? -- --- Get your facts first, th

Re: thrift or avro

2012-03-26 Thread Rita
Thanks. You answered my question. 0.94 looks really exciting! On Mon, Mar 26, 2012 at 11:19 AM, Stack wrote: > On Sun, Mar 25, 2012 at 6:05 AM, Rita wrote: > > Currently, I am connecting to Hbase, Trift and Python and was curious if > > Avro does something similar to Thrift. T

Re: export schema only

2012-03-13 Thread Rita
thanks. I will give it a try On Tue, Mar 13, 2012 at 2:58 AM, Laxman wrote: > Rita, > > I guess you are looking for something similar to RBDMS (say, oracle) is > providing. > > If that is the case, Exporting table structure alone doesn’t make much > sense > in HBase

export schema only

2012-03-12 Thread Rita
Hello, Is it possible to export table schema only versus exporting the entire table? I need this so I can create a separate table for QA purposes. -- --- Get your facts first, then you can distort them as you please.--

Re: python and hbase

2012-02-07 Thread Rita
> > On Tue, Feb 7, 2012 at 8:36 AM, Rita wrote: > > Running CDH U1 and using Hbase. So far I have been extremely happy with > the > > exception of Python support. Currently, I am using thrift but I suspect > > there are some major features missing in it such as RegExFilters

Re: want to try HBase on a large cluster running Lustre - any advice?

2011-12-06 Thread Rita
How would you handle a node failure? Do you have shared storage which exports LUNs to the datanodes? The beauty of hbase+hdfs is you can afford nodes going down (depending on your replication policy). Lustre is a great filesystem for scratch high performance filesystem but using it as a backend fo

Re: thrift and hbase

2011-12-04 Thread Rita
Thanks. I will keep a watch on this. Hopefully, CDH will have it in 4.x On Mon, Nov 28, 2011 at 7:13 PM, Ted Yu wrote: > With HBASE-1744, support for thrift is better. > But that is in TRUNK only. > > On Mon, Nov 28, 2011 at 3:41 PM, Rita wrote: > > > Hello, > &g

Re: zookeeper quorum verification

2011-12-04 Thread Rita
ndrobindnsentry"); // Here we are running zookeeper locally On Sun, Dec 4, 2011 at 10:03 AM, Rita wrote: > Thanks for the nice responses and advice. > > To sum up this thread > > This will not work, > > $ host roundrobindnsentry > roundrobindnsentry has address

Re: zookeeper quorum verification

2011-12-04 Thread Rita
ed about in (4). The DNS.getDefaultHost > returns > > host.domain.com and zookeeper fails to start. > > > > I would be interested if some one has a different way of handling the > > situation I described. > > > > On Sat, Dec 3, 2011 at 4:45 PM, Suraj Varma wrote: &g

zookeeper quorum verification

2011-11-30 Thread Rita
Hello, Previously, I assigned 5 servers as part of the zookeeper quorum. Everything works fine but I was hard coding these 5 servers everywhere and I was thinking of creating a dns entry called appquorum which will always return these 5 servers IPs. Any thoughts about this? -- --- Get your f

Re: statistics using nc

2011-11-30 Thread Rita
osed by the JMX metrics for HBase? > > Lars > > On Nov 30, 2011, at 1:01 AM, Rita wrote: > > > Is it possible to get hbase statistics using, nc? Similar to zookeeper > > > > > > http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html > > > > This

statistics using nc

2011-11-29 Thread Rita
Is it possible to get hbase statistics using, nc? Similar to zookeeper http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html This would actually be a great feature for hbase. Any objections if I submit a jira request for this? -- --- Get your facts first, then you can distort them as yo

thrift and hbase

2011-11-28 Thread Rita
Hello, I am planning to use thrift with python and curious what are its limitations against the defacto Java API? Is it possible to do everything with it or what are its limitations? -- --- Get your facts first, then you can distort them as you please.--

major compaction

2011-11-27 Thread Rita
Hello, When I do a major compaction of a table (1 billion rows). How do I know when its completed? Also, what can I see in the metrics to see the status of it -- --- Get your facts first, then you can distort them as you please.--

Re: understanding Rowcounter

2011-11-21 Thread Rita
> > >Region is the split used by mapper. > > > >Thanks, > >Jahangir Mohammed. > > > >On Mon, Nov 21, 2011 at 6:44 AM, Rita wrote: > > > >> Hello, > >> > >> I have been looking at > >> > >> > >> > http:/

understanding Rowcounter

2011-11-21 Thread Rita
Hello, I have been looking at http://search-hadoop.com/c/HBase:/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java%7C%7C+RowCounter -- the Rowcounter code. I would like to implement this by using thrift and python without mapreduce. My guestion is how is the Rowcounter getting the m

Re: speeding up rowcount

2011-10-29 Thread Rita
Ha. You are over estimating my Java Ted. I am no programmer just a ignorant consumer of great technologies On Sat, Oct 29, 2011 at 10:46 AM, Ted Yu wrote: > Thanks Rita for logging the JIRA. > > Do you want to provide a patch ? > > On Sat, Oct 29, 2011 at 7:29 AM, Rita wrot

Re: speeding up rowcount

2011-10-29 Thread Rita
nt via mapreduce, and it took 6 hours for 7.5 > > million rows... > > > > On Sun, Oct 9, 2011 at 7:50 AM, Rita wrote: > > > >> Hi, > >> > >> I have been doing a rowcount via mapreduce and its taking about 4-5 > hours > >> to > >&

Re: sum, avg, count, etc...

2011-10-29 Thread Rita
For the values, ... price=26.81 open= close= ... Does hbase do a full scan across all values or does it have a constant lookup, O(1) ? On Wed, Oct 26, 2011 at 8:27 PM, Rita wrote: > Thanks for all of your responses. > > The original file is a text file and when I try to search t

Re: querying questions

2011-10-29 Thread Rita
rical values stored as native bytes as bit masks. > > > > An long is 8 bytes. You can store an unsigned number up to > 18,446,744,073,709,551,615 in those eight bytes. If you stored this > number as a String -- presuming a byte per character -- you need > nearly 3x the by

Re: querying questions

2011-10-28 Thread Rita
Thanks for your reply. I am curious, can you give me an example of having a key as a raw bytes and do byte comparison? I am not a native java programmer so an example would be extremely helpful in my case. On Fri, Oct 28, 2011 at 12:23 PM, Stack wrote: > On Fri, Oct 28, 2011 at 4:30 AM, R

querying questions

2011-10-28 Thread Rita
Hello All, I will be querying a lot of time series data and I have been studying opentsdb. Currently my data looks like this, server#timestamp (epoch)#event id#username Couple of questions: What is the best delimiter for a key? Does it even matter? I read somewhere that using a \t is optimal f

Re: sum, avg, count, etc...

2011-10-26 Thread Rita
the way to the end of the table, just > filtering out all the remaining rows. > > On Wed, Oct 26, 2011 at 6:18 AM, Doug Meil > wrote: > > Hi there- > > > > First, make sure you aren't tripping on any of these issues.. > > > > http://hbase.apache.org/b

sum, avg, count, etc...

2011-10-26 Thread Rita
I am trying to do some simple statistics with my data but its taking longer than expected. Here is how my data is structured in hbase. keys (symbol#epoch time stamp) msft#1319562974#NASDAQ t#1319562974#NYSE yhoo#1319562974#NASDAQ msft#1319562975#NASDAQ The values look like this (for instance m

Re: speeding up rowcount

2011-10-09 Thread Rita
e.org/book.html#perf.hbase.client.caching > > Himanshu > > On Sun, Oct 9, 2011 at 9:09 AM, Ted Yu wrote: > > I guess your hbase.hregion.max.filesize is quite high. > > If possible, lower its value so that you have smaller regions. > > > > On Sun, Oct 9, 2011 at 7:50 AM, Rita wrote:

speeding up rowcount

2011-10-09 Thread Rita
Hi, I have been doing a rowcount via mapreduce and its taking about 4-5 hours to count a 500million rows in a table. I was wondering if there are any map reduce tunings I can do so it will go much faster. I have 10 node cluster, each node with 8CPUs with 64GB of memory. Any tuning advice would be

Re: range query

2011-10-05 Thread Rita
> > > On 10/5/11 3:29 AM, "Rita" wrote: > > >Hello, > > > >I have a simple table where the data looks like this, > > > >key,value > >2011-01-01.foo,data01 > >2011-01-02.foo,data02 > >2011-01-03.foo,data03 > >2011-01-04.foo,

range query

2011-10-05 Thread Rita
Hello, I have a simple table where the data looks like this, key,value 2011-01-01.foo,data01 2011-01-02.foo,data02 2011-01-03.foo,data03 2011-01-04.foo,data04 2011-01-05.foo,data05 2011-01-05.foo,data06 Does anyone have any example code to perform a range query like, get all values for keys whic

Re: exporting a table

2011-09-26 Thread Rita
going to work at all since the table is disabled. > > If you refer to a distcp, which would make more sense since by being > disabled the table won't get new mutations, then it won't be any > faster... just safer. > > J-D > > On Mon, Sep 26, 2011 at 2:03 AM, Rita wro

exporting a table

2011-09-26 Thread Rita
Is there a performance boost to export a table while its disabled? -- --- Get your facts first, then you can distort them as you please.--

table version question

2011-09-25 Thread Rita
I understand by default hbase creates 3 versions of a table. This is taking up more space than anticipated therefore I was wondering if I default to 1 version are there any drawbacks? -- --- Get your facts first, then you can distort them as you please.--

Re: schema doubt

2011-09-15 Thread Rita
nd writes each one as a row in > > HBase. WIth the java APIs you can write the raw bytes pretty easily. > > > > -Joey > > > > On Thu, Sep 15, 2011 at 7:56 AM, Rita wrote: > > > I have many small files (close to 1 million) and I was thinking of

Re: schema help

2011-08-25 Thread Rita
I, you could create a scanner with a > startrow that is the concatenation of your value for fieldA and the start > time, and an endrow that has the current time. > > http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html > > Ian > > On Aug 25, 2011, at 9:53 A

Re: schema help

2011-08-25 Thread Rita
matter of fact, I can do 100% of my queries. I will leave the 5% out of my project/schema. On Thu, Aug 25, 2011 at 10:13 AM, Ian Varley wrote: > Rita, > > There's no need to create separate tables here--the table is really just a > "namespace" for keys. A better opti

Re: is mapreduce needed?

2011-07-12 Thread Rita
ith Hbase, but it is extremely useful. > > > > On 7/12/11 6:06 AM, "Rita" wrote: > > >Hello, > > > > > >I have a dataset which is several terabytes in size. I would like to query > >this data using hbase (sql). Would I need to setup mapreduce t

is mapreduce needed?

2011-07-12 Thread Rita
Hello, I have a dataset which is several terabytes in size. I would like to query this data using hbase (sql). Would I need to setup mapreduce to use hbase? Currently the data is stored in hdfs and I am using `hdfs -cat ` to get the data and pipe it into stdin. -- --- Get your facts first, th