Re: HBase scan performance decreases over time.

2012-11-05 Thread Asaf Mesika
Where is this settings located? Sent from my iPhone On 5 בנוב 2012, at 15:05, Michael Segel wrote: > There's an HDFS bandwidth setting which is set to 10MB/s. > > Way too low for even 1GBe. > > Have you modified this setting yet? > > -Mike > > On Nov 3, 2012, at 2:50 PM, David Koch wrote: > >>

Re: Announcing KijiSchema for HBase schema management

2012-11-14 Thread Asaf Mesika
Hi, This looks great! I have a question regarding schema. It is written in the user guide that the schema of a cell is saved next to the data in the cell. I presume it would: Takes more spaces, as schema is duplicated for each row this cell is saved at Makes reading records slower since it needs

Re: Announcing KijiSchema for HBase schema management

2012-11-15 Thread Asaf Mesika
schema mappings in a > metadata table (also stored in HBase) and looks them up as needed. I have > logged https://jira.kiji.org/browse/DOCS-2 to note this improvement > > Cheers, > - Aaron > > > On Wed, Nov 14, 2012 at 9:58 PM, Asaf Mesika wrote: > >> Hi, >> This l

Re: scan is slower after bulk load

2012-11-22 Thread Asaf Mesika
Did you end up finding the answer? How fast is this method of insertion relative to a simple insert of List ? On 13 בנוב 2012, at 02:29, Bijieshan wrote: > I think one possible reason is block caching. Have you turned the block > caching off during scanning? > > Regards, > Jieshan >

Re: Can a RegionObserver coprocessor increment counter of a row key that may not belong to the region ?

2012-12-05 Thread Asaf Mesika
Why not simply send Increment objects from the client side to HBase, to the URL row key and to the domain row key? On 5 בדצמ 2012, at 14:35, Amit Sela wrote: > Hi all, > > I have to count the occurrence of URLs in our system using the URL as row > key and using Increment. > I also want to coun

Re: Can a RegionObserver coprocessor increment counter of a row key that may not belong to the region ?

2012-12-07 Thread Asaf Mesika
(hourly, daily...) and the qualifiers are >>>>> impressions_criteria1,impressions_criteria2... I am going to use >>>>> coprocessors in order to sum all impressions counters (all criteria >>>>> segments) and then increment another counter (general

Re: Heterogeneous cluster

2012-12-09 Thread Asaf Mesika
So just to get this right: the class you have built is a custom Load Balancer which replaces the default hbase load balancer implementation? Sent from my iPhone On 8 בדצמ 2012, at 05:33, Jean-Marc Spaggiari wrote: Hi, Here is the situation. I have an heterogeneous cluster with 2 cores CPUs, 4

Re: Meaure server time of Get/Scan - through RPC logging?

2012-12-09 Thread Asaf Mesika
The scan are done in parallel in many region servers and are specific to your query, so I don't think any jmx counter can help you. Maybe you measure it using your own Region Observer on Pre/Post Scan, and writing it to a shared log file on hdfs. Sent from my iPhone On 9 בדצמ 2012, at 04:41, Wei

Re: Checking if a coprocessor was loaded successfully from client

2012-12-10 Thread Asaf Mesika
My test on a table that doesn't have any coprocessor. It does loads successfully, but I want to understand two things: 1. How should I use getAlterStatus? I'm asking since even placing it between modifyTable and enable and waiting for getFirst to return number of regions fails (always returns 0) 2.

Re: Checking if a coprocessor was loaded successfully from client

2012-12-11 Thread Asaf Mesika
ion server other than > scanning region server log. > > There're several places in (Region)CoprocessorHost where we log exception > if loading encounters problem. > > I hope HBASE-6873 <https://issues.apache.org/jira/browse/HBASE-6873> can > get into the next 0.94

Re: How to design a data warehouse in HBase?

2012-12-14 Thread Asaf Mesika
Here's my take on this matter: In the current situation, there isn't any good solution to the data warehousing solution you want in large scale. Impala and Drill are both projects that heads in this direction, but they still have a way to go and are not production ready yet. If you can stay at

Re: MR missing lines

2012-12-16 Thread Asaf Mesika
Did you check the returned array of the delete method to make sure all records sent for delete have been deleted? Sent from my iPhone On 16 בדצמ 2012, at 14:52, Jean-Marc Spaggiari wrote: > Hi, > > I have a table where I'm running MR each time is exceding 100 000 rows. > > When the target is re

Re: How to use substring in Rowkey

2012-12-27 Thread Asaf Mesika
You can define the StartKey of your scan object to be the customer number (i.e. the prefix of your rowkey) Define the end key to be the same but append a zero byte at the end to signify include of the rowkey. This should retrieve all rows starting with that customer number. Sent from my iPhone O

Re: 0.94.3 in Central (Maven)

2013-01-02 Thread Asaf Mesika
hosts latest HBase version. >> >> I see that central ( >> http://mvnrepository.com/artifact/org.apache.hbase/hbase) only hosts >> 0.94.1 and not 0.94.4 >> >> >> Thanks, >> >> Asaf Mesika >> Senior Developer / CSI Infrastructure Team >

Re: reginoserver's log analysis?

2013-01-03 Thread Asaf Mesika
When I encountered this error it was caused by slow network which caused hdfs to be slow which caused flush to take more time thus blocking updates occurred. Sent from my iPhone On 3 בינו 2013, at 18:25, ramkrishna vasudevan < ramkrishna.s.vasude...@gmail.com> wrote: You need to see your applica

Re: Fastest way to find is a row exist?

2013-01-06 Thread Asaf Mesika
Why not write your own filter class which you can initialize with a set of keys to search for. The HTable on the client side will split the keys based on row keys so it will be sent to the right regions. There your filter can utilize SEEK_NEXT_USING_HINT Return Code to see efficiently on those set

Re: Storing images in Hbase

2013-01-06 Thread Asaf Mesika
What's the penalty performance wise of saving a very large value in a KeyValue in hbase? Splits, scans, etc. Sent from my iPad On 6 בינו 2013, at 22:12, Andrew Purtell wrote: > Also YFrog / ImageShack serves all of its assets out of HBase too, so for > reasonably sized images some are having su

Re: Tune MapReduce over HBase to insert data

2013-01-08 Thread Asaf Mesika
Start by testing HDFS throughput by doing s simple copyFromLocal using Hadoop command line shell (bin/hadoop fs -copyFromLocal pathTo8GBFile /tmp/dummyFile1). If you have 1000Mbit/sec network between the computers, you should get around 75 MB/sec. On Tuesday, January 8, 2013, Bing Jiang wrote: >

Re: HBase - Secondary Index

2013-01-08 Thread Asaf Mesika
I guess one reason is the the amount of data traveling. In your design, you have to query a secondary index table, read all the matched original table row keys, send them back to the client, and then issue a special scan that retrieves only those row keys values. In his example, he retrieved 2% of

Re: Maximizing throughput

2013-01-11 Thread Asaf Mesika
I've done similar work couple of months ago. Start by sharing more details on your program, hbase setup, and the way you measure network and disk bottlenecks. Also, have you isolated network and disk on all nodes and between all nodes? (Each two nodes)) Test them separately and give us those numbe

Re: Increment operations in hbase

2013-01-12 Thread Asaf Mesika
Most time is spent reading from Store file and not on network transfer time of Increment objects. Sent from my iPhone On 12 בינו 2013, at 17:40, Anoop John wrote: Hi Can you check with using API HTable#batch()? Here you can batch a number of increments for many rows in just one RPC call.

Re: exception during coprocessor run

2013-01-13 Thread Asaf Mesika
Where is the 60 sec timeout defined? Sent from my iPhone On 13 בינו 2013, at 22:28, Himanshu Vashishtha wrote: How much time your CP calls take? more than 60sec? On Sun, Jan 13, 2013 at 11:42 AM, Andrew Purtell wrote: This means your client disconnected. On Sun, Jan 13, 2013 at 6:04 AM, S

Re: best read path explanation

2013-01-14 Thread Asaf Mesika
I have a follow up question here: A column family can be defined to have a maximum number of versions per column qualifier value. Is this enforced only by the client side code (HTable) or also by the InternalScanner implementations? On Monday, January 14, 2013, S Ahmed wrote: > Thanks Lars! > > S

Re: Loading data, hbase slower than Hive?

2013-01-19 Thread Asaf Mesika
Start by telling us your row key design. Check for pre splitting your table regions. I managed to get to 25mb/sec write throughput in Hbase using 1 region server. If your data is evenly spread you can get around 7 times that in a 10 regions server environment. Should mean that 1 gig should take 4 s

Re: write throughput in cassandra, understanding hbase

2013-01-22 Thread Asaf Mesika
Sent from my iPhone On 22 בינו 2013, at 20:47, Jean-Daniel Cryans wrote: On Tue, Jan 22, 2013 at 10:38 AM, S Ahmed wrote: I've read articles online where I see cassandra doing like 20K writers per second, and hbase around 2-3K. Numbers with 0 context don't mean much, if at all. I understa

Re: HBASE-7114 Increment does not extend Mutation but probably should

2013-01-26 Thread Asaf Mesika
The all counters is on the same row? By the way, did you guys handle the hbase bug that when an increment is sent to region server and fails it still does it but throws an exception to the client which causes it to do that increment again? Sent from my iPhone On 26 בינו 2013, at 17:32, Amit Sel

Re: HBASE-7114 Increment does not extend Mutation but probably should

2013-01-26 Thread Asaf Mesika
Why not have the Increment object have two columns: one for the country and one for the allCountries ? Sent from my iPhone On 26 בינו 2013, at 18:54, Infolinks wrote: Yes, of course. It's an all counter for the specific keyword. ב-26 בינו 2013, בשעה 18:40, Asaf Mesika כתב/ה: Th

Re: increment-related bug Was: HBASE-7114 Increment does not extend Mutation but probably should

2013-01-26 Thread Asaf Mesika
saf: > Were you referring to HBASE-6291: Don't retry increments on an invalid cell > ? > That was fixed in 0.94.2 > > Or maybe: HBASE-6195 Increment data will be lost when the memstore is > flushed > The above was fixed in 0.94.1 > > Cheers > > On Sat, Jan 26, 2013

Re: HBASE-7114 Increment does not extend Mutation but probably should

2013-01-26 Thread Asaf Mesika
rver (each row has it's own counters so no risk in going outside > the region right ?) . > On Jan 26, 2013 7:15 PM, "Asaf Mesika" wrote: > >> Why not have the Increment object have two columns: one for the country and >> one for the allCountries ? >> >

Re: Tables vs CFs vs Cs

2013-01-28 Thread Asaf Mesika
I would go on using the row-key, on one table. = Row Key Structure = group-depth: 1..4, encoded as 1 byte A-D group; encoded as 1 byte and not as string Examples: <1><192> <2><192><168> <3><192><168><1> <4><192><168><1><10> Column Qualifier: "c" - stands for counters Column Qualifier: "t" - st

Re: Getting the scan type at preCompact

2013-01-29 Thread Asaf Mesika
But my scanner is a wrapper to the scanner passed on to me in preCompact. This scanner is created in the Store object (StoreScanner) with arguments I don't have at preCompactScannerOpen(). So I can't create it. I want the vanilla scanner hbase uses is preCompact so I can wrap it with my behavior bu

Re: Pagination with HBase - getting previous page of data

2013-02-03 Thread Asaf Mesika
Here are my thoughts on this matter: 1. If you define setCaching(numOfRows) on the the scan object, you can check before each call to make sure you haven't passed your page limit, thus won't get to the point in which you retrieve from each region pageSize results. 2. I think its o.k. for the UI t

Re: column count guidelines

2013-02-08 Thread Asaf Mesika
Can you elaborate more on that features? I thought 4 was just for bug fixes. Sent from my iPhone On 8 בפבר 2013, at 02:34, Ted Yu wrote: How many column families are involved ? Have you considered upgrading to 0.94.4 where you would be able to benefit from lazy seek, Data Block Encoding, etc ?

Re: Using HBase for Deduping

2013-02-14 Thread Asaf Mesika
You can load the events into an Hbase table, which has the event id as the unique row key. You can define max versions of 1 to the column family thus letting Hbase get rid of the duplicates for you during major compaction. On Thursday, February 14, 2013, Rahul Ravindran wrote: > Hi, >We hav

Re: Using HBase for Deduping

2013-02-15 Thread Asaf Mesika
Michael, this means read for every write? On Friday, February 15, 2013, Michael Segel wrote: > What constitutes a duplicate? > > An over simplification is to do a HTable.checkAndPut() where you do the > put if the column doesn't exist. > Then if the row is inserted (TRUE) return value, you push t

Re: Using HBase for Deduping

2013-02-15 Thread Asaf Mesika
ons set as 1 and duplicate key is added, the last added will > win removing the old. This is what you want Rahul? I think from his > explanation he needs the reverse way > > -Anoop- > ____ > From: Asaf Mesika [asaf.mes...@gmail.com ] > Sent: F

Re: increment counters via bulk upload in HBase

2013-02-18 Thread Asaf Mesika
We solved this partially by converting increment to put and aggregating in preCompact. I guess that if you bulk upload an HFile which has this puts which are a representation of a delta to a column, and merge it in the compaction it could work. On Monday, February 18, 2013, Andrew Purtell wrote:

Re: coprocessor enabled put very slow, help please~~~

2013-02-19 Thread Asaf Mesika
1. Try batching your increment calls to a List and use batch() to execute it. Should reduce RPC calls by 2 magnitudes. 2. Combine batching with scanning more words, thus aggregating your count for a certain word thus less Increment commands. 3. Enable Bloom Filters. Should speed up Increment by a f

Re: Rowkey design question

2013-02-21 Thread Asaf Mesika
An easier way is to place one byte before the time stamp which is called a bucket. You can calculate it by using modulu on the time stamp by the number of buckets. We are now in the process of field testing it. On Tuesday, February 19, 2013, Paul van Hoven wrote: > Yeah it worked fine. > > But a

Re: Row Key Design in time based aplication

2013-02-21 Thread Asaf Mesika
How does avoid memory hogging the region server when multiple queries with group by are executed, which is done in Hbase jvm. I know that Hbase does not handle well when heap space is set beyond 12G, and combined with compactions happening concurrently with queries, it creates a memory competition.

Re: HBase Thrift inserts bottlenecked somewhere -- but where?

2013-03-01 Thread Asaf Mesika
Maybe you've the limit of the number of RPC Threads handling your write requests per Region Server, thus your requests are queued? On Fri, Mar 1, 2013 at 2:17 PM, Dan Crosta wrote: > We are using a 6-node HBase cluster with a Thrift Server on each of the > RegionServer nodes, and trying to eval

Re: Map Reduce with multiple scans

2013-03-01 Thread Asaf Mesika
Nick, if he didn't specify startKey, endKey in the Scan Object, and delegate it to a Filter, this means he will send this scan to *all* regions in the system, instead of just one or two, no? On Tue, Feb 26, 2013 at 10:12 PM, Nick Dimiduk wrote: > Hi Paul, > > You want to run multiple scans so t

Re: HBase Thrift inserts bottlenecked somewhere -- but where?

2013-03-02 Thread Asaf Mesika
Make sure you are not sending a lot of put of the same rowkey. This can cause contention in the region server side. We fixed that in our project by aggregating all the columns for the same rowkey into the same Put object thus when sending List of Put we made sure each Put has a unique rowkey. On S

Re: Rowkey design and presplit table

2013-03-06 Thread Asaf Mesika
I would convert each id to long and then use Bytes.toBytes to convert this long to a byte array. If it is an int then even better. Now, write all 3 longs one after another to one array which will be your rowkey. This gives you: * fixed size * small row key - 3*8 bytes if you use long and 3*4 for in

Re: Why InternalScanner doesn't have a method that returns entire row or object of Result

2013-03-07 Thread Asaf Mesika
Guys, Just to make things clear: if I have a row which have 12 keys values, and then another row with 5 KVs, and I called InternelScanner(results, 10), where 10 is the limit, then I would get: 1. 10 KV of the 1st row 2. 2 KV of the 1st row 3. 5 KV of the 2nd row Is this correct? On Sat, Dec 1,

Re: Compaction problem

2013-03-26 Thread Asaf Mesika
1st thing I would do to find the bottleneck it so benchmark HDFS solo performance. Create a 16GB file (using dd) which is x2 your memory and run "time hadoop fs -copyFromLocal yourFile.txt /tmp/a.txt" Tell us what is the speed of this file copy in MB/sec. On Mar 22, 2013, at 4:44 PM, tarang daw

Re: Compaction problem

2013-03-27 Thread Asaf Mesika
On Mar 27, 2013, at 5:05 PM, Jean-Marc Spaggiari wrote: > Hi Asaf, > > What kind of results should we expect from the test you are suggesting? > > I mean, how many MB/sec should we see on an healthy cluster? > > Thanks, > > JM > > 2013/3/26 Asaf Mesika : >&g

Re: HBase Writes With Large Number of Columns

2013-03-27 Thread Asaf Mesika
Correct me if I'm wrong, but you the drop is expected, according to the following math: If you have a Put, for a specific rowkey, and that rowkey weighs 100 bytes, then if you have 20 columns you should add the following size to the combined size of the columns: 20 x (100 bytes) = 2000 bytes So th

Re: HBase Writes With Large Number of Columns

2013-03-27 Thread Asaf Mesika
dec > .getClass().getCanonicalName()); > > if (this.compressor != null) { > > builder.setCellBlockCompressorClass(this.compressor > .getClass().getCanonicalName()); > > } > Cheers > > On Wed, Mar 27, 2013 at 2:52 PM, Asaf Mesika > wrote: > >

Re: Understanding scan behaviour

2013-03-30 Thread Asaf Mesika
Yes. Watch out for last byte being max On Fri, Mar 29, 2013 at 7:31 PM, Mohit Anchlia wrote: > Thanks everyone, it's really helpful. I'll change my prefix filter to end > row. Is it necessary to increment the last byte? So if I have hash of > 1234555 my end key should be 1234556? > > > On Thu, M

Re: Read thruput

2013-04-01 Thread Asaf Mesika
How does your client call looks like? Get? Scan? Filters? Is 3000/sec is client side calls or is it in numbers of rows per sec? If you measure in MB/sec how much read throughput do you get? Where is your client located? Same router as the cluster? Have you activated dfs read short circuit? Of not t

Re: Read thruput

2013-04-03 Thread Asaf Mesika
ter parallel to your reading client? > --There is small upload code running. I have never seen the CPU usage more > than 5%, so actually didnt bother to look at this angle. > > -Vibhav > > > On Tue, Apr 2, 2013 at 1:42 AM, Asaf Mesika wrote: > > > How does your

Re: Evenly splitting the table

2013-04-03 Thread Asaf Mesika
Is the number of salt keys configurable? Do you support range scan on row keys althought hash is used? On Thursday, March 21, 2013, Aaron Kimball wrote: > Hi Cole, > > How are your keys structured? In Kiji, we default to using hashed row keys > where each key starts with two bytes of salt. This m

deleteOnExit when JVM shutsdown non gracefully

2013-04-10 Thread Asaf Mesika
Hi, In the CoprocessorHost.java file, there's the following code section used to load a coprocessor jar: fs.copyToLocalFile(path, dst); File tmpLocal = new File(dst.toString()); tmpLocal.deleteOnExit(); There's an assumption here that the JVM will gracefully shutdown (as oppo

Getting list of Region In Transition via command line

2013-04-18 Thread Asaf Mesika
Hi, The Web UI for HBase Master shows a list of Region In Transition. Is there a way to get this list via command line? HBase shell or zkCli? Is the /hbase/unassigned node in ZooKeeper are the region in transition, or in transition could also mean, assigned (thus not in that zNode) but the regi

Re: checkAnd...

2013-04-28 Thread Asaf Mesika
Yep. You can write a RegionObserver which take all event qualifiers with a time stamp larger than a certain grace period, sum it up, add it to the current value of the Count qualifier and emits an updated Count qualifier. I wrote something very similar for us at Akamai and it improved throughput by

Re: HBase is not running.

2013-04-28 Thread Asaf Mesika
Http://Devving.com has a good post on setting hosts file for stand alone hbase On Saturday, April 27, 2013, Yves S. Garret wrote: > Hi, but I don't understand what you mean. Did I miss a step > in the tutorial? > > > On Fri, Apr 26, 2013 at 4:26 PM, Leonid Fedotov > > >wrote: > > > Looks like

Re: Schema Design Question

2013-04-28 Thread Asaf Mesika
I actually don't see the benefit of saving the data into HBase if all you do is read per job id and purges it. Why not accumulate into HDFS per job id and then dump the file? The way I see it, HBase is good for querying parts of your data, even if it is only 10 rows. In your case your average is 1

Re: Read access pattern

2013-04-29 Thread Asaf Mesika
Couple of raw implementation thoughts: 1. Change the schema Take the timestamps inside the row. Rowkey is the hash(objectid), and column qualifier is the LONG.MAX_VALUE - changeDate - getTime(). You can even save it using Bytes.toBytes(ts) to save space - will always be 8 bytes, instead of the lon

Re: HBase and Datawarehouse

2013-04-29 Thread Asaf Mesika
I think for Pheoenix truly to succeed, it's need HBase to break the JVM Heap barrier of 12G as I saw mentioned in couple of posts. since Lots of analytics queries utilize memory, thus since its memory is shared with HBase, there's so much you can do on 12GB heap. On the other hand, if Pheonix was i

Re: discp versus export

2013-04-30 Thread Asaf Mesika
The replication.html reference appears to contain a reference to a bug (2611) which was solved two years ago :) On Wed, Mar 6, 2013 at 12:15 AM, Damien Hardy wrote: > IMO the easier would be hbase export. For long term offline backup (for > disaster recovery). It can even be stored on a differe

Re: availability of 0.94.4 and 0.94.5 in maven repo?

2013-05-06 Thread Asaf Mesika
Hi, I only see up to 0.94.3 Any chance of deploying 0.94.7 there? Thanks! On Wed, Feb 20, 2013 at 9:04 AM, lars hofhansl wrote: > Time permitting, I will do that tomorrow. > > > > > > From: Andrew Purtell > To: "user@hbase.apache.org" > Sent: Tuesday, Febru

Re: availability of 0.94.4 and 0.94.5 in maven repo?

2013-05-06 Thread Asaf Mesika
maven. > > Just change de version number. > > > On Mon, May 6, 2013 at 1:35 PM, Asaf Mesika wrote: > > > Hi, > > > > I only see up to 0.94.3 > > Any chance of deploying 0.94.7 there? > > Thanks! > > > > > > > > On Wed, Fe

Secure HBase upon Replication

2013-05-09 Thread Asaf Mesika
Hi, I know that HBase supports secure RPC between its nodes (Master, Region Server). I have couple of questions about it: 1. Does HBase supports secure RPC between Master and Slave replications? 2. Does enabling security in HBase entails using the latest hbase security branch? 3. Suppose the only

Re: Secure HBase upon Replication

2013-05-09 Thread Asaf Mesika
> > >>1. Does HBase supports secure RPC between Master and Slave replications? > Sorry am not sure on this. > > Regards > Ram > > > On Thu, May 9, 2013 at 2:04 PM, Asaf Mesika wrote: > > > Hi, > > > > I know that HBase supports secure RPC betwee

Re: Secure HBase upon Replication

2013-05-09 Thread Asaf Mesika
act contains all of 0.94 plus: > - A secure RPC engine, for integrating with Hadoop security / Kerberos > - The AccessController coprocessor > - The TokenProvider coprocessor > > From 0.95 and forward there won't be separate security artifacts. > > > On Thu, May

Re: How to implement this check put and then update something logic?

2013-05-11 Thread Asaf Mesika
Maybe this problem is more in the graph domain? I know that there are projects aimed at representing graphs at large scale better. I'm saying this since you have one ID referencing another ID (using target ID). On May 10, 2013, at 11:47 AM, "Liu, Raymond" wrote: > Thanks, seems there are no

Re: How to remove dependency of jruby from HBase

2013-05-11 Thread Asaf Mesika
If JRuby is GPL it also means HBase is GPL, but it's Apache license. On May 11, 2013, at 12:32 AM, xia_y...@dell.com wrote: > Hi, > > We are using Hbase 0.94.1. There is dependency from Hbase to Jruby. I found > below in hbase-0.94.1.pom. > > > org.jruby > jruby-complete > $

Re: EC2 Elastic MapReduce HBase install recommendations

2013-05-11 Thread Asaf Mesika
We ran into that as well. You need to make sure when sending List of Put that all rowkeys there are unique, otherwise as Ted said, the for loop acquiring locks will run multiple times for rowkey which repeats it self On Sunday, May 12, 2013, Ted Yu wrote: > High collision rate means high contenti

Re: Secure HBase upon Replication

2013-05-20 Thread Asaf Mesika
Just pinging the question, in case anyone missed it... (No answers were found in the resources I've searched for, so before diving into the code...) Thanks! On Fri, May 10, 2013 at 12:50 AM, Asaf Mesika wrote: > Thank you for the detailed answer. > Regarding my 1st questio

Re: HBase is not running.

2013-05-21 Thread Asaf Mesika
Devving.com has a good tutorial on HBase first setup On Tuesday, May 21, 2013, Yves S. Garret wrote: > Hi Mohammad, > > I was following your tutorial and when I got to the part when you do > $ bin/start-hbase.sh, this is what I get: > http://bin.cakephp.org/view/428090088 > > I'll keep looking on

Re: HBase is not running.

2013-05-21 Thread Asaf Mesika
Yes. On May 21, 2013, at 8:32 PM, "Yves S. Garret" wrote: > Do you mean this? > > http://blog.devving.com/hbase-quickstart-guide/ > > > On Tue, May 21, 2013 at 1:29 PM, Asaf Mesika wrote: > >> Devving.com has a good tutorial on HBase first setup &g

Re: Out of memory error in Hbase

2012-06-28 Thread Asaf Mesika
How Manu puts object are sending to the server? Maybe you should decrease this amount? Sent from my iPad On 28 ביונ 2012, at 11:13, Prakrati Agrawal wrote: > Hi, > > > > I am getting heap space error while running Hbase on certain nodes of my > cluster. I can't increase the Heapspace allocated

Understanding HBase log output regarding memstore flush

2012-06-28 Thread Asaf Mesika
Hi, I'm trying to figure out some discrepancies I'm witnessing in the HBase Region Server log file. It states that a flush was requested, and then a memstore flush is started. It says the flush size, after snapshotting is 139105600 (~132.7m). In the log message below, the file size of the file th

Re: HBase configuration using two hadoop servers

2012-06-30 Thread Asaf Mesika
I'm new to HBase my self, and when first trying to learn its installation path, I couldn't find a descent installation guide end-to-end (HDFS, HBase, Linux specific stuff, etc). I wrote an installation guide notes, which I'll be happy to expand into a full fledge guide, if it can be added to the HB

Re: HBase configuration using two hadoop servers

2012-06-30 Thread Asaf Mesika
my iPhone On 30 ביונ 2012, at 17:53, Stack wrote: On Sat, Jun 30, 2012 at 3:50 PM, Asaf Mesika wrote: I'm new to HBase my self, and when first trying to learn its installation path, I couldn't find a descent installation guide end-to-end (HDFS, HBase, Linux specific stuff, etc). So

Re: Powered By Page

2012-07-02 Thread Asaf Mesika
Adding capcha didn't help? Sent from my iPad On 2 ביול 2012, at 15:13, Stack wrote: > On Mon, Jul 2, 2012 at 1:06 PM, Ben Cuthbert wrote: >> Thanks Lars >> >> Just a note how do I edit and add? I have just registered. >> >> > > Oh yeah... you have to be granted perms to edit wiki because it wa

Composing your own timestamp

2012-07-09 Thread Asaf Mesika
Hi, We've been tinkering around ideas of implementing secondary index. One of the ideas is based on concatenating three meaningful fields into a long: int, short (2 bytes), short. This long will be used as timestamp when issuing a put to the secondary index table. This means an puts, timestamp w

Re: Composing your own timestamp

2012-07-09 Thread Asaf Mesika
tech.co> > > <http://in.linkedin.com/in/sonalgoyal> > > > > > > On Mon, Jul 9, 2012 at 6:49 PM, Asaf Mesika wrote: > >> Hi, >> >> We've been tinkering around ideas of implementing secondary index. >> One of the ideas is based on con

Re: Enable Snappy compression - not able to load the libs on startup

2012-07-09 Thread Asaf Mesika
Maybe you should look at the content of the jvm argument switch -Djava.library.path, (ps -ef | grep hbase , to see the command line). This will give you a hint on the directories the .so object is being looked for. On Jul 9, 2012, at 21:00 PM, Harsh J wrote: > The hbase-daemon.sh does not ssh b

HBase becomes ultra-slaggy

2012-07-09 Thread Asaf Mesika
this slowdown? Thanks! Asaf Mesika

Re: HBase becomes ultra-slaggy

2012-07-09 Thread Asaf Mesika
I got JMX counters hooked up to JConsole (couple of them opened). Do you have any advice from your experience on what metrics I should focus on to spot this issue? On Jul 9, 2012, at 22:19 PM, Stack wrote: > On Mon, Jul 9, 2012 at 8:35 PM, Asaf Mesika wrote: >> Hi, >> >&g

Re: Composing your own timestamp

2012-07-09 Thread Asaf Mesika
The int, short, short part goes to the time stamp. Thanks! Sent from my iPad On 10 ביול 2012, at 01:08, Mohammad Tariq wrote: > Hello Asaf, >If the 'int' parts of your rowkeys are close to each other, you may > face hotspotting. > > Best > -Tariq > > On

Re: HBase becomes ultra-slaggy

2012-07-09 Thread Asaf Mesika
ually show you what is going on with various parts of your 3 servers. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Original Message ----- From: Asaf Mesika To: user@hbase.apache.org Cc: Sent: Monday, July 9, 2012 2:35 PM Subject: HBase bec

Re: Enable Snappy compression - not able to load the libs on startup

2012-07-10 Thread Asaf Mesika
On Jul 10, 2012, at 8:57 AM, Arvid Warnecke wrote: > Hello, > > On Mon, Jul 09, 2012 at 09:10:12PM +0300, Asaf Mesika wrote: >> On Jul 9, 2012, at 21:00 PM, Harsh J wrote: >>> The hbase-daemon.sh does not ssh back into the host, so preserves any >>> environmen

Re: Enable Snappy compression - not able to load the libs on startup

2012-07-11 Thread Asaf Mesika
hbase". -- Asaf Mesika Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday 10 July 2012 at 20:04, Arvid Warnecke wrote: > Hello Asaf, > > On Tue, Jul 10, 2012 at 02:20:03PM +0300, Asaf Mesika wrote: > > On Jul 10, 2012, at 8:57 AM, Arvid Warnecke wrote: >

Re: Slow random reads, SocketTimeoutExceptions

2012-07-11 Thread Asaf Mesika
What's your cell value size? What do you mean by 100 tables in one column family? Can you please specify what was your insert rate and how many nodes you have? Sent from my iPhone On 11 ביול 2012, at 22:08, Adrien Mogenet wrote: Hi there, I'm discovering HBase and comparing it with other distr

HDFS + HBASE process high cpu usage

2012-07-12 Thread Asaf Mesika
Hi, I have a cluster of 3 DN/RS and another computer hosting NN/Master. From some reason, two of the DataNode nodes are showing high load average (~17). When using "top" I can see HDFS and HBASE processes are the one using the most of the cpu (95% in top). When inspecting both HDFS and HBASE th

Re: HDFS + HBASE process high cpu usage

2012-07-12 Thread Asaf Mesika
0 7 set_robust_list -- --- --- - - 100.00 17.057362109518 53680 total On Jul 12, 2012, at 18:09 PM, Asaf Mesika wrote: > Hi, > > I have a cluster of 3 DN/RS and another compute

Re: HDFS + HBASE process high cpu usage

2012-07-13 Thread Asaf Mesika
Thanks a lot! That must have been it. Unfortunately I couldn't really test this command, since the guys from ops rebooted the entire computer room during maintenance, and it fixed the issue. (This room is a lab room of course) Asaf On Jul 13, 2012, at 4:27 AM, Esteban Gutierrez wrote: > date

Disable table works, but truncate says table does not exists

2012-07-17 Thread Asaf Mesika
Hi there, I have a strange situation on one of my tables in HBase: * disable/describe 'my_table' works (using HBase shell) * truncate/drop doesn't - yells table does not exists. How do I fix it? HBase Shell quote: hbase(main):003:0> disable 'my_table' 0 row(s) in 0.0480 seconds hbase(main):

Re: Java Client Tutorial

2012-07-20 Thread Asaf Mesika
The best examples I saw were on the book HBase - The Definitive Guide. -- Asaf Mesika Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Friday 20 July 2012 at 22:25, Mohit Anchlia wrote: > Is there any place that has good examples of HBase java API calls? > >

Re: Insert blocked

2012-07-23 Thread Asaf Mesika
Is htable in autoFlush? What's your client buffer size? What the thread stuck on? Take a thread dump Sent from my iPad On 24 ביול 2012, at 03:42, Mohit Anchlia wrote: > I am now using HTablePool but still the call hangs at "put". My code is > something like this: > > > hTablePool = *new* HTable

Re: Enabling compression

2012-07-24 Thread Asaf Mesika
You also need to install Snappy - the Shared Object. I've done it using "yum install snappy" on Fedora Core. Sent from my iPad On 25 ביול 2012, at 04:40, Dhaval Shah wrote: Yes you need to add the snappy libraries to hbase path (i think the variable to set is called HBASE_LIBRARY_PATH) -

Re: Retrieve Put timestamp

2012-07-31 Thread Asaf Mesika
What do you mean by using TS as version? Are you determining the ts long value before and then setting it in the Put object? If so, I think you can use a specific cell as a counter (Sequence in Oracle language, or Auto Increment column in MySQL). In that case of course you need the value of the

Re: HBase configuration using two hadoop servers

2012-08-13 Thread Asaf Mesika
work - Setup a cluster and benchmark it - can save time and go home earlier with this guide :) Asaf Mesika On 2 ביול 2012, at 11:03, Stack wrote: > On Sat, Jun 30, 2012 at 5:55 PM, Asaf Mesika wrote: >> I've tried editing but I don't have permissions. What should b

Re: HBase is not running.

2013-05-24 Thread Asaf Mesika
If you truly want to understand the weirdness behind what you witnessed, then make a big cup of coffee, prepare a notebook with a pen and sit down to read this: http://blog.devving.com/why-does-hbase-care-about-etchosts/ My friend at devving.com had a fight like this with HBase pseudo mode, but dec

Re: Explosion in datasize using HBase as a MR sink

2013-05-31 Thread Asaf Mesika
On your data set size, I would go on HFile OutputFormat and then bulk load in into HBase. Why go through the Put flow anyway (memstore, flush, WAL), especially if you have the input ready at your disposal for re-try if something fails? Sounds faster to me anyway. On May 30, 2013, at 10:52 PM, R

Re: HBASE install shell script on cluster

2013-05-31 Thread Asaf Mesika
We have developed some custom scripts on top of fabric (http://docs.fabfile.org/en/1.6/). I've asked the developer on our team to see if can share some of it to the community. It's mainly used for development/QA/Integration test purposes. For production deployment we have a in-house "chef" like

Weird Replication exception

2013-06-01 Thread Asaf Mesika
Hi, I have a weird error in a cluster I'm checking Replication with. I have two clusters set up, each on its own DC (different continents). Each has 1 master, and 3 RS. I've done all required setup, started replication and pushed in some data into the master. I had an issue where the slave (peer

  1   2   3   >