Re: hbase 0.19.0 configuration

2009-01-26 Thread Michael Stack
Hey Jun: For 0.19.x, do not set the *dfs.datanode.socket.write.timeout* to zero as we suggest for hadoop 0.18.x. Leave it at its default of 8 minutes. On timeout, in 0.19.0, the client will reestablish the connection (See HADOOP-3831). If lots of activity against hdfs, up *dfs.datanode.han

Re: RetriesExhaustedException during HTable.commit

2008-12-11 Thread Michael Stack
Edward J. Yoon wrote: I am receiving the RetriesExhaustedException exception during HTable.commit, the size of cell is 50 mb (2,500 * 2,500 double entries). Is there a configuration to avoid this problem? Looks like HADOOP-4802 (in hbase, its part of HBASE-900). Big cells can trigger OOME.

Re: Again UnknowScannerException

2008-12-09 Thread Michael Stack
Edward J. Yoon wrote: To change hbase.regionserver.lease.period, Should I restart hbase cluster? Edward: Yes, unfortunately (should make it so you don't have to). David: Please upgrade to hbase 0.18.1 for other reasons than hbase.regionserver.lease.period (and hadoop 0.18.2). You do not

Re: Newbie: best practice for building sharded SOLR indexes

2008-12-07 Thread Michael Stack
tim robertson wrote: Can someone please help me with the best way to build up SOLR indexes from data held in HBase, that will be too large to sit on a single machine (100s millions rows)? I am assuming in a 20 node Hadoop cluster, I should build a 20 shard index and use SOLRs distributed search?

Re: implementing selection/projection using mapreduce

2008-12-04 Thread Michael Stack
The #configure method gets called with JobConf before your map task starts up. Get what you need from JobConf at that time and save them off into data members. Then pull on them inside in your map. See GroupingTableMap in mapred package for example or the examples in hadoop. St.Ack abhini

Re: Bulk import question.

2008-12-01 Thread Michael Stack
There is none in hbase; it doesn't manage the filesystem so doesn't make the best sense adding it there (We could add it as a metric I suppose). In hdfs there are facilities for asking that it only fill a percentage or an explicit amount of the allocated space -- see hadoop-default.xml. I'm n

Re: Question of Column families in Hbase

2008-11-21 Thread Michael Stack
Nishant Khurana wrote: Hi, I was looking at the API Docs and HQL shell to create tables for Hbase. I couldn't figure out : 1) How to create column families such that each column family contains more than one column through shell. Please tell us how you tripped over HQL. It was remove from hba

Re: hbase-thrift: Helper to install HBase Thrift bindings

2008-11-18 Thread Michael Stack
Looks great Carlos. Would suggest you add hbase-thrift here: http://wiki.apache.org/hadoop/SupportingProjects. St.Ack Carlos Valiente wrote: I've hacked together a package to ease the installation of Thrift Perl and Python bindings for HBase: http://code.google.com/p/hbase-thrift/ Enjoy,

Re: How to detect when the mapper is called the last time?

2008-11-16 Thread Michael Stack
Thibaut_ wrote: Hi, As each row of my hbase table can take a lot of time to process (waiting on answeres from other hosts), I would like to create a few threads to process that data in parallel. I would then use the last call to the map function to wait for all threads to finish their job and

Re: create table is time consuming

2008-11-15 Thread Michael Stack
Things take a while usually because messaging is done on a period; most of the elapsed time is just waiting on the period to pass. See in src/test where we have an hbase-site.xml with different config. from hbase defaults. Most of the config. here are tunings to make stuff run faster in the u

Re: xceiverCount 257 exceeds the limit of concurrent xcievers 256

2008-11-12 Thread Michael Stack
Try upping the limit on your datanodes. Set dfs.datanode.max.xcievers up to 1024 or more. St.Ack Dru Jensen wrote: hbase-users, I have been running MR processes for several days against HBase with success until recently the region servers shut themselves down. Hadoop 0.18.1 Hbase 0.18.1 3 n

Re: HBase read performance

2008-11-12 Thread Michael Stack
wrote: Are you using hbase TRUNK? If so, and if your checkout was recent, you'll see benefit/disadvantage of cache. hadoop 0.18.1, hbase 0.18.0. I do not use TRUNK , any useful update? what do you mean the disadvantage of cache? Disadvantage is that if you are getting mostly cach

Re: HBase read performance

2008-11-11 Thread Michael Stack
wrote: ??2008-11-12??"Michael Stack" <[EMAIL PROTECTED]> ?? wrote: hello, every one. i used to test the performance in PE, but the performance is not well enough. Please say more. What kind of numbers were you getting? especially, the

Re: java.io.IOException: java.util.NoSuchElementException

2008-11-11 Thread Michael Stack
checks to keep this from happening on startup or post about it in the faq in case it comes up again. Billy "Michael Stack" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] Billy Pearson wrote: could it be from the global memcache limit I set my hb

Re: Regionserver fails to serve region

2008-11-11 Thread Michael Stack
Slava Gorelik wrote: Hi.DEBUG wasn't enabled , because it decrease the performance and increase log size. Sure. But maybe leave it on while we're trying to figure issues. Regarding the ulimit - yes it's upped for 32K. Good. You remember correct - during massive load i run the balancer

Re: HBase read performance

2008-11-11 Thread Michael Stack
王凯 wrote: hello, every one. i used to test the performance in PE, but the performance is not well enough. Please say more. What kind of numbers were you getting? especially, the table format is not as what i need. so, i create a table and write some string in every cell. then, i use the count

Re: java.io.IOException: java.util.NoSuchElementException

2008-11-11 Thread Michael Stack
Billy Pearson wrote: could it be from the global memcache limit I set my hbase.hregion.memcache.flush.size = hbase.regionserver.globalMemcacheLimit So that memcache flushes are only as needed. That would probably explain it. The global memcache limit will likely be reached before a flush ha

Re: java.io.IOException: java.util.NoSuchElementException

2008-11-10 Thread Michael Stack
Thats an odd one Billy. We're in that bit of code because we need to flush some regions fast because we're up at memory thresholds -- but we're getting java.util.NoSuchElementException because there are no regions to flush. HBASE-990 in trunk addresses the immediate silly error of trying to g

Re: Pigi project

2008-11-04 Thread Michael Stack
Edward J. Yoon wrote: Hey stack. I'd like to add hama and heart (http://rdf-proj.blogspot.com) to the supporting projects page if you are ok. :) I see you added them without us saying yes or no. IMO, you should take them off the supporting projects page. Your projects are not like the oth

Re: Regionserver fails to serve region

2008-11-04 Thread Michael Stack
Slava Gorelik wrote: Hi Michael.After reformatting HDFS, Hbase started to work as a Swiss Clock. Worked with 8 clients about 30 hours intensive load. Thanks for reporting back to the list. Just small question, after about 28 hours (when i came back to work) i found that one of 7 datanodes in

Re: hbase performance period

2008-11-03 Thread Michael Stack
CaiSijie wrote: Thank you for your replay. My version of HBase is 0.18.0. Yes, I read data in series. But what I see is that reading 1st data cost least time and reading 128th data cost most time. It means that time increase from reading 1st to 128th data item. Then when reading 129th data item

Re: problem installing hbase

2008-10-31 Thread Michael Stack
You can put and get to this hdfs with "./bin/hadoop fs ..."? HBase is having trouble getting to 'localhost'. Try using an IP or resovable hostname instead of 'localhost'. St.Ack antonis labadaridis wrote: Michael Stack wrote: antonis labadaridis wrote: Hi.

Re: problem installing hbase

2008-10-31 Thread Michael Stack
antonis labadaridis wrote: Hi. I am trying to install Hbase on debian testing. I installed hadoop 0.18.1 sucessfully using these instructions: http://wiki.apache.org/hadoop/Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster) (the examples worked fine). Then I tried to install HBase, I follow

Re: Regionserver fails to serve region

2008-10-30 Thread Michael Stack
unctionality i filled jira : https://issues.apache.org/jira/browse/HBASE-961 On Sun, Oct 26, 2008 at 12:58 AM, Michael Stack <[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] <mailto:[EM

Re: Metrics

2008-10-30 Thread Michael Stack
, Michael Stack <[EMAIL PROTECTED]> wrote: Slava Gorelik wrote: Hi. I'm looking for HBase metrics functionality, i need to display the metrics, about number of transactions, average transaction time and other. Does Hbase provides something like this ? It does as of this a

Re: Metrics

2008-10-29 Thread Michael Stack
Slava Gorelik wrote: Hi. I'm looking for HBase metrics functionality, i need to display the metrics, about number of transactions, average transaction time and other. Does Hbase provides something like this ? It does as of this afternoon in TRUNK. There is documentation but you'll probably

Re: Question on concurrent update

2008-10-29 Thread Michael Stack
Yes. Each client will hold a lock on the row while its updating its column. St.Ack Michael Dagaev wrote: Hi, All Let two concurrent clients update values of different qualifiers of the same column family of the same row. I guess that both qualifiers will be updated properly. Is it correct

Re: Regionserver fails to serve region

2008-10-25 Thread Michael Stack
Slava Gorelik wrote: Hi.Haven't tried yet them, i'll try tomorrow morning. In general cluster is working well, the problems begins if i'm trying to add 10M rows, after 1.2M if happened. Anything else running beside the regionserver or datanodes that would suck resources? When datanodes begin to

Re: Regionserver fails to serve region

2008-10-25 Thread Michael Stack
Does your cluster still not work Slava? Have you seen the recent prescription from Jean-Adrien on the list for a problem that looks related? St.Ack Slava Gorelik wrote: Hi.Most of the time i get Premeture [sic] EOF from inputStream , some times it also "No live nodes contain current block". No

Re: HBase region server - utilization extremely unbalanced

2008-10-23 Thread Michael Stack
Yossi Ittach wrote: Thanks for the quick reply. I'm following the jvm Memory consumption (using "top") , and what bothers me is that it seems the percentages are just going up and up , and it makes me kind of worried. 'top' is an extremely crude tool for figuring how the JVM is doing memor

Re: Regionserver fails to serve region

2008-10-22 Thread Michael Stack
Jean-Adrien wrote: .. Stack, you ask me if my hard disks were full. I said one is. Why did you link the above problem with that. Because of the du problem noticed in HADOOP-3232 ? I don't think I'm affected by this problem, my BlockReport process duration is less than a second. We were seeing

Re: How to get all columns from the scanner in a Map-Reduce job?

2008-10-19 Thread Michael Stack
What happens if you pass a column name of "^.*$"? Will it return all columns? I don't think it will. IIRC the regex can only be applied to the column qualifier portion of column name which means you'd have to write out a column spec. for your mapreduce job per column family. So, if you had

Re: Writing a RowResult to HTable?

2008-10-19 Thread Michael Stack
No. We should add a constructor to BatchUpdate that takes a RowResult. HBASE-880, the client API revamping has some mild affectation of RowResult reuse in updates (or at least the SortedMap of columnname->Cell that is heart of RR). Perhaps it could be made stronger. Want to comment in HBASE-8

Re: Possible to set the results' sort method?

2008-10-19 Thread Michael Stack
(Going further down the path I believe J-D was heading), no, its not possible to change the sort-order server-side, not without subclassing regionserver class, but you might be able client-side to give the RowResult to a new SortedMap, one that has a different comparator; e.g. one that sorts th

"i usually see lots of compation work after I restart hbase cluster after running for some time."

2008-10-19 Thread Michael Stack
(Answering an IRC question up here so Rong-en, the asker, sees response) In the log it will say what kind of compaction it is, or explicitly it logs when its a 'major' compaction. Are the compactions you are seeing 'major' compactions? My guess is they are. Major compactions happen on a pe

too busy host causes NotServingRegion exception?

2008-04-18 Thread Michael Stack
> 08/04/18 01:51:15 starting compaction > 08/04/18 01:51:22 region closed I'd guess a split has just happened and that it was responsible for the close of the region. > 08/04/18 01:51:41 NotServingRegion Exception > 08/04/18 01:51:47 compaction done > 08/04/18 01:51:51 NotServingRegion Exce

RE: Regions Offline

2008-04-18 Thread Michael Stack
(Sorry if this duplicate -- sent it from another account last night but it doesn't seem to have shown up given my mailbox listing; one below had a few additions) > >> Hi >> >> My system is quite simple: >> - two (one quad core, one dual core) servers with 2GB mem and 150 GB >> allocated to