Re: Dying region servers...

2009-02-25 Thread stack
Regards the timeout, its a configuration for the dfsclient, the thing hbase uses communicating with hdfs. ./org/apache/hadoop/hdfs/DFSClient.java:this.datanodeWriteTimeout = conf.getInt("dfs.datanode.socket.write.timeout", If the config. is off in hadoop, hbase doesn't see it. To get number

Re: HBase and Web-Scale BI

2009-02-25 Thread Ryan Rawson
Hey, You have to be clear about what hbase does and does not do. HBase is just not a rational database - it's "weakness" is it's strength. In general, you can only access rows in key order. Keys are stored lexicographically sorted however. There aren't declarative secondary indexes (minus the

HBase and Web-Scale BI

2009-02-25 Thread Bradford Stephens
Greetings, I'm in charge of the data analysis and collection platform at my company, and we're basing a large part of our core analysis platform on Hadoop, Nutch, and Lucene -- it's a delight to use. However, we're going to be wanting some on-demand "web-scale" business intelligence, and I'm wonde

Re: HBase 0.19.0: Xcievers / shutdown / JVM GC settings

2009-02-25 Thread stack
Jean-Adrien: Thank you for the high-quality addition to the wiki troubleshooting page and for the detail in hbase-1214. (I added a question there that you might have some input on). St.Ack On Tue, Feb 24, 2009 at 4:39 AM, Jean-Adrien wrote: > > #3 I put the information in the Troubleshooting s

Re: Dying region servers...

2009-02-25 Thread Larry Compton
Actually, I spoke too soon. In "hbase-env.xml", we have "HBASE_CLASSPATH" set to include the Hadoop conf directory on all 4 nodes, so the Hbase servers should have access to all of the Hadoop parameters. I'm going to try a symlink to "hadoop-site.xml" and see if the behavior changes. Larry On Wed

Re: Dying region servers...

2009-02-25 Thread Larry Compton
"dfs.datanode.socket.write.timeout" is set in "hadoop-site.xml" and isn't linked or contained in the Hbase "conf" directory. I'll try that out. I'm not sure I understand why this is necessary, though. It seems like this parameter would only matter to Hadoop, so why is it necessary for the Hbase ser

Re: duplicated hbase timestamps

2009-02-25 Thread Toby White
On 14 Dec 2008, at 22:41, stack wrote: The below looks like the known issue, HBASE-29 'HStore#get and HStore#getFull may not return expected values by timestamp when there is more than one MapFile'. What do you think Toby? Basically, if updates do not go in in chronological order, you'll

Re: MapReduce job to update HBase table in-place

2009-02-25 Thread Erik Holstad
HI Stuart! According to our test it has shown that inputting your data using a BatchUpdate is faster than using the collector, but these were do a while ago, so if you find something else please let us know. Erik

Re: MapReduce job to update HBase table in-place

2009-02-25 Thread Jean-Daniel Cryans
Stuart, Currently there are some issues when outputting to a table with scanners active on it, mainly that the regions won't be able to split until the scanners are gone. No, you should not have loop back issues. My opinion is that you should just do a normal MR with an identity mapper or reduce

MapReduce job to update HBase table in-place

2009-02-25 Thread Stuart White
I'd like to write a MapReduce job to update an HBase table in-place, and I'd like to solicit a little guidance. Here's what I think I should do, as well as some questions. Any feedback is appreciated. - I want to examine all the rows in the table, and for some subset of these rows, update some o

Re: Question on region server/data node restart

2009-02-25 Thread Jean-Daniel Cryans
Correction, I was suggesting 0.18.2 (the svn branch) since it has many fixes that Michael would need and it won't break anything for him (as 0.19.0 will do with MR jobs). J-D On Wed, Feb 25, 2009 at 1:33 AM, stack wrote: > Michael, as J-D suggests above, can you update to 0.19.0 hbase? Its bet