Re: Big machines or (relatively) small machines?

2010-06-08 Thread Tim Robertson
> - Do you plan to serve data out of HBase or will you just use it for > MapReduce? Or will it be a mix (not recommended)? I am also curious what would be the recommended deployment when you have this need (e.g. building multiple Lucene indexes which hold only the Row ID, so building is MR intens

Re: HBASE-2001 and ElasticSearch

2010-06-08 Thread Steven Noels
On Tue, Jun 8, 2010 at 2:55 AM, Daniel Einspanjer wrote: > We are specifically looking for the ability to create callbacks on put, > increment, and delete for specific tables so we can implement the indexing > solution. This is actually advance preparation for Socorro 2.0 which won't > be releas

Lily pre-release info available

2010-06-08 Thread Steven Noels
Hi you all, first of all, thanks for a great project to base our work on, and we hope to find more time to help and make it better. Yesterday, we announced Lily at Berlin Buzzwords, which is still going on at the moment (I'm sitting in on the Cloudera sales pitch -eer- Hadoop overview present

Re: Lily pre-release info available

2010-06-08 Thread Lars George
Hi Steven, First off, congrats for the progress! This is super exciting. As usual, if you need help you know where to find us :) but it seems you have it all well under control. As far as Buzzwords is concerned, I had a proposal submitted but was rejected. But with projects like Lily we will get

help regarding HadoopDB

2010-06-08 Thread Muhammad Mudassar
Hi all, I am trying to run HadoopDB on single node along with mysql on an ubuntu machine. I have done with basic configurations by following HadoopDB Quick Start Guide at http://hadoopdb.sourceforge.net/guide/ I am able to create and query tables from hive shell to those tables which are stored in

Re: HBASE-2001 and ElasticSearch

2010-06-08 Thread Daniel Einspanjer
I didn't realize that Lily was that far along, I thought you were still in R&D for a few more months. This sounds very promising and we'll take a look at what you have available. -Daniel On 6/8/10 3:38 AM, Steven Noels wrote: On Tue, Jun 8, 2010 at 2:55 AM, Daniel Einspanjer wrote: We a

Re: HBASE-2001 and ElasticSearch

2010-06-08 Thread Steven Noels
On Tue, Jun 8, 2010 at 4:46 PM, Daniel Einspanjer wrote: > I didn't realize that Lily was that far along, I thought you were still in > R&D for a few more months. This sounds very promising and we'll take a look > at what you have available. > We have only that much time to make a first and ho

Re: help regarding HadoopDB

2010-06-08 Thread Sean Bigdatafun
(My limited understanding is that HadoopDB is far from production-ready. Correct me if I am wrong) It uses M/R to build the index, and from thereon, it serves query from the indexed data stored on PostgreSQL storage engine. I do not fully understand how they provide a JDBC interface. Put it in an

Re: HBASE-2001 and ElasticSearch

2010-06-08 Thread Andrew Purtell
Steven, > From: Steven Noels [...] > We'll soon (2-3 weeks) be releasing a HBase table-backed > WAL/queue implementation that binds HBase with SOLR for > index updates in Lily, but is designed not to be Lily- > specific or -dependent. I don't know if this is of > any help for you, but it could be

RE: Lily pre-release info available

2010-06-08 Thread Gibbon, Robert, VF-Group
Hello Lars Its a pity that your presentation was bounced - I was eating BuzzWords too; there was no formal hbase representation which was really a crying shame. It was noted by some that Hbase has pretty much eclipsed Cassandra since FB dumped it in favour of hbase - but I guess its no competi

Re: Suggested config changes to be made

2010-06-08 Thread Jean-Daniel Cryans
On Mon, Jun 7, 2010 at 5:52 PM, Daniel Einspanjer wrote: >  For Socorro, we currently have a 15 node HBase 0.20.3 cluster. > The hardware is dual hyperthreaded quads with 24GB of RAM (RS JVM is > allocated 8GB). > HDFS Health reports that we are currently using 20TB out of 60TB. (Storage > is only

RE: Lily pre-release info available

2010-06-08 Thread Andrew Purtell
> From: Gibbon, Robert, VF-Group [...] > It was noted by some that Hbase has pretty much eclipsed > Cassandra since FB dumped it in favour of hbase - but I > guess its no competition... I don't see it as a competition. These two systems solve a similar but partially disjoint set of scalability pr

RE: Suggested config changes to be made

2010-06-08 Thread Jonathan Gray
When thinking about your region size (max file size), you need to consider the number of regions you will have and how many regions there will then be on each node. This has a significant impact on the _actual_ size of files that get flushed to disk. You can increase your flush size to 256 but

Re: Suggested config changes to be made

2010-06-08 Thread Stack
On Mon, Jun 7, 2010 at 5:52 PM, Daniel Einspanjer wrote: > This output indicates that block cache is disabled on -ROOT-.  It sounds > like it was recommended to enable this.  Is it just an alter table or is > there anything else that needs to be done? > In 0.20.4/5 there is a little script named b

Reads of a recently written/modified value

2010-06-08 Thread Vidhyashankar Venkataraman
I was trying to execute some operations on a Hbase instance. After performing a dozen write operations (with auto flush not set), Hbase could not read the inserted/modified records successfully (using the Get operations). But with auto flush set and after writing the records, I could read the

Re: HBase fail-over/reliability issues

2010-06-08 Thread James Baldassari
Well it took almost a month, but I finally saw this problem in the wild again (0.20.3 + HBASE-2180), and this time I think I have more info about what happened. The symptoms were the same as usual: read throughput drops drastically, the HBase client throws tons of errors, and at least one region s

Re: Reads of a recently written/modified value

2010-06-08 Thread Ryan Rawson
Turning auto flush off will cause the client to accumulate puts without sending them to the server. Gets and scans only talk to the server and thus ignore the client write cache. On Jun 8, 2010 1:55 PM, "Vidhyashankar Venkataraman" wrote: I was trying to execute some operations on a Hbase instanc

Re: Reads of a recently written/modified value

2010-06-08 Thread Stack
When you say could not read the inserted/modified records successfully, can you say more what this means. Were all inserts done in same millisecond? St.Ack On Tue, Jun 8, 2010 at 1:53 PM, Vidhyashankar Venkataraman wrote: > I was trying to execute some operations on a Hbase instance. After perfo

RE: Reads of a recently written/modified value

2010-06-08 Thread Hegner, Travis
Is there a manual flush function that the user can call before attempting to read the data back out? Travis Hegner http://www.travishegner.com/ -Original Message- From: Ryan Rawson [mailto:ryano...@gmail.com] Sent: Tuesday, June 08, 2010 5:03 PM To: user@hbase.apache.org Subject: Re: Re

Re: Reads of a recently written/modified value

2010-06-08 Thread Vidhyashankar Venkataraman
Misunderstood about the flush option then.. Thanks for clarifying: So, why doesn't the flush option include Delete operations? Why only Put operations? And shouldn't the client flush after it ends? Vidhya On 6/8/10 2:03 PM, "Ryan Rawson" wrote: Turning auto flush off will cause the client to

RE: Reads of a recently written/modified value

2010-06-08 Thread Jonathan Gray
There is indeed an HTable.flushCommits() > -Original Message- > From: Hegner, Travis [mailto:theg...@trilliumit.com] > Sent: Tuesday, June 08, 2010 2:05 PM > To: user@hbase.apache.org > Subject: RE: Reads of a recently written/modified value > > Is there a manual flush function that the u

RE: Reads of a recently written/modified value

2010-06-08 Thread Jonathan Gray
What do you mean "after it ends"? If you call HTable.close(), that then calls flushCommits(). Currently Delete operations are not buffered on the client side, only Puts are. > -Original Message- > From: Vidhyashankar Venkataraman [mailto:vidhy...@yahoo-inc.com] > Sent: Tuesday, June 08,

Re: Reads of a recently written/modified value

2010-06-08 Thread Vidhyashankar Venkataraman
A related question to the previous mail is if I have to use flushCommits every time I finish the Hbase client? Thank you Vidhya On 6/8/10 2:03 PM, "Ryan Rawson" wrote: Turning auto flush off will cause the client to accumulate puts without sending them to the server. Gets and scans only talk t

RE: Reads of a recently written/modified value

2010-06-08 Thread Jonathan Gray
What do you mean by "finish the HBase client"? If you turn off auto-flush, this means that every time you do HTable.put(Put) you are just adding to an internal, client-side buffer. When you are done with insertions and you want them to be persisted into HBase, you must say that you are done in

Re: Reads of a recently written/modified value

2010-06-08 Thread Vidhyashankar Venkataraman
>> What do you mean "after it ends"? I meant when the client program finishes running without calling the close function or the flushCommits.. Shouldn't commits be flushed when Htable's finalize is called? >> Currently Delete operations are not buffered on the client side, only Puts >> are. Wil

RE: Reads of a recently written/modified value

2010-06-08 Thread Jonathan Gray
> >> What do you mean "after it ends"? > I meant when the client program finishes running without calling the > close function or the flushCommits.. Shouldn't commits be flushed when > Htable's finalize is called? HTable has no finalize(). Correct me if I'm wrong, but isn't finalize() what is ca

Zookeeper exception - ConnectionLoss for /hbase

2010-06-08 Thread Raghava Mutharaju
Hi all, I am using HBase API to insert data into a table on a cluster. I am getting the following exception (full stack trace in pastebin, link given below). [java] 10/06/08 05:02:15 WARN zookeeper.ZooKeeperWrapper: Failed to create /hbase -- check quorum servers, currently=localhost:2181

RE: Zookeeper exception - ConnectionLoss for /hbase

2010-06-08 Thread Jonathan Gray
Is the directory containing hbase-site.xml in the classpath of your client? > -Original Message- > From: Raghava Mutharaju [mailto:m.vijayaragh...@gmail.com] > Sent: Tuesday, June 08, 2010 4:39 PM > To: user@hbase.apache.org > Subject: Zookeeper exception - ConnectionLoss for /hbase > > H

Re: Reads of a recently written/modified value

2010-06-08 Thread Vidhyashankar Venkataraman
> You need to explicitly flush your buffer by either calling flushCommits() or > close() this seems normal and logical to me. Yup you are right.. Pardon my gaffe about finalize.. I was thinking in C++.. But I think a line about closing the table in the Javadoc might help.. I borrowed the cod

Re: Zookeeper exception - ConnectionLoss for /hbase

2010-06-08 Thread Raghava Mutharaju
Will it do if $HBASE_HOME/conf (directory containing hbase-site.xml) is in HADOOP_CLASSPATH of hadoop-env.sh file?? Regards, Raghava. On Tue, Jun 8, 2010 at 7:51 PM, Jonathan Gray wrote: > Is the directory containing hbase-site.xml in the classpath of your client? > > > -Original Message---

RE: Reads of a recently written/modified value

2010-06-08 Thread Jonathan Gray
Deletes could be buffered. There has been recent work on unifying these as Mutate operations which would allow buffering of them both. There is no explicit reason why except that it was previously difficult to do. > -Original Message- > From: Vidhyashankar Venkataraman [mailto:vidhy...

RE: Zookeeper exception - ConnectionLoss for /hbase

2010-06-08 Thread Jonathan Gray
So you are running the insert from mapreduce and not a standalone client? If so, then it should pick that up. If not, then hadoop-env would need to be used by your client. Feel free to jump into #hbase on freenode irc if you want faster help (though I'm on my way out shortly). > -Origina

Re: Zookeeper exception - ConnectionLoss for /hbase

2010-06-08 Thread Patrick Hunt
The client is unable to connect to the server. According to the log the client thinks that the server is at localhost:2181, is it? (check netstat -a output for example) Patrick On 06/08/2010 04:38 PM, Raghava Mutharaju wrote: Hi all, I am using HBase API to insert data into a table o

Re: Lily pre-release info available

2010-06-08 Thread Imran M Yousuf
Congratulations to Steven and the rest of the Lily CMS team. Its extremely exciting to see all the developments centering HBase; specially since we are also looking to develop service around HBase. I have to say that I find the HBase community to be extremely helpful and as rightly pointed earlier

Re: Zookeeper exception - ConnectionLoss for /hbase

2010-06-08 Thread Raghava Mutharaju
Hi JG, aah, that could be the reason. Its a standalone client. I would include hbase-site.xml in the classpath and check it out. Thank you. Patrick: The server is not localhost. But the client thinks so. I modified the hbase.zookeeper.quorum property in hbase-site.xml but even then, it didn

Re: Zookeeper exception - ConnectionLoss for /hbase

2010-06-08 Thread Raghava Mutharaju
HI, I added $HBASE_HOME/conf to the classpath. Even after that, the same exceptions are coming. I added it to my ant file as follows env property doe