Re: blocked when creating HTable

2010-12-07 Thread exception qin
Hi, I did some investigation on this issue.There are several things I need to make clear. 1, I'm using Cloudera's Distribution for Hadoop(0.20.0),HBase(0.20.6) and Zookeeper(3.3.1) 2, The HBase didn't manage it's own instance of zookeeper. 3, I found my program can connected to zookeeper successf

Re: blocked when creating HTable

2010-12-07 Thread Lars George
Hi Exception, This is up to you to set up properly. If you run them on the same cluster/network then you either share the same ZooKeeper or make sure they use different ports (as per the zoo.cfg). Also make sure you have the proper ZooKeeper quorum set and your client being able to "see" it. If

Re: Best Practices Adding Rows

2010-12-07 Thread Lars George
Hi Alex, That is indeed the recommended way, i.e. use binary values if you can. As long as you can express the same sorting as a long as opposed to a string then that's the way to go for sure. Lars On Dec 7, 2010, at 8:21, Alex Baranau wrote: > I think I've faced by the key format, smth lik

Re: Importing to HBase from Java problem

2010-12-07 Thread Gökhan Çapan
I had forgotten add HBase-conf to MapReduce ClassPath. That solved the problem, Thanks. On Mon, Dec 6, 2010 at 8:40 PM, Gökhan Çapan wrote: > > > On Mon, Dec 6, 2010 at 7:54 PM, Stack wrote: > >> That looks like a mismatch between client and server hbase versions. >> Ensure you have same run

Re: serialized objects as strings or as object? & data corruption?

2010-12-07 Thread Friso van Vollenhoven
For large binary objects, you could consider Google Protocol Buffers. It is very compact when working with large lists of numbers, etc. where Java serialization will give a lot of overhead (for example: a single BigInteger object of value 0 takes 50 bytes in serialized form). If you anticipate

Re: blocked when creating HTable

2010-12-07 Thread exception qin
thanks Geogre. I have already shut down the Flume instance. So there should be no zookeeper conflict. I wrote a shell script to run the java program. This is the script: #!/bin/bash HADOOPHOME="/root/hadoop/"; HBASEHOME="/root/hbase"; ZOOKEEPERHOME="/root/zookeeper"; RUNLIB="${HADOOPHOME}/hado

A fatal error has been detected by the Java Runtime Environment

2010-12-07 Thread 陈加俊
One HBase regionserver is crashed , How can I do to avoid this happen again? Thank you! # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x7fc147a4a0b1, pid=31145, tid=140468029049104 # # JRE version: 6.0_17-b17 # Java VM: OpenJDK 64-Bit Server VM (

Re: blocked when creating HTable

2010-12-07 Thread Lars George
OK, so ZooKeeper works now and the client can obviously connect. After the cluster is running can you start the HBase shell and see if you can scan meta or root? Simply try a "> scan '-ROOT-'" in the shell. Do you have anything blocking access to the server hosting those regions? Is port 60020

Weird Problem: Can not delete a specified row

2010-12-07 Thread 曾伟
Hi, Find a strange problem today. The deleting operation does not work. Here is the code: HTable table = this.pool.getTable(this.tableName); System.out.println(table.exists(new Get(row))); Delete d=new Delete(row); table.delete(d); System.out.println(tabl

Re: blocked when creating HTable

2010-12-07 Thread exception qin
yes, the hbase shell works fine.I can scan both '-ROOT-' and '.META.'. Also, I created a table and put some data into it. The port 60020 is listened by region server: tcp6 0 0 dev_26:60020[::]:* LISTEN hbase 1563816 5610/java I wrote a simple program to try to locat

"Child" processes not getting killed

2010-12-07 Thread Hari Sreekumar
Hi, My cluster was running great till yesterday. Today, I submitted some jobs and I saw that the jobs were taking way too long. On investigation, I saw that the "Child" processes created by previous MR jobs were not getting killed, even though no jobs were running on the cluster, and there

Re: Maps sharing a common table.

2010-12-07 Thread rajgopalv
@all.. Thanks. I created instance variable in the setup() function asd suggested. rajgopalv wrote: > > Hi, > I'm writing a MR job to read values from a CSV file and insert it into > hbase using Htable.put() > So each map function will insert one row. > There is no reduce function. > > But now

Zoo keeper exception in the middle of MR

2010-12-07 Thread rajgopalv
Hi all, I wrote a MR job for inserting rows into hbase. I open a CSV file which is present in the hdfs and i use put function in the map() to insert into hbase. This technique worked just for 50% and the job got killed.. the log file is here : http://pastebin.com/12gmF3Z6 Why is this happening

Re: Zoo keeper exception in the middle of MR

2010-12-07 Thread Ted Dunning
lt looks like your task took a long time to complete (> 10 minutes) and didn't produce any output or report any status to Hadoop during this time. This often happens during indexing tasks where a reducer or mapper builds some off-line data structure for a long time. Can you force your mappers to

DataDevRoom at the 2011 edition of the FOSDEM

2010-12-07 Thread Isabel Drost
Hello, We (Olivier, Nicolas and I) are organizing a Data Analytics DevRoom that will take place during the next edition of the FOSDEM in Brussels on Feb. 5. Here is the CFP: http://datadevroom.couch.it/CFP You might be interested in attending the event and take the opportunity to speak about y

Re: DataDevRoom at the 2011 edition of the FOSDEM

2010-12-07 Thread Ted Dunning
Isabel, IF there is a good technical conference some time after March, I would love to hear about it. On Tue, Dec 7, 2010 at 7:58 AM, Isabel Drost wrote: > Hello, > > We (Olivier, Nicolas and I) are organizing a Data Analytics DevRoom > that will take place during the next edition of the FOSDEM

Re: "Child" processes not getting killed

2010-12-07 Thread Stack
Can you dig in more Hari? When a child process won't go down, try figuring what its doing? Thread-dump it or study its logs? St.Ack On Tue, Dec 7, 2010 at 4:36 AM, Hari Sreekumar wrote: > Hi, > >       My cluster was running great till yesterday. Today, I submitted some > jobs and I saw that th

Re: blocked when creating HTable

2010-12-07 Thread Stack
On Tue, Dec 7, 2010 at 3:51 AM, exception qin wrote: > Does this mean port 60020 unaccessable? > Yes -- can you get to 10.1.1.26:60020? -- or your programming is not picking up the configuration and is pointed at 60020 on wrong interface or there is a mismatch in hbase versions between client and

Re: Weird Problem: Can not delete a specified row

2010-12-07 Thread Stack
Can you dig in more on the row? Query for all versions of the row to see what is in there and then see if you have the same issue if you scan across the row as opposed to Getting it? Sounds like a bug. Would be nice to get more info so we can try reproduce on this end. St.Ack 2010/12/7 曾伟 : >

RE: blocked when creating HTable

2010-12-07 Thread Buttler, David
I thought zookeeper 3.2.2 was the one that worked with hbase 0.20.6? I know I have had problems in the past when I have tried to mix and match zookeeper versions Dave -Original Message- From: exception qin [mailto:exceptionq...@gmail.com] Sent: Tuesday, December 07, 2010 12:20 AM To:

RE: Weird Problem: Can not delete a specified row

2010-12-07 Thread Veeramachaneni, Ravi
Check the timestamp of the row, if it is in future date, you may not be able to delete. From: saint@gmail.com [saint@gmail.com] On Behalf Of Stack [st...@duboce.net] Sent: Tuesday, December 07, 2010 10:27 AM To: user@hbase.apache.org Subject: Re:

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-07 Thread Stack
On Mon, Dec 6, 2010 at 11:15 PM, Gabriel Reid wrote: > Hi St.Ack, > > The cluster is a set of 5 machines, each with 3GB of RAM and 1TB of > storage. One machine is doing duty as Namenode, HBase Master, HBase > Regionserver, Datanode, Job Tracker, and Task Tracker, while the other > four are all Da

Re: Weird Problem: Can not delete a specified row

2010-12-07 Thread Stack
2010/12/7 Veeramachaneni, Ravi : > Check the timestamp of the row, if it is in future date, you may not be able > to delete. > Yes. Also do what Ravi suggests. St.Ack

RE: serialized objects as strings or as object? & data corruption?

2010-12-07 Thread Hiller, Dean (Contractor)
Purely application bugs is what I am thinking about and the plan to fix that data corruption when it happens.(ie. Bug is in prod for 1 day and I need to fix all records that it touched). I really like that JSON approach. That sounds quite nice and then I think a short lived Map-Reduce job might f

RE: serialized objects as strings or as object? & data corruption?

2010-12-07 Thread Buttler, David
If you are not doing any type of aggregation, then a reduce job adds unnecessary overhead. For your example I would definitely recommend a single map job that does a get/put operation pair. Also, don't forget that hbase stores versions, so you may be able to simply delete a corrupted value Da

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-07 Thread Gabriel Reid
Hi St.Ack, > You might be leaking scanners but that should have no effect on number > of open store files.  On deploy of a region, we open its store files > and hold them open and do not open others -- not unless > compacting/splitting. > > Hope this helps, Yes, huge help, thank you very much for

RE: serialized objects as strings or as object? & data corruption?

2010-12-07 Thread Hiller, Dean (Contractor)
Yes, I have been wondering about that exact scenario of "rollback" from versions and also wonder if I set it to store the last 3 versions, then do I triple my 7 terabytes into 21 terabytes as it stands now which I don't know yet if that is :( or :). Thoughts on versioning here from experienced use

Re: Issues running a large MapReduce job over a complete HBase table

2010-12-07 Thread Stack
Please be our guest. You'll need to make yourself an account on wiki but don't let that intimidate. Thanks Gabriel, St.Ack On Tue, Dec 7, 2010 at 11:30 AM, Gabriel Reid wrote: > Hi St.Ack, > >> You might be leaking scanners but that should have no effect on number >> of open store files.  On dep

restart hbase after configuration changes?

2010-12-07 Thread Albert Shau
Hi all, If I'm changing the configuration (editing value of hbase.hregion.majorcompaction in hbase-site.xml), do I need to restart hbase? Also, do I need to change the setting on all the regionservers, or is the master enough? Thanks, Albert

Re: restart hbase after configuration changes?

2010-12-07 Thread Ted Yu
You need to restart cluster. See also https://issues.apache.org/jira/browse/HBASE-2789 On Tue, Dec 7, 2010 at 12:14 PM, Albert Shau wrote: > Hi all, > > If I'm changing the configuration (editing value of > hbase.hregion.majorcompaction in hbase-site.xml), do I need to restart > hbase? Also, d

Re: restart hbase after configuration changes?

2010-12-07 Thread Stack
Yeah, what Ted says... and before restart, copy the new value out to all hosts. St.Ack On Tue, Dec 7, 2010 at 1:30 PM, Ted Yu wrote: > You need to restart cluster. > > See also https://issues.apache.org/jira/browse/HBASE-2789 > > On Tue, Dec 7, 2010 at 12:14 PM, Albert Shau wrote: > >> Hi all, >

slow move from rdbm to hadoop/hbase(is there replication strategies for this?)

2010-12-07 Thread Hiller, Dean (Contractor)
We are going to move 7 terabytes(set to grow to 35 when our SLA goes from 2 years to 10 years of storage) from an RDBMS to hadoop/hbase type system and I was wondering if anyone knew of how to get events from hbase on persisted/modified entities so that changes can be replicated to our RDBMS easily

0.90 ready for launch?

2010-12-07 Thread Christian van der Leeden
Hi, as a newbie and future adaptor (hopefully) just wanted to ask if 0.90 is ready to launch. No bugs in JIRA at least. Btw, 0.90 requires an adapted hadoop version, can this also be downloaded through apache.org? Christian

Re: slow move from rdbm to hadoop/hbase(is there replication strategies for this?)

2010-12-07 Thread Todd Lipcon
Hi Dean, There's no good "event log" capability right now, though a JIRA was recently filed to discuss this. One thing you could do is run periodic MR jobs that scan with a timestamp range to find any edits made in the last hour. Then from that MR job, update the RDBMS directly from the map tasks

Re: 0.90 ready for launch?

2010-12-07 Thread Stack
Christian: Its pretty close to release. The second RC was put up yesterday (See dev list for announcement or see here http://mail-archives.apache.org/mod_mbox/hbase-dev/201012.mbox/browser). Here's the story on hadoop http://people.apache.org/~stack/hbase-0.90.0-candidate-1/docs/notsoquick.html#

jobtracker start and mapred.job.tracker is using local

2010-12-07 Thread Hiller, Dean (Contractor)
This is for those people that get the typical 2010-12-07 10:10:22,537 FATAL org.apache.hadoop.mapred.JobTracker: java.lang.Run timeException: Not a host:port pair: local at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:136) at org.apache.hadoop.net.NetUtils.c

Re: blocked when creating HTable

2010-12-07 Thread exception qin
Hi everyone, It is because the version mismatch between hbase and my client. thanks for all your reply. On Wed, Dec 8, 2010 at 12:32 AM, Buttler, David wrote: > I thought zookeeper 3.2.2 was the one that worked with hbase 0.20.6? I > know I have had problems in the past when I have tried to m

Re: "Child" processes not getting killed

2010-12-07 Thread Hari Sreekumar
Hi Stack, The logs don't show anything nasty. e.g, I ran a job which spawned 5 mappers. All of the Child processes spawned by them remained even after the job completed. 3 map tasks got completed, and they have the following log: *stdout logs* -- *stderr l