Re: Lowering HDFS socket timeouts

2012-07-18 Thread N Keywal
Hi Bryan, It's a difficult question, because dfs.socket.timeout is used all over the place in hdfs. I'm currently documenting this. Especially: - it's used for connections between datanodes, and not only for connections between hdfs clients hdfs datanodes. - It's also used for the two types of

Re: Hbase throwing error on batch inserts

2012-07-18 Thread John Hancock
Prakrati, Are you putting all of the puts in a list and then doing them all at once? Without having seen the code (which you may want to post to this list) I am guessing you are putting to many puts in a list before you call Table.put, which is why it works for 50,000 but not 75,000. -John On

RE: Hbase throwing error on batch inserts

2012-07-18 Thread Prakrati Agrawal
Yes, I am putting puts in a list before I call Table.put, but it was working fine when I was doing the same for a table which has only 2 column families. Thanks Prakrati -Original Message- From: John Hancock [mailto:jhancock1...@gmail.com] Sent: Wednesday, July 18, 2012 3:00 PM To:

Re: Hbase joins using MultiTableInputCollection [HBASE-3996]

2012-07-18 Thread David Koch
Hi Ted, Thank you for your reply. You are right, the ticket has not been closed yet. At this point, I am mainly trying to understand how MultiTableInputCollection can be used to do Joins between HBase tables, if possible with an example. Thanks, /David On Tue, Jul 17, 2012 at 11:07 PM, Ted Yu

Re: hbase map reduce is talking lot of time

2012-07-18 Thread syed kather
Thanks shashwat .. After increasing the Size of disk MR works fine . There are 6 mappers are running out of 6.first 5 map is talking 1L excuting in 20 min ... last 6th mapper is talking 5L record and taking long time to excecute . which talks nearly 1.5 hours .. . Can any one give me an idea to

Re: Lowering HDFS socket timeouts

2012-07-18 Thread Bryan Beaudreault
Thanks for the response, N. I could be wrong here, but since this problem is in the HDFS client code, couldn't I set this dfs.socket.timeout in my hbase-site.xml and it would only affect hbase connections to hdfs? I.e. we wouldn't have to worry about affecting connections between datanodes,

Re: How to merge regions in HBase?

2012-07-18 Thread Bryan Beaudreault
Shouldn't it be possible for him to have empty regions if he has a TTL on his data? -- Bryan Beaudreault On Wednesday, July 18, 2012 at 9:58 AM, Kevin O'dell wrote: Also, depending on your version of HBase that you are running you may have to bring down the cluster to merge and not just

Re: HBase 0.94.1 release date

2012-07-18 Thread Harsh J
Amit, It would be released if the current vote passes: http://search-hadoop.com/m/2LLx01jlcYd On Wed, Jul 18, 2012 at 6:43 PM, Amit Sela am...@infolinks.com wrote: Hi all, Anyone knows when (approximately) HBase 0.94.1 release will be available ? Thanks. -- Harsh J

Re: secondary index using coprocessors

2012-07-18 Thread Sever Fundatureanu
Hi, I also looked at your implementation. I have some issues with it but maybe someone can correct me if I'm wrong. What you're doing is writing the secondary index from the memstore in the preFlush hook. However there is no mutual exclusive lock (write lock) that is taken on the whole memstore

Bulk Import Data Locality

2012-07-18 Thread Alex Baranau
Hello, As far as I understand Bulk Import functionality will not take into account the Data Locality question. MR job will create number of reducer tasks same as regions to write into, but it will not advice on which nodes to run these tasks. In that case Reducer task which writes HFiles of some

Re: Rowkey hashing to avoid hotspotting

2012-07-18 Thread AnandaVelMurugan Chandra Mohan
Hi Cristofer, Data i store is test cell reports about a component. I have many test cell reports for each model number + serial number combination. So to make rowkey unique, I added timstamp. On Wed, Jul 18, 2012 at 3:14 AM, Cristofer Weber cristofer.we...@neogrid.com wrote: So, Anand, there

Re: Lowering HDFS socket timeouts

2012-07-18 Thread N Keywal
I don't know. The question is mainly for the read time out: you will connect to the ipc.Client with a read timeout of let say 10s. Server side the implementation may do something with another server, with a connect read timeout of 60s. So if you have: HBase -- live DN -- dead DN The timeout will

Re: Table not storing versions after deleteall

2012-07-18 Thread Jean-Daniel Cryans
The deleteall marker will hide everything that comes before its timestamp (in your case it's current time), if you just want to delete specific values use delete instead. Hope this helps, J-D On Wed, Jul 18, 2012 at 10:19 AM, Zoltán Tóth-Czifra zoltan.tothczi...@softonic.com wrote: Hi, For a

RES: Bulk Import Data Locality

2012-07-18 Thread Cristofer Weber
Hi Alex Here we worked with bulk import creating the HFiles in a MR job and we finish the load calling doBulkLoad method of LoadIncrementalHFiles class (probably the same method used by completebulkload tool) and HFiles generated by reducer tasks are correctly 'adopted' by each corresponding

RE: Table not storing versions after deleteall

2012-07-18 Thread Zoltán Tóth-Czifra
Hi, Thanks for the quick answer! So I understand it's the expected behavior...? For me it doesn't explain why can't I re-add the exact same rows. From: jdcry...@gmail.com [jdcry...@gmail.com] on behalf of Jean-Daniel Cryans [jdcry...@apache.org] Sent:

Re: Table not storing versions after deleteall

2012-07-18 Thread Jean-Daniel Cryans
On Wed, Jul 18, 2012 at 10:32 AM, Zoltán Tóth-Czifra zoltan.tothczi...@softonic.com wrote: Thanks for the quick answer! So I understand it's the expected behavior...? Yeah, it's an edge case. For me it doesn't explain why can't I re-add the exact same rows. As I mentioned, they are hidden by

RES: Rowkey hashing to avoid hotspotting

2012-07-18 Thread Cristofer Weber
Hi Anand! I see... sorry for being so curious, but since I started studying HBase I am curious about how people are modeling their tables, and in what kinds of systems HBase is in use. Have you evaluated recording your reports in a distinct CF using timestamps as column qualifiers? It's my

Re: Table not storing versions after deleteall

2012-07-18 Thread Jason Frantz
Zoltan, It's actually a bit more complicated because the behavior is non-deterministic. If a compaction happens the delete marker may be removed, and if you add the rows back after this time they *will* be visible. See the following for more info:

Scanning columns

2012-07-18 Thread Mohit Anchlia
I am designing a HBase schema as a timeseries model. Taking advice from the definitive guide and tsdb I am planning to use my row key as metricname:Long.MAX_VALUE - basetimestamp. And the column names would be timestamp-base timestamp. My col names would then look like 1,2,3,4,5 .. for instance. I

Re: Scanning columns

2012-07-18 Thread Jerry Lam
Hi, This sounds like you are looking for ColumnRangeFilter? Best Regards, Jerry On Wednesday, July 18, 2012, Mohit Anchlia wrote: I am designing a HBase schema as a timeseries model. Taking advice from the definitive guide and tsdb I am planning to use my row key as

[poll] Does anyone run or test against hadoop 0.21, 0.22, 0.23 under HBase 0.92.0+/0.94.0?

2012-07-18 Thread Jonathan Hsieh
Hi, I'm trying to get a feel to see how affected folks would be if we potentially only had hbase support a hadoop 1.0 build and a hadoop 2.0 build profile (and perhaps a hadoop 3.0-SNAPSHOT proflile). Specifically, does anyone use hbase on top of hadoop 0.21.x, 0.22.x, or 0.23.x (which became

Re: [poll] Does anyone run or test against hadoop 0.21, 0.22, 0.23 under HBase 0.92.0+/0.94.0?

2012-07-18 Thread Ted Yu
We use hadoop 0.22 Cheers On Wed, Jul 18, 2012 at 12:58 PM, Jonathan Hsieh j...@cloudera.com wrote: Hi, I'm trying to get a feel to see how affected folks would be if we potentially only had hbase support a hadoop 1.0 build and a hadoop 2.0 build profile (and perhaps a hadoop 3.0-SNAPSHOT

RES: Bulk Import Data Locality

2012-07-18 Thread Cristofer Weber
Hi Alex, I ran one of our bulk import jobs with partial payload, without proceeding with major compaction, and you are right: Some hdfs blocks are in a different datanode. -Mensagem original- De: Alex Baranau [mailto:alex.barano...@gmail.com] Enviada em: quarta-feira, 18 de julho de

RE: [poll] Does anyone run or test against hadoop 0.21, 0.22, 0.23 under HBase 0.92.0+/0.94.0?

2012-07-18 Thread Tony Dean
We are using HBase 0.94.0 against Hadoop 1.0.3, but plan to move to 0.23.x. -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Wednesday, July 18, 2012 4:12 PM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: [poll] Does anyone run or test against hadoop

Re: cannot invoke coprocessor in trunk

2012-07-18 Thread Ted Yu
ClassNotFoundException ... How did you load your coprocessor jar ? On Wed, Jul 18, 2012 at 3:05 PM, Yin Huai huaiyin@gmail.com wrote: Hello All, When I was trying to invoke coprocessor against trunk, I got the following error... 12/07/18 13:43:49 WARN

Re: Problem with ColumnPaginationFilter: after put twice,get half of limit columns

2012-07-18 Thread Suraj Varma
It's not clear what your question is ... can you provide your hbase shell session or code snippet that shows the below scenario? --S On Tue, Jul 17, 2012 at 8:01 PM, deanforwever2010 deanforwever2...@gmail.com wrote: it only happened when i put same data twice more in the column any ideas?

Re: Load balancer repeatedly close and open region in the same regionserver.

2012-07-18 Thread Suraj Varma
You can use pastebin.com or similar services to cut/paste your logs. --S On Tue, Jul 17, 2012 at 7:11 PM, Howard rj03...@gmail.com wrote: this problem just only once,Because it happens two day before,I remember I check the master-status and only always see regions is pending open in Regions

Re: [poll] Does anyone run or test against hadoop 0.21, 0.22, 0.23 under HBase 0.92.0+/0.94.0?

2012-07-18 Thread Cristofer Weber
We are using CDH4 Sent from my iPad On Jul 18, 2012, at 18:48, Tony Dean tony.d...@sas.com wrote: We are using HBase 0.94.0 against Hadoop 1.0.3, but plan to move to 0.23.x. -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Wednesday, July 18, 2012 4:12 PM To:

Re: [poll] Does anyone run or test against hadoop 0.21, 0.22, 0.23 under HBase 0.92.0+/0.94.0?

2012-07-18 Thread lars hofhansl
Out of curiosity, why 0.23 and not wait a bit for the 2.0.x branch? - Original Message - From: Tony Dean tony.d...@sas.com To: user@hbase.apache.org user@hbase.apache.org; d...@hbase.apache.org d...@hbase.apache.org Cc: Sent: Wednesday, July 18, 2012 2:47 PM Subject: RE: [poll] Does

Re: HBase Fault tolerance

2012-07-18 Thread Suraj Varma
My question is how is this HLog file different from a StoreFile? Why is it faster to write to an HLog file and not write directly to a StoreFile? Read this: http://www.larsgeorge.com/2010/01/hbase-architecture-101-write-ahead-log.html and

Client side hbase-site.xml config

2012-07-18 Thread Mohit Anchlia
I just wanted to check if most people copu hbase-site.xml in the classpath or use some properties file as a resource and then set it in Configuration object returned by HBaseConfiguration.*create*();

Re: Client side hbase-site.xml config

2012-07-18 Thread Stack
On Thu, Jul 19, 2012 at 2:36 AM, Mohit Anchlia mohitanch...@gmail.com wrote: I just wanted to check if most people copu hbase-site.xml in the classpath or use some properties file as a resource and then set it in Configuration object returned by HBaseConfiguration.*create*(); The former in my

Re: Smart Managed Major Compactions

2012-07-18 Thread Stack
On Wed, Jul 18, 2012 at 7:26 PM, Bryan Beaudreault bbeaudrea...@gmail.com wrote: I am looking into managing major compactions ourselves, but there doesn't appear to be any mechanisms I can hook in to determine which tables need compacting. Ideally each time my cron job runs it would compact

Re: cannot invoke coprocessor in trunk

2012-07-18 Thread Yin Huai
I load it through table attribute. Here is what I did... htd.setValue(COPROCESSOR$1, path.toString() + | + RowCountTrunkEndpoint. *class*.getCanonicalName() + | + Coprocessor.PRIORITY_USER); Just found the reason... I forgot to add my jar to classpath... Thanks, Yin On Wed, Jul 18, 2012 at

Re: CCSHB : PASS!

2012-07-18 Thread Todd Lipcon
Congrats, Cristofer. Glad you found the exam both challenging and useful as something to study for. A few of us in the community spent lots of time reviewing the questions, etc. Hope to see you continue to use HBase and maybe even contribute some patches down the road! -Todd On Tue, Jul 17, 2012

Re: Bulk Import Data Locality

2012-07-18 Thread Ben Kim
I added some QA's went with Lars. Hope this is somewhat related to your data locality questions. On Jun 15, 2012, at 6:56 AM, Ben Kim wrote: Hi, I've been posting questions in the mailing-list quiet often lately, and here goes another one about data locality I read the excellent

Re: CCSHB : PASS!

2012-07-18 Thread Ben Kim
Can I ask you.. For Cloudera Certified Specialist what's the coverage of HBase among the exam? How many questions are related to hbase and how many are specific to hadoop? Thank you! Ben On Thu, Jul 19, 2012 at 10:36 AM, Todd Lipcon t...@cloudera.com wrote: Congrats, Cristofer. Glad you found

Re: CCSHB : PASS!

2012-07-18 Thread Stephen Boesch
Can we move this discussion to another forum and maintain the hbase ML on hbase usage (not taking/passing exams) ? 2012/7/18 Ben Kim benkimkim...@gmail.com Can I ask you.. For Cloudera Certified Specialist what's the coverage of HBase among the exam? How many questions are related to hbase

Re: Problem with ColumnPaginationFilter: after put twice,get half of limit columns

2012-07-18 Thread deanforwever2010
hi Suraj my code is like this for(int i = 0; i 5; i++) ht.commonDao.insert(row, sl, 3466673706998492l+i+, ); for(int i = 0; i 5; i++) ht.commonDao.insert(row, sl, 3466673706998492l+i+, ); put the same record twice when I get 500 records by ColumnPaginationFilter,I just got 250 Get get

Re: Bulk Import Data Locality

2012-07-18 Thread Alex Baranau
Thank you a lot for the replies. To me it is clear when data locality gets broken though (and it is not only the failure of the RS, there are other cases). I was hoping more for suggestions around this particular use-case: assuming that nodes/RSs are stable, how to make sure to achieve the data

Fwd: Bulk Import Data Locality

2012-07-18 Thread Alex Baranau
Thank you a lot for the replies. To me it is clear when data locality gets broken though (and it is not only the failure of the RS, there are other cases). I was hoping more for suggestions around this particular use-case: assuming that nodes/RSs are stable, how to make sure to achieve the data

RE: [poll] Does anyone run or test against hadoop 0.21, 0.22, 0.23 under HBase 0.92.0+/0.94.0?

2012-07-18 Thread Ramkrishna.S.Vasudevan
We work on hadoop 2.0. Regards Ram -Original Message- From: Ted Yu [mailto:yuzhih...@gmail.com] Sent: Thursday, July 19, 2012 1:42 AM To: d...@hbase.apache.org Cc: user@hbase.apache.org Subject: Re: [poll] Does anyone run or test against hadoop 0.21, 0.22, 0.23 under HBase

RE: Applying QualifierFilter to one column family only.

2012-07-18 Thread Anoop Sam John
Hi David, You want the below use case in scan Table :T1 -- CF : T CF: S q1 q2..q1 q2 .. Now in Scan u want to scan all the qualifiers under S and one qualifier under T. (I think I got ur use case correctly) Well this use case u can