Re: Bulk upload

2011-08-16 Thread Ophir Cohen
ched in, not sure about the others. > > J-D > > On Thu, Aug 11, 2011 at 1:28 AM, Ophir Cohen wrote: > > I did some more tests and found the problem: on local run the distribtued > > cache does not work. > > > > On full cluster it works. > > Sorry for you

Re: Bulk upload

2011-08-11 Thread Ophir Cohen
Ophir Cohen wrote: > Now I see that it uses the distributed cache - but for some reason > the TotalOrderPartitioner does not grab it. > Ophir > > > On Thu, Aug 11, 2011 at 11:08, Ophir Cohen wrote: > >> Hi, >> I started to use bulk upload and encounter a strange

Re: Bulk upload

2011-08-11 Thread Ophir Cohen
Now I see that it uses the distributed cache - but for some reason the TotalOrderPartitioner does not grab it. Ophir On Thu, Aug 11, 2011 at 11:08, Ophir Cohen wrote: > Hi, > I started to use bulk upload and encounter a strange problem. > I'm using Cloudera cdh3-u1.

Bulk upload

2011-08-11 Thread Ophir Cohen
Hi, I started to use bulk upload and encounter a strange problem. I'm using Cloudera cdh3-u1. I'm using HFileOutputFormat.configureIncrementalLoad() to configure my job. This method create partition file for the TotalOrderPartitioner and save it to HDFS. When the TotalOrderPartitioner initiated

MR Jobs managing on Hadoop cluster

2011-07-03 Thread Ophir Cohen
Hi, Recently we deployed 20-nodes cluster in our organization. Shortly it would doubled (at least) and will start to handle billions of rows. My question concerns the managing option. I would like to let users (i.e. internal developers) to submit, schedule and monitor their jobs. Of course, I can

Re: Add a column family to a table

2011-06-30 Thread Ophir Cohen
We add an argument here on that issue: Can somebody put a little bit light on the reason we need to disable the table in order to add CF? It looks to me that adding CF should be simple as say: there is a new one - anyway its on different file? What do I miss here? Thanks! Ophir On Sat, Jun 18, 20

Re: Mapreduce counters

2011-05-18 Thread Ophir Cohen
unds like a useful feature, maybe file a jira? > > I've never tried to save counters form the MR job into HBase, but you > could pull it from the file as you said or from the Job object after > waitForCompletion() returns by calling getCounters(). > > -Joey > > On Wed

Mapreduce counters

2011-05-18 Thread Ophir Cohen
Hi All, Currently MR job spilled his counters into file at the end of the run. Is there any built-in configuration/plug-in to make it store these counters into HBase as well? Sounds to me like a great feature! Does anybody did something similar? If you did, how did you do it? Run on directory an

Region locality and .META. region

2011-05-16 Thread Ophir Cohen
Hi, I have two questions: 1. Does HBase knows how to handle blocks moving. e.g does HBase can recognize that some local block deleted from machine and move that region to machine with that block? 2. What happen if the region server of the .META. failed? Does HBase has duplicate region for that? ho

Re: Data retention in HBase

2011-05-12 Thread Ophir Cohen
Thanks, good luck with the release... Ophir On Thu, May 12, 2011 at 8:05 PM, Jean-Daniel Cryans wrote: > > So, now with that and with the security/co-processors I can ask: when do > you > > think 0.92 going to deployed? > > When it's ready, there's no formal plan. We were targeting May 1st for >

Re: Data retention in HBase

2011-05-12 Thread Ophir Cohen
key. 2. You have HBase 0.92 and higher So, now with that and with the security/co-processors I can ask: when do you think 0.92 going to deployed? BTW Do you have any simulator to run HBase master and region server to check this code? Ophir On Wed, May 11, 2011 at 10:32 PM, Ophir Cohen

Re: Data retention in HBase

2011-05-11 Thread Ophir Cohen
Thanks for the comments, Going to work on it tomorrow - I'll keep you updated. Ophir On Wed, May 11, 2011 at 8:01 PM, Stack wrote: > On Wed, May 11, 2011 at 6:14 AM, Ophir Cohen wrote: > > My results from today's researches: > > > > I tried to delete region

Re: Data retention in HBase

2011-05-11 Thread Ophir Cohen
de it stated that it triggered compaction and that should be enough (). 3. Is there a way to choose my method of region splitting? I think it can be a great option - way to state when and how region is splitted... Any thoughts? Thanks, Ophir BTW On Tue, May 10, 2011 at 6:50 PM, Ophir

Re: Data retention in HBase

2011-05-10 Thread Ophir Cohen
side-effect of requiring that you disable the table for a short period (I > think). > > On Mon, May 9, 2011 at 10:09 AM, Ophir Cohen wrote: > > > Thanks for the answer! > > > > A little bit more info: > > Our data is internal events grouped for sessions (i.e.

Re: Data retention in HBase

2011-05-09 Thread Ophir Cohen
PS The deletion is matter of privacy, security and terms-of-service not only storage problems... On Mon, May 9, 2011 at 8:33 PM, Ophir Cohen wrote: > Tell it to my company ;) > > It looks like a nice tool to have such an a region dropper... > I'll take a look and will come b

Re: Data retention in HBase

2011-05-09 Thread Ophir Cohen
Tell it to my company ;) It looks like a nice tool to have such an a region dropper... I'll take a look and will come back to discuss it. If I'll go this direction I'm sure going to automate it... Ophir On Mon, May 9, 2011 at 8:29 PM, Stack wrote: > On Mon, May 9, 2011

Re: Data retention in HBase

2011-05-09 Thread Ophir Cohen
ata is actually visible to the consumer of the data? > > On Mon, May 9, 2011 at 2:59 AM, Ophir Cohen wrote: > > > Hi All, > > In my company currently we are working hard on deployment our cluster > with > > HBase. > > > > We talking of ~20 nodes to hold pretty b

Data retention in HBase

2011-05-09 Thread Ophir Cohen
(customers and time) I thought of this option: 1. Split regions and create region with 'candidates to removed'. 2. Drop this region. - Is it possible to drop region? - Do you think it a good idea? - Any other ideas? Thanks, Ophir Cohen LivePerson