subject:"data structure"

Re: Distributed Data structure

2011-08-29 Thread Jean-Daniel Cryans

somuch interest after looking > at the way it got evolved and started slowly working on it. I want to > develop a distributed data structure in HBase environment like Distributed > Hash Table , distributed set, or list. Can somebody help me in this regard. > Even if anyone worked or ha

Distributed Data structure

2011-08-27 Thread vamshi krishna

Hi folks, i am new to Hbase and recently got somuch interest after looking at the way it got evolved and started slowly working on it. I want to develop a distributed data structure in HBase environment like Distributed Hash Table , distributed set, or list. Can somebody help me in this

Re: data structure

2011-07-29 Thread Otis Gospodnetic

ematext.com/ :: Solr - Lucene - Hadoop - HBase Hadoop ecosystem search :: http://search-hadoop.com/ > >From: Andre Reiter >To: user@hbase.apache.org >Sent: Thursday, July 14, 2011 3:52 PM >Subject: data structure > >Hi everybody, > >we

Re: data structure

2011-07-17 Thread Ted Dunning

Averages are easy to rollup as well. Rank statistics like median, min, max and quartiles are not much harder. Total uniques are more difficult. If you have decent distributional information, these can be estimated reasonably well. Mahout has code for the first two. On Sun, Jul 17, 2011 at 9:30

Re: data structure

2011-07-17 Thread Arvind Jayaprakash

On Jul 14, Andre Reiter wrote: >new we are running mapreduce jobs, to generate a report: for example we >want to know how many impressions were done by all users in last x >days. therefore the scan of the MR job is running over all data in our >hbase table for the particular family. this takes at t

Re: data structure

2011-07-15 Thread Claudio Martella

gt;> - Original Message - >> From: Claudio Martella >> Sent: Fri Jul 15 2011 14:40:38 GMT+0200 (CET) >> To: >> CC: >> Subject: Re: data structure > >> supposed you want a per-hour granularity, you could have a key like this >> >> _ >

Re: data structure

2011-07-15 Thread Andre Reiter

hi Claudio, thanks for the hint the point is, that we need a fast request to the user data, that is why we need the row key to be the user_id - Original Message - From: Claudio Martella Sent: Fri Jul 15 2011 14:40:38 GMT+0200 (CET) To: CC: Subject: Re: data structure supposed

Re: data structure

2011-07-15 Thread Claudio Martella

>> Sent: Thu Jul 14 2011 23:17:20 GMT+0200 (CET) >> To: >> CC: >> Subject: Re: data structure > >> You can play tricks with the arrangement of the key. >> >> For instance, you can put date at the end of the key. That would let >> you >> p

Re: data structure

2011-07-14 Thread Ted Dunning

: > - Original Message - >> From: Ted Dunning >> Sent: Thu Jul 14 2011 23:17:20 GMT+0200 (CET) >> To: >> CC: >> Subject: Re: data structure >> > > You can play tricks with the arrangement of the key. >> >> For instance, you can pu

Re: data structure

2011-07-14 Thread Andre Reiter

- Original Message - From: Ted Dunning Sent: Thu Jul 14 2011 23:17:20 GMT+0200 (CET) To: CC: Subject: Re: data structure You can play tricks with the arrangement of the key. For instance, you can put date at the end of the key. That would let you pull data for a particular user for

Re: data structure

2011-07-14 Thread Ted Dunning

You can play tricks with the arrangement of the key. For instance, you can put date at the end of the key. That would let you pull data for a particular user for a particular date range. The date should not be a time stamp, but should be a low-res version of time (day-level resolution might be o

Re: data structure

2011-07-14 Thread Andre Reiter

- Original Message - From: Doug Meil Sent: Thu Jul 14 2011 22:29:16 GMT+0200 (CET) To: CC: Subject: Re: data structure Hi there- A few high-level suggestions... re: "to generate a report: for example we want to know how many impressions were done by all users in last x days&

Re: data structure

2011-07-14 Thread Andre Reiter

Stack wrote: On Thu, Jul 14, 2011 at 12:52 PM, Andre Reiter wrote: Why is 70 seconds too long for a report? 70 seconds seems like a short mapreduce job (to me). You don't have that many regions. How fast would you like this operation to complete in? The report you describe above is predicat

Re: data structure

2011-07-14 Thread Doug Meil

Hi there- A few high-level suggestions... re: "to generate a report: for example we want to know how many impressions were done by all users in last x days" Can you create a summary table by day (via MR job), and then have your ad-hoc report hit the summary table? Re: "and with the data grow

Re: data structure

2011-07-14 Thread Stack

On Thu, Jul 14, 2011 at 12:52 PM, Andre Reiter wrote: > new we are running mapreduce jobs, to generate a report: for example we want > to know how many impressions were done by all users in last x days. > therefore the scan of the MR job is running over all data in our hbase table > for the partic

data structure

2011-07-14 Thread Andre Reiter

Hi everybody, we have our hadoop + hbase cluster running at the moment with 6 servers everything is working just fine. We have a web application, where data is stored with the row key = user id (meaningless UUID). So our users have a cookie, which is the row key, behind this key are families w

Re: Distributed Data structure

Distributed Data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

Re: data structure

data structure

16 matches

Site Navigation

Mail list logo

Footer information