RE: [jira] Commented: (HADOOP-2006) Aggregate Functions in select statement

2007-12-05 Thread edward yoon

Sorry for my mistake...

> If you mistake that standard sql is all of A-DBMS capacity,
> I think you don't want to studies about database structure, access 
> algorithms, philosophies,.., etc of A-DBMS.
>
> Then, Can i make you use the A-DBMS's 100% Full capacity by force?
>
> Or
>
> Let's assume the A-DBMS didn't provide standard sql.
> Are you want to use the A-DBMS?

>> Do you want to use the A-DBMS? 

>
> Ok..
> If you want to use the A-DBMS, you already didn't thought the sql isn't all 
> of A-DBMS.

>> If you want to use the A-DBMS, you already thought the sql isn't all of 
>> A-DBMS.

> So, conclusion?
> The more affluent the hbase shell, the use of hbase will be growing very 
> rapidly.


--

B. Regards,

Edward yoon @ NHN, corp.
Home : http://www.udanax.org


> From: [EMAIL PROTECTED]
> To: hadoop-dev@lucene.apache.org
> Subject: RE: [jira] Commented: (HADOOP-2006) Aggregate Functions in select 
> statement
> Date: Thu, 6 Dec 2007 01:10:14 +
>
>
>>> it will encourage people to think that the shell is a good way to interact 
>>> with HBase in general...
>
> I think this is a key point. :)
>
> The Hbase Shell aim is to improve the work's efficiency, without research of 
> specified knowledge.
> I'll makes an accessory for database access methods on Hbase.
> Also, i'm thinking about Matrix operations on Hbase.
>
> But, ... Hbase Shell just a one of applications on Hbase.
>
> ...
> Let's think.
>
> If you mistake that standard sql is all of A-DBMS capacity,
> I think you don't want to studies about database structure, access 
> algorithms, philosophies,.., etc of A-DBMS.
>
> Then, Can i make you use the A-DBMS's 100% Full capacity by force?
>
> Or
>
> Let's assume the A-DBMS didn't provide standard sql.
> Are you want to use the A-DBMS?
>
> Ok..
> If you want to use the A-DBMS, you already didn't thought the sql isn't all 
> of A-DBMS.
>
> So, conclusion?
> The more affluent the hbase shell, the use of hbase will be growing very 
> rapidly.
>
>
> --
>
> B. Regards,
>
> Edward yoon @ NHN, corp.
> Home : http://www.udanax.org
>
>
>> From: [EMAIL PROTECTED]
>> Subject: Re: [jira] Commented: (HADOOP-2006) Aggregate Functions in select 
>> statement
>> Date: Wed, 5 Dec 2007 15:50:50 -0800
>> To: hadoop-dev@lucene.apache.org
>>
>> If you have a table with something like a billion rows, and do an
>> aggregate function on the table from the shell, you will end up
>> reading all billion rows through a single machine, essentially
>> aggregating the entire dataset locally. This defeats the purpose of
>> having a massively distributed database like HBase. To do this more
>> efficiently, you'd ideally kick of a Map Reduce job that can perform
>> the various aggregation function on the dataset in parallel,
>> harnessing the power of the distributed dataset, and then returning
>> the results to a central location once they are calculated.
>>
>> I think putting this option into the shell is risky, because it will
>> encourage people to think that the shell is a good way to interact
>> with HBase in general, which it isn't. We want people to understand
>> HBase is best consumed in parallel and discourage solutions that
>> aggregate access through a single point. As such, we shouldn't build
>> features that allow people to inadvertently use the wrong access
>> patterns.
>>
>> On Dec 5, 2007, at 3:38 PM, Edward Yoon (JIRA) wrote:
>>
>>>
>>> [ https://issues.apache.org/jira/browse/HADOOP-2006?
>>> page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
>>> tabpanel#action_12548879 ]
>>>
>>> Edward Yoon commented on HADOOP-2006:
>>> -
>>>
>>> I don't understand your comment.
>>> Please more explanation for me.
>>>
>>>> Aggregate Functions in select statement
>>>> ---
>>>>
>>>> Key: HADOOP-2006
>>>> URL: https://issues.apache.org/jira/browse/
>>>> HADOOP-2006
>>>> Project: Hadoop
>>>> Issue Type: Sub-task
>>>> Components: contrib/hbase
>>>> Affects Versions: 0.14.1
>>>> Reporter: Edward Yoon
>>>> Assignee: Edward Yoon
>>>> Priority: Minor
>>>> Fix For: 0.16.0
>>>>
>>>>
>>>> Aggregation functions on

RE: [jira] Commented: (HADOOP-2006) Aggregate Functions in select statement

2007-12-05 Thread edward yoon

>> it will encourage people to think that the shell is a good way to interact 
>> with HBase in general...

I think this is a key point. :)

The Hbase Shell aim is to improve the work's efficiency, without research of 
specified knowledge.
I'll makes an accessory for database access methods on Hbase.
Also, i'm thinking about Matrix operations on Hbase.

But, ... Hbase Shell just a one of applications on Hbase.

...
Let's think.

If you mistake that standard sql is all of A-DBMS capacity, 
I think you don't want to studies about database structure, access algorithms, 
philosophies,.., etc of A-DBMS.

Then, Can i make you use the A-DBMS's 100% Full capacity by force?

Or 

Let's assume the A-DBMS didn't provide standard sql.
Are you want to use the A-DBMS?

Ok.. 
If you want to use the A-DBMS, you already didn't thought the sql isn't all of 
A-DBMS.

So, conclusion?
The more affluent the hbase shell, the use of hbase will be growing very 
rapidly.


--

B. Regards,

Edward yoon @ NHN, corp.
Home : http://www.udanax.org


> From: [EMAIL PROTECTED]
> Subject: Re: [jira] Commented: (HADOOP-2006) Aggregate Functions in select 
> statement
> Date: Wed, 5 Dec 2007 15:50:50 -0800
> To: hadoop-dev@lucene.apache.org
>
> If you have a table with something like a billion rows, and do an
> aggregate function on the table from the shell, you will end up
> reading all billion rows through a single machine, essentially
> aggregating the entire dataset locally. This defeats the purpose of
> having a massively distributed database like HBase. To do this more
> efficiently, you'd ideally kick of a Map Reduce job that can perform
> the various aggregation function on the dataset in parallel,
> harnessing the power of the distributed dataset, and then returning
> the results to a central location once they are calculated.
>
> I think putting this option into the shell is risky, because it will
> encourage people to think that the shell is a good way to interact
> with HBase in general, which it isn't. We want people to understand
> HBase is best consumed in parallel and discourage solutions that
> aggregate access through a single point. As such, we shouldn't build
> features that allow people to inadvertently use the wrong access
> patterns.
>
> On Dec 5, 2007, at 3:38 PM, Edward Yoon (JIRA) wrote:
>
>>
>> [ https://issues.apache.org/jira/browse/HADOOP-2006?
>> page=com.atlassian.jira.plugin.system.issuetabpanels:comment-
>> tabpanel#action_12548879 ]
>>
>> Edward Yoon commented on HADOOP-2006:
>> -
>>
>> I don't understand your comment.
>> Please more explanation for me.
>>
>>> Aggregate Functions in select statement
>>> ---
>>>
>>> Key: HADOOP-2006
>>> URL: https://issues.apache.org/jira/browse/
>>> HADOOP-2006
>>> Project: Hadoop
>>> Issue Type: Sub-task
>>> Components: contrib/hbase
>>> Affects Versions: 0.14.1
>>> Reporter: Edward Yoon
>>> Assignee: Edward Yoon
>>> Priority: Minor
>>> Fix For: 0.16.0
>>>
>>>
>>> Aggregation functions on collections of data values: average,
>>> minimum, maximum, sum, count.
>>> Group rows by value of an columnfamily and apply aggregate
>>> function independently to each group of rows.
>>> *  ƒ ~function_list~ (Relation)
>>> {code}
>>> select producer, avg(year) from movieLog_table group by producer
>>> {code}
>>
>> --
>> This message is automatically generated by JIRA.
>> -
>> You can reply to this email to add a comment to the issue online.
>>
>

_
Put your friends on the big screen with Windows Vista® + Windows Live™.
http://www.microsoft.com/windows/shop/specialoffers.mspx?ocid=TXT_TAGLM_CPC_MediaCtr_bigscreen_102007

Re: [jira] Commented: (HADOOP-2006) Aggregate Functions in select statement

2007-12-05 Thread Bryan Duxbury
If you have a table with something like a billion rows, and do an  
aggregate function on the table from the shell, you will end up  
reading all billion rows through a single machine, essentially  
aggregating the entire dataset locally. This defeats the purpose of  
having a massively distributed database like HBase. To do this more  
efficiently, you'd ideally kick of a Map Reduce job that can perform  
the various aggregation function on the dataset in parallel,  
harnessing the power of the distributed dataset, and then returning  
the results to a central location once they are calculated.


I think putting this option into the shell is risky, because it will  
encourage people to think that the shell is a good way to interact  
with HBase in general, which it isn't. We want people to understand  
HBase is best consumed in parallel and discourage solutions that  
aggregate access through a single point. As such, we shouldn't build  
features that allow people to inadvertently use the wrong access  
patterns.


On Dec 5, 2007, at 3:38 PM, Edward Yoon (JIRA) wrote:



[ https://issues.apache.org/jira/browse/HADOOP-2006? 
page=com.atlassian.jira.plugin.system.issuetabpanels:comment- 
tabpanel#action_12548879 ]


Edward Yoon commented on HADOOP-2006:
-

I don't understand your comment.
Please more explanation for me.


Aggregate Functions in select statement
---

Key: HADOOP-2006
URL: https://issues.apache.org/jira/browse/ 
HADOOP-2006

Project: Hadoop
 Issue Type: Sub-task
 Components: contrib/hbase
   Affects Versions: 0.14.1
   Reporter: Edward Yoon
   Assignee: Edward Yoon
   Priority: Minor
Fix For: 0.16.0


Aggregation functions on collections of data values: average,  
minimum, maximum, sum, count.
Group rows by value of an columnfamily and apply aggregate  
function independently to each group of rows.

 *   ƒ ~function_list~ (Relation)
{code}
select producer, avg(year) from movieLog_table group by producer
{code}


--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.