Wenhai Li has posted comments on this change.

Change subject: RangeGenerator aggfunc for the numeric/asciiString datatype 
based on parallel streaming histogram.
......................................................................


Patch Set 21:

(21 comments)

Hi, Yingyi and Preston.
I didn't know you cann't see the comments without publishing. :( Hope it's not 
too late.

https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/global-rg/global-rg.1.ddl.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/global-rg/global-rg.1.ddl.aql:

Line 20: 
> remove this file
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/global-rg/global-rg.2.update.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/global-rg/global-rg.2.update.aql:

Line 17:  * under the License.
> remove this file
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/global-rg/global-rg.3.query.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/global-rg/global-rg.3.query.aql:

Line 19: use dataverse test;
> rename this file to global-rg.1.query.aql
Done


Line 22:   for $x in [1.0, 2.0, double("3.0"), 3.1, 3.2, 3.3, 3.4] 
> WS
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/local-rg/local-rg.1.ddl.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/local-rg/local-rg.1.ddl.aql:

Line 20: 
> remove this file.
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/local-rg/local-rg.2.update.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/local-rg/local-rg.2.update.aql:

Line 16:  * specific language governing permissions and limitations
> remove this file
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/local-rg/local-rg.3.query.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/local-rg/local-rg.3.query.aql:

Line 13:  * software distributed under the License is distributed on an
> rename this file to local-rg.1.query.aql
Done


Line 22:   for $x in [1.0, 2.0, double("3.0"), 3.1, 3.2, 3.3, 3.4] 
> WS
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double-null/rg-double-null.1.ddl.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double-null/rg-double-null.1.ddl.aql:

Line 8:  * with the License.  You may obtain a copy of the License at
> remove this file
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double-null/rg-double-null.2.update.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double-null/rg-double-null.2.update.aql:

Line 17:  * under the License.
> remove this file.
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double-null/rg-double-null.3.query.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double-null/rg-double-null.3.query.aql:

Line 19: use dataverse test;
> rename this file to rg-double-null.1.query.aql
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double/rg-double.1.ddl.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double/rg-double.1.ddl.aql:

Line 21: create dataverse test;
> remove this file
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double/rg-double.2.update.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double/rg-double.2.update.aql:

Line 12:  * Unless required by applicable law or agreed to in writing,
> remove this file
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double/rg-double.3.query.aql
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries/aggregate/rg-double/rg-double.3.query.aql:

Line 20: set partitions '2'
> rename this file to rg-double.3.query.aql.
Done


Line 22:   for $x in [1.0, 2.0, double("3.0"), 3.1, 3.2, 3.3, 3.4] 
> WS
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-om/src/main/java/org/apache/asterix/om/functions/AsterixBuiltinFunctions.java
File 
asterixdb/asterix-om/src/main/java/org/apache/asterix/om/functions/AsterixBuiltinFunctions.java:

Line 225:             "ceiling", 1);
> code style doesn't seem right.
Done


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalRangeGeneratorAggregateFunction.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/GlobalRangeGeneratorAggregateFunction.java:

Line 71:         IAType listedItemType = ((AOrderedListType) 
inRecType).getItemType();
How can we get the listedItemType without the inRecType?


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalRangeGeneratorAggregateFunction.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/LocalRangeGeneratorAggregateFunction.java:

Line 47: public class LocalRangeGeneratorAggregateFunction extends 
AbstractRangeGeneratorAggregateFunction {
> Can't we use open lists for local/intermediate aggregate output?
Great, how about the final output of the globalgenerator?


https://asterix-gerrit.ics.uci.edu/#/c/806/21/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/RangeGeneratorAggregateFunction.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/RangeGeneratorAggregateFunction.java:

Line 45:     public void step(IFrameTupleReference tuple) throws 
AlgebricksException {
> why no implementation?
Currently, the local/global couple of functions is enough to parallel 
construct/merge the histogram. Needs another round for single construction by 
this class?


https://asterix-gerrit.ics.uci.edu/#/c/806/21/hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/range/structures/GenericStreamingHistogram.java
File 
hyracks-fullstack/hyracks/hyracks-dataflow-std/src/main/java/org/apache/hyracks/dataflow/std/range/structures/GenericStreamingHistogram.java:

Line 38: 
> move the class to org/apache/asterix/runtime/aggregates/std/range?
Question to the comments related to the hyracks-histogram:
1. To accommodate the balancing pre-computation involved in all sorts of 
order-dependent operations, potentially for the job division of the future 
asterixdb sort-merge join and the matrix segmentation in some relevant 
mining/graph requirements, do we need to provide the fundamental algorithms in 
the hyracks running base?
2. The export interfaces below/of the IHistogram  are merely related to the 
hyracks datatype and the transformation/inversion between the variant types and 
the Double covers the primitive types of hyracks, which naturally supports the 
abstraction of the type-ignorant calling from the above asterix/agg and the 
potential statistic requirements from the future optimizer?
To this end, it's better to integrate all the changes to asterixAGG?


https://asterix-gerrit.ics.uci.edu/#/c/806/21/hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/data/skew/zipfan2.tbl
File 
hyracks-fullstack/hyracks/hyracks-examples/hyracks-integration-tests/data/skew/zipfan2.tbl:

Line 3: 5.1143520826504275E7    9669    51143520        20291   
-1171915.9960645214     3003    42424281        291     =kO98+.DI)QN#Z
> what does this mean?
respectively means:
zipfan unsigned double, uniform unsigned int16, zipfan unsigned long/int32, 
guassin unsigned int16/32, zipfan double, uniform int16, zipfan long/int32, 
guassin int16/32, ascii string with variant length.

We can locally construct for the both files and globally merge the generated 
intermediate bins.

It's just for verification purpose of histogram construction accuracy. We can 
remove this once the code is stable enough.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/806
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I450d0962fbeacfb2b6ab9fae0750f025ef17ba01
Gerrit-PatchSet: 21
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: Wenhai Li <lwhaym...@yahoo.com>
Gerrit-Reviewer: Jenkins <jenk...@fulliautomatix.ics.uci.edu>
Gerrit-Reviewer: Jianfeng Jia <jianfeng....@gmail.com>
Gerrit-Reviewer: Michael Blow <mb...@apache.org>
Gerrit-Reviewer: Preston Carman <prest...@apache.org>
Gerrit-Reviewer: Till Westmann <ti...@apache.org>
Gerrit-Reviewer: Wenhai Li <lwhaym...@yahoo.com>
Gerrit-Reviewer: Yingyi Bu <buyin...@gmail.com>
Gerrit-Reviewer: Yingyi Bu <ying...@google.com>
Gerrit-HasComments: Yes

Reply via email to