Yingyi Bu has posted comments on this change.

Change subject: Equivalence induced onetoone operatiions for 
insert/delete/upsert/query exchange optimization for lookup and delete when the 
operation is merely directed by the primary key.
......................................................................


Patch Set 10:

(26 comments)

Here are some high-level comments:

1. We need to revisit what needs to be added into the partitioning property to 
indicate where the operator actually needs to be run.  I'm not sure the 
abstraction of "isSingle" is enough. I inlined two motivating queries.

2. The caller of an IMetadataProvider probably shouldn't be a aware of a node 
id, but a hash value, or more general, the INodeDomain information. I think 
this also needs a discussion.

3. The code style seems not follow the recommended one: 
http://asterixdb.incubator.apache.org/dev-setup.html

Other detailed comments are inlined.

https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/metadata/IMetadataProvider.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/metadata/IMetadataProvider.java:

Line 71:             JobGenContext context, JobSpecification jobSpec, boolean 
bulkload, int nodeId) throws AlgebricksException;
int nodeId->Integer hashValue

I guess the caller shouldn't be aware of nodeId, but only hashValue.
if the Integer is null, that means the reader needs to run on all storage nodes.


Line 75:             LogicalVariable payLoadVar, List<LogicalVariable> 
additionalNonKeyFields, RecordDescriptor recordDesc,
int nodeId->Integer hashValue

I guess the caller shouldn't be aware of nodeId, but only hashValue.
if the Integer is null, that means the reader needs to run on all storage nodes.


Line 115:             JobGenContext context, JobSpecification spec, boolean 
bulkload, int nodeId) throws AlgebricksException;
int nodeId->Integer hashValue

I guess the caller shouldn't be aware of nodeId, but only hashValue.
if the Integer is null, that means the reader needs to run on all storage nodes.


Line 154:             JobGenContext context, JobSpecification spec, int nodeId) 
throws AlgebricksException;
int nodeId->Integer hashValue

I guess the caller shouldn't be aware of nodeId, but only hashValue.
if the Integer is null, that means the reader needs to run on all storage nodes.


Line 199:             RecordDescriptor recordDesc, JobGenContext context, 
JobSpecification jobSpec, int nodeId)
int nodeId->Integer hashValue

I guess the caller shouldn't be aware of nodeId, but only hashValue.
if the Integer is null, that means the reader needs to run on all storage nodes.


Line 207:             RecordDescriptor inputDesc, JobGenContext context, 
JobSpecification spec, int nodeId)
int nodeId->Integer hashValue

I guess the caller shouldn't be aware of nodeId, but only hashValue.
if the Integer is null, that means the reader needs to run on all storage nodes.


https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/FDsAndEquivClassesVisitor.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/FDsAndEquivClassesVisitor.java:

Line 316:                     + GroupByOperator.veListToString(gByList) + " to 
" + GroupByOperator.veListToString(newGbyList)
It seems your code style is not right.


https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/ProducedVariableExpressionVisitor.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/logical/visitors/ProducedVariableExpressionVisitor.java:

Line 66:  * @author whli
add some comments to describe what this class is used for.


Line 72:     public 
ProducedVariableExpressionVisitor(Map<List<LogicalVariable>, 
List<ILogicalExpression>> varToExpr) {
Why not map a variable to an expression?


Line 78:         return null;
aggregate produces variables.


Line 83:         return null;
running aggregate produces variables.


Line 93:         return null;
group by produces variables.


Line 129:         varToExpr.put(vars, exprs);
why not map each individual variable to its definition expression?


Line 169:     public Void visitSubplanOperator(SubplanOperator op, Void arg) 
throws AlgebricksException {
subplan produces variables.


Line 180:         // TODO Auto-generated method stub
union produces variables.


Line 186:         // TODO Auto-generated method stub
unnest produces variables.


Line 204:     }
data scan produces variables.


Line 234:         List<LogicalVariable> vars = new ArrayList<LogicalVariable>();
insertdelete doesn't produces any variables.
Here the code maps used variables to null?


Line 256:         // TODO Auto-generated method stub
intersect produces variables.


Line 262:         // TODO Auto-generated method stub
outer unnest produces variables.


https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/BulkloadPOperator.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/BulkloadPOperator.java:

Line 106:                 inputDesc, context, spec, true, -1);
pass in null instead of -1.


https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/IndexBulkloadPOperator.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/IndexBulkloadPOperator.java:

Line 131:                 additionalFilteringKeys, filterExpr, inputDesc, 
context, spec, true, -1);
-1 -> null


https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/InsertDeleteUpsertPOperator.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/operators/physical/InsertDeleteUpsertPOperator.java:

Line 105:             ((UnorderedPartitionedProperty) 
r.getPartitioningProperty()).setSingle(this.isSinglePartition());
There shouldn't be a special handling of isSinglePart.


https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/properties/IPartitioningProperty.java
File 
algebricks/algebricks-core/src/main/java/org/apache/hyracks/algebricks/core/algebra/properties/IPartitioningProperty.java:

Line 94:     public void setSingle(boolean isSingle);
This looks to be an immutable field, thus it's better to remove the setter and 
enforce that in the constructor.


Line 96:     public boolean isSingle();
I'm not sure whether isSingle() is enough to capture the partitioning property.

You also need to know which partition(s) are involved.

for example, consider the following query:

create dataset foo(id int32, value string);
create dataset bar(id int32, value string);

insert into dataset foo(
  for $d in dataset bar
  where $d.key = 1
  return {
     "key":  $d.key + 3,
     "value":  $d.value
  }
);

The primary key search for bar goes to one node and the insert to foo goes to 
one node.  Both datasets are hash partitioned in the same way. But the two 
nodes could be different nodes.

I'm thinking where we can extend DefaultNodeGroupDomain with a boolean vector 
to indicate where the operator instances need to be run.

Also consider the following query (more general cases):

for $d in dataset bar
where $d.key = 1 OR $d.key=2
return $d

It would be nice that the query can only go to two nodes  (or one node), 
depending on the hash function.  That also seems to require an more general 
abstraction than isSingle.


https://asterix-gerrit.ics.uci.edu/#/c/427/10/algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java
File 
algebricks/algebricks-rewriter/src/main/java/org/apache/hyracks/algebricks/rewriter/rules/EnforceStructuralPropertiesRule.java:

Line 97: 
it seems the code style is not right.


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/427
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic836210e57b87128120e1f8dabbfed062d09f5e4
Gerrit-PatchSet: 10
Gerrit-Project: hyracks
Gerrit-Branch: master
Gerrit-Owner: Wenhai Li <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-Reviewer: Jianfeng Jia <[email protected]>
Gerrit-Reviewer: Steven Jacobs <[email protected]>
Gerrit-Reviewer: Till Westmann <[email protected]>
Gerrit-Reviewer: Wenhai Li <[email protected]>
Gerrit-Reviewer: Yingyi Bu <[email protected]>
Gerrit-HasComments: Yes

Reply via email to