> On May 26, 2015, 7:40 a.m., Yi Pan (Data Infrastructure) wrote:
> > samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/schema/Stream.java,
> >  line 80
> > <https://reviews.apache.org/r/33280/diff/5/?file=970946#file970946line80>
> >
> >     Question: do we have a way to specify the primary keys/index keys used 
> > in the stream?
> 
> Milinda Pathirage wrote:
>     We don't have a way to specify primary key in the current code. But I am 
> wondering whether primary key is really required during query execution. As I 
> understand primary key constraint is used during insertions in databases and 
> may be during query planning. But I never saw a need to use the primary key 
> during the query execution. May be this is because primary key constraint is 
> handled during insertion (at least in relational databases) and it is assumed 
> that there are no duplicate rows in query execution time. But this can be a 
> problem in our case because currently we don't have a way to handle 
> duplicates (tuples with same primary key) at ingest.
> 
> Yi Pan (Data Infrastructure) wrote:
>     Hi, Milinda, I wonder how we specify the ordering of records in a stream 
> if we don't know the index keys? I think that we would need that at least? I 
> was using offset as the unique identifier in the physical operator 
> implementation. But that's a system specific details. I think what we need at 
> least is the ordering key (i.e. in Order by clause, at least we will get the 
> index keys).

Hi, Yi, Current implementation use Calcite AbstractTable's getStatistics method 
to return statistics about the stream with ordering information. Please have a 
look at the OrderStreamTableFactory in org.apache.samza.sql.calcite.test. 
getStatistics implementation from that class:

```java
public Statistic getStatistic() {
  return Statistics.of(100d,
    ImmutableList.<ImmutableBitSet>of(),
    RelCollations.createSingleton(3));
}
```
  
```RelCollations.createSingleton(3)``` from above code tells Calcite planner 
that this streams is ordered (ascending)  by fourth column (or 'rowtime' 
column). We can also use RelFieldCollation instances to provide more 
information (e.g. change the order to descending) to the planner. We can use 
hints/constraints from DML to return proper statistics to the planner. But this 
information is not used outside of the planner. 

If we are to use information from DML or Calcite in the core, we can implement 
org.apache.samza.sql.api.dat.Stream interface in 
org.apache.samza.sql.calcite.schema.Stream. I'll have a look at that.


- Milinda


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33280/#review85153
-----------------------------------------------------------


On May 23, 2015, 4:27 p.m., Milinda Pathirage wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33280/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 4:27 p.m.)
> 
> 
> Review request for samza, Guozhang Wang and Yi Pan (Data Infrastructure).
> 
> 
> Bugs: SAMZA-561
>     https://issues.apache.org/jira/browse/SAMZA-561
> 
> 
> Repository: samza
> 
> 
> Description
> -------
> 
> This patch contains initial query execution planner implementation based on 
> Apache Calcite. 
> 
> - Basic 'insert into' and 'where' clause support
> - Doesn't support projections, widnowing and aggregates. They will be added 
> later.
> 
> 
> Diffs
> -----
> 
>   build.gradle 16c3574 
>   samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/Utils.java 
> PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/planner/ExecutionPlanner.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/planner/QueryPlanner.java
>  e1c22e9 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/planner/RexToJavaCompiler.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/planner/rules/FilterableStreamScanRule.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/planner/rules/ProjectableStreamScanRule.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/planner/rules/RemoveIdentityProjectRule.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/rel/ProjectableFilterableStreamScan.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/rel/StreamScan.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/schema/AvroSchemaConverter.java
>  705c0ff 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/schema/AvroSchemaUtils.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/schema/RelDataTypeUtils.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/schema/SamzaStreamType.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/schema/Stream.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/main/java/org/apache/samza/sql/calcite/task/StreamSqlTask.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/planner/SamzaStreamTableFactory.java
>  fd87aa5 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/planner/TestExecutionPlanner.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/planner/TestQueryPlanner.java
>  0bb15b2 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/planner/TestRexToJavaCompiler.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/schema/TestAvroSchemaConverter.java
>  fbb5c59 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/schema/TestAvroSchemaUtils.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/test/Constants.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/test/OrderStreamTableFactory.java
>  PRE-CREATION 
>   
> samza-sql-calcite/src/test/java/org/apache/samza/sql/calcite/test/Utils.java 
> PRE-CREATION 
>   samza-sql-core/src/main/java/org/apache/samza/sql/Utils.java PRE-CREATION 
>   samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Field.java 
> PRE-CREATION 
>   samza-sql-core/src/main/java/org/apache/samza/sql/api/data/Schema.java 
> 1e8f192 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/api/expressions/Expression.java
>  PRE-CREATION 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/api/operators/OperatorSpec.java
>  4d670fd 
>   samza-sql-core/src/main/java/org/apache/samza/sql/data/DataUtils.java 
> PRE-CREATION 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/data/IntermediateMessageTuple.java
>  PRE-CREATION 
>   samza-sql-core/src/main/java/org/apache/samza/sql/data/avro/AvroSchema.java 
> 577cf74 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/data/serializers/SqlAvroSerdeFactory.java
>  aad18f4 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/data/string/StringSchema.java
>  348fc0c 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorFactoryImpl.java
>  cbc84d0 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/SimpleOperatorSpec.java
>  56753b6 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/factory/TypeAwareOperatorSpec.java
>  PRE-CREATION 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/insert/InsertToStreamOp.java
>  PRE-CREATION 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/insert/InsertToStreamSpec.java
>  PRE-CREATION 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/scan/FilterableStreamScanOp.java
>  PRE-CREATION 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/scan/FilterableStreamScanSpec.java
>  PRE-CREATION 
>   
> samza-sql-core/src/main/java/org/apache/samza/sql/operators/scan/StreamScanSpec.java
>  PRE-CREATION 
>   
> samza-sql-core/src/test/java/org/apache/samza/sql/data/serializers/SqlAvroSerdeTest.java
>  7412669 
>   
> samza-sql-core/src/test/java/org/apache/samza/sql/data/serializers/TestSqlAvroSerde.java
>  PRE-CREATION 
>   samza-sql-core/src/test/resources/orders.avsc PRE-CREATION 
>   samza-test/src/main/config/sql-filter.properties PRE-CREATION 
>   
> samza-test/src/main/java/org/apache/samza/test/integration/sql/OrdersStreamFactory.java
>  PRE-CREATION 
>   samza-test/src/main/python/integration_tests.py df64e23 
>   samza-test/src/main/python/perf.py 144cf58 
>   samza-test/src/main/python/requirements.txt 2ae9590 
>   samza-test/src/main/python/tests/sql_tests.py PRE-CREATION 
>   samza-test/src/main/resources/orders.avsc PRE-CREATION 
>   samza-test/src/main/resources/orders.json PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/33280/diff/
> 
> 
> Testing
> -------
> 
> * ./bin/check-all.sh passed.
> * Integration tests passed including new streaming sql integration test.
> 
> 
> Thanks,
> 
> Milinda Pathirage
> 
>

Reply via email to