[ 
https://issues.apache.org/jira/browse/BEAM-7545?focusedWorklogId=277692&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-277692
 ]

ASF GitHub Bot logged work on BEAM-7545:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Jul/19 19:36
            Start Date: 16/Jul/19 19:36
    Worklog Time Spent: 10m 
      Work Description: akedin commented on pull request #9040: [BEAM-7545] 
Reordering Beam Joins
URL: https://github.com/apache/beam/pull/9040#discussion_r304084580
 
 

 ##########
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java
 ##########
 @@ -103,6 +106,10 @@
 
           // join rules
           JoinPushExpressionsRule.INSTANCE,
+          JoinCommuteRule.INSTANCE,
 
 Review comment:
   There are few separate kinds of tests relevant here:
   
   * tests of Beam/Beam SQL:
     * we test that the system works correctly;
     * we verify that expected inputs produce correct results;
        * as long as everything works correctly and produces correct data, 
everything is good;
     * if we add a new optimization rule that doesn't affect the 
behavior+correctness of the system, no tests in this category should fail;
   
   * tests of specific rules. They test that a concrete rule works as expected. 
To test it we need to make sure that:
     * without the rule the behavior doesn't happen (no reorders);
     * with the rule the behavior happens (reorder does happen);
     * this specific rule is applied and caused the change, and no other rule 
could have caused this effect:
       * this potentially can happen if there are multiple interdependent rules 
that trigger in presence of each other. Whether this is a correct and intended 
situation or not we should be isolating this case;
     * if someone adds another rule to the bigger "production" set, this suite 
should not break, it only tests the specific rule;
     * if we decide to not use the rule and exclude it from the "production" 
rule set, this test should not break;
   
   * "integration" test, or system configuration test. We ensure that the 
system is configured the certain way:
      * we can look at what rules are enabled and fail if the enabled rule set 
is different from what we expect;
      * we can test that the larger "production" ruleset has some behavior, 
e.g. that all rules combined have the effect of ordering enabled, so that if 
someone disables it with another rule, it breaks;
   
   I am not sure if we currently have a straightforward way to setup the tests 
that exercise only one rule. E.g. to test a concrete rule we probably need to 
setup the parser and make it work without all rules except the ones we need, 
probably even without Beam physical rules (which we also don't test directly). 
If we are able to get something like this from Calcite/Flink, we should do it.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 277692)
    Time Spent: 8h 50m  (was: 8h 40m)

> Row Count Estimation for CSV TextTable
> --------------------------------------
>
>                 Key: BEAM-7545
>                 URL: https://issues.apache.org/jira/browse/BEAM-7545
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Alireza Samadianzakaria
>            Assignee: Alireza Samadianzakaria
>            Priority: Major
>             Fix For: Not applicable
>
>          Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> Implementing Row Count Estimation for CSV Tables by reading the first few 
> lines of the file and estimating the number of records based on the length of 
> these lines and the total length of the file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to