[ 
https://issues.apache.org/jira/browse/BEAM-6276?focusedWorklogId=178318&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-178318
 ]

ASF GitHub Bot logged work on BEAM-6276:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 23/Dec/18 11:28
            Start Date: 23/Dec/18 11:28
    Worklog Time Spent: 10m 
      Work Description: kanterov commented on issue #7331: [BEAM-6276] Fix 
performance regression.
URL: https://github.com/apache/beam/pull/7331#issuecomment-449630174
 
 
   @reuvenlax agree that it isn't a problem for graph-construction. I didn't 
elaborate properly, but the reason to have deterministic schema is to pass it 
to a method like:
   
   ```
   class FromRowUsingCreatorGenerator {
     public static <T> FromRowUsingCreator<T> generate(Class<T> clazz, Schema 
schema);
   }
   ```
   
   and then generate byte code based on a schema and class:
   
   ```java
   public class GeneratedSchemaUserTypeCreator extends 
SerializableFunction4<Object, Object, Object, Object, JavaBean> implements 
UserTypeCreatorFactory {
     private final FieldValueSetter[] setters;
   
     Object apply(Object... args) { // for UserTypeCreatorFactory
       return apply(args[0], args[1], args[2], args[3]);
     }
   
     Object apply(Object p0, Object p1, Object p2, Object p3) { // faster
           // we don't use newInstance, instead just generate byte-code with a 
constructor call
           Object object = new JBean(); 
           setters[0].set(object, p0);
           setters[1].set(object, p1);
           setters[2].set(object, p2);
           setters[3].set(object, p3);
     } 
   }
   
   public class GeneratedFromRowUsingCreator extends 
FromRowUsingCreator<JavaBean> {
   
     private final SerializableFunction4 creator;
     private final FromRowUsingCreator<InnerJavaBean> underlying1; // for 
field_1
   
     Generated(Schema schema) {
       // know that it is SerializableFunction4 because there are 4 fields in 
schema
       creator = (SerializableFunction4) 
schemaTypeCreatorFactory.create(JavaBean.class, schema);
     }
   
     public T apply(Row row) {
       // calling .apply(Object p0, ..., Object p3) is much faster then 
`.apply(Object... params)`
       // due to JIT
       return creator.apply(
           row.getValue(0),
           // byte code will contain a call to underling FromRowUsingCreator
           // only in the case of ROW, MAP or ARRAY field
           underlying.toRow(row.getRow(1)), 
           row.getValue(2),
           row.getValue(3));
     }
     
   }
   ```
   
   When I did benchmark, I used JMH, and didn't include any cost that we pay 
once per pipeline construction.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 178318)
    Time Spent: 2h 40m  (was: 2.5h)

> Performance regression caused by extra calls to TypeDescriptor.getRawType
> -------------------------------------------------------------------------
>
>                 Key: BEAM-6276
>                 URL: https://issues.apache.org/jira/browse/BEAM-6276
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Reuven Lax
>            Assignee: Reuven Lax
>            Priority: Major
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to