[ 
https://issues.apache.org/jira/browse/ARROW-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diana Clarke updated ARROW-13266:
---------------------------------
    Description: 
I propose we make the following changes to the JS benchmark results, so that 
they are more parsable for Conbench. We'll also need to add suite name to the 
json results to disambiguate otherwise duplicate benchmark names.

1) Rename {{name}} to {{column}}

    {code}"name": "name: 'lat', length: 1,000,000, type: Float32",{code}
    
    vs.
    
    {code}"name": "column: 'lat', length: 1,000,000, type: Float32",{code}

2) Add the suite name to the json results.

2) Remove dataset name from the suite name, move it to the benchmark name.


  was:
1) I found the double usage of "name" confusing.

    {code}"name": "name: 'lat', length: 1,000,000, type: Float32",{code}
    
    Perhaps `column` instead?
    
    {code}"name": "column: 'lat', length: 1,000,000, type: Float32",{code}

2) It would probably be more readable if {{tracks}} was in single quotes.

{{Running "Get "tracks" values by index" suite...}}

3) The names could be more informative (and there are currently duplicates). I 
see the following in the json.

{code}
    "name": "Table.from",
    "name": "readBatches",
    "name": "serialize",
    "name": "name: 'lat', length: 1,000,000, type: Float32",
    "name": "name: 'lng', length: 1,000,000, type: Float32",
    "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
    "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, 
Utf8>",
    "name": "name: 'lat', length: 1,000,000, type: Float32",
    "name": "name: 'lng', length: 1,000,000, type: Float32",
    "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
    "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, 
Utf8>",
    "name": "name: 'lat', length: 1,000,000, type: Float32",
    "name": "name: 'lng', length: 1,000,000, type: Float32",
    "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
    "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, 
Utf8>",
    "name": "name: 'lat', length: 1,000,000, type: Float32",
    "name": "name: 'lng', length: 1,000,000, type: Float32",
    "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>",
    "name": "name: 'destination', length: 1,000,000, type: Dictionary<Int8, 
Utf8>",
    "name": "length: 1,000,000",
    "name": "name: 'lat', length: 1,000,000, type: Float32, test: gt, value: 0",
    "name": "name: 'lng', length: 1,000,000, type: Float32, test: gt, value: 0",
    "name": "name: 'origin', length: 1,000,000, type: Dictionary<Int8, Utf8>, 
test: eq, value: Seattle",
{code}


  Yet I do see informative names in the code (like {{DataFrame Count By...}} & 
{{DataFrame Filter-Scan Count...}}):

    - 
https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L124
    - 
https://github.com/apache/arrow/blob/5ca16287a389afceabdd4b487d2e43e62745abcc/js/perf/index.ts#L114
 
   Perhaps add the suite name? And make the values json rather than comma 
separated values as one string?
   
   Something like this:
   
{code}
       ...
       "name": "DataFrame Count By"
       "values": {
           "column": "lng",
           "length": "1,000,000",
           "type": "Float32",
           "test": "gt",
           "value": "0"
        }
        ...
{code}    


> [JS] Improve benchmark names
> ----------------------------
>
>                 Key: ARROW-13266
>                 URL: https://issues.apache.org/jira/browse/ARROW-13266
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: JavaScript
>            Reporter: Diana Clarke
>            Assignee: Diana Clarke
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> I propose we make the following changes to the JS benchmark results, so that 
> they are more parsable for Conbench. We'll also need to add suite name to the 
> json results to disambiguate otherwise duplicate benchmark names.
> 1) Rename {{name}} to {{column}}
>     {code}"name": "name: 'lat', length: 1,000,000, type: Float32",{code}
>     
>     vs.
>     
>     {code}"name": "column: 'lat', length: 1,000,000, type: Float32",{code}
> 2) Add the suite name to the json results.
> 2) Remove dataset name from the suite name, move it to the benchmark name.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to