[ 
https://issues.apache.org/jira/browse/SPARK-26225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16735954#comment-16735954
 ] 

Wenchen Fan commented on SPARK-26225:
-------------------------------------

I think it's hard to define the decoding time, as every data source may has its 
own definition.

For data source v1, I think we just need to update `RowDataSourceScanExec` and 
track the time of the unsafe projection that turns Row to InternalRow.

For data source v2, it's totally different. Spark needs to ask the data source 
to report the decoding time (or any other metrics). I'd like to defer it after 
the data source v2 metrics API is introduced.

> Scan: track decoding time for row-based data sources
> ----------------------------------------------------
>
>                 Key: SPARK-26225
>                 URL: https://issues.apache.org/jira/browse/SPARK-26225
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Reynold Xin
>            Priority: Major
>
> Scan node should report decoding time for each record, if it is not too much 
> overhead.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to