[ https://issues.apache.org/jira/browse/PARQUET-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gabor Szadovszky resolved PARQUET-2072. --------------------------------------- Resolution: Fixed > Do Not Determine Both Min/Max for Binary Stats > ---------------------------------------------- > > Key: PARQUET-2072 > URL: https://issues.apache.org/jira/browse/PARQUET-2072 > Project: Parquet > Issue Type: Improvement > Reporter: David Mollitor > Assignee: David Mollitor > Priority: Minor > > I'm looking at some benchmarking code of Apache ORC v.s. Apache Parquet and > see that Parquet is quite a bit slower for writes (reads TBD). Based on my > investigation, I have noticed a significant amount of time spent in > determining min/max for binary types. > One quick improvement is to bypass a "max" value determinization if the value > has already been determined to be a "min". > While I'm at it, remove calls to deprecated functions. -- This message was sent by Atlassian Jira (v8.3.4#803005)