[GitHub] [spark] cchighman commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

GitBox Sun, 26 Jul 2020 21:45:11 -0700


cchighman commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-664115981



   > Test failure looks related.
   
   I'm a little baffled on the failures from SparkR for SparkSQL Arrow 
optimization failing.  Hoping they clear up with the sql tests resolving.  
   
   I just observed they were being skipped from my local environment.
   **From Local:**
   ```
   test_sparkSQL_arrow.R:25: skip: createDataFrame/collect Arrow optimization
   Reason: arrow cannot be loaded
   ```
   
   **From Build:**
   ```
   2020-07-26T08:42:28.2744499Z ⠏ |   0       | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2744809Z ⠋ |   0 1     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2745122Z ⠙ |   0 2     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2745447Z ⠹ |   0 3     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2745754Z ⠸ |   0 4     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2746063Z ⠼ |   0 5     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2746368Z ⠴ |   0 6     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2746888Z ⠦ |   0 7     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2747222Z ⠧ |   0 8     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2747545Z ⠇ |   0 9     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2748054Z ⠙ |   3 9     | SparkSQL Arrow optimization
   2020-07-26T08:42:28.2748439Z ✖ |   5 9     | SparkSQL Arrow optimization 
[12.5 s]
   2020-07-26T08:42:28.2752608Z 
────────────────────────────────────────────────────────────────────────────────
   2020-07-26T08:42:28.2752908Z test_sparkSQL_arrow.R:39: error: 
createDataFrame/collect Arrow optimization
   2020-07-26T08:42:28.2753470Z (converted from warning) Use 'read_ipc_stream' 
or 'read_feather' instead.
   2020-07-26T08:42:28.2753635Z Backtrace:
   2020-07-26T08:42:28.2753875Z   1. base::tryCatch(...) 
tests/fulltests/test_sparkSQL_arrow.R:39:2
   2020-07-26T08:42:28.2754069Z   7. SparkR::collect(createDataFrame(mtcars))
   2020-07-26T08:42:28.2754233Z   8. SparkR:::.local(x, ...)
   2020-07-26T08:42:28.2754408Z  11. arrow::read_arrow(readRaw(conn))
   2020-07-26T08:42:28.2754958Z  12. base::.Deprecated(msg = "Use 
'read_ipc_stream' or 'read_feather' instead.")
   2020-07-26T08:42:28.2755145Z  13. base::warning(...)
   2020-07-26T08:42:28.2755305Z  14. base::withRestarts(...)
   2020-07-26T08:42:28.2755502Z  15. base:::withOneRestart(expr, restarts[[1L]])
   2020-07-26T08:42:28.2755702Z  16. base:::doWithOneRestart(return(expr), 
restart)
   2020-07-26T08:42:28.2755790Z 
   2020-07-26T08:42:28.2756278Z test_sparkSQL_arrow.R:54: error: 
createDataFrame/collect Arrow optimization - many partitions (partition order 
test)
   2020-07-26T08:42:28.2756658Z (converted from warning) Use 'read_ipc_stream' 
or 'read_feather' instead.
   2020-07-26T08:42:28.2756826Z Backtrace:
   2020-07-26T08:42:28.2757048Z   1. base::tryCatch(...) 
tests/fulltests/test_sparkSQL_arrow.R:54:2
   2020-07-26T08:42:28.2757267Z   7. SparkR::collect(createDataFrame(mtcars, 
numPartitions = 32))
   2020-07-26T08:42:28.2757434Z   8. SparkR:::.local(x, ...)
   2020-07-26T08:42:28.2757791Z  11. arrow::read_arrow(readRaw(conn))
   2020-07-26T08:42:28.2758303Z  12. base::.Deprecated(msg = "Use 
'read_ipc_stream' or 'read_feather' instead.")
   2020-07-26T08:42:28.2758506Z  13. base::warning(...)
   2020-07-26T08:42:28.2758675Z  14. base::withRestarts(...)
   2020-07-26T08:42:28.2758881Z  15. base:::withOneRestart(expr, restarts[[1L]])
   2020-07-26T08:42:28.2759088Z  16. base:::doWithOneRestart(return(expr), 
restart)
   ```
   
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cchighman commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

Reply via email to