Hyukjin Kwon created SPARK-19117:
------------------------------------

             Summary: script transformation does not work on Windows due to 
fixed bash executable location
                 Key: SPARK-19117
                 URL: https://issues.apache.org/jira/browse/SPARK-19117
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Hyukjin Kwon


There are some tests failed on Windows via AppVeyor as below due to this 
problem :

{code}
 - script *** FAILED *** (553 milliseconds)
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 56.0 failed 1 times, most recent failure: Lost task 0.0 in stage 56.0 
(TID 54, localhost, executor driver): java.io.IOException: Cannot run program 
"/bin/bash": CreateProcess error=2, The system cannot find the file specified

 - Star Expansion - script transform *** FAILED *** (2 seconds, 375 
milliseconds)
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 389.0 failed 1 times, most recent failure: Lost task 0.0 in stage 389.0 
(TID 725, localhost, executor driver): java.io.IOException: Cannot run program 
"/bin/bash": CreateProcess error=2, The system cannot find the file specified

 - test script transform for stdout *** FAILED *** (2 seconds, 813 milliseconds)
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 391.0 failed 1 times, most recent failure: Lost task 0.0 in stage 391.0 
(TID 726, localhost, executor driver): java.io.IOException: Cannot run program 
"/bin/bash": CreateProcess error=2, The system cannot find the file specified

 - test script transform for stderr *** FAILED *** (2 seconds, 407 milliseconds)
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 393.0 failed 1 times, most recent failure: Lost task 0.0 in stage 393.0 
(TID 727, localhost, executor driver): java.io.IOException: Cannot run program 
"/bin/bash": CreateProcess error=2, The system cannot find the file specified

 - test script transform data type *** FAILED *** (171 milliseconds)
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 395.0 failed 1 times, most recent failure: Lost task 0.0 in stage 395.0 
(TID 728, localhost, executor driver): java.io.IOException: Cannot run program 
"/bin/bash": CreateProcess error=2, The system cannot find the file specified

 - transform *** FAILED *** (359 milliseconds)
   Failed to execute query using catalyst:
   Error: Job aborted due to stage failure: Task 0 in stage 1347.0 failed 1 
times, most recent failure: Lost task 0.0 in stage 1347.0 (TID 2395, localhost, 
executor driver): java.io.IOException: Cannot run program "/bin/bash": 
CreateProcess error=2, The system cannot find the file specified
  
 - schema-less transform *** FAILED *** (344 milliseconds)
   Failed to execute query using catalyst:
   Error: Job aborted due to stage failure: Task 0 in stage 1348.0 failed 1 
times, most recent failure: Lost task 0.0 in stage 1348.0 (TID 2396, localhost, 
executor driver): java.io.IOException: Cannot run program "/bin/bash": 
CreateProcess error=2, The system cannot find the file specified

 - transform with custom field delimiter *** FAILED *** (296 milliseconds)
   Failed to execute query using catalyst:
   Error: Job aborted due to stage failure: Task 0 in stage 1349.0 failed 1 
times, most recent failure: Lost task 0.0 in stage 1349.0 (TID 2397, localhost, 
executor driver): java.io.IOException: Cannot run program "/bin/bash": 
CreateProcess error=2, The system cannot find the file specified

 - transform with custom field delimiter2 *** FAILED *** (297 milliseconds)
   Failed to execute query using catalyst:
   Error: Job aborted due to stage failure: Task 0 in stage 1350.0 failed 1 
times, most recent failure: Lost task 0.0 in stage 1350.0 (TID 2398, localhost, 
executor driver): java.io.IOException: Cannot run program "/bin/bash": 
CreateProcess error=2, The system cannot find the file specified

 - transform with custom field delimiter3 *** FAILED *** (312 milliseconds)
   Failed to execute query using catalyst:
   Error: Job aborted due to stage failure: Task 0 in stage 1351.0 failed 1 
times, most recent failure: Lost task 0.0 in stage 1351.0 (TID 2399, localhost, 
executor driver): java.io.IOException: Cannot run program "/bin/bash": 
CreateProcess error=2, The system cannot find the file specified

 - transform with SerDe2 *** FAILED *** (437 milliseconds)
   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 1355.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1355.0 
(TID 2403, localhost, executor driver): java.io.IOException: Cannot run program 
"/bin/bash": CreateProcess error=2, The system cannot find the file specified

 - script transformation - schemaless *** FAILED *** (78 milliseconds)
   ...
   Cause: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1968.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 1968.0 (TID 3932, localhost, executor driver): java.io.IOException: 
Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find 
the file specified
  - script transformation - alias list *** FAILED *** (94 milliseconds)
   ...
   Cause: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1969.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 1969.0 (TID 3933, localhost, executor driver): java.io.IOException: 
Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find 
the file specified

 - script transformation - alias list with type *** FAILED *** (93 milliseconds)
   ...
   Cause: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1970.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 1970.0 (TID 3934, localhost, executor driver): java.io.IOException: 
Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find 
the file specified

 - script transformation - row format delimited clause with only one format 
property *** FAILED *** (78 milliseconds)
   ...
   Cause: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1971.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 1971.0 (TID 3935, localhost, executor driver): java.io.IOException: 
Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find 
the file specified

 - script transformation - row format delimited clause with multiple format 
properties *** FAILED *** (94 milliseconds)
   ...
   Cause: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1972.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 1972.0 (TID 3936, localhost, executor driver): java.io.IOException: 
Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find 
the file specified

 - script transformation - row format serde clauses with SERDEPROPERTIES *** 
FAILED *** (78 milliseconds)
   ...
   Cause: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1973.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 1973.0 (TID 3937, localhost, executor driver): java.io.IOException: 
Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find 
the file specified

 - script transformation - row format serde clauses without SERDEPROPERTIES *** 
FAILED *** (78 milliseconds)
   ...
   Cause: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 0 in stage 1974.0 failed 1 times, most recent failure: Lost task 0.0 in 
stage 1974.0 (TID 3938, localhost, executor driver): java.io.IOException: 
Cannot run program "/bin/bash": CreateProcess error=2, The system cannot find 
the file specified

 - cat without SerDe *** FAILED *** (156 milliseconds)
   ...
   Caused by: java.io.IOException: Cannot run program "/bin/bash": 
CreateProcess error=2, The system cannot find the file specified

 - cat with LazySimpleSerDe *** FAILED *** (63 milliseconds)
    ...
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2383.0 failed 1 times, most recent failure: Lost task 0.0 in stage 
2383.0 (TID 4819, localhost, executor driver): java.io.IOException: Cannot run 
program "/bin/bash": CreateProcess error=2, The system cannot find the file 
specified

 - script transformation should not swallow errors from upstream operators (no 
serde) *** FAILED *** (78 milliseconds)
    ...
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2384.0 failed 1 times, most recent failure: Lost task 0.0 in stage 
2384.0 (TID 4820, localhost, executor driver): java.io.IOException: Cannot run 
program "/bin/bash": CreateProcess error=2, The system cannot find the file 
specified

 - script transformation should not swallow errors from upstream operators 
(with serde) *** FAILED *** (47 milliseconds)
    ...
    org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 
in stage 2385.0 failed 1 times, most recent failure: Lost task 0.0 in stage 
2385.0 (TID 4821, localhost, executor driver): java.io.IOException: Cannot run 
program "/bin/bash": CreateProcess error=2, The system cannot find the file 
specified

 - SPARK-14400 script transformation should fail for bad script command *** 
FAILED *** (47 milliseconds)
   "Job aborted due to stage failure: Task 0 in stage 2386.0 failed 1 times, 
most recent failure: Lost task 0.0 in stage 2386.0 (TID 4822, localhost, 
executor driver): java.io.IOException: Cannot run program "/bin/bash": 
CreateProcess error=2, The system cannot find the file specified
{code}

The problem is in this line - 
https://github.com/apache/spark/blob/21c7539a5274a7e77686d17a6261d56592b85c2d/sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/ScriptTransformation.scala#L70

We always assume {{bash}} is located in {{/bin/bash}}. In some cases such as 
installing Cygwin and use bash in cmd or using bash in Windows 10, they are not 
located there.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to