Hi,

I am new in the development of Spark. When I tried to run unit tests locally
on macOS 10.15.4, everything went smoothly except a single testcase -
SPARK-6330 regression test. After a few hours struggling with it, I moved to
Linux and it passed magically. My OS is Ubuntu 18.0.4. 

Digging into the code, I believe the intention of the test is to validate
that the distributed filesystem's schema is interpreted from the file path
if no default filesystem provided. And it should avoid the exception: 
"IllegalArgumentException: Wrong FS: hdfs://..., expected: file:///".
Instead, the code goes further and meets errors like "UnknownHostException"
when connecting to the remote system as it is a fake file path. However, the
test on my local environment broke because it throws another exception: 
"java.lang.IllegalArgumentException: Pathname  from hdfs://nonexistent is
not a valid DFS filename." when connecting to the remote.

The code is the below:

  test("SPARK-6330 regression test") {
    // In 1.3.0, save to fs other than file: without configuring
core-site.xml would get:
    // IllegalArgumentException: Wrong FS: hdfs://..., expected: file:///
    intercept[Throwable] {
      spark.read.parquet("file:///nonexistent")
    }
    val errorMessage = intercept[Throwable] {
      spark.read.parquet("hdfs://nonexistent")
    }.toString
    assert(errorMessage.contains("UnknownHostException"))
  }

I am wondering if anyone has seen the same broken test before. If so, what
tweaks did you do to make it pass? It could be something I missed when
setting up my local environment. I was using Hadoop3.2 and Hive2.3.

If it is due to the discrepancy of OS systems, does it make sense to make a
change to the test case to help local development? Though we have Jenkins,
we may still need to run tests locally sometimes. My proposals would be:

1. assert(!errorMessage.contains("Wrong FS"))
The risk is later version of Hadoop might change the content of the error
message.

2. assert(errorMessage.contains("UnknownHostException") ||
errorMessage.contains("not a valid DFS filename"))

Any suggestions would be really appreciated. Thanks for your time!



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to