[jira] [Updated] (SPARK-24930) Exception information is not accurate when using `LOAD DATA LOCAL INPATH`

Xiaochen Ouyang (JIRA) Thu, 26 Jul 2018 02:22:25 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-24930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Xiaochen Ouyang updated SPARK-24930:
------------------------------------
    Description: 
# root user create a test.txt file contains a record '123'  in /root/ directory
 # switch mr user to execute spark-shell --master local

{code:java}
scala> spark.version
res2: String = 2.2.1

scala> spark.sql("create table t1(id int) partitioned by(area string)");
2018-07-26 17:20:37,523 WARN org.apache.hadoop.hive.metastore.HiveMetaStore: 
Location: hdfs://nameservice/spark/t1 specified for non-external table:t1
res4: org.apache.spark.sql.DataFrame = []


scala> spark.sql("load data local inpath '/root/test.txt' into table t1 
partition(area ='025')")
org.apache.spark.sql.AnalysisException: LOAD DATA input path does not exist: 
/root/test.txt;
 at org.apache.spark.sql.execution.command.LoadDataCommand.run(tables.scala:339)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
 at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:639)
 ... 48 elided

scala>
{code}
In fact, the input path exists, but the mr user does not have permission to 
access the directory `/root/` ,so the message throwed by `AnalysisException` 
can misleading user fix problem.

  was:
#  root user  create a test.txt file contains a record '123'  in /root/ 
directory
# switch mr user to execute spark-shell --master local

{code:java}
scala> spark.version
res2: String = 2.2.1

scala> spark.sql("load data local inpath '/root/test.txt' into table t1 
partition(area ='025')")
org.apache.spark.sql.AnalysisException: LOAD DATA input path does not exist: 
/root/test.txt;
 at org.apache.spark.sql.execution.command.LoadDataCommand.run(tables.scala:339)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
 at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:639)
 ... 48 elided

scala>
{code}

In fact, the input path exists, but the mr user does not have permission to 
access the directory `/root/` ,so the message throwed by `AnalysisException` 
can misleading user fix problem.


>  Exception information is not accurate when using `LOAD DATA LOCAL INPATH`
> --------------------------------------------------------------------------
>
>                 Key: SPARK-24930
>                 URL: https://issues.apache.org/jira/browse/SPARK-24930
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.2.1, 2.2.2, 2.3.0, 2.3.1
>            Reporter: Xiaochen Ouyang
>            Priority: Major
>
> # root user create a test.txt file contains a record '123'  in /root/ 
> directory
>  # switch mr user to execute spark-shell --master local
> {code:java}
> scala> spark.version
> res2: String = 2.2.1
> scala> spark.sql("create table t1(id int) partitioned by(area string)");
> 2018-07-26 17:20:37,523 WARN org.apache.hadoop.hive.metastore.HiveMetaStore: 
> Location: hdfs://nameservice/spark/t1 specified for non-external table:t1
> res4: org.apache.spark.sql.DataFrame = []
> scala> spark.sql("load data local inpath '/root/test.txt' into table t1 
> partition(area ='025')")
> org.apache.spark.sql.AnalysisException: LOAD DATA input path does not exist: 
> /root/test.txt;
>  at 
> org.apache.spark.sql.execution.command.LoadDataCommand.run(tables.scala:339)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>  at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:639)
>  ... 48 elided
> scala>
> {code}
> In fact, the input path exists, but the mr user does not have permission to 
> access the directory `/root/` ,so the message throwed by `AnalysisException` 
> can misleading user fix problem.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-24930) Exception information is not accurate when using `LOAD DATA LOCAL INPATH`

Reply via email to