[ 
https://issues.apache.org/jira/browse/SPARK-24400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492291#comment-16492291
 ] 

Supreeth Sharma commented on SPARK-24400:
-----------------------------------------

cc [~hyukjin.kwon]

> Issue with spark while accessing managed table with partitions across 
> multiple namespaces - HDFS Federation
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-24400
>                 URL: https://issues.apache.org/jira/browse/SPARK-24400
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.3.0
>            Reporter: Supreeth Sharma
>            Priority: Critical
>         Attachments: federation_managed_table.py
>
>
> Facing Issue with spark while accessing managed table with partitions across 
> multiple namespaces
> Test steps :
> 1) Create HDFS federated cluster with two namespaces.
> 2) Create a managed table whose location is in default namespace (CREATE 
> TABLE test_managed_tbl (id int, name string, dept string) PARTITIONED BY 
> (year int))
> 3) Insert a row into table and check that action is going through fine.
> 4) Try to alter the table and set the new location which is in Namespace2. 
> (ALTER TABLE test_managed_tbl SET LOCATION 
> 'hdfs://ns2/apps/hive/warehouse/test_managed_tbl')
> 5) Try to insert new value into the table (INSERT INTO test_managed_tbl 
> PARTITION (year=2017) VALUES (9,'Harris','CSE'))
> This action is failing with below error :
> {code:java}
> 18/05/23 02:50:59 INFO FileUtils: Creating directory if it doesn't exist: 
> hdfs://ns2/apps/hive/warehouse/test_managed_tbl/year=2017
> Traceback (most recent call last):
>   File "/tmp/federation_managed.py", line 17, in <module>
>     spark.sql("INSERT INTO test_managed_tbl PARTITION (year=2017) VALUES 
> (9,'Harris','CSE')")
>   File 
> "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/session.py",
>  line 714, in sql
>   File 
> "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
>  line 1257, in __call__
>   File 
> "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/utils.py", 
> line 69, in deco
> pyspark.sql.utils.AnalysisException: u'java.lang.IllegalArgumentException: 
> Wrong FS: 
> hdfs://ns2/apps/hive/warehouse/test_managed_tbl/.hive-staging_hive_2018-05-23_02-50-56_484_3662347267719413000-1/-ext-10000/part-00000-5ee3003b-d41f-41d8-adaa-8937919f896d-c000,
>  expected: hdfs://ns1;'
> 18/05/23 02:50:59 INFO SparkContext: Invoking stop() from shutdown hook 
> {code}
> Spark-submit command :
> {code:java}
> spark-submit --master yarn-client --conf spark.sql.catalogImplementation=hive 
> /tmp/federation_managed_table.py
> {code}
> Attaching federation_managed_table.py .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to