Pedro Prado created SPARK-15347:
-----------------------------------

             Summary: Problem select empty ORC table
                 Key: SPARK-15347
                 URL: https://issues.apache.org/jira/browse/SPARK-15347
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 1.6.1
         Environment: Hadoop 2.7.1.2.4.2.0-258
Subversion g...@github.com:hortonworks/hadoop.git -r 
13debf893a605e8a88df18a7d8d214f571e05289
Compiled by jenkins on 2016-04-25T05:46Z
Compiled with protoc 2.5.0
>From source with checksum 2a2d95f05ec6c3ac547ed58cab713ac
This command was run using 
/usr/hdp/2.4.2.0-258/hadoop/hadoop-common-2.7.1.2.4.2.0-258.jar

            Reporter: Pedro Prado
             Fix For: 1.6.0



Error when I selected empty ORC table

    [pprado@hadoop-m ~]$ beeline -u jdbc:hive2://
    WARNING: Use "yarn jar" to launch YARN applications.
    Connecting to jdbc:hive2://
    Connected to: Apache Hive (version 1.2.1000.2.4.2.0-258)
    Driver: Hive JDBC (version 1.2.1000.2.4.2.0-258)
    Transaction isolation: TRANSACTION_REPEATABLE_READ
    Beeline version 1.2.1000.2.4.2.0-258 by Apache Hive

On beeline => create table my_test (id int, name String) stored as orc;
On beeline => select * from my_test;

    16/05/13 18:18:57 [main]: ERROR hdfs.KeyProviderCache: Could not find uri 
with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
    OK
    +-------------+---------------+--+
    | my_test.id | my_test.name |
    +-------------+---------------+--+
    +-------------+---------------+--+
    No rows selected (1.227 seconds)

Hive is OK!

Now, when i execute pyspark.

    Welcome to
    SPARK version 1.6.1

    Using Python version 2.6.6 (r266:84292, Jul 23 2015 15:22:56)
    SparkContext available as sc, HiveContext available as sqlContext.

PySpark => sqlContext.sql("select * from my_test")

    16/05/13 18:33:41 INFO ParseDriver: Parsing command: select * from my_test
    16/05/13 18:33:41 INFO ParseDriver: Parse Completed
    Traceback (most recent call last):
    File "", line 1, in
    File "/usr/hdp/2.4.2.0-258/spark/python/pyspark/sql/context.py", line 580, 
in sql
    return DataFrame(self.ssql_ctx.sql(sqlQuery), self)
    File 
"/usr/hdp/2.4.2.0-258/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", 
line 813, in __call_
    File "/usr/hdp/2.4.2.0-258/spark/python/pyspark/sql/utils.py", line 53, in 
deco
    raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
    pyspark.sql.utils.IllegalArgumentException: u'orcFileOperator: path 
hdfs://hadoop-m.c.sva-0001.internal:8020/apps/hive/warehouse/my_test does not 
have valid orc files matching the pattern'

when i create parquet table, it's all right. I do not have problem.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to