[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2020-01-02 Thread Sofia (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007014#comment-17007014
 ] 

Sofia commented on HIVE-21367:
--

[~sophie1] we re still using this workaround with the same version. I don't 
have any idea if this bug has been fixed in the latest version.

 

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.t

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-05 Thread Sofia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16784226#comment-16784226
 ] 

Sofia commented on HIVE-21367:
--

Both sqoop and spark makes subdirs.

Setting property hive.fetch.task.conversion= to none resolved the issue.

Thanks [~starphin]

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread Sofia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783485#comment-16783485
 ] 

Sofia commented on HIVE-21367:
--

The target table is from two different sources :
 * {color:#33}*From SQOOP*{color}: when loading tables we use the following 
code.

{code:java}
sqoop import --connect ${CONNECTION} \
--username ${USER} \
--password ${PASSWORD} \
--table $1 \
--hive-database $2 \
--hive-table ${TBNAME} \
--hive-import \
--as-orcfile \
--hive-overwrite \
-m 1 \
--delete-target-dir 

{code}
 *  *From SPARK*: when processing the data, we store the output as a table in 
hive using the following code.
{code:java}
df.write
  .mode(mode)
  .format(HiveWarehouseSession.HIVE_WAREHOUSE_CONNECTOR)
  .option("table",tableName)
  .save(){code}
How do we load the data into the root path of the target table in each case ? 

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
> Attachments: mapred_input_dir_recursive.png
>
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of t

[jira] [Commented] (HIVE-21367) Hive returns an incorrect result when using a simple select query

2019-03-04 Thread Sofia (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-21367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16783236#comment-16783236
 ] 

Sofia commented on HIVE-21367:
--

Hi [~starphin],  why do hive behave that way and create subdirs when executing 
a simple select ? Is there any workaround for that ?

> Hive returns an incorrect result when using a simple select query
> -
>
> Key: HIVE-21367
> URL: https://issues.apache.org/jira/browse/HIVE-21367
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2, JDBC, SQL
>Affects Versions: 3.1.0
> Environment:  - HDP 3.1
>   - Hive 3.1.0
>   - Spark 2.3.2
>   - Sqoop 1.4.7
>Reporter: LEMBARKI Mohamed Amine
>Priority: Blocker
>
> Hive returns an incorrect result when using a simple select query with a 
> where clause
>  While with an aggregation it returns a correct result
>  The problem arises for tables created by Spark or Sqoop
> Also when we use spark-shell with HiveWarehouseConnector it returns a correct 
> result
>  
> Workflow: 
>      - Loading data with sqoop to hive
>      - Data processing with spark using HiveWarehouseConnector and Storage to 
> Hive
>   
> below the error log :
>  
>  */-* 
>  *1 - Executing Query : select code from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select code from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:36 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:code, 
> type:string, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.142 seconds
> INFO : Executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2): 
> select code from db1.tbl1 where code = '123'
> INFO : Completed executing 
> command(queryId=hive_20190301103129_d48e71f6-a8dd-490e-a574-04d8d4f893e2); 
> Time taken: 0.003 seconds
> INFO : OK
> +--+
> | code |
> +--+
> +--+
> No rows selected (4,307 seconds)
> Beeline version 3.1.0.3.1.0.0-78 by Apache Hive
> Closing: 0: 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> {code}
> */-*
> *2 - Executing Query using count :* 
>       *select count(code) from db1.tbl1 where code = '123'*
>  */-*
> {code:java}
> [data@data1 ~]$ hive -e "select count(code) from db1.tbl1 where code = '123'"
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/3.1.0.0-78/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> Connecting to 
> jdbc:hive2://data2:2181,data1:2181/default;password=data;serviceDiscoveryMode=zooKeeper;user=data;zooKeeperNamespace=hiveserver2
> 19/03/01 10:31:56 [main]: INFO jdbc.HiveConnection: Connected to data2:1
> Connected to: Apache Hive (version 3.1.0.3.1.0.0-78)
> Driver: Hive JDBC (version 3.1.0.3.1.0.0-78)
> Transaction isolation: TRANSACTION_REPEATABLE_READ
> INFO : Compiling 
> command(queryId=hive_20190301103149_90aa338b-b99b-4f1c-b7e5-6b285f64cb3e): 
> select count(code) from db1.tbl1 where code = '123'
> INFO : Semantic Analysis Completed (retria