[jira] [Commented] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16764091#comment-16764091 ] hakki commented on IMPALA-8109: --- Hi Tim, I also tested the case that the file fits into one single block. Nothing changed. > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Priority: Major > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760860#comment-16760860 ] hakki edited comment on IMPALA-8109 at 2/5/19 2:40 PM: --- I tried similar query on my table, however it fails. Fsck output: {code:java} [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 10 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY {code} was (Author: hakkibc): I tried similar query on my table, however it fails. Fsck output: {code:java} [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY {code} > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Assignee: Tim Armstrong >Priority: Major > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760860#comment-16760860 ] hakki edited comment on IMPALA-8109 at 2/5/19 2:32 PM: --- I tried similar query on my table, however it fails. Fsck output: - [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY- was (Author: hakkibc): I tried similar query on my table, however it fails. Fsck output: [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Assignee: Tim Armstrong >Priority: Major > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760860#comment-16760860 ] hakki commented on IMPALA-8109: --- I tried similar query on my table, however it fails. Fsck output: ~[root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY~ > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Assignee: Tim Armstrong >Priority: Major > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760860#comment-16760860 ] hakki edited comment on IMPALA-8109 at 2/5/19 2:33 PM: --- I tried similar query on my table, however it fails. Fsck output: [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY was (Author: hakkibc): I tried similar query on my table, however it fails. Fsck output: - [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY- > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Assignee: Tim Armstrong >Priority: Major > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760860#comment-16760860 ] hakki edited comment on IMPALA-8109 at 2/5/19 2:32 PM: --- I tried similar query on my table, however it fails. Fsck output: [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY was (Author: hakkibc): {{I tried similar query on my table, however it fails. Fsck output: }} {{ [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz}} {{ Connecting to namenode via [http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz|http://fff.fff.fff.fff/user/impala/test/XX_198457.log.gz]}} {{ FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019}} {{ .Status: HEALTHY}} {{ Total size: 2173247385 B}} {{ Total dirs: 0}} {{ Total files: 1}} {{ Total symlinks: 0}} {{ Total blocks (validated): 17 (avg. block size 127838081 B)}} {{ Minimally replicated blocks: 17 (100.0 %)}} {{ Over-replicated blocks: 0 (0.0 %)}} {{ Under-replicated blocks: 0 (0.0 %)}} {{ Mis-replicated blocks: 0 (0.0 %)}} {{ Default replication factor: 2}} {{ Average block replication: 2.0}} {{ Corrupt blocks: 0}} {{ Missing replicas: 0 (0.0 %)}} {{ Number of data-nodes: 100}} {{ Number of racks: 1}} {{ FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds}}{{The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY}} > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Assignee: Tim Armstrong >Priority: Major > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Comment Edited] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16760860#comment-16760860 ] hakki edited comment on IMPALA-8109 at 2/5/19 2:31 PM: --- {{I tried similar query on my table, however it fails. Fsck output: }} {{ [root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz}} {{ Connecting to namenode via [http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz|http://fff.fff.fff.fff/user/impala/test/XX_198457.log.gz]}} {{ FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019}} {{ .Status: HEALTHY}} {{ Total size: 2173247385 B}} {{ Total dirs: 0}} {{ Total files: 1}} {{ Total symlinks: 0}} {{ Total blocks (validated): 17 (avg. block size 127838081 B)}} {{ Minimally replicated blocks: 17 (100.0 %)}} {{ Over-replicated blocks: 0 (0.0 %)}} {{ Under-replicated blocks: 0 (0.0 %)}} {{ Mis-replicated blocks: 0 (0.0 %)}} {{ Default replication factor: 2}} {{ Average block replication: 2.0}} {{ Corrupt blocks: 0}} {{ Missing replicas: 0 (0.0 %)}} {{ Number of data-nodes: 100}} {{ Number of racks: 1}} {{ FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds}}{{The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY}} was (Author: hakkibc): I tried similar query on my table, however it fails. Fsck output: ~[root@ ~]# hdfs fsck /user/impala/test/XX_198457.log.gz Connecting to namenode via http://FFF.FFF.FFF.FFF/user/impala/test/XX_198457.log.gz FSCK started by root (auth:SIMPLE) from /XX.XX.XX.XX for path /user/impala/test/XX_198457.log.gz at Tue Feb 05 17:23:52 EET 2019 .Status: HEALTHY Total size:2173247385 B Total dirs:0 Total files: 1 Total symlinks:0 Total blocks (validated): 17 (avg. block size 127838081 B) Minimally replicated blocks: 17 (100.0 %) Over-replicated blocks:0 (0.0 %) Under-replicated blocks: 0 (0.0 %) Mis-replicated blocks: 0 (0.0 %) Default replication factor:2 Average block replication: 2.0 Corrupt blocks:0 Missing replicas: 0 (0.0 %) Number of data-nodes: 100 Number of racks: 1 FSCK ended at Tue Feb 05 17:23:52 EET 2019 in 0 milliseconds The filesystem under path '/user/impala/test/XX_198457.log.gz' is HEALTHY~ > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Assignee: Tim Armstrong >Priority: Major > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16752186#comment-16752186 ] hakki commented on IMPALA-8109: --- [~tarmstrong] [~boroknagyz] There is an important point. Just tested the file on two different impalad versions. 1- on 2.12.0-cdh5.15.0 version, the read is failed. 2- However; on 2.6.0-cdh5.8.3 version, the read of the same file is successfull. FYI. Thanks. > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Priority: Minor > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hakki updated IMPALA-8109: -- Description: When querying a partition containing gzip files, the query fails with the error below: WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: Error(255): Unknown error 255 Root cause: EOFException: Cannot seek to negative offset hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The uncompressed size is ~13GB The impalad version is : 2.12.0-cdh5.15.0 was: When querying a partition containing gzip files, the query fails with the error below: WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: Error(255): Unknown error 255 Root cause: EOFException: Cannot seek to negative offset hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a size of bigger than 2 GB (approx: 2.4 GB) The exact version is : 2.12.0-cdh5.15.0 > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Priority: Minor > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file is > a delimited text file and has a size of bigger than 2 GB (approx: 2.4 GB) The > uncompressed size is ~13GB > The impalad version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Updated] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
[ https://issues.apache.org/jira/browse/IMPALA-8109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hakki updated IMPALA-8109: -- Description: When querying a partition containing gzip files, the query fails with the error below: WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: Error(255): Unknown error 255 Root cause: EOFException: Cannot seek to negative offset hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a size of bigger than 2 GB (approx: 2.4 GB) The exact version is : 2.12.0-cdh5.15.0 was: When querying a partition containing gzip files, the query fails with the error below: WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: Error(255): Unknown error 255 Root cause: EOFException: Cannot seek to negative offset hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a size of bigger than 2 GB (approx: 2.4 GB) > Impala cannot read the gzip files bigger than 2 GB > -- > > Key: IMPALA-8109 > URL: https://issues.apache.org/jira/browse/IMPALA-8109 > Project: IMPALA > Issue Type: Bug > Components: Backend >Affects Versions: Impala 2.12.0 >Reporter: hakki >Priority: Minor > > When querying a partition containing gzip files, the query fails with the > error below: > WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: > Error(255): Unknown error 255 > Root cause: EOFException: Cannot seek to negative offset > hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has > a size of bigger than 2 GB (approx: 2.4 GB) > The exact version is : 2.12.0-cdh5.15.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Created] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
hakki created IMPALA-8109: - Summary: Impala cannot read the gzip files bigger than 2 GB Key: IMPALA-8109 URL: https://issues.apache.org/jira/browse/IMPALA-8109 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: hakki When querying a partition containing gzip files, the query fails with the error below: WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: Error(255): Unknown error 255 Root cause: EOFException: Cannot seek to negative offset hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a size of bigger than 2 GB (approx: 2.4 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (IMPALA-8109) Impala cannot read the gzip files bigger than 2 GB
hakki created IMPALA-8109: - Summary: Impala cannot read the gzip files bigger than 2 GB Key: IMPALA-8109 URL: https://issues.apache.org/jira/browse/IMPALA-8109 Project: IMPALA Issue Type: Bug Components: Backend Affects Versions: Impala 2.12.0 Reporter: hakki When querying a partition containing gzip files, the query fails with the error below: WARNINGS: Disk I/O error: Error seeking to -2147483648 in file: hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz: Error(255): Unknown error 255 Root cause: EOFException: Cannot seek to negative offset hdfs://HADOOP_CLUSTER/user/hive/AAA/BBB/datehour=20180910/XXX.gz file has a size of bigger than 2 GB (approx: 2.4 GB) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org
[jira] [Commented] (IMPALA-4909) Redhat timezone update rpm causes queries to disappear from CM screen
[ https://issues.apache.org/jira/browse/IMPALA-4909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16675092#comment-16675092 ] hakki commented on IMPALA-4909: --- Rather, it seems to be a cloudera management service (CMS) issue. After locating the new joda jar file under the /usr/share/cmf/common_jars and restarting the CMS, the issue resolved. > Redhat timezone update rpm causes queries to disappear from CM screen > - > > Key: IMPALA-4909 > URL: https://issues.apache.org/jira/browse/IMPALA-4909 > Project: IMPALA > Issue Type: Bug > Components: Frontend >Affects Versions: Impala 2.2, Impala 2.6.0 >Reporter: hakki >Priority: Minor > > When implemented timezone update packages (tzdata-2016g-2.el6.noarch.rpm for > redhat and tzdata2016g.tar.gz for java) to redhat 6.6 on which impala daemons > run, queries does not appear on the cloudera manager impala queries screen. > Note: Timezone update package is also applied to the java. Cloudera manager > server is located on identical servers with impala daemons, catalog server > and statestore. > Reproduce scenario: > 1- Install CDH-5.4.7 with parcel and embedded postgresql database. (all the > os are redhat 6.6, the default timezone was EEST, initially) > 2- After installation, apply tzdata-2016g-2.el6.noarch.rpm to all servers. > 3- Apply java tz update package to java (java version "1.7.0_67" Java(TM) SE > Runtime Environment (build 1.7.0_67-b01)) > 4- Run a query from impala > 5- Open the impala queries screen from the cloudera manager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org