Re: how to let hive support lzo

2013-07-22 Thread Sanjay Subramanian
This works for us

SET hive.exec.compress.intermediate=true
SET hive.exec.compress.output=true
SET 
mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec
SET mapreduce.map.output.compress=true
SET 
mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
SET mapreduce.output.fileoutputformat.compress=true


From: "bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>" 
mailto:bejoy...@yahoo.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>, 
"bejoy...@yahoo.com<mailto:bejoy...@yahoo.com>" 
mailto:bejoy...@yahoo.com>>
Date: Monday, July 22, 2013 5:09 AM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" 
mailto:user@hive.apache.org>>
Subject: Re: how to let hive support lzo


Hi,

Along with the mapred.compress* properties try to set
hive.exec.compress.output to true.
Regards
Bejoy KS

Sent from remote device, Please excuse typos

From: ch huang mailto:justlo...@gmail.com>>
Date: Mon, 22 Jul 2013 13:41:01 +0800
To: mailto:user@hive.apache.org>>
ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Re: how to let hive support lzo


# hbase org.apache.hadoop.hbase.util.CompressionTest 
hdfs://CH22:9000/alex/my.txt lzo
13/07/22 13:27:58 WARN conf.Configuration: hadoop.native.lib is deprecated. 
Instead, use io.native.lib.available
13/07/22 13:27:59 INFO util.ChecksumType: Checksum using 
org.apache.hadoop.util.PureJavaCrc32
13/07/22 13:27:59 INFO util.ChecksumType: Checksum can use 
org.apache.hadoop.util.PureJavaCrc32C
13/07/22 13:27:59 ERROR metrics.SchemaMetrics: Inconsistent configuration. 
Previous configuration for using table name in metrics: true, new 
configuration: false
13/07/22 13:27:59 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
13/07/22 13:27:59 INFO lzo.LzoCodec: Successfully loaded & initialized 
native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]
13/07/22 13:27:59 INFO compress.CodecPool: Got brand-new compressor 
[.lzo_deflate]
13/07/22 13:28:00 INFO compress.CodecPool: Got brand-new decompressor 
[.lzo_deflate]
SUCCESS





# hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-0.4.15.jar 
com.hadoop.compression.lzo.LzoIndexer /alex
13/07/22 09:39:04 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
13/07/22 09:39:04 INFO lzo.LzoCodec: Successfully loaded & initialized 
native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]
13/07/22 09:39:04 INFO lzo.LzoIndexer: LZO Indexing directory /alex...
13/07/22 09:39:04 INFO lzo.LzoIndexer:   LZO Indexing directory 
hdfs://CH22:9000/alex/alex_t...
13/07/22 09:39:04 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file 
hdfs://CH22:9000/alex/sqoop-1.99.2-bin-hadoop200.tar.gz.lzo, size 0.02 GB...
13/07/22 09:39:05 WARN conf.Configuration: hadoop.native.lib is deprecated. 
Instead, use io.native.lib.available
13/07/22 09:39:06 INFO lzo.LzoIndexer:   Completed LZO Indexing in 1.16 seconds 
(13.99 MB/s).  Index size is 0.52 KB.

13/07/22 09:39:06 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file 
hdfs://CH22:9000/alex/test1.lzo, size 0.00 GB...
13/07/22 09:39:06 INFO lzo.LzoIndexer:   Completed LZO Indexing in 0.08 seconds 
(0.00 MB/s).  Index size is 0.01 KB.


On Mon, Jul 22, 2013 at 1:37 PM, ch huang 
mailto:justlo...@gmail.com>> wrote:
hi ,all:
 i already install and testing lzo in hadoop and hbase,all success,but when 
i try it on hive ,it failed ,how can i do let hive can recognize lzo?


hive> set mapred.map.output.compression.codec;
mapred.map.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
hive> set 
mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec
hive> select count(*) from test;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapred.reduce.tasks=
Starting Job = job_1374463239553_0003, Tracking URL = 
http://CH22:8088/proxy/application_1374463239553_0003/<http://ch22:8088/proxy/application_1374463239553_0003/>
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1374463239553_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2013-07-22 13:33:27,243 Stage-1 map = 0%,  reduce = 0%
2013-07-22 13:33:45,403 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_1374463239553_0003 with errors
Error during job, obtaining debugging information...
Job Tracking URL: 
http://CH22:8088/proxy/application_1374463239553_0003/<http://ch22:8088/proxy/application_1374463239553_0003/>
Examining task ID: task_1374463239553_0003_m_00 (and more) from job 
job_137446

Re: how to let hive support lzo

2013-07-22 Thread bejoy_ks

Hi,

Along with the mapred.compress* properties try to set
hive.exec.compress.output to true.

Regards 
Bejoy KS

Sent from remote device, Please excuse typos

-Original Message-
From: ch huang 
Date: Mon, 22 Jul 2013 13:41:01 
To: 
Reply-To: user@hive.apache.org
Subject: Re: how to let hive support lzo

# hbase org.apache.hadoop.hbase.util.CompressionTest
hdfs://CH22:9000/alex/my.txt lzo
13/07/22 13:27:58 WARN conf.Configuration: hadoop.native.lib is deprecated.
Instead, use io.native.lib.available
13/07/22 13:27:59 INFO util.ChecksumType: Checksum using
org.apache.hadoop.util.PureJavaCrc32
13/07/22 13:27:59 INFO util.ChecksumType: Checksum can use
org.apache.hadoop.util.PureJavaCrc32C
13/07/22 13:27:59 ERROR metrics.SchemaMetrics: Inconsistent configuration.
Previous configuration for using table name in metrics: true, new
configuration: false
13/07/22 13:27:59 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
13/07/22 13:27:59 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]
13/07/22 13:27:59 INFO compress.CodecPool: Got brand-new compressor
[.lzo_deflate]
13/07/22 13:28:00 INFO compress.CodecPool: Got brand-new decompressor
[.lzo_deflate]
SUCCESS




# hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-0.4.15.jar
com.hadoop.compression.lzo.LzoIndexer /alex
13/07/22 09:39:04 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
13/07/22 09:39:04 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]
13/07/22 09:39:04 INFO lzo.LzoIndexer: LZO Indexing directory /alex...
13/07/22 09:39:04 INFO lzo.LzoIndexer:   LZO Indexing directory
hdfs://CH22:9000/alex/alex_t...
13/07/22 09:39:04 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file
hdfs://CH22:9000/alex/sqoop-1.99.2-bin-hadoop200.tar.gz.lzo, size 0.02 GB...
13/07/22 09:39:05 WARN conf.Configuration: hadoop.native.lib is deprecated.
Instead, use io.native.lib.available
13/07/22 09:39:06 INFO lzo.LzoIndexer:   Completed LZO Indexing in 1.16
seconds (13.99 MB/s).  Index size is 0.52 KB.

13/07/22 09:39:06 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file
hdfs://CH22:9000/alex/test1.lzo, size 0.00 GB...
13/07/22 09:39:06 INFO lzo.LzoIndexer:   Completed LZO Indexing in 0.08
seconds (0.00 MB/s).  Index size is 0.01 KB.


On Mon, Jul 22, 2013 at 1:37 PM, ch huang  wrote:

> hi ,all:
>  i already install and testing lzo in hadoop and hbase,all success,but
> when i try it on hive ,it failed ,how can i do let hive can recognize lzo?
>
>
> hive> set mapred.map.output.compression.codec;
>
> mapred.map.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
> hive> set
> mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec
> hive> select count(*) from test;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=
> Starting Job = job_1374463239553_0003, Tracking URL =
> http://CH22:8088/proxy/application_1374463239553_0003/<http://ch22:8088/proxy/application_1374463239553_0003/>
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1374463239553_0003
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 1
> 2013-07-22 13:33:27,243 Stage-1 map = 0%,  reduce = 0%
> 2013-07-22 13:33:45,403 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1374463239553_0003 with errors
> Error during job, obtaining debugging information...
> Job Tracking URL: 
> http://CH22:8088/proxy/application_1374463239553_0003/<http://ch22:8088/proxy/application_1374463239553_0003/>
> Examining task ID: task_1374463239553_0003_m_00 (and more) from job
> job_1374463239553_0003
> Task with the most failures(4):
> -
> Task ID:
>   task_1374463239553_0003_m_00
> URL:
>
> http://CH22:8088/taskdetails.jsp?jobid=job_1374463239553_0003&tipid=task_1374463239553_0003_m_00<http://ch22:8088/taskdetails.jsp?jobid=job_1374463239553_0003&tipid=task_1374463239553_0003_m_00>
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: native-lzo library not available
> at
> com.hadoop.compression.lzo.LzoCodec.getCompressorType(LzoCodec.java:155)
> at
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:104)
> at
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:118)
> at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:115)
>

Re: how to let hive support lzo

2013-07-21 Thread ch huang
# hbase org.apache.hadoop.hbase.util.CompressionTest
hdfs://CH22:9000/alex/my.txt lzo
13/07/22 13:27:58 WARN conf.Configuration: hadoop.native.lib is deprecated.
Instead, use io.native.lib.available
13/07/22 13:27:59 INFO util.ChecksumType: Checksum using
org.apache.hadoop.util.PureJavaCrc32
13/07/22 13:27:59 INFO util.ChecksumType: Checksum can use
org.apache.hadoop.util.PureJavaCrc32C
13/07/22 13:27:59 ERROR metrics.SchemaMetrics: Inconsistent configuration.
Previous configuration for using table name in metrics: true, new
configuration: false
13/07/22 13:27:59 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
13/07/22 13:27:59 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]
13/07/22 13:27:59 INFO compress.CodecPool: Got brand-new compressor
[.lzo_deflate]
13/07/22 13:28:00 INFO compress.CodecPool: Got brand-new decompressor
[.lzo_deflate]
SUCCESS




# hadoop jar /usr/lib/hadoop/lib/hadoop-lzo-0.4.15.jar
com.hadoop.compression.lzo.LzoIndexer /alex
13/07/22 09:39:04 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
13/07/22 09:39:04 INFO lzo.LzoCodec: Successfully loaded & initialized
native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]
13/07/22 09:39:04 INFO lzo.LzoIndexer: LZO Indexing directory /alex...
13/07/22 09:39:04 INFO lzo.LzoIndexer:   LZO Indexing directory
hdfs://CH22:9000/alex/alex_t...
13/07/22 09:39:04 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file
hdfs://CH22:9000/alex/sqoop-1.99.2-bin-hadoop200.tar.gz.lzo, size 0.02 GB...
13/07/22 09:39:05 WARN conf.Configuration: hadoop.native.lib is deprecated.
Instead, use io.native.lib.available
13/07/22 09:39:06 INFO lzo.LzoIndexer:   Completed LZO Indexing in 1.16
seconds (13.99 MB/s).  Index size is 0.52 KB.

13/07/22 09:39:06 INFO lzo.LzoIndexer:   [INDEX] LZO Indexing file
hdfs://CH22:9000/alex/test1.lzo, size 0.00 GB...
13/07/22 09:39:06 INFO lzo.LzoIndexer:   Completed LZO Indexing in 0.08
seconds (0.00 MB/s).  Index size is 0.01 KB.


On Mon, Jul 22, 2013 at 1:37 PM, ch huang  wrote:

> hi ,all:
>  i already install and testing lzo in hadoop and hbase,all success,but
> when i try it on hive ,it failed ,how can i do let hive can recognize lzo?
>
>
> hive> set mapred.map.output.compression.codec;
>
> mapred.map.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
> hive> set
> mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec
> hive> select count(*) from test;
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=
> In order to set a constant number of reducers:
>   set mapred.reduce.tasks=
> Starting Job = job_1374463239553_0003, Tracking URL =
> http://CH22:8088/proxy/application_1374463239553_0003/
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1374463239553_0003
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 1
> 2013-07-22 13:33:27,243 Stage-1 map = 0%,  reduce = 0%
> 2013-07-22 13:33:45,403 Stage-1 map = 100%,  reduce = 0%
> Ended Job = job_1374463239553_0003 with errors
> Error during job, obtaining debugging information...
> Job Tracking URL: 
> http://CH22:8088/proxy/application_1374463239553_0003/
> Examining task ID: task_1374463239553_0003_m_00 (and more) from job
> job_1374463239553_0003
> Task with the most failures(4):
> -
> Task ID:
>   task_1374463239553_0003_m_00
> URL:
>
> http://CH22:8088/taskdetails.jsp?jobid=job_1374463239553_0003&tipid=task_1374463239553_0003_m_00
> -
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: native-lzo library not available
> at
> com.hadoop.compression.lzo.LzoCodec.getCompressorType(LzoCodec.java:155)
> at
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:104)
> at
> org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:118)
> at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:115)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1580)
> at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1457)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> o

how to let hive support lzo

2013-07-21 Thread ch huang
hi ,all:
 i already install and testing lzo in hadoop and hbase,all success,but
when i try it on hive ,it failed ,how can i do let hive can recognize lzo?


hive> set mapred.map.output.compression.codec;
mapred.map.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec
hive> set
mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec
hive> select count(*) from test;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
In order to set a constant number of reducers:
  set mapred.reduce.tasks=
Starting Job = job_1374463239553_0003, Tracking URL =
http://CH22:8088/proxy/application_1374463239553_0003/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1374463239553_0003
Hadoop job information for Stage-1: number of mappers: 1; number of
reducers: 1
2013-07-22 13:33:27,243 Stage-1 map = 0%,  reduce = 0%
2013-07-22 13:33:45,403 Stage-1 map = 100%,  reduce = 0%
Ended Job = job_1374463239553_0003 with errors
Error during job, obtaining debugging information...
Job Tracking URL:
http://CH22:8088/proxy/application_1374463239553_0003/
Examining task ID: task_1374463239553_0003_m_00 (and more) from job
job_1374463239553_0003
Task with the most failures(4):
-
Task ID:
  task_1374463239553_0003_m_00
URL:

http://CH22:8088/taskdetails.jsp?jobid=job_1374463239553_0003&tipid=task_1374463239553_0003_m_00
-
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: native-lzo library not available
at
com.hadoop.compression.lzo.LzoCodec.getCompressorType(LzoCodec.java:155)
at
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:104)
at
org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:118)
at org.apache.hadoop.mapred.IFile$Writer.(IFile.java:115)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1580)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1457)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)