num_rows is always 0 in statistics

2012-08-29 Thread Hiroyuki Yamada
Hi,

I have run analyse table command several times to get statistics,
but I always get num_rows=0 like below.
(also, raw_data_size is 0)

-
hive analyze table lineitem compute statistics;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201208291425_0011, Tracking URL =
http://hadoop-node1:50030/jobdetails.jsp?jobid=job_201208291425_0011
Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job
-Dmapred.job.tracker=hadoop-node1:8021 -kill job_201208291425_0011
Hadoop job information for Stage-0: number of mappers: 3; number of reducers: 0
2012-08-29 15:16:16,133 Stage-0 map = 0%,  reduce = 0%
2012-08-29 15:16:20,154 Stage-0 map = 100%,  reduce = 0%
2012-08-29 15:16:22,168 Stage-0 map = 100%,  reduce = 100%
Ended Job = job_201208291425_0011
Table sf1.lineitem stats: [num_partitions: 0, num_files: 1, num_rows:
0, total_size: 759863287, raw_data_size: 0]
-

I tried the version 0.7.1, 0.8.1, 0.9.0 and
the same result.
Is there anything else I have to do to make it work ?

Also, is statistics only works for managed tables ?
I tried it for external tables and it doesn't seem working. (all the
values are 0 )

Thanks,

Hiroyuki


Re: num_rows is always 0 in statistics

2012-08-29 Thread Hiroyuki Yamada
Hi,

Thank you for the reply.
I tried with the following setting, but I got the same result. (with num_rows=0)

hive.stats.dbconnectionstring=jdbc:derby:;databaseName=/tmp/TempStatsStore;create=true

Is there any clue ?

On Wed, Aug 29, 2012 at 4:09 PM, rohithsharma rohithsharm...@huawei.com wrote:
 I resolved the issue with following way.

 Configure
 hive.stats.dbconnectionstring=jdbc:derby:;databaseName=/home/TempStore.
 This works only in single node cluster.


 Please check HIVE-3324.


 -Original Message-
 From: Hiroyuki Yamada [mailto:mogwa...@gmail.com]
 Sent: Wednesday, August 29, 2012 11:57 AM
 To: user@hive.apache.org
 Subject: num_rows is always 0 in statistics

 Hi,

 I have run analyse table command several times to get statistics,
 but I always get num_rows=0 like below.
 (also, raw_data_size is 0)

 -
 hive analyze table lineitem compute statistics;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201208291425_0011, Tracking URL =
 http://hadoop-node1:50030/jobdetails.jsp?jobid=job_201208291425_0011
 Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job
 -Dmapred.job.tracker=hadoop-node1:8021 -kill job_201208291425_0011
 Hadoop job information for Stage-0: number of mappers: 3; number of
 reducers: 0
 2012-08-29 15:16:16,133 Stage-0 map = 0%,  reduce = 0%
 2012-08-29 15:16:20,154 Stage-0 map = 100%,  reduce = 0%
 2012-08-29 15:16:22,168 Stage-0 map = 100%,  reduce = 100%
 Ended Job = job_201208291425_0011
 Table sf1.lineitem stats: [num_partitions: 0, num_files: 1, num_rows:
 0, total_size: 759863287, raw_data_size: 0]
 -

 I tried the version 0.7.1, 0.8.1, 0.9.0 and
 the same result.
 Is there anything else I have to do to make it work ?

 Also, is statistics only works for managed tables ?
 I tried it for external tables and it doesn't seem working. (all the
 values are 0 )

 Thanks,

 Hiroyuki



Re: num_rows is always 0 in statistics

2012-08-29 Thread Hiroyuki Yamada
Hi,

Sorry, it works now. Thank you.
But, the value is not correct. (about half of real number of rows.)
Is this sampled value ?
It seems counting every row as far as i checked TableScanOperator.java .

Thanks,

Hiroyuki

On Wed, Aug 29, 2012 at 5:39 PM, Hiroyuki Yamada mogwa...@gmail.com wrote:
 Hi,

 Thank you for the reply.
 I tried with the following setting, but I got the same result. (with 
 num_rows=0)

 hive.stats.dbconnectionstring=jdbc:derby:;databaseName=/tmp/TempStatsStore;create=true

 Is there any clue ?

 On Wed, Aug 29, 2012 at 4:09 PM, rohithsharma rohithsharm...@huawei.com 
 wrote:
 I resolved the issue with following way.

 Configure
 hive.stats.dbconnectionstring=jdbc:derby:;databaseName=/home/TempStore.
 This works only in single node cluster.


 Please check HIVE-3324.


 -Original Message-
 From: Hiroyuki Yamada [mailto:mogwa...@gmail.com]
 Sent: Wednesday, August 29, 2012 11:57 AM
 To: user@hive.apache.org
 Subject: num_rows is always 0 in statistics

 Hi,

 I have run analyse table command several times to get statistics,
 but I always get num_rows=0 like below.
 (also, raw_data_size is 0)

 -
 hive analyze table lineitem compute statistics;
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201208291425_0011, Tracking URL =
 http://hadoop-node1:50030/jobdetails.jsp?jobid=job_201208291425_0011
 Kill Command = /usr/lib/hadoop-0.20/bin/hadoop job
 -Dmapred.job.tracker=hadoop-node1:8021 -kill job_201208291425_0011
 Hadoop job information for Stage-0: number of mappers: 3; number of
 reducers: 0
 2012-08-29 15:16:16,133 Stage-0 map = 0%,  reduce = 0%
 2012-08-29 15:16:20,154 Stage-0 map = 100%,  reduce = 0%
 2012-08-29 15:16:22,168 Stage-0 map = 100%,  reduce = 100%
 Ended Job = job_201208291425_0011
 Table sf1.lineitem stats: [num_partitions: 0, num_files: 1, num_rows:
 0, total_size: 759863287, raw_data_size: 0]
 -

 I tried the version 0.7.1, 0.8.1, 0.9.0 and
 the same result.
 Is there anything else I have to do to make it work ?

 Also, is statistics only works for managed tables ?
 I tried it for external tables and it doesn't seem working. (all the
 values are 0 )

 Thanks,

 Hiroyuki



How to see the intermediate results between AST and optimized logical query plan.

2011-10-19 Thread Hiroyuki Yamada
Hello,

I have been trying to learn the Hive query compiler and
I am wondering if there is a way to see the result of semantic
analysis (query block tree)
and non-optimized logical query plan.
I know we can get AST and optimized logical query plan with explain,
but I want to know the intermediate results between them.

Also, is there any detailed documentations about Hive query compiler ?

I would be very appreciated if anyone answered my questions.

Thanks,
Hiroyuki