Re: Hive UDF gives duplicate result regardless of parameters, when nested in a subquery

2014-07-23 Thread Navis류승우
Looks like it's caused by HIVE-7314. Could you try that with "hive.cache.expr.evaluation=false"? Thanks, Navis 2014-07-24 14:34 GMT+09:00 丁桂涛(桂花) : > Yes. The output is correct: ["tp","p","sp"]. > > I developed the UDF using JAVA in eclipse and exported the jar file into > the auxlib directory

Column Statistics with Parquet

2014-07-23 Thread Suma Shivaprasad
I am trying to enable Column statistics usage with Parquet tables. This is the query I am executing. However on explain, I see that even though *Basic stats: COMPLETE *is seen *Column stats *is seen as* NONE.* Can someone please explain what else I need to debug/fix this. set hive.compute.query.us

Column Statistics with Parquet

2014-07-23 Thread Suma Shivaprasad
I am trying to enable Column statistics usage with Parquet tables. This is the query I am executing. However on explain, I see that even though *Basic stats: COMPLETE *is seen *Column stats *is seen as* NONE.* Can someone please explain what else I need to debug/fix this. set hive.compute.query.us

Re: Hive UDF gives duplicate result regardless of parameters, when nested in a subquery

2014-07-23 Thread 丁桂涛(桂花)
Yes. The output is correct: ["tp","p","sp"]. I developed the UDF using JAVA in eclipse and exported the jar file into the auxlib directory of hive. Then add the following line into the ~/.hiverc file. create temporary function getad as 'xxx'; The hive version is 0.12.0. Perhaps the problem r

Re: Hive UDF gives duplicate result regardless of parameters, when nested in a subquery

2014-07-23 Thread Jie Jin
Have you tried this query without UDF, say: select array(tp, p, sp) as ps from ( select 'tp' as tp, 'p' as p, 'sp' as sp from table_name where id = ) t; ​And how you implement the UDF?​ 谢谢 金杰 (Jie Jin) On Wed, Jul 23, 2014 at 1:34 PM, 丁桂涛(桂花) wrote: > R

Re: MoveTasks releasing locks that don't belong to it?

2014-07-23 Thread Edward Capriolo
For what it is worth I run with locks off. I played with it in versions 0.8x -> 0.10. I found them problematic particularly from hive-server. We ended up doing our own locking application side. I am very suprised that some vendors "distributions" suggest that this is better/safer "on" when I have f

MoveTasks releasing locks that don't belong to it?

2014-07-23 Thread Darren Yin
In releaseLocks in MoveTask , it looks like the lockMgr.getLocks line actually grabs all locks associated with whatever lock objects (e.g., a partition) the MoveTask is concerned with. My theory for

Re: HCat and non-string partition types

2014-07-23 Thread Prem Yadav
related? https://issues.apache.org/jira/browse/HCATALOG-23 On Wed, Jul 23, 2014 at 4:49 PM, Brian Jeltema < brian.jelt...@digitalenvoy.net> wrote: > I have some Hive tables that are partitioned by an int field. When I tried > to do a Sqoop import using Sqoops HCatalog > support, it failed compla

HCat and non-string partition types

2014-07-23 Thread Brian Jeltema
I have some Hive tables that are partitioned by an int field. When I tried to do a Sqoop import using Sqoops HCatalog support, it failed complaining that HCatalog only supports string partitions. However, I’ve used HCatalog in mapReduce jobs with int partitions successfully. The docs that I’ve s

RE: Hive Statistics

2014-07-23 Thread Navdeep Agrawal
Stuck .need help I created a small table with multiple partition desc (id int ,term int) partitioned by id ,whenever I run analyze on any id I am getting perfectly good answers . I am unable to figure out the difference each file is making . New table Table Parameters: transient_lastDdlT

Re: Hive does not show data generated with HCatOutputFormat

2014-07-23 Thread D K
I am wondering if the presence of _SUCCESS file is causing the empty result. Can you try setting property "mapreduce.fileoutputcommitter.marksuccessfuljobs" to false to disable the generation of _SUCCESS file? Just a long shot but might not hurt to debug this. On Mon, Jul 21, 2014 at 4:47 AM, Ti

RE: Hive Statistics

2014-07-23 Thread Navdeep Agrawal
No I have not set these to mysql db . when I set them to the one I am using for hive I am getting stat publisher not getting initialized .but if I have not set these parameters why every time a new row is getting created in mysql db in part_col_stats table . From: Andre Araujo [mailto:ara...@p

Re: Hive Statistics

2014-07-23 Thread Andre Araujo
Hi, Navdeep, Please note that the configuration for the stats database is separate from the configuration for the metastore db. Can you confirm you have both to use a mysql db? The properties for the stats db are: hive.stats.dbclass= hive.stats.dbconnectionstring= On 23 July 2014 16:07, Navdee

RE: Hive Stats

2014-07-23 Thread Navdeep Agrawal
i digged into partitions params and checked out column_stats_accurate in metastore but found it to be true . From: Bala Krishna Gangisetty [mailto:b...@altiscale.com] Sent: Wednesday, July 23, 2014 12:46 AM To: user@hive.apache.org Subject: Re: Hive Stats What is the value of "COLUMN_STATS_ACCUR

RE: Hive Stats

2014-07-23 Thread Navdeep Agrawal
Hi , I am unable to see COLUMN_STATS_ACCURATE in desc formatted/extended ,but I think u r pointing to stats.reliable I have made that to false . From: Bala Krishna Gangisetty [mailto:b...@altiscale.com] Sent: Wednesday, July 23, 2014 12:46 AM To: user@hive.apache.org Subject: Re: Hive Stats Wha