Re: UDAF terminatePartial structure

2013-07-30 Thread Robin Morris
There are limitations as to what can be passed between terminatePartial() and merge() I'm not sure that you can pass java arrays (i.e. your double[] c1;) through all the hive reflection gubbins. Try using ArrayLists instead, but be warned, you need to make explicit deep copies of anything

Re: PL/SQL to HiveQL translation

2013-07-30 Thread Jérôme Verdier
Hi, Thanks for this link, it was very helpful :-) I have another question. I have some HiveQL script wich are stored into .hql file. What is the best way to execute these scripts with a Java/JDBC program ? Thanks. 2013/7/29 Brendan Heussler bheuss...@gmail.com Jerome, There is a really

Re: Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors

2013-07-30 Thread Nitin Pawar
The mentioned flow is called when you have unsecure mode of thrift metastore client-server connection. So one way to avoid this is have a secure way. code public boolean process(final TProtocol in, final TProtocol out) throwsTException { setIpAddress(in); ... ... ... @Override protected void

Hive Join with distinct rows

2013-07-30 Thread Sunita Arvind
Hi Praveen / All, I also have a requirement similar to the one explained (by Praveen) below: distinct rows on a single column with corresponding data from other columns.

RE: Hive Join with distinct rows

2013-07-30 Thread Marcin Mejran
I've used a rank udf for this previously, distribute and sort by the column then select all rows where rank=1. That should work with a join but I never tried it. It'd be an issue if the join outputs a lot of records that then are all dropped since that'd slow down the query. I've actually

Prevent users from killing each other's jobs

2013-07-30 Thread Murat Odabasi
Hi there, I am trying to introduce some sort of security to prevent different people using the cluster from interfering with each other's jobs. Following the instructions at http://hadoop.apache.org/docs/stable/cluster_setup.html and

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Vinod Kumar Vavilapalli
You need to set up Job ACLs. See http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Authorization. It is a per job configuration, you can provide with defaults. If the job owner wishes to give others access, he/she can do so. Thanks, +Vinod Kumar Vavilapalli Hortonworks Inc.

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Edward Capriolo
Honestly tell your users to stop being jerks. People know if they kill my query there is going to be hell to pay :) On Tue, Jul 30, 2013 at 2:25 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: You need to set up Job ACLs. See

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Murat Odabasi
I'm not sure how I should do that. The documentation says A job submitter can specify access control lists for viewing or modifying a job via the configuration properties mapreduce.job.acl-view-job and mapreduce.job.acl-modify-job respectively. By default, nobody is given access in these

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Mikhail Antonov
In addition to using job's ACLs you could have more brutal schema. Track all requests to kill the jobs, and if any request is coming from the user who should't be trying to kill this particular job, then ssh from the script to his client machine and forcibly reboot it :) 2013/7/30 Edward

Re: Prevent users from killing each other's jobs

2013-07-30 Thread pandees waran
Hi Mikhail, Could you please explain how we can track all the kill requests for a job? Is there any feature available in hadoop stack for this? Or do we need to track this in OS layer by capturing the signals? Thanks, Pandeesh On Jul 31, 2013 12:03 AM, Mikhail Antonov olorinb...@gmail.com wrote:

Re: Prevent users from killing each other's jobs

2013-07-30 Thread Vinod Kumar Vavilapalli
That is correct. Seems like something else is happening. One thing to see if all your users or more importantly their group is added to the cluster-admin acl (mapreduce.cluster.administrators) You should look at mapreduce audit logs (which by default go into JobTracker logs, search for

Select statements return null

2013-07-30 Thread Sunita Arvind
Hi, I have written a script which generates JSON files, loads it into a dictionary, adds a few attributes and uploads the modified files to HDFS. After the files are generated, if I perform a select * from..; on the table which points to this location, I get null, null as the result. I also

Write access for the wiki

2013-07-30 Thread Mark Wagner
Hi all, Would someone with the right permissions grant me write access to the Hive wiki? I'd like to update some information on the Avro Serde. Thanks, Mark

Re: Write access for the wiki

2013-07-30 Thread Ashutosh Chauhan
Mark, Do you have an account on hive cwiki. Whats your id ? Thanks, Ashutosh On Tue, Jul 30, 2013 at 1:06 PM, Mark Wagner wagner.mar...@gmail.comwrote: Hi all, Would someone with the right permissions grant me write access to the Hive wiki? I'd like to update some information on the Avro

Re: Write access for the wiki

2013-07-30 Thread Mark Wagner
My id is mwagner. Thanks! On Tue, Jul 30, 2013 at 1:36 PM, Ashutosh Chauhan hashut...@apache.orgwrote: Mark, Do you have an account on hive cwiki. Whats your id ? Thanks, Ashutosh On Tue, Jul 30, 2013 at 1:06 PM, Mark Wagner wagner.mar...@gmail.comwrote: Hi all, Would someone with

Re: Write access for the wiki

2013-07-30 Thread Ashutosh Chauhan
Is that your cwiki id ? I am not seeing it there. Remember cwiki is separate than jira account. Ashutosh On Tue, Jul 30, 2013 at 1:40 PM, Mark Wagner wagner.mar...@gmail.comwrote: My id is mwagner. Thanks! On Tue, Jul 30, 2013 at 1:36 PM, Ashutosh Chauhan hashut...@apache.orgwrote: Mark,

Re: Write access for the wiki

2013-07-30 Thread Mark Wagner
Yes, I created it right before emailing the list: https://cwiki.apache.org/confluence/display/~mwagner On Tue, Jul 30, 2013 at 1:45 PM, Ashutosh Chauhan hashut...@apache.orgwrote: Is that your cwiki id ? I am not seeing it there. Remember cwiki is separate than jira account. Ashutosh On

Re: Write access for the wiki

2013-07-30 Thread Ashutosh Chauhan
Done. Added you as contributor. Happy Documenting !! Ashutosh On Tue, Jul 30, 2013 at 2:15 PM, Mark Wagner wagner.mar...@gmail.comwrote: Yes, I created it right before emailing the list: https://cwiki.apache.org/confluence/display/~mwagner On Tue, Jul 30, 2013 at 1:45 PM, Ashutosh Chauhan

Review Request (wikidoc): LZO Compression in Hive

2013-07-30 Thread Sanjay Subramanian
Hi Met with Lefty this afternoon and she was kind to spend time to add my documentation to the site - since I still don't have editing privileges :-) Please review the new wikidoc about LZO compression in the Hive language manual. If anything is unclear or needs more information, you can

UDFs with package names

2013-07-30 Thread Michael Malak
Thus far, I've been able to create Hive UDFs, but now I need to define them within a Java package name (as opposed to the default Java package as I had been doing), but once I do that, I'm no longer able to load them into Hive. First off, this works: add jar

Re: Hive Join with distinct rows

2013-07-30 Thread Sunita Arvind
Thanks for sharing your experience Marcin Sunita On Tue, Jul 30, 2013 at 11:54 AM, Marcin Mejran marcin.mej...@hooklogic.com wrote: I’ve used a rank udf for this previously, distribute and sort by the column then select all rows where rank=1. That should work with a join but I never tried

Re: UDFs with package names

2013-07-30 Thread Edward Capriolo
It might be a better idea to use your own package com.mystuff.x. You might be running into an issue where java is not finding the file because it assumes the relation between package and jar is 1 to 1. You might also be compiling wrong If your package is com.mystuff that class file should be in a

Multiple Insert with Where Clauses

2013-07-30 Thread Sha Liu
Hi Hive Gurus, When using the Hive extension of multiple inserts, can we add Where clauses for each Select statement, like the following? FROM ...INSERT OVERWRITE TABLE ...SELECT col1, col2, col3WHERE col4='abc'INSERT OVERWRITE TABLE ...SELECT col1, col4, col2WHERE col3='xyz'

Re: Multiple Insert with Where Clauses

2013-07-30 Thread Brad Ruderman
Have you simply tried INSERT OVERWRITE TABLE destination SELECT col1, col2, col3 FROM source WHERE col4 = 'abc' Thanks! On Tue, Jul 30, 2013 at 8:25 PM, Sha Liu lius...@hotmail.com wrote: Hi Hive Gurus, When using the Hive extension of multiple inserts, can we add Where clauses for each

RE: Multiple Insert with Where Clauses

2013-07-30 Thread Sha Liu
Yes for the example you gave, it works. It even works when there is a single insert under the from clause, but there there are multiple inserts, the where clauses seem no longer effective. Date: Tue, 30 Jul 2013 20:29:19 -0700 Subject: Re: Multiple Insert with Where Clauses From:

Re: Multiple Insert with Where Clauses

2013-07-30 Thread Brad Ruderman
Hive doesn't support inserting a few records into a table. You will need to write a query to union your select and then insert. IF you can partition, then you can insert a whole partition at a time instead of the table. Thanks, Brad On Tue, Jul 30, 2013 at 9:04 PM, Sha Liu lius...@hotmail.com

RE: Multiple Insert with Where Clauses

2013-07-30 Thread Sha Liu
Doesn't INSERT INTO do what you said? I'm not sure I understand inserting a few records into a table. Anyway here the problem seems different to me. For my cases these where clauses for multiple inserts seem not effective, while Hive doesn't complain about that. -Sha Date: Tue, 30 Jul 2013