Re: Export to RDBMS directly

2013-07-17 Thread Siddharth Tiwari
Why dont you try implementing that and contributing it to community ? 

Sent from my iPhone

On Jul 17, 2013, at 10:18 PM, "Omkar Joshi"  wrote:

> I read of the term ‘JDBC Storage Handler’ at 
> https://issues.apache.org/jira/browse/HIVE-1555
>  
> The issues seems open but I just want to confirm that it has not been 
> implemented in the latest Hive releases.
>  
> Regards,
> Omkar Joshi
>  
> From: Bertrand Dechoux [mailto:decho...@gmail.com] 
> Sent: Thursday, July 18, 2013 10:43 AM
> To: user@hive.apache.org
> Subject: Re: Export to RDBMS directly
>  
> The short answer is no. You could at the moment write your own input 
> format/output format in order to do so. I don't know all the details for hive 
> but that's possible. However, you will likely run a DOS against your database 
> if you are not careful. Hive could embed sqoop in order do that smartly for 
> you but that's not the case and I doubt that is it a feature planned in the 
> short term. You have to manage yourself the steps (for which you could use a 
> scheduler like Oozie).
> 
> Regards
> Bertrand
>  
> 
> On Thu, Jul 18, 2013 at 7:04 AM, Omkar Joshi  
> wrote:
> Hi,
>  
> Currently, I’m executing the following steps(Hadoop 1.1.2, Hive 0.11 and 
> Sqoop-1.4.3.bin__hadoop-1.0.0) :
>  
> 1.Import data from MySQL to Hive using Sqoop
> 
> 2.Execute a query in Hive and store its output in a Hive table
> 
> 3.Export the output to MySQL using Sqoop
> 
>  
> I was wondering if it would be possible to combine steps 2 & 3 – the output 
> of the Hive query written directly to the MySQL database. I read about the 
> external tables but couldn’t find an example where the LOCATION clause points 
> to something like 
> jdbc:myql://localhost:3306//. Is it 
> really possible?
>  
> Regards,
> Omkar Joshi
>  
>  
> The contents of this e-mail and any attachment(s) may contain confidential or 
> privileged information for the intended recipient(s). Unintended recipients 
> are prohibited from taking action on the basis of information in this e-mail 
> and using or disseminating the information, and must notify the sender and 
> delete it from their system. L&T Infotech will not accept responsibility or 
> liability for the accuracy or completeness of, or the presence of any virus 
> or disabling code in this e-mail"
> 
> 
> 
> -- 
> Bertrand Dechoux


RE: Export to RDBMS directly

2013-07-17 Thread Omkar Joshi
I read of the term 'JDBC Storage Handler' at 
https://issues.apache.org/jira/browse/HIVE-1555

The issues seems open but I just want to confirm that it has not been 
implemented in the latest Hive releases.

Regards,
Omkar Joshi

From: Bertrand Dechoux [mailto:decho...@gmail.com]
Sent: Thursday, July 18, 2013 10:43 AM
To: user@hive.apache.org
Subject: Re: Export to RDBMS directly

The short answer is no. You could at the moment write your own input 
format/output format in order to do so. I don't know all the details for hive 
but that's possible. However, you will likely run a DOS against your database 
if you are not careful. Hive could embed sqoop in order do that smartly for you 
but that's not the case and I doubt that is it a feature planned in the short 
term. You have to manage yourself the steps (for which you could use a 
scheduler like Oozie).
Regards
Bertrand

On Thu, Jul 18, 2013 at 7:04 AM, Omkar Joshi 
mailto:omkar.jo...@lntinfotech.com>> wrote:
Hi,

Currently, I'm executing the following steps(Hadoop 1.1.2, Hive 0.11 and 
Sqoop-1.4.3.bin__hadoop-1.0.0) :


1.Import data from MySQL to Hive using Sqoop

2.Execute a query in Hive and store its output in a Hive table

3.Export the output to MySQL using Sqoop

I was wondering if it would be possible to combine steps 2 & 3 - the output of 
the Hive query written directly to the MySQL database. I read about the 
external tables but couldn't find an example where the LOCATION clause points 
to something like 
jdbc:myql://localhost:3306//. Is it really 
possible?

Regards,
Omkar Joshi



The contents of this e-mail and any attachment(s) may contain confidential or 
privileged information for the intended recipient(s). Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail and 
using or disseminating the information, and must notify the sender and delete 
it from their system. L&T Infotech will not accept responsibility or liability 
for the accuracy or completeness of, or the presence of any virus or disabling 
code in this e-mail"



--
Bertrand Dechoux


Re: Export to RDBMS directly

2013-07-17 Thread Bertrand Dechoux
The short answer is no. You could at the moment write your own input
format/output format in order to do so. I don't know all the details for
hive but that's possible. However, you will likely run a DOS against your
database if you are not careful. Hive could embed sqoop in order do that
smartly for you but that's not the case and I doubt that is it a feature
planned in the short term. You have to manage yourself the steps (for which
you could use a scheduler like Oozie).

Regards
Bertrand


On Thu, Jul 18, 2013 at 7:04 AM, Omkar Joshi wrote:

>  Hi,
>
> ** **
>
> Currently, I’m executing the following steps(Hadoop 1.1.2, Hive 0.11 and
> Sqoop-1.4.3.bin__hadoop-1.0.0) :
>
> ** **
>
> **1.**Import data from MySQL to Hive using Sqoop
>
> **2.**Execute a query in Hive and store its output in a Hive table
>
> **3.**Export the output to MySQL using Sqoop
>
> ** **
>
> I was wondering if it would be possible to combine steps 2 & 3 – the
> output of the Hive query written directly to the MySQL database. I read
> about the external tables but couldn’t find an example where the LOCATION
> clause points to something like
> jdbc:myql://localhost:3306//. Is it
> really possible?
>
> ** **
>
> Regards,
>
> Omkar Joshi
>
> ** **
>
> --
> The contents of this e-mail and any attachment(s) may contain confidential
> or privileged information for the intended recipient(s). Unintended
> recipients are prohibited from taking action on the basis of information in
> this e-mail and using or disseminating the information, and must notify the
> sender and delete it from their system. L&T Infotech will not accept
> responsibility or liability for the accuracy or completeness of, or the
> presence of any virus or disabling code in this e-mail"
>



-- 
Bertrand Dechoux


Export to RDBMS directly

2013-07-17 Thread Omkar Joshi
Hi,

Currently, I'm executing the following steps(Hadoop 1.1.2, Hive 0.11 and 
Sqoop-1.4.3.bin__hadoop-1.0.0) :


1.Import data from MySQL to Hive using Sqoop

2.Execute a query in Hive and store its output in a Hive table

3.Export the output to MySQL using Sqoop

I was wondering if it would be possible to combine steps 2 & 3 - the output of 
the Hive query written directly to the MySQL database. I read about the 
external tables but couldn't find an example where the LOCATION clause points 
to something like 
jdbc:myql://localhost:3306//. Is it really 
possible?

Regards,
Omkar Joshi



The contents of this e-mail and any attachment(s) may contain confidential or 
privileged information for the intended recipient(s). Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail and 
using or disseminating the information, and must notify the sender and delete 
it from their system. L&T Infotech will not accept responsibility or liability 
for the accuracy or completeness of, or the presence of any virus or disabling 
code in this e-mail"


Re: Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

2013-07-17 Thread Andrew Trask
Put them in hive's lib folder?

Sent from my Rotary Phone

On Jul 17, 2013, at 11:14 PM, Mitesh Peshave  wrote:

> Hello,
> 
> I am trying to use a custom inputformat for a hive table. 
> 
> When I add the jar containing the custom inputformat through a client, such 
> as the beeline, executing "add jar" command, all seems to work fine. In this 
> scenario, hive seems to pass inputformat class to the JT and TTs. I believe, 
> it correctly adds the jar to the distributed cache, and the MR jobs complete 
> without any errors.
> 
> But when I add the jar containing the custom input format under hive auxlibs 
> diror the hive lib dir, hive does not seem to pass the inputformat class to 
> the JT and TTs, causing the MR jobs to fails with ClassNotFoundException.
> 
> The use-case I am looking at here is, multiple users connecting to the 
> HiveServer using hive clients and query a table that uses the a custom 
> inputformat. I would not want each user to add the jar executing the "add 
> jar" command before the users start querying the table.
> 
> Is there a way to add extra jars to the hive server once and force the server 
> to push these jars to JT for every MR jobs it generates?
> 
> Appreciate,
> Mitesh 


Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

2013-07-17 Thread Mitesh Peshave
Hello,

I am trying to use a custom inputformat for a hive table.

When I add the jar containing the custom inputformat through a client, such
as the beeline, executing "add jar" command, all seems to work fine. In
this scenario, hive seems to pass inputformat class to the JT and TTs. I
believe, it correctly adds the jar to the distributed cache, and the MR
jobs complete without any errors.

But when I add the jar containing the custom input format under hive
auxlibs diror the hive lib dir, hive does not seem to pass the inputformat
class to the JT and TTs, causing the MR jobs to fails with
ClassNotFoundException.

The use-case I am looking at here is, multiple users connecting to the
HiveServer using hive clients and query a table that uses the a custom
inputformat. I would not want each user to add the jar executing the "add
jar" command before the users start querying the table.

Is there a way to add extra jars to the hive server once and force the
server to push these jars to JT for every MR jobs it generates?

Appreciate,
Mitesh


Re: can hive use where clause in jdbc?

2013-07-17 Thread Thejas Nair
It is unlikely to be specifically caused by where clause.
Are you able to run this query using hive cli ?
Are you able to run any query that involves running a MR job through jdbc ?

What do you see in the hive logs ?

On Tue, Jul 16, 2013 at 1:10 AM, ch huang  wrote:
> here is my test output,why the beeline report error when i use where
> condition??
>
> hive> select foo from demo_hive where bar='value3';
> Total MapReduce jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks is set to 0 since there's no reduce operator
> Starting Job = job_1373509276088_0015, Tracking URL =
> http://CH22:8088/proxy/application_1373509276088_0015/
> Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1373509276088_0015
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 0
> 2013-07-16 16:00:05,538 Stage-1 map = 0%,  reduce = 0%
> 2013-07-16 16:00:11,958 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.6
> sec
> MapReduce Total cumulative CPU time: 5 seconds 600 msec
> Ended Job = job_1373509276088_0015
> MapReduce Jobs Launched:
> Job 0: Map: 1   Cumulative CPU: 5.6 sec   HDFS Read: 245 HDFS Write: 2
> SUCCESS
> Total MapReduce CPU Time Spent: 5 seconds 600 msec
> OK
> 3
> Time taken: 15.236 seconds
>
> 0: jdbc:hive2://192.168.10.22:1> select foo from demo_hive where
> bar='value3';
> Error: Error while processing statement: FAILED: Execution Error, return
> code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask (state=08S01,code=1)
> Error: Error while processing statement: FAILED: Execution Error, return
> code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask (state=08S01,code=1)


Problem with the windowing function ntile (Exceptions)

2013-07-17 Thread Lars Francke
Hi,

I'm running a query like this:

CREATE TABLE foo
  STORED AS ORC
AS
SELECT
  id,
  season,
  amount,
  ntile(10)
OVER (
  PARTITION BY season
  ORDER BY amount DESC
)
FROM bar;

On a small enough dataset that works fine but when switching to a
larger sample we're seeing exceptions like this:

"Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Reset on
PersistentByteBasedList not supported"

Looking at the code (without really understanding it) we tried setting:
SET 
hive.ptf.partition.persistence='org.apache.hadoop.hive.ql.exec.PTFPersistence$PartitionedByteBasedList';

because that List supports reset but we are seeing a
ClassNotFoundException so we're doing that wrong.

Next try was setting hive.ptf.partition.persistence.memsize higher
which worked but first of all we don't really understand what all of
that stuff is doing and second of all we fear that it just might break
down again.

Any hints as to what that error really means and how to deal with it
would be greatly appreciated.

Thanks!

Lars


Re: Question regarding external table and csv in NFS

2013-07-17 Thread Mainak Ghosh

Hey Saurabh,

I tried this command and it still gives the same error. Actually the folder
name is supplier and supplier.tbl is the csv which resided inside it. I had
it correct in the query but in mail it is wrong. So the query that I
executed was:


create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
TEXTFILE LOCATION 'file:///mnt/h/tpc-h-impala/data/supplier';


Thanks and Regards,
Mainak.



From:   Saurabh M 
To: user@hive.apache.org,
Date:   07/17/2013 02:06 PM
Subject:Re: Question regarding external table and csv in NFS



Hi Mainak,


Can you try using this:


 create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
TEXTFILE LOCATION 'file:///mnt/h/tpc-h-impala/data/supplier.tbl';


I assume that "supplier.tbl" is a directory and the csv file is present in
the same.


Let me know if it worked!


Thanks,


Saurabh




On Thu, Jul 18, 2013 at 1:55 AM, Mainak Ghosh  wrote:
  Hello,

  I have just started using Hive and I was trying to create an external
  table with the csv file placed in NFS. I tried using file:// and
  local://. Both of these attempts failed with the error:

  create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
  S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
  S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
  TEXTFILE LOCATION 'local://mnt/h/tpc-h-impala/data/supplier.tbl';

  FAILED: Error in metadata: MetaException(message:Got exception:
  java.io.IOException No FileSystem for scheme: local)

  and

  create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
  S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
  S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
  TEXTFILE LOCATION 'file://mnt/h/tpc-h-impala/data/supplier.tbl';

  FAILED: Error in metadata: MetaException
  (message:file:/h/tpc-h-impala/data/supplier.tbl is not a directory or
  unable to create one)

  Am I missing some configuration? Any help would be really appreciated.

  Thanks and Regards,
  Mainak.




<>

Re: Question regarding external table and csv in NFS

2013-07-17 Thread Saurabh M
Hi Mainak,

Can you try using this:

 create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
TEXTFILE LOCATION 'file:///mnt/h/tpc-h-impala/data/supplier.tbl';

I assume that "supplier.tbl" is a directory and the csv file is present in
the same.

Let me know if it worked!

Thanks,

Saurabh


On Thu, Jul 18, 2013 at 1:55 AM, Mainak Ghosh  wrote:

> Hello,
>
> I have just started using Hive and I was trying to create an external
> table with the csv file placed in NFS. I tried using file:// and local://.
> Both of these attempts failed with the error:
>
> create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
> S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
> S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
> TEXTFILE LOCATION 'local://mnt/h/tpc-h-impala/data/supplier.tbl';
>
> FAILED: Error in metadata: MetaException(message:Got exception:
> java.io.IOException No FileSystem for scheme: local)
>
> and
>
> create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
> S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
> S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
> TEXTFILE LOCATION 'file://mnt/h/tpc-h-impala/data/supplier.tbl';
>
> FAILED: Error in metadata:
> MetaException(message:file:/h/tpc-h-impala/data/supplier.tbl is not a
> directory or unable to create one)
>
> Am I missing some configuration? Any help would be really appreciated.
>
> Thanks and Regards,
> Mainak.
>


Question regarding external table and csv in NFS

2013-07-17 Thread Mainak Ghosh


Hello,

I have just started using Hive and I was trying to create an external table
with the csv file placed in NFS. I tried using file:// and local://. Both
of these attempts failed with the error:

create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
TEXTFILE LOCATION 'local://mnt/h/tpc-h-impala/data/supplier.tbl';

FAILED: Error in metadata: MetaException(message:Got exception:
java.io.IOException No FileSystem for scheme: local)

and

create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING,
S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE,
S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS
TEXTFILE LOCATION 'file://mnt/h/tpc-h-impala/data/supplier.tbl';

FAILED: Error in metadata: MetaException
(message:file:/h/tpc-h-impala/data/supplier.tbl is not a directory or
unable to create one)

Am I missing some configuration? Any help would be really appreciated.

Thanks and Regards,
Mainak.

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread xiufeng liu
You could also take a look the flowing resources for data science:
http://datascienc.es/resources/
http://blog.zipfianacademy.com/



Regards,
Xiufeng Liu


On Wed, Jul 17, 2013 at 10:09 PM, Yasmin Lucero wrote:

> Ha. I have the same problem. It is hard to find resources aimed at the
> right level. I have been pretty happy with the book Head First Java by
> Kathy Sierra and Bert someone er other.
>
> y
>
> 
> Yasmin Lucero
> Senior Statistician, Gravity.com
> Santa Monica, CA
> 831-332-4596
>
> www.yasminlucero.net
>
>
> On Wed, Jul 17, 2013 at 12:52 PM, John Omernik  wrote:
>
>> Hey all -
>>
>> I was wondering if there were any "shortcut" Java courses out there.  As
>> in, I am not looking for a holistic learn everything about Java course, but
>> more of a "So you are a big data/hive geek and you get Python/Perl pretty
>> well, but when you try to understand Java your head explodes and it feels
>> like you are missing something entry level and basic thus you need these
>> basic things and you'll be fine" course.
>>
>>
>> Any thoughts?
>>
>>
>>
>


Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread Yasmin Lucero
Ha. I have the same problem. It is hard to find resources aimed at the
right level. I have been pretty happy with the book Head First Java by
Kathy Sierra and Bert someone er other.

y


Yasmin Lucero
Senior Statistician, Gravity.com
Santa Monica, CA
831-332-4596

www.yasminlucero.net


On Wed, Jul 17, 2013 at 12:52 PM, John Omernik  wrote:

> Hey all -
>
> I was wondering if there were any "shortcut" Java courses out there.  As
> in, I am not looking for a holistic learn everything about Java course, but
> more of a "So you are a big data/hive geek and you get Python/Perl pretty
> well, but when you try to understand Java your head explodes and it feels
> like you are missing something entry level and basic thus you need these
> basic things and you'll be fine" course.
>
>
> Any thoughts?
>
>
>


Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread John Meagher
The Data Science course on Coursera has a pretty good overview of map
reduce, Hive, and Pig without going into the Java side of things.
https://www.coursera.org/course/datasci.  It's not in depth, but it is
enough to get started.

On Wed, Jul 17, 2013 at 3:52 PM, John Omernik  wrote:
> Hey all -
>
> I was wondering if there were any "shortcut" Java courses out there.  As in,
> I am not looking for a holistic learn everything about Java course, but more
> of a "So you are a big data/hive geek and you get Python/Perl pretty well,
> but when you try to understand Java your head explodes and it feels like you
> are missing something entry level and basic thus you need these basic things
> and you'll be fine" course.
>
>
> Any thoughts?
>
>


Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread John Omernik
Hey all -

I was wondering if there were any "shortcut" Java courses out there.  As
in, I am not looking for a holistic learn everything about Java course, but
more of a "So you are a big data/hive geek and you get Python/Perl pretty
well, but when you try to understand Java your head explodes and it feels
like you are missing something entry level and basic thus you need these
basic things and you'll be fine" course.


Any thoughts?


Re: New to hive.

2013-07-17 Thread Mohammad Tariq
Great. Good luck with that.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Thu, Jul 18, 2013 at 12:43 AM, Bharati Adkar <
bharati.ad...@mparallelo.com> wrote:

> Hi Tariq,
>
> No Problems,
> It was the hive.jar.path property that was not being set. Figured it out
> and fixed it.
> Got the plan.xml and jobconf.xml now will debug hadoop to get the rest of
> info.
>
> Thanks,
> Warm regards,
> Bharati
>
> On Jul 17, 2013, at 12:08 PM, Mohammad Tariq  wrote:
>
> Hello ma'm,
>
> Apologies first of all for responding so late. Stuck with some urgent
> deliverables. Was out of touch for a while.
>
> java.io.IOException: Cannot run program
> "/Users/bharati/hive-0.11.0/src/testutils/hadoop" (in directory
> "/Users/bharati/eclipse/tutorial/src"): error=13, Permission denied
>  at java.lang.ProcessBuilder.processException(ProcessBuilder.java:478)
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:457)
>  at java.lang.Runtime.exec(Runtime.java:593)
>  at java.lang.Runtime.exec(Runtime.java:431)
>  at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:269)
>  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
>  at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
> TaskRunner.java:57)
>  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
>  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
>  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
>  at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(
> HiveServer.java:198)
>  at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(
> ThriftHive.java:644)
>  at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(
> ThriftHive.java:1)
>  at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>  at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>  at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(
> TThreadPoolServer.java:206)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
> ThreadPoolExecutor.java:895)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:918)
>  at java.lang.Thread.run(Thread.java:680)
> Caused by: java.io.IOException: error=13, Permission denied
>  at java.lang.UNIXProcess.forkAndExec(Native Method)
>  at java.lang.UNIXProcess.(UNIXProcess.java:53)
>  at java.lang.ProcessImpl.start(ProcessImpl.java:91)
>  at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
>  ... 17 more
>
> Please make sure you have proper permissions set for this path.
>
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
>
>
> On Wed, Jul 17, 2013 at 8:03 PM, Puneet Khatod 
> wrote:
>
>>  Hi,
>>
>> ** **
>>
>> There are many online tutorials and blogs to provide quick get-set-go
>> sort of information. To start with you can learn Hadoop. For detailed
>> knowledge you will have to go through e-books as mentioned by Lefty.
>> These books are bulky but will provide every bit of hadoop.
>>
>> I recently came across an android app called 'Big data Xpert', which has
>> tips and tricks about big data technologies. I think that it can be quick
>> and good reference for beginners as well as experienced developers.. 
>>
>> For reference: 
>>
>>
>> https://play.google.com/store/apps/details?id=com.mobiknights.xpert.bigdata
>> 
>>
>> ** **
>>
>> Thanks,
>>
>> Puneet
>>
>> ** **
>>
>> *From:* Lefty Leverenz [mailto:le...@hortonworks.com]
>> *Sent:* Thursday, June 20, 2013 11:05 AM
>> *To:* user@hive.apache.org
>> *Subject:* Re: New to hive.
>>
>> ** **
>>
>> "Programming Hive" and "Hadoop: The Definitive Guide" are available at
>> the O'Reilly website (http://oreilly.com/) and on Amazon. 
>>
>> ** **
>>
>> But don't forget the Hive wiki:
>>
>>- Hive Home -- https://cwiki.apache.org/confluence/display/Hive/Home *
>>***
>>- Getting Started --
>>https://cwiki.apache.org/confluence/display/Hive/GettingStarted
>>- Hive Tutorial --
>>https://cwiki.apache.org/confluence/display/Hive/Tutorial
>>
>> – Lefty
>>
>> ** **
>>
>> ** **
>>
>> On Wed, Jun 19, 2013 at 7:02 PM, Mohammad Tariq 
>> wrote:
>>
>> Hello ma'am,
>>
>> ** **
>>
>>   Hive queries are parsed using ANTLR  and
>> and are converted into corresponding MR jobs(actually a lot of things
>> happen under the hood). I had answered a similar 
>> questionfew
>>  days ago on SO, you might find it helpful. But I would suggest you to
>> go through the original 
>> paperwhich explains all 
>> these things in proper detail. I would also recommend
>> you to go through the book "Programming Hive". It's really nice.
>>
>> ** **
>>
>> HTH
>>
>>
>> 
>>
>> Warm Regards,
>>
>> Tariq
>>
>> cloudfront.blogspot.com
>>
>> ** **
>>
>> On Thu, Jun 20, 20

Re: New to hive.

2013-07-17 Thread Bharati Adkar
Hi Tariq,

No Problems,
It was the hive.jar.path property that was not being set. Figured it out and 
fixed it. 
Got the plan.xml and jobconf.xml now will debug hadoop to get the rest of info.

Thanks,
Warm regards,
Bharati
On Jul 17, 2013, at 12:08 PM, Mohammad Tariq  wrote:

> Hello ma'm,
> 
> Apologies first of all for responding so late. Stuck with some urgent 
> deliverables. Was out of touch for a while.
>  
> java.io.IOException: Cannot run program 
> "/Users/bharati/hive-0.11.0/src/testutils/hadoop" (in directory 
> "/Users/bharati/eclipse/tutorial/src"): error=13, Permission denied
>   at java.lang.ProcessBuilder.processException(ProcessBuilder.java:478)
>   at java.lang.ProcessBuilder.start(ProcessBuilder.java:457)
>   at java.lang.Runtime.exec(Runtime.java:593)
>   at java.lang.Runtime.exec(Runtime.java:431)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:269)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
>   at 
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:198)
>   at 
> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:644)
>   at 
> org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(ThriftHive.java:1)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>   at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>   at java.lang.Thread.run(Thread.java:680)
> Caused by: java.io.IOException: error=13, Permission denied
>   at java.lang.UNIXProcess.forkAndExec(Native Method)
>   at java.lang.UNIXProcess.(UNIXProcess.java:53)
>   at java.lang.ProcessImpl.start(ProcessImpl.java:91)
>   at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
>   ... 17 more
> 
> Please make sure you have proper permissions set for this path.
> 
> Warm Regards,
> Tariq
> cloudfront.blogspot.com
> 
> 
> On Wed, Jul 17, 2013 at 8:03 PM, Puneet Khatod  
> wrote:
> Hi,
>  
> There are many online tutorials and blogs to provide quick get-set-go sort of 
> information. To start with you can learn Hadoop. For detailed knowledge you 
> will have to go through e-books as mentioned by Lefty.
> These books are bulky but will provide every bit of hadoop.
> 
> I recently came across an android app called 'Big data Xpert', which has tips 
> and tricks about big data technologies. I think that it can be quick and good 
> reference for beginners as well as experienced developers..
> For reference:
> https://play.google.com/store/apps/details?id=com.mobiknights.xpert.bigdata
> 
>  
> 
> Thanks,
> 
> Puneet
> 
>  
> 
> From: Lefty Leverenz [mailto:le...@hortonworks.com] 
> Sent: Thursday, June 20, 2013 11:05 AM
> To: user@hive.apache.org
> Subject: Re: New to hive.
> 
>  
> 
> "Programming Hive" and "Hadoop: The Definitive Guide" are available at the 
> O'Reilly website (http://oreilly.com/) and on Amazon. 
> 
>  
> 
> But don't forget the Hive wiki:
> 
> Hive Home -- https://cwiki.apache.org/confluence/display/Hive/Home 
> Getting Started -- 
> https://cwiki.apache.org/confluence/display/Hive/GettingStarted
> Hive Tutorial -- https://cwiki.apache.org/confluence/display/Hive/Tutorial
> – Lefty
> 
>  
> 
>  
> 
> On Wed, Jun 19, 2013 at 7:02 PM, Mohammad Tariq  wrote:
> 
> Hello ma'am,
> 
>  
> 
>   Hive queries are parsed using ANTLR and and are converted into 
> corresponding MR jobs(actually a lot of things happen under the hood). I had 
> answered a similar question few days ago on SO, you might find it helpful. 
> But I would suggest you to go through the original paper which explains all 
> these things in proper detail. I would also recommend you to go through the 
> book "Programming Hive". It's really nice.
> 
>  
> 
> HTH
> 
> 
> 
> Warm Regards,
> 
> Tariq
> 
> cloudfront.blogspot.com
> 
>  
> 
> On Thu, Jun 20, 2013 at 4:24 AM, Bharati  wrote:
> 
> Hi Folks,
> 
> I am new to hive and need information, tutorials etc that you can point to. I 
> have installed hive to work with MySQL.
> 
>  I can run queries. Now I would like to understand how the map and reduce 
> classes are created and how I can look at the data for the map job and map 
> class the hive query generates.  Also is there a way to create custom map 
> classes.
> I would appreciate if anyone can help me get started.
> 
> Thanks,

Re: New to hive.

2013-07-17 Thread Mohammad Tariq
Hello ma'm,

Apologies first of all for responding so late. Stuck with some urgent
deliverables. Was out of touch for a while.

java.io.IOException: Cannot run program
"/Users/bharati/hive-0.11.0/src/testutils/hadoop" (in directory
"/Users/bharati/eclipse/tutorial/src"): error=13, Permission denied
 at java.lang.ProcessBuilder.processException(ProcessBuilder.java:478)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:457)
 at java.lang.Runtime.exec(Runtime.java:593)
 at java.lang.Runtime.exec(Runtime.java:431)
 at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:269)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:144)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1355)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1139)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:945)
 at org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(
HiveServer.java:198)
 at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(
ThriftHive.java:644)
 at org.apache.hadoop.hive.service.ThriftHive$Processor$execute.getResult(
ThriftHive.java:1)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
 at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(
TThreadPoolServer.java:206)
 at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(
ThreadPoolExecutor.java:895)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:918)
 at java.lang.Thread.run(Thread.java:680)
Caused by: java.io.IOException: error=13, Permission denied
 at java.lang.UNIXProcess.forkAndExec(Native Method)
 at java.lang.UNIXProcess.(UNIXProcess.java:53)
 at java.lang.ProcessImpl.start(ProcessImpl.java:91)
 at java.lang.ProcessBuilder.start(ProcessBuilder.java:452)
 ... 17 more

Please make sure you have proper permissions set for this path.

Warm Regards,
Tariq
cloudfront.blogspot.com


On Wed, Jul 17, 2013 at 8:03 PM, Puneet Khatod wrote:

>  Hi,
>
> ** **
>
> There are many online tutorials and blogs to provide quick get-set-go sort
> of information. To start with you can learn Hadoop. For detailed knowledge
> you will have to go through e-books as mentioned by Lefty.
> These books are bulky but will provide every bit of hadoop.
>
> I recently came across an android app called 'Big data Xpert', which has
> tips and tricks about big data technologies. I think that it can be quick
> and good reference for beginners as well as experienced developers.. 
>
> For reference: 
>
> https://play.google.com/store/apps/details?id=com.mobiknights.xpert.bigdata
> 
>
> ** **
>
> Thanks,
>
> Puneet
>
> ** **
>
> *From:* Lefty Leverenz [mailto:le...@hortonworks.com]
> *Sent:* Thursday, June 20, 2013 11:05 AM
> *To:* user@hive.apache.org
> *Subject:* Re: New to hive.
>
> ** **
>
> "Programming Hive" and "Hadoop: The Definitive Guide" are available at the
> O'Reilly website (http://oreilly.com/) and on Amazon. 
>
> ** **
>
> But don't forget the Hive wiki:
>
>- Hive Home -- https://cwiki.apache.org/confluence/display/Hive/Home **
>**
>- Getting Started --
>https://cwiki.apache.org/confluence/display/Hive/GettingStarted
>- Hive Tutorial --
>https://cwiki.apache.org/confluence/display/Hive/Tutorial
>
>  – Lefty
>
> ** **
>
> ** **
>
> On Wed, Jun 19, 2013 at 7:02 PM, Mohammad Tariq 
> wrote:
>
>  Hello ma'am,
>
> ** **
>
>   Hive queries are parsed using ANTLR  and and
> are converted into corresponding MR jobs(actually a lot of things happen
> under the hood). I had answered a similar 
> questionfew
>  days ago on SO, you might find it helpful. But I would suggest you to
> go through the original 
> paperwhich explains all 
> these things in proper detail. I would also recommend
> you to go through the book "Programming Hive". It's really nice.
>
> ** **
>
> HTH
>
>
> 
>
> Warm Regards,
>
> Tariq
>
> cloudfront.blogspot.com
>
> ** **
>
> On Thu, Jun 20, 2013 at 4:24 AM, Bharati 
> wrote:
>
>  Hi Folks,
>
> I am new to hive and need information, tutorials etc that you can point
> to. I have installed hive to work with MySQL.
>
>  I can run queries. Now I would like to understand how the map and reduce
> classes are created and how I can look at the data for the map job and map
> class the hive query generates.  Also is there a way to create custom map
> classes.
> I would appreciate if anyone can help me get started.
>
> Thanks,
> Bharati
>
> Sent from my iPad
>
> Fortigate Filtered
>
>  ** **
>
>  ** **
>
>
> Any comments or statements made in this email are not necessarily thos

Re: which approach is better

2013-07-17 Thread Hamza Asad
I use data to generates reports on daily basis, Do couple of analysis and
its insert once and read many on daily basis.  But My main purpose is to
secure my data and easily recover it even if my hadoop(datanode) OR HDFS
crashes. As uptill now, i'm using approach in which data has been retrieved
directly from HDFS and few days back my hadoop crashes and when i repair
it, i was unable to recover my Old data which resides on HDFS. So please
let me know do i have to make architectural change OR is there any way to
recover data which resides in crashed HDFS


On Wed, Jul 17, 2013 at 11:00 PM, Nitin Pawar wrote:

> what's the purpose of data storage?
> whats the read and write throughput you expect?
> whats the way you will access data while read?
> whats are your SLAs on both read and write?
>
> there will be more questions others will ask so be ready for that :)
>
>
>
> On Wed, Jul 17, 2013 at 11:10 PM, Hamza Asad wrote:
>
>> Please let me knw which approach is better. Either i save my data
>> directly to HDFS and run hive (shark) queries over it OR store my data in
>> HBASE, and then query it.. as i want to ensure efficient data retrieval and
>> data remains safe and can easily recover if hadoop crashes.
>>
>> --
>> *Muhammad Hamza Asad*
>>
>
>
>
> --
> Nitin Pawar
>



-- 
*Muhammad Hamza Asad*


Re: which approach is better

2013-07-17 Thread kulkarni.swar...@gmail.com
First of all, that might not be the right approach to choose the underlying
storage. You should choose HDFS or HBase depending on whether the data is
going to be used for batch processing or you need random access on top of
it. HBase is just another layer on top of HDFS. So obviously the queries
running on top of HBase are going to be less efficient. So if you can get
away with using HDFS, I would say that is the best and simplest approach.


On Wed, Jul 17, 2013 at 12:40 PM, Hamza Asad  wrote:

> Please let me knw which approach is better. Either i save my data directly
> to HDFS and run hive (shark) queries over it OR store my data in HBASE, and
> then query it.. as i want to ensure efficient data retrieval and data
> remains safe and can easily recover if hadoop crashes.
>
> --
> *Muhammad Hamza Asad*
>



-- 
Swarnim


Re: which approach is better

2013-07-17 Thread Nitin Pawar
what's the purpose of data storage?
whats the read and write throughput you expect?
whats the way you will access data while read?
whats are your SLAs on both read and write?

there will be more questions others will ask so be ready for that :)



On Wed, Jul 17, 2013 at 11:10 PM, Hamza Asad  wrote:

> Please let me knw which approach is better. Either i save my data directly
> to HDFS and run hive (shark) queries over it OR store my data in HBASE, and
> then query it.. as i want to ensure efficient data retrieval and data
> remains safe and can easily recover if hadoop crashes.
>
> --
> *Muhammad Hamza Asad*
>



-- 
Nitin Pawar


which approach is better

2013-07-17 Thread Hamza Asad
Please let me knw which approach is better. Either i save my data directly
to HDFS and run hive (shark) queries over it OR store my data in HBASE, and
then query it.. as i want to ensure efficient data retrieval and data
remains safe and can easily recover if hadoop crashes.

-- 
*Muhammad Hamza Asad*


RE: New to hive.

2013-07-17 Thread Puneet Khatod
Hi,

There are many online tutorials and blogs to provide quick get-set-go sort of 
information. To start with you can learn Hadoop. For detailed knowledge you 
will have to go through e-books as mentioned by Lefty.
These books are bulky but will provide every bit of hadoop.

I recently came across an android app called 'Big data Xpert', which has tips 
and tricks about big data technologies. I think that it can be quick and good 
reference for beginners as well as experienced developers..
For reference:
https://play.google.com/store/apps/details?id=com.mobiknights.xpert.bigdata

Thanks,
Puneet

From: Lefty Leverenz [mailto:le...@hortonworks.com]
Sent: Thursday, June 20, 2013 11:05 AM
To: user@hive.apache.org
Subject: Re: New to hive.

"Programming Hive" and "Hadoop: The Definitive Guide" are available at the 
O'Reilly website (http://oreilly.com/) and on Amazon.

But don't forget the Hive wiki:

  *   Hive Home -- https://cwiki.apache.org/confluence/display/Hive/Home
  *   Getting Started -- 
https://cwiki.apache.org/confluence/display/Hive/GettingStarted
  *   Hive Tutorial -- https://cwiki.apache.org/confluence/display/Hive/Tutorial
- Lefty


On Wed, Jun 19, 2013 at 7:02 PM, Mohammad Tariq 
mailto:donta...@gmail.com>> wrote:
Hello ma'am,

  Hive queries are parsed using ANTLR and and are 
converted into corresponding MR jobs(actually a lot of things happen under the 
hood). I had answered a similar 
question
 few days ago on SO, you might find it helpful. But I would suggest you to go 
through the original paper 
which explains all these things in proper detail. I would also recommend you to 
go through the book "Programming Hive". It's really nice.

HTH

Warm Regards,
Tariq
cloudfront.blogspot.com

On Thu, Jun 20, 2013 at 4:24 AM, Bharati 
mailto:bharati.ad...@mparallelo.com>> wrote:
Hi Folks,

I am new to hive and need information, tutorials etc that you can point to. I 
have installed hive to work with MySQL.

 I can run queries. Now I would like to understand how the map and reduce 
classes are created and how I can look at the data for the map job and map 
class the hive query generates.  Also is there a way to create custom map 
classes.
I would appreciate if anyone can help me get started.

Thanks,
Bharati

Sent from my iPad
Fortigate Filtered



Any comments or statements made in this email are not necessarily those of 
Tavant Technologies.
The information transmitted is intended only for the person or entity to which 
it is addressed and may 
contain confidential and/or privileged material. If you have received this in 
error, please contact the 
sender and delete the material from any computer. All e-mails sent from or to 
Tavant Technologies 
may be subject to our monitoring procedures.


Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Richa Sharma
my bad ... in relational databases we generally do not give a column name
inside rank() ... but the one in (partition by  order by..) is
sufficient.

But looks like that's not the case in Hive


Jerome,

Please look at the examples in link below. See if you are able to make it
work

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics#LanguageManualWindowingAndAnalytics-PARTITIONBYwithpartitioning%2CORDERBY%2Candwindowspecification



Cant help you beyond this as i don't have Hive 0.11 :-(


Richa


On Wed, Jul 17, 2013 at 3:08 PM, Jérôme Verdier
wrote:

> Hi Richa,
>
> I have tried one query, with what i've understand of  Vijay's tips.
>
> SELECT code_entite, RANK(mag.me_vente_ht) OVER (PARTITION BY
> mag.co_societe ORDER BY  mag.me_vente_ht) AS rank FROM
> default.thm_renta_rgrp_produits_n_1 mag;
>
> This query is working, it gives me results.
>
> You say that maybe i'm hitting the same bug of JIRA HIVE-4663, but query
> is also failling when i put analytical columns in...
>
>
> 2013/7/17 Richa Sharma 
>
>> Vijay
>>
>> Jerome has already passed column -> mag.co_societe for rank.
>>
>> syntax -> RANK() OVER (PARTITION BY mag.co_societe ORDER BY
>> mag.me_vente_ht)
>> This will generate a rank for column mag.co_societe based on column value
>> me_vente_ht
>>
>> Jerome,
>>
>> Its possible you are also hitting the same bug as I mentioned in my email
>> before.
>>
>>
>> Richa
>>
>>
>> On Wed, Jul 17, 2013 at 2:31 PM, Vijay  wrote:
>>
>>> As the error message states: "One ore more arguments are expected," you
>>> have to pass a column to the rank function.
>>>
>>>
>>> On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier <
>>> verdier.jerom...@gmail.com> wrote:
>>>
 Hi Richa,

 I have tried a simple query without joins, etc

 SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
 mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
 default.thm_renta_rgrp_produits_n_1 mag;

 Unfortunately, the error is the same like previously.

 Error: Query returned non-zero code: 4, cause: FAILED:
 SemanticException Failed to breakup Windowing invocations into Groups. At
 least 1 group must only depend on input columns. Also check for circular
 dependencies.

 Underlying error:
 org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
 arguments are expected.
 SQLState:  42000
 ErrorCode: 4




 2013/7/17 Richa Sharma 

> Jerome
>
> I would recommend that you try Rank function with columns from just
> one table first.
> Once it is established that rank is working fine then add all the
> joins.
>
> I am still on Hive 0.10 so cannot test it myself.
> However, I can find a similar issue on following link - so its
> possible you are facing issues due to this reported bug.
>
> https://issues.apache.org/jira/browse/HIVE-4663
>
>
> Richa
>
>
> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
> verdier.jerom...@gmail.com> wrote:
>
>> You can see my query below :
>>
>> SELECT
>> mag.co_magasin,
>> dem.id_produit  as
>> id_produit_orig,
>> pnvente.dt_debut_commercial as
>> dt_debut_commercial,
>> COALESCE(pnvente.id_produit,dem.id_produit) as
>> id_produit,
>> RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>> ORDER BY pnvente.dt_debut_commercial DESC,
>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>
>> FROM default.demarque_mag_jour dem
>>
>>   LEFT OUTER JOIN default.produit_norm pn
>>   ON  pn.co_societe = dem.co_societe
>>   AND pn.id_produit = dem.id_produit
>>   LEFT OUTER JOIN default.produit_norm pnvente
>>   ON  pnvente.co_societe = pn.co_societe
>>   AND pnvente.co_produit_rfu = pn.co_produit_lip
>>   INNER JOIN default.kpi_magasin mag
>>   ON mag.id_magasin = dem.id_magasin
>>
>>
>> GROUP BY
>> mag.co_magasin,
>> dem.id_produit,
>> pnvente.dt_debut_commercial,
>> COALESCE(pnvente.id_produit,dem.id_produit);
>>
>>
>> 2013/7/16 Richa Sharma 
>>
>>> Can you share query with just RANK().
>>>
>>> Richa
>>>
>>>
>>> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
>>> verdier.jerom...@gmail.com> wrote:
>>>
 Hi Richa,

 I tried to execute the rank function alone, but the result is the
 same

 Thanks


 2013/7/16 Richa Sharma 

> Hi Jerome
>
>
> I think the problem is you are trying to use MIN, SUM and RANK
> function in a single query.
>
> T

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Richa,

I have tried one query, with what i've understand of  Vijay's tips.

SELECT code_entite, RANK(mag.me_vente_ht) OVER (PARTITION BY mag.co_societe
ORDER BY  mag.me_vente_ht) AS rank FROM default.thm_renta_rgrp_produits_n_1
mag;

This query is working, it gives me results.

You say that maybe i'm hitting the same bug of JIRA HIVE-4663, but query is
also failling when i put analytical columns in...


2013/7/17 Richa Sharma 

> Vijay
>
> Jerome has already passed column -> mag.co_societe for rank.
>
> syntax -> RANK() OVER (PARTITION BY mag.co_societe ORDER BY
> mag.me_vente_ht)
> This will generate a rank for column mag.co_societe based on column value
> me_vente_ht
>
> Jerome,
>
> Its possible you are also hitting the same bug as I mentioned in my email
> before.
>
>
> Richa
>
>
> On Wed, Jul 17, 2013 at 2:31 PM, Vijay  wrote:
>
>> As the error message states: "One ore more arguments are expected," you
>> have to pass a column to the rank function.
>>
>>
>> On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier <
>> verdier.jerom...@gmail.com> wrote:
>>
>>> Hi Richa,
>>>
>>> I have tried a simple query without joins, etc
>>>
>>> SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
>>> mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
>>> default.thm_renta_rgrp_produits_n_1 mag;
>>>
>>> Unfortunately, the error is the same like previously.
>>>
>>> Error: Query returned non-zero code: 4, cause: FAILED:
>>> SemanticException Failed to breakup Windowing invocations into Groups. At
>>> least 1 group must only depend on input columns. Also check for circular
>>> dependencies.
>>>
>>> Underlying error:
>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>>> arguments are expected.
>>> SQLState:  42000
>>> ErrorCode: 4
>>>
>>>
>>>
>>>
>>> 2013/7/17 Richa Sharma 
>>>
 Jerome

 I would recommend that you try Rank function with columns from just one
 table first.
 Once it is established that rank is working fine then add all the joins.

 I am still on Hive 0.10 so cannot test it myself.
 However, I can find a similar issue on following link - so its possible
 you are facing issues due to this reported bug.

 https://issues.apache.org/jira/browse/HIVE-4663


 Richa


 On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
 verdier.jerom...@gmail.com> wrote:

> You can see my query below :
>
> SELECT
> mag.co_magasin,
> dem.id_produit  as
> id_produit_orig,
> pnvente.dt_debut_commercial as
> dt_debut_commercial,
> COALESCE(pnvente.id_produit,dem.id_produit) as
> id_produit,
> RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
> ORDER BY pnvente.dt_debut_commercial DESC,
> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>
> FROM default.demarque_mag_jour dem
>
>   LEFT OUTER JOIN default.produit_norm pn
>   ON  pn.co_societe = dem.co_societe
>   AND pn.id_produit = dem.id_produit
>   LEFT OUTER JOIN default.produit_norm pnvente
>   ON  pnvente.co_societe = pn.co_societe
>   AND pnvente.co_produit_rfu = pn.co_produit_lip
>   INNER JOIN default.kpi_magasin mag
>   ON mag.id_magasin = dem.id_magasin
>
>
> GROUP BY
> mag.co_magasin,
> dem.id_produit,
> pnvente.dt_debut_commercial,
> COALESCE(pnvente.id_produit,dem.id_produit);
>
>
> 2013/7/16 Richa Sharma 
>
>> Can you share query with just RANK().
>>
>> Richa
>>
>>
>> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
>> verdier.jerom...@gmail.com> wrote:
>>
>>> Hi Richa,
>>>
>>> I tried to execute the rank function alone, but the result is the
>>> same
>>>
>>> Thanks
>>>
>>>
>>> 2013/7/16 Richa Sharma 
>>>
 Hi Jerome


 I think the problem is you are trying to use MIN, SUM and RANK
 function in a single query.

 Try to get the rank first in a query and on top of it apply these
 aggregate functions

 Richa




 On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
 verdier.jerom...@gmail.com> wrote:

> Hi,
>
> I have a problem while using RANK OVER PARTITION function with
> Hive.
>
> Hive is in version 0.11 and, as we can see here :
> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
> we can now use these functions in Hive.
>
> But, when i use it, i encountered this error :
>
> FAILED: SemanticException Failed to breakup Windowing invocations
> into Groups. At 

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Richa Sharma
Vijay

Jerome has already passed column -> mag.co_societe for rank.

syntax -> RANK() OVER (PARTITION BY mag.co_societe ORDER BY
mag.me_vente_ht)
This will generate a rank for column mag.co_societe based on column value
me_vente_ht

Jerome,

Its possible you are also hitting the same bug as I mentioned in my email
before.


Richa


On Wed, Jul 17, 2013 at 2:31 PM, Vijay  wrote:

> As the error message states: "One ore more arguments are expected," you
> have to pass a column to the rank function.
>
>
> On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier <
> verdier.jerom...@gmail.com> wrote:
>
>> Hi Richa,
>>
>> I have tried a simple query without joins, etc
>>
>> SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
>> mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
>> default.thm_renta_rgrp_produits_n_1 mag;
>>
>> Unfortunately, the error is the same like previously.
>>
>> Error: Query returned non-zero code: 4, cause: FAILED:
>> SemanticException Failed to breakup Windowing invocations into Groups. At
>> least 1 group must only depend on input columns. Also check for circular
>> dependencies.
>>
>> Underlying error:
>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>> arguments are expected.
>> SQLState:  42000
>> ErrorCode: 4
>>
>>
>>
>>
>> 2013/7/17 Richa Sharma 
>>
>>> Jerome
>>>
>>> I would recommend that you try Rank function with columns from just one
>>> table first.
>>> Once it is established that rank is working fine then add all the joins.
>>>
>>> I am still on Hive 0.10 so cannot test it myself.
>>> However, I can find a similar issue on following link - so its possible
>>> you are facing issues due to this reported bug.
>>>
>>> https://issues.apache.org/jira/browse/HIVE-4663
>>>
>>>
>>> Richa
>>>
>>>
>>> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
>>> verdier.jerom...@gmail.com> wrote:
>>>
 You can see my query below :

 SELECT
 mag.co_magasin,
 dem.id_produit  as
 id_produit_orig,
 pnvente.dt_debut_commercial as
 dt_debut_commercial,
 COALESCE(pnvente.id_produit,dem.id_produit) as
 id_produit,
 RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
 ORDER BY pnvente.dt_debut_commercial DESC,
 COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang

 FROM default.demarque_mag_jour dem

   LEFT OUTER JOIN default.produit_norm pn
   ON  pn.co_societe = dem.co_societe
   AND pn.id_produit = dem.id_produit
   LEFT OUTER JOIN default.produit_norm pnvente
   ON  pnvente.co_societe = pn.co_societe
   AND pnvente.co_produit_rfu = pn.co_produit_lip
   INNER JOIN default.kpi_magasin mag
   ON mag.id_magasin = dem.id_magasin


 GROUP BY
 mag.co_magasin,
 dem.id_produit,
 pnvente.dt_debut_commercial,
 COALESCE(pnvente.id_produit,dem.id_produit);


 2013/7/16 Richa Sharma 

> Can you share query with just RANK().
>
> Richa
>
>
> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
> verdier.jerom...@gmail.com> wrote:
>
>> Hi Richa,
>>
>> I tried to execute the rank function alone, but the result is the same
>>
>> Thanks
>>
>>
>> 2013/7/16 Richa Sharma 
>>
>>> Hi Jerome
>>>
>>>
>>> I think the problem is you are trying to use MIN, SUM and RANK
>>> function in a single query.
>>>
>>> Try to get the rank first in a query and on top of it apply these
>>> aggregate functions
>>>
>>> Richa
>>>
>>>
>>>
>>>
>>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
>>> verdier.jerom...@gmail.com> wrote:
>>>
 Hi,

 I have a problem while using RANK OVER PARTITION function with Hive.

 Hive is in version 0.11 and, as we can see here :
 https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
 we can now use these functions in Hive.

 But, when i use it, i encountered this error :

 FAILED: SemanticException Failed to breakup Windowing invocations
 into Groups. At least 1 group must only depend on input columns. Also 
 check
 for circular dependencies.
 Underlying error:
 org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
 arguments are expected.

 Here is my script :

 SELECT
 mag.co_magasin,
 dem.id_produit  as
 id_produit_orig,
 pnvente.dt_debut_commercial as
 dt_debut_commercial,
 COALESCE(pnvente.id_produit,dem.id_produit) as
>>

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Vijay,

Could you give me an example, i'm not sure of what you're meaning.

Thanks,


2013/7/17 Vijay 

> As the error message states: "One ore more arguments are expected," you
> have to pass a column to the rank function.
>
>
> On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier <
> verdier.jerom...@gmail.com> wrote:
>
>> Hi Richa,
>>
>> I have tried a simple query without joins, etc
>>
>> SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
>> mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
>> default.thm_renta_rgrp_produits_n_1 mag;
>>
>> Unfortunately, the error is the same like previously.
>>
>> Error: Query returned non-zero code: 4, cause: FAILED:
>> SemanticException Failed to breakup Windowing invocations into Groups. At
>> least 1 group must only depend on input columns. Also check for circular
>> dependencies.
>>
>> Underlying error:
>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>> arguments are expected.
>> SQLState:  42000
>> ErrorCode: 4
>>
>>
>>
>>
>> 2013/7/17 Richa Sharma 
>>
>>> Jerome
>>>
>>> I would recommend that you try Rank function with columns from just one
>>> table first.
>>> Once it is established that rank is working fine then add all the joins.
>>>
>>> I am still on Hive 0.10 so cannot test it myself.
>>> However, I can find a similar issue on following link - so its possible
>>> you are facing issues due to this reported bug.
>>>
>>> https://issues.apache.org/jira/browse/HIVE-4663
>>>
>>>
>>> Richa
>>>
>>>
>>> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
>>> verdier.jerom...@gmail.com> wrote:
>>>
 You can see my query below :

 SELECT
 mag.co_magasin,
 dem.id_produit  as
 id_produit_orig,
 pnvente.dt_debut_commercial as
 dt_debut_commercial,
 COALESCE(pnvente.id_produit,dem.id_produit) as
 id_produit,
 RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
 ORDER BY pnvente.dt_debut_commercial DESC,
 COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang

 FROM default.demarque_mag_jour dem

   LEFT OUTER JOIN default.produit_norm pn
   ON  pn.co_societe = dem.co_societe
   AND pn.id_produit = dem.id_produit
   LEFT OUTER JOIN default.produit_norm pnvente
   ON  pnvente.co_societe = pn.co_societe
   AND pnvente.co_produit_rfu = pn.co_produit_lip
   INNER JOIN default.kpi_magasin mag
   ON mag.id_magasin = dem.id_magasin


 GROUP BY
 mag.co_magasin,
 dem.id_produit,
 pnvente.dt_debut_commercial,
 COALESCE(pnvente.id_produit,dem.id_produit);


 2013/7/16 Richa Sharma 

> Can you share query with just RANK().
>
> Richa
>
>
> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
> verdier.jerom...@gmail.com> wrote:
>
>> Hi Richa,
>>
>> I tried to execute the rank function alone, but the result is the same
>>
>> Thanks
>>
>>
>> 2013/7/16 Richa Sharma 
>>
>>> Hi Jerome
>>>
>>>
>>> I think the problem is you are trying to use MIN, SUM and RANK
>>> function in a single query.
>>>
>>> Try to get the rank first in a query and on top of it apply these
>>> aggregate functions
>>>
>>> Richa
>>>
>>>
>>>
>>>
>>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
>>> verdier.jerom...@gmail.com> wrote:
>>>
 Hi,

 I have a problem while using RANK OVER PARTITION function with Hive.

 Hive is in version 0.11 and, as we can see here :
 https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
 we can now use these functions in Hive.

 But, when i use it, i encountered this error :

 FAILED: SemanticException Failed to breakup Windowing invocations
 into Groups. At least 1 group must only depend on input columns. Also 
 check
 for circular dependencies.
 Underlying error:
 org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
 arguments are expected.

 Here is my script :

 SELECT
 mag.co_magasin,
 dem.id_produit  as
 id_produit_orig,
 pnvente.dt_debut_commercial as
 dt_debut_commercial,
 COALESCE(pnvente.id_produit,dem.id_produit) as
 id_produit,
 min(
   CASE WHEN dem.co_validation IS NULL THEN 0 ELSE 1 END
 )   as
 flg_demarque_valide,
 sum(CASE WHEN dem.co_validatio

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Vijay
As the error message states: "One ore more arguments are expected," you
have to pass a column to the rank function.


On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier
wrote:

> Hi Richa,
>
> I have tried a simple query without joins, etc
>
> SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
> mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
> default.thm_renta_rgrp_produits_n_1 mag;
>
> Unfortunately, the error is the same like previously.
>
> Error: Query returned non-zero code: 4, cause: FAILED:
> SemanticException Failed to breakup Windowing invocations into Groups. At
> least 1 group must only depend on input columns. Also check for circular
> dependencies.
>
> Underlying error: org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException:
> One or more arguments are expected.
> SQLState:  42000
> ErrorCode: 4
>
>
>
>
> 2013/7/17 Richa Sharma 
>
>> Jerome
>>
>> I would recommend that you try Rank function with columns from just one
>> table first.
>> Once it is established that rank is working fine then add all the joins.
>>
>> I am still on Hive 0.10 so cannot test it myself.
>> However, I can find a similar issue on following link - so its possible
>> you are facing issues due to this reported bug.
>>
>> https://issues.apache.org/jira/browse/HIVE-4663
>>
>>
>> Richa
>>
>>
>> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
>> verdier.jerom...@gmail.com> wrote:
>>
>>> You can see my query below :
>>>
>>> SELECT
>>> mag.co_magasin,
>>> dem.id_produit  as
>>> id_produit_orig,
>>> pnvente.dt_debut_commercial as
>>> dt_debut_commercial,
>>> COALESCE(pnvente.id_produit,dem.id_produit) as
>>> id_produit,
>>> RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>>> ORDER BY pnvente.dt_debut_commercial DESC,
>>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>>
>>> FROM default.demarque_mag_jour dem
>>>
>>>   LEFT OUTER JOIN default.produit_norm pn
>>>   ON  pn.co_societe = dem.co_societe
>>>   AND pn.id_produit = dem.id_produit
>>>   LEFT OUTER JOIN default.produit_norm pnvente
>>>   ON  pnvente.co_societe = pn.co_societe
>>>   AND pnvente.co_produit_rfu = pn.co_produit_lip
>>>   INNER JOIN default.kpi_magasin mag
>>>   ON mag.id_magasin = dem.id_magasin
>>>
>>>
>>> GROUP BY
>>> mag.co_magasin,
>>> dem.id_produit,
>>> pnvente.dt_debut_commercial,
>>> COALESCE(pnvente.id_produit,dem.id_produit);
>>>
>>>
>>> 2013/7/16 Richa Sharma 
>>>
 Can you share query with just RANK().

 Richa


 On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
 verdier.jerom...@gmail.com> wrote:

> Hi Richa,
>
> I tried to execute the rank function alone, but the result is the same
>
> Thanks
>
>
> 2013/7/16 Richa Sharma 
>
>> Hi Jerome
>>
>>
>> I think the problem is you are trying to use MIN, SUM and RANK
>> function in a single query.
>>
>> Try to get the rank first in a query and on top of it apply these
>> aggregate functions
>>
>> Richa
>>
>>
>>
>>
>> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
>> verdier.jerom...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I have a problem while using RANK OVER PARTITION function with Hive.
>>>
>>> Hive is in version 0.11 and, as we can see here :
>>> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
>>> we can now use these functions in Hive.
>>>
>>> But, when i use it, i encountered this error :
>>>
>>> FAILED: SemanticException Failed to breakup Windowing invocations
>>> into Groups. At least 1 group must only depend on input columns. Also 
>>> check
>>> for circular dependencies.
>>> Underlying error:
>>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>>> arguments are expected.
>>>
>>> Here is my script :
>>>
>>> SELECT
>>> mag.co_magasin,
>>> dem.id_produit  as
>>> id_produit_orig,
>>> pnvente.dt_debut_commercial as
>>> dt_debut_commercial,
>>> COALESCE(pnvente.id_produit,dem.id_produit) as
>>> id_produit,
>>> min(
>>>   CASE WHEN dem.co_validation IS NULL THEN 0 ELSE 1 END
>>> )   as
>>> flg_demarque_valide,
>>> sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>>> CAST(dem.mt_revient_ope AS INT) END)
>>> as
>>> me_dem_con_prx_cs,
>>> 0   as
>>> me_dem_inc_prx_cs,
>>> 0 

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Richa,

I have tried a simple query without joins, etc

SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY
mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM
default.thm_renta_rgrp_produits_n_1 mag;

Unfortunately, the error is the same like previously.

Error: Query returned non-zero code: 4, cause: FAILED:
SemanticException Failed to breakup Windowing invocations into Groups. At
least 1 group must only depend on input columns. Also check for circular
dependencies.
Underlying error: org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException:
One or more arguments are expected.
SQLState:  42000
ErrorCode: 4




2013/7/17 Richa Sharma 

> Jerome
>
> I would recommend that you try Rank function with columns from just one
> table first.
> Once it is established that rank is working fine then add all the joins.
>
> I am still on Hive 0.10 so cannot test it myself.
> However, I can find a similar issue on following link - so its possible
> you are facing issues due to this reported bug.
>
> https://issues.apache.org/jira/browse/HIVE-4663
>
>
> Richa
>
>
> On Tue, Jul 16, 2013 at 6:41 PM, Jérôme Verdier <
> verdier.jerom...@gmail.com> wrote:
>
>> You can see my query below :
>>
>> SELECT
>> mag.co_magasin,
>> dem.id_produit  as
>> id_produit_orig,
>> pnvente.dt_debut_commercial as
>> dt_debut_commercial,
>> COALESCE(pnvente.id_produit,dem.id_produit) as id_produit,
>> RANK() OVER (PARTITION BY mag.co_magasin, dem.id_produit
>> ORDER BY pnvente.dt_debut_commercial DESC,
>> COALESCE(pnvente.id_produit,dem.id_produit) DESC) as rang
>>
>> FROM default.demarque_mag_jour dem
>>
>>   LEFT OUTER JOIN default.produit_norm pn
>>   ON  pn.co_societe = dem.co_societe
>>   AND pn.id_produit = dem.id_produit
>>   LEFT OUTER JOIN default.produit_norm pnvente
>>   ON  pnvente.co_societe = pn.co_societe
>>   AND pnvente.co_produit_rfu = pn.co_produit_lip
>>   INNER JOIN default.kpi_magasin mag
>>   ON mag.id_magasin = dem.id_magasin
>>
>>
>> GROUP BY
>> mag.co_magasin,
>> dem.id_produit,
>> pnvente.dt_debut_commercial,
>> COALESCE(pnvente.id_produit,dem.id_produit);
>>
>>
>> 2013/7/16 Richa Sharma 
>>
>>> Can you share query with just RANK().
>>>
>>> Richa
>>>
>>>
>>> On Tue, Jul 16, 2013 at 6:08 PM, Jérôme Verdier <
>>> verdier.jerom...@gmail.com> wrote:
>>>
 Hi Richa,

 I tried to execute the rank function alone, but the result is the same

 Thanks


 2013/7/16 Richa Sharma 

> Hi Jerome
>
>
> I think the problem is you are trying to use MIN, SUM and RANK
> function in a single query.
>
> Try to get the rank first in a query and on top of it apply these
> aggregate functions
>
> Richa
>
>
>
>
> On Tue, Jul 16, 2013 at 2:15 PM, Jérôme Verdier <
> verdier.jerom...@gmail.com> wrote:
>
>> Hi,
>>
>> I have a problem while using RANK OVER PARTITION function with Hive.
>>
>> Hive is in version 0.11 and, as we can see here :
>> https://cwiki.apache.org/Hive/languagemanual-windowingandanalytics.html,
>> we can now use these functions in Hive.
>>
>> But, when i use it, i encountered this error :
>>
>> FAILED: SemanticException Failed to breakup Windowing invocations
>> into Groups. At least 1 group must only depend on input columns. Also 
>> check
>> for circular dependencies.
>> Underlying error:
>> org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException: One or more
>> arguments are expected.
>>
>> Here is my script :
>>
>> SELECT
>> mag.co_magasin,
>> dem.id_produit  as
>> id_produit_orig,
>> pnvente.dt_debut_commercial as
>> dt_debut_commercial,
>> COALESCE(pnvente.id_produit,dem.id_produit) as
>> id_produit,
>> min(
>>   CASE WHEN dem.co_validation IS NULL THEN 0 ELSE 1 END
>> )   as
>> flg_demarque_valide,
>> sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>> CAST(dem.mt_revient_ope AS INT) END)
>> as
>> me_dem_con_prx_cs,
>> 0   as
>> me_dem_inc_prx_cs,
>> 0   as
>> me_dem_prov_stk_cs,
>> sum(CASE WHEN dem.co_validation IS NULL THEN 0 ELSE
>> CAST(dem.qt_demarque AS INT) END)
>> as
>> qt_dem_con,
>> 0   as
>> qt_

Hive header line in Select query help?

2013-07-17 Thread Matouk IFTISSEN
Hello Hive user,

I want to know if is there a way to export header line in select query, in
order to store the result in file (from  local or HDFS directory)?
 like this query result :

set hive.cli.print.header=true;

INSERT OVERWRITE LOCAL DIRECTORY 'C:\resultats\alerts_http_500\par_heure'
SELECT  heure_actuelle,nb_erreur_500, pct_erreurs_500 ,
 heure_precedentele, nb_erreur_500, hp.pct_erreurs_500
FROM Ma_table;

known that the set hive.cli.print.header=true; does not work . it works
only in the CLI.

*thanks for your answers ;)*