Re: Export to RDBMS directly

2013-07-17 Thread Siddharth Tiwari
Why dont you try implementing that and contributing it to community ? Sent from my iPhone On Jul 17, 2013, at 10:18 PM, "Omkar Joshi" wrote: > I read of the term ‘JDBC Storage Handler’ at > https://issues.apache.org/jira/browse/HIVE-1555 > > The issues seems open but I just want to confirm

RE: Export to RDBMS directly

2013-07-17 Thread Omkar Joshi
I read of the term 'JDBC Storage Handler' at https://issues.apache.org/jira/browse/HIVE-1555 The issues seems open but I just want to confirm that it has not been implemented in the latest Hive releases. Regards, Omkar Joshi From: Bertrand Dechoux [mailto:decho...@gmail.com] Sent: Thursday, Ju

Re: Export to RDBMS directly

2013-07-17 Thread Bertrand Dechoux
The short answer is no. You could at the moment write your own input format/output format in order to do so. I don't know all the details for hive but that's possible. However, you will likely run a DOS against your database if you are not careful. Hive could embed sqoop in order do that smartly fo

Export to RDBMS directly

2013-07-17 Thread Omkar Joshi
Hi, Currently, I'm executing the following steps(Hadoop 1.1.2, Hive 0.11 and Sqoop-1.4.3.bin__hadoop-1.0.0) : 1.Import data from MySQL to Hive using Sqoop 2.Execute a query in Hive and store its output in a Hive table 3.Export the output to MySQL using Sqoop I was wondering if it

Re: Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

2013-07-17 Thread Andrew Trask
Put them in hive's lib folder? Sent from my Rotary Phone On Jul 17, 2013, at 11:14 PM, Mitesh Peshave wrote: > Hello, > > I am trying to use a custom inputformat for a hive table. > > When I add the jar containing the custom inputformat through a client, such > as the beeline, executing "ad

Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

2013-07-17 Thread Mitesh Peshave
Hello, I am trying to use a custom inputformat for a hive table. When I add the jar containing the custom inputformat through a client, such as the beeline, executing "add jar" command, all seems to work fine. In this scenario, hive seems to pass inputformat class to the JT and TTs. I believe, it

Re: can hive use where clause in jdbc?

2013-07-17 Thread Thejas Nair
It is unlikely to be specifically caused by where clause. Are you able to run this query using hive cli ? Are you able to run any query that involves running a MR job through jdbc ? What do you see in the hive logs ? On Tue, Jul 16, 2013 at 1:10 AM, ch huang wrote: > here is my test output,why t

Problem with the windowing function ntile (Exceptions)

2013-07-17 Thread Lars Francke
Hi, I'm running a query like this: CREATE TABLE foo STORED AS ORC AS SELECT id, season, amount, ntile(10) OVER ( PARTITION BY season ORDER BY amount DESC ) FROM bar; On a small enough dataset that works fine but when switching to a larger sample we're seeing exceptions like this:

Re: Question regarding external table and csv in NFS

2013-07-17 Thread Mainak Ghosh
Hey Saurabh, I tried this command and it still gives the same error. Actually the folder name is supplier and supplier.tbl is the csv which resided inside it. I had it correct in the query but in mail it is wrong. So the query that I executed was: create external table outside_supplier (S_SUPPK

Re: Question regarding external table and csv in NFS

2013-07-17 Thread Saurabh M
Hi Mainak, Can you try using this: create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, S_NATIONKEY INT, S_PHONE STRING, S_ACCTBAL DOUBLE, S_COMMENT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE LOCATION 'file:///mnt/h/tpc-h-impala/da

Question regarding external table and csv in NFS

2013-07-17 Thread Mainak Ghosh
Hello, I have just started using Hive and I was trying to create an external table with the csv file placed in NFS. I tried using file:// and local://. Both of these attempts failed with the error: create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING, S_NATIONK

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread xiufeng liu
You could also take a look the flowing resources for data science: http://datascienc.es/resources/ http://blog.zipfianacademy.com/ Regards, Xiufeng Liu On Wed, Jul 17, 2013 at 10:09 PM, Yasmin Lucero wrote: > Ha. I have the same problem. It is hard to find resources aimed at the > right level

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread Yasmin Lucero
Ha. I have the same problem. It is hard to find resources aimed at the right level. I have been pretty happy with the book Head First Java by Kathy Sierra and Bert someone er other. y Yasmin Lucero Senior Statistician, Gravity.com Santa

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread John Meagher
The Data Science course on Coursera has a pretty good overview of map reduce, Hive, and Pig without going into the Java side of things. https://www.coursera.org/course/datasci. It's not in depth, but it is enough to get started. On Wed, Jul 17, 2013 at 3:52 PM, John Omernik wrote: > Hey all - >

Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread John Omernik
Hey all - I was wondering if there were any "shortcut" Java courses out there. As in, I am not looking for a holistic learn everything about Java course, but more of a "So you are a big data/hive geek and you get Python/Perl pretty well, but when you try to understand Java your head explodes and

Re: New to hive.

2013-07-17 Thread Mohammad Tariq
Great. Good luck with that. Warm Regards, Tariq cloudfront.blogspot.com On Thu, Jul 18, 2013 at 12:43 AM, Bharati Adkar < bharati.ad...@mparallelo.com> wrote: > Hi Tariq, > > No Problems, > It was the hive.jar.path property that was not being set. Figured it out > and fixed it. > Got the plan.x

Re: New to hive.

2013-07-17 Thread Bharati Adkar
Hi Tariq, No Problems, It was the hive.jar.path property that was not being set. Figured it out and fixed it. Got the plan.xml and jobconf.xml now will debug hadoop to get the rest of info. Thanks, Warm regards, Bharati On Jul 17, 2013, at 12:08 PM, Mohammad Tariq wrote: > Hello ma'm, > > Ap

Re: New to hive.

2013-07-17 Thread Mohammad Tariq
Hello ma'm, Apologies first of all for responding so late. Stuck with some urgent deliverables. Was out of touch for a while. java.io.IOException: Cannot run program "/Users/bharati/hive-0.11.0/src/testutils/hadoop" (in directory "/Users/bharati/eclipse/tutorial/src"): error=13, Permission denied

Re: which approach is better

2013-07-17 Thread Hamza Asad
I use data to generates reports on daily basis, Do couple of analysis and its insert once and read many on daily basis. But My main purpose is to secure my data and easily recover it even if my hadoop(datanode) OR HDFS crashes. As uptill now, i'm using approach in which data has been retrieved dir

Re: which approach is better

2013-07-17 Thread kulkarni.swar...@gmail.com
First of all, that might not be the right approach to choose the underlying storage. You should choose HDFS or HBase depending on whether the data is going to be used for batch processing or you need random access on top of it. HBase is just another layer on top of HDFS. So obviously the queries ru

Re: which approach is better

2013-07-17 Thread Nitin Pawar
what's the purpose of data storage? whats the read and write throughput you expect? whats the way you will access data while read? whats are your SLAs on both read and write? there will be more questions others will ask so be ready for that :) On Wed, Jul 17, 2013 at 11:10 PM, Hamza Asad wrote

which approach is better

2013-07-17 Thread Hamza Asad
Please let me knw which approach is better. Either i save my data directly to HDFS and run hive (shark) queries over it OR store my data in HBASE, and then query it.. as i want to ensure efficient data retrieval and data remains safe and can easily recover if hadoop crashes. -- *Muhammad Hamza As

RE: New to hive.

2013-07-17 Thread Puneet Khatod
Hi, There are many online tutorials and blogs to provide quick get-set-go sort of information. To start with you can learn Hadoop. For detailed knowledge you will have to go through e-books as mentioned by Lefty. These books are bulky but will provide every bit of hadoop. I recently came across

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Richa Sharma
my bad ... in relational databases we generally do not give a column name inside rank() ... but the one in (partition by order by..) is sufficient. But looks like that's not the case in Hive Jerome, Please look at the examples in link below. See if you are able to make it work https://cwi

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Richa, I have tried one query, with what i've understand of Vijay's tips. SELECT code_entite, RANK(mag.me_vente_ht) OVER (PARTITION BY mag.co_societe ORDER BY mag.me_vente_ht) AS rank FROM default.thm_renta_rgrp_produits_n_1 mag; This query is working, it gives me results. You say that may

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Richa Sharma
Vijay Jerome has already passed column -> mag.co_societe for rank. syntax -> RANK() OVER (PARTITION BY mag.co_societe ORDER BY mag.me_vente_ht) This will generate a rank for column mag.co_societe based on column value me_vente_ht Jerome, Its possible you are also hitting the same bug as I menti

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Vijay, Could you give me an example, i'm not sure of what you're meaning. Thanks, 2013/7/17 Vijay > As the error message states: "One ore more arguments are expected," you > have to pass a column to the rank function. > > > On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier < > verdier.jerom.

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Vijay
As the error message states: "One ore more arguments are expected," you have to pass a column to the rank function. On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier wrote: > Hi Richa, > > I have tried a simple query without joins, etc > > SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Richa, I have tried a simple query without joins, etc SELECT RANK() OVER (PARTITION BY mag.co_societe ORDER BY mag.me_vente_ht),mag.co_societe, mag.me_vente_ht FROM default.thm_renta_rgrp_produits_n_1 mag; Unfortunately, the error is the same like previously. Error: Query returned non-ze

Hive header line in Select query help?

2013-07-17 Thread Matouk IFTISSEN
Hello Hive user, I want to know if is there a way to export header line in select query, in order to store the result in file (from local or HDFS directory)? like this query result : set hive.cli.print.header=true; INSERT OVERWRITE LOCAL DIRECTORY 'C:\resultats\alerts_http_500\par_heure' SELEC