date:20130930

Re: how to make async call to hive

2013-09-30 Thread Vaibhav Gumashta

Hi Gary,

HiveServer2 has recently added an API to support asynchronous execution:
https://github.com/apache/hive/blob/trunk/service/if/TCLIService.thrift#L604

You will have to create an instance of Thrift HiveServer2 client and while
creating the request object for ExecuteStatement, set runAsync as true.

Thanks,
--Vaibhav


On Sun, Sep 29, 2013 at 9:23 PM, Gary Zhao garyz...@gmail.com wrote:

 I'm using node.js which is async.


 On Sun, Sep 29, 2013 at 5:32 PM, Brad Ruderman bruder...@radiumone.comwrote:

 Typically it be your application that opens the process off the main
 thread. Hue (Beeswax specifically) does this and you can see the code here:
 https://github.com/cloudera/hue/tree/master/apps/beeswax

 Thx



 On Sun, Sep 29, 2013 at 5:15 PM, kentkong_work kentkong_w...@163.comwrote:

 **
 hi all，
 just wonder if there is offical solution for async call to hive?
 hive query runs so long time, my application can't block until it
 returns.


 Kent






-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

issue about remote hive client

2013-09-30 Thread ch huang

hi,all:
i run hive client in seperate box ,but all job submit from the client is
local job,why? ,i try it from hive-server2 running box ,the job will submit
as distribute job

Load Timestamp data type fom local file

2013-09-30 Thread Claudio Reggiani

Hello,

For unit testing, I would like to load from a local file data that has
several columns, one is also Timestamp. The command I use is LOAD DATA
LOCAL INPATH... .

Unfortunately that column does not allow me to load all the dataset. I have
no error in the log of my local apache hive server, everything looks ok. By
the way, officially the data type Timestamp is available.

For completeness, I'm using hive version: 0.10.0 and I report both the
script which format the database and the dataset:

-
hive DROP TABLE momis_test_a_3
hive CREATE TABLE momis_test_a_3 (col1 STRING, col2 DOUBLE, col3 FLOAT,
col4 TIMESTAMP, col5 BOOLEAN) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
hive LOAD DATA LOCAL INPATH
'/home/nophiq/Programmi/Eclipse-Indigo-Momis/workspace/datariver/datariver-querymanager/test/sources/hive/dataset3'
OVERWRITE INTO TABLE momis_test_a_3

-
testo1,100.00,201.00,2013-01-01 04:00:00.123,true
testo2,300.00,401.00,2013-01-02 04:00:00.123,false
testo3,500.00,601.00,2013-01-03 04:00:00.123,false


Finally, here it is the log from the local server:

Copying data from
file:/home/nophiq/Programmi/Eclipse-Indigo-Momis/workspace/datariver/datariver-querymanager/test/sources/hive/dataset3
Copying file:
file:/home/nophiq/Programmi/Eclipse-Indigo-Momis/workspace/datariver/datariver-querymanager/test/sources/hive/dataset3
Loading data to table default.momis_test_a_3
Deleted file:/home/nophiq/Programmi/hive-0.10.0/warehouse/momis_test_a_3
Table default.momis_test_a_3 stats: [num_partitions: 0, num_files: 1,
num_rows: 0, total_size: 182, raw_data_size: 0]
OK
OK


How can I load timestamp data type from a local file? I don't want to
create an external table.
Any suggestion?

Thanks
Claudio Reggiani

Re: Load Timestamp data type fom local file

2013-09-30 Thread Nitin Pawar

Sorry but I could not understand the issues you are facing.

When you loaded data, did select col from table for the timestamp column,
what error did you get? what data did you get?
this is the default datetime format -MM-dd hh:mm:ss.

Looking at your sample data seems to match the format. Can you show us some
error or what you expect to see as the query output?

On Mon, Sep 30, 2013 at 3:36 PM, Claudio Reggiani nop...@gmail.com wrote:

Hello,

For unit testing, I would like to load from a local file data that has
several columns, one is also Timestamp. The command I use is LOAD DATA
LOCAL INPATH... .

Unfortunately that column does not allow me to load all the dataset. I
have no error in the log of my local apache hive server, everything looks
ok. By the way, officially the data type Timestamp is available.

For completeness, I'm using hive version: 0.10.0 and I report both the
script which format the database and the dataset:

-
hive DROP TABLE momis_test_a_3
hive CREATE TABLE momis_test_a_3 (col1 STRING, col2 DOUBLE, col3 FLOAT,
col4 TIMESTAMP, col5 BOOLEAN) ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
hive LOAD DATA LOCAL INPATH
'/home/nophiq/Programmi/Eclipse-Indigo-Momis/workspace/datariver/datariver-querymanager/test/sources/hive/dataset3'
OVERWRITE INTO TABLE momis_test_a_3

-
testo1,100.00,201.00,2013-01-01 04:00:00.123,true
testo2,300.00,401.00,2013-01-02 04:00:00.123,false
testo3,500.00,601.00,2013-01-03 04:00:00.123,false

Finally, here it is the log from the local server:

Copying data from
file:/home/nophiq/Programmi/Eclipse-Indigo-Momis/workspace/datariver/datariver-querymanager/test/sources/hive/dataset3
Copying file:
file:/home/nophiq/Programmi/Eclipse-Indigo-Momis/workspace/datariver/datariver-querymanager/test/sources/hive/dataset3
Loading data to table default.momis_test_a_3
Deleted file:/home/nophiq/Programmi/hive-0.10.0/warehouse/momis_test_a_3
Table default.momis_test_a_3 stats: [num_partitions: 0, num_files: 1,
num_rows: 0, total_size: 182, raw_data_size: 0]
OK
OK

How can I load timestamp data type from a local file? I don't want to
create an external table.
Any suggestion?

Thanks
Claudio Reggiani

--
Nitin Pawar

Re: Load Timestamp data type fom local file

2013-09-30 Thread Claudio Reggiani

Thanks Nitin for the reply,

if I run the query SELECT * FROM momis_test_a_3 I get an empty result set
with no errors. Instead I would expect all the results.

My best guess is that because of timestamp data the whole dataset is not
able to be loaded. But since I don't have any errors (of any kind) I don't
know where to puts my hands on.

Claudio

2013/9/30 Nitin Pawar nitinpawar...@gmail.com

Sorry but I could not understand the issues you are facing.

When you loaded data, did select col from table for the timestamp column,
what error did you get? what data did you get?
this is the default datetime format -MM-dd hh:mm:ss.

Looking at your sample data seems to match the format. Can you show us
some error or what you expect to see as the query output?

On Mon, Sep 30, 2013 at 3:36 PM, Claudio Reggiani nop...@gmail.comwrote:

Hello,

For unit testing, I would like to load from a local file data that has
several columns, one is also Timestamp. The command I use is LOAD DATA
LOCAL INPATH... .

For completeness, I'm using hive version: 0.10.0 and I report both the
script which format the database and the dataset:

-
testo1,100.00,201.00,2013-01-01 04:00:00.123,true
testo2,300.00,401.00,2013-01-02 04:00:00.123,false
testo3,500.00,601.00,2013-01-03 04:00:00.123,false

Finally, here it is the log from the local server:

How can I load timestamp data type from a local file? I don't want to
create an external table.
Any suggestion?

Thanks
Claudio Reggiani

--
Nitin Pawar

Re: Load Timestamp data type fom local file

2013-09-30 Thread Nitin Pawar

Hi Claudio,

When you do a select * from table there is no mapreduce in place.
What hive does is it uses the hdfs api and reads your files and displays
the data by a tab separated columns list.

If the data is wrongly populated, hive will show the entire set into first
column and rest of the columns are shown as NULL
When you are seeing no data, i suspect the data file is deleted somehow
from your table.

I would recommend you to do following things
1) Create table with a location flag
2) Load data in the table and check the directory for the file
3) If the file is present then you run select * query

alternatively, what you can do is check your current table directory
:home/nophiq/Programmi/hive-0.10.0/warehouse/momis_test_a_3
If there are any files, you can do hadoop dfs -cat on that file and see
if that shows your content.
If that shows some content then we will need to see why hive is not able to
read the file

On Mon, Sep 30, 2013 at 4:28 PM, Claudio Reggiani nop...@gmail.com wrote:

Thanks Nitin for the reply,

if I run the query SELECT * FROM momis_test_a_3 I get an empty result
set with no errors. Instead I would expect all the results.

My best guess is that because of timestamp data the whole dataset is not
able to be loaded. But since I don't have any errors (of any kind) I don't
know where to puts my hands on.

Claudio

2013/9/30 Nitin Pawar nitinpawar...@gmail.com

Sorry but I could not understand the issues you are facing.

When you loaded data, did select col from table for the timestamp column,
what error did you get? what data did you get?
this is the default datetime format -MM-dd hh:mm:ss.

Looking at your sample data seems to match the format. Can you show us
some error or what you expect to see as the query output?

On Mon, Sep 30, 2013 at 3:36 PM, Claudio Reggiani nop...@gmail.comwrote:

Hello,

For unit testing, I would like to load from a local file data that has
several columns, one is also Timestamp. The command I use is LOAD DATA
LOCAL INPATH... .

For completeness, I'm using hive version: 0.10.0 and I report both the
script which format the database and the dataset:

-
testo1,100.00,201.00,2013-01-01 04:00:00.123,true
testo2,300.00,401.00,2013-01-02 04:00:00.123,false
testo3,500.00,601.00,2013-01-03 04:00:00.123,false

Finally, here it is the log from the local server:

How can I load timestamp data type from a local file? I don't want to
create an external table.
Any suggestion?

Thanks
Claudio Reggiani

--
Nitin Pawar

Re: unable to create a table in hive

2013-09-30 Thread Nitin Pawar

Can you share your create table ddl and hive warehouse directory setting
from hive-site.xml ?


On Mon, Sep 30, 2013 at 4:57 PM, Manickam P manicka...@outlook.com wrote:

 Guys,

 when i try to create a new table in hive i am getting the below error.
 *FAILED: Error in metadata: MetaException(message:Got exception:
 java.io.FileNotFoundException /user)*
 *FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask*

 I've created direcotries in hdfs like /home/storate/tmp and
 /home/storage/user/hive/warehouse and given permission but it is not taking
 up.

 I'm having hdfs federated cluster with 2 name nodes.


 does anyone have any idea?




 Thanks,
 Manickam P




-- 
Nitin Pawar

RE: unable to create a table in hive

2013-09-30 Thread Manickam P

Hi,
I have given below the script i used. I've not used any hive site xml here. 
CREATE TABLE TABLE_A (EMPLOYEE_ID INT, EMPLOYEE_NAME STRING, EMPLOYEE_LOCATION 
STRING, EMPLOYEE_DEPT STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' 
STORED AS TEXTFILE;

Thanks,
Manickam P

Date: Mon, 30 Sep 2013 17:11:12 +0530
Subject: Re: unable to create a table in hive
From: nitinpawar...@gmail.com
To: user@hive.apache.org

Can you share your create table ddl and hive warehouse directory setting from 
hive-site.xml ? 

On Mon, Sep 30, 2013 at 4:57 PM, Manickam P manicka...@outlook.com wrote:




Guys,
when i try to create a new table in hive i am getting the below error. 
FAILED: Error in metadata: MetaException(message:Got exception: 
java.io.FileNotFoundException /user)FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask

I've created direcotries in hdfs like /home/storate/tmp and 
/home/storage/user/hive/warehouse and given permission but it is not taking up. 
I'm having hdfs federated cluster with 2 name nodes. 


does anyone have any idea? 



Thanks,
Manickam P


-- 
Nitin Pawar

Re: unable to create a table in hive

2013-09-30 Thread Nitin Pawar

hive-site.xml will be placed under your hive conf directory.

anyway, try using location flag to your ddl like below
CREATE TABLE TABLE_A (EMPLOYEE_ID INT, EMPLOYEE_NAME STRING,
EMPLOYEE_LOCATION STRING, EMPLOYEE_DEPT STRING) ROW FORMAT DELIMITED FIELDS
TERMINATED BY '|' STORED AS TEXTFILE LOCATION '/home/storage/user/hive/
warehouse/TABLE_A'


On Mon, Sep 30, 2013 at 5:17 PM, Manickam P manicka...@outlook.com wrote:

 Hi,

 I have given below the script i used. I've not used any hive site xml
 here.

 CREATE TABLE TABLE_A (EMPLOYEE_ID INT, EMPLOYEE_NAME STRING,
 EMPLOYEE_LOCATION STRING, EMPLOYEE_DEPT STRING) ROW FORMAT DELIMITED FIELDS
 TERMINATED BY '|' STORED AS TEXTFILE;


 Thanks,
 Manickam P

 --
 Date: Mon, 30 Sep 2013 17:11:12 +0530
 Subject: Re: unable to create a table in hive
 From: nitinpawar...@gmail.com
 To: user@hive.apache.org


 Can you share your create table ddl and hive warehouse directory setting
 from hive-site.xml ?


 On Mon, Sep 30, 2013 at 4:57 PM, Manickam P manicka...@outlook.comwrote:

 Guys,

 when i try to create a new table in hive i am getting the below error.
 *FAILED: Error in metadata: MetaException(message:Got exception:
 java.io.FileNotFoundException /user)*
 *FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask*

 I've created direcotries in hdfs like /home/storate/tmp and
 /home/storage/user/hive/warehouse and given permission but it is not taking
 up.

 I'm having hdfs federated cluster with 2 name nodes.


 does anyone have any idea?




 Thanks,
 Manickam P




 --
 Nitin Pawar




-- 
Nitin Pawar

RE: unable to create a table in hive

2013-09-30 Thread Manickam P

Thanks man. I added hive site and it worked.

Thanks,
Manickam P

From: Nitin Pawarmailto:nitinpawar...@gmail.com
Sent: ‎30-‎09-‎2013 05:35 PM
To: user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: unable to create a table in hive

hive-site.xml will be placed under your hive conf directory.

anyway, try using location flag to your ddl like below
CREATE TABLE TABLE_A (EMPLOYEE_ID INT, EMPLOYEE_NAME STRING,
EMPLOYEE_LOCATION STRING, EMPLOYEE_DEPT STRING) ROW FORMAT DELIMITED FIELDS
TERMINATED BY '|' STORED AS TEXTFILE LOCATION '/home/storage/user/hive/
warehouse/TABLE_A'

On Mon, Sep 30, 2013 at 5:17 PM, Manickam P manicka...@outlook.com wrote:

 Hi,

 I have given below the script i used. I've not used any hive site xml
 here.

 CREATE TABLE TABLE_A (EMPLOYEE_ID INT, EMPLOYEE_NAME STRING,
 EMPLOYEE_LOCATION STRING, EMPLOYEE_DEPT STRING) ROW FORMAT DELIMITED FIELDS
 TERMINATED BY '|' STORED AS TEXTFILE;

 Thanks,
 Manickam P

 --
 Date: Mon, 30 Sep 2013 17:11:12 +0530
 Subject: Re: unable to create a table in hive
 From: nitinpawar...@gmail.com
 To: user@hive.apache.org

 Can you share your create table ddl and hive warehouse directory setting
 from hive-site.xml ?

 On Mon, Sep 30, 2013 at 4:57 PM, Manickam P manicka...@outlook.comwrote:

 Guys,

 when i try to create a new table in hive i am getting the below error.
 *FAILED: Error in metadata: MetaException(message:Got exception:
 java.io.FileNotFoundException /user)*
 *FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask*

 I've created direcotries in hdfs like /home/storate/tmp and
 /home/storage/user/hive/warehouse and given permission but it is not taking
 up.

 I'm having hdfs federated cluster with 2 name nodes.

 does anyone have any idea?

 Thanks,
 Manickam P

 --
 Nitin Pawar

--
Nitin Pawar

RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Sunderlin, Mark

Hmm.. No replies on this one?  Is no one use Hue? :-)  That would be 
interesting to know .. if not Hue, how are others exposing Hive to end users? 
without given them a direct login to a node on the cluster?

---
Mark E. Sunderlin
Data Architect | AOL NETWORKS BDM
P: 703-265-6935  | C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way,  Dulles, VA  20166


-Original Message-
From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com] 
Sent: Wednesday, September 18, 2013 2:08 PM
To: user@hive.apache.org
Subject: Hive Query via Hue, Only column headers in downloaded CSV or XSL 
results, sometimes

Using Hive V11, via Hue from CDH4, I can run my query, output 10 rows (limit 
10) and download to a nice CSV or XSL file ... sometimes. :-(

Sometimes, even when the run is error free, the download only downloads the 
column headers.  This is true for both the CSV and XSL options.

It is only ten lines of output, so it cannot be a number of rows issue.  Is 
there a limit to the width of the data you can download?  A limit on the number 
of columns?

Anyone seen this before?  Does anyone  know a fix or a work around?

---
Mark E. Sunderlin
Data Architect | AOL NETWORKS BDM
P: 703-265-6935 | C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way,  Dulles, VA  20166

Error - loading data into tables

2013-09-30 Thread Manickam P

Hi,
I'm getting the below error while loading the data into hive table. return code 
1 from org.apache.hadoop.hive.ql.exec.MoveTask
I used  LOAD DATA INPATH '/home/storage/mount1/tabled.txt' INTO TABLE TEST; 
this query to insert into table. 

Thanks,
Manickam P

Not able to execute this query

2013-09-30 Thread shouvanik.haldar

When I executing the query to create table in HIVE, I am getting this error.

'NoneType' object has no attribute 'columns'


The table script below


create external table test1(

-
)
PARTITIONED BY (col1 timestamp,
col2  timestamp)
  CLUSTERED BY(col1)
  SORTED BY(col1 ASC) into 40 buckets

  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE
  LOCATION '/user/test/'


Thanks,
Shouvanik


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited.

Where allowed by local law, electronic communications with Accenture and its 
affiliates, including e-mail and instant messaging (including content), may be 
scanned by our systems for the purposes of information security and assessment 
of internal compliance with Accenture policy.

__

www.accenture.com

Re: Not able to execute this query

2013-09-30 Thread Nitin Pawar

you are trying to bucket and partition on same column?

I could create a hive table if I change the bucketing column to
non-partition column


On Mon, Sep 30, 2013 at 7:23 PM, shouvanik.hal...@accenture.com wrote:

  When I executing the query to create table in HIVE, I am getting this
 error.

 ** **

 'NoneType' object has no attribute 'columns'

 ** **

 ** **

 The table script below

 ** **

 ** **

 create external table test1(

 ** **

 -

 )

 PARTITIONED BY (col1 timestamp,

 col2  timestamp)

   CLUSTERED BY(col1)

   SORTED BY(col1 ASC) into 40 buckets



   ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

   STORED AS TEXTFILE

   LOCATION '/user/test/'

 ** **

 ** **

 Thanks,

 Shouvanik

 --
 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited.

 Where allowed by local law, electronic communications with Accenture and
 its affiliates, including e-mail and instant messaging (including content),
 may be scanned by our systems for the purposes of information security and
 assessment of internal compliance with Accenture policy.


 __

 www.accenture.com




-- 
Nitin Pawar

RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Martin, Nick

Hi Mark - we hit this issue as well. We use Hue as the Hive front-end for our 
users and this is a pretty big roadblock for them. 

We're on Hue 2.2 and Hive 11.

If you figure out a fix let me know :)



-Original Message-
From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com] 
Sent: Monday, September 30, 2013 9:38 AM
To: user@hive.apache.org
Subject: RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL 
results, sometimes

Hmm.. No replies on this one?  Is no one use Hue? :-)  That would be 
interesting to know .. if not Hue, how are others exposing Hive to end users? 
without given them a direct login to a node on the cluster?

---
Mark E. Sunderlin
Data Architect | AOL NETWORKS BDM
P: 703-265-6935  | C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way,  Dulles, VA  20166


-Original Message-
From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com] 
Sent: Wednesday, September 18, 2013 2:08 PM
To: user@hive.apache.org
Subject: Hive Query via Hue, Only column headers in downloaded CSV or XSL 
results, sometimes

Using Hive V11, via Hue from CDH4, I can run my query, output 10 rows (limit 
10) and download to a nice CSV or XSL file ... sometimes. :-(

Sometimes, even when the run is error free, the download only downloads the 
column headers.  This is true for both the CSV and XSL options.

It is only ten lines of output, so it cannot be a number of rows issue.  Is 
there a limit to the width of the data you can download?  A limit on the number 
of columns?

Anyone seen this before?  Does anyone  know a fix or a work around?

---
Mark E. Sunderlin
Data Architect | AOL NETWORKS BDM
P: 703-265-6935 | C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way,  Dulles, VA  20166

RE: Not able to execute this query

2013-09-30 Thread shouvanik.haldar

Hi,

Have you used HUE WEB console. Actually I have not used same columns. But, when 
I give a query, I get that error.!

Please help?

Thanks,
Shouvanik

From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Monday, September 30, 2013 7:34 PM
To: user@hive.apache.org
Subject: Re: Not able to execute this query

you are trying to bucket and partition on same column?

I could create a hive table if I change the bucketing column to non-partition 
column

On Mon, Sep 30, 2013 at 7:23 PM, 
shouvanik.hal...@accenture.commailto:shouvanik.hal...@accenture.com wrote:
When I executing the query to create table in HIVE, I am getting this error.

'NoneType' object has no attribute 'columns'


The table script below


create external table test1(

-
)
PARTITIONED BY (col1 timestamp,
col2  timestamp)
  CLUSTERED BY(col1)
  SORTED BY(col1 ASC) into 40 buckets

  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE
  LOCATION '/user/test/'


Thanks,
Shouvanik


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited.

Where allowed by local law, electronic communications with Accenture and its 
affiliates, including e-mail and instant messaging (including content), may be 
scanned by our systems for the purposes of information security and assessment 
of internal compliance with Accenture policy.

__

www.accenture.comhttp://www.accenture.com



--
Nitin Pawar

Re: Not able to execute this query

2013-09-30 Thread Nitin Pawar

I am really not sure what your entire query is
but the below one works . If possible share your entire ddl and mask or
hide cols if there is something you can not share

create table test1(
col3 int,
col4 string)
PARTITIONED BY (col1 timestamp, col2  timestamp)
  CLUSTERED BY(col3)
  SORTED BY(col3 ASC) into 40 buckets

  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE;


On Mon, Sep 30, 2013 at 7:45 PM, shouvanik.hal...@accenture.com wrote:

  Hi,

 ** **

 Have you used HUE WEB console. Actually I have not used same columns. But,
 when I give a query, I get that error.!

 ** **

 Please help?

 ** **

 Thanks,

 Shouvanik

 ** **

 *From:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
 *Sent:* Monday, September 30, 2013 7:34 PM
 *To:* user@hive.apache.org
 *Subject:* Re: Not able to execute this query

 ** **

 you are trying to bucket and partition on same column? 

 ** **

 I could create a hive table if I change the bucketing column to
 non-partition column 

 ** **

 On Mon, Sep 30, 2013 at 7:23 PM, shouvanik.hal...@accenture.com wrote:**
 **

 When I executing the query to create table in HIVE, I am getting this
 error.

  

 'NoneType' object has no attribute 'columns'

  

  

 The table script below

  

  

 create external table test1(

  

 -

 )

 PARTITIONED BY (col1 timestamp,

 col2  timestamp)

   CLUSTERED BY(col1)

   SORTED BY(col1 ASC) into 40 buckets



   ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

   STORED AS TEXTFILE

   LOCATION '/user/test/'

  

  

 Thanks,

 Shouvanik

 ** **
  --

 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited.

 Where allowed by local law, electronic communications with Accenture and
 its affiliates, including e-mail and instant messaging (including content),
 may be scanned by our systems for the purposes of information security and
 assessment of internal compliance with Accenture policy.


 __

 www.accenture.com



 

 ** **

 --
 Nitin Pawar




-- 
Nitin Pawar

RE: Not able to execute this query

2013-09-30 Thread shouvanik.haldar

Hi Nitin,

Thanks. That answers my previous query. But, if I add LOCATION '/user/hue/' 
string below, I get a big fat exception in beeswax.

Thanks,
Shouvanik

From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Monday, September 30, 2013 8:22 PM
To: user@hive.apache.org
Subject: Re: Not able to execute this query

I am really not sure what your entire query is
but the below one works . If possible share your entire ddl and mask or hide 
cols if there is something you can not share

create table test1(
col3 int,
col4 string)
PARTITIONED BY (col1 timestamp, col2  timestamp)
  CLUSTERED BY(col3)
  SORTED BY(col3 ASC) into 40 buckets

  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE;

On Mon, Sep 30, 2013 at 7:45 PM, 
shouvanik.hal...@accenture.commailto:shouvanik.hal...@accenture.com wrote:
Hi,

Have you used HUE WEB console. Actually I have not used same columns. But, when 
I give a query, I get that error.!

Please help?

Thanks,
Shouvanik

From: Nitin Pawar 
[mailto:nitinpawar...@gmail.commailto:nitinpawar...@gmail.com]
Sent: Monday, September 30, 2013 7:34 PM
To: user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: Not able to execute this query

you are trying to bucket and partition on same column?

I could create a hive table if I change the bucketing column to non-partition 
column

On Mon, Sep 30, 2013 at 7:23 PM, 
shouvanik.hal...@accenture.commailto:shouvanik.hal...@accenture.com wrote:
When I executing the query to create table in HIVE, I am getting this error.

'NoneType' object has no attribute 'columns'


The table script below


create external table test1(

-
)
PARTITIONED BY (col1 timestamp,
col2  timestamp)
  CLUSTERED BY(col1)
  SORTED BY(col1 ASC) into 40 buckets

  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE
  LOCATION '/user/test/'


Thanks,
Shouvanik


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited.

Where allowed by local law, electronic communications with Accenture and its 
affiliates, including e-mail and instant messaging (including content), may be 
scanned by our systems for the purposes of information security and assessment 
of internal compliance with Accenture policy.

__

www.accenture.comhttp://www.accenture.com



--
Nitin Pawar



--
Nitin Pawar

Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh Bhatnagar (Business Intelligence)

Hi,

I have a lot of tweets saved as text. I created an external table on top of
it to access it as textfile. I need to convert these to sequencefiles with
each tweet as its own record. To do this, I created another table as a
sequencefile table like so -

CREATE EXTERNAL TABLE tweetseq(
  tweet STRING
  )
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
 STORED AS SEQUENCEFILE
LOCATION '/user/hdfs/tweetseq'


Now when I insert into this table from my original tweets table, each line
gets its own record as expected. This is great. However, I don't have any
record ids here. Short of writing my own UDF to make that happen, are there
any obvious solutions I am missing here?

PS, I need the ids to be there because mahout seq2sparse expects that.
Without ids, it fails with -

java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be
cast to org.apache.hadoop.io.Text
at
org.apache.mahout.vectorizer.document.SequenceFileTokenizerMapper.map(SequenceFileTokenizerMapper.java:37)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)

Regards,
S

Re: Not able to execute this query

2013-09-30 Thread Nitin Pawar

Do the hue user have permissions to access '/user/hue'?
is that directory existing ?


On Mon, Sep 30, 2013 at 8:58 PM, shouvanik.hal...@accenture.com wrote:

  Hi Nitin,

 ** **

 Thanks. That answers my previous query. But, if I add LOCATION
 '/user/hue/' string below, I get a big fat exception in beeswax.

 ** **

 Thanks,

 Shouvanik

 ** **

 *From:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
 *Sent:* Monday, September 30, 2013 8:22 PM

 *To:* user@hive.apache.org
 *Subject:* Re: Not able to execute this query

 ** **

 I am really not sure what your entire query is 

 but the below one works . If possible share your entire ddl and mask or
 hide cols if there is something you can not share 

 ** **

 create table test1(

 col3 int,

 col4 string)

 PARTITIONED BY (col1 timestamp, col2  timestamp)

   CLUSTERED BY(col3)

   SORTED BY(col3 ASC) into 40 buckets

 ** **

   ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

   STORED AS TEXTFILE;

 ** **

 On Mon, Sep 30, 2013 at 7:45 PM, shouvanik.hal...@accenture.com wrote:**
 **

 Hi,

  

 Have you used HUE WEB console. Actually I have not used same columns. But,
 when I give a query, I get that error.!

  

 Please help?

  

 Thanks,

 Shouvanik

  

 *From:* Nitin Pawar [mailto:nitinpawar...@gmail.com]
 *Sent:* Monday, September 30, 2013 7:34 PM
 *To:* user@hive.apache.org
 *Subject:* Re: Not able to execute this query

  

 you are trying to bucket and partition on same column? 

  

 I could create a hive table if I change the bucketing column to
 non-partition column 

  

 On Mon, Sep 30, 2013 at 7:23 PM, shouvanik.hal...@accenture.com wrote:**
 **

 When I executing the query to create table in HIVE, I am getting this
 error.

  

 'NoneType' object has no attribute 'columns'

  

  

 The table script below

  

  

 create external table test1(

  

 -

 )

 PARTITIONED BY (col1 timestamp,

 col2  timestamp)

   CLUSTERED BY(col1)

   SORTED BY(col1 ASC) into 40 buckets



   ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'

   STORED AS TEXTFILE

   LOCATION '/user/test/'

  

  

 Thanks,

 Shouvanik

  
  --

 This message is for the designated recipient only and may contain
 privileged, proprietary, or otherwise confidential information. If you have
 received it in error, please notify the sender immediately and delete the
 original. Any other use of the e-mail by you is prohibited.

 Where allowed by local law, electronic communications with Accenture and
 its affiliates, including e-mail and instant messaging (including content),
 may be scanned by our systems for the purposes of information security and
 assessment of internal compliance with Accenture policy.


 __

 www.accenture.com



 

  

 --
 Nitin Pawar



 

 ** **

 --
 Nitin Pawar




-- 
Nitin Pawar

Re: Error - loading data into tables

2013-09-30 Thread Nitin Pawar

Is this /home/strorage/... a hdfs directory?
I think its a normal filesystem directory.

Try running this
load data local inpath '*/home/storage/mount1/tabled.txt' INTO TABLE TEST;*


On Mon, Sep 30, 2013 at 7:13 PM, Manickam P manicka...@outlook.com wrote:

 Hi,

 I'm getting the below error while loading the data into hive table.
 *return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask*
 *
 *
 I used * LOAD DATA INPATH '/home/storage/mount1/tabled.txt' INTO TABLE
 TEST;* this query to insert into table.


 Thanks,
 Manickam P




-- 
Nitin Pawar

RE: how to treat an existing partition data file as a table?

2013-09-30 Thread Olga L. Natkovich

You need to specify a table partition from which you want to sample.

Olga

From: Yang [mailto:tedd...@gmail.com]
Sent: Sunday, September 29, 2013 1:39 PM
To: hive-u...@hadoop.apache.org
Subject: how to treat an existing partition data file as a table?

we have a huge table, including browsing data for the past 5 years, let's say.

now I want to take a few samples to play around with it. so I did
select * from mytable limit 10;
but it actually went full out and tried to scan the entire table. is there a 
way to kind of create a view pointing to only one of the data files used by 
the original table mytable ?
this way the total files to be scanned is much smaller.

thanks!
yang

RE: Not able to execute this query

2013-09-30 Thread shouvanik.haldar

Thanks Nitin.

I am able to create the table in HUE now. You are right. There was no directory 
and no permission accordingly.

Thanks,
Shouvanik

From: Nitin Pawar [mailto:nitinpawar...@gmail.com]
Sent: Monday, September 30, 2013 9:08 PM
To: Haldar, Shouvanik
Cc: user@hive.apache.org
Subject: Re: Not able to execute this query

Do the hue user have permissions to access '/user/hue'?
is that directory existing ?

On Mon, Sep 30, 2013 at 8:58 PM, 
shouvanik.hal...@accenture.commailto:shouvanik.hal...@accenture.com wrote:
Hi Nitin,

Thanks. That answers my previous query. But, if I add LOCATION '/user/hue/' 
string below, I get a big fat exception in beeswax.

Thanks,
Shouvanik

From: Nitin Pawar 
[mailto:nitinpawar...@gmail.commailto:nitinpawar...@gmail.com]
Sent: Monday, September 30, 2013 8:22 PM

To: user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: Not able to execute this query

I am really not sure what your entire query is
but the below one works . If possible share your entire ddl and mask or hide 
cols if there is something you can not share

create table test1(
col3 int,
col4 string)
PARTITIONED BY (col1 timestamp, col2  timestamp)
  CLUSTERED BY(col3)
  SORTED BY(col3 ASC) into 40 buckets

  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE;

On Mon, Sep 30, 2013 at 7:45 PM, 
shouvanik.hal...@accenture.commailto:shouvanik.hal...@accenture.com wrote:
Hi,

Have you used HUE WEB console. Actually I have not used same columns. But, when 
I give a query, I get that error.!

Please help?

Thanks,
Shouvanik

From: Nitin Pawar 
[mailto:nitinpawar...@gmail.commailto:nitinpawar...@gmail.com]
Sent: Monday, September 30, 2013 7:34 PM
To: user@hive.apache.orgmailto:user@hive.apache.org
Subject: Re: Not able to execute this query

you are trying to bucket and partition on same column?

I could create a hive table if I change the bucketing column to non-partition 
column

On Mon, Sep 30, 2013 at 7:23 PM, 
shouvanik.hal...@accenture.commailto:shouvanik.hal...@accenture.com wrote:
When I executing the query to create table in HIVE, I am getting this error.

'NoneType' object has no attribute 'columns'


The table script below


create external table test1(

-
)
PARTITIONED BY (col1 timestamp,
col2  timestamp)
  CLUSTERED BY(col1)
  SORTED BY(col1 ASC) into 40 buckets

  ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
  STORED AS TEXTFILE
  LOCATION '/user/test/'


Thanks,
Shouvanik


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited.

Where allowed by local law, electronic communications with Accenture and its 
affiliates, including e-mail and instant messaging (including content), may be 
scanned by our systems for the purposes of information security and assessment 
of internal compliance with Accenture policy.

__

www.accenture.comhttp://www.accenture.com



--
Nitin Pawar



--
Nitin Pawar



--
Nitin Pawar

Doing FSCK throws error

2013-09-30 Thread shouvanik.haldar

Hi,

On executing MSCK REPAIR TABLE table1, I get the below error.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask


What can possibly be the error.


Thanks,
Shouvnik


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited.

Where allowed by local law, electronic communications with Accenture and its 
affiliates, including e-mail and instant messaging (including content), may be 
scanned by our systems for the purposes of information security and assessment 
of internal compliance with Accenture policy.

__

www.accenture.com

RE: Doing FSCK throws error

2013-09-30 Thread shouvanik.haldar

The script for this table is

add jar json-serde-1.1.3-jar-with-dependencies.jar;
list jars;

CREATE EXTERNAL TABLE IF NOT EXISTS table1 (
instance_type string, category string, session_id string, nonce string, user_id 
string, properties arraystructname : string, value : string, instance 
mapstring,string, true_as_of_secs string
)
PARTITIONED BY (type string, dth string)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3n://com...xx.xxx/tables/pl0/ctg=';

MSCK REPAIR TABLE table1;

Thanks,
Shouvanik

From: Haldar, Shouvanik
Sent: Monday, September 30, 2013 10:39 PM
To: nitinpawar...@gmail.com; user@hive.apache.org
Subject: Doing FSCK throws error

Hi,

On executing MSCK REPAIR TABLE table1, I get the below error.
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask


What can possibly be the error.


Thanks,
Shouvnik


This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited.

Where allowed by local law, electronic communications with Accenture and its 
affiliates, including e-mail and instant messaging (including content), may be 
scanned by our systems for the purposes of information security and assessment 
of internal compliance with Accenture policy.

__

www.accenture.com

Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh B

Hi,

I have a lot of tweets saved as text. I created an external table on top of
it to access it as textfile. I need to convert these to sequencefiles with
each tweet as its own record. To do this, I created another table as a
sequencefile table like so -

CREATE EXTERNAL TABLE tweetseq(
  tweet STRING
  )
 ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054'
 STORED AS SEQUENCEFILE
LOCATION '/user/hdfs/tweetseq'


Now when I insert into this table from my original tweets table, each line
gets its own record as expected. This is great. However, I don't have any
record ids here. How can I get it to write ids?

PS, I need the ids to be there because mahout seq2sparse expects that.

Regards,
S

RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Martin, Nick

Mark - is the Hive table you're using for this fairly wide? If so, are you 
doing a select * from table_name limit 10?

We ran some tests this morning on one of the Hive tables giving us some fits 
and if we limit the select to ~20 columns and put the limit on the query we get 
the returns fairly quickly and are able to export. 

-Original Message-
From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com] 
Sent: Monday, September 30, 2013 9:38 AM
To: user@hive.apache.org
Subject: RE: Hive Query via Hue, Only column headers in downloaded CSV or XSL 
results, sometimes

Hmm.. No replies on this one?  Is no one use Hue? :-)  That would be 
interesting to know .. if not Hue, how are others exposing Hive to end users? 
without given them a direct login to a node on the cluster?

---
Mark E. Sunderlin
Data Architect | AOL NETWORKS BDM
P: 703-265-6935  | C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way,  Dulles, VA  20166


-Original Message-
From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com] 
Sent: Wednesday, September 18, 2013 2:08 PM
To: user@hive.apache.org
Subject: Hive Query via Hue, Only column headers in downloaded CSV or XSL 
results, sometimes

Using Hive V11, via Hue from CDH4, I can run my query, output 10 rows (limit 
10) and download to a nice CSV or XSL file ... sometimes. :-(

Sometimes, even when the run is error free, the download only downloads the 
column headers.  This is true for both the CSV and XSL options.

It is only ten lines of output, so it cannot be a number of rows issue.  Is 
there a limit to the width of the data you can download?  A limit on the number 
of columns?

Anyone seen this before?  Does anyone  know a fix or a work around?

---
Mark E. Sunderlin
Data Architect | AOL NETWORKS BDM
P: 703-265-6935 | C: 540-327-6222 | AIM: MESunderlin
22000 AOL Way,  Dulles, VA  20166

Re: Hive Query via Hue, Only column headers in downloaded CSV or XSL results, sometimes

2013-09-30 Thread Prasad Mujumdar

+ hue-user hue-u...@cloudera.org

thanks
Prasad



On Mon, Sep 30, 2013 at 11:05 AM, Martin, Nick nimar...@pssd.com wrote:

 Mark - is the Hive table you're using for this fairly wide? If so, are you
 doing a select * from table_name limit 10?

 We ran some tests this morning on one of the Hive tables giving us some
 fits and if we limit the select to ~20 columns and put the limit on the
 query we get the returns fairly quickly and are able to export.

 -Original Message-
 From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com]
 Sent: Monday, September 30, 2013 9:38 AM
 To: user@hive.apache.org
 Subject: RE: Hive Query via Hue, Only column headers in downloaded CSV or
 XSL results, sometimes

 Hmm.. No replies on this one?  Is no one use Hue? :-)  That would be
 interesting to know .. if not Hue, how are others exposing Hive to end
 users? without given them a direct login to a node on the cluster?

 ---
 Mark E. Sunderlin
 Data Architect | AOL NETWORKS BDM
 P: 703-265-6935  | C: 540-327-6222 | AIM: MESunderlin
 22000 AOL Way,  Dulles, VA  20166


 -Original Message-
 From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com]
 Sent: Wednesday, September 18, 2013 2:08 PM
 To: user@hive.apache.org
 Subject: Hive Query via Hue, Only column headers in downloaded CSV or XSL
 results, sometimes

 Using Hive V11, via Hue from CDH4, I can run my query, output 10 rows
 (limit 10) and download to a nice CSV or XSL file ... sometimes. :-(

 Sometimes, even when the run is error free, the download only downloads
 the column headers.  This is true for both the CSV and XSL options.

 It is only ten lines of output, so it cannot be a number of rows issue.
  Is there a limit to the width of the data you can download?  A limit on
 the number of columns?

 Anyone seen this before?  Does anyone  know a fix or a work around?

 ---
 Mark E. Sunderlin
 Data Architect | AOL NETWORKS BDM
 P: 703-265-6935 | C: 540-327-6222 | AIM: MESunderlin
 22000 AOL Way,  Dulles, VA  20166

Re: Converting from textfile to sequencefile using Hive

2013-09-30 Thread Nitin Pawar

are you using hive to just convert your text files to sequence files?
If thats the case then you may want to look at the purpose why hive was
developed.

If you want to modify data or process data which does not involve any kind
of analytics functions on a routine basis.

If you want to do a data manipulation or enrichment and do not want to code
a lot of map reduce job, you can take a look at pig scripts.
basically what you want to do is generate an  UUID for each of your tweet
and then feed it to mahout algorithms.

Sorry if I understood it wrong or it sounds rude.

Re: Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh B

Hi Nitin,

No offense taken. Thank you for your response. Part of this is also trying
to find the right tool for the job.

I am doing queries to determine the cuts of tweets that I want, then doing
some modest normalization (through a python script) and then I want to
create sequenceFiles from that.

So far Hive seems to be the most convenient way to do this. But I can take
a look at PIG too. It looked like the STORED AS SEQUENCEFILE gets me 99%
way there. So I was wondering if there was a way to get those ids in there
as well. The last piece is always the stumbler :)

Thanks again,

S




On Mon, Sep 30, 2013 at 2:41 PM, Nitin Pawar nitinpawar...@gmail.comwrote:

 are you using hive to just convert your text files to sequence files?
 If thats the case then you may want to look at the purpose why hive was
 developed.

 If you want to modify data or process data which does not involve any kind
 of analytics functions on a routine basis.

 If you want to do a data manipulation or enrichment and do not want to
 code a lot of map reduce job, you can take a look at pig scripts.
 basically what you want to do is generate an  UUID for each of your tweet
 and then feed it to mahout algorithms.

 Sorry if I understood it wrong or it sounds rude.

Re: Converting from textfile to sequencefile using Hive

2013-09-30 Thread Sean Busbey

S,

Check out these presentations from Data Science Maryland back in May[1].

1. working with Tweets in Hive:

http://www.slideshare.net/JoeyEcheverria/analyzing-twitter-data-with-hadoop-20929978

2. then pulling stuff out of Hive to use with Mahout:

http://files.meetup.com/6195792/Working%20With%20Mahout.pdf

The Mahout talk didn't have a directly useful outcome (largely because it
tried to work with the tweets as individual text documents), but it does
get through all the mechanics of exactly what you state you want.

The meetup page also has links to video, if the slides don't give enough
context.

HTH

[1]: http://www.meetup.com/Data-Science-MD/events/111081282/

On Mon, Sep 30, 2013 at 11:54 AM, Saurabh B saurabh.wri...@gmail.comwrote:

 Hi Nitin,

 No offense taken. Thank you for your response. Part of this is also trying
 to find the right tool for the job.

 I am doing queries to determine the cuts of tweets that I want, then doing
 some modest normalization (through a python script) and then I want to
 create sequenceFiles from that.

 So far Hive seems to be the most convenient way to do this. But I can take
 a look at PIG too. It looked like the STORED AS SEQUENCEFILE gets me 99%
 way there. So I was wondering if there was a way to get those ids in there
 as well. The last piece is always the stumbler :)

 Thanks again,

 S




 On Mon, Sep 30, 2013 at 2:41 PM, Nitin Pawar nitinpawar...@gmail.comwrote:

 are you using hive to just convert your text files to sequence files?
 If thats the case then you may want to look at the purpose why hive was
 developed.

 If you want to modify data or process data which does not involve any
 kind of analytics functions on a routine basis.

 If you want to do a data manipulation or enrichment and do not want to
 code a lot of map reduce job, you can take a look at pig scripts.
 basically what you want to do is generate an  UUID for each of your tweet
 and then feed it to mahout algorithms.

 Sorry if I understood it wrong or it sounds rude.





-- 
Sean

Re: Want query to use more reducers

2013-09-30 Thread Sean Busbey

Hey Keith,

It sounds like you should tweak the settings for how Hive handles query
execution[1]:

1) Tune the guessed number of reducers based on input size

= hive.exec.reducers.bytes.per.reducer

Defaults to 1G. Based on your description, it sounds like this is probably
still at default.

In this case, you should also set a max # of reducers based on your cluster
size.

= hive.exec.reducers.max

I usually set this to the # reduce slots, if there's a decent chance I'll
get to saturate the cluster. If not, don't worry about it.

2) Hard code a number of reducers

= mapred.reduce.tasks

Setting this will cause Hive to always use that number. It defaults to -1,
which tells hive to use the heuristic about input size to guess.

In either of the above cases, you should look at the options to merge small
files (search for merge  in the configuration property list) to avoid
getting lots of little outputs.

HTH

[1]:
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryExecution

-Sean

On Mon, Sep 30, 2013 at 11:31 AM, Keith Wiley kwi...@keithwiley.com wrote:

 I have a query that doesn't use reducers as efficiently as I would hope.
  If I run it on a large table, it uses more reducers, even saturating the
 cluster, as I desire.  However, on smaller tables it uses as low as a
 single reducer.  While I understand there is a logic in this (not using
 multiple reducers until the data size is larger), it is nevertheless
 inefficient to run a query for thirty minutes leaving the entire cluster
 vacant when the query could distribute the work evenly and wrap things up
 in a fraction of the time.  The query is shown below (abstracted to its
 basic form).  As you can see, it is a little atypical: it is a nested query
 which obviously implies two map-reduce jobs and it uses a script for the
 reducer stage that I am trying to speed up.  I thought the distribute by
 clause should make it use the reducers more evenly, but as I said, that is
 not the behavior I am seeing.

 Any ideas how I could improve this situation?

 Thanks.

 CREATE TABLE output_table ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' as
 SELECT * FROM (
 FROM (
 SELECT * FROM input_table
 DISTRIBUTE BY input_column_1 SORT BY input_column_1 ASC,
 input_column_2 ASC, input_column_etc ASC) q
 SELECT TRANSFORM(*)
 USING 'python my_reducer_script.py' AS(
 output_column_1,
 output_column_2,
 output_column_etc,
 )
 ) s
 ORDER BY output_column_1;


 
 Keith Wiley kwi...@keithwiley.com keithwiley.com
 music.keithwiley.com

 Luminous beings are we, not this crude matter.
--  Yoda

 




-- 
Sean

Re: Converting from textfile to sequencefile using Hive

2013-09-30 Thread Saurabh B

Thanks Sean, that is exactly what I want.


On Mon, Sep 30, 2013 at 3:09 PM, Sean Busbey bus...@cloudera.com wrote:

 S,

 Check out these presentations from Data Science Maryland back in May[1].

 1. working with Tweets in Hive:


 http://www.slideshare.net/JoeyEcheverria/analyzing-twitter-data-with-hadoop-20929978

 2. then pulling stuff out of Hive to use with Mahout:

 http://files.meetup.com/6195792/Working%20With%20Mahout.pdf

 The Mahout talk didn't have a directly useful outcome (largely because it
 tried to work with the tweets as individual text documents), but it does
 get through all the mechanics of exactly what you state you want.

 The meetup page also has links to video, if the slides don't give enough
 context.

 HTH

 [1]: http://www.meetup.com/Data-Science-MD/events/111081282/


 On Mon, Sep 30, 2013 at 11:54 AM, Saurabh B saurabh.wri...@gmail.comwrote:

 Hi Nitin,

 No offense taken. Thank you for your response. Part of this is also
 trying to find the right tool for the job.

 I am doing queries to determine the cuts of tweets that I want, then
 doing some modest normalization (through a python script) and then I want
 to create sequenceFiles from that.

 So far Hive seems to be the most convenient way to do this. But I can
 take a look at PIG too. It looked like the STORED AS SEQUENCEFILE gets me
 99% way there. So I was wondering if there was a way to get those ids in
 there as well. The last piece is always the stumbler :)

 Thanks again,

 S




 On Mon, Sep 30, 2013 at 2:41 PM, Nitin Pawar nitinpawar...@gmail.comwrote:

 are you using hive to just convert your text files to sequence files?
 If thats the case then you may want to look at the purpose why hive was
 developed.

 If you want to modify data or process data which does not involve any
 kind of analytics functions on a routine basis.

 If you want to do a data manipulation or enrichment and do not want to
 code a lot of map reduce job, you can take a look at pig scripts.
 basically what you want to do is generate an  UUID for each of your
 tweet and then feed it to mahout algorithms.

 Sorry if I understood it wrong or it sounds rude.





 --
 Sean

Re: Want query to use more reducers

2013-09-30 Thread Keith Wiley

Thanks.  mapred.reduce.tasks and hive.exec.reducers.max seem to have fixed the 
problem.  It is now saturating the cluster and running the query super fast.  
Excellent!

On Sep 30, 2013, at 12:28 , Sean Busbey wrote:

 Hey Keith,
 
 It sounds like you should tweak the settings for how Hive handles query 
 execution[1]:
 
 1) Tune the guessed number of reducers based on input size
 
 = hive.exec.reducers.bytes.per.reducer
 
 Defaults to 1G. Based on your description, it sounds like this is probably 
 still at default.
 
 In this case, you should also set a max # of reducers based on your cluster 
 size.
 
 = hive.exec.reducers.max
 
 I usually set this to the # reduce slots, if there's a decent chance I'll get 
 to saturate the cluster. If not, don't worry about it.
 
 2) Hard code a number of reducers
 
 = mapred.reduce.tasks
 
 Setting this will cause Hive to always use that number. It defaults to -1, 
 which tells hive to use the heuristic about input size to guess.
 
 In either of the above cases, you should look at the options to merge small 
 files (search for merge  in the configuration property list) to avoid 
 getting lots of little outputs.
 
 HTH
 
 [1]: 
 https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryExecution
 
 -Sean
 
 On Mon, Sep 30, 2013 at 11:31 AM, Keith Wiley kwi...@keithwiley.com wrote:
 I have a query that doesn't use reducers as efficiently as I would hope.  If 
 I run it on a large table, it uses more reducers, even saturating the 
 cluster, as I desire.  However, on smaller tables it uses as low as a single 
 reducer.  While I understand there is a logic in this (not using multiple 
 reducers until the data size is larger), it is nevertheless inefficient to 
 run a query for thirty minutes leaving the entire cluster vacant when the 
 query could distribute the work evenly and wrap things up in a fraction of 
 the time.  The query is shown below (abstracted to its basic form).  As you 
 can see, it is a little atypical: it is a nested query which obviously 
 implies two map-reduce jobs and it uses a script for the reducer stage that I 
 am trying to speed up.  I thought the distribute by clause should make it 
 use the reducers more evenly, but as I said, that is not the behavior I am 
 seeing.
 
 Any ideas how I could improve this situation?
 
 Thanks.
 
 CREATE TABLE output_table ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' as
 SELECT * FROM (
 FROM (
 SELECT * FROM input_table
 DISTRIBUTE BY input_column_1 SORT BY input_column_1 ASC, 
 input_column_2 ASC, input_column_etc ASC) q
 SELECT TRANSFORM(*)
 USING 'python my_reducer_script.py' AS(
 output_column_1,
 output_column_2,
 output_column_etc,
 )
 ) s
 ORDER BY output_column_1;
 
 
 Keith Wiley kwi...@keithwiley.com keithwiley.com
 music.keithwiley.com
 
 Luminous beings are we, not this crude matter.
--  Yoda
 
 
 
 
 
 -- 
 Sean



Keith Wiley kwi...@keithwiley.com keithwiley.commusic.keithwiley.com

I do not feel obliged to believe that the same God who has endowed us with
sense, reason, and intellect has intended us to forgo their use.
   --  Galileo Galilei

Re: how to treat an existing partition data file as a table?

2013-09-30 Thread Yang

thanks guys, I found that the table is not partitioned, so I guess no way
out...


On Mon, Sep 30, 2013 at 9:31 AM, Olga L. Natkovich ol...@yahoo-inc.comwrote:

  You need to specify a table partition from which you want to sample.

 ** **

 Olga

 ** **

 *From:* Yang [mailto:tedd...@gmail.com]
 *Sent:* Sunday, September 29, 2013 1:39 PM
 *To:* hive-u...@hadoop.apache.org

 *Subject:* how to treat an existing partition data file as a table?

  ** **

 we have a huge table, including browsing data for the past 5 years, let's
 say. 

 ** **

 now I want to take a few samples to play around with it. so I did

 select * from mytable limit 10;

 but it actually went full out and tried to scan the entire table. is there
 a way to kind of create a view pointing to only one of the data files
 used by the original table mytable ?

 this way the total files to be scanned is much smaller.

 ** **

 ** **

 thanks!
 yang

Re: UDF error?

2013-09-30 Thread Tim Robertson

It's been ages since I wrote one, but the differences to mine:

a) I use LongWritable:  public LongWritable evaluate(LongWritable startAt) {
b) I have annotations on the class (but I think they are just for docs)
@Description(name = row_sequence,
  value = _FUNC_() - Returns a generated row sequence number starting from
1)
@UDFType(deterministic = false)
public class UDFRowSequence extends UDF {

Hope this helps!
Tim



On Mon, Sep 30, 2013 at 10:47 PM, Yang tedd...@gmail.com wrote:

 I wrote a super simple UDF, but got some errors:

 UDF:

 package yy;
 import org.apache.hadoop.hive.ql.exec.UDF;
 import java.util.Random;
 import java.util.UUID;
 import java.lang.management.*;

 public class MyUdf extends UDF {
 static Random rand = new Random(System.currentTimeMillis() +
 Thread.currentThread().getId()* 100);
 String name = ManagementFactory.getRuntimeMXBean().getName();
 long startValue = Long.valueOf(name.replaceAll([^\\d]+, )) *
 1 + Thread.currentThread().getId() * 1000;
 public long evaluate(long x ) {
 //return (long)UUID.randomUUID().hashCode();
 //return rand.nextLong();
 return startValue++;
  }
 }





 sql script:

 CREATE TEMPORARY FUNCTION gen_uniq2 AS 'yy.MyUdf';
 select gen_uniq2(field1), field2
 from yy_mapping limit 10;

 field1 is bigint, field2 is int





 error:

 hive source aa.sql;
 Added ./MyUdf.jar to class path
 Added resource: ./MyUdf.jar
 OK
 Time taken: 0.0070 seconds
 FAILED: SemanticException [Error 10014]: Line 2:7 Wrong arguments
 'field1': No matching method for class yy.MyUdf with (bigint). Possible
 choices: _FUNC_()





 so I'm declaring a UDF with arg of long, so that should work for a bigint
 (more importantly it's complaining not long vs bigint, but bigint vs void
 ). I tried changing both to int, same failure


 thanks!
 yang

Re: UDF error?

2013-09-30 Thread Tim Robertson

That class is:
https://code.google.com/p/gbif-occurrencestore/source/browse/trunk/occurrence-store/src/main/java/org/gbif/occurrencestore/hive/udf/UDFRowSequence.java

Cheers,
Tim


On Mon, Sep 30, 2013 at 10:55 PM, Tim Robertson
timrobertson...@gmail.comwrote:

 It's been ages since I wrote one, but the differences to mine:

 a) I use LongWritable:  public LongWritable evaluate(LongWritable startAt)
 {
 b) I have annotations on the class (but I think they are just for docs)
 @Description(name = row_sequence,
   value = _FUNC_() - Returns a generated row sequence number starting
 from 1)
 @UDFType(deterministic = false)
 public class UDFRowSequence extends UDF {

 Hope this helps!
 Tim



 On Mon, Sep 30, 2013 at 10:47 PM, Yang tedd...@gmail.com wrote:

 I wrote a super simple UDF, but got some errors:

 UDF:

 package yy;
 import org.apache.hadoop.hive.ql.exec.UDF;
 import java.util.Random;
 import java.util.UUID;
 import java.lang.management.*;

 public class MyUdf extends UDF {
 static Random rand = new Random(System.currentTimeMillis() +
 Thread.currentThread().getId()* 100);
 String name = ManagementFactory.getRuntimeMXBean().getName();
 long startValue = Long.valueOf(name.replaceAll([^\\d]+, )) *
 1 + Thread.currentThread().getId() * 1000;
 public long evaluate(long x ) {
 //return (long)UUID.randomUUID().hashCode();
 //return rand.nextLong();
 return startValue++;
  }
 }





 sql script:

 CREATE TEMPORARY FUNCTION gen_uniq2 AS 'yy.MyUdf';
 select gen_uniq2(field1), field2
 from yy_mapping limit 10;

 field1 is bigint, field2 is int





 error:

 hive source aa.sql;
 Added ./MyUdf.jar to class path
 Added resource: ./MyUdf.jar
 OK
 Time taken: 0.0070 seconds
 FAILED: SemanticException [Error 10014]: Line 2:7 Wrong arguments
 'field1': No matching method for class yy.MyUdf with (bigint). Possible
 choices: _FUNC_()





 so I'm declaring a UDF with arg of long, so that should work for a bigint
 (more importantly it's complaining not long vs bigint, but bigint vs void
 ). I tried changing both to int, same failure


 thanks!
 yang

Re: UDF error?

2013-09-30 Thread Yang

thanks!  at first I did have a no-arg evaluate(), but somehow

select myfunction(), field1, field2 from mytable ;

spits out the same value for myfunction() for each row. so I was wondering
whether the UDF got called only 1 time, because the hive compiler sees that
the argument is void, so that
all the invocations would be having the same value, then I tried to pass
in a param to prevent this possibility.


On Mon, Sep 30, 2013 at 1:55 PM, Tim Robertson timrobertson...@gmail.comwrote:

 It's been ages since I wrote one, but the differences to mine:

 a) I use LongWritable:  public LongWritable evaluate(LongWritable startAt)
 {
 b) I have annotations on the class (but I think they are just for docs)
 @Description(name = row_sequence,
   value = _FUNC_() - Returns a generated row sequence number starting
 from 1)
 @UDFType(deterministic = false)
 public class UDFRowSequence extends UDF {

 Hope this helps!
 Tim



 On Mon, Sep 30, 2013 at 10:47 PM, Yang tedd...@gmail.com wrote:

 I wrote a super simple UDF, but got some errors:

 UDF:

 package yy;
 import org.apache.hadoop.hive.ql.exec.UDF;
 import java.util.Random;
 import java.util.UUID;
 import java.lang.management.*;

 public class MyUdf extends UDF {
 static Random rand = new Random(System.currentTimeMillis() +
 Thread.currentThread().getId()* 100);
 String name = ManagementFactory.getRuntimeMXBean().getName();
 long startValue = Long.valueOf(name.replaceAll([^\\d]+, )) *
 1 + Thread.currentThread().getId() * 1000;
 public long evaluate(long x ) {
 //return (long)UUID.randomUUID().hashCode();
 //return rand.nextLong();
 return startValue++;
  }
 }





 sql script:

 CREATE TEMPORARY FUNCTION gen_uniq2 AS 'yy.MyUdf';
 select gen_uniq2(field1), field2
 from yy_mapping limit 10;

 field1 is bigint, field2 is int





 error:

 hive source aa.sql;
 Added ./MyUdf.jar to class path
 Added resource: ./MyUdf.jar
 OK
 Time taken: 0.0070 seconds
 FAILED: SemanticException [Error 10014]: Line 2:7 Wrong arguments
 'field1': No matching method for class yy.MyUdf with (bigint). Possible
 choices: _FUNC_()





 so I'm declaring a UDF with arg of long, so that should work for a bigint
 (more importantly it's complaining not long vs bigint, but bigint vs void
 ). I tried changing both to int, same failure


 thanks!
 yang

Re: UDF error?

2013-09-30 Thread Tim Robertson

Here is an example of a no arg that will return a different value for each
row:
https://code.google.com/p/gbif-occurrencestore/source/browse/trunk/occurrence-store/src/main/java/org/gbif/occurrencestore/hive/udf/UuidUDF.java

Hope this helps,
Tim


On Mon, Sep 30, 2013 at 10:59 PM, Yang tedd...@gmail.com wrote:

 thanks!  at first I did have a no-arg evaluate(), but somehow

 select myfunction(), field1, field2 from mytable ;

 spits out the same value for myfunction() for each row. so I was wondering
 whether the UDF got called only 1 time, because the hive compiler sees that
 the argument is void, so that
 all the invocations would be having the same value, then I tried to pass
 in a param to prevent this possibility.


 On Mon, Sep 30, 2013 at 1:55 PM, Tim Robertson 
 timrobertson...@gmail.comwrote:

 It's been ages since I wrote one, but the differences to mine:

 a) I use LongWritable:  public LongWritable evaluate(LongWritable
 startAt) {
 b) I have annotations on the class (but I think they are just for docs)
 @Description(name = row_sequence,
   value = _FUNC_() - Returns a generated row sequence number starting
 from 1)
 @UDFType(deterministic = false)
 public class UDFRowSequence extends UDF {

 Hope this helps!
 Tim



 On Mon, Sep 30, 2013 at 10:47 PM, Yang tedd...@gmail.com wrote:

 I wrote a super simple UDF, but got some errors:

 UDF:

 package yy;
 import org.apache.hadoop.hive.ql.exec.UDF;
 import java.util.Random;
 import java.util.UUID;
 import java.lang.management.*;

 public class MyUdf extends UDF {
 static Random rand = new Random(System.currentTimeMillis() +
 Thread.currentThread().getId()* 100);
 String name = ManagementFactory.getRuntimeMXBean().getName();
 long startValue = Long.valueOf(name.replaceAll([^\\d]+, )) *
 1 + Thread.currentThread().getId() * 1000;
 public long evaluate(long x ) {
 //return (long)UUID.randomUUID().hashCode();
 //return rand.nextLong();
 return startValue++;
  }
 }





 sql script:

 CREATE TEMPORARY FUNCTION gen_uniq2 AS 'yy.MyUdf';
 select gen_uniq2(field1), field2
 from yy_mapping limit 10;

 field1 is bigint, field2 is int





 error:

 hive source aa.sql;
 Added ./MyUdf.jar to class path
 Added resource: ./MyUdf.jar
 OK
 Time taken: 0.0070 seconds
 FAILED: SemanticException [Error 10014]: Line 2:7 Wrong arguments
 'field1': No matching method for class yy.MyUdf with (bigint). Possible
 choices: _FUNC_()





 so I'm declaring a UDF with arg of long, so that should work for a
 bigint (more importantly it's complaining not long vs bigint, but bigint vs
 void ). I tried changing both to int, same failure


 thanks!
 yang

Re: UDF error?

2013-09-30 Thread Yang

ok I found the reason, as I modified the jar file, though I re-ran ADD
.MyUdf.jar;  create temporary function ; , it doesn't take effect.
I have to get out of hive session, then rerun these again.


On Mon, Sep 30, 2013 at 1:47 PM, Yang tedd...@gmail.com wrote:

 I wrote a super simple UDF, but got some errors:

 UDF:

 package yy;
 import org.apache.hadoop.hive.ql.exec.UDF;
 import java.util.Random;
 import java.util.UUID;
 import java.lang.management.*;

 public class MyUdf extends UDF {
 static Random rand = new Random(System.currentTimeMillis() +
 Thread.currentThread().getId()* 100);
 String name = ManagementFactory.getRuntimeMXBean().getName();
 long startValue = Long.valueOf(name.replaceAll([^\\d]+, )) *
 1 + Thread.currentThread().getId() * 1000;
 public long evaluate(long x ) {
 //return (long)UUID.randomUUID().hashCode();
 //return rand.nextLong();
 return startValue++;
  }
 }





 sql script:

 CREATE TEMPORARY FUNCTION gen_uniq2 AS 'yy.MyUdf';
 select gen_uniq2(field1), field2
 from yy_mapping limit 10;

 field1 is bigint, field2 is int





 error:

 hive source aa.sql;
 Added ./MyUdf.jar to class path
 Added resource: ./MyUdf.jar
 OK
 Time taken: 0.0070 seconds
 FAILED: SemanticException [Error 10014]: Line 2:7 Wrong arguments
 'field1': No matching method for class yy.MyUdf with (bigint). Possible
 choices: _FUNC_()





 so I'm declaring a UDF with arg of long, so that should work for a bigint
 (more importantly it's complaining not long vs bigint, but bigint vs void
 ). I tried changing both to int, same failure


 thanks!
 yang

Re: Tableau connectivity available on KR

2013-09-30 Thread Mohammad Islam

Olga.
I'm sure it was not intended for me  and a lot of us. 
hive-u...@hadoop.apache.org made it happened.

 From: Olga L. Natkovich ol...@yahoo-inc.com
To: kryptonite-u...@yahoo-inc.com kryptonite-u...@yahoo-inc.com; 
hive-u...@hadoop.apache.org hive-u...@hadoop.apache.org; 
ygrid-sandbox-annou...@yahoo-inc.com ygrid-sandbox-annou...@yahoo-inc.com; 
ygrid-production-annou...@yahoo-inc.com 
ygrid-production-annou...@yahoo-inc.com; 
ygrid-research-annou...@yahoo-inc.com 
ygrid-research-annou...@yahoo-inc.com; hcat-us...@yahoo-inc.com 
hcat-us...@yahoo-inc.com 
Sent: Monday, September 30, 2013 2:12 PM
Subject: Tableau connectivity available on KR

Dear Grid Users,

Hadoop Services team is happy to announce that Tableau is now supported on KR. 
Please, come give it a try and provide your feedback. The steps to connect with 
Tableau are described here:  
http://twiki.corp.yahoo.com/view/Grid/HiveServer2BITools.

In addition, we also provide support for MicroStrategy users. If you want to 
connect your MS server to KR, please, follow the instructions here:  
https://docs.google.com/a/yahoo-inc.com/document/d/1QzAh19bysE6ooFeCPSZZTFgcVR36stK6v8ZPcq2Yi30.

Olga

RE: Tableau connectivity available on KR

2013-09-30 Thread Olga L. Natkovich

Sorry for the spam. This was meant as internal Yahoo announcement.

Olga

From: Mohammad Islam [mailto:misla...@yahoo.com]
Sent: Monday, September 30, 2013 3:53 PM
To: user@hive.apache.org
Subject: Re: Tableau connectivity available on KR

Olga.
I'm sure it was not intended for me  and a lot of us. 
hive-u...@hadoop.apache.orgmailto:hive-u...@hadoop.apache.org made it 
happened.

From: Olga L. Natkovich ol...@yahoo-inc.commailto:ol...@yahoo-inc.com
To: kryptonite-u...@yahoo-inc.commailto:kryptonite-u...@yahoo-inc.com 
kryptonite-u...@yahoo-inc.commailto:kryptonite-u...@yahoo-inc.com; 
hive-u...@hadoop.apache.orgmailto:hive-u...@hadoop.apache.org 
hive-u...@hadoop.apache.orgmailto:hive-u...@hadoop.apache.org; 
ygrid-sandbox-annou...@yahoo-inc.commailto:ygrid-sandbox-annou...@yahoo-inc.com

ygrid-sandbox-annou...@yahoo-inc.commailto:ygrid-sandbox-annou...@yahoo-inc.com;

ygrid-production-annou...@yahoo-inc.commailto:ygrid-production-annou...@yahoo-inc.com

ygrid-production-annou...@yahoo-inc.commailto:ygrid-production-annou...@yahoo-inc.com;

ygrid-research-annou...@yahoo-inc.commailto:ygrid-research-annou...@yahoo-inc.com

ygrid-research-annou...@yahoo-inc.commailto:ygrid-research-annou...@yahoo-inc.com;
 hcat-us...@yahoo-inc.commailto:hcat-us...@yahoo-inc.com 
hcat-us...@yahoo-inc.commailto:hcat-us...@yahoo-inc.com
Sent: Monday, September 30, 2013 2:12 PM
Subject: Tableau connectivity available on KR

Dear Grid Users,

Hadoop Services team is happy to announce that Tableau is now supported on KR. 
Please, come give it a try and provide your feedback. The steps to connect with 
Tableau are described here:  
http://twiki.corp.yahoo.com/view/Grid/HiveServer2BITools.

In addition, we also provide support for MicroStrategy users. If you want to 
connect your MS server to KR, please, follow the instructions here:  
https://docs.google.com/a/yahoo-inc.com/document/d/1QzAh19bysE6ooFeCPSZZTFgcVR36stK6v8ZPcq2Yi30.

Olga

42 matches

Mail list logo