RE: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Markovitz, Dudu
P.s. There are some risky data manipulations going there. I’m not sure this is a desired result… ☺ hive> select CAST(REGEXP_REPLACE('And the Lord spake, saying, "First shalt thou take out the Holy Pin * Then shalt thou count to 3, no more, no less * 3 shall be the number thou shalt count, and

RE: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Markovitz, Dudu
Hi guys I would strongly recommend not to work with zipped files. “Hadoop will not be able to split your file into chunks/blocks and run multiple maps in parallel” https://cwiki.apache.org/confluence/display/Hive/CompressedStorage Dudu From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com]

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Mich Talebzadeh
Hi Igor, Hive can read from zipped files. If you are getting a lot of external files it makes sense to zip them and store on staging hdfs directory 1) download say these csv files into your local file system and use bzip2 to zip them as part of ETL ls -l total 68 -rw-r--r-- 1 hduser hadoop

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Igor Kravzov
Mich, will Hive automatically detect and unzip zipped files? Ir there is special option in table configuration? it will affect performance, correct? On Mon, Jun 6, 2016 at 4:14 PM, Mich Talebzadeh wrote: > Hi Sandeep. > > I tend to use Hive external tables as staffing

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Mich Talebzadeh
sorry should read* staging *tables .. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 6 June 2016 at

RE: alter partitions on hive external table

2016-06-06 Thread Markovitz, Dudu
And here is a full example -- bash mkdir -p t mkdir -p t/20150122/dudu/cust1 mkdir

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Mich Talebzadeh
Hi Sandeep. I tend to use Hive external tables as staffing tables but still I will require access writes to hdfs. Zip files work OK as well. For example our CSV files are zipped using bzip2 to save space However, you may request a temporary solution by disabling permission in

Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-06 Thread Mich Talebzadeh
Thanks for this update. I can create a hive ORC transactional table with Spark no problem. the whole thing in Hive on spark including updates works fine. my Spark is 1.6.1 and Hive is version 2 Bur updates of ORC transactional table through Spark fails I am afraid Welcome to

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Igor Kravzov
I see file are with extension .gz. Are these zipped? Did you try with unzipped files? Maybe in order to read the data hive needs to unzip files but does not have write permission? Just a wild guess... On Tue, May 31, 2016 at 4:20 AM, Sandeep Giri wrote: > Yes, when I run

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Sandeep Giri
Hi Mich, Thank you for your response. My question is very simple. How to do you process huge read-only data in HDFS using Hive? Regards, Sandeep Giri, +1 347 781 4573 (US) +91-953-899-8962 (IN) www.CloudxLab.com Phone: +1 (412) 568-3901 <+1+(412)+568-3901> (Office) [image: linkedin icon]

Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-06 Thread Alan Gates
This JIRA https://issues.apache.org/jira/browse/HIVE-12366 moved the heartbeat logic from the engine to the client. AFAIK this was the only issue preventing working with Spark as an engine. That JIRA was released in 2.0. I want to stress that to my knowledge no one has tested this combination

Re: Delete hive partition while executing query.

2016-06-06 Thread Alan Gates
Do you have the system configured to use the DbTxnManager? See https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration for details on how to set this up. The transaction manager is what manages locking and makes sure that your queries don’t stomp each

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Mich Talebzadeh
Well Sandeep, the permissioning on HDFS resembles that of Linux file system. For security reason it does not allow you to write to that file. An external table in Hive is just an interface. Any reason why you have not got access to that file. Can you try to log in with beeline with username and

Hive table show less records than hbase

2016-06-06 Thread Amrit Jangid
Hi All, *Using HBase0.98.6, CDH5.3.5 and Hive-0.13.1* I am accessing a hbase table from Hive, all column are mapped as string type. if i do coun(*) from hive on table, i get less record than total records in hbase table. Also found this case - i have a coulmn -email type of string. hive>

Re: Why does the user need write permission on the location of external hive table?

2016-06-06 Thread Sandeep Giri
Yes, Mich that's right. That folder us read-only to me. That's my question. Why do we need modification permissions on the location while creating external table. This data is read-only. In hive, how can we process the huge data on which we don't have write permissions? Is cloning this data the

RE: alter partitions on hive external table

2016-06-06 Thread Markovitz, Dudu
… are just logical connections between certain values and specific directories … From: Markovitz, Dudu [mailto:dmarkov...@paypal.com] Sent: Monday, June 06, 2016 6:07 PM To: user@hive.apache.org Subject: RE: alter partitions on hive external table Hi Raj 1. I don’t understand the reason

RE: alter partitions on hive external table

2016-06-06 Thread Markovitz, Dudu
Hi Raj 1. I don’t understand the reason for this change, can you please elaborate? 2. External table is just an interface. Instructions for how to read existing data. Partitions of external table are just a logical connections between certain values and a specific directories.

Delete hive partition while executing query.

2016-06-06 Thread Igor Kuzmenko
Hello, I'm trying to find a safe way to delete partition with all data it includes. I'm using Hive 1.2.1, Hive JDBC driver 1.2.1 and perform simple test on transactional table: asyncExecute("Select count(distinct in_info_msisdn) from mobile_connections where dt=20151124 and msisdn_last_digit=2",

Re: alter partitions on hive external table

2016-06-06 Thread Mich Talebzadeh
so you are doing this for partition elimination? it is a tough call whatever you do Since userid is unique you can try CLUSTERED BY (userid,datetime,customerid) INTO 256 BUCKETS or try creating a new table based on new column partition and insert/select part of data and see it actually

Re: alter partitions on hive external table

2016-06-06 Thread raj hive
Hi Mich, table type is external table. Yes, I am doing this for certain queries where userid as the most significant column. On Mon, Jun 6, 2016 at 12:35 PM, Mich Talebzadeh wrote: > That order datetime/userid/customerId looks more natural to me. > > Two questions: >

Re: alter partitions on hive external table

2016-06-06 Thread Mich Talebzadeh
That order datetime/userid/customerId looks more natural to me. Two questions: What is the type of table in Hive? Are you doing this for certain queries where you think userid as the most significant column is going to help queries better? HTH Dr Mich Talebzadeh LinkedIn *

Re: alter partitions on hive external table

2016-06-06 Thread Margus Roo
Hi The first idea pops up is: 1. HDFS commands to copy your existing structure and data to support a new partitions structure. 2. Create a new on temporary hive external table 3. (optional) if you created temporary table then drop old one and insert ... select from temporary table.