Re: Updating ORC table fails with [Error 10122]: Bucketized tables do not support INSERT INTO:

2016-07-29 Thread Mich Talebzadeh
42 HARRODS HARRODS LTD CD 4610 4 HARRODS HARRODS LTD CD 4636 13 HARRODS HARRODS LTD CD 5916 28 HARRODS HARRODS LTD CD 4628 111 HARRODS cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <

Analytical function works in Spark SQL but not in Hive 2 QL

2016-07-31 Thread Mich Talebzadeh
flect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) *FAILED: ParseException line 6:7 Failed to recogni

Re: Analytical function works in Spark SQL but not in Hive 2 QL

2016-07-31 Thread Mich Talebzadeh
d Stage-1_0: 0(+1)/1 2016-07-31 10:48:35,780 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 Finished Status: Finished successfully in 10.10 seconds OK 2015-12-15 HARRODS LTD CD 4636 10.95 1 Time taken: 46.546 seconds, Fetched: 1 row(s) Dr Mich Talebzadeh LinkedIn * https://w

Re: Hive on spark

2016-08-01 Thread Mich Talebzadeh
Hi, You can download the pdf from here <https://talebzadehmich.files.wordpress.com/2016/08/hive_on_spark_only.pdf> HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Hive transactional table with delta files, Spark cannot read and sends error

2016-08-01 Thread Mich Talebzadeh
/delta_100_100_ drwxr-xr-x - hduser supergroup 0 2016-07-29 21:20 /user/hive/warehouse/accounts.db/payees/delta_101_101_ Spark fails reading this table. What options do I have here? And interestingly Hive running on Spark engine and its works

How can I force Hive to start compaction on a table immediately

2016-08-01 Thread Mich Talebzadeh
Rather than queuing it hive> alter table payees COMPACT 'major'; Compaction enqueued. OK Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP

Re: Hive transactional table with delta files, Spark cannot read and sends error

2016-08-01 Thread Mich Talebzadeh
Thanks Gopal. I am on Spark 1.6.1 and getting the following error scala> var conn = LlapContext.newInstance(sc, hs2_url); :28: error: not found: value LlapContext var conn = LlapContext.newInstance(sc, hs2_url); Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/v

Re: How can I force Hive to start compaction on a table immediately

2016-08-01 Thread Mich Talebzadeh
Thanks Alan. One crude solution would be to copy data from the ACID table to a simple table and present that table to Spark to see the data. This is basically Spark optimiser issue not the engine itself My Hive runs on Spark query engine and all works fine there. HTH Dr Mich Talebzadeh

Vectorised Query Execution extension

2016-08-04 Thread Mich Talebzadeh
to extend it beyond 1024 rows to include the whole column in table? VQE would be very useful especially with ORC as it basically means that one can process the whole column separately thus improving performance of the query. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
Do you know the existing table schema? The new table schema will be based on that table without partitioning? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
ATE EXTERNAL TABLE sales5 AS SELECT * FROM SALES; FAILED: SemanticException [Error 10070]: CREATE-TABLE-AS-SELECT cannot create external table Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/v

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
yes but that essentially copies the metadata and leaves the partition there with no data. it is just an image copy. won't help this case Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
: bigint, quantity_sold: decimal(10,0), amount_sold: decimal(10,0)] scala> s2.write.mode("overwrite").parquet("/data/stg/newtable/sales5") HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.l

Re: hive concurrency not working

2016-08-04 Thread Mich Talebzadeh
you won't have this problem if you use Spark as the execution engine? That handles concurrency OK Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdO

Re: hive concurrency not working

2016-08-04 Thread Mich Talebzadeh
You won't have this problem if you use Spark as the execution engine! This set up handles concurrency but Hive with Spark is not part of the HW distro. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.link

Re: hive concurrency not working

2016-08-05 Thread Mich Talebzadeh
great in that case they can try it and I am pretty sure if they are stuck they can come and ask you for expert advice since Hortonworks do not support Hive on Spark and I know that Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: hive not showing up default database

2016-08-05 Thread Mich Talebzadeh
H Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all res

Re: hive concurrency not working

2016-08-06 Thread Mich Talebzadeh
wonder whether hive.support.concurrency is set to true with zookeeper running and hive.lock.manager set to org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <ht

Re: hive concurrency not working

2016-08-07 Thread Mich Talebzadeh
-- I am pretty sure that they will support it because the Spark option is supported Hortonworks support Spark but not Hive on Spark. Their official distro is Hive on Tez + LLAP Not sure where you get your information from though I got it from Hortonworks and I know that Dr Mich Talebzadeh

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-07 Thread Mich Talebzadeh
reated without those two columns and of course will not be partitioned. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://ta

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-07 Thread Mich Talebzadeh
`prod_id` bigint, `cust_id` bigint, `time_id` timestamp, `channel_id` bigint, `promo_id` bigint, `quantity_sold` decimal(10,0), `amount_sold` decimal(10,0)) *PARTITIONED BY ( `year` int, `month` int)* - Which is not that useful HTH Dr Mich Talebzadeh Linked

Re: Re: hive will die or not?

2016-08-08 Thread Mich Talebzadeh
basically Hive thrift server and without it would not exist - Without Hive and HiveContext there would not be Spark-sql I am a fan of Spark and use it extensively. However, you have to consider the use case when talking about a product. HTH Dr Mich Talebzadeh LinkedIn * https

Re: hive session drops tables when restarted

2016-08-08 Thread Mich Talebzadeh
see them netstat -pltenp 'Local|1000|9083' HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: hiveserver2 start hang

2016-08-09 Thread Mich Talebzadeh
what command did you use to start hiveserver2? $HIVE_HOME/bin/hiveserver2 & is the port 1 used? netstat -alnp|egrep 'Local|10010' HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin

Re: hiver errors

2016-08-10 Thread Mich Talebzadeh
which version of hive it it? can you log in to hive via hive cli *$HIVE_HOME/bin/hive* Logging initialized using configuration in file:/usr/lib/hive/conf/hive-log4j2.properties hive> HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: hive will die or not?

2016-08-14 Thread Mich Talebzadeh
which is a Data Warehouse. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use

Re: hive will die or not?

2016-08-14 Thread Mich Talebzadeh
...I do not agree with you... Yeah right. I am so upset. Was waiting for your nod LOL Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive logging 2.0.0

2016-08-17 Thread Mich Talebzadeh
binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in file:/usr/lib/hive/conf/hive-log4j2.properties hive> Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <

Re: Hive logging 2.0.0

2016-08-18 Thread Mich Talebzadeh
copy $HIVE_HOME/conf/hive-log4j2.properties.template to $HIVE_HOME/conf/hive-log4j2.properties Change the values in that file to WARN etc. For example property.hive.log.level = INFO HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Getting column statistics on paritioned Hive tables

2016-08-18 Thread Mich Talebzadeh
ump --rowindex HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your

Re: Inserting data in hive bucket

2016-08-19 Thread Mich Talebzadeh
Hi, You are partitioning by Month and bucketing by date or day? If that is the case you only have 30-31 hash partitioning (bucketing) for each Month? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: Inserting data in hive bucket

2016-08-19 Thread Mich Talebzadeh
potentially many (definitely not known until we encounter them all) and if you want to spread them evenly (after all that is what hash partitioning is all about) then I think day of the month makes more sense. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Inserting data in hive bucket

2016-08-21 Thread Mich Talebzadeh
Hi Rahul, I don't believe you can drop a particular bucket in Hive HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw&g

Re: Populating tables using hive and spark

2016-08-22 Thread Mich Talebzadeh
l1| col2| ++--+ | 1|London| | 2|NY| | 3|California| | 4| Dehli| ++--+ So the rows are there. Let me go to Hive again now hive> select * from testme; OK 1 London 2 NY 3 California 4 Dehli hive> analyze tabl

Re: Hive transaction doesn't release lock.

2016-08-22 Thread Mich Talebzadeh
there are issues with locks not being released even when the transaction is aborted. There are still entries in hive_locks. I ended up deleting the row from hive_locks table manually. Not ideal but you know that the lock should not be there as the table is dropped. HTH Dr Mich Talebzadeh

Re: Hive transaction doesn't release lock.

2016-08-23 Thread Mich Talebzadeh
Hi Igor, I don't think so. Well I never raised one! HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wo

Re: How to remove Hive table property?

2016-08-23 Thread Mich Talebzadeh
Has table got data in it? Can you create a new table WITHOUT serialization.null.format and INSERT/SELECT from old to new, drop old and rename new to old. If the data is already there then the setting will apply to new rows only. That may be acceptable. HTH Dr Mich Talebzadeh LinkedIn

Re: Fw: Hive update operation

2016-08-24 Thread Mich Talebzadeh
have the underlying table to be updated been defined as transactional? can you give the update example? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Loading Sybase to hive using sqoop

2016-08-24 Thread Mich Talebzadeh
;) Insert into Hive table sqltext = """ INSERT INTO TABLE dummy SELECT ID , CLUSTERED , SCATTERED , RANDOMISED , RANDOM_STRING , SMALL_VC , PADDING FROM tmp """ HiveContext.sql

Re: Loading Sybase to hive using sqoop

2016-08-24 Thread Mich Talebzadeh
hm. Watching paint dry :) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it a

Re: Loading Sybase to hive using sqoop

2016-08-24 Thread Mich Talebzadeh
Are you using a vendor distro or in-house build? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

Re: Loading Sybase to hive using sqoop

2016-08-24 Thread Mich Talebzadeh
out first time and sync Hive table with Sybase IQ table real time. You will need SRS SP 204 or above to make this work. Talk to your DBA if they can get SRS SP from Sybase for this purpose. I have done it many times. I think it is stable enough for this purpose. HTH Dr Mich Talebzadeh

Re: Fw: Hive update operation

2016-08-25 Thread Mich Talebzadeh
Him What is your current RDBMS and are these SQL the ones used in RDBMS? Have you tried them on Hive? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Fw: Hive update operation

2016-08-25 Thread Mich Talebzadeh
timestamp AND mv.acqnum=t2.acqnum > INNER JOIN table1 t1 on mv.acqnum=t1.deal_number > where t1.deal_number=mv.acqnum; > > OUTPUT: > > " FAILED: ParseException line 1:221 missing EOF at 'FROM' near 'bgps' " > > >

Re: hive 2.1.0 + drop view

2016-08-26 Thread Mich Talebzadeh
shed Status: Finished successfully in 24.11 seconds OK 3325000 hive> drop view v_dummy2; OK HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6z

Re: hive 2.1.0 + drop view

2016-08-26 Thread Mich Talebzadeh
Sounds like there are a number of issues with Hive metastore on Postgres. There have been a number of reports on this. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: hive 2.1.0 + drop view

2016-08-26 Thread Mich Talebzadeh
You don't really want to mess around with the schema. This is what I have in Oracle 12c schema for TBLS. The same as yours [image: Inline images 1] But this is Oracle, a serious database :) HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/vi

Re: hive 2.1.0 + drop view

2016-08-26 Thread Mich Talebzadeh
r set). *char(n)* and *varchar(n)* allocate *n* bytes of storage. What character set are you using for your server/database? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile

Re: Quota for rogue ad-hoc queries

2016-08-31 Thread Mich Talebzadeh
Trt this hive.limit.optimize.fetch.max - Default Value: 5 - Added In: Hive 0.8.0 Maximum number of rows allowed for a smaller subset of data for simple LIMIT, if it is a fetch query. Insert queries are not restricted by this limit. HTH Dr Mich Talebzadeh LinkedIn * https

Re: hive on spark job not start enough executors

2016-09-09 Thread Mich Talebzadeh
when you start hive on spark do you set any parameters for the submitted job (or read them from init file)? set spark.master=yarn; set spark.deploy.mode=client; set spark.executor.memory=3g; set spark.driver.memory=3g; set spark.executor.instances=2; set spark.ui.port=; Dr Mich Talebzadeh

Re: Working of HiveServer2

2016-09-13 Thread Mich Talebzadeh
engine. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. A

Re: Sqoop: SQL Server to Hive import

2016-09-14 Thread Mich Talebzadeh
into Hive table. There are other ways of using JDBC say through Spark etc. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: ACID transactions on data added from Spark not working

2016-09-14 Thread Mich Talebzadeh
my experience. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your ow

Re: Hive On Spark - ORC Table - Hive Streaming Mutation API

2016-09-14 Thread Mich Talebzadeh
; set spark.ui.port=; HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:*

Re: Hive 2.x usage

2016-09-14 Thread Mich Talebzadeh
type, then you possibly can sort it out. Loads of time I have seen guys waiting for a vendor's supply or fix that could have been sort it out in a fraction of a time cause they could not be bothered to DIY. We are vendor agnostic and so far so good. HTH Dr Mich Talebzadeh LinkedI

Re: Hive on Spark - Mesos

2016-09-15 Thread Mich Talebzadeh
sorry on Yarn only but I gather it should work with Mesos. I don't think that comes into it. The issue is the compatibility of Spark assembly library with Hive. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw &

Re: on duplicate update equivalent?

2016-09-23 Thread Mich Talebzadeh
Hi Vijay, What is the use case for UPSERT in Hive. The functionality does not exist but there are other solutions. Are we talking about a set of dimension tables with primary keys hat need to be updated (existing rows) or inserted (new rows)? HTH Dr Mich Talebzadeh LinkedIn * https

Re: on duplicate update equivalent?

2016-09-23 Thread Mich Talebzadeh
es much like Hive. HiveContext in Spark is mapping here to HiveQL var sqltext = "" sqltext = """ SELECT rs.Month, rs.SalesChannel, round(TotalSales,2) As Sales, .... FROM ( SELECT t_t.CALENDAR_MONTH_DESC AS Month, t_c.CHANNEL_DESC AS SalesChannel, SUM(t_s.AMOUN

Re: on duplicate update equivalent?

2016-09-23 Thread Mich Talebzadeh
that I suggested earlier on may serve better. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

populating Hive table periodically from files on HDFS

2016-09-25 Thread Mich Talebzadeh
xternal table. This seems to be OK. The other option is only add new rows since last time with INSERT INTO WHERE rows do not exist in target table. Any other suggestions? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8

Re: populating Hive table periodically from files on HDFS

2016-09-25 Thread Mich Talebzadeh
partition Year, Months, Days etc. I thought about bucketing the partitions but one needs to balance the housekeeping with the number of buckets within each partition. So I did not bother. Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: populating Hive table periodically from files on HDFS

2016-09-26 Thread Mich Talebzadeh
ction happens nothing can be done to make Spark read data. If my assumptions are incorrect, I stand corrected. Regards Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
I have not encountered this case before. However, you can create a temporary table in Hive put all writes into it, read the rows as needed, and finally append data from the temporary table to ORC once reads are done. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
until compaction takes place which cannot be forced. I don't know where there is a way to enforce quick compaction.. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile

Re: Hive orc use case

2016-09-26 Thread Mich Talebzadeh
alter table payees compact 'minor'; Compaction enqueued. OK It queues compaction but there is no way I can force it to do compaction immediately? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedi

Hive in-memory offerings in forthcoming releases

2016-10-08 Thread Mich Talebzadeh
Hi, Is there any documentation on Apache Hive proposed new releases which is going to offer an in-memory database (IMDB) in the form of LLAP or built on LLAP. Love to see something like SAP ASE IMDB or Oracle 12c in-memory offerings with Hive as well. Regards, Dr Mich Talebzadeh LinkedIn

Re: Hive in-memory offerings in forthcoming releases

2016-10-10 Thread Mich Talebzadeh
part is interesting. The primary use case for this capability is to accelerate the analytics part of mixed OLTP and Analytical workloads by eliminating the need* for most of the Analytics indexes. *This speed up the analytical queries by a huge amount. HTH Dr Mich Talebzadeh LinkedIn *

Hive metadata on Hbase

2016-10-23 Thread Mich Talebzadeh
one is going to gain by having Hbase as the Hive metastore? I trust that we can still use our existing schemas on Oracle. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
e now relying on HDFS itself plus Hbase as well for persistent storage. So the situation might change. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
Hive 2.0.1 Subversion git://reznor-mbp-2.local/Users/sergey/git/hivegit -r e3cfeebcefe9a19c5055afdcbb00646908340694 Compiled by sergey on Tue May 3 21:03:11 PDT 2016 >From source with checksum 5a49522e4b572555dbbe5dd4773bc7c2 Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/v

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
just to > optimize the query, before even running it. > > I guess another advantage is that using a RDBMS as metastore makes it a > SPOF, unless you setup replication etc. while, HBase would give HA for free. > > > > On Mon, Oct 24, 2016 at 9:06 AM, Mich Talebzadeh <

Re: hiveserver2 java heap space

2016-10-24 Thread Mich Talebzadeh
does this work ok through Hive cli? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:*

Re: Hive metadata on Hbase

2016-10-24 Thread Mich Talebzadeh
l is a non-starter" They already do and pay more if they have to. We will stick with Hive metadata on Oracle with schema on SSD . HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.c

Re: Connect metadata

2016-10-25 Thread Mich Talebzadeh
ables and views in your schema and you only need it once and the schema will be populated by hive user that you have specified the details in hive-site.xml HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.link

Re: Hive metadata on Hbase

2016-10-25 Thread Mich Talebzadeh
not touch these system tables but things are not generally that simple. Are you using Hbase as Hive metastore now? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Can I specify database name in hive metastore service?

2016-10-26 Thread Mich Talebzadeh
Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsi

Re: Creating Index and no performance improvements

2016-10-27 Thread Mich Talebzadeh
Have you checked running SQL with EXPLAIN EXTENDED SELECT .. And post the results. In general your compact index will not work HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Creating Index and no performance improvements

2016-10-27 Thread Mich Talebzadeh
Hive table in ORC format and partition it by ndate or Dtsramp = '2016-10-27' Then you can do periodic INSERT/SELECT from the external table to ORC table. In that case you will utilise Store Index in Hive. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/

Happy Diwali to those forum members who celebrate this great festival

2016-10-30 Thread Mich Talebzadeh
Enjoy the festive season. Regards, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:*

Re: Happy Diwali to those forum members who celebrate this great festival

2016-10-30 Thread Mich Talebzadeh
I can hear and see plenty of firework in this foggy London tonight :) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Big Data Event London, 3-4th November 2016 from Tomorrow

2016-11-02 Thread Mich Talebzadeh
models. I am very interested the next generation of Hive with LLAP as an-in-memory database (not to be confused with LLAP as the execution engine) is extremely interesting. I am looking forward to query about that and host of others :) cheers Dr Mich Talebzadeh LinkedIn * https

Re: import sql file

2016-11-23 Thread Mich Talebzadeh
. alternatively use Sqoop to read the RDBMS table and create and import data into Hive table. you need the JAR file for the relevevant RDBMS HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Hive on Spark not working

2016-11-29 Thread Mich Talebzadeh
Hive on Spark engine only works with Spark 1.3.1. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

Re: Difference between MANAGED_TABLE and EXTERNAL_TABLE in org.apache.hadoop.hive.metastore.TableType

2016-12-01 Thread Mich Talebzadeh
e etc through ALTER table like below ALTER TABLE ${DATABASE}.EXTERNALMARKETDATA set location 'hdfs://rhes564:9000/data/prices/${TODAY}'; HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.c

Parquet timestamp storage in Hive and impact when using Impala to read timestampvalues

2016-12-02 Thread Mich Talebzadeh
goes directly to Parquet files. there is an impact to business. my suggestion is that if they want performant reads they should use Spark SQL on Hive. it will always get the same values as stored by Hive Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/vi

Re: Specifying orc.stripe.size in Spark

2016-12-18 Thread Mich Talebzadeh
.filter.columns"="ID", "orc.bloom.filter.fpp"="0.05", "orc.stripe.size"="268435456", "orc.row.index.stride"="1" ) """ HiveContext.sql(sqltext) sqltext = """ INSERT INTO TABLE test.dummy2 SELECT

Vectorised Queries in Hive

2017-01-10 Thread Mich Talebzadeh
confirms this please? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at yo

VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
STRRING. What is the thread view on this? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disc

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
in HDFS compared to STRING columns? Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disc

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
opposed to String make any difference in terms of storage efficiency? Regards Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
Sounds like VARCHAR and CHAR types were created for Hive to have ANSI SQL Compliance. Otherwise they seem to be practically the same as String types. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Parquet tables with snappy compression

2017-01-25 Thread Mich Talebzadeh
Hi, Has there been any study of how much compressing Hive Parquet tables with snappy reduces storage space or simply the table size in quantitative terms? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <ht

Re: Difference between join and inner join

2017-02-13 Thread Mich Talebzadeh
join is by default inner join as in Oracle or Sybase. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

Using Sqoop to get data from Impala/Hive to another Hive table

2017-02-21 Thread Mich Talebzadeh
Hi, I have not tried this but someone mentioned that it is possible to use Sqoop to get data from one Impala/Hive table in one cluster to another? The clusters are in different zones. This is to test the cluster. Has anyone done such a thing? Thanks Dr Mich Talebzadeh LinkedIn * https

Re: Using Sqoop to get data from Impala/Hive to another Hive table

2017-02-21 Thread Mich Talebzadeh
. this is not really a test is iut? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use

Re: Using Sqoop to get data from Impala/Hive to another Hive table

2017-02-21 Thread Mich Talebzadeh
regardless there is no point using Sqoop for such purpose. it is not really designed for it :) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Real time data streaming into Hive text table and excessive amount of file numbers

2017-03-26 Thread Mich Talebzadeh
op.db/t/00_0_copy_1544 So I was wondering what are the best ways of compacting these files? Is there any detriment when the number of these files grow very high such as 1000s of them? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6

beeline connection to Hive using both Kerberos and LDAP with SSL

2017-04-07 Thread Mich Talebzadeh
Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsi

Re: beeline connection to Hive using both Kerberos and LDAP with SSL

2017-04-30 Thread Mich Talebzadeh
Thanks Kapil. Does this mean that one can have both Kerberos and LDAP (with SSL) and use either? Cheers, Mich Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: beeline connection to Hive using both Kerberos and LDAP with SSL

2017-05-02 Thread Mich Talebzadeh
So it translates to either LDAP or Kerberos, we cannot enable both for same Hive Server. SSL is independent. So the supported situations are as below. 1. Anonymous authentication (w/ or w/o SSL) 2. LDAP authentication (w/ or w/o SSL) 3. Kerberos Cheers Dr Mich Talebzadeh

<    2   3   4   5   6   7   8   >