Re: Hive SQL query

2016-04-07 Thread Mich Talebzadeh
e -u jdbc:hive2://HOST:POPRT/default org.apache.hive.jdbc.HiveDriver -n hduser -p xx HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>

Work on Spark engine for Hive

2016-04-08 Thread Mich Talebzadeh
Hi, Is there any scheduled work to enable Hive to use recent version of Spark engines? This is becoming an issue as some applications have to rely on MapR engine to do operations on Hive 2 which is serial and slow. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Work on Spark engine for Hive

2016-04-08 Thread Mich Talebzadeh
This is a different thing. the question is when will Hive 2 be able to run on Spark 1.6.1 installed binaries as execution engine. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Work on Spark engine for Hive

2016-04-08 Thread Mich Talebzadeh
d goal we will have to do eventually. I was not aware > that it is not working to be honest. > > Can you let us know what is broken on Hive 2 on Spark 1.6.1? Preferably > via filing a JIRA on HIVE side? > > On Fri, Apr 8, 2016 at 7:47 AM, Mich Talebzadeh > wrote: >

Re: Hive 0.14 schema evolution for orc table

2016-04-08 Thread Mich Talebzadeh
l = 'abc'; FAILED: SemanticException [Error 10122]: Bucketized tables do not support INSERT INTO: Table: test.dummy2 Check the discussion in Hive user mailing list here <http://mail-archives.apache.org/mod_mbox/hive-user/201512.mbox/%3cd290a400.39823%25ekoif...@hortonworks.com%3E&g

Re: ORC file sort order ..

2016-04-09 Thread Mich Talebzadeh
) INTO 256 BUCKETS*STORED AS ORC TBLPROPERTIES ( *"orc.create.index"="true","orc.bloom.filter.columns"="ID","* orc.bloom.filter.fpp"="0.05", "orc.compress"="SNAPPY", "orc.stripe.size"="16777216", &

Moving Hive metastore to Solid State Disks

2016-04-17 Thread Mich Talebzadeh
dedicated to Hive. HTH HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Moving Hive metastore to Solid State Disks

2016-04-17 Thread Mich Talebzadeh
ity. > > On 17 Apr 2016, at 11:52, Mich Talebzadeh > wrote: > > Hi, > > I have had my Hive metastore database on Oracle 11g supporting concurrency > (with added transactional capability) > > Over the past few days I created a new schema on Oracle 12c on Solid State > D

Re: Insert after typecast fails for Timestamp

2016-04-18 Thread Mich Talebzadeh
inished Stage-4_0: 1/1 Finished Status: Finished successfully in 2.26 seconds Loading data to table default.dummy OK Time taken: 2.586 seconds Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com

Re: [VOTE] Bylaws change to allow some commits without review

2016-04-18 Thread Mich Talebzadeh
+1 Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 18 April 2016 at 18:24, Alan Gates wrote:

Re: Mappers spawning Hive queries

2016-04-18 Thread Mich Talebzadeh
What is the version of Hive and the execution engine (MR, Tez, Spark)? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Hive footprint

2016-04-18 Thread Mich Talebzadeh
FS, a good engine for Hive (sounds like many prefer TEZ although I am a Spark fan) and the ubiquitous YARN. Let me know your thoughts. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/pr

Re: Hive footprint

2016-04-18 Thread Mich Talebzadeh
Thanks Marcin. What is the definition of low latency here? Are you referring to the performance of SQL against HBase tables compared to Hive. As I understand HBase is a columnar database. Would it be possible to use Hive against ORC to achieve the same? Dr Mich Talebzadeh LinkedIn * https

Re: Hive footprint

2016-04-19 Thread Mich Talebzadeh
itmap| | +---+---+---+--+---+--+--+ Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdO

Re: Hive footprint

2016-04-19 Thread Mich Talebzadeh
BTW what is the situation with Impala? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 Apri

Re: Standard Deviation in Hive 2 is still incorrect

2016-04-19 Thread Mich Talebzadeh
Will do thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 April 2016 at 23:33, Alan

Re: Hive footprint

2016-04-20 Thread Mich Talebzadeh
Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 20 April 2016 at 13:07, Sabarish Sasidharan wrote:

Re: Hive footprint

2016-04-20 Thread Mich Talebzadeh
Hi, If I may, I would also like to see where the Hive optimizer shows that it is used with explain ... or other means. It will be interesting. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/prof

Re: Standard Deviation in Hive 2 is still incorrect

2016-04-21 Thread Mich Talebzadeh
HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574> Created and assigned to myself Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Re: Hive footprint

2016-04-21 Thread Mich Talebzadeh
This simply does not work but we need to make Hive use external indexes. This is a must Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Jira Hive-13574 raised to resolve Standard deviation calculation in Hive

2016-04-21 Thread Mich Talebzadeh
Hi, Jira HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574> is raised to resolve Hive standard deviation function STTDEV() which is incorrect at the moment. Please vote for it. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/v

Hive external indexes incorporation into Hive CBO

2016-04-21 Thread Mich Talebzadeh
we are down the road with Work In Progress on this. However, I am happy to help with this. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Hive external indexes incorporation into Hive CBO

2016-04-21 Thread Mich Talebzadeh
Kindly provide an example where one can see EXPLAIN SELECT .shows external index usage? That will be great. Choose your table and block size Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh
user_parameters t2 JOIN user_details t1 ON t2.user_id = t1.user_id; Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh
thanks I may have missed something. Deepak might clarify. cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive footprint

2016-04-25 Thread Mich Talebzadeh
em by investing in the existing tools rather than trying to fragment it further. There seems to be little effort in this area for reasons that I may not be aware. However, I am more than happy to contribute to this case. Kind regards, Mich Dr Mich Talebzadeh LinkedIn * https://www.linkedi

Re: Hive TTransportException - Create Table

2016-04-27 Thread Mich Talebzadeh
ary tables (private to that session). A DDL in any database is a heavy operation if you can truncate or overwrite the existing tables it would be prudent. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com

Re: Sqoop_Sql_blob_types

2016-04-27 Thread Mich Talebzadeh
Is the source of data Oracle? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 27 April 2016 at

Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh
Is the parameter --set hive.enforce.bucketing = true; depreciated in Hive 2 as it causes hql code not to work? hive> set hive.enforce.bucketing = true; Query returned non-zero code: 1, cause: hive configuration hive.enforce.bucketing does not exists. Dr Mich Talebzadeh LinkedIn * ht

Re: Issue with correlated subqueries being case-sensitive

2016-04-29 Thread Mich Talebzadeh
Why not just try the standard way SELECT * FROM P WHERE EXISTS(SELECT 1 FROM B WHERE P.ID = B.ID) You don't need '*' that is not standard SQL as far as I know HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdO

Re: Issue with correlated subqueries being case-sensitive

2016-04-29 Thread Mich Talebzadeh
ts/Not Exists operator SubQuery must be Correlated. As a work around This works but not that efficient hive> select count(1) from smallsales where PROD_ID IN (SELECT PROD_ID FROM sales_staging); HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP

Re: Disable Hive autogather optimization

2016-04-29 Thread Mich Talebzadeh
ITE operation is involved in an existing table, then column stats kicks in and that adds to timing process? Sounds like it is a general feature and can be disabled as part of table struct. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP

Re: Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh
Well having it in the old code causes the query to crash as well! Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh
Unfortunately that needs to be done or better the whole line removed in every hql code where it is set as true . Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Disable Hive autogather optimization

2016-04-29 Thread Mich Talebzadeh
Hopefully that will turn off the autogather feature for existing tables. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://taleb

Re: Disable Hive autogather optimization

2016-04-29 Thread Mich Talebzadeh
apologies should read "Udit" Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 Apr

Re: Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh
Ok thanks Lefty Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 April 2016 at 02:23,

Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh
Hi, What is the simplest way of making sqoop import use spark engine as opposed to the default mapreduce when putting data into hive table. I did not see any parameter for this in sqoop command line doc. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh
e.execution.engine=spark does not matter. Sqoop seems to internally set hive.execution.engine=mr anyway. May be there should be an option --hive-execution-engine='mr/tez/spak' etc in above command? Cheers, Mich Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh
yes I was thinking of that. use Spark to load JDBC data from Oracle and flush it into ORC table in Hive. Now I am using Spark 1.6.1 and JDBC driver as I recall (I raised a thread for it) throwing error. This was working under Spark 1.5.2. Cheers Dr Mich Talebzadeh LinkedIn * https

Re: Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh
into temp table. The code actually creates the Hive ORC table in Hive database and populates it from temp table. ​ See How it goes Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Spark Streaming, Batch interval, Windows length and Sliding Interval settings

2016-05-04 Thread Mich Talebzadeh
on what is being measured. However, I believe having slidinginterval = batch interval makes sense? Appreciate any views on this. Thanks, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/v

Re: Spark Streaming, Batch interval, Windows length and Sliding Interval settings

2016-05-05 Thread Mich Talebzadeh
Any ideas/experience on this? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 4 May 2016 at

Re: Predicates for 'like' and 'between' operators to custom storage handler.

2016-05-05 Thread Mich Talebzadeh
row selected (153.959 seconds) So it does work HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Predicates for 'like' and 'between' operators to custom storage handler.

2016-05-05 Thread Mich Talebzadeh
Hi, Do you have the equivalent of that operation in pure SQL. Also have you tried Spark query tool with Hive table. I gather you are doing this through Java? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <ht

Re: Predicates for 'like' and 'between' operators to custom storage handler.

2016-05-05 Thread Mich Talebzadeh
yyy-MM-dd')) AS TransactionDate Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 5 May 2016 at 13:25

Hive-Hbase vs Phoenix-Hbase

2016-05-05 Thread Mich Talebzadeh
elies on memory (what else) to speed up this process. Hive on newer engine can do most of this these days. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAE

Re: NullPointerException when dropping database backed by S3

2016-05-06 Thread Mich Talebzadeh
in your metastore. Cheers, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 6 May 2016 at 17:29,

Re: Create external table

2016-05-10 Thread Mich Talebzadeh
| NULL | | year | int | | | month| string| | +--+---+-------+--+ 13 rows selected (0.13 seconds) 0: jdbc:hive2:

Re: Create external table

2016-05-10 Thread Mich Talebzadeh
yes but table then exists correct I mean second time did you try *use default;* *drop table if exists trips;* it is still within Hive metadata registered as an existing table. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: clustered bucket and tablesample

2016-05-14 Thread Mich Talebzadeh
vel in Hive, the number of partitions/files will be fixed. In contrast, with partitioning you do not have this limitation. can you do show create table X and send the output. please. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: clustered bucket and tablesample

2016-05-14 Thread Mich Talebzadeh
stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.l

Re: Query Failing while querying on ORC Format

2016-05-14 Thread Mich Talebzadeh
check this thread. alter table add columns aternatives or hive refresh that night help HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Query Failing while querying on ORC Format

2016-05-15 Thread Mich Talebzadeh
Hi Mahender, Please check this thread https://mail.google.com/mail/#search/alter+table+add+columns+aternatives+or+hive+refresh/153fe59e7c2970b2 HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: clustered bucket and tablesample

2016-05-15 Thread Mich Talebzadeh
mn is unpredictable. With integer it is fine. I believe there is an underlying bug in here. Other alternative is to an integer as a surrogate column for hash partitioning. like a seqiuence in Oracle or identity in Sybase/MSSQL HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?i

Re: Query Failing while querying on ORC Format

2016-05-16 Thread Mich Talebzadeh
spark. thanks Top of Form http://permalink.gmane.org/gmane.comp.lang.scala.spark.user/32484 | http://post.gmane.org/post.php?group=gmane.comp.lang.scala.spark.user&followup=32484 | Bottom of Form http:// http://search.gmane.org/?author=Mich+Talebzadeh&sort=date | 10 Apr 12:41 201

Re: Query Failing while querying on ORC Format

2016-05-17 Thread Mich Talebzadeh
Hi Mahendar, That version 1.2 is reasonable. One alternative is to create a new table (new_table) in Hive with columns from old_table plus the added column new_column as ORC etc Do an INSERT/SELECT from old_table to new_table INSERT INTO new_table SELECT *, https://www.linkedin.com/profile/view

Re: Query Failing while querying on ORC Format

2016-05-17 Thread Mich Talebzadeh
I am afraid AFAIK the old partitions cannot be modified as they are fixed in size. That is the existing partition file. I agree this is very tedious. We should come up with a more flexible design for ORC tables. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Hive setup on Hadoop cluster

2016-05-18 Thread Mich Talebzadeh
e etc. You also need to set up environment variables for both Hadoop and hive in your start up script like .profile .kshrc etc Have a look anyway. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com

Re: Hive setup on Hadoop cluster

2016-05-18 Thread Mich Talebzadeh
Hi John, can you please a new thread for your problem so we can deal with separately. thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Missing HIVE Execution JAR

2016-05-18 Thread Mich Talebzadeh
how about CLASSPATH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 18 May 2016 at 20:46,

Re: Hive setup on Hadoop cluster

2016-05-18 Thread Mich Talebzadeh
Hi John, I see this error Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Can you check in case you have a problem under Hadoop storage or you have an issue with your user say hduser on Linux! HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Missing HIVE Execution JAR

2016-05-18 Thread Mich Talebzadeh
I don't use windows but check bin/hive.cmd for environment variables. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw&g

Re: Hive system catalog

2016-05-18 Thread Mich Talebzadeh
Hi Braj, Any tool GUI or OS level can log in and see the schema created for Hive. For example my metadata for Hive is on Oracle and I can use SQL Developer Data Model to create a logical model from the physical model HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Hive system catalog

2016-05-19 Thread Mich Talebzadeh
The Hive 2 metastore with concurrency capability has 194 tables, 127 views and 38 relationships for a metastore created on Oracle 12c I have created an Entity-Relationship diagram but need to decide in what format to post it Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Hive setup on Hadoop cluster

2016-05-19 Thread Mich Talebzadeh
ndLoadMain(LauncherHelper.java:486) However, sounds like you may have an issue with yarn container memory. How big is the underlying table. Also can you just do a plain select count(1) from itself (no distinct etc) and see it works? HTH Dr Mich Talebzadeh LinkedIn * https://www.li

Hive 2 database Entity-Relationship Diagram

2016-05-19 Thread Mich Talebzadeh
pr=1> Fairly big diagram in PDF format. However, you can zoom into it. Please have a kook and appreciate comments to me and if it is useful we can load it into wiki. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Hive 2 database Entity-Relationship Diagram

2016-05-19 Thread Mich Talebzadeh
down Thanks[image: Inline images 1] Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 May 2016 at

Re: Hive 2 database Entity-Relationship Diagram

2016-05-19 Thread Mich Talebzadeh
Thanks These are the list of tables and views Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Unable to pick data from subdirectories into hive table in CDH 5.3.3

2016-05-19 Thread Mich Talebzadeh
Hi, I am not familiar with CDH, but in a default set -up, the hive directory is under hdfs://https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com

Re: Unable to pick data from subdirectories into hive table in CDH 5.3.3

2016-05-19 Thread Mich Talebzadeh
agreed but it still needs to know where the hive top node directory starts from, which is normally under ../../ warehouse Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Compatibility of Hive 2 with TEZ

2016-05-21 Thread Mich Talebzadeh
Hi, I see in a matrix that Hive 2 is compatible with Tez 0.8.2 as its execution engine. Can someone verify this please as I am trying to test Hive 2 with Tez. I normally use Hive 2 on Spark 1.3 engine fine. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Hive 2.0 on Spark 1.6.1 Engine

2016-05-21 Thread Mich Talebzadeh
ure anyone ahs tried this? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Hive and XML

2016-05-22 Thread Mich Talebzadeh
That is interesting. DBs like MarkLogic are adapt to this. BTW how do you define yor base Hive table for XML and what table type have you used? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: Hive 2 Metastore Entity-Relationship Diagram, Base tables

2016-05-22 Thread Mich Talebzadeh
for now to be used as a quick reference for hive metadata tables, columns, pk and constraint. It only covers the base tables excluding transactional add ons in hive-txn-schema-2.0.0.oracle.sql HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-23 Thread Mich Talebzadeh
Have a look at this thread Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 23 May 2016 at 09:10

Re: Using Spark as execution engine for Hive

2016-05-23 Thread Mich Talebzadeh
Hi Sharath See this thread Using Spark on Hive with Hive also using Spark as its execution engine HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Compatibility of Hive 2 with TEZ

2016-05-23 Thread Mich Talebzadeh
Thanks Seth. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 23 May 2016 at 21:08, Siddhart

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-24 Thread Mich Talebzadeh
very large table from Oracle to Hive and decided to use Spark 1.6.1 with Hive 2 on Spark 1.3.1 and that worked fine. We just used JDBC connection with temp table and it was good. We could have used sqoop but decided to settle for Spark so it all depends on use case. HTH Dr Mich Talebzadeh Lin

Hive 2 loss of connection to metadstore and multiple connections/disconnect in the same session

2016-05-24 Thread Mich Talebzadeh
con nections: 2 2016-05-24T16:16:44,864 INFO [0deb842d-9b15-4dd9-8d60-0e198a9d3865 0deb842d-9b15-4dd9-8d60-0e198a9d3865 main]: hive.metastore (HiveMetaStoreClient.java:open(505)) - Connected to metastore. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Insert query with selective columns in Hive

2016-05-24 Thread Mich Talebzadeh
only col4 hive> insert into testme (col4) values(6); Loading data to table test.testme OK HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPC

Re: Any way in hive to have functionality like SQL Server collation on Case sensitivity

2016-05-24 Thread Mich Talebzadeh
r_expression1* is equal to *char_expression2* or* uchar_expression2*. - -1 – indicates that *char_expression1* or *uchar_expression1* is less than *char_expression2 *or* uchar expression2*. hive> select compare("aaa", "bbb"); FAILED: SemanticException [Error 100

Re: Copying all Hive tables from Prod to UAT

2016-05-25 Thread Mich Talebzadeh
duct to do the same. I am not sure vendors do parallelise this sort of things. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http

Hive and using Pooled Connections

2016-05-25 Thread Mich Talebzadeh
the reuse of connection objects and reduce the number of times that connection objects are created. Connection pools significantly improve performance for database-intensive applications because creating connection objects is costly both in terms of time and resources. Thanks Dr Mich Talebzadeh

Re: Copying all Hive tables from Prod to UAT

2016-05-26 Thread Mich Talebzadeh
be an option. NAS is better as it saves scp and copy across with taget having enough external space to get the files in. More useful tool would be to export the full Hive database in binary format and import it in target. Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile

Re: Hive and using Pooled Connections

2016-05-26 Thread Mich Talebzadeh
mittent "No such lock.." and "No such transaction..." errors. Setting "datanucleus.connectionPoolingType=DBCP" is recommended in this case So I changed the setting to DBCP. Don't know how useful it is going to be. Regards, Dr Mich Talebzadeh L

Re: Test

2016-05-29 Thread Mich Talebzadeh
yep Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 29 May 2016 at 18:01, Igor Kravzov

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-29 Thread Mich Talebzadeh
course Spark has both plus in-memory capability. It would be interesting to see what version of TEZ works as execution engine with Hive. Vendors are divided on this (use Hive with TEZ) or use Impala instead of Hive etc as I am sure you already know. Cheers, Dr Mich Talebzadeh LinkedIn

Anyone successfully deployed Hive on TEZ engine?

2016-05-29 Thread Mich Talebzadeh
Please bear in mind that I am talking about your own build not anything comes as part of Vendor's package. If so kindly specify both Hive and TEZ versions. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw &

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-29 Thread Mich Talebzadeh
thanks I think the problem is that the TEZ user group is exceptionally quiet. Just sent an email to Hive user group to see anyone has managed to built a vendor independent version. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Anyone successfully deployed Hive on TEZ engine?

2016-05-30 Thread Mich Talebzadeh
thanks Damien. I tried TEZ 0.82 with Hive 2 although I did not persevere. When you say "Not stable" are you referring to using it with YARN etc. In short at the simplest set up what Resource Manager it works with? Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/pr

Re: Anyone successfully deployed Hive on TEZ engine?

2016-05-30 Thread Mich Talebzadeh
to make it work as I have hive on spark engine as well. please tell me what version of tez and yarn etc. I thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6

Re: Does hive need exact schema in Hive Export/Import?

2016-05-30 Thread Mich Talebzadeh
select count(1) from test.sales_staging; exit; Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 May 2016 at 12

Re: Anyone successfully deployed Hive on TEZ engine?

2016-05-30 Thread Mich Talebzadeh
Hi Gopal, please see my correspondence about Tez in tez user group. I forwarded to hive user group. thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Does hive need exact schema in Hive Export/Import?

2016-05-30 Thread Mich Talebzadeh
oup 1588 2016-05-25 16:46 hdfs://rhes564:9000/export/ *_metadata*drwxr-xr-x - hduser supergroup 0 2016-05-25 16:46 hdfs://rhes564:9000/export/data and uses the metadata file to create the target table which somehow does not work in this case! HTH Dr Mich Talebzadeh LinkedIn * ht

Re: SHOW DATABASES/TABLES with SQL standard authorization

2016-05-30 Thread Mich Talebzadeh
have access rights to that database. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 Ma

Re: SHOW DATABASES/TABLES with SQL standard authorization

2016-05-30 Thread Mich Talebzadeh
with no access right given? -- 1> use ASEIMDB 2> go Msg 10351, Level 14, State 1: Server 'SYB_157', Line 1: Server user id 24 is not a valid user in database 'ASEIMDB' HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-30 Thread Mich Talebzadeh
another stack like Tez. Cloudera support Impala instead of Hive but it is not something I have used. . HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-30 Thread Mich Talebzadeh
data). 80-20 rule? In reality may be just 2TB or most recent partitions etc. The rest is cold data. cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Why does the user need write permission on the location of external hive table?

2016-05-31 Thread Mich Talebzadeh
is this location correct and valid? LOCATION '/data/SentimentFiles/*SentimentFiles*/upload/data/tweets_raw/' Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/pr

<    1   2   3   4   5   6   7   8   >