from:"\"Mich Talebzadeh\""

Re: Hive SQL query

2016-04-07 Thread Mich Talebzadeh

e -u jdbc:hive2://HOST:POPRT/default org.apache.hive.jdbc.HiveDriver -n hduser -p xx HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>

Work on Spark engine for Hive

2016-04-08 Thread Mich Talebzadeh

Hi, Is there any scheduled work to enable Hive to use recent version of Spark engines? This is becoming an issue as some applications have to rely on MapR engine to do operations on Hive 2 which is serial and slow. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Work on Spark engine for Hive

2016-04-08 Thread Mich Talebzadeh

This is a different thing. the question is when will Hive 2 be able to run on Spark 1.6.1 installed binaries as execution engine. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Work on Spark engine for Hive

2016-04-08 Thread Mich Talebzadeh

d goal we will have to do eventually. I was not aware > that it is not working to be honest. > > Can you let us know what is broken on Hive 2 on Spark 1.6.1? Preferably > via filing a JIRA on HIVE side? > > On Fri, Apr 8, 2016 at 7:47 AM, Mich Talebzadeh > wrote: >

Re: Hive 0.14 schema evolution for orc table

2016-04-08 Thread Mich Talebzadeh

l = 'abc'; FAILED: SemanticException [Error 10122]: Bucketized tables do not support INSERT INTO: Table: test.dummy2 Check the discussion in Hive user mailing list here <http://mail-archives.apache.org/mod_mbox/hive-user/201512.mbox/%3cd290a400.39823%25ekoif...@hortonworks.com%3E&g

Re: ORC file sort order ..

2016-04-09 Thread Mich Talebzadeh

) INTO 256 BUCKETS*STORED AS ORC TBLPROPERTIES ( *"orc.create.index"="true","orc.bloom.filter.columns"="ID","* orc.bloom.filter.fpp"="0.05", "orc.compress"="SNAPPY", "orc.stripe.size"="16777216", &

Moving Hive metastore to Solid State Disks

2016-04-17 Thread Mich Talebzadeh

dedicated to Hive. HTH HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Moving Hive metastore to Solid State Disks

2016-04-17 Thread Mich Talebzadeh

ity. > > On 17 Apr 2016, at 11:52, Mich Talebzadeh > wrote: > > Hi, > > I have had my Hive metastore database on Oracle 11g supporting concurrency > (with added transactional capability) > > Over the past few days I created a new schema on Oracle 12c on Solid State > D

Re: Insert after typecast fails for Timestamp

2016-04-18 Thread Mich Talebzadeh

inished Stage-4_0: 1/1 Finished Status: Finished successfully in 2.26 seconds Loading data to table default.dummy OK Time taken: 2.586 seconds Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com

Re: [VOTE] Bylaws change to allow some commits without review

2016-04-18 Thread Mich Talebzadeh

+1 Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 18 April 2016 at 18:24, Alan Gates wrote:

Re: Mappers spawning Hive queries

2016-04-18 Thread Mich Talebzadeh

What is the version of Hive and the execution engine (MR, Tez, Spark)? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Hive footprint

2016-04-18 Thread Mich Talebzadeh

FS, a good engine for Hive (sounds like many prefer TEZ although I am a Spark fan) and the ubiquitous YARN. Let me know your thoughts. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/pr

Re: Hive footprint

2016-04-18 Thread Mich Talebzadeh

Thanks Marcin. What is the definition of low latency here? Are you referring to the performance of SQL against HBase tables compared to Hive. As I understand HBase is a columnar database. Would it be possible to use Hive against ORC to achieve the same? Dr Mich Talebzadeh LinkedIn * https

Re: Hive footprint

2016-04-19 Thread Mich Talebzadeh

itmap| | +---+---+---+--+---+--+--+ Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdO

Re: Hive footprint

2016-04-19 Thread Mich Talebzadeh

BTW what is the situation with Impala? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 Apri

Re: Standard Deviation in Hive 2 is still incorrect

2016-04-19 Thread Mich Talebzadeh

Will do thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 April 2016 at 23:33, Alan

Re: Hive footprint

2016-04-20 Thread Mich Talebzadeh

Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 20 April 2016 at 13:07, Sabarish Sasidharan wrote:

Re: Hive footprint

2016-04-20 Thread Mich Talebzadeh

Hi, If I may, I would also like to see where the Hive optimizer shows that it is used with explain ... or other means. It will be interesting. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/prof

Re: Standard Deviation in Hive 2 is still incorrect

2016-04-21 Thread Mich Talebzadeh

HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574> Created and assigned to myself Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Re: Hive footprint

2016-04-21 Thread Mich Talebzadeh

This simply does not work but we need to make Hive use external indexes. This is a must Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Jira Hive-13574 raised to resolve Standard deviation calculation in Hive

2016-04-21 Thread Mich Talebzadeh

Hi, Jira HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574> is raised to resolve Hive standard deviation function STTDEV() which is incorrect at the moment. Please vote for it. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/v

Hive external indexes incorporation into Hive CBO

2016-04-21 Thread Mich Talebzadeh

we are down the road with Work In Progress on this. However, I am happy to help with this. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Hive external indexes incorporation into Hive CBO

2016-04-21 Thread Mich Talebzadeh

Kindly provide an example where one can see EXPLAIN SELECT .shows external index usage? That will be great. Choose your table and block size Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh

user_parameters t2 JOIN user_details t1 ON t2.user_id = t1.user_id; Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive query to split one row into many rows such that Row 1 will have col 1 Name, col 1 Value and Row 2 will have col 2 Name and col 2 value

2016-04-23 Thread Mich Talebzadeh

thanks I may have missed something. Deepak might clarify. cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive footprint

2016-04-25 Thread Mich Talebzadeh

em by investing in the existing tools rather than trying to fragment it further. There seems to be little effort in this area for reasons that I may not be aware. However, I am more than happy to contribute to this case. Kind regards, Mich Dr Mich Talebzadeh LinkedIn * https://www.linkedi

Re: Hive TTransportException - Create Table

2016-04-27 Thread Mich Talebzadeh

ary tables (private to that session). A DDL in any database is a heavy operation if you can truncate or overwrite the existing tables it would be prudent. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com

Re: Sqoop_Sql_blob_types

2016-04-27 Thread Mich Talebzadeh

Is the source of data Oracle? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 27 April 2016 at

Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh

Is the parameter --set hive.enforce.bucketing = true; depreciated in Hive 2 as it causes hql code not to work? hive> set hive.enforce.bucketing = true; Query returned non-zero code: 1, cause: hive configuration hive.enforce.bucketing does not exists. Dr Mich Talebzadeh LinkedIn * ht

Re: Issue with correlated subqueries being case-sensitive

2016-04-29 Thread Mich Talebzadeh

Why not just try the standard way SELECT * FROM P WHERE EXISTS(SELECT 1 FROM B WHERE P.ID = B.ID) You don't need '*' that is not standard SQL as far as I know HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdO

Re: Issue with correlated subqueries being case-sensitive

2016-04-29 Thread Mich Talebzadeh

ts/Not Exists operator SubQuery must be Correlated. As a work around This works but not that efficient hive> select count(1) from smallsales where PROD_ID IN (SELECT PROD_ID FROM sales_staging); HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP

Re: Disable Hive autogather optimization

2016-04-29 Thread Mich Talebzadeh

ITE operation is involved in an existing table, then column stats kicks in and that adds to timing process? Sounds like it is a general feature and can be disabled as part of table struct. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP

Re: Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh

Well having it in the old code causes the query to crash as well! Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh

Unfortunately that needs to be done or better the whole line removed in every hql code where it is set as true . Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Disable Hive autogather optimization

2016-04-29 Thread Mich Talebzadeh

Hopefully that will turn off the autogather feature for existing tables. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://taleb

Re: Disable Hive autogather optimization

2016-04-29 Thread Mich Talebzadeh

apologies should read "Udit" Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 Apr

Re: Hive configuration parameter hive.enforce.bucketing does not exist in Hive 2

2016-04-29 Thread Mich Talebzadeh

Ok thanks Lefty Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 April 2016 at 02:23,

Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh

Hi, What is the simplest way of making sqoop import use spark engine as opposed to the default mapreduce when putting data into hive table. I did not see any parameter for this in sqoop command line doc. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh

e.execution.engine=spark does not matter. Sqoop seems to internally set hive.execution.engine=mr anyway. May be there should be an option --hive-execution-engine='mr/tez/spak' etc in above command? Cheers, Mich Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh

yes I was thinking of that. use Spark to load JDBC data from Oracle and flush it into ORC table in Hive. Now I am using Spark 1.6.1 and JDBC driver as I recall (I raised a thread for it) throwing error. This was working under Spark 1.5.2. Cheers Dr Mich Talebzadeh LinkedIn * https

Re: Making sqoop import use Spark engine as opposed to MapReduce for Hive

2016-04-30 Thread Mich Talebzadeh

into temp table. The code actually creates the Hive ORC table in Hive database and populates it from temp table. See How it goes Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Spark Streaming, Batch interval, Windows length and Sliding Interval settings

2016-05-04 Thread Mich Talebzadeh

on what is being measured. However, I believe having slidinginterval = batch interval makes sense? Appreciate any views on this. Thanks, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/v

Re: Spark Streaming, Batch interval, Windows length and Sliding Interval settings

2016-05-05 Thread Mich Talebzadeh

Any ideas/experience on this? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 4 May 2016 at

Re: Predicates for 'like' and 'between' operators to custom storage handler.

2016-05-05 Thread Mich Talebzadeh

row selected (153.959 seconds) So it does work HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Predicates for 'like' and 'between' operators to custom storage handler.

2016-05-05 Thread Mich Talebzadeh

Hi, Do you have the equivalent of that operation in pure SQL. Also have you tried Spark query tool with Hive table. I gather you are doing this through Java? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <ht

Re: Predicates for 'like' and 'between' operators to custom storage handler.

2016-05-05 Thread Mich Talebzadeh

yyy-MM-dd')) AS TransactionDate Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 5 May 2016 at 13:25

Hive-Hbase vs Phoenix-Hbase

2016-05-05 Thread Mich Talebzadeh

elies on memory (what else) to speed up this process. Hive on newer engine can do most of this these days. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAE

Re: NullPointerException when dropping database backed by S3

2016-05-06 Thread Mich Talebzadeh

in your metastore. Cheers, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 6 May 2016 at 17:29,

Re: Create external table

2016-05-10 Thread Mich Talebzadeh

| NULL | | year | int | | | month| string| | +--+---+-------+--+ 13 rows selected (0.13 seconds) 0: jdbc:hive2:

Re: Create external table

2016-05-10 Thread Mich Talebzadeh

yes but table then exists correct I mean second time did you try *use default;* *drop table if exists trips;* it is still within Hive metadata registered as an existing table. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: clustered bucket and tablesample

2016-05-14 Thread Mich Talebzadeh

vel in Hive, the number of partitions/files will be fixed. In contrast, with partitioning you do not have this limitation. can you do show create table X and send the output. please. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: clustered bucket and tablesample

2016-05-14 Thread Mich Talebzadeh

stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat serde: org.apache.hadoop.hive.serde2.l

Re: Query Failing while querying on ORC Format

2016-05-14 Thread Mich Talebzadeh

check this thread. alter table add columns aternatives or hive refresh that night help HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Query Failing while querying on ORC Format

2016-05-15 Thread Mich Talebzadeh

Hi Mahender, Please check this thread https://mail.google.com/mail/#search/alter+table+add+columns+aternatives+or+hive+refresh/153fe59e7c2970b2 HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: clustered bucket and tablesample

2016-05-15 Thread Mich Talebzadeh

mn is unpredictable. With integer it is fine. I believe there is an underlying bug in here. Other alternative is to an integer as a surrogate column for hash partitioning. like a seqiuence in Oracle or identity in Sybase/MSSQL HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?i

Re: Query Failing while querying on ORC Format

2016-05-16 Thread Mich Talebzadeh

spark. thanks Top of Form http://permalink.gmane.org/gmane.comp.lang.scala.spark.user/32484 | http://post.gmane.org/post.php?group=gmane.comp.lang.scala.spark.user&followup=32484 | Bottom of Form http:// http://search.gmane.org/?author=Mich+Talebzadeh&sort=date | 10 Apr 12:41 201

Re: Query Failing while querying on ORC Format

2016-05-17 Thread Mich Talebzadeh

Hi Mahendar, That version 1.2 is reasonable. One alternative is to create a new table (new_table) in Hive with columns from old_table plus the added column new_column as ORC etc Do an INSERT/SELECT from old_table to new_table INSERT INTO new_table SELECT *, https://www.linkedin.com/profile/view

Re: Query Failing while querying on ORC Format

2016-05-17 Thread Mich Talebzadeh

I am afraid AFAIK the old partitions cannot be modified as they are fixed in size. That is the existing partition file. I agree this is very tedious. We should come up with a more flexible design for ORC tables. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Hive setup on Hadoop cluster

2016-05-18 Thread Mich Talebzadeh

e etc. You also need to set up environment variables for both Hadoop and hive in your start up script like .profile .kshrc etc Have a look anyway. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com

Re: Hive setup on Hadoop cluster

2016-05-18 Thread Mich Talebzadeh

Hi John, can you please a new thread for your problem so we can deal with separately. thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Missing HIVE Execution JAR

2016-05-18 Thread Mich Talebzadeh

how about CLASSPATH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 18 May 2016 at 20:46,

Re: Hive setup on Hadoop cluster

2016-05-18 Thread Mich Talebzadeh

Hi John, I see this error Stage-Stage-1: HDFS Read: 0 HDFS Write: 0 FAIL Can you check in case you have a problem under Hadoop storage or you have an issue with your user say hduser on Linux! HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Missing HIVE Execution JAR

2016-05-18 Thread Mich Talebzadeh

I don't use windows but check bin/hive.cmd for environment variables. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw&g

Re: Hive system catalog

2016-05-18 Thread Mich Talebzadeh

Hi Braj, Any tool GUI or OS level can log in and see the schema created for Hive. For example my metadata for Hive is on Oracle and I can use SQL Developer Data Model to create a logical model from the physical model HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Hive system catalog

2016-05-19 Thread Mich Talebzadeh

The Hive 2 metastore with concurrency capability has 194 tables, 127 views and 38 relationships for a metastore created on Oracle 12c I have created an Entity-Relationship diagram but need to decide in what format to post it Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: Hive setup on Hadoop cluster

2016-05-19 Thread Mich Talebzadeh

ndLoadMain(LauncherHelper.java:486) However, sounds like you may have an issue with yarn container memory. How big is the underlying table. Also can you just do a plain select count(1) from itself (no distinct etc) and see it works? HTH Dr Mich Talebzadeh LinkedIn * https://www.li

Hive 2 database Entity-Relationship Diagram

2016-05-19 Thread Mich Talebzadeh

pr=1> Fairly big diagram in PDF format. However, you can zoom into it. Please have a kook and appreciate comments to me and if it is useful we can load it into wiki. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Hive 2 database Entity-Relationship Diagram

2016-05-19 Thread Mich Talebzadeh

down Thanks[image: Inline images 1] Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 May 2016 at

Re: Hive 2 database Entity-Relationship Diagram

2016-05-19 Thread Mich Talebzadeh

Thanks These are the list of tables and views Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Unable to pick data from subdirectories into hive table in CDH 5.3.3

2016-05-19 Thread Mich Talebzadeh

Hi, I am not familiar with CDH, but in a default set -up, the hive directory is under hdfs://https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com

Re: Unable to pick data from subdirectories into hive table in CDH 5.3.3

2016-05-19 Thread Mich Talebzadeh

agreed but it still needs to know where the hive top node directory starts from, which is normally under ../../ warehouse Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Compatibility of Hive 2 with TEZ

2016-05-21 Thread Mich Talebzadeh

Hi, I see in a matrix that Hive 2 is compatible with Tez 0.8.2 as its execution engine. Can someone verify this please as I am trying to test Hive 2 with Tez. I normally use Hive 2 on Spark 1.3 engine fine. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Hive 2.0 on Spark 1.6.1 Engine

2016-05-21 Thread Mich Talebzadeh

ure anyone ahs tried this? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Re: Hive and XML

2016-05-22 Thread Mich Talebzadeh

That is interesting. DBs like MarkLogic are adapt to this. BTW how do you define yor base Hive table for XML and what table type have you used? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: Hive 2 Metastore Entity-Relationship Diagram, Base tables

2016-05-22 Thread Mich Talebzadeh

for now to be used as a quick reference for hive metadata tables, columns, pk and constraint. It only covers the base tables excluding transactional add ons in hive-txn-schema-2.0.0.oracle.sql HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-23 Thread Mich Talebzadeh

Have a look at this thread Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 23 May 2016 at 09:10

Re: Using Spark as execution engine for Hive

2016-05-23 Thread Mich Talebzadeh

Hi Sharath See this thread Using Spark on Hive with Hive also using Spark as its execution engine HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Compatibility of Hive 2 with TEZ

2016-05-23 Thread Mich Talebzadeh

Thanks Seth. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 23 May 2016 at 21:08, Siddhart

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-24 Thread Mich Talebzadeh

very large table from Oracle to Hive and decided to use Spark 1.6.1 with Hive 2 on Spark 1.3.1 and that worked fine. We just used JDBC connection with temp table and it was good. We could have used sqoop but decided to settle for Spark so it all depends on use case. HTH Dr Mich Talebzadeh Lin

Hive 2 loss of connection to metadstore and multiple connections/disconnect in the same session

2016-05-24 Thread Mich Talebzadeh

con nections: 2 2016-05-24T16:16:44,864 INFO [0deb842d-9b15-4dd9-8d60-0e198a9d3865 0deb842d-9b15-4dd9-8d60-0e198a9d3865 main]: hive.metastore (HiveMetaStoreClient.java:open(505)) - Connected to metastore. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Insert query with selective columns in Hive

2016-05-24 Thread Mich Talebzadeh

only col4 hive> insert into testme (col4) values(6); Loading data to table test.testme OK HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPC

Re: Any way in hive to have functionality like SQL Server collation on Case sensitivity

2016-05-24 Thread Mich Talebzadeh

r_expression1* is equal to *char_expression2* or* uchar_expression2*. - -1 – indicates that *char_expression1* or *uchar_expression1* is less than *char_expression2 *or* uchar expression2*. hive> select compare("aaa", "bbb"); FAILED: SemanticException [Error 100

Re: Copying all Hive tables from Prod to UAT

2016-05-25 Thread Mich Talebzadeh

duct to do the same. I am not sure vendors do parallelise this sort of things. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http

Hive and using Pooled Connections

2016-05-25 Thread Mich Talebzadeh

the reuse of connection objects and reduce the number of times that connection objects are created. Connection pools significantly improve performance for database-intensive applications because creating connection objects is costly both in terms of time and resources. Thanks Dr Mich Talebzadeh

Re: Copying all Hive tables from Prod to UAT

2016-05-26 Thread Mich Talebzadeh

be an option. NAS is better as it saves scp and copy across with taget having enough external space to get the files in. More useful tool would be to export the full Hive database in binary format and import it in target. Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile

Re: Hive and using Pooled Connections

2016-05-26 Thread Mich Talebzadeh

mittent "No such lock.." and "No such transaction..." errors. Setting "datanucleus.connectionPoolingType=DBCP" is recommended in this case So I changed the setting to DBCP. Don't know how useful it is going to be. Regards, Dr Mich Talebzadeh L

Re: Test

2016-05-29 Thread Mich Talebzadeh

yep Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 29 May 2016 at 18:01, Igor Kravzov

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-29 Thread Mich Talebzadeh

course Spark has both plus in-memory capability. It would be interesting to see what version of TEZ works as execution engine with Hive. Vendors are divided on this (use Hive with TEZ) or use Impala instead of Hive etc as I am sure you already know. Cheers, Dr Mich Talebzadeh LinkedIn

Anyone successfully deployed Hive on TEZ engine?

2016-05-29 Thread Mich Talebzadeh

Please bear in mind that I am talking about your own build not anything comes as part of Vendor's package. If so kindly specify both Hive and TEZ versions. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw &

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-29 Thread Mich Talebzadeh

thanks I think the problem is that the TEZ user group is exceptionally quiet. Just sent an email to Hive user group to see anyone has managed to built a vendor independent version. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Anyone successfully deployed Hive on TEZ engine?

2016-05-30 Thread Mich Talebzadeh

thanks Damien. I tried TEZ 0.82 with Hive 2 although I did not persevere. When you say "Not stable" are you referring to using it with YARN etc. In short at the simplest set up what Resource Manager it works with? Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/pr

Re: Anyone successfully deployed Hive on TEZ engine?

2016-05-30 Thread Mich Talebzadeh

to make it work as I have hive on spark engine as well. please tell me what version of tez and yarn etc. I thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6

Re: Does hive need exact schema in Hive Export/Import?

2016-05-30 Thread Mich Talebzadeh

select count(1) from test.sales_staging; exit; Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 May 2016 at 12

Re: Anyone successfully deployed Hive on TEZ engine?

2016-05-30 Thread Mich Talebzadeh

Hi Gopal, please see my correspondence about Tez in tez user group. I forwarded to hive user group. thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Does hive need exact schema in Hive Export/Import?

2016-05-30 Thread Mich Talebzadeh

oup 1588 2016-05-25 16:46 hdfs://rhes564:9000/export/ *_metadata*drwxr-xr-x - hduser supergroup 0 2016-05-25 16:46 hdfs://rhes564:9000/export/data and uses the metadata file to create the target table which somehow does not work in this case! HTH Dr Mich Talebzadeh LinkedIn * ht

Re: SHOW DATABASES/TABLES with SQL standard authorization

2016-05-30 Thread Mich Talebzadeh

have access rights to that database. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 30 Ma

Re: SHOW DATABASES/TABLES with SQL standard authorization

2016-05-30 Thread Mich Talebzadeh

with no access right given? -- 1> use ASEIMDB 2> go Msg 10351, Level 14, State 1: Server 'SYB_157', Line 1: Server user id 24 is not a valid user in database 'ASEIMDB' HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-30 Thread Mich Talebzadeh

another stack like Tez. Cloudera support Impala instead of Hive but it is not something I have used. . HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-05-30 Thread Mich Talebzadeh

data). 80-20 rule? In reality may be just 2TB or most recent partitions etc. The rest is cold data. cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Why does the user need write permission on the location of external hive table?

2016-05-31 Thread Mich Talebzadeh

is this location correct and valid? LOCATION '/data/SentimentFiles/*SentimentFiles*/upload/data/tweets_raw/' Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/pr

< 1 2 3 4 5 6 7 8 >

401 - 500 of 794 matches

Mail list logo