Re: Hive footprint

2016-04-19 Thread Mich Talebzadeh
BTW what is the situation with Impala? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 19 Apri

Hive footprint

2016-04-18 Thread Mich Talebzadeh
engine for Hive (sounds like many prefer TEZ although I am a Spark fan) and the ubiquitous YARN. Let me know your thoughts. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Hive footprint

2016-04-18 Thread Mich Talebzadeh
Thanks Marcin. What is the definition of low latency here? Are you referring to the performance of SQL against HBase tables compared to Hive. As I understand HBase is a columnar database. Would it be possible to use Hive against ORC to achieve the same? Dr Mich Talebzadeh LinkedIn * https

Re: Hive footprint

2016-04-19 Thread Mich Talebzadeh
| | +---+---+---+--+---+--+--+ Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Hive footprint

2016-04-20 Thread Mich Talebzadeh
Hi, If I may, I would also like to see where the Hive optimizer shows that it is used with explain ... or other means. It will be interesting. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/prof

Jira Hive-13574 raised to resolve Standard deviation calculation in Hive

2016-04-21 Thread Mich Talebzadeh
Hi, Jira HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574> is raised to resolve Hive standard deviation function STTDEV() which is incorrect at the moment. Please vote for it. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/v

Hive external indexes incorporation into Hive CBO

2016-04-21 Thread Mich Talebzadeh
we are down the road with Work In Progress on this. However, I am happy to help with this. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Standard Deviation in Hive 2 is still incorrect

2016-04-21 Thread Mich Talebzadeh
HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574> Created and assigned to myself Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Re: Hive footprint

2016-04-21 Thread Mich Talebzadeh
This simply does not work but we need to make Hive use external indexes. This is a must Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Insert query with selective columns in Hive

2016-05-24 Thread Mich Talebzadeh
only col4 hive> insert into testme (col4) values(6); Loading data to table test.testme OK HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPC

Re: Copying all Hive tables from Prod to UAT

2016-05-25 Thread Mich Talebzadeh
not sure vendors do parallelise this sort of things. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

Re: can't start up hive 2.1 hiveserver2/metastore services

2016-07-13 Thread Mich Talebzadeh
default that runs on port 1 HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:*

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
ut the cluster. It must be using some clever algorithm to do so. Cheers . Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzad

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink Time taken: 0.1 seconds, Fetched: 44 row(s) HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linke

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
I suggest that you try it for yourself then Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disc

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
memory computing. As usual your mileage varies. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com

Fwd: Building Spark 2 from source that does not include the Hive jars

2016-07-28 Thread Mich Talebzadeh
My interest is to extract the jar file similar to below from the build spark-assembly-1.3.1-hadoop2.4.0.jar Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Re: Hive on spark

2016-07-27 Thread Mich Talebzadeh
-07-27 20:38:29,395 Stage-2_0: 24/24 Finished Stage-3_0: 1/1 Finished Status: Finished successfully in 13.14 seconds OK 1 Time taken: 13.426 seconds, Fetched: 1 row(s) HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Updating ORC table fails with [Error 10122]: Bucketized tables do not support INSERT INTO:

2016-07-29 Thread Mich Talebzadeh
42 HARRODS HARRODS LTD CD 4610 4 HARRODS HARRODS LTD CD 4636 13 HARRODS HARRODS LTD CD 5916 28 HARRODS HARRODS LTD CD 4628 111 HARRODS cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <

Re: Hive on spark

2016-08-01 Thread Mich Talebzadeh
Hi, You can download the pdf from here <https://talebzadehmich.files.wordpress.com/2016/08/hive_on_spark_only.pdf> HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Re: Analytical function works in Spark SQL but not in Hive 2 QL

2016-07-31 Thread Mich Talebzadeh
0: 0(+1)/1 2016-07-31 10:48:35,780 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 Finished Status: Finished successfully in 10.10 seconds OK 2015-12-15 HARRODS LTD CD 4636 10.95 1 Time taken: 46.546 seconds, Fetched: 1 row(s) Dr Mich Talebzadeh LinkedIn * https://www.linkedi

Analytical function works in Spark SQL but not in Hive 2 QL

2016-07-31 Thread Mich Talebzadeh
thodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) *FAILED: ParseException line 6:7 Failed to recognize predicate 'inner'. Failed r

Re: Hive on spark

2016-07-27 Thread Mich Talebzadeh
n send the presentation Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at y

Re: hiver errors

2016-08-10 Thread Mich Talebzadeh
which version of hive it it? can you log in to hive via hive cli *$HIVE_HOME/bin/hive* Logging initialized using configuration in file:/usr/lib/hive/conf/hive-log4j2.properties hive> HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: hive will die or not?

2016-08-14 Thread Mich Talebzadeh
which is a Data Warehouse. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer

Re: hive will die or not?

2016-08-14 Thread Mich Talebzadeh
...I do not agree with you... Yeah right. I am so upset. Was waiting for your nod LOL Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
thanks Marcin. What Is your guesstimate on the order of "faster" please? Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: can't start up hive 2.1 hiveserver2/metastore services

2016-07-13 Thread Mich Talebzadeh
uot; ""=== Starting hiveserver metastore ===" >> ${LOG_FILE} $HIVE_HOME/bin/hive --service metastore & netstat -alnp|egrep 'Local|9083' echo `date` " ""=== Started hiveserver2 metastore ===" >> ${LOG_FILE} exit HTH

Re: 答复: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-13 Thread Mich Talebzadeh
to Hive on Spark or they apply equally to Hive on MapReduce as well. In other words a general issue with Hive optimizer case hive-9044? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/prof

Re: Verifying Hive execution engine used within a session

2016-07-13 Thread Mich Talebzadeh
Nice one Shaw hive> set hive.execution.engine; hive.execution.engine=mr Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>

Re: 答复: 答复: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-14 Thread Mich Talebzadeh
Wjich version of Hive and Spark please? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer

Re: 答复: 答复: 答复: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-14 Thread Mich Talebzadeh
fine which version of spark are using for Hive execution/query engine please? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive External Storage Handlers

2016-07-18 Thread Mich Talebzadeh
will need to build from source. It works and it is table. Otherwise you may decide to use Spark Thrift Server (STS) that allows JDBC access to Spark SQL (through beeline, Squirrel , Zeppelin) that has Hive SQL context built into it as if you were using Hive Thrift Server (HSS) HTH Dr Mich Talebzadeh

Re: Hive External Storage Handlers

2016-07-19 Thread Mich Talebzadeh
make sense and is meaningless without any evidence. Either you provide evidence that you have done this work and you encountered errors or better not mention it. Sounds like scaremongering. Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: hive external table on gzip

2016-07-19 Thread Mich Talebzadeh
://rhes564:9000/data/stg/*.gz HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Us

Re: Hive on TEZ + LLAP

2016-07-19 Thread Mich Talebzadeh
and d_moy=12 and d_year=2001 group by i_brand, i_brand_id order by ext_price desc, i_brand_id limit 100 ; What was the type (Parquet, text, ORC etc) and row count for each three tables above? thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Hive on TEZ + LLAP

2016-07-19 Thread Mich Talebzadeh
Sounds like if I am correct joining a fact table store_sales; with two dimensions? cool thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: ORC does not support type conversion from INT to STRING.

2016-07-18 Thread Mich Talebzadeh
or it is stored as is and just masked? The version of Hive I am using is 2 and it works OK for primitive data types (insert/select from INT to String) However, I believe Mahender is referring to Complex types? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Hive on TEZ + LLAP

2016-07-18 Thread Mich Talebzadeh
14.38 16 times ORC 202.33317.77 11 times So the hybrid engine seems to make much difference which if I just consider Tez only and Tez + LLAP the gain is more than 3 times Cheers, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com

Re: Presentation in London: Running Spark on Hive or Hive on Spark

2016-07-19 Thread Mich Talebzadeh
in La Tasca West India Docks Road E14 <http://www.meetup.com/futureofdata-london/events/232423292/> and especially if you like Spanish food :) Regards, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linke

Re: ORC does not support type conversion from INT to STRING.

2016-07-19 Thread Mich Talebzadeh
in Hive 2, I don't see this issue INSERT/SELECT from INT to String column! Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive on TEZ + LLAP

2016-07-16 Thread Mich Talebzadeh
Hi, This is interesting. Are there any late presentations of Hive on Tez and Hive on Tez with LLAP. Also has there been simple benchmarks to compare: 1. Hive on MR 2. Hine on Tez 3. Hive on Tez with LLAP It would be interesting how these three fare. Thanks Dr Mich Talebzadeh

Re: A dedicated Web UI interface for Hive

2016-07-15 Thread Mich Talebzadeh
Hi Marcin, For Hive on Spark I can use Spark 1.3.1 UI which does not have DAG diagram (later versions like 1.6.1 have it). But yes you are correct. However, I was certain that Gopal was working on a UI interface if my memory serves right. Cheers, Mich Dr Mich Talebzadeh LinkedIn * https

A dedicated Web UI interface for Hive

2016-07-14 Thread Mich Talebzadeh
Hi Gopal, If I recall you were working on a UI support for Hive. Currently the one available is the standard Hadoop one on port 8088. Do you have any timelines which release of Hive is going to have this facility? Thanks, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view

Re: A dedicated Web UI interface for Hive

2016-07-15 Thread Mich Talebzadeh
Hi Marcin, Which two web interfaces are these. I know the usual one on 8088 any other one? I want something in line with what Spark provides. I thought Gopal has got something: [image: Inline images 1] Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-11 Thread Mich Talebzadeh
14.38 ORC 202.33317.77 Still I would use Spark if I had a choice and I agree that on VLT (very large tables), the limitation in available memory may be the overriding factor in using Spark. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profil

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
I guess that is what DAG adds up to with Tez Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
later but it will be very useful to remove thriftserver, if we can. " Cheers, Mich Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-12 Thread Mich Talebzadeh
to Hive on MR. One experiment is worth hundreds of opinions Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-11 Thread Mich Talebzadeh
compared to Hive or not? Will it keep the data in memory for reuse or not. 6. What I don't understand what makes Tez and LLAP more efficient compared to Spark! Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <ht

Re: Verifying Hive execution engine used within a session

2016-07-13 Thread Mich Talebzadeh
Please send a brief message to Unsubscribe: user-unsubscr...@hive.apache.org in here <https://hive.apache.org/mailing_lists.html> HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Verifying Hive execution engine used within a session

2016-07-13 Thread Mich Talebzadeh
can switch the engines set hive.execution.engine=tez; Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Querying Hive tables from Spark

2016-06-27 Thread Mich Talebzadeh
(10,0)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 ListSink *And Hive on Spark returns the same 24 rows in 30 seconds* Ok Hive query is just slower with Spark engine. Assuming that the time taken will be optimization time + query time then it appear

Presentation in London: Running Spark on Hive or Hive on Spark

2016-07-06 Thread Mich Talebzadeh
erested please register here <http://www.meetup.com/futureofdata-london/events/232423292/> Looking forward to seeing those who can make it to have an interesting discussion and leverage your experience. Regards, Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAA

Re: hive 2.1.0 beeline cannot show verbose log

2016-07-07 Thread Mich Talebzadeh
16/07/07 11:36:22 [main]: DEBUG conf.VariableSubstitution: Substitution is on: hive HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Hive Metastore on Amazon Aurora

2016-07-11 Thread Mich Talebzadeh
of transaction activity using ORC files with Insert/Update/Delete that need to communicate with metastore with heartbeat etc? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-11 Thread Mich Talebzadeh
Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. A

Re: hive 2.1.0 beeline cannot show verbose log

2016-07-07 Thread Mich Talebzadeh
Hi Is this available in Hive 2? hive> set hive.async.log.enabled=false; Query returned non-zero code: 1, cause: hive configuration hive.async.log.enabled does not exists. Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Using Spark on Hive with Hive also using Spark as its execution engine

2016-07-11 Thread Mich Talebzadeh
umulative CPU: 721.83 sec HDFS Read: 400442823 HDFS Write: 10 SUCCESS Total MapReduce CPU Time Spent: 12 minutes 1 seconds 830 msec OK 1 *Time taken: 239.532 seconds, Fetched: 1 row(s)* I leave it to you guys to guess which one is better :) Cheers Dr Mich Talebzadeh LinkedIn

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
,0), amount_sold: decimal(10,0)] scala> s2.write.mode("overwrite").parquet("/data/stg/newtable/sales5") HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAE

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
Do you know the existing table schema? The new table schema will be based on that table without partitioning? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
ATE EXTERNAL TABLE sales5 AS SELECT * FROM SALES; FAILED: SemanticException [Error 10070]: CREATE-TABLE-AS-SELECT cannot create external table Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/v

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-04 Thread Mich Talebzadeh
yes but that essentially copies the metadata and leaves the partition there with no data. it is just an image copy. won't help this case Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view

Vectorised Query Execution extension

2016-08-04 Thread Mich Talebzadeh
to extend it beyond 1024 rows to include the whole column in table? VQE would be very useful especially with ORC as it basically means that one can process the whole column separately thus improving performance of the query. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile

Re: hive concurrency not working

2016-08-05 Thread Mich Talebzadeh
you won't have this problem if you use Spark as the execution engine? That handles concurrency OK Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: hive concurrency not working

2016-08-05 Thread Mich Talebzadeh
You won't have this problem if you use Spark as the execution engine! This set up handles concurrency but Hive with Spark is not part of the HW distro. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: hive not showing up default database

2016-08-05 Thread Mich Talebzadeh
Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsi

Re: hive concurrency not working

2016-08-05 Thread Mich Talebzadeh
great in that case they can try it and I am pretty sure if they are stuck they can come and ask you for expert advice since Hortonworks do not support Hive on Spark and I know that Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: hive concurrency not working

2016-08-06 Thread Mich Talebzadeh
wonder whether hive.support.concurrency is set to true with zookeeper running and hive.lock.manager set to org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <ht

Re: hive concurrency not working

2016-08-07 Thread Mich Talebzadeh
-- I am pretty sure that they will support it because the Spark option is supported Hortonworks support Spark but not Hive on Spark. Their official distro is Hive on Tez + LLAP Not sure where you get your information from though I got it from Hortonworks and I know that Dr Mich Talebzadeh

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-07 Thread Mich Talebzadeh
hout those two columns and of course will not be partitioned. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmi

Re: Crate Non-partitioned table from partitioned table using CREATE TABLE .. LIKE

2016-08-07 Thread Mich Talebzadeh
rod_id` bigint, `cust_id` bigint, `time_id` timestamp, `channel_id` bigint, `promo_id` bigint, `quantity_sold` decimal(10,0), `amount_sold` decimal(10,0)) *PARTITIONED BY ( `year` int, `month` int)* - Which is not that useful HTH Dr Mich Talebzadeh LinkedIn *

Re: hiveserver2 start hang

2016-08-09 Thread Mich Talebzadeh
what command did you use to start hiveserver2? $HIVE_HOME/bin/hiveserver2 & is the port 1 used? netstat -alnp|egrep 'Local|10010' HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profi

Re: hive session drops tables when restarted

2016-08-08 Thread Mich Talebzadeh
see them netstat -pltenp 'Local|1000|9083' HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Di

Re: How can I force Hive to start compaction on a table immediately

2016-08-01 Thread Mich Talebzadeh
Thanks Alan. One crude solution would be to copy data from the ACID table to a simple table and present that table to Spark to see the data. This is basically Spark optimiser issue not the engine itself My Hive runs on Spark query engine and all works fine there. HTH Dr Mich Talebzadeh

Re: Hive transactional table with delta files, Spark cannot read and sends error

2016-08-01 Thread Mich Talebzadeh
Thanks Gopal. I am on Spark 1.6.1 and getting the following error scala> var conn = LlapContext.newInstance(sc, hs2_url); :28: error: not found: value LlapContext var conn = LlapContext.newInstance(sc, hs2_url); Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/v

How can I force Hive to start compaction on a table immediately

2016-08-01 Thread Mich Talebzadeh
Rather than queuing it hive> alter table payees COMPACT 'major'; Compaction enqueued. OK Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOAB

Hive transactional table with delta files, Spark cannot read and sends error

2016-08-01 Thread Mich Talebzadeh
options do I have here? And interestingly Hive running on Spark engine and its works fine Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP

Updating ORC table fails with [Error 10122]: Bucketized tables do not support INSERT INTO:

2016-07-29 Thread Mich Talebzadeh
support INSERT INTO: Table: accounts.payees What would be the least painful solution without some elaborate means? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profil

Re: Hive logging 2.0.0

2016-08-17 Thread Mich Talebzadeh
binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Logging initialized using configuration in file:/usr/lib/hive/conf/hive-log4j2.properties hive> Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <

Parquet tables with snappy compression

2017-01-25 Thread Mich Talebzadeh
Hi, Has there been any study of how much compressing Hive Parquet tables with snappy reduces storage space or simply the table size in quantitative terms? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <ht

Using Sqoop to get data from Impala/Hive to another Hive table

2017-02-21 Thread Mich Talebzadeh
Hi, I have not tried this but someone mentioned that it is possible to use Sqoop to get data from one Impala/Hive table in one cluster to another? The clusters are in different zones. This is to test the cluster. Has anyone done such a thing? Thanks Dr Mich Talebzadeh LinkedIn * https

Re: Using Sqoop to get data from Impala/Hive to another Hive table

2017-02-21 Thread Mich Talebzadeh
. this is not really a test is iut? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer

Re: Using Sqoop to get data from Impala/Hive to another Hive table

2017-02-21 Thread Mich Talebzadeh
regardless there is no point using Sqoop for such purpose. it is not really designed for it :) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: Difference between join and inner join

2017-02-13 Thread Mich Talebzadeh
join is by default inner join as in Oracle or Sybase. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
to STRRING. What is the thread view on this? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disc

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
in HDFS compared to STRING columns? Cheers Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disc

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
as opposed to String make any difference in terms of storage efficiency? Regards Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV

Re: VARCHAR or STRING fields in Hive

2017-01-16 Thread Mich Talebzadeh
Sounds like VARCHAR and CHAR types were created for Hive to have ANSI SQL Compliance. Otherwise they seem to be practically the same as String types. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: Inserting data in hive bucket

2016-08-19 Thread Mich Talebzadeh
Hi, You are partitioning by Month and bucketing by date or day? If that is the case you only have 30-31 hash partitioning (bucketing) for each Month? HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.

Re: Inserting data in hive bucket

2016-08-21 Thread Mich Talebzadeh
Hi Rahul, I don't believe you can drop a particular bucket in Hive HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: Populating tables using hive and spark

2016-08-22 Thread Mich Talebzadeh
+--+ | 1|London| | 2|NY| | 3|California| | 4| Dehli| ++--+ So the rows are there. Let me go to Hive again now hive> select * from testme; OK 1 London 2 NY 3 California 4 Dehli hive> analyze table testme compute sta

Re: Loading Sybase to hive using sqoop

2016-08-24 Thread Mich Talebzadeh
Are you using a vendor distro or in-house build? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpre

Re: Loading Sybase to hive using sqoop

2016-08-24 Thread Mich Talebzadeh
out first time and sync Hive table with Sybase IQ table real time. You will need SRS SP 204 or above to make this work. Talk to your DBA if they can get SRS SP from Sybase for this purpose. I have done it many times. I think it is stable enough for this purpose. HTH Dr Mich Talebzadeh

Re: Loading Sybase to hive using sqoop

2016-08-24 Thread Mich Talebzadeh
hm. Watching paint dry :) Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it a

Re: Quota for rogue ad-hoc queries

2016-08-31 Thread Mich Talebzadeh
Trt this hive.limit.optimize.fetch.max - Default Value: 5 - Added In: Hive 0.8.0 Maximum number of rows allowed for a smaller subset of data for simple LIMIT, if it is a fetch query. Insert queries are not restricted by this limit. HTH Dr Mich Talebzadeh LinkedIn * https

Re: Sqoop: SQL Server to Hive import

2016-09-14 Thread Mich Talebzadeh
there into Hive table. There are other ways of using JDBC say through Spark etc. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

Re: ACID transactions on data added from Spark not working

2016-09-14 Thread Mich Talebzadeh
this is my experience. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your ow

Re: Hive On Spark - ORC Table - Hive Streaming Mutation API

2016-09-14 Thread Mich Talebzadeh
; set spark.ui.port=; HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer

Re: Working of HiveServer2

2016-09-13 Thread Mich Talebzadeh
engine. HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own ris

<    1   2   3   4   5   6   7   8   >