BTW what is the situation with Impala?
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
On 19 Apri
engine for Hive (sounds like many
prefer TEZ although I am a Spark fan) and the ubiquitous YARN.
Let me know your thoughts.
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view
Thanks Marcin.
What is the definition of low latency here? Are you referring to the
performance of SQL against HBase tables compared to Hive. As I understand
HBase is a columnar database. Would it be possible to use Hive against ORC
to achieve the same?
Dr Mich Talebzadeh
LinkedIn *
https
|
|
+---+---+---+--+---+--+--+
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
Hi,
If I may, I would also like to see where the Hive optimizer shows that it
is used with explain ... or other means. It will be interesting.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/prof
Hi,
Jira HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574> is
raised to resolve Hive standard deviation function STTDEV() which is
incorrect at the moment.
Please vote for it.
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/v
we are down the road with Work In Progress on this.
However, I am happy to help with this.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
HIVE-13574 <https://issues.apache.org/jira/browse/HIVE-13574>
Created and assigned to myself
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profil
This simply does not work but we need to make Hive use external indexes.
This is a must
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
only col4
hive> insert into testme (col4) values(6);
Loading data to table test.testme
OK
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPC
not sure vendors do
parallelise this sort of things.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpre
default that runs on port 1
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:*
ut the cluster. It must be using some
clever algorithm to do so.
Cheers
.
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzad
.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
Time taken: 0.1 seconds, Fetched: 44 row(s)
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linke
I suggest that you try it for yourself then
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disc
memory
computing.
As usual your mileage varies.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
My interest is to extract the jar file similar to below from the build
spark-assembly-1.3.1-hadoop2.4.0.jar
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profil
-07-27 20:38:29,395 Stage-2_0: 24/24 Finished Stage-3_0: 1/1
Finished
Status: Finished successfully in 13.14 seconds
OK
1
Time taken: 13.426 seconds, Fetched: 1 row(s)
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
42 HARRODS
HARRODS LTD CD 4610 4 HARRODS
HARRODS LTD CD 4636 13 HARRODS
HARRODS LTD CD 5916 28 HARRODS
HARRODS LTD CD 4628 111 HARRODS
cheers
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<
Hi,
You can download the pdf from here
<https://talebzadehmich.files.wordpress.com/2016/08/hive_on_spark_only.pdf>
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profil
0: 0(+1)/1
2016-07-31 10:48:35,780 Stage-0_0: 1/1 Finished Stage-1_0: 1/1 Finished
Status: Finished successfully in 10.10 seconds
OK
2015-12-15 HARRODS LTD CD 4636 10.95 1
Time taken: 46.546 seconds, Fetched: 1 row(s)
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedi
thodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
*FAILED: ParseException line 6:7 Failed to recognize predicate 'inner'.
Failed r
n send the presentation
Cheers
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at y
which version of hive it it?
can you log in to hive via hive cli
*$HIVE_HOME/bin/hive*
Logging initialized using configuration in
file:/usr/lib/hive/conf/hive-log4j2.properties
hive>
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view
which is a Data Warehouse.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer
...I do not agree with you...
Yeah right. I am so upset. Was waiting for your nod
LOL
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
thanks Marcin.
What Is your guesstimate on the order of "faster" please?
Cheers
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
uot; ""=== Starting hiveserver metastore ===" >>
${LOG_FILE}
$HIVE_HOME/bin/hive --service metastore &
netstat -alnp|egrep 'Local|9083'
echo `date` " ""=== Started hiveserver2 metastore ===" >>
${LOG_FILE}
exit
HTH
to Hive on Spark or they apply equally to Hive on
MapReduce as well. In other words a general issue with Hive optimizer case
hive-9044?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/prof
Nice one Shaw
hive> set hive.execution.engine;
hive.execution.engine=mr
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
Wjich version of Hive and Spark please?
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer
fine which version of spark are using for Hive execution/query engine
please?
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
will need to build from source.
It works and it is table.
Otherwise you may decide to use Spark Thrift Server (STS) that allows JDBC
access to Spark SQL (through beeline, Squirrel , Zeppelin) that has Hive
SQL context built into it as if you were using Hive Thrift Server (HSS)
HTH
Dr Mich Talebzadeh
make sense and is meaningless without any evidence.
Either you provide evidence that you have done this work and you
encountered errors or better not mention it. Sounds like scaremongering.
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
://rhes564:9000/data/stg/*.gz
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Us
and d_moy=12
and d_year=2001
group by i_brand, i_brand_id
order by ext_price desc, i_brand_id
limit 100 ;
What was the type (Parquet, text, ORC etc) and row count for each three
tables above?
thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id
Sounds like if I am correct joining a fact table store_sales; with two
dimensions?
cool
thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
or it is
stored as is and just masked?
The version of Hive I am using is 2 and it works OK for primitive data
types (insert/select from INT to String)
However, I believe Mahender is referring to Complex types?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id
14.38 16 times
ORC 202.33317.77 11 times
So the hybrid engine seems to make much difference which if I just consider
Tez only and Tez + LLAP the gain is more than 3 times
Cheers,
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com
in La Tasca West India Docks Road E14
<http://www.meetup.com/futureofdata-london/events/232423292/>
and especially if you like Spanish food :)
Regards,
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linke
in Hive 2, I don't see this issue INSERT/SELECT from INT to String column!
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
Hi,
This is interesting. Are there any late presentations of Hive on Tez and
Hive on Tez with LLAP.
Also has there been simple benchmarks to compare:
1. Hive on MR
2. Hine on Tez
3. Hive on Tez with LLAP
It would be interesting how these three fare.
Thanks
Dr Mich Talebzadeh
Hi Marcin,
For Hive on Spark I can use Spark 1.3.1 UI which does not have DAG diagram
(later versions like 1.6.1 have it). But yes you are correct.
However, I was certain that Gopal was working on a UI interface if my
memory serves right.
Cheers,
Mich
Dr Mich Talebzadeh
LinkedIn *
https
Hi Gopal,
If I recall you were working on a UI support for Hive. Currently the one
available is the standard Hadoop one on port 8088.
Do you have any timelines which release of Hive is going to have this
facility?
Thanks,
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view
Hi Marcin,
Which two web interfaces are these. I know the usual one on 8088 any other
one?
I want something in line with what Spark provides. I thought Gopal has got
something:
[image: Inline images 1]
Cheers
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id
14.38
ORC 202.33317.77
Still I would use Spark if I had a choice and I agree that on VLT (very
large tables), the limitation in available memory may be the overriding
factor in using Spark.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profil
I guess that is what DAG adds up to with Tez
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpre
later but it will be very useful to remove thriftserver, if we can. "
Cheers,
Mich
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
to Hive on MR.
One experiment is worth hundreds of opinions
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
compared to Hive or not? Will it keep the data in memory for reuse or not.
6. What I don't understand what makes Tez and LLAP more efficient
compared to Spark!
Cheers
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<ht
Please send a brief message to Unsubscribe: user-unsubscr...@hive.apache.org
in here <https://hive.apache.org/mailing_lists.html>
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profil
can switch the engines
set hive.execution.engine=tez;
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
(10,0))
outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5,
_col6
ListSink
*And Hive on Spark returns the same 24 rows in 30 seconds*
Ok Hive query is just slower with Spark engine.
Assuming that the time taken will be optimization time + query time then it
appear
erested please register here
<http://www.meetup.com/futureofdata-london/events/232423292/>
Looking forward to seeing those who can make it to have an interesting
discussion and leverage your experience.
Regards,
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEAA
16/07/07 11:36:22 [main]: DEBUG conf.VariableSubstitution: Substitution is
on: hive
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
of transaction activity using ORC files with
Insert/Update/Delete that need to communicate with metastore with heartbeat
etc?
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk. A
Hi
Is this available in Hive 2?
hive> set hive.async.log.enabled=false;
Query returned non-zero code: 1, cause: hive configuration
hive.async.log.enabled does not exists.
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
umulative CPU: 721.83 sec HDFS Read:
400442823 HDFS Write: 10 SUCCESS
Total MapReduce CPU Time Spent: 12 minutes 1 seconds 830 msec
OK
1
*Time taken: 239.532 seconds, Fetched: 1 row(s)*
I leave it to you guys to guess which one is better :)
Cheers
Dr Mich Talebzadeh
LinkedIn
,0), amount_sold: decimal(10,0)]
scala> s2.write.mode("overwrite").parquet("/data/stg/newtable/sales5")
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAE
Do you know the existing table schema? The new table schema will be based
on that table without partitioning?
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view
ATE EXTERNAL TABLE sales5 AS SELECT * FROM SALES;
FAILED: SemanticException [Error 10070]: CREATE-TABLE-AS-SELECT cannot
create external table
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/v
yes but that essentially copies the metadata and leaves the partition there
with no data. it is just an image copy. won't help this case
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view
to extend it
beyond 1024 rows to include the whole column in table?
VQE would be very useful especially with ORC as it basically means that one
can process the whole column separately thus improving performance of the
query.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile
you won't have this problem if you use Spark as the execution engine? That
handles concurrency OK
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
You won't have this problem if you use Spark as the execution engine! This
set up handles concurrency but Hive with Spark is not part of the HW distro.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.
Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk. Any and all responsi
great in that case they can try it and I am pretty sure if they are stuck
they can come and ask you for expert advice since Hortonworks do not
support Hive on Spark and I know that
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
wonder whether hive.support.concurrency is set to true with zookeeper
running and hive.lock.manager set to
org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<ht
-- I am pretty sure that they will support it because the Spark option is
supported
Hortonworks support Spark but not Hive on Spark. Their official distro is
Hive on Tez + LLAP
Not sure where you get your information from though
I got it from Hortonworks and I know that
Dr Mich Talebzadeh
hout those two columns and of course will not
be partitioned.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmi
rod_id` bigint,
`cust_id` bigint,
`time_id` timestamp,
`channel_id` bigint,
`promo_id` bigint,
`quantity_sold` decimal(10,0),
`amount_sold` decimal(10,0))
*PARTITIONED BY ( `year` int, `month` int)*
-
Which is not that useful
HTH
Dr Mich Talebzadeh
LinkedIn *
what command did you use to start hiveserver2?
$HIVE_HOME/bin/hiveserver2 &
is the port 1 used?
netstat -alnp|egrep 'Local|10010'
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profi
see them
netstat -pltenp 'Local|1000|9083'
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Di
Thanks Alan.
One crude solution would be to copy data from the ACID table to a simple
table and present that table to Spark to see the data.
This is basically Spark optimiser issue not the engine itself
My Hive runs on Spark query engine and all works fine there.
HTH
Dr Mich Talebzadeh
Thanks Gopal.
I am on Spark 1.6.1 and getting the following error
scala> var conn = LlapContext.newInstance(sc, hs2_url);
:28: error: not found: value LlapContext
var conn = LlapContext.newInstance(sc, hs2_url);
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/v
Rather than queuing it
hive> alter table payees COMPACT 'major';
Compaction enqueued.
OK
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOAB
options do I have here?
And interestingly Hive running on Spark engine and its works fine
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP
support
INSERT INTO: Table: accounts.payees
What would be the least painful solution without some elaborate means?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profil
binding is of type
[org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in
file:/usr/lib/hive/conf/hive-log4j2.properties
hive>
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<
Hi,
Has there been any study of how much compressing Hive Parquet tables with
snappy reduces storage space or simply the table size in quantitative terms?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<ht
Hi,
I have not tried this but someone mentioned that it is possible to use
Sqoop to get data from one Impala/Hive table in one cluster to another?
The clusters are in different zones. This is to test the cluster. Has
anyone done such a thing?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https
.
this is not really a test is iut?
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer
regardless there is no point using Sqoop for such purpose. it is not really
designed for it :)
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
join is by default inner join as in Oracle or Sybase.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpre
to STRRING.
What is the thread view on this?
Thanks
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disc
in HDFS compared to
STRING columns?
Cheers
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disc
as opposed to
String make any difference in terms of storage efficiency?
Regards
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV
Sounds like VARCHAR and CHAR types were created for Hive to have ANSI SQL
Compliance. Otherwise they seem to be practically the same as String types.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.
Hi,
You are partitioning by Month and bucketing by date or day?
If that is the case you only have 30-31 hash partitioning (bucketing) for
each Month?
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.
Hi Rahul,
I don't believe you can drop a particular bucket in Hive
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
+--+
| 1|London|
| 2|NY|
| 3|California|
| 4| Dehli|
++--+
So the rows are there.
Let me go to Hive again now
hive> select * from testme;
OK
1 London
2 NY
3 California
4 Dehli
hive> analyze table testme compute sta
Are you using a vendor distro or in-house build?
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpre
out first
time and sync Hive table with Sybase IQ table real time. You will need SRS
SP 204 or above to make this work.
Talk to your DBA if they can get SRS SP from Sybase for this purpose. I
have done it many times. I think it is stable enough for this purpose.
HTH
Dr Mich Talebzadeh
hm. Watching paint dry :)
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it a
Trt this
hive.limit.optimize.fetch.max
- Default Value: 5
- Added In: Hive 0.8.0
Maximum number of rows allowed for a smaller subset of data for simple
LIMIT, if it is a fetch query. Insert queries are not restricted by this
limit.
HTH
Dr Mich Talebzadeh
LinkedIn *
https
there into Hive table.
There are other ways of using JDBC say through Spark etc.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
this is my experience.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your ow
;
set spark.ui.port=;
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer
engine.
HTH
Dr Mich Talebzadeh
LinkedIn *
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own ris
501 - 600 of 741 matches
Mail list logo