Hi friends,
I am trying to create hive table through spark with Java code in Eclipse
using below code.
HiveContext sqlContext = new org.apache.spark.sql.hive.HiveContext(sc.sc());
sqlContext.sql("CREATE TABLE IF NOT EXISTS src (key INT, value STRING)");
but i am getting error
<soni2015.sp...@gmail.com<mailto:soni2015.sp...@gmail.com>> wrote:
Hi friends,
I am trying to create hive table through spark with Java code in Eclipse using
below code.
HiveContext sqlContext = new
org.apache.spark.sql.hive.HiveContext(sc.sc<http://sc.sc/>());
sqlContext
Hi,
Here is my javacode;
SparkConf sparkConf = Constance.getSparkConf();
JavaSparkContext sc = new JavaSparkContext(sparkConf);
SQLContext sql = new SQLContext(sc);
HiveContext sqlContext = new HiveContext(sc.sc());
List fields = new
-is-save-by-HiveContext-tp25774p25776.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h
i solove this now;
just run 'refresh table shop.id' on beeline;
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-read-data-correctly-through-beeline-when-data-is-save-by-HiveContext-tp25774p25779.html
Sent from the Apache Spark User List mailing list
> Hi Ted,
>
> The self join works fine on tbales where the hivecontext tables are direct
> hive tables, therefore
>
> table1 = hiveContext.sql("select columnA, columnB from hivetable1")
> table1.registerTempTable("table1")
> table1.cache()
> table1
in SPARK
>
> https://forums.databricks.com/questions/2142/self-join-in-spark-sql.html
>
>
> Regards,
> Gourav
>
> On Thu, Dec 17, 2015 at 10:52 AM, Gourav Sengupta <
> gourav.sengu...@gmail.com> wrote:
>
>> Hi Ted,
>>
>> The self join works fi
a <
> gourav.sengu...@gmail.com> wrote:
>
>> hi,
>>
>> I think that people have reported the same issue elsewhere, and this
>> should be registered as a bug in SPARK
>>
>> https://forums.databricks.com/questions/2142/self-join-in-spark-sql.html
>&g
t; [programme_key#1802,is_logged_in#1295L,is_4od_video_view#1327L],
>> (MetastoreRelation default, omnitureweb_log, None), [hit_month#1289 IN
>> (2015-11),hit_day#1290 IN (20)]
>>
>> Code Generation: true
>>
>>
>>
>> Regards,
>> Gourav
&g
Hi Ted,
The self join works fine on tbales where the hivecontext tables are direct
hive tables, therefore
table1 = hiveContext.sql("select columnA, columnB from hivetable1")
table1.registerTempTable("table1")
table1.cache()
table1.count()
and if I do a self join on table1 t
I did the following exercise in spark-shell ("c" is cached table):
scala> sqlContext.sql("select x.b from c x join c y on x.a = y.a").explain
== Physical Plan ==
Project [b#4]
+- BroadcastHashJoin [a#3], [a#125], BuildRight
:- InMemoryColumnarTableScan [b#4,a#3], InMemoryRelation
Hi,
This is how the data can be created:
1. TableA : cached()
2. TableB : cached()
3. TableC: TableA inner join TableB cached()
4. TableC join TableC does not take the data from cache but starts reading
the data for TableA and TableB from disk.
Does this sound like a bug? The self join between
Hi,
>>>
>>> I have a HIVE table with few thousand partitions (based on date and
>>> time). It takes a long time to run if for the first time and then
>>> subsequently it is fast.
>>>
>>> Is there a way to store the cache of partition lookups
Sengupta <
>>> gourav.sengu...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a HIVE table with few thousand partitions (based on date and
>>>> time). It takes a long time to run if for the first time and then
>>>> sub
t;> I start a new SPARK instance (cannot keep my personal server running
>> continuously), I can immediately restore back the temptable in hiveContext
>> without asking it go again and cache the partition lookups?
>>
>> Currently it takes around 1.5 hours for me just to cache in
server running
continuously), I can immediately restore back the temptable in hiveContext
without asking it go again and cache the partition lookups?
Currently it takes around 1.5 hours for me just to cache in the partition
information and after that I can see that the job gets queued in the SPARK
UI
n subsequently it
> is fast.
>
> Is there a way to store the cache of partition lookups so that every time
> I start a new SPARK instance (cannot keep my personal server running
> continuously), I can immediately restore back the temptable in hiveContext
> without asking it go again and cache
Hi everyone,
I'm using HiveContext and SparkSQL to query a Hive table and doing join
operation on it.
After changing the default serializer to Kryo with
spark.kryo.registrationRequired = true, the Spark application failed with
the following error:
java.lang.IllegalArgumentException: Class
wrote:
> Hi everyone,
> I'm using HiveContext and SparkSQL to query a Hive table and doing join
> operation on it.
> After changing the default serializer to Kryo with
> spark.kryo.registrationRequired = true, the Spark application failed with
> the following error:
>
> java
I'm trying to do this in unit tests:
val sConf = new SparkConf()
.setAppName("RandomAppName")
.setMaster("local")
val sc = new SparkContext(sConf)
val sqlContext = new TestHiveContext(sc) // tried new HiveContext(sc)
as well
But I get this:
*[sc
t;
> val sConf = new SparkConf()
> .setAppName("RandomAppName")
> .setMaster("local")
> val sc = new SparkContext(sConf)
> val sqlContext = new TestHiveContext(sc) // tried new
> HiveContext(sc) as well
>
>
> But I get this:
ate: Tuesday, December 8, 2015 at 4:09 AM
To: "user@spark.apache.org<mailto:user@spark.apache.org>"
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: HiveContext creation failed with Kerberos
On 8 Dec 2015, at 06:52, Neal Yin
<neal@workday.com<
On 8 Dec 2015, at 06:52, Neal Yin
> wrote:
15/12/08 04:12:28 ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism
eSparkConf(…)
val sparkContext = new JavaSparkContext(sparkConf)
new HiveContext(sparkContext.sc) // failed
})
Spark context boots up fine with UGI, but HiveContext creation failed with
following message. If I manually do kinit within same shell, this code works.
Any thoughts?
15/12/0
uru...@honeywell.com>; user
<user@spark.apache.org>
Subject: Re: RE: error while creating HiveContext
Could you provide your hive-site.xml file info ?
Best,
Sun.
fightf...@163.com<mailto:fightf...@163.com>
From: Chandra Mohan, Ananda Vel Murugan&
Could you provide your hive-site.xml file info ?
Best,
Sun.
fightf...@163.com
From: Chandra Mohan, Ananda Vel Murugan
Date: 2015-11-27 17:04
To: fightf...@163.com; user
Subject: RE: error while creating HiveContext
Hi,
I verified and I could see hive-site.xml in spark conf directory
Subject: Re: error while creating HiveContext
Hi,
I think you just want to put the hive-site.xml in the spark/conf directory and
it would load
it into spark classpath.
Best,
Sun.
fightf...@163.com<mailto:fightf...@163.com>
From: Chandra Mohan,
Hi,
I am building a spark-sql application in Java. I created a maven project in
Eclipse and added all dependencies including spark-core and spark-sql. I am
creating HiveContext in my spark program and then try to run sql queries
against my Hive Table. When I submit this job in spark, for some
Hi,
I think you just want to put the hive-site.xml in the spark/conf directory and
it would load
it into spark classpath.
Best,
Sun.
fightf...@163.com
From: Chandra Mohan, Ananda Vel Murugan
Date: 2015-11-27 15:04
To: user
Subject: error while creating HiveContext
Hi,
I am building
Hi Zhan,
Thank you for providing a workaround!
I will try this out but I agree with Ted, there should be a better way to
capture the exception and handle it by just initializing SQLContext instead of
HiveContext. WARN the user that something is wrong with his hive setup.
Having
1:9083
HW11188:spark zzhang$
By the way, I don’t know whether there is any caveat for this walk around.
Thanks.
Zhan Zhang
On Nov 6, 2015, at 2:40 PM, Jerry Lam
<chiling...@gmail.com<mailto:chiling...@gmail.com>> wrote:
Hi Zhan,
I don’t use HiveContext features at
I agree with minor change. Adding a config to provide the option to init
SQLContext or HiveContext, with HiveContext as default instead of bypassing
when hitting the Exception.
Thanks.
Zhan Zhang
On Nov 6, 2015, at 2:53 PM, Ted Yu
<yuzhih...@gmail.com<mailto:yuzhih...@gmail.com>&g
I would suggest adding a config parameter that allows bypassing
initialization of HiveContext in case of SQLException
Cheers
On Fri, Nov 6, 2015 at 2:50 PM, Zhan Zhang <zzh...@hortonworks.com> wrote:
> Hi Jerry,
>
> OK. Here is an ugly walk around.
>
> Put a hive-site.xml u
If you assembly jar have hive jar included, the HiveContext will be used.
Typically, HiveContext has more functionality than SQLContext. In what case you
have to use SQLContext that cannot be done by HiveContext?
Thanks.
Zhan Zhang
On Nov 6, 2015, at 10:43 AM, Jerry Lam
<chiling...@gmail.
Hi Zhan,
I don’t use HiveContext features at all. I use mostly DataFrame API. It is
sexier and much less typo. :)
Also, HiveContext requires metastore database setup (derby by default). The
problem is that I cannot have 2 spark-shell sessions running at the same time
in the same host (e.g
ree with Ted, there should be a better way to
capture the exception and handle it by just initializing SQLContext instead of
HiveContext. WARN the user that something is wrong with his hive setup.
Having spark.sql.hive.enabled false configuration would be lovely too. :)
Just an addi
Hi Ted,
I was trying to set spark.sql.dialect to sql as to specify I only need
“SQLContext” not HiveContext. It didn’t work. It still instantiate HiveContext.
Since I don’t use HiveContext and I don’t want to start a mysql database
because I want to have more than 1 session of spark-shell
Hi spark users and developers,
Is it possible to disable HiveContext from being instantiated when using
spark-shell? I got the following errors when I have more than one session
starts. Since I don't use HiveContext, it would be great if I can have more
than 1 spark-shell start at the same time
What is interesting is that pyspark shell works fine with multiple session in
the same host even though multiple HiveContext has been created. What does
pyspark does differently in terms of starting up the shell?
> On Nov 6, 2015, at 12:12 PM, Ted Yu <yuzhih...@gmail.com&
Hi all,
# Programm Sketch
I create a HiveContext `hiveContext`
With that context, I create a DataFrame `df` from a JDBC relational table.I
register the DataFrame `df` viadf.registerTempTable("TESTTABLE")I start a
HiveThriftServer2 via
HiveThriftServer2.startWithContext(h
I am not sure if we really want to support that with HiveContext, but a
workround is to use the Spark package at https://github.com/databricks/spark-csv
From: Felix Cheung [mailto:felixcheun...@hotmail.com]
Sent: Tuesday, October 27, 2015 10:54 AM
To: Daniel Haviv; user
Subject: RE: HiveContext
I will
Thank you.
> On 27 באוק׳ 2015, at 4:54, Felix Cheung <felixcheun...@hotmail.com> wrote:
>
> Please open a JIRA?
>
>
> Date: Mon, 26 Oct 2015 15:32:42 +0200
> Subject: HiveContext ignores ("skip.header.line.count"="1")
> From: daniel.ha
Please open a JIRA?
Date: Mon, 26 Oct 2015 15:32:42 +0200
Subject: HiveContext ignores ("skip.header.line.count"="1")
From: daniel.ha...@veracity-group.com
To: user@spark.apache.org
Hi,I have a csv table in Hive which is configured to skip the header row
Hi,
I have a csv table in Hive which is configured to skip the header row using
TBLPROPERTIES("skip.header.line.count"="1").
When querying from Hive the header row is not included in the data, but
when running the same query via HiveContext I get the header row.
I made sure t
I think DF performs the same as the SQL API does in the multi-inserts, if you
don’t use the cached table.
Hao
From: Daniel Haviv [mailto:daniel.ha...@veracity-group.com]
Sent: Friday, October 9, 2015 3:09 PM
To: Cheng, Hao
Cc: user
Subject: Re: Insert via HiveContext is slow
Thanks Hao
out soon.
>
>
>
> Hao
>
>
>
> *From:* Daniel Haviv [mailto:daniel.ha...@veracity-group.com]
> *Sent:* Friday, October 9, 2015 3:08 AM
> *To:* user
> *Subject:* Re: Insert via HiveContext is slow
>
>
>
> Forgot to mention that my insert is a multi table insert
Hi all, would this be a bug??
val ws = Window.
partitionBy("clrty_id").
orderBy("filemonth_dtt")
val nm = "repeatMe"
df.select(df.col("*"), rowNumber().over(ws).cast("int").as(nm))
Which version of Spark?
On Thu, Oct 8, 2015 at 7:25 AM, wrote:
> Hi all, would this be a bug??
>
> val ws = Window.
> partitionBy("clrty_id").
> orderBy("filemonth_dtt")
>
> val nm = "repeatMe"
>
Hi, thanks for looking into. v1.5.1. I am really worried.
I dont have hive/hadoop for real in the environment.
Saif
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: Thursday, October 08, 2015 2:57 PM
To: Ellafi, Saif A.
Cc: user
Subject: Re: RowNumber in HiveContext returns null
...@databricks.com
Cc: user@spark.apache.org
Subject: RE: RowNumber in HiveContext returns null or negative values
Hi, thanks for looking into. v1.5.1. I am really worried.
I dont have hive/hadoop for real in the environment.
Saif
From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: Thursday
Oct 8, 2015 at 9:51 PM, Daniel Haviv <
daniel.ha...@veracity-group.com> wrote:
> Hi,
> I'm inserting into a partitioned ORC table using an insert sql statement
> passed via HiveContext.
> The performance I'm getting is pretty bad and I was wondering if there are
> ways to speed thi
out soon.
Hao
From: Daniel Haviv [mailto:daniel.ha...@veracity-group.com]
Sent: Friday, October 9, 2015 3:08 AM
To: user
Subject: Re: Insert via HiveContext is slow
Forgot to mention that my insert is a multi table insert :
sqlContext2.sql("""from avro_events
later
Hi,
I'm inserting into a partitioned ORC table using an insert sql statement
passed via HiveContext.
The performance I'm getting is pretty bad and I was wondering if there are
ways to speed things up.
Would saving the DF like this
df.write().mode(SaveMode.Append).partitionBy("date").s
Repartition and default parallelism to 1, in cluster mode, is still broken.
So the problem is not the parallelism, but the cluster mode itself. Something
wrong with HiveContext + cluster mode.
Saif
From: saif.a.ell...@wellsfargo.com [mailto:saif.a.ell...@wellsfargo.com]
Sent: Thursday, October
Can you open a JIRA?
On Thu, Oct 8, 2015 at 11:24 AM, <saif.a.ell...@wellsfargo.com> wrote:
> Repartition and default parallelism to 1, in cluster mode, is still
> *broken*.
>
>
>
> So the problem is not the parallelism, but the cluster mode itself.
> Something wrong
Hi,
I do a sql query on about 10,000 partitioned orc files. Because of the
partition schema the files cannot be merged any longer (to reduce the
total number).
From this command hiveContext.sql(sqlText), the 10K tasks were created
to handle each file. Is it possible to use less tasks? How
Hi,
I have a HiveContext job which takes less than 1 minute to complete in local
mode with 16 cores.
However, when I launch it over stand-alone cluster, it takes for ever, probably
can't even finish. Even when I have the same only node running up in which I
execute it locally.
How could I
on 'clean'
context.
I found it hard to do as, first of all, I couldn't find any way to
understand of SparkContext is already stopped. It has private flag for that
but its private.
Anther problem is that when creating local HiveContext it initialize derby
instance. when trying to create new
t; would like to create new SparkContext in order to run the tests on 'clean'
> context.
> I found it hard to do as, first of all, I couldn't find any way to
> understand of SparkContext is already stopped. It has private flag for that
> but its private.
> Anther problem is that when crea
sts on 'clean'
>> context.
>> I found it hard to do as, first of all, I couldn't find any way to
>> understand of SparkContext is already stopped. It has private flag for that
>> but its private.
>> Anther problem is that when creating local HiveContext it initialize
>>
rote:
> Hi,
> I want to create an external hive table using HiveContext. I have the
> following :
> 1. full path/location of parquet data directory
> 2. name of the new table
> 3. I can get the schema as well.
>
> What API will be the best (for 1,3.x or 1.4.x)? I can see 6
&
TIONS (path
'')")
When you specify the path its automatically created as an external table. The
schema will be discovered.
On Wed, Sep 9, 2015 at 9:33 PM, Mohammad Islam <misla...@yahoo.com.invalid>
wrote:
Hi,I want to create an external hive table using HiveContext. I have the
following
Hi,I want to create an external hive table using HiveContext. I have the
following :1. full path/location of parquet data directory2. name of the new
table3. I can get the schema as well.
What API will be the best (for 1,3.x or 1.4.x)? I can see 6
createExternalTable() APIs but not sure which
of the unit
test) and I believe it has something to do with HiveContext not reclaiming
memory after it is finished (or I'm not shutting it down properly).
It could very well be related to sbt, however, it's not clear to me.
On Tue, Aug 25, 2015 at 1:12 PM, Yana Kadiyska yana.kadiy...@gmail.com
wrote
.
However, the primary issue is that running the same unit test in the same
JVM (multiple times) results in increased memory (each run of the unit
test) and I believe it has something to do with HiveContext not reclaiming
memory after it is finished (or I'm not shutting it down properly).
It could
Hello,
I am using sbt and created a unit test where I create a `HiveContext` and
execute some query and then return. Each time I run the unit test the JVM
will increase it's memory usage until I get the error:
Internal error when running tests: java.lang.OutOfMemoryError: PermGen space
Exception
test where I create a `HiveContext` and
execute some query and then return. Each time I run the unit test the JVM
will increase it's memory usage until I get the error:
Internal error when running tests: java.lang.OutOfMemoryError: PermGen
space
Exception in thread Thread-2
Well. I managed to solve that issue after running my tests on a linux
system instead of windows (which I was originally using). However, now I
have an error when I try to reset the hive context using hc.reset(). It
tries to create a file inside directory /user/my_user_name instead of the
usual
Well, I try this approach, and still have issues. Apparently TestHive can
not delete the hive metastore directory. The complete error that I have is:
15/08/06 15:01:29 ERROR Driver: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
if not
exists and does insert into using hiveContext.sql. Now we cant execute
hiveContext in executor so I have to execute this for loop in driver program
and should run serially one by one. When I submit this Spark job in YARN
cluster almost all the time my executor gets lost because of shuffle not
found
Hello,
I am trying to define an external Hive table from Spark HiveContext like the
following:
import org.apache.spark.sql.hive.HiveContext
val hiveCtx = new HiveContext(sc)
hiveCtx.sql(sCREATE EXTERNAL TABLE IF NOT EXISTS Rentrak_Ratings (Version
string, Gen_Date string, Market_Number
We are using a local hive context in order to run unit tests. Our unit
tests runs perfectly fine if we run why by one using sbt as the next
example:
sbt test-only com.company.pipeline.scalers.ScalerSuite.scala
sbt test-only com.company.pipeline.labels.ActiveUsersLabelsSuite.scala
However, if we
TestHive takes care of creating a temporary directory for each invocation
so that multiple test runs won't conflict.
On Mon, Aug 3, 2015 at 3:09 PM, Cesar Flores ces...@gmail.com wrote:
We are using a local hive context in order to run unit tests. Our unit
tests runs perfectly fine if we run
has anyone tried to make HiveContext only if the class is available?
i tried this:
implicit lazy val sqlc: SQLContext = try {
Class.forName(org.apache.spark.sql.hive.HiveContext, true,
Thread.currentThread.getContextClassLoader)
.getConstructor(classOf[SparkContext]).newInstance(sc
Does spark HiveContext support the rank() ... distribute by syntax (as in
the following article-
http://www.edwardcapriolo.com/roller/edwardcapriolo/entry/doing_rank_with_hive
)?
If not, how can it be achieved?
Thanks,
Lior
the customize
UDF of rank.
Yong
Date: Thu, 16 Jul 2015 15:10:58 +0300
Subject: Use rank with distribute by in HiveContext
From: lio...@taboola.com
To: user@spark.apache.org
Does spark HiveContext support the rank() ... distribute by syntax (as in the
following article-
http://www.edwardcapriolo.com
functionsrankrankdense_rankdenseRankpercent_rank
percentRankntilentilerow_numberrowNumber
HTH.
-Todd
On Thu, Jul 16, 2015 at 8:10 AM, Lior Chaga lio...@taboola.com wrote:
Does spark HiveContext support the rank() ... distribute by syntax (as in
the following article-
http://www.edwardcapriolo.com
...@tresata.com wrote:
has anyone tried to make HiveContext only if the class is available?
i tried this:
implicit lazy val sqlc: SQLContext = try {
Class.forName(org.apache.spark.sql.hive.HiveContext, true,
Thread.currentThread.getContextClassLoader)
.getConstructor(classOf[SparkContext
#L1023-L1037).
What is the version of Spark you are using? How did you add the spark-csv
jar?
On Thu, Jul 16, 2015 at 1:21 PM, Koert Kuipers ko...@tresata.com wrote:
has anyone tried to make HiveContext only if the class is available?
i tried this:
implicit lazy val sqlc: SQLContext = try
HiveContext only if the class is available?
i tried this:
implicit lazy val sqlc: SQLContext = try {
Class.forName(org.apache.spark.sql.hive.HiveContext, true,
Thread.currentThread.getContextClassLoader)
.getConstructor(classOf[SparkContext]).newInstance(sc).asInstanceOf[SQLContext
16, 2015 at 1:21 PM, Koert Kuipers ko...@tresata.com
wrote:
has anyone tried to make HiveContext only if the class is available?
i tried this:
implicit lazy val sqlc: SQLContext = try {
Class.forName(org.apache.spark.sql.hive.HiveContext, true,
Thread.currentThread.getContextClassLoader
#L1023-L1037).
What is the version of Spark you are using? How did you add the spark-csv
jar?
On Thu, Jul 16, 2015 at 1:21 PM, Koert Kuipers ko...@tresata.com
wrote:
has anyone tried to make HiveContext only if the class is available?
i tried this:
implicit lazy val sqlc: SQLContext = try
/apache/spark/blob/master/repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkILoop.scala#L1023-L1037).
What is the version of Spark you are using? How did you add the spark-csv
jar?
On Thu, Jul 16, 2015 at 1:21 PM, Koert Kuipers ko...@tresata.com
wrote:
has anyone tried to make HiveContext
Hi All,
I am trying to run a simple join on Hive through SparkShell on pseudo
cloudera cluster on ubuntu machine :
*val hc = new HiveContext(sc);*
*hc.sql(use testdb);*
But it is failing with the message :
org.apache.hadoop.hive.ql.parse.SemanticException: Database does not exist:
testdb
To: huangzheng
Cc: Apache Spark User List
Subject: Re: [SparkR] Float type coercion with hiveContext
I used spark 1.4.0 binaries from official site:
http://spark.apache.org/downloads.html
And running it on:
* Hortonworks HDP 2.2.0.0-2041
* with Hive 0.14
* with disabled hooks for Application Timeline Servers
track it to see
how it will be solved.
Ray
-Original Message-
From: Evgeny Sinelnikov [mailto:esinelni...@griddynamics.com]
Sent: Monday, July 6, 2015 7:27 PM
To: huangzheng
Cc: Apache Spark User List
Subject: Re: [SparkR] Float type coercion with hiveContext
I used spark 1.4.0
Just trying to get started with Spark and attempting to use HiveContext using
spark-shell to interact with existing Hive tables on my CDH cluster but keep
running into the errors (pls see below) when I do 'hiveContext.sql(show
tables)'. Wanted to know what all JARs need to be included to have
trying to get started with Spark and attempting to use HiveContext using
spark-shell to interact with existing Hive tables on my CDH cluster but keep
running into the errors (pls see below) when I do 'hiveContext.sql(show
tables)'. Wanted to know what all JARs need to be included to have this
working
[mailto:buntu...@gmail.com]
Sent: Tuesday, July 7, 2015 5:07 PM
To: user@spark.apache.org
Subject: HiveContext throws
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Just trying to get started with Spark and attempting to use HiveContext using
spark-shell to interact with existing
Hello,
I'm got a trouble with float type coercion on SparkR with hiveContext.
result - sql(hiveContext, SELECT offset, percentage from data limit 100)
show(result)
DataFrame[offset:float, percentage:float]
head(result)
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot
) 晚上6:31
收件人: useruser@spark.apache.org;
主题: [SparkR] Float type coercion with hiveContext
Hello,
I'm got a trouble with float type coercion on SparkR with hiveContext.
result - sql(hiveContext, SELECT offset, percentage from data limit
100)
show(result)
DataFrame[offset:float
Hello,
I'm got a trouble with float type coercion on SparkR with hiveContext.
result - sql(hiveContext, SELECT offset, percentage from data limit 100)
show(result)
DataFrame[offset:float, percentage:float]
head(result)
Error in as.data.frame.default(x[[i]], optional = TRUE) :
cannot
.
Thanks
Best Regards
On Thu, Jul 2, 2015 at 6:11 PM, Daniel Haviv
daniel.ha...@veracity-group.com wrote:
Hi,
I've downloaded the pre-built binaries for Hadoop 2.6 and whenever I start
the spark-shell it always start with HiveContext.
How can I disable the HiveContext from being initialized
The main reason is Spark's startup time and the need to configure a component I
don't really need (without configs the hivecontext takes more time to load)
Thanks,
Daniel
On 3 ביולי 2015, at 11:13, Robin East robin.e...@xense.co.uk wrote:
As Akhil mentioned there isn’t AFAIK any kind
Hivecontext should be supersets of SQL context so you should be able to
perform all your tasks. Are you facing any problem with hivecontext?
On 3 Jul 2015 17:33, Daniel Haviv daniel.ha...@veracity-group.com wrote:
Thanks
I was looking for a less hack-ish way :)
Daniel
On Fri, Jul 3, 2015
for Hadoop 2.6 and whenever I
start the spark-shell it always start with HiveContext.
How can I disable the HiveContext from being initialized automatically ?
Thanks,
Daniel
Hi,
I've downloaded the pre-built binaries for Hadoop 2.6 and whenever I start
the spark-shell it always start with HiveContext.
How can I disable the HiveContext from being initialized automatically ?
Thanks,
Daniel
Hi,
As per my use case I need to submit multiple queries to Spark SQL in
parallel but due to HiveContext being thread safe the jobs are getting
submitted sequentially.
I could see many threads are waiting for HiveContext.
on-spray-can-akka.actor.default-dispatcher-26 - Thread t@149
from
mx3.post_tp_annotated_mb_impr where ad_id = 30590918987 and datestamp
='20150623' )
Thanks
Ayman
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/HiveContext-Spark-much-slower-than-Hive-tp23480.html
Sent from the Apache Spark User List mailing list
Hi Marcelo,
The issue does not happen while connecting to the hive metstore, that works
fine. It seems that HiveContext only uses Hive CLI to execute the queries
while HiveServer2 does not support it. I dont think you can specify any
configuration in hive-site.xml which can make it connect
101 - 200 of 390 matches
Mail list logo