_2.11
>
>
> ...and it will be automatically fetched and used.
>
> Thanks,
> Abhishek
>
>
> On Sun, Jul 28, 2019 at 4:42 AM naresh Goud
> wrote:
>
>> It looks there is some internal dependency missing.
>>
>> libraryDependencies ++= Seq(
>> "c
/
On Sat, Jul 27, 2019 at 5:34 PM naresh Goud
wrote:
> Hi Abhishek,
>
>
> We are not able to build jar using git hub code with below error?
>
> Any others able to build jars? Is there anything else missing?
>
>
>
> Note: Unresolved dependencies path:
> [warn]
Hi Abhishek,
We are not able to build jar using git hub code with below error?
Any others able to build jars? Is there anything else missing?
Note: Unresolved dependencies path:
[warn] com.qubole:spark-acid-shaded-dependencies_2.11:0.1
Thanks Abhishek.
Will it work on hive acid table which is not compacted ? i.e table having
base and delta files?
Let’s say hive acid table customer
Create table customer(customer_id int, customer_name string, customer_email
string) cluster by customer_id buckets 10 location ‘/test/customer’
use the hive
> parser or optimization engine. Instead it uses Catalyst, see
> https://databricks.com/blog/2015/04/13/deep-dive-into-spark-sqls-catalyst-optimizer.html
>
> On Mon, Jun 10, 2019 at 2:07 PM naresh Goud
> wrote:
>
>> Hi Team,
>>
>> Is Spark Sql
Hello All,
How can we override jars in spark submit?
We have hive-exec-spark jar which is available as part of default spark
cluster jars.
We wanted to override above mentioned jar in spark submit with latest
version jar.
How do we do that ?
Thank you,
Naresh
--
Thanks,
Naresh
Hi Team,
Is Spark Sql uses hive engine to run queries ?
My understanding that spark sql uses hive meta store to get metadata
information to run queries.
Thank you,
Naresh
--
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam
http://hadoopandspark.blogspot.com/
You can have below statement for multiple topics
val dfStatus = spark.readStream.
format("kafka").
option("subscribe", "utility-status, utility-critical").
option("kafka.bootstrap.servers", "localhost:9092").
option("startingOffsets", "earliest")
.load()
On Mon,
Also check enough space available on /tmp directory
On Fri, Aug 17, 2018 at 10:14 AM Jeevan K. Srivatsa <
jeevansriva...@gmail.com> wrote:
> Hi Venkata,
>
> On a quick glance, it looks like a file-related issue more so than an
> executor issue. If the logs are not that important, I would clear
>
What are you doing? Give more details o what are you doing
On Wed, May 30, 2018 at 12:58 PM Arun Hive
wrote:
>
> Hi
>
> While running my spark job component i am getting the following exception.
> Requesting for your help on this:
> Spark core version -
> spark-core_2.10-2.1.1
>
> Spark
Change you table name in query to spam.spamdataset instead of spamdataset.
On Sun, Apr 15, 2018 at 2:12 PM Rishikesh Gawade
wrote:
> Hello there. I am a newbie in the world of Spark. I have been working on a
> Spark Project using Java.
> I have configured Hive and
Whenever spark read the data from it will have it in executor memory until
and unless there is no room for new data read or processed. This is the
beauty of spark.
On Tue, Apr 3, 2018 at 12:42 AM snjv wrote:
> Hi,
>
> When we execute the same operation twice, spark
>From spark point of view it shouldn’t effect. it’s possible to extend
columns of new parquet files and it won’t affect Performance and not
required to change spark application code.
On Tue, Apr 3, 2018 at 9:14 AM Vitaliy Pisarev
wrote:
> This is not strictly a
In case of storing as parquet file I don’t think it requires header.
option("header","true")
Give a try by removing header option and then try to read it. I haven’t
tried. Just a thought.
Thank you,
Naresh
On Tue, Mar 27, 2018 at 9:47 PM Mina Aslani wrote:
> Hi,
>
>
>
How about accumaltors?
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam
http://hadoopandspark.blogspot.com/
On Thu, Mar 8, 2018 at 12:07 AM Chethan Bhawarlal <
cbhawar...@collectivei.com> wrote:
> Hi Dev,
>
> I am doing spark operations on Rdd level for each row like this,
>
> private def
change it to readStream instead of read as below
val df = spark
.readStream
.format("kafka")
.option("kafka.bootstrap.servers", "host1:port1,host2:port2")
.option("subscribe", "topic1")
.load()
Check is this helpful
Hi Kant,
TD's explanation makes a lot sense. Refer this stackoverflow, where its was
explained with program output. Hope this helps.
https://stackoverflow.com/questions/45579100/structured-streaming-watermark-vs-exactly-once-semantics
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam
what is your driver memory?
Thanks,
Naresh
www.linkedin.com/in/naresh-dulam
http://hadoopandspark.blogspot.com/
On Mon, Feb 26, 2018 at 3:45 AM, Patrick wrote:
> Hi,
>
> We were getting OOM error when we are accumulating the results of each
> worker. We were trying to
is this helps?
sc.parallelize(List((1,10),(2,
20))).toDF("foo","bar").map(("foo","bar")=>("foo",("foo","bar"))).
partitionBy("foo").json("json-out")
On Mon, Feb 26, 2018 at 4:28 PM, Alex Nastetsky
wrote:
> Is there a way to make outputs created with "partitionBy" to
://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc5-docs/_site/structured-streaming-programming-guide.html#continuous-processing
>
> On Sun, Feb 25, 2018 at 12:26 PM, naresh Goud <nareshgoud.du...@gmail.com>
> wrote:
>
>> Hello Spark Experts,
>>
>> What i
Appu,
I am also landed in same problem.
Are you able to solve this issue? Could you please share snippet of code if
your able to do?
Thanks,
Naresh
On Wed, Feb 14, 2018 at 8:04 PM, Tathagata Das
wrote:
> 1. Just loop like this.
>
>
> def startQuery(): Streaming
Hello Spark Experts,
What is the difference between Trigger.Continuous(10.seconds) and
Trigger.ProcessingTime("10 seconds") ?
Thank you,
Naresh
entiate records of one type of entity from other type of
>entities.
>
>
>
> -Beejal
>
>
>
> *From:* naresh Goud [mailto:nareshgoud.du...@gmail.com]
> *Sent:* Friday, February 23, 2018 8:56 AM
> *To:* Vibhakar, Beejal <beejal.vibha...@fisglobal.com>
>
; Regards,
> Keith.
>
> http://keith-chapman.com
>
> On Thu, Feb 22, 2018 at 6:58 PM, naresh Goud <nareshgoud.du...@gmail.com>
> wrote:
>
>> It would be very difficult to tell without knowing what is your
>> application code doing, what kind of transformation/
It would be very difficult to tell without knowing what is your application
code doing, what kind of transformation/actions performing. From my
previous experience tuning application code which avoids unnecessary
objects reduce pressure on GC.
On Thu, Feb 22, 2018 at 2:13 AM, Keith Chapman
Even i am not able to reproduce error
On Thu, Feb 22, 2018 at 2:51 AM, Michael Artz
wrote:
> I am not able to reproduce your error. You should do something before you
> do that last function and maybe get some more help from the exception it
> returns. Like just add a
ark/strea
> ming/kafka010/KafkaUtils.scala
>
> FYI
>
> On Sun, Feb 18, 2018 at 5:17 PM, naresh Goud <nareshgoud.du...@gmail.com>
> wrote:
>
>> Hello Team,
>>
>> I see "KafkaUtils.createStream() " method not available in spark 2.2.1.
>>
>>
Hello Team,
I see "KafkaUtils.createStream() " method not available in spark 2.2.1.
Can someone please confirm if these methods are removed?
below is my pom.xml entries.
2.11.8
2.11
org.apache.spark
spark-streaming_${scala.tools.version}
2.2.1
provided
Spark/Hive converting decimal to null value if we specify the precision
more than available precision in file. Below example give you details. I
am not sure why its converting into Null.
Note: You need to trim string before casting to decimal
Table data with col1 and col2 columns
val r =
If I understand your requirement correct.
Use broadcast variables to replicate across all nodes the small amount of
data you wanted to reuse.
On Mon, Jan 22, 2018 at 9:24 PM David Rosenstrauch
wrote:
> This seems like an easy thing to do, but I've been banging my head
30 matches
Mail list logo